Cohortful integration

● live
# Cohortful API Guide

## Base URL
- Production: `https://app.cohortful.com`
- API prefix: `/api/v1`

## Authentication
Use a bearer token on every API call:

```http
Authorization: Bearer <token>
```

If token auth fails, the API returns `401 Unauthorized`.

## Profiles
Profile endpoints are scoped to the account attached to the bearer token. You can only list, read, create, update, and delete profiles within that token scope.

### List profiles
`GET /api/v1/profiles`

```bash
curl "https://app.cohortful.com/api/v1/profiles" \
  -H "Authorization: Bearer $TOKEN"
```

### Get a profile
`GET /api/v1/profiles/:id`

### Create a profile
`POST /api/v1/profiles`

Send JSON with a `profile` object. Omitted tuning fields use the model defaults. Set `default` to `true` to make the profile the account default.

```bash
curl -X POST "https://app.cohortful.com/api/v1/profiles" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "profile": {
      "name": "Baseline",
      "description": "Default assumptions",
      "draws": 500,
      "target_accept": 0.97,
      "zero_revenue": "keep",
      "arpu_bar_mu": 0.3,
      "arpu_bar_sigma": 0.3,
      "arpu_sigma": 0.3,
      "cv_bar_mu": 2.0,
      "cv_bar_sigma": 0.8,
      "cv_sigma": 1.0,
      "p_bar_mu": 0.5,
      "p_bar_sigma": 0.3,
      "p_sigma": 0.2,
      "y_sd_sigma": 0.5,
      "t_roas": 1.0,
      "t_roas_interval_left": 0.8,
      "t_roas_interval_right": 10.0,
      "default": true
    }
  }'
```

### Update a profile
`PATCH /api/v1/profiles/:id`

Send the same `profile` JSON shape as create, but only include the fields you want to change.

### Delete a profile
`DELETE /api/v1/profiles/:id`

Returns `204 No Content` on success.

## Create Dataset
`POST /api/v1/datasets`

Uploads a CSV, creates a dataset, stores Parquet, and queues inference.

The columns you pass in `features` are treated as ML model features (for example: geo, creative, channel, build). Cohortful fits the statistical model on your uploaded cohort data to produce early-signal estimates for:
- conversion rate
- ARPU
- CV (coefficient of variation / whale-risk signal)

Do not use high-cardinality identifier columns as features (for example `user_id`, `device_id`, or other near-unique IDs), because they degrade generalization and inference quality.

### Content type
- `multipart/form-data`

### Required form fields
- `name` (string)
- `features[]` (one or more feature column names used by the model)
- `arpu_name` (ARPU column name)
- `file` (CSV file)

### Optional fields
- `aggregated` (`true` or `false`, default `true`)
- `arpu_std_name` (required when `aggregated=true`)
- `size_name` (required when `aggregated=true`)
- `spend_name` (spend column name)
- `cpi_name` (CPI column name)
- `impressions_name` (impressions column name)
- `installs_name` (installs column name)
- `profile_id` (inference profile to use)

### How to get `profile_id`
List profiles for your account, then use one of the returned `id` values as `profile_id` in dataset creation.

`GET /api/v1/profiles`

```bash
curl "https://app.cohortful.com/api/v1/profiles" \
  -H "Authorization: Bearer $TOKEN"
```

### cURL example
```bash
curl -X POST "https://app.cohortful.com/api/v1/datasets" \
  -H "Authorization: Bearer $TOKEN" \
  -F "name=my-dataset" \
  -F "aggregated=false" \
  -F "features[]=cohort" \
  -F "arpu_name=arpu" \
  -F "cpi_name=cpi" \
  -F "impressions_name=impressions" \
  -F "file=@/path/to/dataset.csv;type=text/csv"
```

### Success response (`201`)
```json
{
  "dataset": {
    "id": 123,
    "name": "my-dataset",
    "aggregated": false,
    "url": "s3://.../datasets/123.parquet",
    "tokens": 456,
    "inference": null,
    "created_at": "2026-01-11T12:34:56Z"
  },
  "status": "queued"
}
```

## Get Dataset
`GET /api/v1/datasets/:id`

Returns dataset metadata and inference info (if available).

### cURL example
```bash
curl "https://app.cohortful.com/api/v1/datasets/123" \
  -H "Authorization: Bearer $TOKEN"
```

### Success response (`200`)
```json
{
  "dataset": {
    "id": 123,
    "name": "my-dataset",
    "aggregated": false,
    "url": "s3://.../datasets/123.parquet",
    "tokens": 456,
    "inference": {
      "url": "s3://.../inference/123.json",
      "inference_time": 12
    },
    "created_at": "2026-01-11T12:34:56Z",
    "updated_at": "2026-01-11T12:35:10Z"
  },
  "status": "complete"
}
```

## Pull Inference (CSV)
There are two supported ways to download inference as CSV.

### 1) By inference id
`GET /api/v1/inferences/:id`

Returns a CSV file (`text/csv`) for that inference.

```bash
curl -L "https://app.cohortful.com/api/v1/inferences/456" \
  -H "Authorization: Bearer $TOKEN" \
  -o inference.csv
```

### 2) By dataset id
`GET /api/v1/datasets/:id.csv`

If inference is ready, this also returns the inference CSV for the dataset.

```bash
curl -L "https://app.cohortful.com/api/v1/datasets/123.csv" \
  -H "Authorization: Bearer $TOKEN" \
  -o inference.csv
```

If inference is not ready, API returns `422` with `inference_not_ready`.

## Score Data Against Inference
`POST /api/v1/datasets/:dataset_id/inference`

Runs prediction against an existing completed inference.

Required JSON field:
- `data`

```bash
curl -X POST "https://app.cohortful.com/api/v1/datasets/123/inference" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"data":[{"cohort":"US_ios"}]}'
```

## Errors
- `401 Unauthorized`: token missing/invalid
- `404 Not Found`: dataset not found for your account
- `413 Payload Too Large`: CSV too large
- `422 Unprocessable Entity`: invalid params or ingestion failure

Example `422` body:

```json
{
  "error": {
    "code": "validation_error",
    "message": "Dataset is invalid",
    "details": {
      "features": ["can't be blank"]
    }
  }
}
```