# Baseer OCR



Overview [#overview]

**Baseer** (بصير) is a vision-to-text model built for Arabic documents. It goes beyond character recognition by understanding document structure, rendering tables as clean HTML, and preserving the semantic layout of classical Arabic texts.

The OCR API is **asynchronous** — you upload a file, poll for status, and retrieve results when processing completes. This enables handling of large PDFs and complex documents without request timeouts.

***

Workflow [#workflow]

<div className="fd-steps [&_h3]:fd-step">
  Upload a file [#upload-a-file]

  Submit your image or PDF to `POST /v1/ocr`. You'll receive a `fileId` and a `Location` header pointing to the status endpoint.

  Poll for status [#poll-for-status]

  Use `GET /v1/ocr/{fileId}/status` to check whether processing is `pending`, `processing`, `completed`, or `failed`.

  Retrieve results [#retrieve-results]

  Once the status is `completed`, fetch the extracted content from `GET /v1/ocr/{fileId}/results`. Results can only be consumed **once** — they are deleted after retrieval.
</div>

***

Endpoints [#endpoints]

Upload File [#upload-file]

```
POST /v1/ocr
```

Accepts a file upload and begins asynchronous OCR processing. Returns **202 Accepted** with a `fileId` and a `Location` header.

Requires an [API Key](/docs/api-keys) passed via the `x-api-key` header.

Request [#request]

The request must be sent as `multipart/form-data`.

| Field     | Type     | Required | Description                                                                            |
| --------- | -------- | -------- | -------------------------------------------------------------------------------------- |
| `file`    | `binary` | Yes      | Image (`jpg`, `png`, `tiff`) or PDF file. Images up to **5 MB**, PDFs up to **50 MB**. |
| `model`   | `string` | Yes      | The OCR model to use. See [Models](/docs/models/ocr#models).                           |
| `options` | `object` | No       | Fine-tuning parameters for inference. See [Options](/docs/models/ocr#options).         |

Response [#response]

**Status: `202 Accepted`**

**Headers:**

* `Location: /v1/ocr/{fileId}/status`

```json
{
  "fileId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
```

| Field    | Type     | Description                                                                             |
| -------- | -------- | --------------------------------------------------------------------------------------- |
| `fileId` | `string` | Unique identifier for the uploaded file. Use this to check status and retrieve results. |

Example [#example]

```bash
curl -X POST https://api-dev.kawn.io/v1/ocr \
  -H "x-api-key: <YOUR_API_KEY>" \
  -F "model=baseer/baseer-v2" \
  -F "file=@/path/to/document.pdf"
```

***

Check Status [#check-status]

```
GET /v1/ocr/{fileId}/status
```

Returns the current processing status of a previously uploaded file.

Requires an [API Key](/docs/api-keys) passed via the `x-api-key` header.

Path Parameters [#path-parameters]

| Parameter | Type     | Required | Description                                    |
| --------- | -------- | -------- | ---------------------------------------------- |
| `fileId`  | `string` | Yes      | The file ID returned from the upload endpoint. |

Response [#response-1]

```json
{
  "fileId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed"
}
```

| Field    | Type     | Description                                             |
| -------- | -------- | ------------------------------------------------------- |
| `fileId` | `string` | The file identifier.                                    |
| `status` | `string` | One of: `pending`, `processing`, `completed`, `failed`. |

Status Values [#status-values]

| Status       | Description                                                          |
| ------------ | -------------------------------------------------------------------- |
| `pending`    | File has been received and is queued for processing.                 |
| `processing` | The OCR model is actively extracting text from the file.             |
| `completed`  | Processing finished successfully. Results are ready to be retrieved. |
| `failed`     | Processing failed. The file could not be processed.                  |

Example [#example-1]

```bash
curl https://api-dev.kawn.io/v1/ocr/a1b2c3d4-e5f6-7890-abcd-ef1234567890/status \
  -H "x-api-key: <YOUR_API_KEY>"
```

***

Retrieve Results [#retrieve-results-1]

```
GET /v1/ocr/{fileId}/results
```

Returns the extracted content for a completed file. **Results can only be retrieved once** — the data is deleted immediately after a successful response.

Requires an [API Key](/docs/api-keys) passed via the `x-api-key` header.

Path Parameters [#path-parameters-1]

| Parameter | Type     | Required | Description                                    |
| --------- | -------- | -------- | ---------------------------------------------- |
| `fileId`  | `string` | Yes      | The file ID returned from the upload endpoint. |

Response [#response-2]

```json
{
  "fileId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "model": "baseer/baseer-v2",
  "pages": [
    {
      "index": 0,
      "content": "<p>بسم الله الرحمن الرحيم</p>"
    },
    {
      "index": 1,
      "content": "<p>الحمد لله رب العالمين</p>"
    }
  ],
  "creditsConsumed": 2
}
```

| Field             | Type     | Description                                                                                |
| ----------------- | -------- | ------------------------------------------------------------------------------------------ |
| `fileId`          | `string` | The file identifier.                                                                       |
| `model`           | `string` | The model that processed the request.                                                      |
| `pages`           | `array`  | Ordered list of page results. Each page has an `index` and its extracted `content` (HTML). |
| `creditsConsumed` | `number` | Number of credits deducted for this request.                                               |

<Callout type="warn">
  Results are consumed on retrieval. If you need to keep the extracted content, make sure to persist it on your end after fetching.
</Callout>

Example [#example-2]

```bash
curl https://api-dev.kawn.io/v1/ocr/a1b2c3d4-e5f6-7890-abcd-ef1234567890/results \
  -H "x-api-key: <YOUR_API_KEY>"
```

***

Full Example [#full-example]

Here's a complete example using `curl` and `jq` to upload a file, poll until completion, and retrieve results:

```bash
# 1. Upload the file
FILE_ID=$(curl -s -X POST https://api-dev.kawn.io/v1/ocr \
  -H "x-api-key: <YOUR_API_KEY>" \
  -F "model=baseer/baseer-v2" \
  -F "file=@/path/to/document.pdf" | jq -r '.fileId')

echo "File ID: $FILE_ID"

# 2. Poll for status until completed
while true; do
  STATUS=$(curl -s https://api-dev.kawn.io/v1/ocr/$FILE_ID/status \
    -H "x-api-key: <YOUR_API_KEY>" | jq -r '.status')
  echo "Status: $STATUS"

  if [ "$STATUS" = "completed" ]; then
    break
  elif [ "$STATUS" = "failed" ]; then
    echo "Processing failed"
    exit 1
  fi

  sleep 2
done

# 3. Retrieve results (one-time consumption)
curl -s https://api-dev.kawn.io/v1/ocr/$FILE_ID/results \
  -H "x-api-key: <YOUR_API_KEY>" | jq .
```

***

Models [#models]

| Model ID           | Description               |
| ------------------ | ------------------------- |
| `baseer/baseer-v2` | Default production model. |

Options [#options]

| Field         | Type     | Description                                                               |
| ------------- | -------- | ------------------------------------------------------------------------- |
| `topP`        | `number` | Nucleus sampling threshold.                                               |
| `topK`        | `number` | Top-K sampling parameter.                                                 |
| `temperature` | `number` | Controls randomness of output. Higher values produce more varied results. |

***

How It Works [#how-it-works]

The Baseer processor runs a quality-assured pipeline before returning results:

1. **Preprocessing** — Images are compressed and normalized for optimal inference accuracy.
2. **Inference** — The file is passed to the Baseer model.
3. **Validation** — Output is checked by validators (e.g. repetition detection, hallucination checks, Quranic ayah validation).
4. **Retry Logic** — If validation fails, the system automatically retries with adjusted parameters (higher temperature or repetition penalty) before surfacing an error.
