Overview
The parse endpoint extracts text content from files (PDFs, images, DOCX, text files). Files are sent as base64-encoded payloads and processed in batch.Supported file types
| MIME type | Description |
|---|---|
application/pdf | PDF documents |
application/vnd.openxmlformats-officedocument.wordprocessingml.document | DOCX files |
application/msword | Legacy DOC files |
application/json | JSON files |
text/* | Any plain text file |
image/* | Images (OCR extraction) |
Limits
- Max file size: 20 MB per file
- Max files per request: 20
- Max pages (PDF): 50 (configurable via
options.max_pages) - Request timeout: 110 seconds
1. The Request
Method:POST
Endpoint: https://api.sciforium.com/api/attachments/parse
Content-Type: application/json
Request fields
| Field | Type | Required | Description |
|---|---|---|---|
files | array | Yes | 1..20 file objects |
files[].url | string | Yes | Data URI: data:<mime>;base64,<bytes> |
files[].filename | string | Yes | File name (max 255 chars) |
files[].media_type | string | No | MIME type hint |
options.max_pages | integer | No | 1..50, default 50 |
Example CURL Request (PDF)
Example response — POST /api/attachments/parse
200 OK — Content-Type: application/json