curl --request GET \
--url https://api.chunkr.ai/tasks/{task_id}/parse \
--header 'Authorization: <api-key>'{
"completed": true,
"configuration": {
"chunk_processing": {
"ignore_headers_and_footers": null,
"target_length": 4096,
"tokenizer": {
"Enum": "Word"
}
},
"error_handling": "Fail",
"ocr_strategy": "All",
"pipeline": "Chunkr",
"segment_processing": {
"Caption": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Footnote": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"FormRegion": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Formula": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"GraphicalItem": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Legend": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"LineNumber": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"ListItem": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Page": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageFooter": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageHeader": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageNumber": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Picture": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Table": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Text": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Title": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Unknown": {
"description": null,
"extended_context": null,
"llm": "<string>"
}
},
"segmentation_strategy": "LayoutAnalysis"
},
"created_at": "2023-11-07T05:31:56Z",
"file_info": {
"url": "<string>",
"mime_type": "<string>",
"name": "<string>",
"page_count": 1,
"ss_cell_count": 1
},
"message": "<string>",
"task_id": "<string>",
"version_info": {
"client_version": "Legacy",
"server_version": "<string>"
},
"expires_at": "2023-11-07T05:31:56Z",
"finished_at": "2023-11-07T05:31:56Z",
"input_file_url": "<string>",
"output": {
"chunks": [
{
"chunk_length": 1,
"segments": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"page_height": 123,
"page_number": 1,
"page_width": 123,
"segment_id": "<string>",
"confidence": 123,
"content": "<string>",
"description": "<string>",
"embed": "<string>",
"image": "<string>",
"llm": "<string>",
"ocr": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"text": "<string>",
"confidence": 123,
"ocr_id": "<string>",
"ss_cell_ref": "<string>"
}
],
"segment_length": 1,
"ss_cells": [
{
"cell_id": "<string>",
"range": "<string>",
"text": "<string>",
"formula": "<string>",
"hyperlink": "<string>",
"style": {
"bg_color": "<string>",
"font_face": "<string>",
"is_bold": true,
"text_color": "<string>"
},
"value": "<string>"
}
],
"ss_header_bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"ss_header_ocr": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"text": "<string>",
"confidence": 123,
"ocr_id": "<string>",
"ss_cell_ref": "<string>"
}
],
"ss_header_range": "<string>",
"ss_header_text": "<string>",
"ss_range": "<string>",
"ss_sheet_name": "<string>",
"text": "<string>"
}
],
"chunk_id": "<string>",
"content": "<string>",
"embed": "<string>"
}
],
"file_name": "<string>",
"mime_type": "<string>",
"page_count": 1,
"pages": [
{
"image": "<string>",
"page_height": 123,
"page_number": 1,
"page_width": 123,
"dpi": 123,
"ss_sheet_name": "<string>"
}
],
"pdf_url": "<string>"
},
"started_at": "2023-11-07T05:31:56Z",
"task_url": "<string>"
}Get Parse Task
Retrieves the current state of a parse task.
Returns task details such as processing status, configuration, output (when available), file metadata, and timestamps.
Typical uses:
- Poll a task during processing
- Retrieve the final output once processing is complete
- Access task metadata and configuration
curl --request GET \
--url https://api.chunkr.ai/tasks/{task_id}/parse \
--header 'Authorization: <api-key>'{
"completed": true,
"configuration": {
"chunk_processing": {
"ignore_headers_and_footers": null,
"target_length": 4096,
"tokenizer": {
"Enum": "Word"
}
},
"error_handling": "Fail",
"ocr_strategy": "All",
"pipeline": "Chunkr",
"segment_processing": {
"Caption": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Footnote": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"FormRegion": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Formula": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"GraphicalItem": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Legend": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"LineNumber": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"ListItem": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Page": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageFooter": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageHeader": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"PageNumber": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Picture": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Table": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Text": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Title": {
"description": null,
"extended_context": null,
"llm": "<string>"
},
"Unknown": {
"description": null,
"extended_context": null,
"llm": "<string>"
}
},
"segmentation_strategy": "LayoutAnalysis"
},
"created_at": "2023-11-07T05:31:56Z",
"file_info": {
"url": "<string>",
"mime_type": "<string>",
"name": "<string>",
"page_count": 1,
"ss_cell_count": 1
},
"message": "<string>",
"task_id": "<string>",
"version_info": {
"client_version": "Legacy",
"server_version": "<string>"
},
"expires_at": "2023-11-07T05:31:56Z",
"finished_at": "2023-11-07T05:31:56Z",
"input_file_url": "<string>",
"output": {
"chunks": [
{
"chunk_length": 1,
"segments": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"page_height": 123,
"page_number": 1,
"page_width": 123,
"segment_id": "<string>",
"confidence": 123,
"content": "<string>",
"description": "<string>",
"embed": "<string>",
"image": "<string>",
"llm": "<string>",
"ocr": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"text": "<string>",
"confidence": 123,
"ocr_id": "<string>",
"ss_cell_ref": "<string>"
}
],
"segment_length": 1,
"ss_cells": [
{
"cell_id": "<string>",
"range": "<string>",
"text": "<string>",
"formula": "<string>",
"hyperlink": "<string>",
"style": {
"bg_color": "<string>",
"font_face": "<string>",
"is_bold": true,
"text_color": "<string>"
},
"value": "<string>"
}
],
"ss_header_bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"ss_header_ocr": [
{
"bbox": {
"height": 123,
"left": 123,
"top": 123,
"width": 123
},
"text": "<string>",
"confidence": 123,
"ocr_id": "<string>",
"ss_cell_ref": "<string>"
}
],
"ss_header_range": "<string>",
"ss_header_text": "<string>",
"ss_range": "<string>",
"ss_sheet_name": "<string>",
"text": "<string>"
}
],
"chunk_id": "<string>",
"content": "<string>",
"embed": "<string>"
}
],
"file_name": "<string>",
"mime_type": "<string>",
"page_count": 1,
"pages": [
{
"image": "<string>",
"page_height": 123,
"page_number": 1,
"page_width": 123,
"dpi": 123,
"ss_sheet_name": "<string>"
}
],
"pdf_url": "<string>"
},
"started_at": "2023-11-07T05:31:56Z",
"task_url": "<string>"
}Authorizations
Path Parameters
Id of the task to retrieve
Query Parameters
Whether to return base64 encoded URLs. If false, the URLs will be returned as presigned URLs.
Whether to include chunks in the output response
Response
Task details.
True when the task reaches a terminal state i.e. status is Succeeded or Failed or Cancelled
Show child attributes
Show child attributes
The date and time when the task was created and queued.
Information about the input file.
Show child attributes
Show child attributes
A message describing the task's status or any errors that occurred.
The status of the task.
Starting, Processing, Succeeded, Failed, Cancelled The unique identifier for the task.
Parse, Extract Version information for the task.
Show child attributes
Show child attributes
The date and time when the task will expire.
The date and time when the task was finished.
The presigned URL of the input file.
Deprecated use file_info.url instead.
The processed results of a document parsing task
Show child attributes
Show child attributes
The date and time when the task was started.
The presigned URL of the task.