Skip to main content

Extract information from one or more documents.

Given a document, this endpoint will extract information from it.

Supported document formats are:
- pdf
- jpeg / jpg
- png
- tiff
- webp

Path Parameters
  • organization_id integer required

    unique ID of your workspace

  • project_id integer required

    is a unique ID of your project

  • model_id integer required

    is a unique ID of your trained AI model

Query Parameters
  • return_bboxes string

    whether or not to return bounding boxes

  • return_annotated_pages string

    whether or not to return the images correspondent to each page, with the found matches, as a base64 encoded string

Request Body required
  • file binary required

    file to upload

Responses

OK


Schema
  • model string

    Name of the model used to extract the information

  • all_required_fields_found boolean

    If all required labels were found while processing the documents

  • all_confidence_thresholds_met boolean

    If all the expected confidence for the labels were met while processing the documents

  • all_data_conversion_passed boolean

    If all data was converted according to the labels configuration while processing the documents

  • total_credits_used integer

    Total of credits consumed by the extraction

  • documents object[]

    Information extracted from the documents, per document

  • name string

    Name of the document

  • all_required_fields_found boolean

    If all required labels were found while processing this specific document

  • all_confidence_thresholds_met boolean

    If all the expected confidence for the labels were met while processing this specific document

  • all_data_conversion_passed boolean

    If all data was converted according to the labels configuration while processing this specific document

  • credits_used integer

    Total of credits consumed by the extraction for this specific document

  • matches object[]

    All the information extracted, per document

  • bboxes array[]

    All the bounding boxes found, per document

  • annotated_pages string[]

    All the document pages with annotated date

  • hitl object

    Information about the HITL

  • review_url string

    the API endpoint to retrieve that review

  • status string

    the HITL status (pending or completed)

  • result object[]

    All the issues found by the HITL

  • confidences object[]

    List all the matches bellow the expected threshold

  • label string

    Label associated with the value

  • value string

    Value found

  • expected_confidence number

    Expected confidence for the label

  • found number

    Found confidence for the label

  • labels object[]

    All project labels and configuration for each label

  • id integer

    The ID of the label

  • name string

    The label's name.

Loading...