Breakdown of the Structure:
{
"message": "Success", // overall status of the request
"result": [ // result array carrying the extracted data
{
"message": "Success", // overall status of the request at the file-level
"input": "Multi-Table Image Sample 2.png", // name of the input file
"prediction": [ // prediction array carrying the extracted data
{
"id": "0f051447-c4f3-4f6b-bf61-21d2a6346c0a",
"label": "table", // tabel level information starting with the first table
"xmin": 5, // co-ordinates of the image where the first table is present
"ymin": 9,
"xmax": 967,
"ymax": 209,
"score": 1, // confidence score indicatimg the probability of correctly extracted data
"ocr_text": "table",
"type": "table", // "type" indicates tabular data, can be used to differentiate between key-value pairs and tabular data
"cells": [ // extracted cell-level data in the array object "cells"
{
"id": "470c04f8-c621-475e-8c54-b1550c0cbf3a",
"row": 1, // row and column numbers to distinguish the cell
"col": 1,
"row_span": 1,
"col_span": 1,
"label": "", // "label" contains the name of the data point extracted from the cell, usually empty in case of tables
"xmin": 5, // co-ordinates of the cell on the page
"ymin": 9,
"xmax": 316,
"ymax": 34,
"score": 0.73095703, // confidence score indicating the accuracy of extraction
"text": "Original Performance", // extracted text from the cell
"row_label": "",
"verification_status": "correctly_predicted", // status message indicating the accuracy of extraction
"status": "",
"failed_validation": "", // flag to identify if is passed validation rules in the workflow
"label_id": "",
"lookup_edited": false
},
...
... // above code block repeats for all cells
]
...
... // above code block repeats for all tables
-
Top-level Fields:
-
message
: Indicates the overall status of the request. -
result
: Contains the main content of the response, which includes predictions for detected.-
Each element represents a detection result for a specific input image (
"input"
field) -
Detection Object (
result[i]
): ◦message
: Status message for this specific detection result. ◦input
: Name of the input image file processed. ◦prediction
: Array of predictions for tables detected in the image. -
Prediction Object (
prediction[j]
): ◦id
: Unique identifier for this prediction. ◦label
: Indicates the type of object detected ("table"
in this case). ◦xmin
,ymin
,xmax
,ymax
: Coordinates indicating the bounding box of the detected table. ◦score
: Confidence score for the detection. ◦ocr_text
: Text recognised by OCR for this object (here it's"table"
). ◦type
: Type of the detected object ("table"
). ◦cells
: Array containing details of cells detected within the table. -
Cell Object (
cells[k]
):Detailed information about each cell detected within the table, including:
◦
id
: Unique identifier for the cell. ◦row
,col
: Row and column indices of the cell. ◦row_span
,col_span
: Span of rows and columns occupied by the cell. ◦xmin
,ymin
,xmax
,ymax
: Bounding box coordinates of the cell. ◦score
: Confidence score for OCR text recognition within the cell. ◦text
: Recognised text content of the cell. ◦row_label
: Additional labeling information if applicable. ◦verification_status
,status
,failed_validation
,label_id
,lookup_edited
: Various attributes related to verification and editing status.
-
Additional Notes:
• This structure is designed to provide detailed information about each detected table and its cells, facilitating further processing or display of structured data extracted from the image.
• Each detected table comes with its own Prediction Object where the label remains the same, which is
"table"
.• Depending on the API's capabilities and your requirements, there may be additional fields or variations in the structure of the response, particularly if there are workflows engaged or additional metadata to be extracted.
Example:
{ "message": "Success", "result": [ { "message": "Success", "input": "Multi-Table Image Sample 2.png", "prediction": [ { "id": "0f051447-c4f3-4f6b-bf61-21d2a6346c0a", "label": "table", "xmin": 5, "ymin": 9, "xmax": 967, "ymax": 209, "score": 1, "ocr_text": "table", "type": "table", "cells": [ { "id": "470c04f8-c621-475e-8c54-b1550c0cbf3a", "row": 1, "col": 1, "row_span": 1, "col_span": 1, "label": "", "xmin": 5, "ymin": 9, "xmax": 316, "ymax": 34, "score": 0.73095703, "text": "Original Performance", "row_label": "", "verification_status": "correctly_predicted", "status": "", "failed_validation": "", "label_id": "", "lookup_edited": false }, { "id": "97d8379d-116a-4558-a6e0-82e2252df734", "row": 1, "col": 2, "row_span": 1, "col_span": 1, "label": "", "xmin": 316, "ymin": 10, "xmax": 370, "ymax": 34, "score": 0.73095703, "text": "", "row_label": "", "verification_status": "correctly_predicted", "status": "", "failed_validation": "", "label_id": "", "lookup_edited": false }, ...... ...... { "id": "e713bf83-b55e-4ca1-a4de-751478eeffbf", "row": 9, "col": 14, "row_span": 1, "col_span": 1, "label": "", "xmin": 937, "ymin": 187, "xmax": 966, "ymax": 209, "score": 0.8413086, "text": "", "row_label": "", "verification_status": "correctly_predicted", "status": "", "failed_validation": "", "label_id": "", "lookup_edited": false } ], "status": "correctly_predicted", "page_no": 0, "label_id": "", "lookup_edited": false, "lookup_parent_id": "" }, { "id": "a13ccfba-f7a9-4fe2-8915-bc15e54500a8", "label": "table", "xmin": 6, "ymin": 222, "xmax": 964, "ymax": 557, "score": 1, "ocr_text": "table", "type": "table", "cells": [ { "id": "8b9c1b63-21c5-4df3-927b-474126eefa4a", "row": 1, "col": 1, "row_span": 1, "col_span": 1, "label": "", "xmin": 6, "ymin": 222, "xmax": 307, "ymax": 245, "score": 0.91259766, "text": "ShortDescription", "row_label": "", "verification_status": "correctly_predicted", "status": "", "failed_validation": "", "label_id": "", "lookup_edited": false },
For Instance, in this JSON response, please find explanation below:
- “label”: table denotes the first table.
- “xmin”, “xmax”, “ymin”, “ymax”: denote the co-ordinates of the table on the image
- “cells”: array object containing information about each individual cell. In “row”: 1, “col”: 1, we have the "text": "Original Performance", and so on, until “row”: 9, “col”: 14 which is empty.
- "id": "a13ccfba-f7a9-4fe2-8915-bc15e54500a8" is a unique identifier associated with the second table which is a second prediction object of "type": "table". This also contains an array object called “cells” which details information about each individual cell in this table.
- For example, “row”: 1, “col”: 1, for this table contains "text": "ShortDescription" and so on.
-