Overiew
This feature lets you automatically classify and send documents to your respective OCR models.
For eg:
-
Let's say you want to discard all non-invoice documents sent by your users within Nanonets. You can now do this by creating a document classification model and then routing only the invoice documents to the correct OCR model
-
Another example is, the ability to route receipts, invoices and purchase orders, i.e different document types to the correct OCR models. You can create a document classification model with 3 labels for each of these 3 documents and then select which OCR model you want the documents to flow to
Creating a Document Router Model
- Create a Document Classification Model
- Head over to https://app.nanonets.com/#/
- Scroll down to the Document Sorting section
- Click on the Document Classification Model Card
- Create Labels
- Create the necessary labels
- For eg: invoices, receipts
- Choose the respective OCR models that you want to route the documents to
- For eg: The receipt and AP workflow pretrained Models
- For eg: The receipt and AP workflow pretrained Models
- Create the necessary labels
- Train Model
- Upload 10 images to each category
- For eg: 10 receipt images, 10 images of invoices
- Hit Train
- Your model should get trained in 1 hour
- Upload 10 images to each category
Integration
- Once you have created and trained your own model, you have a few options to integrate
- Email - You can send documents via a dedicated email inbox. You can get this from heading over to https://app.nanonets.com/#/ic/extract/MODEL_ID and clicking on the upload file button and choose the email option
All the attachments sent in an email will be processed. The attachment will then get routed to the respective OCR model - API - You can send files directly to the Document Router API. Available here ->https://nanonets.com/documentation/#operation/ImageCategorizationLabelFilePost
Like email, all the documents sent in this POST API call, will be sent to the respective OCR model- Response Format
- The data of the OCR model results will be present in 'data_extraction_results`
- This structure will be the same as the OCR model results
- Response Format
- Email - You can send documents via a dedicated email inbox. You can get this from heading over to https://app.nanonets.com/#/ic/extract/MODEL_ID and clicking on the upload file button and choose the email option
{
"message": "Success",
"result": [{
"message": "Success",
"prediction": [{
"label": "receipts",
"probability": 0.836599
},
{
"label": "invoices",
"probability": 0.16340101
}
],
"file": "00ba7aad-bd43-4449-b649-add832b325ae.jpeg",
"page": 0,
"label": "receipts"
}],
"signed_urls": {
"uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg": {
"original": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?expires=1665595218&or=0&s=d8c571407eb57941f84b7e4f8abba1b2",
"original_compressed": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?auto=compress&expires=1665595218&or=0&s=481d4f025c664175cdd23f902efe496c",
"thumbnail": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?auto=compress&expires=1665595218&w=240&s=7f4323f31fa714b42f1757df50ebc50a",
"acw_rotate_90": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?auto=compress&expires=1665595218&or=270&s=f387fb219d5cabdcf55a2a6e4f588e7b",
"acw_rotate_180": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?auto=compress&expires=1665595218&or=180&s=b9f8ec62e6e91cf7c8f80d3aa024b85a",
"acw_rotate_270": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?auto=compress&expires=1665595218&or=90&s=97c6299c60a8456386d5b4a423a1f8ab",
"original_with_long_expiry": "https://nnts.imgix.net/uploadedfiles/06f748b9-350b-46b2-ac8a-dfae2298b09c/7e096fca-3315-4cf8-83cc-66641252bc5b.jpeg?expires=1681132818&or=0&s=1b380a6e0a65b5edfb9c38d2fbed4d6d"
}
},
"data_extraction_result": {
"message": "Success",
"result": [{
"message": "Success",
"input": "00ba7aad-bd43-4449-b649-add832b325ae.jpeg",
"prediction": [],
"page": 0,
"request_file_id": "55272e5a-44b9-4a57-9a55-11e99eee0960",
"filepath": "PredictionImages/",
"id": "9dfe75f5-4a30-11ed-8bda-96810894b27e",
"rotation": 0,
"file_url": "uploadedfiles/09d205e6-7283-44be-a360-8428b410233a/RawPredictions/00ba7aad-bd43-4449-b649-add832b325ae-2022-10-12T13-20-18.811.jpeg",
"request_metadata": ""
}],
"signed_urls": {
"PredictionImages/": {
"original": "https://nnts.imgix.net/PredictionImages/?expires=1665595219&or=0&s=cb21580e3ff217d04eac31421505215b",
"original_compressed": "https://nnts.imgix.net/PredictionImages/?auto=compress&expires=1665595219&or=0&s=6d8a3328edc6646279b7b579d12354a8",
"thumbnail": "https://nnts.imgix.net/PredictionImages/?auto=compress&expires=1665595219&w=240&s=d2323f8f0f5f0ed1b6e0e37ef87f5fc3",
"acw_rotate_90": "https://nnts.imgix.net/PredictionImages/?auto=compress&expires=1665595219&or=270&s=1e75b08dfa73df680eb09f640081c17f",
"acw_rotate_180": "https://nnts.imgix.net/PredictionImages/?auto=compress&expires=1665595219&or=180&s=931f27a6d764cc548d897e07b013d23d",
"acw_rotate_270": "https://nnts.imgix.net/PredictionImages/?auto=compress&expires=1665595219&or=90&s=30915233f2b024fb6954d50565e4889b",
"original_with_long_expiry": "https://nnts.imgix.net/PredictionImages/?expires=1681132819&or=0&s=5b112c38944b793a22cb191943be68b7"
},
"uploadedfiles/09d205e6-7283-44be-a360-8428b410233a/RawPredictions/00ba7aad-bd43-4449-b649-add832b325ae-2022-10-12T13-20-18.811.jpeg": {
"original": "https://nanonets.s3.us-west-2.amazonaws.com/uploadedfiles/09d205e6-7283-44be-a360-8428b410233a/RawPredictions/00ba7aad-bd43-4449-b649-add832b325ae-2022-10-12T13-20-18.811.jpeg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5F4WPNNTLX3QHN4W%2F20221012%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20221012T132019Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&response-cache-control=no-cache&X-Amz-Signature=639c02ffa9af432ac731ad48c0d68c28f7ad068cd65a0d05282ad45c494fad30",
"original_compressed": "",
"thumbnail": "",
"acw_rotate_90": "",
"acw_rotate_180": "",
"acw_rotate_270": "",
"original_with_long_expiry": ""
}
}
}
}
Behaviours and Caveats
- In case you want to discard documents, use the `Do Nothing` option available in Step #1 while creating the model
- In case the files are uploaded to the Document Classification Model via email, the document will be automatically routed in the OCR model. We recommend using the webhooks export in the OCR model to retrieve the results
- The data of the Document Classification will also be present in the OCR model Get APIs
- The data of the classification will be present in the `page_classification_result` key
{
"moderated_images_count": 1,
"unmoderated_images_count": 1,
"moderated_images": [{
"model_id": "05618833-0469-4fd2-a5fc-8e4a61e64486",
"day_since_epoch": 19200,
"is_moderated": true,
"hour_of_day": 13,
"id": "bfcf1e47-0dad-11ed-8ede-3200f1e279c7",
"url": "uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg",
"predicted_boxes": [],
"moderated_boxes": [],
"page_classification_results": {
"message": "Success",
"model_id": "00000000-0000-0000-0000-000000000000",
"result": [{
"file": "string",
"message": "Success",
"page": 0,
"prediction": [{
"label": "category1",
"probability": 0.9
},
{
"label": "category2",
"probability": 0.1
}
]
}]
},
"size": {
"width": 2380,
"height": 3368
},
"page": 0,
"request_file_id": "bfcf1e53-0dad-11ed-8edf-3200f1e279c7",
"original_file_name": "SAMPLE-Passed.jpg",
"custom_response": null,
"assigned_member": "",
"is_deleted": false,
"source": "api",
"no_of_fields": 30,
"cost": 0.3,
"payable_cost": 0,
"status": "success",
"export_status": "",
"retries": 0,
"rotation": 0,
"updated_at": "88ce6249-0e89-11ed-b146-3a9bc324f25c",
"verified_at": "88ce6237-0e89-11ed-b145-3a9bc324f25c",
"verified_by": "rushabh@nanonets.com",
"current_stage_id": "ffffffff-ffff-ffff-ffff-ffffffffffff",
"uploaded_by": "sahil@nanonets.com",
"upload_channel": "ui",
"file_url": "uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Passed-2022-07-07T10-46-23.132.jpg",
"request_metadata": "",
"raw_ocr": [],
"delay_post_prediction_tasks": false
}],
"unmoderated_images": [{
"model_id": "05618833-0469-4fd2-a5fc-8e4a61e64486",
"day_since_epoch": 19200,
"is_moderated": false,
"hour_of_day": 13,
"id": "bfd084fc-0dad-11ed-8ee0-3200f1e279c7",
"url": "uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg",
"predicted_boxes": [],
"moderated_boxes": [],
"page_classification_results": {
"message": "Success",
"model_id": "00000000-0000-0000-0000-000000000000",
"result": [{
"file": "string",
"message": "Success",
"page": 0,
"prediction": [{
"label": "category1",
"probability": 0.9
},
{
"label": "category2",
"probability": 0.1
}
]
}]
},
"size": {
"width": 595,
"height": 842
},
"page": 0,
"request_file_id": "bfd08506-0dad-11ed-8ee1-3200f1e279c7",
"original_file_name": "SAMPLE-Flagged.jpg",
"custom_response": null,
"assigned_member": "sahil@nanonets.com",
"is_deleted": false,
"source": "api",
"no_of_fields": 30,
"cost": 0.3,
"payable_cost": 0,
"status": "success",
"export_status": "",
"retries": 0,
"rotation": 0,
"updated_at": "bfd084fc-0dad-11ed-8ee0-3200f1e279c7",
"verified_at": "bfd084fc-0dad-11ed-8ee0-3200f1e279c7",
"verified_by": "",
"current_stage_id": "f5934a81-d6a6-42fe-a130-246ce1e338d3",
"uploaded_by": "sahil@nanonets.com",
"upload_channel": "ui",
"file_url": "uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Flagged-2022-07-07T10-46-22.158.jpg",
"request_metadata": "",
"raw_ocr": [],
"delay_post_prediction_tasks": false
}],
"signed_urls": {
"uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg": {
"original": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?expires=1659968049&or=0&s=17e5b56292fbe00cd7277326899a13c8",
"original_compressed": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?auto=compress&expires=1659968049&or=0&s=dd77542ea92d42714da2f1e2922493f1",
"thumbnail": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?auto=compress&expires=1659968049&w=240&s=2e6c3bf0fd613cab82f0489dd84193e1",
"acw_rotate_90": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?auto=compress&expires=1659968049&or=270&s=d656532d0fbf471d7ebf14d259119679",
"acw_rotate_180": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?auto=compress&expires=1659968049&or=180&s=9b6a07be6961ee768c4c2466d84e555b",
"acw_rotate_270": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?auto=compress&expires=1659968049&or=90&s=70d6639f21150c690ce6139119a8a05a",
"original_with_long_expiry": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/85975318-3bf7-4c0b-aec1-908f616b0192.jpeg?expires=1675505649&or=0&s=448c749daaec494a05f6570d0d88361d"
},
"uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg": {
"original": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?expires=1659968049&or=0&s=fa7d37cb7e1c6a8e9cd9e9db9cdf52ea",
"original_compressed": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?auto=compress&expires=1659968049&or=0&s=ccc2ee9d540cffee6f614466cc6c216f",
"thumbnail": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?auto=compress&expires=1659968049&w=240&s=c5cfd1e3893ab3379783bdb3970a3ab8",
"acw_rotate_90": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?auto=compress&expires=1659968049&or=270&s=423bb22e9aa01ebcf2b2a51ad2ba3895",
"acw_rotate_180": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?auto=compress&expires=1659968049&or=180&s=eb614306c30f1f76741ba8a819261bb0",
"acw_rotate_270": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?auto=compress&expires=1659968049&or=90&s=385ce16f0c75a4c1fce987310191dfab",
"original_with_long_expiry": "https://nnts.imgix.net/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/PredictionImages/ae861e25-3fe5-4727-9287-7ec166a3522b.jpeg?expires=1675505649&or=0&s=b71cbea278a03bfe0e27a417d8e49a31"
},
"uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Flagged-2022-07-07T10-46-22.158.jpg": {
"original": "https://nanonets.s3.us-west-2.amazonaws.com/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Flagged-2022-07-07T10-46-22.158.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5F4WPNNTLX3QHN4W%2F20220808%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20220808T101409Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&response-cache-control=no-cache&X-Amz-Signature=1f0331d94c1e8b87497a19d324211af01565e443f461febeaa5b16f34bd6eb1d",
"original_compressed": "",
"thumbnail": "",
"acw_rotate_90": "",
"acw_rotate_180": "",
"acw_rotate_270": "",
"original_with_long_expiry": ""
},
"uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Passed-2022-07-07T10-46-23.132.jpg": {
"original": "https://nanonets.s3.us-west-2.amazonaws.com/uploadedfiles/415f8096-f114-41f8-a167-cc3c5b7fdd13/RawPredictions/SAMPLE-Passed-2022-07-07T10-46-23.132.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5F4WPNNTLX3QHN4W%2F20220808%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20220808T101409Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&response-cache-control=no-cache&X-Amz-Signature=6c31d04733e35297cb6c0bac5b9736eb69e316f143b516d1726c1a11dcb2a934",
"original_compressed": "",
"thumbnail": "",
"acw_rotate_90": "",
"acw_rotate_180": "",
"acw_rotate_270": "",
"original_with_long_expiry": ""
}
}
}