What are instant learning models?
This is a type of model that learns quickly from each file you upload, makes changes or modifications to the extracted data, and approves them, so you don't have to wait a long time to see improvements based on your feedback. With instant learning models, the learning process is immediate, ensuring that the model rapidly adjusts to new data and insights.
1. Create a New Model, Add Fields and Table Headers
Step-by-Step Instructions:
- Accessing Manage Labels: After creating a new model, navigate to the left-hand panel and select the "AI Training" section. Here, you'll find the "Manage Labels" option which includes "Fields" and "Table Headers".
- Defining Fields and Table Headers: In the "Manage Labels" section, you can define all the fields and table headers that you want to extract from the documents you process using Nanonets.
What is the difference between Fields and Table Headers?
- Fields: When defining a field, it is assumed that on a given page or within a specific document, the field will have a singular value.
- Table Headers: When defining a table header, it is possible that multiple values for a single field can exist per page or document.
-
Adding Descriptions: Adding clear descriptions to each field and table header is crucial for improving model accuracy. Below are examples of how to effectively describe the
invoice_number
field:-
Good Description: "The
invoice_number
field should contain a unique alphanumeric identifier located at the top right of the invoice, typically starting with 'INV-' followed by up to 8 digits. Ensure this field captures the complete identifier without any surrounding text." - Bad Description: "Enter the number from the invoice." This description is vague and does not specify the location, format, or uniqueness of the invoice number, potentially leading to incorrect or incomplete data capture.
-
Good Description: "The
2. Upload a Document
Step-by-Step Instructions:
- Configuring Fields/Table Headers: Ensure that all fields and table headers are configured before proceeding with the upload.
- Manual Upload: Documents can be uploaded manually by clicking the "Upload Files" button on the "Extract Data" screen.
- Automated Upload: For automated uploads, refer to the How to Setup Workflow? page for detailed instructions on setting up the import block.
3. Correct Mistakes in Fields and Tables
Step-by-Step Instructions:
- Post-Upload Processing: After uploading and processing a file, navigate to the extracted data to make corrections.
- Adjustments and Corrections: Adjust the boundaries of the annotated boxes to ensure accurate text capture, modify any incorrectly predicted values, and save the changes.
- Saving Changes: Click the "Save" button to apply corrections.
4. Approve File in Order to Train the Model
Step-by-Step Instructions:
- Learning from Approved Images: The model only learns from approved files. It is essential to approve files to incorporate the changes into the model training.
- Approving the File: After making all necessary corrections to the labels, use the "Approve" button located at the bottom right corner of the interface to finalize the training data input.
Debugging the Results of an Instant Learning Model
When working with an instant learning model, it's important to ensure that the data extraction is accurate and aligned with your requirements. If the predictions do not meet expectations, you may encounter one of the following issues:
1. Blank Predictions for a Field
Cause: The model may give blank predictions if the field name or its description is unclear, causing confusion about what data needs to be extracted.
Resolution:
- Review Field Definitions: Ensure that each field name and description are explicit and informative. Descriptions should clearly state where the information is typically found, what it looks like, and any unique characteristics (e.g., alphanumeric, format specificities).
- Train the model: Train the models by approving the files. Make sure you are approving files where the extracted data is correct.
2. Incorrect Data Extraction Location
Cause: The model may extract data from an incorrect part of the document, leading to erroneous outputs.
Resolution:
- Inspect the Bounding Box: Click on the extracted data to view the bounding box where the data was extracted. This visual feedback will show you the exact location on the document from where the model is pulling the data.
- Adjust the Bounding Box: If the bounding box is not correctly placed, manually adjust it to the correct location on the document to ensure that the model learns the right areas for data extraction in future operations.