This feature is free to use on all models (free or paid) and can be set up via the Manage Labels page on the app.
What can you do with captured field data?
There are 3 different kinds of operations a Label Type can perform, all can be present on a label simultaneously.
- Post-processing - Changes the data extracted to match your specifications.
- Formatting - Display/Store data in a particular format, just aesthetic.
- Validations - Validate if the data matches the conditions and rules set, marks the label invalid if not a valid entry.
Choose from date formats (US/EU/Friendly) under Formatting.
eg: Change 12 Jan 2021 to 12/01/2021
Remove special characters from Amount field to remove currency symbols with Post Processing
eg: Change $4000 to 4000
Mark file as invalid if it does not match set criteria for character length with Validations
eg: Set entry length to 10 digits for phone numbers
Remove alphabets from entries for Invoice number with Post Processing
Watch a video
Your browser does not support HTML5 video.
Steps to set these up
1. Find the Manage Labels page:
On the Extract data or Prepare for Training screen > Click on an image to open it > Click on the ✏️(edit) icon next to Labels.
From the Prepare model for Training page > under Model details > Click on the ⚙️(settings) icon.
From the Build section on the side navigation bar > select Manage Labels.
2. Set a Label Type for any field:
For example, if you want all the extracted dates from your files to match the format mm/dd/yyyy, here's how you can do that:
- Go to the Manage Labels page from one of the screens mentioned above. This page should look like this:
- Against each field, you will see a dropdown 'Select Type'
- Click on the dropdown to select from a preset. (In this example we have selected Date)
- Once the Label Type is selected, you can select or change the default settings by clicking on the ⚙️icon. (See steps in the next section to learn more about custom types)
Custom Label Types
Steps to create a Custom label type or edit default settings:
- If you have already added a preset Label Type, skip to step 2. If you have not added any Label type yet follow the steps above to add a label type first.
- After adding a preset Label Type, click on the ⚙️icon to see Customise field settings on the right side of the page. This should look like this:
- You can now either edit or add new operations/validations under the sections:
- Click on Apply Changes.
Custom options list
- Change case: Change case of the string to uppercase/lowercase/titlecase
- Closest match to: Specify text that the entry should match
- Concatenate: Join character strings end-to-end. Eg: "snow" and "ball" is "snowball".
- ToAscii: Convert Characters to ASCII Codes. web developer and programmer tools.
- Keep only one: Keep only one entry, discard the entry with low confidence.
- Find and Replace: Find multiple text and replace them with specified text.
- Match Regex: Create patterns that help match, locate, and manage text.
- Remove: Remove numbers, alphabets or special characters. Eg: $400 to 400
- Content length is: Allow entries only if they match the specified number of characters
- Match databases: Set up integrations with Salesforce, Postgres etc and check if data matches with entries in those databases
- Valid date: Allow entry only if it matches one of the EU/US or Friendly date formats
- EU (dd/mm/yyyy)
- ISO (2020-08-05)
- US (mm/dd/yyyy)
- Lower Case
- Upper Case
Check invalid entries on your files
- Upload files to the Extract data screen on your model.
- In the Files list you will see a column: Validation
- If ALL labels on a file have valid entries: the file will be marked with a blue tick.
- If there is a label with an invalid entry: the file will be marked with a red error symbol.
- Click to open the invalid file (with the red error icon)
- You will see the label(s) containing the Invalid entry marked in red with an error message icon.
- Click on the Error message icon to see why the entry is marked as Invalid.
- Correct the data to match the validation criteria.
- If there are no more invalid entries, the file will now be marked as Valid with the blue mark.