Document AI Accelerator Scales and Improves Efficiency
Cindy Barrientos, Senior Data Science Engineer, Rackspace Technology
An estimated 80% of all business data is in the form of unstructured data, including emails, images, multimedia files and PDFs. This unstructured data contains highly valuable information, such as HR records, patient records, insurance claims, invoices, purchase orders, maintenance logs, and order scheduling and tracking.
Onica by Rackspace Technology™ leverages AWS’s powerful Intelligent Document Processing (IDP) managed services as an Accelerator program that helps organizations make their unstructured data searchable, so they can intelligently access their data quickly.
Onica IDP is entirely built on AWS cloud infrastructure. It is fully customizable and configurable, allowing it to seamlessly plug into an organization's existing systems. The solution employs Amazon Textract and Amazon Comprehend Models to increase data quality and reduce costs by operationalizing document processing workflows. In addition, it provides scalability, improves efficiency and enhances governance and security while supporting compliance and regulatory requirements. The uses for this solution are vast, and include:
- Classifying and tagging documents and images
- Creating intelligent filing systems
- Eliminating the need for manual data entry
- Formatting document/form fields into logical formats
- Translating documents into different languages
- Digitizing paper documents like invoices, receipts, photos, forms and other files
- Reducing physical storage space and costs
Onica has built a front-end user interface that makes uploading, searching and analyzing documents easy. The analysis begins with Textract to run text detection or document analysis. Wait steps are in place to monitor if the AWS managed services have completed. If not, the wait persists until the job completes and then the transformation and extraction of the output continues. Textract word map is the precursor to combine the outputs of Textract and Comprehend. The Comprehend classifier and entity recognition run in parallel.
Any documents classified with low-confidence scores are sent to Amazon A2i via a Lambda. End-users will be invited to review and label the data to ensure labels are correct and to retrain and Comprehend. Post-processed Textract and Comprehend outputs are sent to S3 and an OpenSearch cluster.
CloudWatch metrics of the document processing Lambdas are tracked to prevent or quickly recognize failures. Considering the hard limits of AWS Lambda, we want to be aware of processing loads that are reaching capacity before problems arise.
We provide a CloudWatch dashboard with an overview of the health of the processing pipeline. When one or more Lambdas in the pipeline reach an 80% threshold of memory utilization or execution duration, an alarm is triggered to notify an established point of contact. This user can then determine the source by investigating the CloudWatch dashboard, and opt to change the processing schedule. Because of the scheduled batch processing feature, our pipeline can handle most document workloads. To reach a target pipeline utilization, we set a schedule to match the typical day-to-day usage.
Logging and Data Visualization
We export CloudWatch logs to s3, using Kinesis fire hose. AWS glue then runs crawlers to check the tables and get all the common values run through Athena, allowing users to query the data quickly via Quicksight. A dashboard with a high-level overview of the document processing pipeline is published for easy access through the front-end interface.
The front-end is hosted and deployed on AWS Amplify for easy infrastructure management. We then use Amazon Cognito to authenticate users who are granted access to the document data.
New documents may be uploaded for processing on the upload page. The processing workflow is run on a customizable schedule (e.g., hourly, weekly, monthly).
You can find documents of interest based on their classification of the document type for your use case. The following example shows case study documents grouped by relevant industry. When we select an industry in the search bar, previews of the documents are displayed and can be clicked on to enlarge.
Alternatively, or in addition to filtering by classification, you can search for text in the documents, such as client name, invoice line item, prescribers name, etc.
The ability to quickly classify documents and detect entities unlocks limitless workflow possibilities for increasing efficiency and improving employee productivity. For instance, we can include additional post-processing on documents that have been determined to be purchase orders, receipts, or invoices to extract and organize specific line items into a database. Or we could route incoming pages from a fax to their respective departments for more streamlined communications. That automation of these processes can take your business to the next level by helping you to improve accuracy and freeing your employees to work on more meaningful tasks.
Our generative AI services are part of the Foundry for AI by Rackspace Technology.
Foundry for AI by Rackspace (FAIR™) is a groundbreaking global practice dedicated to accelerating the secure, responsible, and sustainable adoption of generative AI solutions across industries. Follow FAIR on LinkedIn.