RPA SoftwareMachine LearningArtificial Intelligence


Just as digitalism has reached a state of ubiquity, automation is following behind on a similar path through Robotic Process Automation (RPA) and Artificial Intelligence (AI) technologies. In defining a context for a series of inputs, automation technologies dictate the corresponding outputs. These outputs can take many forms, produced automatically using the predefined parameters. The most obvious difference between AI and RPA is their ability to evolve—AI evolves past the original parameters to produce new outputs beyond the rules that are explicitly dictated, while RPA produces rule-bound outputs.

Overall, we know RPA improves business processes by eliminating human error, and by removing human capital from the equation, we reduce operating costs. Our client has instrumented RPA technology to reduce inefficiencies, both resource and process inefficiencies. Their intelligent document capture technology classifies and separates page images into documents and extracts metadata from the Optical Character Recognition (OCR) content of a document.

woman making manual calculations
RPA software


The client's web-based user interface allows operators to review documents and validate the extracted content. The assembled document and the associated metadata can be exported to other Enterprise Content Management (ECM) systems for further processing.

All organizations require tools to manage their information and knowledge. Document management, workflow, web content management, document capture, records management, portals, and other knowledge management systems are a few of the tools categorized as ECM. Since information can be stored in many different electronic systems, ECM tools communicate not only with each other, but also with other corporate system—ERP, CRM, and other associated databases.

Essentially, Chetu was tasked with engineering the document capture and data classification platform capable of accelerating the identification of various documents, extracting meaningful data to feed back-office applications and business processes. Ephesoft software is used to extract the desired data to approach the result.

Inside the data extraction process, the supporting software needs to be trained with the extracting rules, creating the operating requirements. These rules are set in the form of text pattern, matching regular expressions to identify small or large-scale structures. After the data extraction, each rule is validated, and once the desired result is obtained it can be applied. When a rule is applied, each extraction mirrors the rule, producing the desired result.


The development environment is based on Ephesoft Software. The rules (or codes) are based on the classification of the documents. Batch classes are created after the classification for the documents. Each document is indexed where extraction rules are established. There are three types of extraction:

  • Free Form Extraction
  • Fixed Form Extraction
  • Table/Line Item Extraction

We used Regular Expression (Regex) in the rule set-up. Ephespoft Software supports our dictated rules to jumpstart the desired extraction.

An administrator acts as the Quality Analyst and is required to validate the extracted data ensuring it aligns with the customer's requirement. If the desired extraction is not cohesive then the administrator re-assigns the work to a Data Extraction team who manually completes the processes.

For data extraction approved by the administrator, the corresponding output is then imported into the immediate extracting system. This is followed by data transformation and possibly the addition of metadata prior to exporting to another stage in the workflow.

Multiple users can access the extraction and validation. Once multiple sample files are provided to create extraction rules in one format, the rules can then be applied in an automated extraction. Each desired extraction must be written into a rule requirement to operate sufficiently. Our Client provides documents that list all the "document types" and "fields" to be extracted. The enumerations are built into the system so that the extractions can be obtained with greater accuracy. Regex library accommodates all possible fields and the "document types" are described by annotations within the document.

Ultimately, the Ephesoft Software becomes more attune to the Client's needs as time passes. With RPA technology, the software has to be "trained." The Client has been able to train our software effectively improving nearly every facet of their infrastructure thus rendering repetitive manual processes obsolete. Our relationship with this Client has carried on far past this project-our collaborative RPA work has transitioned into a long-term partnership.

technology logo
Let's Work Together

Contact us to learn more about Chetu and our custom software services

Privacy Policy | Legal Policy | Careers | Sitemap | Referral | Contact Us

Chetu Limited is a company registered in England and Wales with company number 11882245

Copyright © 2000- Chetu Inc. All Rights Reserved.

Button to scroll to top

By continuing to use this website, you agreeto our cookie policy. GOT IT