Tel: +353 (1) 294 2420

Invoice Data Capture: How to Choose the Right Process for You

Accounts payable automation is comprised of a number of different components including matching invoices, approval workflows, adding data to an ERP system, paying suppliers, and eventually generating reports and analytics.

Robert Lynch, P2P Insights Analyst
Published on March 6, 2019

One step of this process that often gets overlooked is invoice data capture. This is the stage where an invoice is received and the required data is extracted and entered into the system. Before a P2P or AP automation platform can begin to function, data from the invoice needs to be entered accurately.

Organizations typically receive invoices in either paper, PDF, or electronic format. Depending on certain factors within the organization, there are a variety of ways for the data to be extracted and entered into an AP automation system.


Data capture methods


OCR (Optical Character Recognition)

OCR is a technology that allows software to interpret machine printed text on scanned images. Invoice processing software will use OCR to identify certain familiar areas on an invoice and read the data from there. OCR has the ability to learn from its own actions. If it makes a mistake and is manually corrected, then the technology will learn and develop itself for the future.


Manual Keying

Organizations have the option to manually key data from an invoice into the AP automation system directly. This process is time-consuming but is often less costly than the alternatives. This would suit organizations that have a low quantity of invoices or they don’t need to extract a lot of data.



The third option is to outsource the process of capturing data from an invoice. Invoices can be sent directly to an external organization from the supplier. Paper invoices can be sent to a PO box that is collected by an outsource service collector. Tools like ‘Email Collecter’ can be configured to ensure that invoices received through email attachments are automatically added to a queue for data capture.

As with any outsourced process, the main advantage is that organizations don’t have to handle invoices themselves and they can trust an expert to perform the process. Organizations that have sensitive data often prefer to perform the process internally so that they keep full control.


Factors to consider when choosing an invoice data capture method

Choosing the invoice data capture method that is correct for you can be challenging and overwhelming. When making your decision you need to take a number of different factors into consideration. The next four points are important factors that you should consider when selecting the best invoice data capture system for your organization.


Factor #1: Invoice volume

Invoice volume is a major factor to consider when choosing an invoice data capture system. If an organization receives a low number of invoices then Data Keying could be a simple solution. A single individual could spend all or some of their time being responsible for the process of manually keying data into the system.

If an organization receives a large number of invoices then outsourcing would be a strong option. An external organization could offer operational efficiencies and process the invoice as soon as it is received.


Factor #2: Format of invoices received

Invoices are usually received in one of three formats: paper, email attachment (generally PDF), or electronic (EDI or XML.) The method of capturing data from invoices depends on the format of the invoice that is received.

  • Invoices that are received in electronic format are preferred because they are the easiest, quickest and cheapest format of invoices to process. EDI allows for the transfer of documents in a standard format, making processing quicker, and removing delays in issuing payment.
  • If an organization receives mostly paper-based invoices they will need to manually key the data or scan the invoices so that they can be then be read by OCR (Optical Character Recognition.) This can be done in-house or through outsourcing services. It is very common for AP teams to have their own desktop scanners and then verify the outputted data from OCR.

Invoices that are received as PDFs will need to be manually keyed into the system or processed through OCR. Email collector is a great tool for AP teams who are looking to streamline the invoice data capture process. This tool can be configured so that invoices that are attached to emails automatically get added to a queue to be either manually keyed, ran through OCR or sent to an outsourced partner.

Electronic data interchange, or EDI, is the system-to-system transfer of data between organizations. If you are placing a large number of orders with vendors, you may also have the bargaining power to ask vendors to start sending invoices using EDI, which would ensure that the majority of invoices are captured the same way, allowing for even faster processing times.


Factor #3: Type and volume of data that needs to be extracted

On an invoice, line items represent a product or service that has been added to an invoice along with its relevant quantities, rates, and prices. Header data is all other information on an invoice such as invoice number, total price and document date. If your organization has a lot of line items on an invoice, then it will be very time-consuming to manually key the data into an ERP system. OCR would be able to extract all line item data from an invoice in a fraction of the time. Although the information needs to be verified, AP staff will be able to re-allocate their available time to different tasks.

If an invoice doesn’t have a lot of line items then organizations could manually key the data into their system. If there isn’t much data to be extracted, employees won’t need to spend too much time on data entry.

In some cases, it may not be necessary to extract all of the data on the invoice. If all of the data on the invoice needs to be extracted, then OCR may be the most efficient and accurate option. If only certain information needs to be extracted, then manually keying the information could be a strong solution.


Factor #4: Resources

When selecting an invoice data capture method it is important to consider the financial and human resources that you have available. The reality is that all organizations have to allocate budgets for projects like data capture. Obviously, these budgets vary from company to company. The cost of an invoice data capture project relies on each of the previous three factors. Invoice volumes and the complexity of each invoice will increase the price of the overall project.

For organizations who have smaller budgets, data keying may be the best option. An AP employee can be responsible for manually keying the data from each invoice into the ERP system. The total cost of this method would only be the salary of that person(s). With that said there needs to be sufficient staff resources available.

Alternatively, if an organization had a larger budget they should look into outsourcing or OCR as their preferred method. Both of these methods would take little to no time from your workforce which is a good option for organizations who are tight on staff resources.

choosing Invoice Data Capture process with invoice volume, invoice format, type and volume of data for capture and resources.



Keeping on top of incoming invoices can be a time-consuming task which requires a lot of resources. The penalties associated with late payments and errors in paying invoices can be quite substantial. Invoice data capture is the first stage of a long accounts payable process and when done correctly can lead to massive time and cost savings allowing AP staff to spend their time on different tasks that can add more value.

New call-to-action

Request a Demo

Complete the form to request a demo of one of our solutions.