The pdf2Data SDK is a native Java (or .NET) application. Its primary function is to extract data from PDF files using predefined extraction rules.
The extracted data is output in either JSON or XML format.
The preferred way to set up iText pdf2Data in Java is to use a build system like Maven or Gradle and download pdf2Data artifacts from the iText Artifactory located at https://repo.itextsupport.com/pdf2data/
The groupId is
com.itextpdf.pdf2data, and the artifactId is
In Maven, the configuration would look similar to the example below:
You can browse for the desired NuGet package manually or install it with the
Install-Package itext7.pdf2data NuGet Package Manager command.
Integrating pdf2Data into your code
As from iText pdf2Data 4.0 the format of extraction templates has been changed, compared to iText pdf2Data 3.*. Please see the Migration guide to get to know more
With the pdf2Data Manager in iText pdf2Data 4.0, you can download templates optimized for use in the pdf2Data SDK, so you can extract data in two lines of code.
Make sure to load the license file before invoking any code
The initialization of the
Pdf2DataExtractor instance from a processed template should now be done with one function call:
Parse PDF using the extraction template
You can use extracted values directly from the result or save them in one of two structured formats