A-PDF Text Extractor is a free PDF program to convert .pdf files to .txt enabling you to utilize all the formula in Foxtrot to extract information from the document with high precision, flexibility, and speed. In order to use A-PDF Text Extractor, you must download it using the following link:
For a one-time purchase of (only) 35 dollars, you have the option to purchase the A-PDF Text Extractor application as a Windows console utility (PTCMD). This is a standalone .exe program enabling you to perform the text extraction directly via the DOS Command in Foxtrot, which makes the process significantly faster and more stable. To purchase it, go to this link:
As an alternative to the A-PDF Text Extractor, you might consider using the Poppler utility library that is completely free and offers the same capabilties as the A-PDF Text Extractor CMD utility plus some additional useful features. However, it is arguably also a bit more advanced to work with. To read more about the Poppler Utility Library, see this article:
For later reference, here are some articles that show how to use most commonly used formulas in Foxtrot:
IMPORTANT: Remember, both the A-PDF Text Extractor and Poppler utility library works well with "text" PDF files, however, they are not able to extract text from "image" PDF files. If you are not sure about this, please read our overall guide on how to work with PDF files:
A-PDF Text Extractor Command Line
After purchasing and downloading the PTCMD, we recommend that you place the .exe program directly on the C: drive for easier access during scripting. Hereafter, you simply need to follow the instructions from their website. In Foxtrot, use the DOS Command action to call the PTCMD program. If you have placed the .exe program on the C: drive, here is how you could convert a file in the downloads folder:
cd C:\ && PTCMD "[*DOWNLOADS_DIRECTORY]example.pdf"
This will first navigate to the C:\ directory using the "cd" command, hereafter, it will activate the PTCMD program to convert the "example.pdf" file in the downloads directory. In this example, no output file name is specified, meaning that the output will be the same file name in the same folder, simply with ".txt" instead of ".pdf".
If you wish to specify the output file destination, the command could be:
cd C:\ && PTCMD "[*DOWNLOADS_DIRECTORY]example.pdf" "[*DESKTOP_DIRECTORY]temp.txt"
Of course, you can utilize this in conjunction with Files & Folders lists if you need to convert multiple files in the same folder in a loop.
Here is a best practice example where you have two variables:
- FilePath = the path of the pdf to convert
- CmdPath = the path of the A-PDF Text Extractor program
Notice how formulas are used to make the command dynamic and some additional options are used at the end to match the settings typically used in the desktop version of A-PDF Text Extractor. This command will navigate to the folder of the program, then call the program to convert the file specified in the "FilePath" variable and generate an output file in the same directory with the same name but with ".txt" extension.
IMPORTANT: If you have placed the program in the root of c:\, replace "[?LeftOf([%CmdPath],"\",,False)]" with "c:\".
cd [?LeftOf([%CmdPath],"\",,False)] && [?RightOf([%CmdPath],"\",,False)] "[%FilePath]" "[?Replace("[%FilePath]",".pdf",".txt")]" -O"Smart" -F" = Page &p ="
After producing the output text files, you can utilize the Read File action to read the content of the output files into a variable in Foxtrot, and you can then either extract the desired data using text formulas or by sending it to Excel to extract the data using the features available in Excel.
A-PDF Text Extractor Desktop Application
In the attached file (find at the end of the article), you can find an example of how to build a script that extracts data from PDF files. You can download the script and try to run it. Remember to save the Foxtrot Project File to the same location as your other projects, typically in this destination: C:\ProgramData\Foxtrot Suite
You can also find a folder with PDF samples attached. You should locate it on your Desktop to run the script.
- Foxtrot starts by opening the program and choosing the way to display extracted text.
- The next step is to create a list of all PDF files in the folder and create a loop that will open each file in A-PDF Text Extractor and get data to a text file.
- Next, Foxtrot applies formulas to get the values to variables.
The important thing is to know the text structure. For example, this formula gets an invoice number; it takes the value between “Invoice” and space. And it will work for all documents of that structure.
- Final steps are to write a log, close the program and clear variables.
- The result looks like this:
Comments
0 comments
Please sign in to leave a comment.