jPDFText Source Code Samples
Following are a number of Java samples that use jPDFText to extract text content from PDF documents. More samples are available in jPDFText knowledge base.
ExtractAllText.java – Simple program to extract the entire text in a document as a single String, and then saving this to a file.
ExtractTextByPage.java – Program that extracts the text for each page in a document and writes it to a file.
GetWordList.java – Program that gets all the words from a PDF document and echoes them to the console.
GetLinesAndPositions.java – Program that gets all the lines from a PDF document and echoes them to the console. This program can be used to extract data from structured reports such as invoices, statements, etc..
GetWordsAndPositions.java – Extracts all the words in the document with their position informaiton and echoes this to the console.