public class TessJNI extends Object
| Constructor and Description |
|---|
TessJNI() |
| Modifier and Type | Method and Description |
|---|---|
String |
getOCRText(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Performs OCR on a PDF page and returns a plain text result string.
|
String |
getTesseractVersion()
Returns the version of Tesseract that we're running.
|
String |
performOCR(String language,
BufferedImage image,
int imageDPI)
Deprecated.
Use the version of this method that takes a pageNumber argument.
|
String |
performOCR(String language,
BufferedImage image,
int imageDPI,
int pageNumber)
Performs OCR on an image and returns an hOCR result string.
|
String |
performOCR(String language,
PDFPage pdfPage,
int dpi)
Performs OCR on a PDF page and returns an hOCR result string.
|
String |
performOCR(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Performs OCR on a PDF page and returns an hOCR result string.
|
OCRPageResults |
performOCRandOSD(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Convert a PDF page to an image and run OCR on it, to return the OCR text in hOCR format, and page orientation information.
|
public String performOCR(String language, BufferedImage image, int imageDPI, int pageNumber) throws OCRException
language - The language to use in performing the OCR. This can be multiple languages separated by a plus character (ex. "eng+fra")image - The image to processOCRExceptionpublic String performOCR(String language, BufferedImage image, int imageDPI) throws OCRException
OCRExceptionpublic String performOCR(String language, PDFPage pdfPage, int dpi) throws PDFException, OCRException
language - The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage - The PDF page to process.dpi - Dots per inch at which to render the image.OCRExceptionPDFExceptionpublic String performOCR(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
language - The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage - The PDF page to process.dpi - Dots per inch at which to render the image.options - The OCR options.PDFExceptionOCRExceptionpublic String getOCRText(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
language - The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage - The PDF page to process.dpi - Dots per inch at which to render the image.options - The OCR options.PDFExceptionOCRExceptionpublic OCRPageResults performOCRandOSD(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
lang - Language to use when running OCR.pdfPage - The PDF page to run OCR on.dpi - The image resolution, in dots per inchoptions - The options to use when running OCR.PDFException - For errors covnerting the PDF page to an image.OCRException - For OCR errors.public String getTesseractVersion()