public class TessJNI extends Object
Constructor and Description |
---|
TessJNI() |
Modifier and Type | Method and Description |
---|---|
String |
getOCRText(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Performs OCR on a PDF page and returns a plain text result string.
|
String |
getTesseractVersion()
Returns the version of Tesseract that we're running.
|
String |
performOCR(String language,
BufferedImage image,
int imageDPI)
Deprecated.
Use the version of this method that takes a pageNumber argument.
|
String |
performOCR(String language,
BufferedImage image,
int imageDPI,
int pageNumber)
Performs OCR on an image and returns an hOCR result string.
|
String |
performOCR(String language,
PDFPage pdfPage,
int dpi)
Performs OCR on a PDF page and returns an hOCR result string.
|
String |
performOCR(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Performs OCR on a PDF page and returns an hOCR result string.
|
OCRPageResults |
performOCRandOSD(String language,
PDFPage pdfPage,
int dpi,
OCROptions options)
Convert a PDF page to an image and run OCR on it, to return the OCR text in hOCR format, and page orientation information.
|
public String performOCR(String language, BufferedImage image, int imageDPI, int pageNumber) throws OCRException
language
- The language to use in performing the OCR. This can be multiple languages separated by a plus character (ex. "eng+fra")image
- The image to processOCRException
public String performOCR(String language, BufferedImage image, int imageDPI) throws OCRException
OCRException
public String performOCR(String language, PDFPage pdfPage, int dpi) throws PDFException, OCRException
language
- The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage
- The PDF page to process.dpi
- Dots per inch at which to render the image.OCRException
PDFException
public String performOCR(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
language
- The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage
- The PDF page to process.dpi
- Dots per inch at which to render the image.options
- The OCR options.PDFException
OCRException
public String getOCRText(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
language
- The language to use in performing the OCR. This can be multiple languages separated a plus character (ex. "eng+fra").pdfPage
- The PDF page to process.dpi
- Dots per inch at which to render the image.options
- The OCR options.PDFException
OCRException
public OCRPageResults performOCRandOSD(String language, PDFPage pdfPage, int dpi, OCROptions options) throws PDFException, OCRException
lang
- Language to use when running OCR.pdfPage
- The PDF page to run OCR on.dpi
- The image resolution, in dots per inchoptions
- The options to use when running OCR.PDFException
- For errors covnerting the PDF page to an image.OCRException
- For OCR errors.public String getTesseractVersion()