OCR PDF
PDF Studio is capable of OCRing documents using any of the available OCR languages to add text to documents. OCR allows you to add text to scanned documents or images so that the document can be searched or marked up as you would any other text document. PDF Studio can also run OCR with two languages at once. For more information on using OCR with two languages see OCR Preferences.
What is OCR?
Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed or printed text into machine-encoded searchable text data.
From Existing Document
Text can be added to an existing document using OCR
- Launch PDF Studio and open the PDF document that you wish to add searchable text to
- Go to the Document Tab > OCR from the toolbar
- From the Language drop down select the language you wish to use
- Note: The first time using OCR you will need to download the language packs. To do so click on “Download OCR Languages“, then select the languages you wish to use and click on “Download”.
- Select the Page Range and Resolution that you wish to use
- Note: A resolution of 300 DPI produces good OCR results for most images. When dealing with scans containing noise, you may try using a lower DPI setting to get rid of the noise and obtain better OCR results.
- Choose additional options
- Discard Invisible Text - removes any previous OCR text that has been added to the page.
- Auto Deskew Images - When checked, if the document’s text/images are slanting too far in one direction or is misaligned, PDF Studio will attempt to auto-rotate the document so that the alignment is corrected.
- Click on “OK” to begin the OCR process
- You will see a progress dialog showing you the current page being processed. Once complete click on “OK” to close the dialog
- Your document is now ready to be searched, edited, or marked up with highlights, underlined, crossed-out or used with caret annotations.
When Scanning a Document
OCR can add text to a document at the same time it is being scanned with PDF Studio
- Start the Scanning Dialog as normal
- In the scanning dialog you will see an option to OCR the document after scanning
- From the Language drop down select the language you wish to use
- Note: The first time using OCR you will need to download the language packs. To do so click on “Download OCR Languages“, then select the languages you wish to use and click on “Download”.
- After setting all of your scanning and OCR settings click on “Scan” to begin scanning the document
- Once the scanning completes the OCR process will begin and you will see a progress dialog showing you the current page being processed. Once complete click on “OK” to close the dialog
- Your document is now ready to be searched, edited, or marked up with highlights, underlined, crossed-out or used with caret annotations.
Available OCR Languages
The following language dictionary files are available for download directly from within PDF Studio OCR functions. Using the appropriate language file will improve the accuracy of OCR results. See Tips on Improving OCR Results for additional information
- Afrikaans
- Albanian – shqip
- Arabic – العربية
- Azerbaijani – azərbaycan
- Basque – euskara
- Belarusian – беларуская
- Bengali – বাংলা
- Bulgarian – български
- Catalan – català
- Cherokee
- Chinese (Simplified) – 中文(简体中文)
- Chinese (Traditional) – 中文(繁體)
- Croatian – hrvatski
- Czech – čeština “da”>Danish – dansk
- Danish – Dansk
- Danish (Fraktur) – Dansk (Fraktur)
- Dutch - Netherlandish
- English
- Estonian – eesti
- Finnish - Suomalainen
- French - Français
|
- Galician – galego
- German - Deutsche
- Greek – Ελληνικά
- Hebrew – עברית
- Hindi – हिन्दी
- Hungarian – magyar
- Icelandic – íslenska
- Indonesian – Bahasa Indonesia
- Italian - Italiano
- Italian (old) – italino vecchio
- Japanese – 日本語
- Kannada – ಕನ್ನಡ
- Korean – 한국어
- Latvian – latviešu
- Lithuanian – lietuvių
- Macedonian – македонски
- Malay – Bahasa Melayu
- Malayalam – മലയാളം
- Maltese – Malti
- Math / Equations
- Norwegian - Norsk
|
- Polish - Polskie
- Portuguese - Português
- Romanian – română
- Russian – русский
- Serbian – српски
- Slovakian – slovenčina
- Slovakian (Fraktur) – slovenčina (Fraktur)
- Slovenian – slovenščina
- Spanish - Español
- Spanish (Old) – español (Antiguo)
- Swahili – Kiswahili
- Swedish - Svensk
- Tagalog
- Tamil – தமிழ்
- Telugu – తెలుగు
- Thai – ไทย
- Turkish – Türkçe
- Ukrainian – українська
- Vietnamese – Tiếng Việt
|