The PDF Text to Variable Node allows you to create variables from text content stored within the PDF. You will first need to set the areas or region to extract text from a template PDF document. Once set, as the documents are processed the variables will be populated with the text found in the PDF documents using the location information set for each variable.
To set the regions in a PDF you will need to launch the Text to Variable tool and select a template document to use as a reference to create the locations to extract text. The template document is for reference only.
Variable Name: The name of the variable to be created
Type: PAS currently supports the following user defined data types for variables
Format: Sets the default format for the variable. The format can also be set dynamically at output time using Variable Formatting
Required: Sets whether a value is required in the variable to continue the flow. When set to true if no variable is created on this node it will trigger Trouble Handling. When set to false and no value is found it will pass through an empty string.
Regular Expression: Allows you to user a regular expression to perform more advanced search queries
Examples:
Template text: Preview of the text found in the area selected
Variables require a default format to be set in order for to be used by the workflow. You can use one of the predefined formats in the drop down or write your own using the following formatting symbols
NO formatting options, plain text strings only
0 = digit placeholder (000 = 001)
# = digit no placeholder (### = 1)
. = decimal separator (#.# = 0.5)
, = grouping separator (#,### = 1,000)
yy = two-digit year
yyyy = four-digit year
m = month (1-12)
mm = two-digit month (01-12)
mmm = three letter month abbreviation (01=Jan)
mmmm = full month name (01=January)
d = day of month (1-31)
dd = two-digit day of month (01 - 31)
H = hours (1-23)
HH = two digit hour (00 - 23) (AM/PM NOT allowed)
M = minutes (0-59)
MM = two digit minute (00 - 59)
s = seconds (0-59)
ss = two digit second (00 - 59)
h = hour (0 - 12 | 12-hour AM/PM format)
hh = two digit hour (00 - 12 | 12-hour AM/PM format)
tt = AM/PM based on the time
zz = time zone (Pacific Daylight Time = PDT)
zzzz = time zone (Pacific Daylight Time)
True / False
1 / 0 = binary format
yes / no
Qoppa Software's PDF Automation Server for Windows, Linux, Unix, and macOS
Automate PDF Document Workflows through RESTful Web Services & Folder Watching
Copyright © 2002-Present Qoppa Software. All rights reserved.