Batch Processing Guide

Batch Processing Guide
EMC ® Document Sciences ®
xPression ©
Version 4.6
Batch Processing Guide
EMC Corporation
Corporate Headquarters
Hopkinton, MA 01748-9103
1-508-435-1000
www.EMC.com
Legal Notice
Copyright © 2003-2015 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change
without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS
OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY
DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Adobe and Adobe PDF
Library are trademarks or registered trademarks of Adobe Systems Inc. in the U.S. and other countries. All other trademarks
used herein are the property of their respective owners.
Documentation Feedback
Your opinion matters. We want to hear from you regarding our product documentation. If you have feedback
about how we can make our documentation better or easier to use, please send us your feedback directly at
ECD.Documentation.Feedback@emc.com
Table of Contents
Preface
Chapter 1
Chapter 2
Chapter 3
Chapter 4
................................................................................................................................
7
.......................................................................
9
xPression Handling Process for Batch Jobs .........................................................
Understanding Job Definitions ......................................................................
9
9
An In-Depth Look at xPression Batch .................................................................
How xPression Batch Fits into Your Environment ...........................................
Important Information About Your Data ........................................................
Oracle Databases and Batch...........................................................................
10
10
11
11
Multi-Threading in xPression ............................................................................
An In-Depth Look at Multi-Threading in xPression Batch ...............................
Configuring Multi-Threading ........................................................................
Configure BatchRunner.properties.............................................................
Configure xDashboard ..............................................................................
12
12
14
14
14
xPression Batch and Subdocuments ...................................................................
Debugging Your Batch Environment ..................................................................
15
15
Command Line Batch Processing ...............................................................
Command Line Parameters ...............................................................................
17
17
Overriding Data Sources ...................................................................................
Data Override with xRevise Custom Batch .....................................................
20
21
Debugging Your Batch Environment ..................................................................
Increasing Maximum Message Size ................................................................
Increasing Memory Size ................................................................................
21
21
22
Running Traditional Chinese Jobs From the Command Line ................................
22
........................................................
Simple xPresso Job Definition Sample ................................................................
Simple xDesign Job Definition Sample ...............................................................
Complex Job Definition Sample .........................................................................
xPressionPublishJob Element ............................................................................
Pre and Post-Processing Scripts Element ............................................................
Instantiate Element ...........................................................................................
UseBDT Element ..............................................................................................
Parameters Element ..........................................................................................
xPRSInstantiate Element ...................................................................................
JobLog Element ................................................................................................
If You are Using a DB2 Database ........................................................................
23
xPression Batch Processing
Manually Creating Your Job Definition
Sample Batch Trigger File
23
24
24
25
26
26
27
27
27
28
28
........................................................................... 31
Sample File ......................................................................................................
31
3
Table of Contents
Chapter 5
4
....................................................................................
Implementing the Event Notification Monitor ....................................................
Event Notification Sequence ..............................................................................
LifecycleListenerFactory Abstract Class .............................................................
BatchLifecycleListener ..................................................................................
LifecycleListenerFactory ...............................................................................
Sample LifecycleListenerFactory Abstract Class ..............................................
BatchLifecycleListener Interface ........................................................................
Sample BatchLifecycleListener Interface .........................................................
BatchInfo Class .................................................................................................
Attributes ....................................................................................................
Methods ......................................................................................................
BatchError Class ...............................................................................................
Attributes ....................................................................................................
Methods ......................................................................................................
DocInfo Class ...................................................................................................
Attributes ....................................................................................................
Methods ......................................................................................................
EventListenerException Class ............................................................................
Methods ......................................................................................................
FactoryConfigurationException Class ................................................................
Methods ......................................................................................................
Batch Event Monitor
33
33
34
35
35
35
35
36
36
37
37
37
37
37
38
38
38
38
39
39
39
39
Table of Contents
List of Figures
Figure 1.
Queues and threads that are created during batch job processing.............................
10
5
Table of Contents
6
Preface
This guide introduces you to the batch process in xPression. This guide provides an overview of the
batch process, shows you how to run batch from the command line, and explains how previously
run jobs work. Additionally, this guide provides a sample trigger file that you can review and
instructions for creating a batch even monitor.
Intended Audience
The guide is intended for users responsible for generating xPression outputs by running batch jobs.
Conventions
The following conventions are used in this document:
Font Type
Meaning
boldface
Graphical user interface elements associated with an action
italic
Book titles, emphasis, or placeholder variables for which you supply particular
values
monospace
Commands within a paragraph, URLs, code in examples, text that appears on the
screen, or text that you enter
xPressionHome
The term “xPressionHome” refers to the location where xPression was installed on your server. On
Windows servers, the default location is C:\xPression.
Revision History
The following changes have been made to this document.
7
Preface
8
Revision Date
Description
November 2015
Initial publication
Chapter 1
xPression Batch Processing
Running jobs in batch mode enables you to produce large volumes of personalized xPression
documents in a single batch run. You can run batch jobs in three ways:
• Schedule batch jobs to run as unattended jobs
• Manually initiate the batch job from the command line
• Run the batch job directly from Job Management
xPression Handling Process for Batch Jobs
When running in batch mode, xPression produces one or more personalized documents from a set of
customer data records. When you run a batch job, you are executing the Batch Runner command,
passing along parameters and a job definition.
Understanding Job Definitions
The job definition is the key component of a batch job in xPression. It controls everything about
your batch job except when and where it executes. Job definitions are XML files that provide the
following information:
• Which documents to select
• How to select them
• Which data source to use
• How to process the documents
• How to customize batch reports and batch logs
Job definitions can be created through Job Management or composed manually using a text or XML
editor.
Job definitions generated by Job Management are stored in the content repository. Manually
composed job definitions are stored on your file system as an XML file. To learn more about creating
job definitions, see the xDashboard User Guide.
9
xPression Batch Processing
An In-Depth Look at xPression Batch
The following figure shows an in-depth look of the entire Batch process.
Figure 1. Queues and threads that are created during batch job processing
The main job thread created by Batch Runner starts an xPression component that reads your
customer data, and based on the instructions specified in the job definition, executes the job. It
then sends the job record to the Task Queue. The customer data reader reads a block of customer
data, which the batch component starts processing by launching a configurable number of parallel
threads, each of which invokes an instance of the assembly engine. While the assembly engine
threads are assembling personalized documents from the block of data read, xPression reads in the
next block. All of these threads feed into a single thread to the Output Profile Controller, which in
turn calls the composition engines.
How xPression Batch Fits into Your Environment
A batch job definition contains your specifications for the customer data source, document source,
publisher, and error log preferences. When processing a batch job, xPression merges this information
with a previously defined output profile that directs your batch job to the correct output format
and output device.
The left side of the diagram in Figure 1, page 10 shows that the job definition and output profile
are sent as input to the batch process.
xPression produces one or more output streams, each of which contains a set of personalized
documents in a specific output format, targeted for a specific output device.
10
xPression Batch Processing
The following table shows how output management and job definition elements control your output.
xPression Element
What it Contributes to Your Output
Job Definition
Selects customer data, document data, creates reports, and customizes
the batch error log.
Output Definition
/Format Definition
Determines the output format of the documents in the batch job. This
definition is passed along as part of an output profile.
Distribution Definition
Sends the documents to a specific output device for publishing. This
definition is passed along as part of an output profile. xPression Batch
does not support the Return to Caller distribution definition.
Important Information About Your Data
Your data is defined in your data sources as everything between one delimiter node and the next.
For example, one of the sample XML data sources supplied with xPression, AUTOPAY.xml, uses
<Transaction> and </Transaction> to mark the start and end of data blocks.
xPression Batch will stop processing data for a single customer record set when the number of bytes
read reaches the XMLCUSTOMERDATA_MAXRECORDLEN setting in the customerdata.properties
file. To ensure that all the XML data is read for a single customer record set, make sure that you set
the XMLCUSTOMERDATA_MAXRECORDLEN value to exceed the maximum byte count of the
longest customer record set byte length.
xPression Batch sets a default memory size of 20 MB, which is used to store all of the items selected
for inclusion in a customer’s document. This value is stored in the customerdata.properties file.
In most cases, this will be enough memory to successfully assemble the output. If you need to
increase the memory setting, change the XMLCUSTOMERDATA_MAXRECORDLEN value in the
customerdata.properties file.
XMLCUSTOMERDATA_BUFSIZE = 5000
XMLCUSTOMERDATA_MAXRECORDLEN=20000000
The XMLCUSTOMERDATA_BUFSIZE setting in the customerdata.properties file sets the upper limit
for the number of bytes of memory used to process a customer record. If the XMLCUSTOMERDATA_
MAXRECORDLEN setting is smaller than the XMLCUSTOMERDATA_BUFSIZE setting, xPression
will read in and process a complete customer record, and then begin to read the next record
into the remaining memory. When the remaining memory is too small to successfully process
the second customer record, xPression issues an error. To avoid partial processing of customer
records, make sure that the XMLCUSTOMERDATA_BUFSIZE setting is always smaller than the
XMLCUSTOMERDATA_ MAXRECORDLEN setting.
Oracle Databases and Batch
If your customer data is in an Oracle database and you encounter random dropped records when
running a large batch job, you may need to adjust some parameters in the init.ora file. See your
xPression installation documentation.
11
xPression Batch Processing
Multi-Threading in xPression
The concept behind multi-threading is to enable xPression to process more than one customer record
at a time. xPression’s batch capabilities were designed to be thread-safe, enabling xPression to open
up several concurrent processing threads, significantly improving batch performance. xPression
sends these multiple threads to xPublish, which is thread-safe. See the following topics:
• An In-Depth Look at Multi-Threading in xPression Batch, page 12
• Configuring Multi-Threading, page 14
An In-Depth Look at Multi-Threading in xPression Batch
xPression uses its main thread, the batch job reading thread, to launch an instance of the customer
data reader. The customer data reader parses data according to the instructions contained in the job
definition. After the customer data reader reads a block of data, xPression processes it by launching a
configurable amount of parallel threads. Each thread invokes an instance of the assembly engine.
While the assembly engine threads are assembling personalized documents for one block of customer
data, the customer data reader reads in the next block.
The following figure shows how the batch component moves customer data into the assembly engine.
These threads feed into the multiple worker threads of the Streamer, which in turn calls xPublish.
The TaskQueue synchronizes the work of BatchRunner, the customer data reader, the assembly
engine, and the Streamer. It is implemented using the Java class SyncQueue().
The TaskQueue is a data synchronization channel between Batch Runner’s main thread (also known
as the batch job reading thread) and the assembly engine threads. The OPCQueue is the data
synchronization channel between the assembly engine threads and the Streamer. Both queues have
two configurable thresholds: EnqueueThreshold and DequeueThreshold.
12
xPression Batch Processing
The following figure shows TaskQueue.
When the number of records in a queue exceeds the configuration value in the EnqueueThreshold, the
component that is moving records into the queue must stop and wait until the number of records has
fallen to the value in the DequeueThreshold. When the value falls to the level of DequeueThreshold
xPression initiates the component that moves records into the queue to start moving records again.
You can set this threshold in your xDashboard job definition page.
13
xPression Batch Processing
Configuring Multi-Threading
The settings that control multi-threading reside in the BatchRunner.properties file and on the xAdmin
Job Definition page. You should configure xPression Batch to run most efficiently on your system.
• Configure BatchRunner.properties, page 14
• Configure xDashboard, page 14
Configure BatchRunner.properties
BatchRunner.properties is located in the xPression installation directory on your application server. It
contains the TaskQueue_EnqueueThreshold and TaskQueue_DequeueThreshold properties.
TaskQueue_EnqueueThreshold controls the size of TaskQueue, which is the maximum number of
customer job records that can be queued up for assembly. When the TaskQueue surpasses this value,
the customer data reader ceases reading new blocks of data until the DequeueThreshold is met. EMC
Document Sciences recommends a starting value in the tens. A large value can consume a lot of
memory. Adjust this number as needed by your requirements.
The customer data reader reads in blocks of data to the TaskQueue. If too many records accumulate
in the TaskQueue, the customer data reader stops reading new blocks of data until the number of
records in the queue match the TaskQueue_DequeueThreshold value. When the number of records in
the TaskQueue matches or is lower than this value, the customer data reader resumes reading new
blocks of customer data records. EMC Document Sciences recommends a starting value of 5. For more
information on TaskQueue, see An In-Depth Look at Multi-Threading in xPression Batch, page 12.
BatchRunner.properties also contains two properties that should not be altered:
OPCQueue_DequeueThreshold and OPCQueue_EnqueueThreshold. Please leave these properties at
their default values.
The recommendations provided above may not be ideal for your specific installation because of
various factors, such as CPU speed, available memory, and I/O speed. You may need to adjust the
configurations based on observed performance.
Configure xDashboard
The xDashboard Job Definition page contains performance parameters. These parameters configure
multi-threading in xPression:
• Instantiation Thread Pool Size — Defines the number of worker threads available
for each batch run. The Customer Data Reader and xPression Assembly components of the
batch process use this setting to distribute customer records across parallel threads to improve
performance.
• Streaming thread pool size — This setting is used to control the number of threads for
document output. If not defined with a specific setting, the thread number will equal the number
of streams in the Output Profile. If there are many streams, setting to a small thread number may
help to avoid OutofMemory errors.
14
xPression Batch Processing
• Customer Record Buffer — Number of customer records the main batch thread reads in
at a time.
• Job Level — Indicates what type of information will be collected for your job.
Statistics - Collects only batch statistics, such as start time, end time, and publish type.
Statistics with errors - Collects all the statistics information and information about failed customer
documents.
Statistics with details - Collects all the statistics information and customer document information
for all documents.
xPression Batch and Subdocuments
When using data override with subdocuments, the customer data source for both master and
subdocument must use the same schema.
Debugging Your Batch Environment
If you are experiencing problems while processing your jobs with xPression Batch, your first step is
to discover the source of the error.
Often, the easiest way to accomplish this is to check for errors in the log file specified on the Job Log
tab of your job definition. This log file should contain a list of errors encountered by the batch process.
When you run batch jobs from the command line or through web service calls, errors and other
events are recorded in the job log only. When you run jobs from xDashboard, the messages in the
job log are also recorded in the xPression log.
From 4.6, xPression supports custom distribution types. When you run batch jobs using your custom
distribution types, do not manually remove or modify the xPublishTempDir/BatchDisItem
folder and the files in it. Otherwise, your batch jobs might fail. xPublishTempDir refers to the folder
for xPression-generated temporary files, which you can define in xPressionPublish.properties. By
default, the folder is xPressionHome/Publish/Temp.
From 4.5 SP1, additional types of warning messages and error messages (those are actually warnings
but are logged as errors) can be included in the job log. Those warning and error messages are
generated by assembly calls from batch jobs. For example, warnings about content missing during
composition. Before 4.5 SP1, the messages are only found in the xPression log. The inclusion of these
messages in the job log is controlled by the property “JobLogAddExtendedServerMessages” in the
BatchRunner.properties file. Set the property value to “on” to include these messages and “off” to
exclude them. The default value is “on”. The BatchRunner.properties file is located in the xPression
installation directory.
For jobs executed in xPression versions before 4.5 SP1, setting JobLogAddExtendedServerMessages
to “on” might change the status of the jobs and their exit or return codes. For example, newly
included errors might lead to the exceeding of the Error Threshold value you have set, and a job that
previously succeeded could seemingly fail. Or a job that ran successfully previously could complete
with warnings. In those cases, you can set JobLogAddExtendedServerMessages to “off” to ignore
15
xPression Batch Processing
the log messages from the server as they are actually all warning messages. The exclusion of those
messages will not affect the outcome of a job.
16
Chapter 2
Command Line Batch Processing
For Windows operating systems, EMC Document Sciences provides BatchRunner as a batch file. For
UNIX platforms, EMC Document Sciences provides Batch Runner as a shell script.
To run xPression Batch directly from the command line, navigate to the xPression .ear directory in
your application server installation directory and type the BatchRunner command. The BatchRunner
command uses the following parameters:
• For Windows operating systems, type the following command:
BatchRunner -i JobID -j JobDefName -f JobDefLocation -q
OutDataOverrideFile -o InDataOverrideFile -n BatchParameter -p
OutputFilePath -ignoreDebug -d DiagnosticOutput -disableReporting
-r JobRunIDOverride
• For UNIX operating systems, the command is case sensitive. Type the following command:
$ BatchRunner.sh -i JobID -j JobDefName -f JobDefLocation
-q OutDataOverrideFile -o InDataOverrideFile -n BatchParameter -p
OutputFilePath -ignoreDebug -d DiagnosticOutput -disableReporting -r
JobRunIDOverride
For example, if you installed WebSphere to the C:\ directory, your BatchRunner command would
look like this: C:\WebSphere\appServer\installedApps\xPression.ear\BatchRunner
-j NightlyBatchRun
The default location of xPression Batch output is: <xPressionHome>\Publish\Output\
Command Line Parameters
The following table contains descriptions of the BatchRunner parameters.
Parameter
Definition
-i
This parameter identifies the Job ID. Follow this parameter with the ID of the
job definition as it appears on the Job Management tab. You can use this
parameter instead of the -j parameter. If you use both parameters, only the -i
parameter will be used.
17
Command Line Batch Processing
Parameter
Definition
-j
This parameter identifies the job definition you want to run. Use this
parameter if you created your job definition in xDashboard. Follow the -j
parameter with the name of the job definition. You cannot use this parameter
if you are also using the -f parameter. If you created your job definition
manually, see the instructions for the -f parameter.
NOTE: If your job definition name contains more than one word, you must
enclose the name in double-quotes. For example, “Sample Job Def”.
-f
(Optional) This parameter identifies the location of a manually created
job definition. You should only use this parameter if you created your job
definition manually.
Follow the -f parameter with the fully qualified name and location of the job
definition file. Do not use this parameter if you created your job definition
with xDashboard. You cannot use this parameter if you are also using the
-j parameter.
-q
(Optional) This parameter creates a list of the data sources used in the batch
run. If you follow this parameter with an output file name, xPression will
write the list to the file. If you do not supply a file name, xPression will echo
the results to the user screen.
By specifying a filename, you can edit this file to change the data source
designations and import it into the job definition using the -o parameter.
For more information, see Overriding Data Sources, page 20.
-o
(Optional) This parameter enables you to override the data sources of a batch
run. Typically, you will first create a data override file with the -q parameter.
You can then edit this file to change the data source designations, and import
it with the -o parameter.
Supply the name of the file you want to import after the -o parameter.
For more information, see Overriding Data Sources, page 20.
-n
(Optional) The BatchParameter. This parameter enables you to add an
identifier to your report file, log file, and print file names. This identifier
helps you identify output files for a particular job. This is especially helpful
when output from more than one job resides in the same directory. Follow
the -n parameter with text to identify the output files of a batch run.
For example: -n 1stRun
If you have added the BatchParameter variable to your report file, log file, or
print file names, xPression will take the identifier and apply it to those file
names.
18
Command Line Batch Processing
Parameter
Definition
-p
(Optional) This parameter specifies the output file location for the print and
log files. If used, it overrides and print, log, or report file paths you specified
in xAdmin. For example: -p C:\Test Jobs\Daily Run\
The following characters are invalid for file naming or path naming and
will be removed:
• File naming:
\ / :
* ?
" < > |
• Path naming:
: * ?
< > | "
A colon can only be used after the drive letter on Windows.
For example, c:\test:1\work is treated as c:\test1\work.
-ignoreDebug
When running in DEBUG mode, xPression will warn you that your
LogConfiguration file or MigrationBatchLogConfiguration file is set to
DEBUG. When you run in DEBUG mode, xPression writes a lot of information
to the xPression.log file. This can slow down your publishing time. xPression
requires you to confirm running in DEBUG mode by typing "Y". To disable
this warning, use the -ignore DEBUG parameter. This parameter is used by
default when running batch through xAdmin.
-d
The diagnostic output parameter. To use the diagnostic utility with xPression
batch, simply add the -d command line parameter to the batch start command.
The -d parameter should be followed by an identification string. This string
can consist of any characters that are valid for a filename.
For example: BatchRunner -j MyJobDefinition -d
TestDiagnosticOutput
-disableReporting
xPression’s reporting capabilities can slow down your batch performance.
Use this parameter to disable the xPression’s reporting capabilities for the
current batch run. xPression will not capture any job run information when
this parameter is used.
-r
Enables you to specify the job run ID as an input parameter to xPression
during a batch run. If you do not specify the job run ID, xPression
automatically creates a unique identifier.
If you want to query your batch jobs from the database in an automated
fashion, you should use this parameter.
Please make sure the job run ID you choose is unique. xPression will report
an error if you attempt to create a job run ID that already exists. The job run
ID must contain only numbers and be no longer than 31 characters.
For example: -r 142990045
19
Command Line Batch Processing
Parameter
Definition
The xPression-generated job run ID consists of the following values:
<IP_address><object_hashcode><system_milliseconds><global_counter>
-dcf
This feature is for use with EMC Document Sciences Support. It enables users
with RDB customer data to create an XML file with the records used in the
assembly. The subset data is dumped into the XML file specified in the path.
The file is overwritten each time a document is assembled. To use this feature,
supply the —dcf switch and supply the path and filename for the XML dump
file. For example: -dcf C:\DCF.xml
Overriding Data Sources
You can override the designated data sources when running a batch job by making use of the -q
and -o parameters. Please be aware that you cannot override the data source specified in the Job
Definition if your xPublish document includes an xPresso package as a subdocument or Universal
Content item. The -q parameter enables you to output a data file that lists the data sources for each
step in the job. You can alter this file and use the -o parameter to input the altered data file to the
batch runner. You can edit this file to change the data source and trigger file designations. You
cannot change the step number or step name. To override only the trigger file in step 1, change
the output data file as shown here:
<?xml version=”1.0” encoding=”UTF-8”?>
<JobDatasourceOverride>
"<JobStepDatasourceOverride
jobStepID=""
jobStepName=""
definedAssembleDatasource=""
definedTriggerDatasource=""
overridenAssembleDatasource=""
overridenAssembleDataFile=""
overridenTriggerDatasource="""
overridenTriggerDataFile="" />
</JobDatasourceOverride>
The following parameters make up the JobStepDatasourceOverride element:
Parameter
Description
jobStepID
The unique ID for the job step.
jobStepName
The name of the job step. It has to match the job step name you
defined in xDashboard for the job.
definedAssembleDatasource
The name of the defined assemble data source. This data source is
specified in xDashboard when you define the job.
definedTriggerDatasource
The name of the defined trigger data source.
overridenAssembleDatasource
The name of the overridden assemble data source.
overridenAssembleDataFile
The name of the overridden assemble data file. This is the XML file
where the override data resides.
20
Command Line Batch Processing
Parameter
Description
overridenTriggerDatasource
The name of the overridden trigger data source.
overridenTriggerDataFile
The name of the overridden trigger data file.
To import an altered output data file, use the -o parameter as follows:
• From your Windows command line, type BatchRunner -j Test -o OverrideTrigger.txt
• From your UNIX command line, type $ BatchRunner.sh -j Test -o OverrideTrigger.txt
Data Override with xRevise Custom Batch
A new attribute, customerXMLFilePath, has been added to the JobDatasourceOverride element in
xRevise Custom Batch. For example:
<JobDatasourceOverride>
<JobStepDatasourceOverride jobStepID="0" jobStepName="xxx"
customerXMLFilePath="xxx"/>
</JobDatasourceOverride>
By pointing customerXMLFilePath to the customer data xml, xRevise custom batch will use these
data to replace variables in the selected completed work item.
Debugging Your Batch Environment
If you are experiencing problems while processing your jobs with xPression Batch, your first step
should be to discover the source of the error.
Often, the easiest way to accomplish this is to check for errors in the log file specified on the Job Log
tab of your job definition. This log file should contain a list of errors encountered by the batch process.
For more information about the problems your batch run encountered, see the main log file,
xPression.log. You can set up this log file to produce three levels of details: INFO, ERROR, and
DEBUG.
Note: When running batch jobs from the command line, errors and other events are recorded in the
job log only, but when running jobs from xDashboard information about the job is recorded in the
xPression log as well as the job log.
Increasing Maximum Message Size
If you encounter MaxMessageSizeExceededException errors when running batch jobs on
WebLogic servers, increase the value of MaxMessageSize in the BatchRunner.bat/sh and
BatchRunner.properties files. For example:
$JAVA_HOME/bin/java -Xms512M -Xmx2048M -Dweblogic.MaxMessageSize=300000000
21
Command Line Batch Processing
Increasing Memory Size
If you encounter Out Of Memory errors when running batch jobs, increase the value of the "-Xmx"
parameter in the BatchRunner.bat/sh and BatchRunner.properties files. For example:
MEM_ARGS="-Xms1024m -Xmx5120m -ss1m -XX:MaxPermSize=512m"
Running Traditional Chinese Jobs From the
Command Line
In order to show Traditional Chinese characters in the Windows command console, you need to set
the codepage of the console to 950.
To change the codepage of the command console:
1.
Click the Start menu, select Settings, and click Control Panel. The Control Panel window
appears.
2.
Double-click Regional and Language Options. The Regional and Language Options dialog
box appears.
3.
Click the Advanced tab.
4.
Select Chinese(Taiwan) from the Language for non-Unicode programs drop-down list.
5.
Click OK and then close the Control Panel window.
6.
Open the command prompt window and type:
>chcp 950
This command changes the codepage to 950.
22
Chapter 3
Manually Creating Your Job Definition
This section contains a couple of examples of manually created Job Definitions and other important
information about using a manually created Job Definition:
• Simple xPresso Job Definition Sample, page 23
• Simple xDesign Job Definition Sample, page 24
• Complex Job Definition Sample, page 24
• xPressionPublishJob Element, page 25
• Pre and Post-Processing Scripts Element, page 26
• Instantiate Element, page 26
• UseBDT Element, page 27
• Parameters Element, page 27
• xPRSInstantiate Element, page 27
• JobLog Element, page 28
• If You are Using a DB2 Database, page 28
Simple xPresso Job Definition Sample
A simple Job Definition for xPresso documents.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Job>
<Job version="2.0">
<xPressionPublishJob outputProfileName="PDF to File"
deletePreviousJobsFirst="no" threadPoolSize="" preserveDataInputOrder="true"
nextTasksNumber="" jobLevel="1">
<xPRSInstantiate name="stepName" category="categoryName"
dataSourceGroup="DSG" dataSource="DS"
documentPackage="packageName.xword" keyPath="" />
<JobLog logLevel="DEBUG" logPath="C:\xPression\" logFile="Job_{JobRunID}.log"
errorThreshold="100" appendToExisting="no" />
</xPressionPublishJob>
</Job>
23
Manually Creating Your Job Definition
This sample contains the following elements:
• xPressionPublishJob Element, page 25
• xPRSInstantiate Element, page 27
• JobLog Element, page 28
Simple xDesign Job Definition Sample
A simple Job Definition for xDesign documents.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Job>
<Job version="2.0">
<xPressionPublishJob outputProfileName=" PDF To File"
deletePreviousJobsFirst="no" threadPoolSize="" preserveDataInputOrder="false"
nextTasksNumber="" jobLevel="1">
<Instantiate invalidCustomerRecordOutputFile=""
errorRecordsPath="c:\xPressionHome\"
errorRecordsFile="JobErrorRecords_{JobRunID}_sample.xml"
enableErrRecordsOutput="disable"
assembleDataSource="AUTOPAY1.XML" SQL="SELECT * FROM AUTOPAY"
doAutoCustomization="no" useAs="collection" persistAfterJob="no"
name="sample">
<UseBDT BDTName="Automatic Payment Letter" />
<Parameters>
<Parameter name="AUTOPAY_KEY" value="$AUTOPAY_KEY" dtype="integer" />
<Parameter name="STATUS" value="Any" dtype="string" />
<Parameter name="optionalObjectsMode" value="3" dtype="integer" />
</Parameters>
</Instantiate>
<JobLog logLevel="DEBUG" logPath="c:\xPressionHome\"
logFile="Job_{JobRunID}.log" errorThreshold="100" appendToExisting="no" />
</xPressionPublishJob>
</Job>
This sample contains the following elements:
• xPressionPublishJob Element, page 25
• Instantiate Element, page 26
• UseBDT Element, page 27
• Parameters Element, page 27
• JobLog Element, page 28
Complex Job Definition Sample
A more complex Job Definition example with pre and post-processing scripts.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Job>
<Job version="2.0">
<PreProcessingScript>
<Script name="PreProcess" command="c:\\PreProcess.bat -1 ZIP" />
</PreProcessingScript>
<xPressionPublishJob outputProfileName="PDF to File" deletePreviousJobsFirst="no"
24
Manually Creating Your Job Definition
threadPoolSize="55" preserveDataInputOrder="true" nextTasksNumber="222"
jobLevel="2">
<xPRSInstantiate name="stepName" category="categoryName"
dataSourceGroup="DSG" dataSource="DS"
documentPackage="packageName.xword" keyPath="" />
<JobLog logLevel="INFO" logPath="C:\Temp"
logFile="Job_{JobRunID}RPJobSampleLog.log"
errorThreshold="88" appendToExisting="yes" />
<DIFDiagnosis difPath="C:\xPression" />
</xPressionPublishJob>
<PostProcessingScript>
<Script name="PostProcessPrint" command="c:\\PrintProd.bat -1 AFP" />
</PostProcessingScript>
</Job>
This sample contains the following elements:
• xPressionPublishJob Element, page 25
• Pre and Post-Processing Scripts Element, page 26
• xPRSInstantiate Element, page 27
• JobLog Element, page 28
xPressionPublishJob Element
Identifies the Job Type as an xPression Publish job as opposed to a CompuSet job. This element
contains the following parameters:
• outputProfileName—For this parameter, specify the output profile that you want to associate
with this job definition. This output profile will apply to all the documents included in the batch
run (unless you are using queued documents, which enables you to override it). An output
profile is a set of output streams, each of which is associated with a format or printer definition,
and a distribution definition.
• deletePreviousJobsFirst—For CompuSet jobs only. This parameter indicates whether or not you
want to delete old job records associated with the current job.
• threadPoolSize—Defines the number of worker threads available for each batch run. The
Customer Data Reader and xPression Assembly components of the batch process use this setting
to distribute customer records across parallel threads to improve performance. Some DOCX,
HTML, and text output files may be lost if Thread Pool Size is greater than 1 and a counter is
used as the sole means of ensuring unique file names for the output. Files with identical names
may be produced by parallel threads and counters do not reliably reflect different threads, so
files with duplicate names are overwritten.
• preserveDataInputOrder—(xPublish only) This option publishes the output in the order provided
in the customer data. To use this option effectively, sort the data in advance so that the desired
order is provided in the customer data. This avoids the need for sorting at publish time and so
prevents failures related to generating large numbers of temp files.
• nextTasksNumber—The number of customer records the main batch thread reads in at a time.
• JobLevel—Indicates what type of information will be collected for your job. Type 1, 2, or 3 for
the value. (1) - Collects only batch statistics, such as start time, end time, and publish type. (2)
- Collects all the statistics information and information about failed customer documents. (3) -
25
Manually Creating Your Job Definition
Collects all the statistics information and customer document information for all documents.
Select this option for information to appear on the Job History tab.
Pre and Post-Processing Scripts Element
This element contains information about any pre or post-processing scripts you want to implement.
See Complex Job Definition Sample, page 24 for an example. This element contains the following
parameters:
• PreProcessingScript—This element should appear at the top of the Job Definition. Use this
element for pre-processing scripts. See Complex Job Definition Sample, page 24 for an example.
• PostProcessingScript—This element should appear at the bottom of the Job Definition. Use this
element for post-processing scripts. See Complex Job Definition Sample, page 24 for an example.
• Script—This element contains parameters for your scripts. Create one script element for each
script.
• name—An identifying name for the script.
• command—Supply the command to execute the script. This command should contain the
path, script filename, and needed parameters. For example: “C://startup.bat -log=true
-path=C://startuplog.log”
Instantiate Element
This element contains information about the Job Step. Instantiate is used for xDesign documents .
Create this element for each Job Step. See Simple xDesign Job Definition Sample, page 24 for an
example. This element contains the following parameters:
• invalidCustomerRecordOutputFile—This parameter is no longer used and can be ignored.
• errorRecordsPath—Specify the location where you want to save the Job Error Customer Records
file. By default, xPression saves error logs to C:\xPression\. See the xDashboard User Guide for
more information on the Job Error Customer Records.
• errorRecordsFile—The name of the Job Error Customer Records file. You can use the following
variables in the file name by typing them with brackets ({}).
{JobRunID} returns the current Job ID.
{JobName} returns the current Job name.
{BatchParam} returns the Batch Parameter value.
{Date} returns the date.
• enableErrorRecordsOutput—To enable the creation of a Job Error Customer Records file, type
“enable” as the value. To disable this file, type “disable” as the value.
• assembleDataSource—The name of the datasource file.
• SQL—Used only for RDB data sources. Its value is a SQL statement that indicates how to return
customer records.
26
Manually Creating Your Job Definition
• doAutoCustomization—Specifies whether to apply customization for xRevise jobs.
• useAs—This parameter is no longer used, please leave this parameter at the default value.
• persistAfterJob—This parameter is no longer used, please leave this parameter at the default
value.
• name—The name of the Job Step.
UseBDT Element
This element is used only for xDesign Job Definitions. See Simple xDesign Job Definition Sample,
page 24 for an example. This element contains the following parameter:
• BDTName — The name of the document you want to publish.
Parameters Element
This element is used only for xDesign Job Definitions. See Simple xDesign Job Definition Sample,
page 24 for an example. This element contains the following parameters:
• Paremeters—It can contain one or more parameter.
• Parameter—The parameter that defines the Primary key, status, and optional object mode.
Each parameter includes name, value, and dtype parameters. For Primary keys, this parameter
identifies the Primary key name. For STATUS, this parameter indicates whether the job should
assemble content items of a certain status. For optionalObjectsMode, this parameter indicates if
the job should include optional content in the xDesign document.
• name—Can be the name of the primary key, STATUS, or optionalObjectsMode.
• value—If identifying the primary key, the value will be the name of the Primary key preceded by
a dollar sign ($). If identifying the STATUS, the value can be ANY (includes all content items) or
APPROVED (includes all content items). If identifying the optionalObjectsMode, the value is 3.
• dtype—Can be integer, string, or date/time. For the Primary key and optionalObjectsMode, the
value is integer. For STATUS, the value is string.
xPRSInstantiate Element
This element contains information about the Job Step. xPRSInstantiate is used for xPresso documents.
Create this element for each Job Step. See Simple xPresso Job Definition Sample, page 23 for an
example. This element contains the following parameters:
• name—The name of the Job Step.
• category—The document category.
• datasourcegroup—The data source group that contains the data source you want to use for
publishing.
27
Manually Creating Your Job Definition
• datasource—The data source that contains the data you want to use for publishing.
• documentPackage—The name of the document that you want to publish.
• keyPath—xPresso document only. Specify the XPath value of the primary key in the customer
record. xPression will use this value when reporting in the log.
JobLog Element
This element contains the following parameters:
• logLevel—Controls the depth of the information provided in the log ordered from the smallest
amount of data to the largest. You do not have to restart the server to change the log level.
Set the value to ERROR to output the least amount of data which includes only error messages,
both fatal and non-fatal. This is the most commonly used option in a production environment
since it produces smaller log files.
Set the value to INFO to output a moderate amount of data including error messages, warnings
and informational messages that may indicate events of interest or concern.
Set the value to DEBUG to output the most information including error messages, informational
messages, plus additional detailed messages needed for debugging.
• logPath—Specify the name and path where you want xPression to save your job logs. By default,
xPression saves job logs to C:\xPression\
• logFile—The name of the log file. You can use the following variables in the file name by typing
them with brackets ({}).
{JobRunID} returns the current Job ID.
{JobName} returns the current Job name.
{BatchParam} returns the Batch Parameter value.
{Date} returns the date.
• errorThreshold—The error threshold is the number of errors that can be accumulated during the
job run before the job aborts. If the number of document level errors are equal or greater to the
value defined here, the job will exit.
• appendToExisting—Value is Yes or No. Define what should be done with new job log records if
the log file already exists. If set to Yes, xPression will append new records to the existing file. If
set to No, a new file will be created.
If You are Using a DB2 Database
If you are manually creating your Job Definitions in XML and are using a DB2 xPression database,
you must use entity notation to represent the following symbols.
Symbol
>
28
Notation
&gt;
Symbol
<
Notation
&lt;
Manually Creating Your Job Definition
Symbol
Notation
>=
&gt;=
<>
&lt;&gt;
Symbol
<=
Notation
&lt;=
29
Manually Creating Your Job Definition
30
Chapter 4
Sample Batch Trigger File
An xPression trigger file contains two elements, the primary key (PK) and the document name. In
most cases, the PK identifies the customer record.
Sample File
Please examine the following sample trigger file:
<CustomerData>
<TRIGGERDATA>
<AUTOPAY_KEY>1</AUTOPAY_KEY>
<BDT>Insurance Letter</BDT>
</TRIGGERDATA>
<TRIGGERDATA>
<AUTOPAY_KEY>5</AUTOPAY_KEY>
<BDT>Invoice</BDT>
</TRIGGERDATA>
<TRIGGERDATA>
<AUTOPAY_KEY>8</AUTOPAY_KEY>
<BDT>Invoice</BDT>
</TRIGGERDATA>
<TRIGGERDATA>
<AUTOPAY_KEY>45</AUTOPAY_KEY>
<BDT>Insurance Letter</BDT>
</TRIGGERDATA>
<TRIGGERDATA>
<AUTOPAY_KEY>70</AUTOPAY_KEY>
<BDT>Insurance Letter</BDT>
</TRIGGERDATA>
</CustomerData>
In this example, <AUTOPAY_KEY> is the primary key for the data source. When creating your
trigger file, use the name for your primary key.
The <BDT> element identifies the document name.
You can only specify one primary key and one document name for each triggerdata record. If you
want to publish multiple documents for the same customer record, you must create a separate
triggerdata record for each document. For example:
<TRIGGERDATA>
<AUTOPAY_KEY>45</AUTOPAY_KEY>
<BDT>Insurance Letter</BDT>
</TRIGGERDATA>
<TRIGGERDATA>
<AUTOPAY_KEY>45</AUTOPAY_KEY>
31
Sample Batch Trigger File
<BDT>Invoice</BDT>
</TRIGGERDATA>
32
Chapter 5
Batch Event Monitor
xPression enables you to monitor your batch jobs and receive notification when an error is generated.
The event monitor is provided through a JAVA abstract class named LifecycleListenerFactory and a
JAVA interface named BatchLifecycleListener. To use the event monitor, you are required to create
custom JAVA classes to extend the abstract class and implement the interface.
The following diagram shows the architecture of the event notification design.
The following diagram shows the event monitor architecture.
Implementing the Event Notification Monitor
You need to use the following classes:
• Interface: com.dsc.xpression.batch.tracking.BatchLifecycleListener
• Abstract Class: com.dsc.xpression.batch.tracking.LifecycleListenerFactory
• Exception Class: com.dsc.xpression.batch.tracking.exception.EventListenerException
33
Batch Event Monitor
All needed classes are found in UniArch_BatchRunner.jar, which is located in the xPression.ear
folder on your server. In order to write a custom batch listener, you will need to reference this
jar file in your project.
To Implement the Event Notification Monitor
1.
Create a class to implement the BatchLifecycleListener Interface. This class should have a
default constructor.
2.
Create a class to extend the LifecycleListenerFactory Abstract Class. You must override the
method named createBatchLifecycleListenerFor();
3.
Edit the batchrunner.properties file located in your xPressionHome directory. Add the following
property to this file:
ListenerFactoryImpClass=<ClassName>
Where <ClassName> is the class you created to extend the LifecycleListenerFactory abstract class.
Event Notification Sequence
When fully implemented, the event notification sequence will operate as in the following diagram:
1.
Batch is executed.
2.
xPression creates the BatchInfo object.
3.
xPression creates the LifecycleListenerFactory as dictated by the “ListenerFactoryImpClass=Class
Name” property in the batchrunner.properties file. You set this property using the name of the
class you created to extend the LifecycleListenerFactory class.
4.
xPression creates the BatchLifecycleListener instance.
5.
xPression invokes the BatchLifecycleListener.onBatchStart() method.
34
Batch Event Monitor
6.
If the batch error handling mechanism captures errors, xPression invokes the onFatalError() or
onDocumentError() methods.
7.
When the batch job completes, the onBatchDone() method is called.
LifecycleListenerFactory Abstract Class
This is the JAVA abstract class provided by EMC Document Sciences. The class you create must be
derived from this abstract class. This class consists of two methods: BatchLifecycleListener and
LifecycleListenerFactory.
This class is found in UniArch_BatchRunner.jar, which is located in the xPression.ear folder on your
server. In order to write a custom batch listener, you will need to reference this jar file in your project.
BatchLifecycleListener
public abstract BatchLifecycleListener
createBatchLifecycleListenerFor(BatchInfo batchInfo) throws EventListenerException
This class is found in UniArch_BatchRunner.jar, which is located in the xPression.ear folder on your
server. In order to write a custom batch listener, you will need to reference this jar file in your project.
LifecycleListenerFactory
This method acquires configuration information from the batchrunner.properties file and creates a
LifecycleListenerFactory class that is derived from LifecycleListenerFactory. The derived class then
implements the createBatchLifecycleListenerFor() method.
public static LifecycleListenerFactory
createInstance() throws FactoryConfigurationException
This class is found in UniArch_BatchRunner.jar, which is located in the xPression.ear folder on your
server. In order to write a custom batch listener, you will need to reference this jar file in your project.
Sample LifecycleListenerFactory Abstract Class
The following text is a sample of the LifeCycleListenerFactory abstract class.
import com.dsc.xpression.batch.tracking.*;
public class BatchLifecycleFactoryImp extends LifecycleListenerFactory
{public BatchLifecycleListener createBatchLifecycleListenerFor(BatchInfo info)
{return new TestListenerImp(info);}}
35
Batch Event Monitor
BatchLifecycleListener Interface
The interface will provide four basic events.
+onBatchStart()
+onBatchDone()
+onFatalError(in error : BatchError)
+onDocumentError(in info : DocInfo, in error : BatchError)
If the onBatchStart(), onFatalError() and onDocumentError() methods fail, you can catch these
exceptions for debugging purposes as shown in the following sample code.
Sample BatchLifecycleListener Interface
The following text is a sample of the BatchLifeCycleListener interface.
import com.dsc.xpression.batch.tracking.*;
import com.dsc.xpression.batch.tracking.exception.*;
public class TestListenerImp implements BatchLifecycleListener{
BatchInfo info;
public TestListenerImp(BatchInfo info){
this.info = info;
}
public void onBatchStart()
throws EventListenerException
{
System.out.println("onBatchStart(). " + info.getJobName() + " Start time: " +
info.getStartTime());
}
public void onBatchDone(){
System.out.println("onBatchDone(). " + " End Time: " + info.getEndTime());
}
public void onFatalError(BatchError error)
throws EventListenerException
{
System.out.println("onFatalError(). ");
System.out.println("code = " + error.getErrorCode());
System.out.println("level = " + error.getErrorLevel());
System.out.println("message = " + error.getErrorMessage());
}
public void onDocumentError(DocInfo info, BatchError error)
throws EventListenerException
{
System.out.println("onDocumentError().");
System.out.println("code = " + error.getErrorCode());
System.out.println("level = " + error.getErrorLevel());
System.out.println("message = " + error.getErrorMessage());
System.out.println("customerkeys = " + info.getCustomerKeys());
}
}
36
Batch Event Monitor
BatchInfo Class
This class captures the batch information and provides “set” and “get” methods that enable you to
access these attributes.
Attributes
The BatchInfo Class has the following attributes.
Attribute : Type
Definition
-jobName : String
The name of the batch job.
-startTime : Date
The start time of the batch job.
-endTime : Date
The ending time of the batch job.
Methods
The BatchInfo class has the following methods.
+setJobName(in jobName : String)
+getJobName() : String
+setStartTime(in startTime : Date)
+getStartTime() : Date
+setendTime(in endTime : Date)
+getendTime() : Date
BatchError Class
This class captures batch error information and provides “set” and “get” methods that enable you to
access these attributes.
Attributes
The BatchError class has the following attributes.
Attribute : Type
Definition
-errorCode : String
The error message code.
37
Batch Event Monitor
Attribute : Type
Definition
-errorMessage : String
The error message.
-errorLevel : String
The recorded error level.
Methods
The BatchError class has the following methods.
+setErrorCode(in code : String) : void
+getErrorCode() : String
+setErrorMessage(in message : String) : void
+getErrorMessage() : String
+setErrorLevel(in errorLevel : String) : void
+getErrorLevel() : String
DocInfo Class
This class captures the customerKey attribute, and provides “set” and “get” methods that enable
you to access the attribute.
Attributes
The DocInfo class has the following attributes.
Attribute : Type
Definition
-customerKeys : String
The customer keys included in the
batch job.
Methods
The DocInfo class has the following methods.
+setcustomerKeys : String
+getcustomerKeys() : String
38
Batch Event Monitor
EventListenerException Class
See Methods.
Methods
The EventListenerException class has the following methods.
+EventListenerException()
+EventListenerException(in s: String)
FactoryConfigurationException Class
See Methods.
Methods
The FactoryConfigurationException class has the following methods.
+FactoryConfigurationException()
+FactoryConfigurationException(in s: String)
39
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising