Performance Monitoring and Tuning Guide

DB2 Version 9.1 for z/OS
Performance Monitoring and Tuning
Guide
SC18-9851-08
DB2 Version 9.1 for z/OS
Performance Monitoring and Tuning
Guide
SC18-9851-08
Note
Before using this information and the product it supports, be sure to read the general information under “Notices” at the
end of this information.
Ninth edition (December 2010)
This edition applies to DB2 Version 9.1 for z/OS (DB2 V9.1 for z/OS), product number 5635-DB2, and to any
subsequent releases until otherwise indicated in new editions. Make sure you are using the correct edition for the
level of the product.
Specific changes are indicated by a vertical bar to the left of a change. A vertical bar to the left of a figure caption
indicates that the figure has changed. Editorial changes that have no technical significance are not noted.
© Copyright IBM Corporation 1982, 2010.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
About this information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Who should read this information . .
DB2 Utilities Suite . . . . . . . .
Terminology and citations . . . . .
Accessibility features for DB2 Version 9.1
How to send your comments . . . .
.
.
.
for
.
. .
. .
. .
z/OS
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xiii
xiii
xiv
xiv
. xv
Part 1. Planning your performance strategy . . . . . . . . . . . . . . . . . . 1
Chapter 1. Managing performance in general . . . . . . . . . . . . . . . . . . . . 3
Chapter 2. Setting reasonable performance objectives. . . . . . . . . . . . . . . . 5
Defining your workloads . . . . . . . . . .
Sizing your workloads . . . . . . . . . .
Translating resource requirements into performance
Reviewing performance during external design . .
Reviewing performance during internal design . .
Reviewing performance during coding and testing
Reviewing performance after development . . .
. . . .
. . . .
objectives .
. . . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
7
8
8
9
9
Chapter 3. Planning to review performance data . . . . . . . . . . . . . . . . . . 11
Typical review questions . . . . . .
Validating your performance objectives .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 11
. 13
Part 2. Managing your system performance . . . . . . . . . . . . . . . . . . 15
Chapter 4. Managing CPU and I/O to improve response time and throughput . . . . . 17
Controlling the number of I/O operations . . . . . . . .
Read operations . . . . . . . . . . . . . . . .
Write operations . . . . . . . . . . . . . . . .
Keeping access path statistics updated . . . . . . . .
Making buffer pools large enough for the workload . . . .
Making I/O operations faster . . . . . . . . . . . .
Distributing data sets efficiently . . . . . . . . . .
Creating additional work file table spaces to reduce contention
Formatting early and speed-up formatting . . . . . . .
Avoiding excessively small extents . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
17
21
22
23
23
24
25
27
28
Chapter 5. z/OS performance options for DB2 . . . . . . . . . . . . . . . . . . . 31
Determining z/OS Workload Manager velocity goals
How DB2 assigns I/O priorities . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 31
. 32
Chapter 6. Configuring storage for performance . . . . . . . . . . . . . . . . . . 35
Storage servers and channel subsystems . . . . . . .
Balancing the storage controller cache and buffer resources
Improving the use of real and virtual storage . . . . .
Real storage . . . . . . . . . . . . . . .
Virtual storage . . . . . . . . . . . . . .
Tuning DB2 buffer, EDM, RID, and sort pools . . . .
Managing the opening and closing of data sets . . .
Improving disk storage . . . . . . . . . . . .
Selecting and configuring storage devices . . . . .
Using disk space effectively . . . . . . . . . .
© Copyright IBM Corp. 1982, 2010
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
35
36
37
37
60
64
64
67
iii
Chapter 7. Improving DB2 log performance . . . . . . . . . . . . . . . . . . . . 83
Improving log write performance . . . . .
Types of log writes . . . . . . . . .
Improving log read performance . . . . .
Log statistics . . . . . . . . . . . .
Calculating average log record size . . .
Improving log capacity . . . . . . . .
Total capacity and the number of logs . .
Choosing a checkpoint frequency . . . .
Increasing the number of active log data sets
Setting the size of active log data sets . .
Controlling the amount of log data . . . .
Controlling log size for utilities . . . . .
Controlling log size for SQL operations . .
|
|
|
|
|
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
83
84
85
86
87
87
88
88
88
89
90
90
90
Chapter 8. Improving the performance of stored procedures and user-defined
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Maximizing the number of procedures or functions that run in an address space .
Assigning stored procedures and functions to WLM application environments . .
Accounting for nested activities . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 9. Using materialized query tables to improve SQL performance
Configuring automatic query rewrite . . . . . . . . . . . .
Materialized query tables and automatic query rewrite . . . . .
Enabling automatic query rewrite . . . . . . . . . . . .
Creating a materialized query table . . . . . . . . . . . .
Populating and maintaining materialized query tables . . . . . .
Enabling a materialized query table for automatic query rewrite . .
Recommendations for materialized query table and base table design
Materialized query tables—examples shipped with DB2 . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 94
. 96
. 98
. . . . . . 101
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
101
102
110
112
116
120
120
122
Chapter 10. Managing DB2 threads . . . . . . . . . . . . . . . . . . . . . . . 123
Setting thread limits . . . . . . . . . . . . . . . . .
Allied thread allocation . . . . . . . . . . . . . . . .
Step 1: Thread creation . . . . . . . . . . . . . . .
Step 2: Resource allocation . . . . . . . . . . . . . .
Step 3: SQL statement execution . . . . . . . . . . . .
Step 4: Commit and thread termination. . . . . . . . . .
Variations on thread management . . . . . . . . . . .
Reusing threads . . . . . . . . . . . . . . . . .
Distributed database access threads . . . . . . . . . . . .
Setting thread limits for database access threads . . . . . . .
Pooling of INACTIVE MODE threads for DRDA-only connections
Using threads with private-protocol connections. . . . . . . .
Reusing threads for remote connections . . . . . . . . . .
Using z/OS Workload Manager to set performance objectives . . .
Classifying DDF threads. . . . . . . . . . . . . . .
Establishing performance periods for DDF threads . . . . . .
Establishing performance objectives for DDF threads . . . . .
Setting CICS options for threads . . . . . . . . . . . . .
Setting IMS options for threads . . . . . . . . . . . . .
Setting TSO options for threads . . . . . . . . . . . . .
Setting DB2 QMF options for threads . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
123
124
124
125
125
126
127
127
129
130
130
133
134
134
134
136
137
137
138
138
139
Chapter 11. Designing DB2 statistics for performance . . . . . . . . . . . . . . . 141
Maintaining statistics in the catalog . . . . .
Statistics used for access path selection . . .
Setting default statistics for created temporary
History statistics . . . . . . . . . .
Additional statistics that provide index costs . .
iv
Performance Monitoring and Tuning Guide
. .
. .
tables
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
141
141
161
162
167
Modeling your production system . . . . .
Real-time statistics. . . . . . . . . . .
Setting up your system for real-time statistics
Contents of the real-time statistics tables . .
Operating with real-time statistics . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
168
171
171
171
172
Part 3. Programming DB2 applications for performance. . . . . . . . . . . . 235
Chapter 12. Tuning your queries . . . . . . . . . . . . . . . . . . . . . . . . 237
|
|
|
|
|
|
Coding SQL statements as simply as possible . . . . . . . . . . . . . . . .
Coding queries with aggregate functions efficiently. . . . . . . . . . . . . .
Using non-column expressions efficiently . . . . . . . . . . . . . . . . .
Materialized query tables and query performance . . . . . . . . . . . . . . .
Encrypted data and query performance . . . . . . . . . . . . . . . . . .
XML data and query performance . . . . . . . . . . . . . . . . . . . .
Best practices for XML performance in DB2 . . . . . . . . . . . . . . . .
Writing efficient predicates . . . . . . . . . . . . . . . . . . . . . . .
Ensuring that predicates are coded correctly . . . . . . . . . . . . . . . .
Properties of predicates . . . . . . . . . . . . . . . . . . . . . . .
Predicates in the ON clause . . . . . . . . . . . . . . . . . . . . .
Using predicates efficiently . . . . . . . . . . . . . . . . . . . . . . .
When DB2 evaluates predicates . . . . . . . . . . . . . . . . . . . .
Summary of predicate processing. . . . . . . . . . . . . . . . . . . .
Examples of predicate properties . . . . . . . . . . . . . . . . . . . .
Predicate filter factors . . . . . . . . . . . . . . . . . . . . . . .
Avoiding problems with correlated columns . . . . . . . . . . . . . . . .
DB2 predicate manipulation . . . . . . . . . . . . . . . . . . . . .
Predicates with encrypted data . . . . . . . . . . . . . . . . . . . .
Using host variables efficiently . . . . . . . . . . . . . . . . . . . . .
Writing efficient subqueries. . . . . . . . . . . . . . . . . . . . . . .
Correlated and non-correlated subqueries . . . . . . . . . . . . . . . . .
When DB2 transforms a subquery into a join . . . . . . . . . . . . . . . .
When DB2 correlates and de-correlates subqueries . . . . . . . . . . . . . .
Subquery tuning . . . . . . . . . . . . . . . . . . . . . . . . .
Using scrollable cursors efficiently . . . . . . . . . . . . . . . . . . . .
Efficient queries for tables with data-partitioned secondary indexes . . . . . . . . .
Making predicates eligible for index on expression . . . . . . . . . . . . . . .
Improving the performance of queries for special situations . . . . . . . . . . . .
Using the CARDINALITY clause to improve the performance of queries with user-defined
references . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reducing the number of matching columns . . . . . . . . . . . . . . . .
Indexes for efficient star schema processing . . . . . . . . . . . . . . . .
Rearranging the order of tables in a FROM clause . . . . . . . . . . . . . .
Improving outer join processing . . . . . . . . . . . . . . . . . . . .
Using a subsystem parameter to optimize queries with IN-list predicates . . . . . .
Providing more information to DB2 for access path selection . . . . . . . . . . .
Fetching a limited number of rows: FETCH FIRST n ROWS ONLY . . . . . . . .
Minimizing overhead for retrieving few rows: OPTIMIZE FOR n ROWS . . . . . .
Favoring index access . . . . . . . . . . . . . . . . . . . . . . .
Using a subsystem parameter to favor matching index access . . . . . . . . . .
Updating catalog statistics to influence access path selection. . . . . . . . . . .
Managing query access paths . . . . . . . . . . . . . . . . . . . . . .
Plan management polices . . . . . . . . . . . . . . . . . . . . . .
Managing access paths for static SQL statements . . . . . . . . . . . . . .
Influencing access path selection by using optimization hints . . . . . . . . . .
Reoptimizing SQL statements at run time . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
table function
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
237
237
238
238
239
239
239
248
248
249
253
253
253
254
260
262
271
275
281
281
282
283
284
286
287
288
289
291
293
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
293
294
295
297
297
297
298
298
299
301
301
302
303
304
304
308
315
Chapter 13. Programming for concurrency . . . . . . . . . . . . . . . . . . . . 319
Concurrency and locks .
Suspension . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 319
. 320
Contents
v
Time out . . . . . . . . . . . . . . . . . . .
Deadlock . . . . . . . . . . . . . . . . . . .
Promoting basic concurrency . . . . . . . . . . . . .
Using system and subsystem options to promote concurrency .
Designing your databases for concurrency. . . . . . . .
Programming your applications for concurrency. . . . . .
Aspects of transaction locks . . . . . . . . . . . . .
Lock size . . . . . . . . . . . . . . . . . . .
The duration of a lock . . . . . . . . . . . . . .
Lock modes . . . . . . . . . . . . . . . . . .
The object of a lock . . . . . . . . . . . . . . .
How DB2 chooses lock types . . . . . . . . . . . .
Options for tuning locks. . . . . . . . . . . . . . .
IRLM startup procedure options . . . . . . . . . . .
Setting installation options for wait times . . . . . . . .
Bind options for locks . . . . . . . . . . . . . .
Using other options to control locking . . . . . . . . .
Controlling DB2 locks for LOBs . . . . . . . . . . . .
LOB locks . . . . . . . . . . . . . . . . . .
Controlling the number of LOB locks . . . . . . . . .
Explicitly locking LOB tables . . . . . . . . . . . .
Controlling lock size for LOB table spaces . . . . . . . .
Controlling DB2 locks for XML data . . . . . . . . . . .
XML locks . . . . . . . . . . . . . . . . . .
Controlling the number of XML locks . . . . . . . . .
Explicitly locking XML data . . . . . . . . . . . .
Specifying the size of locks for XML data . . . . . . . .
Claims and drains for concurrency control . . . . . . . .
Claims . . . . . . . . . . . . . . . . . . .
Drains . . . . . . . . . . . . . . . . . . . .
How DB2 uses drain locks . . . . . . . . . . . . .
Utility locks on the catalog and directory . . . . . . . .
Compatibility of utilities. . . . . . . . . . . . . .
Concurrency during REORG . . . . . . . . . . . .
Utility operations with nonpartitioned indexes . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
320
321
323
323
324
325
329
329
332
332
335
338
346
346
347
352
371
378
378
380
380
381
381
381
384
384
385
385
385
386
386
387
387
389
390
Chapter 14. Programming for parallel processing . . . . . . . . . . . . . . . . . 391
Parallel processing. . . . . . . . . . . . . . . .
Methods of parallel processing . . . . . . . . . .
Partitioning for optimal parallel performance . . . . . . .
Determining if a query is I/O- or processor-intensive . . .
Determining the number of partitions for parallel processing
Working with a table space that is already partitioned. . .
Making the partitions the same size . . . . . . . . .
Working with partitioned indexes . . . . . . . . .
Enabling parallel processing . . . . . . . . . . . .
Restrictions for parallelism . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
391
391
394
395
395
397
397
398
398
399
Chapter 15. Tuning distributed applications . . . . . . . . . . . . . . . . . . . 401
Remote access . . . . . . . . . . . . .
Application and requesting systems . . . . . .
BIND options for distributed applications . . .
SQL statement options for distributed applications
Block fetch . . . . . . . . . . . . .
Optimizing for very large results sets for DRDA.
Optimizing for small results sets for DRDA . .
Data encryption security options . . . . . .
Serving system . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
401
402
402
402
403
408
409
410
410
Part 4. Monitoring DB2 for z/OS performance. . . . . . . . . . . . . . . . . 413
vi
Performance Monitoring and Tuning Guide
Chapter 16. Planning for performance monitoring. . . . . . . . . . . . . . . . . 415
Continuous performance monitoring
Planning for periodic monitoring . .
Detailed performance monitoring. .
Exception performance monitoring .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
415
416
417
417
Chapter 17. Using tools to monitor performance . . . . . . . . . . . . . . . . . 419
Investigating SQL performance with EXPLAIN
IBM Tivoli OMEGAMON XE . . . . . .
Tivoli Decision Support for z/OS. . . . .
Response time reporting. . . . . . . .
Using z/OS, CICS, and IMS tools . . . .
Monitoring system resources . . . . .
Monitoring transaction manager throughput
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
421
423
424
425
427
428
429
Chapter 18. Using profiles to monitor and optimize performance . . . . . . . . . . 431
Profiles . . . . . . . . .
Creating profiles . . . . . .
Starting and stopping profiles . .
How DB2 resolves conflicting rows
.
.
.
in
. . . . . . .
. . . . . . .
. . . . . . .
the profile attributes
. .
. .
. .
table
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
431
432
433
434
Chapter 19. Using DB2 Trace to monitor performance . . . . . . . . . . . . . . . 437
Minimizing the effects of traces
Types of traces . . . . . .
Statistics trace . . . . .
Accounting trace . . . .
Audit trace . . . . . .
Performance trace . . . .
Monitor trace . . . . .
Recording SMF trace data . .
Activating SMF. . . . .
Allocating SMF buffers . .
Reporting data in SMF . .
Recording GTF trace data . .
on
.
.
.
.
.
.
.
.
.
.
.
DB2 performance
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
437
438
438
439
440
441
441
442
443
443
444
444
Chapter 20. Programming for the instrumentation facility interface (IFI) . . . . . . . 447
Submitting DB2 commands through IFI . . . . . . . . . . .
Obtaining trace data through IFI . . . . . . . . . . . . . .
Passing data to DB2 through IFI . . . . . . . . . . . . . .
IFI functions. . . . . . . . . . . . . . . . . . . . .
Invoking IFI from your program . . . . . . . . . . . . . .
Using IFI from stored procedures. . . . . . . . . . . . . .
COMMAND: Syntax and usage with IFI . . . . . . . . . . .
Authorization for DB2 commands through IFI . . . . . . . .
Syntax for DB2 commands through IFI . . . . . . . . . . .
Using READS requests through IFI . . . . . . . . . . . . .
Authorization for READS requests through IFI . . . . . . . .
Syntax for READS requests through IFI . . . . . . . . . .
Which qualifications are used for READS requests issued through IFI?
Synchronous data and READS requests through IFI . . . . . .
Monitoring the dynamic statement cache with READS calls . . . .
Monitoring deadlocks and timeouts with READS calls . . . . .
Controlling collection of dynamic statement cache statistics with IFCID
Using READA requests through IFI . . . . . . . . . . . . .
Authorization for READA requests through IFI . . . . . . . .
Syntax for READA requests through IFI . . . . . . . . . .
Asynchronous data and READA requests through IFI . . . . . .
How DB2 processes READA requests through IFI . . . . . . .
Using WRITE requests through IFI . . . . . . . . . . . . .
Authorization for WRITE requests through IFI . . . . . . . .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
0318
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
447
448
449
449
450
451
451
452
452
454
455
456
466
467
469
470
471
471
472
472
473
473
474
474
vii
Syntax for WRITE requests through IFI .
Common communication areas for IFI calls
Instrumentation facility communications
Return area . . . . . . . . . .
IFCID area . . . . . . . . . .
Output area . . . . . . . . . .
Using IFI in a data sharing group . . .
Data integrity and IFI . . . . . . .
Auditing data and IFI . . . . . . .
Locking considerations for IFI . . . . .
Recovery considerations for IFI . . . .
Errors and IFI . . . . . . . . . .
. . . .
. . . .
area (IFCA)
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
474
475
475
479
480
480
481
482
482
483
483
483
| Chapter 21. Monitoring the use of IBM specialty engines . . . . . . . . . . . . . 485
| IBM System z Integrated Information Processors . . . . . . . . . . . . . . . . . . . . . . 485
| IBM System z Application Assist Processor . . . . . . . . . . . . . . . . . . . . . . . . 486
Chapter 22. Monitoring storage. . . . . . . . . . . . . . . . . . . . . . . . . 489
Monitoring I/O activity of data sets . . . . . . . . .
Buffer Pool Analyzer . . . . . . . . . . . . . .
Monitoring and tuning buffer pools using online commands
Using OMEGAMON to monitor buffer pool statistics . . .
Monitoring work file data sets. . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
489
489
489
491
494
Chapter 23. Monitoring concurrency and locks . . . . . . . . . . . . . . . . . . 497
Scenario for analyzing concurrency . . . . . . . . .
Analyzing concurrency . . . . . . . . . . . .
OMEGAMON online locking conflict display. . . . .
Using the statistics and accounting traces to monitor locking
Using EXPLAIN to identify locks chosen by DB2 . . . .
Deadlock detection scenarios . . . . . . . . . . .
Scenario 1: Two-way deadlock with two resources . . .
Scenario 2: Three-way deadlock with three resources . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
497
497
502
502
503
505
505
507
Chapter 24. Monitoring statistics . . . . . . . . . . . . . . . . . . . . . . . . 509
Querying the catalog for statistics
Gathering and updating statistics.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 509
. 510
Chapter 25. Monitoring SQL performance . . . . . . . . . . . . . . . . . . . . 513
| Monitoring SQL performance with IBM optimization tools . . . . . . . . . . . . . . . . . . . 513
DB2-supplied user tables for optimization tools . . . . . . . . . . . . . . . . . . . . . . 513
|
|
|
Using EXPLAIN to capture information about SQL statements. . . . .
Creating EXPLAIN tables . . . . . . . . . . . . . . .
Updating the format of an existing PLAN_TABLE . . . . . . .
Capturing EXPLAIN information . . . . . . . . . . . . .
Working with and retrieving EXPLAIN table data . . . . . . .
Monitoring SQL statements by using profiles . . . . . . . . . .
Obtaining snapshot information for monitored queries . . . . .
Limiting the number of statement reports from a monitor profile . .
How DB2 resolves conflicting rows in the profile attributes table . .
Gathering information about SQL statements for IBM Software Support .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
514
514
515
517
523
525
528
529
531
532
Chapter 26. Checking for invalid plans and packages . . . . . . . . . . . . . . . 535
Chapter 27. Monitoring parallel operations . . . . . . . . . . . . . . . . . . . . 537
Chapter 28. Monitoring DB2 in a distributed environment . . . . . . . . . . . . . 541
The DISPLAY command . .
Tracing distributed events .
viii
.
.
.
.
.
.
.
.
.
.
.
.
Performance Monitoring and Tuning Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 541
. 541
Reporting server-elapsed time . . . . . .
Monitoring distributed processing with RMF .
Duration of an enclave . . . . . . .
RMF records for enclaves . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
545
545
545
546
Part 5. Analyzing performance data . . . . . . . . . . . . . . . . . . . . . 547
Chapter 29. Looking at the entire system . . . . . . . . . . . . . . . . . . . . 549
Chapter 30. Beginning to look at DB2 . . . . . . . . . . . . . . . . . . . . . . 551
Chapter 31. A general approach to problem analysis in DB2 . . . . . . . . . . . . 553
Chapter 32. Interpreting DB2 trace output . . . . . . . . . . . . . . . . . . . . 557
The sections of the trace output
SMF writer header section .
GTF writer header section .
Self-defining section . . .
Product section . . . . .
Trace field descriptions . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
557
558
559
564
566
573
Chapter 33. Reading accounting reports from OMEGAMON . . . . . . . . . . . . 575
The accounting report (short format)
The accounting report (long format) .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 575
. 576
Chapter 34. Interpreting records returned by IFI . . . . . . . . . . . . . . . . . 585
Trace data record format
Command record format
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 585
. 587
Chapter 35. Interpreting data access by using EXPLAIN . . . . . . . . . . . . . . 589
|
|
Table space scan access(ACCESSTYPE='R' and PREFETCH='S') . .
Aggregate function access (COLUMN_FN_EVAL) . . . . . .
Index access (ACCESSTYPE is 'I', 'I1', 'N', 'MX', or 'DX') . . . .
Overview of index access . . . . . . . . . . . . .
Index access paths. . . . . . . . . . . . . . . .
Direct row access (PRIMARY_ACCESSTYPE='D') . . . . . .
Predicates that qualify for direct row access . . . . . . .
Reverting to ACCESSTYPE . . . . . . . . . . . . .
Access methods that prevent direct row access . . . . . .
Example: Coding with row IDs for direct row access . . . .
Scans limited to certain partitions (PAGE_RANGE='Y') . . . .
Parallel processing access (PARALLELISM_MODE='I', 'C', or 'X') .
Complex trigger WHEN clause access (QBLOCKTYPE='TRIGGR') .
Prefetch access paths (PREFETCH='D', 'S', or 'L') . . . . . .
Dynamic prefetch . . . . . . . . . . . . . . . .
Sequential prefetch . . . . . . . . . . . . . . .
List prefetch . . . . . . . . . . . . . . . . . .
Sort access . . . . . . . . . . . . . . . . . . .
Sorts of data . . . . . . . . . . . . . . . . . .
Sorts of RIDs . . . . . . . . . . . . . . . . .
The effect of sorts on OPEN CURSOR . . . . . . . . .
Join operations . . . . . . . . . . . . . . . . . .
Cartesian join with small tables first . . . . . . . . . .
Nested loop join (METHOD=1) . . . . . . . . . . .
When a MERGE statement is used (QBLOCK_TYPE ='MERGE')
Merge scan join (METHOD=2) . . . . . . . . . . .
Hybrid join (METHOD=4) . . . . . . . . . . . . .
Star schema access . . . . . . . . . . . . . . .
Subquery access . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
589
590
591
591
593
603
603
604
605
605
606
607
607
608
608
609
609
611
611
613
613
613
616
616
619
620
622
623
635
ix
View and nested table expression access . . . . . . . . . . . .
Merge processing . . . . . . . . . . . . . . . . . . .
Materialization . . . . . . . . . . . . . . . . . . . .
Performance of merge versus materialization . . . . . . . . . .
Using EXPLAIN to determine when materialization occurs . . . . .
Using EXPLAIN to determine UNION, INTERSECT, and EXCEPT activity
Interpreting query parallelism . . . . . . . . . . . . . . . .
Examining PLAN_TABLE columns for parallelism . . . . . . . .
PLAN_TABLE examples showing parallelism. . . . . . . . . .
Estimating the cost of SQL statements . . . . . . . . . . . . .
Cost categories . . . . . . . . . . . . . . . . . . . .
Retrieving rows from a statement table . . . . . . . . . . . .
. .
. .
. .
. .
. .
and
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
query
. .
. .
. .
. .
. .
. .
. . .
. . .
. . .
. . .
. . .
rewrite
. . .
. . .
. . .
. . .
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
637
637
638
640
641
642
644
645
645
647
648
649
Part 6. Tuning DB2 for z/OS performance . . . . . . . . . . . . . . . . . . 651
Chapter 36. Controlling resource usage . . . . . . . . . . . . . . . . . . . . . 653
|
|
Facilities for controlling resource usage . . . . . . . . .
Prioritizing resources . . . . . . . . . . . . . . .
Limiting resources for each job . . . . . . . . . . .
Limiting resources for TSO sessions . . . . . . . . . .
Limiting resources for IMS and CICS . . . . . . . . .
Limiting resources for a stored procedure . . . . . . . .
Setting limits for system resource usage . . . . . . . .
Using reactive governing . . . . . . . . . . . .
Using predictive governing . . . . . . . . . . . .
Combining reactive and predictive governing . . . . .
Limiting resource usage for plans or packages . . . . .
Limiting resource usage by client information for middleware
Managing resource limit tables . . . . . . . . . .
Governing statements from a remote site . . . . . . .
Calculating service unit values for resource limit tables . .
Restricting bind operations . . . . . . . . . . . .
Restricting parallelism modes . . . . . . . . . . .
The DB2 system monitor . . . . . . . . . . . . .
Reducing processor resource consumption. . . . . . . .
Reusing threads for your high-volume transactions . . . .
Minimizing the use of DB2 traces . . . . . . . . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
servers
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
653
654
654
655
655
655
656
657
658
659
660
662
663
667
668
669
669
670
670
671
672
Chapter 37. Providing cost information, for accessing user-defined table functions, to
DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
Chapter 38. Optimizing subsystem parameters . . . . . . . . . . . . . . . . . . 677
Optimizing subsystem parameters for SQL statements by using profiles .
.
.
.
.
.
.
.
.
.
.
.
.
.
. 677
Chapter 39. Improving index and table space access . . . . . . . . . . . . . . . 681
How clustering affects access path selection . .
When to reorganize indexes and table spaces .
Whether to rebind after gathering statistics .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 681
. 683
. 687
Chapter 40. Updating the catalog . . . . . . . . . . . . . . . . . . . . . . . . 689
Correlations in the catalog . . . . . . . . . . .
Recommendation for COLCARDF and FIRSTKEYCARDF
Updating HIGH2KEY and LOW2KEY values . . . . .
Statistics for distributions . . . . . . . . . . .
Recommendation for using the TIMESTAMP column . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
689
690
691
691
691
Chapter 41. Tuning parallel processing . . . . . . . . . . . . . . . . . . . . . 693
Disabling query parallelism
x
.
.
.
.
.
Performance Monitoring and Tuning Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 695
Part 7. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
Appendix A. DB2-supplied stored procedures for managing performance . . . . . . 699
|
DSNAEXP stored procedure .
DSNACCOR stored procedure.
DSNACCOX stored procedure.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Appendix B. DB2-supplied user tables
|
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 699
. 702
. 723
. . . . . . . . . . . . . . . . . . . . . 753
EXPLAIN tables . . . . . . . . . . . . .
PLAN_TABLE . . . . . . . . . . . . .
DSN_DETCOST_TABLE . . . . . . . . . .
DSN_FILTER_TABLE . . . . . . . . . . .
DSN_FUNCTION_TABLE . . . . . . . . .
DSN_PGRANGE_TABLE . . . . . . . . .
DSN_PGROUP_TABLE . . . . . . . . . .
DSN_PREDICAT_TABLE . . . . . . . . .
DSN_PTASK_TABLE . . . . . . . . . . .
DSN_QUERYINFO_TABLE . . . . . . . . .
DSN_QUERY_TABLE . . . . . . . . . .
DSN_SORTKEY_TABLE . . . . . . . . . .
DSN_SORT_TABLE . . . . . . . . . . .
DSN_STATEMENT_CACHE_TABLE. . . . . .
DSN_STATEMNT_TABLE . . . . . . . . .
DSN_STRUCT_TABLE . . . . . . . . . .
DSN_VIEWREF_TABLE . . . . . . . . . .
Input tables . . . . . . . . . . . . . . .
DSN_VIRTUAL_INDEXES . . . . . . . . .
Profile tables . . . . . . . . . . . . . .
SYSIBM.DSN_PROFILE_TABLE . . . . . . .
SYSIBM.DSN_PROFILE_HISTORY . . . . . .
SYSIBM.DSN_PROFILE_ATTRIBUTES . . . . .
SYSIBM.DSN_PROFILE_ATTRIBUTES_HISTORY .
Resource limit facility tables . . . . . . . . .
DSNRLMTxx . . . . . . . . . . . . .
DSNRLSTxx . . . . . . . . . . . . . .
Runtime information tables. . . . . . . . . .
DSN_OBJECT_RUNTIME_INFO . . . . . . .
DSN_STATEMENT_RUNTIME_INFO . . . . .
Workload control center tables. . . . . . . . .
DB2OSC.DSN_WCC_WORKLOADS. . . . . .
DB2OSC.DSN_WCC_WL_SOURCES. . . . . .
DB2OSC.DSN_WCC_WL_SOURCE_DTL . . . .
DB2OSC.DSN_WCC_WL_AUTHIDS . . . . .
DB2OSC.DSN_WCC_TB_STATUS . . . . . .
DB2OSC.DSN_WCC_EV_HISTORY . . . . . .
DB2OSC.DSN_WCC_EP_HISTORY . . . . . .
DB2OSC.DSN_WCC_MESSAGES . . . . . . .
DB2OSC.DSN_WCC_STMT_TEXTS . . . . . .
DB2OSC.DSN_WCC_STMT_INSTS . . . . . .
DB2OSC.DSN_WCC_STATEMENT_INST_SUMMY .
DB2OSC.DSN_WCC_TASKS . . . . . . . .
DB2OSC.DSN_WCC_STMT_RUNTM . . . . .
DB2OSC.DSN_WCC_OBJ_RUNTM . . . . . .
DB2OSC.DSN_WCC_WL_INFO . . . . . . .
DB2OSC.DSN_WCC_STMT_INFO . . . . . .
DB2OSC.DSN_WCC_CAP_TMP_ES . . . . . .
DB2OSC.DSN_WCC_RP_TBS . . . . . . . .
DB2OSC.DSN_WCC_REPORT_TBS_HIS . . . .
DB2OSC.DSN_WCC_RP_IDXES . . . . . . .
DB2OSC.DSN_WCC_RP_IDX_HIS . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
753
754
767
772
775
777
779
783
787
790
792
794
797
799
803
806
809
812
812
814
815
816
817
822
823
823
826
829
830
831
836
836
837
839
839
840
840
841
841
841
842
844
844
846
848
850
850
850
852
853
853
854
xi
DB2OSC.DSN_WCC_RP_STMTS .
Workload Statistics advisor tables .
DB2OSC.DSN_WSA_SESSIONS .
DB2OSC.DSN_WSA_DATABASES
DB2OSC.DSN_WSA_TSS . . .
DB2OSC.DSN_WSA_TABLES . .
DB2OSC.DSN_WSA_COLUMNS .
DB2OSC.DSN_WSA_COLGROUPS
DB2OSC.DSN_WSA_CGFREQS .
DB2OSC.DSN_WSA_CGHISTS .
DB2OSC.DSN_WSA_INDEXES .
DB2OSC.DSN_WSA_KEYS . . .
DB2OSC.DSN_WSA_KEYTARGETS
DB2OSC.DSN_WSA_KTGS . . .
DB2OSC.DSN_WSA_KTGFREQS .
DB2OSC.DSN_WSA_KTGHISTS .
DB2OSC.DSN_WSA_LITERALS .
DB2OSC.DSN_WSA_ADVICE . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Information resources for DB2 for z/OS and related products
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
854
855
855
855
856
856
856
857
859
859
859
860
860
861
862
862
863
863
. . . . . . . . . . . 865
How to obtain DB2 information. . . . . . . . . . . . . . . . . . . . . . . . . 871
How to use the DB2 library . . . . . . . . . . . . . . . . . . . . . . . . . . 873
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877
Programming Interface Information . . . . . . . . . . . . . . . .
General-use Programming Interface and Associated Guidance Information . .
Product-sensitive Programming Interface and Associated Guidance Information
Trademarks . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
879
879
879
879
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927
xii
Performance Monitoring and Tuning Guide
About this information
This information describes performance monitoring and tuning tasks for DB2®
Version 9.1 for z/OS®.
This information assumes that your DB2 subsystem is running in Version 9.1
new-function mode. Generally, new functions that are described, including changes
to existing functions, statements, and limits, are available only in new-function
mode. Two exceptions to this general statement are new and changed utilities and
optimization enhancements, which are also available in conversion mode unless
stated otherwise.
Who should read this information
This information is primarily intended for system and database administrators.
It assumes that the user is familiar with:
v The basic concepts and facilities of DB2
v Time Sharing Option (TSO) and Interactive System Productivity Facility (ISPF)
v
v
v
v
The basic concepts of Structured Query Language (SQL)
The basic concepts of Customer Information Control System (CICS®)
The basic concepts of Information Management System (IMS™)
How to define and allocate z/OS data sets using job control language (JCL).
Certain tasks require additional skills, such as knowledge of Transmission Control
Protocol/Internet Protocol (TCP/IP) or Virtual Telecommunications Access Method
(VTAM®) to set up communication between DB2 subsystems.
DB2 Utilities Suite
Important: In this version of DB2 for z/OS, the DB2 Utilities Suite is available as
an optional product. You must separately order and purchase a license to such
utilities, and discussion of those utility functions in this publication is not intended
to otherwise imply that you have a license to them.
The DB2 Utilities Suite can work with DB2 Sort and the DFSORT program, which
you are licensed to use in support of the DB2 utilities even if you do not otherwise
license DFSORT for general use. If your primary sort product is not DFSORT,
consider the following informational APARs mandatory reading:
v II14047/II14213: USE OF DFSORT BY DB2 UTILITIES
v II13495: HOW DFSORT TAKES ADVANTAGE OF 64-BIT REAL
ARCHITECTURE
These informational APARs are periodically updated.
Related information
DB2 utilities packaging (Utility Guide)
© Copyright IBM Corp. 1982, 2010
xiii
Terminology and citations
In this information, DB2 Version 9.1 for z/OS is referred to as "DB2 for z/OS." In
cases where the context makes the meaning clear, DB2 for z/OS is referred to as
"DB2." When this information refers to titles of DB2 for z/OS books, a short title is
used. (For example, "See DB2 SQL Reference" is a citation to IBM® DB2 Version 9.1
for z/OS SQL Reference.)
When referring to a DB2 product other than DB2 for z/OS, this information uses
the product's full name to avoid ambiguity.
The following terms are used as indicated:
Represents either the DB2 licensed program or a particular DB2 subsystem.
DB2
OMEGAMON®
Refers to any of the following products:
v IBM Tivoli® OMEGAMON XE for DB2 Performance Expert on z/OS
v IBM Tivoli OMEGAMON XE for DB2 Performance Monitor on z/OS
v IBM DB2 Performance Expert for Multiplatforms and Workgroups
v IBM DB2 Buffer Pool Analyzer for z/OS
C, C++, and C language
Represent the C or C++ programming language.
|
CICS
Represents CICS Transaction Server for z/OS.
IMS
Represents the IMS Database Manager or IMS Transaction Manager.
MVS
™
Represents the MVS element of the z/OS operating system, which is
equivalent to the Base Control Program (BCP) component of the z/OS
operating system.
RACF®
Represents the functions that are provided by the RACF component of the
z/OS Security Server.
Accessibility features for DB2 Version 9.1 for z/OS
Accessibility features help a user who has a physical disability, such as restricted
mobility or limited vision, to use information technology products successfully.
Accessibility features
The following list includes the major accessibility features in z/OS products,
including DB2 Version 9.1 for z/OS. These features support:
v Keyboard-only operation.
v Interfaces that are commonly used by screen readers and screen magnifiers.
v Customization of display attributes such as color, contrast, and font size
Tip: The Information Management Software for z/OS Solutions Information
Center (which includes information for DB2 Version 9.1 for z/OS) and its related
publications are accessibility-enabled for the IBM Home Page Reader. You can
operate all features using the keyboard instead of the mouse.
Keyboard navigation
You can access DB2 Version 9.1 for z/OS ISPF panel functions by using a keyboard
or keyboard shortcut keys.
xiv
Performance Monitoring and Tuning Guide
For information about navigating the DB2 Version 9.1 for z/OS ISPF panels using
TSO/E or ISPF, refer to the z/OS TSO/E Primer, the z/OS TSO/E User's Guide, and
the z/OS ISPF User's Guide. These guides describe how to navigate each interface,
including the use of keyboard shortcuts or function keys (PF keys). Each guide
includes the default settings for the PF keys and explains how to modify their
functions.
Related accessibility information
Online documentation for DB2 Version 9.1 for z/OS is available in the Information
Management Software for z/OS Solutions Information Center, which is available at
the following website: http://publib.boulder.ibm.com/infocenter/imzic
IBM and accessibility
See the IBM Accessibility Center at http://www.ibm.com/able for more information
about the commitment that IBM has to accessibility.
How to send your comments
Your feedback helps IBM to provide quality information. Please send any
comments that you have about this book or other DB2 for z/OS documentation.
You can use the following methods to provide comments:
v Send your comments by email to db2zinfo@us.ibm.com and include the name of
the product, the version number of the product, and the number of the book. If
you are commenting on specific text, please list the location of the text (for
example, a chapter and section title or a help topic title).
v You can send comments from the web. Visit the DB2 for z/OS - Technical
Resources website at:
http://www.ibm.com/support/docview.wss?rs=64&uid=swg27011656
This website has an online reader comment form that you can use to send
comments.
v You can also send comments by using the Feedback link at the footer of each
page in the Information Management Software for z/OS Solutions Information
Center at http://publib.boulder.ibm.com/infocenter/imzic.
About this information
xv
xvi
Performance Monitoring and Tuning Guide
Part 1. Planning your performance strategy
The first step toward ensuring that DB2 meets the performance requirements of
your solution is planning.
About this task
|
|
|
|
|
You can avoid some performance problems completely by planning for
performance when you first design your system. As you begin to plan your
performance, remember the following information.
v DB2 is only a part of your overall system. Any change to System z9® and
System z10® hardware, disk subsystems, z/OS, IMS, CICS, TCP/IP, VTAM, the
network, WebSphere®, or distributed application platforms (such as Windows,
UNIX, or Linux) that share your enterprise IT infrastructure can affect how DB2
and its applications run.
v The recommendations for managing performance are based on current
knowledge of DB2 performance for “normal” circumstances and “typical”
systems. Therefore, that this information provides the best, or most appropriate
advice, for any specific site cannot be guaranteed. In particular, the advice on
performance often approaches situations from a performance viewpoint only.
Other factors of higher priority might make some of these performance
recommendations inappropriate for your specific solution.
v The recommendations are general. Performance measurements are highly
dependent on workload and system characteristics that are external to DB2.
Related concepts
Performance monitoring and tuning (DB2 Data Sharing Planning and
Administration)
© Copyright IBM Corp. 1982, 2010
1
2
Performance Monitoring and Tuning Guide
Chapter 1. Managing performance in general
The steps for managing DB2 are like those for any system.
About this task
Performance management is an iterative process.
To manage performance:
Procedure
1. Establish your performance objectives when you plan your system.
2. Consider performance as you design and implement your system.
3. Plan how you will monitor performance and capture performance data.
4. Analyze performance reports to decide whether the objectives have been met.
5. If performance is thoroughly satisfactory, use one of the following options:
v Monitor less, because monitoring itself uses resources.
v Continue monitoring to generate a history of performance to compare with
future results.
6. If performance has not been satisfactory, take the following actions:
a. Determine the major constraints in the system.
b. Decide where you can afford to make trade-offs and which resources can
bear an additional load. Nearly all tuning involves trade-offs among system
resources.
c. Tune your system by adjusting its characteristics to improve performance.
d. Continue to monitor the system.
What to do next
Periodically, or after significant changes to your system or workload, reexamine
your objectives, and refine your monitoring and tuning strategy accordingly.
© Copyright IBM Corp. 1982, 2010
3
4
Performance Monitoring and Tuning Guide
Chapter 2. Setting reasonable performance objectives
Reasonable performance objectives are realistic, in line with your budget,
understandable, and measurable.
How you define good performance for your DB2 subsystem depends on your
particular data processing needs and their priority.
Common objectives include values for:
Acceptable response time
A duration within which some percentage of all applications have
completed.
Average throughput
The total number of transactions or queries that complete within a given
time.
System availability
Which included mean time to failure and the durations of down time.
Objectives such as these define the workload for the system and determine the
requirements for resources, such as processor speed, amount of storage, additional
software, and so on. Often, however, available resources limit the maximum
acceptable workload, which requires revising the objectives.
Service-level agreements
Presumably, your users have a say in your performance objectives. A mutual
agreement on acceptable performance, between the data processing and user
groups in an organization, is often formalized and called a service-level agreement.
Service-level agreements can include expectations of query response time, the
workload throughput per day, hour, or minute, and windows provided for batch
jobs (including utilities). These agreements list criteria for determining whether or
not the system is performing adequately.
Example
A service-level agreement might require that 90% of all response times sampled on
a local network in the prime shift be under 2 seconds, or that the average response
time not exceed 6 seconds even during peak periods. (For a network of remote
terminals, consider substantially higher response times.)
Amount of processing
Performance objectives must reflect not only elapsed time, but also the amount of
processing expected. Consider whether to define your criteria in terms of the
average, the ninetieth percentile, or even the worst-case response time. Your choice
can depend on your site's audit controls and the nature of the workloads.
z/OS Workload Manager (WLM) can manage to the performance objectives in the
service-level agreement and provide performance reporting analysis. The terms
used in the service-level agreement and the WLM service policy are similar.
© Copyright IBM Corp. 1982, 2010
5
Defining your workloads
Before installing DB2, you should gather design data and evaluate your
performance objectives with that information.
Procedure
To define the workload of the system:
Determine the type of workload. For each type of workload, describe a preliminary
workload profile that includes the following information:
v A definition of the workload type in terms of its function and its volume. You
are likely to have many workloads that perform the same general function (for
example, order entry through CICS, IMS, Websphere Application Server, or other
transaction managers) and have an identifiable workload profile. Other
workload types include SPUFI and DB2 QMF™ queries, transactions, utilities,
and batch jobs.
For the volume of a workload that is already processed by DB2, use the
summary of its volumes from the DB2 statistics trace.
v The relative priority of the type, including periods during which the priorities
change.
v The resources that are required to do the work, including physical resources that
are managed by the operating system (such as real storage, disk I/O, and
terminal I/O) and logical resources managed by the subsystem (such as control
blocks and buffers).
|
|
|
|
|
|
|
|
What to do next
You should review and reevaluate your performance objectives at all phases of
system development.
Sizing your workloads
You can look at base estimates for transactions, query use, and batch processing to
find ways to reduce the workload.
About this task
Changes in design in early stages, before contention with other programs, are
likely to be the most effective. Later, you can compare the actual production profile
against the base. Make an estimate even if these quantities are uncertain at this
stage.
Procedure
To establish your resource requirements:
Estimate resource requirements for the following items:
Transactions
v Availability of transaction managers, such as IMS, CICS, or WebSphere
v Number of message pairs for each user function, either to and from a
terminal or to and from a distributed application
v Network bandwidth and network device latencies
6
Performance Monitoring and Tuning Guide
v Average and maximum number of concurrent users, either terminal
operators or distributed application requesters
v Maximum rate of workloads per second, minute, hour, day, or week
v Number of disk I/O operations per user workload
v Average and maximum processor usage per workload type and total
workload
v Size of tables
v Effects of objectives on operations and system programming
Query use
v Time required to key in user data
v Online query processing load
v Limits to be set for the query environment or preformatted queries
v Size of tables
v Effects of objectives on operations and system programming
Batch processing
v Batch windows for data reorganization, utilities, data definition
activities, and BIND processing
v Batch processing load
v Length of batch window
v Number of records to process, data reorganization activity, use of
utilities, and data definition activity
v Size of tables and complexity of the queries
v Effects of objectives on operations and system programming
Translating resource requirements into performance
objectives
Your estimated resource requirements are in important input into the process of
defining performance objectives.
Procedure
To translate your resource requirements into performance objectives:
1. For each workload type, convert your estimated resource requirements into
measurable performance objectives. Include the following factors when you
consider your estimates:
System response time
You cannot guarantee requested response times before any of the
design has been done. Therefore, plan to review your performance
targets when you design and implement the system.
Response times can vary for many reasons. Therefore, include
acceptable tolerances in your descriptions of targets. Remember that
distributed data processing adds overhead at both the local and remote
locations.
Exclude from the targets any unusual applications that have
exceptionally heavy requirements for processing or database access, or
establish individual targets for those applications.
Network response time
Responses in the processor are likely to be in microseconds, whereas
Chapter 2. Setting reasonable performance objectives
7
responses in the network with appropriate facilities can be about a
millisecond. This difference in response times means that an overloaded
network can impact the delivery of server responses to user terminals
or distributed applications regardless of the speed of the processor.
Disk response time
I/O operations are generally responsible for much internal processing
time. Consider all I/O operations that affect a workload.
Existing workload.
Consider the effects of additional work on existing applications. In
planning the capacity of the system, consider the total load on each
major resource, not just the load for the new application.
Business factors
When calculating performance estimates, concentrate on the expected
peak throughput rate. Allow for daily peaks (for example, after receipt
of mail), weekly peaks (for example, a Monday peak after weekend
mail), and seasonal peaks as appropriate to the business. Also allow for
peaks of work after planned interruptions, such as preventive
maintenance periods and public holidays. Remember that the
availability of input data is one of the constraints on throughput.
2. Include statements about the throughput rates to be supported (including any
peak periods) and the internal response time profiles to be achieved.
3. Make assumptions about I/O rates, paging rates, and workloads.
Reviewing performance during external design
You should review performance during the external design phase for your system.
Procedure
During the external design phase, you must:
1. Estimate the network, web server, application server, processor, and disk
subsystem workload.
2. Refine your estimates of logical disk accesses. Ignore physical accesses at this
stage. One of the major difficulties is determining the number of I/Os per
statement.
Reviewing performance during internal design
You should review performance objectives during the internal design of your
system.
Procedure
During the internal design phase, you must:
1. Refine your estimated workload against the actual workload.
2. Refine disk access estimates against database design. After internal design, you
can define physical data accesses for application-oriented processes and
estimate buffer hit ratios.
3. Add the accesses for DB2 work file database, DB2 log, program library, and
DB2 sorts.
4. Consider whether additional processor loads can cause a significant constraint.
5. Refine estimates of processor usage.
8
Performance Monitoring and Tuning Guide
6. Estimate the internal response time as the sum of processor time and
synchronous I/O time or as asynchronous I/O time, whichever is larger.
7. Prototype your DB2 system. Before committing resources to writing code, you
can create a small database, update the statistics stored in the DB2 catalog
tables, run SQL statements and examine the results.
Reviewing performance during coding and testing
You should review your performance objectives during the coding and testing
phase for your system.
Procedure
During the coding and testing phases, you must:
1. Refine the internal design estimates of disk and processing resources.
2. Run the monitoring tools you have selected and check the results against your
estimates. You might use a terminal network simulator such as TeleProcessing
Network Simulator (TPNS) or other tools to test the system and simulate load
conditions.
Reviewing performance after development
When you are ready to test the complete system, review its performance in detail.
Procedure
Take the following steps to complete your performance review:
1. Validate system performance and response times against your performance
objectives.
2. Identify resources whose usage requires regular monitoring.
3. Incorporate the observed figures into future estimates:
a. Identify discrepancies from the estimated resource usage
b. Identify the cause of the discrepancies
c. Assign priorities to remedial actions
d. Identify resources that are consistently heavily used
e. Set up utilities to provide graphic representation of those resources
f. Project the processor usage against the planned future system growth to
ensure that adequate capacity is available
g. Update the design document with the observed performance figures
h. Modify your procedures for making estimates according to what you have
learned what you have learned
Results
You need feedback from users and might have to solicit it. Establish reporting
procedures, and teach your users how to use them. Consider logging incidents
such as:
v System, line, and transaction or query failures
v System unavailable time
v Response times that are outside the specified limits
v Incidents that imply performance constraints, such as deadlocks, deadlock
abends, and insufficient storage
Chapter 2. Setting reasonable performance objectives
9
v Situations, such as recoveries, that use additional system resources
You should log detailed information for such incidents.
v
v
v
v
v
v
10
Time
Date
Location
Duration
Cause (if it can be determined)
Action taken to resolve the problem
Performance Monitoring and Tuning Guide
Chapter 3. Planning to review performance data
When establishing requirements and planning to monitor performance, you should
also plan how to review the results of monitoring.
About this task
You can inspect your performance data to determine whether performance has
been satisfactory, to identify problems, and to evaluate the monitoring process.
Procedure
v Plan to review the performance data systematically. Review daily data weekly
and weekly data monthly; review data more often when reports raise specific
questions that require investigation. Depending on your system, the weekly
review might require about an hour, particularly after you have had some
experience with the process and are able to locate quickly any items that require
special attention. The monthly review might take half a day at first, less time
later on. But when new applications are installed, workload volumes increased,
or terminals added, allow more time for review.
v Review the data on a gross level, looking for problem areas. Review details only
if a problem arises or if you need to verify measurements.
v When reviewing performance data, try to identify the basic pattern in the
workload, and then identify variations of the pattern. After a certain period,
discard most of the data you have collected, but keep a representative sample.
For example, save the report from the last week of a month for three months; at
the end of the year, discard all but the last week of each quarter. Similarly, keep
a representative selection of daily and monthly figures. Because of the potential
volume of data, consider using Tivoli Decision Support for z/OS, Application
Monitor for z/OS, or a similar tool to track historical data in a manageable form.
Typical review questions
You can use specific review questions to help guide your review of performance
data.
Use the following questions as a basis for your own checklist. They are not limited
strictly to performance items, but your historical data can provide most of their
answers. If the performance data is for modeled workloads or changed workloads,
the first question to ask for each category is, "What changed?"
How often was each transaction and SQL statement used?
v
1. Considering variations in the workload mix over time, are the monitoring
times appropriate?
2. Should monitoring be done more frequently during the day, week, or month
to verify this?
3. How many SELECT, INSERT, UPDATE, DELETE, PREPARE, DESCRIBE,
DESCRIBE TABLE, PREPARE, OPEN, FETCH, and CLOSE statements are
issued per second and per commit?
4. How many IRLM and buffer pool requests are issued per second and per
commit?
© Copyright IBM Corp. 1982, 2010
11
How were processor and I/O resources used?
1. Has usage increased for functions that run at a higher priority than DB2 tasks?
Examine CICS, IMS, z/OS, JES, TCP/IP, VTAM, WebSphere Application Server,
WebSphere MQ (formerly called MQSeries®), other subsystems, and key
applications.
2. Is the report of processor usage consistent with previous observations?
3. Are scheduled batch jobs able to run successfully?
4. Do any incident reports show that the first invocation of a function takes much
longer than later ones? This increased time can happen when programs have to
open data sets.
5. What is the CPU time and where is it accumulated? Separate CPU time into
accounting TCB and SRB time, and distinguish non-nested, stored procedure,
user-defined function, and trigger CPU times. Note the times for DB2 address
spaces, DBM1, MSTR, IRLM, and DDF.
6. In a data sharing environment, how are the coupling facility (CF) lock, group
buffer pool, and SCA structures performing? What is the peak CF CPU
utilization?
7. Are unnecessary DB2 traces on?
|
|
8. Are online monitors performing unnecessary or excessive tracing?
How much real storage was used, and how effective is storage?
1. Is the paging rate increasing? Adequate real storage is very important for DB2
performance.
2. What are the hit ratios for the following types of storage:
v Buffer pools
v EDM RDS pools (above and below)
v EDM statement cache
v EDM DBD cache
|
|
|
|
|
|
v EDM skeleton pool
3. buffer pools, the EDM pools (above and below), the EDM statement cache, the
EDM DBD cache, the EDM skeleton pool, and the dynamic statement cache?
To what degree was disk used?
Is the number of I/O requests increasing? DB2 records both physical and logical
requests. The number of physical I/Os depend on the configuration of indexes, the
data records per control interval, and the buffer allocations.
To what extent were DB2 log resources used?
|
1. Is the log subject to undue contention from other data sets?
|
|
|
|
|
Recommendation: Do not put a recoverable (updated) resource and a log
under the same RAID controller. If that controller fails, you lose both the
resource and the log, and you are unable to perform forward recovery.
2. What is the I/O rate for requests and physical blocks on the log?
3. What is the logging rate for one log in MB per second?
|
4. How fast are the disk units that are used for logging?
12
Performance Monitoring and Tuning Guide
Do any figures indicate design, coding, or operational errors?
1. Are disk, I/O, log, or processor resources heavily used? If so, was that heavy
use expected at design time? If not, can the heavy use be explained in terms of
heavier use of workloads?
2. Is the heavy usage associated with a particular application? If so, is there
evidence of planned growth or peak periods?
3. What are your needs for concurrent read/write and query activity?
4. Are there any disk, channel, or path problems?
5. Are there any abends or dumps?
What are the effects of DB2 locks?
1. What are the incidents of deadlocks and timeouts?
2. What percentage of elapsed time is due to lock suspensions? How much lock
or latch contention was encountered? Check the contention rate per second by
class.
3. How effective is lock avoidance?
Were there any bottlenecks?
1. Were any critical thresholds reached?
2. Are any resources approaching high utilization?
Related concepts
“Accounting trace” on page 439
“Statistics trace” on page 438
“Using z/OS, CICS, and IMS tools” on page 427
Related tasks
“Monitoring system resources” on page 428
Chapter 23, “Monitoring concurrency and locks,” on page 497
Validating your performance objectives
After beginning to review and monitor performance, you need to find out if your
objectives are reasonable.
About this task
You should consider questions about the objectives, such as:
v Are they achievable, given the available hardware?
v Are they based upon actual measurements of the workload?
Procedure
To measure performance against initial objectives and report the results to users:
1. Identify any systematic differences between the internal response time, the
measured performance data, and the external response time, what the user sees.
2. If the measurements differ greatly from the estimates:
v Revise response-time objectives for the application
v Upgrade your system
v plan a reduced application workload.
Chapter 3. Planning to review performance data
13
If, however, the differences between internal and external response times are
not too large, you can begin monitoring and tuning the entire system.
14
Performance Monitoring and Tuning Guide
Part 2. Managing your system performance
By considering performance as you design and configure your system, you can
help to ensure better performance from DB2.
© Copyright IBM Corp. 1982, 2010
15
16
Performance Monitoring and Tuning Guide
Chapter 4. Managing CPU and I/O to improve response time
and throughput
To ensure that DB2 meets your goals for response time and throughput, you
should manage how your system uses processor resources and I/O processing.
Related tasks
“Setting limits for system resource usage” on page 656
Controlling the number of I/O operations
You can improve the response time of your applications and queries by reducing
the number of unnecessary I/O operations.
Related concepts
“Overview of index access” on page 591
Read operations
DB2 uses prefetch I/O to read data in almost all cases, but uses synchronous I/O
in certain cases.
When synchronous read is used, just one page is retrieved at a time. The unit of
transfer for a synchronous read is one page per single I/O operation. DB2 uses
prefetch mechanisms such as dynamic prefetch, sequential prefetch, and list
prefetch, whenever possible, to avoid costly wait times from synchronous I/O.
Prefetch I/O
DB2 uses different read types to prefetch data and avoid costly synchronous read
operations that can cause application wait times.
Prefetch is a mechanism for reading a set of pages, usually 32, into the buffer pool
with only one asynchronous I/O operation. Prefetch provides for substantial
savings in both CPU and I/O costs. The maximum number of pages read by a
single prefetch operation is determined by the size of the buffer pool that is used
for the operation.
DB2 uses the following types of prefetch to avoid using costly synchronous read to
access data, indexes, and LOBs:
Sequential prefetch
DB2 uses sequential prefetch for table scans and for sequential access to data
in a multi-table segmented table space when index access is not available.
Dynamic prefetch
In dynamic prefetch, DB2 uses a sequential detection algorithm to detect
whether pages are being read sequentially. DB2 tries to distinguish
between clustered or sequential pages from random pages. DB2 uses
multi-page asynchronous prefetch I/Os for the sequential pages, and
synchronous I/Os for the random pages.
For example, if the cluster ratio for an index is 100% and a table is read in
key-sequential order according to that index, all of the data pages are
clustered and read sequentially. However, if the cluster ratio is somewhat
© Copyright IBM Corp. 1982, 2010
17
less than 100%, then some of the pages are random and only those pages
are read using synchronous I/Os. Dynamic prefetch works for both
forward and backwards scans.
Because dynamic prefetch uses sequential detection, it is more adaptable to
dynamically changing access patterns than sequential prefetch. DB2 uses
dynamic prefetch in almost all situations, the main exception is for table
space scans. Index scans always use dynamic prefetch.
List prefetch
DB2 uses list prefetch to read a set of data pages that is determined by a list
of row IDs taken from an index or from the DB2 log. DB2 also uses list
prefetch to read index leaf pages which are determined from the non-leaf
pages, or to read LOB pages which are determined from the LOB map.
Unlike other types of prefetch, list prefetch does not use any kind of
sequential detection. Instead, some examples of the specific situations that
might cause DB2 to use list prefetch are the following situations:
v Reading leaf pages of a disorganized index.
v The optimizer chooses a list prefetch access path.
v Access to LOB data.
v Incremental image copy.
v Log apply operations.
When the optimizer chooses the list prefetch access path, DB2 uses the
following process:
1. Retrieve the list of rows identifiers through a matching index scan on
one or more index.
2. Sort the list of row identifiers in ascending order by page number.
3. Prefetch the pages in order, by using the sorted list of row identifiers.
List prefetch access paths are ideally suited for queries where the qualified
rows, as determined in index key sequence, are not sequential, are
skip-sequential but sparse, or when the value of the
DATAREPEATFACTORF statistic is large.
You can use the sequential steal threshold (VPSEQT) to protect randomly accessed
pages in the buffer pool. It is beneficial to protect the random pages from the
sequential pages, because it is generally faster to read sequential pages than
random pages from disk, and sequential pages are less likely to be re-accessed.
Because all prefetch I/Os are executed under a service request block in the DBM1
address space, the I/O time for prefetch I/Os is asynchronous with respect class 2
CPU time. When a get page operation waits for a prefetch I/O to complete, the
class 3 suspension time is captured as "other read I/O" suspension.
Prefetch CPU time is captured as system SRB time. CPU time for prefetch is
usually small, but it can become significant for index scans because the compressed
index pages are decompressed by the prefetch engine.
Related concepts
“Prefetch access paths (PREFETCH='D', 'S', or 'L')” on page 608
The number of pages read by prefetch:
The number of pages read by prefetch depends on the type of prefetch, the buffer
pool size (VPSIZE), in some cases, the sequential steal threshold (VPSEQT), and the
number of buffers.
18
Performance Monitoring and Tuning Guide
The following table shows the number pages read by prefetch for each
asynchronous I/O for each buffer pool size (4 KB, 8 KB, 16 KB, and 32 KB).
|
Table 1. The number of pages read for each asynchronous I/O by prefetch, by buffer pool size
|
|
|
|
Buffer pool
size
|
4 KB
Number of buffers
Pages Read by
Sequential and LOB
List Prefetch
Pages Read by
Dynamic and
Non-LOB list
Prefetch
Pages Read by
Utility sequential
Prefetch
VPSIZE < 224
8
8
16
|
225 < VPSIZE <1,000
16
16
32
|
|
|
|
32
32
64
1000 <= VPSIZE < 40,000
or
VPSIZE*VPSEQT < 40000
|
|
40,000 <= VPSIZE*VPSEQT <
80,000
64
32
64
|
80,000 <= VPSIZE*VPSEQT
64
32
128
VPSIZE < 48
4
4
8
|
48 < VPSIZE <400
8
8
16
|
|
|
|
16
16
32
400 <= VPSIZE< 20,000
or
VPSIZE*VPSEQT < 20000
|
|
20,000 <= VPSIZE*VPSEQT <
40,000
32
16
32
|
40,000 <= VPSIZE*VPSEQT
32
16
64
VPSIZE < 24
2
2
4
|
24 < VPSIZE < 200
4
4
8
|
|
200 <= VPSIZE< 10,000 or
VPSIZE*VPSEQT < 10000
8
8
16
|
|
10,000 <= VPSIZE*VPSEQT <
20,000
16
8
16
|
20,000 <= VPSIZE*VPSEQT
16
8
32
VPSIZE < 12
1
1
2
|
12 < VPSIZE < 100
2
2
4
|
|
|
|
4
4
8
100 <= VPSIZE< 5,000
or
VPSIZE*VPSEQT < 5,000
|
|
5,000 <= VPSIZE*VPSEQT <
10,000
8
4
8
|
|
10,000 <= VPSIZE*VPSEQT
8
4
17
|
|
|
8 KB
16 KB
32 KB
Sequential detection at execution time:
|
|
Even if DB2 does not choose prefetch at bind time, it can sometimes use prefetch
at execution time nevertheless. The method is called sequential detection.
Chapter 4. Managing CPU and I/O to improve response time and throughput
19
When sequential detection is used
DB2 can use sequential detection for both index leaf pages and data pages. It is
most commonly used on the inner table of a nested loop join, if the data is
accessed sequentially. If a table is accessed repeatedly using the same statement
(for example, DELETE in a do-while loop), the data or index leaf pages of the table
can be accessed sequentially. This is common in a batch processing environment.
Sequential detection can then be used if access is through:
v SELECT or FETCH statements
v UPDATE and DELETE statements
v INSERT statements when existing data pages are accessed sequentially
DB2 can use sequential detection if it did not choose sequential prefetch at bind
time because of an inaccurate estimate of the number of pages to be accessed.
Because sequential detection is chosen by DB2 at run time, EXPLAIN tables do not
contain any information about whether it was used. However, you can check
IFCID 0003 record in the accounting trace or the IFCID 0006 record in the
performance trace to learn whether sequential detection was used.
Sequential detection is not used for an SQL statement that is subject to referential
constraints.
How sequential detection works
The pattern of data access on a page is tracked when the application scans DB2
data through an index. Tracking is done to detect situations where the access
pattern that develops is sequential or nearly sequential.
The most recent eight pages are tracked. A page is considered page-sequential if it
is within P/2 advancing pages of the current page, where P is the prefetch
quantity. P is usually 32.
If a page is page-sequential, DB2 determines further if data access is sequential or
nearly sequential. Data access is declared sequential if more than 4 out of the last
eight pages are page-sequential; this is also true for index-only access. The tracking
is continuous, allowing access to slip into and out of data access sequential.
When data access is first declared sequential, which is called initial data access
sequential, three page ranges are calculated as follows:
v Let A be the page being requested. RUN1 is defined as the page range of length
P/2 pages starting at A.
v Let B be page A + P/2. RUN2 is defined as the page range of length P/2 pages
starting at B.
v Let C be page B + P/2. RUN3 is defined as the page range of length P pages
starting at C.
Sequential detection example
For example, assume that page A is 10. The following figure illustrates the page
ranges that DB2 calculates.
20
Performance Monitoring and Tuning Guide
A
B
RUN1
Page #
P=32 pages
10
C
RUN2
26
16
RUN3
42
16
32
Figure 1. Initial page ranges to determine when to use prefetch
For initial data access sequential, prefetch is requested starting at page A for P
pages (RUN1 and RUN2). The prefetch quantity is always P pages.
For subsequent page requests where the page is 1) page sequential and 2) data
access sequential is still in effect, prefetch is requested as follows:
v If the desired page is in RUN1, no prefetch is triggered because it was already
triggered when data access sequential was first declared.
v If the desired page is in RUN2, prefetch for RUN3 is triggered and RUN2
becomes RUN1, RUN3 becomes RUN2, and RUN3 becomes the page range
starting at C+P for a length of P pages.
If a data access pattern develops such that data access sequential is no longer in
effect and, thereafter, a new pattern develops that is sequential, then initial data
access sequential is declared again and handled accordingly.
Because, at bind time, the number of pages to be accessed can only be estimated,
sequential detection acts as a safety net and is employed when the data is being
accessed sequentially.
In extreme situations, when certain buffer pool thresholds are reached, sequential
prefetch can be disabled.
PSPI
Write operations
Write operations are usually performed concurrently with user requests.
Updated pages are queued by data set until they are written when one of the
following events occurs:
v A checkpoint is taken
v The percentage of updated pages in a buffer pool for a single data set exceeds a
preset limit called the vertical deferred write threshold (VDWQT)
v The percentage of unavailable pages in a buffer pool exceeds a preset limit
called the deferred write threshold (DWQT)
The following table lists how many pages DB2 can write in a single I/O operation.
|
Table 2. Number of pages that DB2 can write in a single I/O operation
|
Page size
Number of pages
|
4 KB
32
|
8 KB
16
|
16 KB
8
|
|
32 KB
4
Chapter 4. Managing CPU and I/O to improve response time and throughput
21
The following table lists how many pages DB2 can write in a single utility I/O
operation. If the number of buffers is large enough, DB2 can write twice as many
pages for each I/O operation for a utility write.
|
Table 3. The number of pages that DB2 can write for a single I/O operation for utility writes
|
Page Size
Number of buffers
Number of pages
|
4 KB
BP > 80,000
128
BP < 80,000
64
BP > 40,000
64
BP < 40,000
32
BP > 20,000
32
BP < 20,000
16
BP > 10,000
16
BP < 10,000
8
|
|
8 KB
|
|
16 KB
|
|
32 KB
|
|
As with utility write operations, DB2 can write twice as many pages for each I/O
in a LOB write operation. The following table shows the number of pages that DB2
can write for each I/O operation for a LOB write.
|
Table 4. The number of pages that DB2 can write for in a single I/O operation for LOB writes
|
Page Size
Number of buffers
Number of pages
|
4 KB
BP > 80,000
64
BP < 80,000
32
BP > 40,000
32
BP < 40,000
16
BP > 20,000
16
BP < 20,000
8
BP > 10,000
8
BP < 10,000
4
|
|
8 KB
|
|
16 KB
|
|
32 KB
|
|
Keeping access path statistics updated
You can reduce the number of unnecessary I/O operations by using the
RUNSTATS utility to keep statistics updated in the DB2 catalog.
About this task
The RUNSTATS utility collects statistics about DB2 objects. These statistics can be
stored in the DB2 catalog and are used during the bind process to choose the path
for accessing data. If you never use RUNSTATS and subsequently rebind your
packages or plans, DB2 cannot have the information it needs to choose the most
efficient access path. This can result in unnecessary I/O operations and excessive
processor consumption.
Procedure
To ensure that catalog statistics are updated:
v Run RUNSTATS at least once against each table and its associated indexes. How
often you rerun the utility depends on how current you need the catalog data to
22
Performance Monitoring and Tuning Guide
be. If data characteristics of the table vary significantly over time, you should
keep the catalog current with those changes. RUNSTATS is most beneficial for
the following situations:
–
–
–
–
Table spaces that contain frequently accessed tables
Tables involved in a sort
Tables with many rows
Tables against which SELECT statements having many search arguments are
performed
v For some tables, you cannot find a good time to run RUNSTATS. For example,
you might use some tables for work that is in process. The tables might have
only a few rows in the evening when it is convenient to run RUNSTATS, but
they might have thousands or millions of rows in them during the day. For such
tables, consider these possible approaches:
– Set the statistics to a relatively high number and hope your estimates are
appropriate.
|
|
– Use volatile tables so that SQL operations choose index access whenever
possible.
Whichever approach that you choose, monitor the tables because optimization is
adversely affected by incorrect information.
Related concepts
“Gathering and updating statistics” on page 510
Related reference
ALTER TABLE (DB2 SQL)
CREATE TABLE (DB2 SQL)
Making buffer pools large enough for the workload
You might improve the performance of I/O operations by increasing the size of
your buffer pools.
You should make buffer pools as large as you can afford for the following reasons:
v Using larger buffer pools might mean fewer I/O operations and therefore faster
access to your data.
v Using larger buffer pools can reduce I/O contention for the most frequently
used tables and indexes.
v Using larger buffer pools can speed sorting by reducing I/O contention for work
files.
However, many factors affect how you determine the number of buffer pools to
have and how big they should be.
Making I/O operations faster
You can use several methods to reduce the time required to perform individual
I/O operations.
Chapter 4. Managing CPU and I/O to improve response time and throughput
23
Related concepts
“Parallel processing” on page 391
Distributing data sets efficiently
Avoid I/O contention and increase throughput through the I/O subsystem by
placing frequently used data sets on fast disk devices and by distributing I/O
activity.
About this task
Distributing I/O activity is less important when you use disk devices with parallel
access volumes (PAV) support and multiple allegiance support.
Putting frequently used data sets on fast devices
You can improve performance by assigning frequently used data sets to faster disk
devices.
Procedure
To make the best use of your disk devices:
v Assign the most frequently used data sets to the faster disk devices at your
disposal.
v For partitioned table spaces, you might choose to have some partitions on faster
devices than other partitions. Placing frequently used data sets on fast disk
devices also improves performance for nonpartitioned table spaces.
v Consider partitioning any nonpartitioned table spaces that have excessive I/O
contention at the data set level.
Distributing the I/O
By distributing your data sets, you can prevent I/O requests from being queued in
z/OS.
Procedure
To distribute I/O operations:
v If you do not have parallel access volumes, allocate frequently used data sets or
partitions across your available disk volumes so that I/O operations are
distributed. Even with RAID devices, in which the data set is spread across the
physical disks in an array, data should be accessed at the same time on separate
logical volumes to reduce the chance of an I/O request being queued in z/OS.
v Consider isolating data sets that have characteristics that do not complement
other data sets.
|
|
|
Partitioning schemes and data clustering for partitioned tablespaces:
Depending on the type of operations that your applications emphasize, you have
several options for distributing I/O.
If the partitions of your partitioned table spaces must be of relatively the same size
(which can be a great benefit for query parallelism), consider using a ROWID
column as all or part of the partitioning key.
For partitions that are of such unequal size that performance is negatively affected,
alter the limit key values to set new partition boundaries and then reorganize the
24
Performance Monitoring and Tuning Guide
affected partitions to rebalance the data. Alternatively, you can use the REORG
utility with the REBALANCE keyword to set new partition boundaries.
REBALANCE causes DB2 to change the limit key values such that the rows in the
range of partitions being reorganized are distributed across those partitions as
evenly as possible.
|
|
|
|
|
|
If your performance objectives emphasize inserts, distribute the data in a manner
that reduces the amount of clustered key values. Consider designing your database
with randomized index keys design to remove clustering. You can also take
advantage of the index page splitting by choosing the appropriate size for index
pages. If the rate of inserts does not require you to spread out the inserts, consider
creating or altering tables with the APPEND YES option.
In contrast, for performance objectives that emphasize read operations, your data
clustering should reflect the sequence in which queries will be processed so that
DB2 can use the sequential processing method of parallelism to reduce I/O and
CPU time.
Partition data that will be subject to frequent update operations in a manner that
provides plenty of free space, especially if new and updated rows might expand
because they contain columns with varying-length data types or compressed data.
Related concepts
“Index splitting for sequential INSERT activity” on page 80
“Methods of parallel processing” on page 391
Related tasks
“Designing your databases for concurrency” on page 324
Increasing the number of data sets for an index:
By increasing the number of data sets that are used for an index and spreading
those data sets across the available I/O paths, you can reduce the physical
contention on the index.
Using data-partitioned secondary indexes, or making the piece size of a nonpartitioned
index smaller, increases the number of data sets that are used for the index.
A secondary index on a partitioned table space can be partitioned. When you
partition an index, the partitioning scheme is that of the data in the underlying
table, and each index partition has its own data set. Although data-partitioned
secondary indexes promote partition independence and can provide performance
advantages, they do not always improve query performance. Before using a
data-partitioned secondary index, understand the advantages and disadvantages.
Creating additional work file table spaces to reduce
contention
You can minimize I/O contention in certain situations by creating additional work
file table spaces.
About this task
For a single query, the recommended number of work file disk volumes to have is
one-fifth the maximum number of data partitions, with 5 as a minimum and 50 as
a maximum. For concurrently running queries, multiply this value by the number
of concurrent queries. Depending on the number of table spaces and the amount of
Chapter 4. Managing CPU and I/O to improve response time and throughput
25
concurrent activity, performance can vary. In general, adding more table spaces
improves performance.
Procedure
To fine tune the combination of different types of tables in the work file database:
v In query parallelism environments, ensure that the number of work file disk
volumes is at least equal to the maximum number of parallel operation use for
queries in a given workload, place these volumes on different channel or control
unit paths, and monitor the I/O activity for the work file table spaces.
1. Place the volumes on different channel or control unit paths.
2. Monitor the I/O activity for the work file table spaces. You might need to
further separate this work file activity to avoid contention.
3. As the amount of work file activity increases, consider increasing the size of
the buffer pool for work files to support concurrent activities more efficiently.
The general recommendation for the work file buffer pool is to increase the
size to minimize the following buffer pool statistics:
– MERGE PASSES DEGRADED, which should be less than 1% of MERGE
PASS REQUESTED
– WORKFILE REQUESTS REJECTED, which should be less than 1% of
WORKFILE REQUEST ALL MERGE PASSES
– Synchronous read I/O, which should be less than 1% of pages read by
prefetch
– Prefetch quantity of 4 or less, which should be near 8
v If your applications require extensive use of temporary objects or operations that
can be processed in work files that span more than one table space, define
multiple table spaces in the work file database that have a zero secondary
quantity and are stored in DB2-managed data sets. Processing that uses work
files and can span more than one table space includes objects and operations
such as:
– Large concurrent sorts and single large sorts
– Created temporary tables
– Some merge, star, and outer joins
– Non-correlated subqueries
– Materialized views
– Materialized nested table expressions
– Triggers with transition variables
When a table space in the work file database is stored in user-managed data
sets, DB2 does not detect during table space selection whether any secondary
allocation exists. Consequently when space is allocated for work files, such table
spaces are given the same preference as table spaces that are stored in
DB2-managed data sets and have a zero secondary quantity, even when the table
space has a secondary allocation.
DB2 gives preference to table spaces that are stored in DB2-managed data sets
and have zero secondary quantity when allocating space for work files for
processing that can span more than one table space. By creating multiple work
file table spaces with zero secondary quantity, you support efficient concurrent
read and write I/Os to the work files..
v If your applications require extensive use of temporary objects or operations that
can be processed only in a single table space, define some table spaces in the
work file database that have a non-zero secondary quantity and are stored in
DB2-managed data sets. Processing that uses work files and is limited to a single
table space includes objects and operations such as:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
– Declared global temporary tables
– Scrollable cursors
– SQL MERGE statements
The table spaces with non-zero secondary quantity help to minimize contention
for space between temporary objects or operations that can span multiple tables
spaces and those that cannot. DB2 gives preference for processing that cannot
span more than one table space to table spaces that are stored in DB2-managed
data sets and that can grow beyond the primary space allocation because the
secondary quantity is greater than zero.
v Ensure that the work file database contains at least one 32-KB page size table
space before you create declared global temporary tables. The rows of declared
global temporary tables reside in a table space in the work file database, and
DB2 does not create an implicit table space for the declared global temporary
table.
v To further improve performance, consider allocating more 32-KB data sets. DB2
uses 32 KB work file data sets when the total work file record size, including
record overhead, exceeds 100 bytes, resulting in better performance and reduced
work file buffer pool and disk space usage for work files, in some cases.
Example
To create new workfile table spaces:
1. Use the VSAM DEFINE CLUSTER statement to define the required data sets.
You can use the definitions in the edited DSNTIJTM installation job as a model.
2. Create the work file table space by entering the following SQL statement (If
you are using DB2-managed data sets, omit this step.):
CREATE TABLESPACE xyz IN DSNDB07
BUFFERPOOL BP7
CLOSE NO
USING VCAT DSNC910;
Related concepts
“How sort work files are allocated” on page 58
Related information
Work file sizing
Formatting early and speed-up formatting
You can improve the performance of applications that use heavy insert processing
by allocating space so that cylinders are used as the allocation amount, and by
pre-formatting a table space before inserting data.
Allocating space in cylinders or in large primary and secondary
quantities
Specify your space allocation amounts to ensure allocation by CYLINDER. If you
use record allocation for more than a cylinder, cylinder allocation is used. Cylinder
allocation can reduce the time required to do SQL mass inserts and to perform
LOGONLY recovery; it does not affect the time required to recover a table space
from an image copy or to run the REBUILD utility.
Chapter 4. Managing CPU and I/O to improve response time and throughput
27
When inserting records, DB2 pre-formats space within a page set as needed. The
allocation amount, which is either CYLINDER or TRACK, determines the amount
of space that is pre-formatted at any one time.
Because less space is pre-formatted at one time for the TRACK allocation amount,
a mass insert can take longer when the allocation amount is TRACK than the same
insert when the allocation amount is CYLINDER. However, smart secondary space
allocation minimizes the difference between TRACK and CYLINDER.
The allocation amount is dependent on device type and the number of bytes you
specify for PRIQTY and SECQTY when you define table spaces and indexes. The
default SECQTY is 10% of the PRIQTY, or 3 times the page size, whichever is
larger. This default quantity is an efficient use of storage allocation. Choosing a
SECQTY value that is too small in relation to the PRIQTY value results in track
allocation.
Pre-formatting during LOAD or REORG
When DB2 pre-formatting delays impact the performance or execution time
consistency of applications that do heavy insert processing, and if the table size
can be predicted for a business processing cycle, consider using the PREFORMAT
option of LOAD and REORG. If you preformat during LOAD or REORG, DB2
does not have to preformat new pages during execution. When the pre-formatted
space is used and when DB2 has to extend the table space, normal data set
extending and pre-formatting occurs.
Consider pre-formatting only if pre-formatting is causing a measurable delay with
the insert processing or causing inconsistent elapsed times for insert applications.
Recommendation: Quantify the results of pre-formatting in your environment by
assessing the performance both before and after using pre-formatting.
Related concepts
Secondary space allocation (DB2 Administration Guide)
Improving performance with LOAD or REORG PREFORMAT (DB2 Utilities)
Avoiding excessively small extents
Data set extent size affects performance because excessively small extents can
degrade performance during a sequential database scan.
|
Example
|
|
|
Suppose that the sequential data transfer speed is 100 MB per second and that the
extent size is 10 MB. The sequential scan must move to a new extent ten times per
second.
|
|
|
|
Recommendation: Maintain extent sizes that are large enough to avoid excessively
frequent extent moving during scans. Because as many as 16 cylinders can be
pre-formatted at the same time, keep the extent size greater than 16 cylinders for
large data sets.
|
Maximum number of extents
|
|
An SMS-managed linear data set is limited to 123 extents on a volume and 7257
total extents on all volumes. A non-SMS-managed data set is limited to 123 extents
28
Performance Monitoring and Tuning Guide
|
|
on a volume and 251 total extents on all volumes. If a data set grows and extents
are not monitored, jobs eventually fail due to these extent limitations.
|
|
|
Recommendation: Monitor the number of extents to avoid reaching the maximum
number of extents on a volume and the maximum number of extents on all
volumes.
|
Specifying primary quantity for nonpartitioned indexes
Specifying sufficient primary and secondary allocations for frequently used data
sets minimizes I/O time, because the data is not located at different places on the
disks.
|
|
Listing the catalog or VTOC occasionally to determine the number of secondary
allocations that have been made for your more frequently used data sets can also
be helpful. Alternatively, you can use IFCID 0258 in the statistics class 3 trace and
real time statistics to monitor data set extensions. OMEGAMON monitors IFCID
0258.
To prevent wasted space for non-partitioned indexes, take one of the
following actions:
v Let DB2 use the default primary quantity and calculate the secondary quantities.
Do this by specifying 0 for the IXQTY subsystem parameter, and by omitting a
PRIQTY and SECQTY value in the CREATE INDEX statement or ALTER INDEX
statement. If a primary and secondary quantity were previously specified for an
index, you can specify PRIQTY -1 and SECQTY -1 to change to the default
primary quantity and calculated secondary quantity.
v If the MGEXTSZ subsystem parameter is set to NO, so that you control
secondary space allocations, make sure that the value of PRIQTY + (N ×
SECQTY) is a value that evenly divides into PIECESIZE.
Chapter 4. Managing CPU and I/O to improve response time and throughput
29
30
Performance Monitoring and Tuning Guide
Chapter 5. z/OS performance options for DB2
You can set z/OS performance options for DB2 using z/OS Workload Manager.
With Workload Manager (WLM), you define performance goals and assign a
business importance to each goal. You define the goals for work in business terms,
and the system decides how much resource, such as CPU and storage, should be
given to the work to meet its goal.
WLM controls the dispatching priority based on the goals you supply. WLM raises
or lowers the priority as needed to meet the specified goal. Thus, you do not need
to fine-tune the exact priorities of every piece of work in the system and can focus
instead on business objectives.
The three kinds of goals are:
Response time
How quickly you want the work to be processed.
Execution velocity
How fast the work should be run when ready, without being delayed for
processor, storage, I/O access, and queue delay.
Discretionary
A category for low priority work for which you define no performance
goals.
Response times are appropriate goals for “end user” applications, such as DB2
QMF users running under the TSO address space goals, or users of CICS using the
CICS work load goals. You can also set response time goals for distributed users,
as described in “Using z/OS Workload Manager to set performance objectives” on
page 134.
For DB2 address spaces, velocity goals are more appropriate. A small amount of
the work done in DB2 is counted toward this velocity goal. Most of the work done
in DB2 applies to the end user goal.
For information about WLM and defining goals through the service definition, see
z/OS MVS Planning: Workload Management.
Determining z/OS Workload Manager velocity goals
With Workload Manager (WLM), you define performance goals and assign a
business importance to each goal.
Before you begin
v You are managing CICS, IMS, or DDF according to WLM response-time goals.
v You are set up to use WLM-established stored procedures address spaces.
Procedure
To set velocity goals:
v Use following service classes for velocity:
© Copyright IBM Corp. 1982, 2010
31
– The default SYSSTC service class for the following address spaces:
- VTAM
- TCP/IP
- IRLM (IRLMPROC)
- ssnmMSTR
|
|
|
|
|
|
|
|
|
Important: The ssnmMSTR address space should have SYSSTC priority, so
that the DB2 system monitor task runs at high enough priority to monitor
CPU stalls and virtual storage constraints. The VTAM, TCP/IP, IRLM, and
ssnmMSTR address spaces must always have a higher dispatching priority
than the other DBMS address spaces, their attached address spaces, and their
subordinate address spaces. Do not allow WLM to reduce the priority of
VTAM, TCP/IP, IRLM or ssnmMSTR address spaces to or below the priority
of the other DBMS address spaces.
– A high-velocity goal for a service class whose name you define, such as
PRODREGN, for the following address spaces:
|
|
- DB2 address spaces, including ssnmDBM1, and ssnmDIST, which should
always have the same service class.
- CICS address spaces (all region types)
- IMS address spaces (all region types except BMPs)
The velocity goals for CICS and IMS regions are only important during startup
or restart. After transactions begin running, WLM ignores the CICS or IMS
velocity goals and assigns priorities based on the goals of the transactions that
are running in the regions. A high velocity goal is good for ensuring that
startups and restarts are performed as quickly as possible.
Similarly, when you set response time goals for DDF threads, the only work
controlled by the DDF or stored procedure velocity goals are the DB2 service
tasks (work performed for DB2 that cannot be attributed to a single user). The
user work runs under separate WLM service class goals assigned to the enclaves
through the classification rules that are defined for the subsystem DDF in the
WLM policy.
IMS BMPs can be treated along with other batch jobs or given a velocity goal,
depending on what business and functional requirements you have at your site.
v Consider the following other Workload Manager recommendations:
– IRLM must be eligible for the SYSSTC service class. To make IRLM eligible
for SYSSTC, do not classify IRLM to one of your own service classes.
– If you need to change a goal, change the velocity between 5 and 10%. Velocity
goals do not translate directly to priority. Higher velocity tends to have higher
priority, but this is not always the case.
– WLM can assign I/O priority (based on I/O delays) separately from
processor priority.
For information about how read and write I/O priorities are determined, see
“How DB2 assigns I/O priorities”.
– z/OS Workload Manager dynamically manages storage isolation to meet the
goals you set.
How DB2 assigns I/O priorities
DB2 informs z/OS about which address space's priority is to be associated with a
particular I/O request.
32
Performance Monitoring and Tuning Guide
WLM handles the management of the request from after that. The table below
describes to which enclave or address space DB2 associates I/O read requests.
Table 5. How read I/O priority is determined
Request type
Synchronous reads
Prefetch reads
Local
Application's address space
Application's address space
DDF or Sysplex Enclave priority
query
parallelism
(assistant only)
Enclave priority
The table below describes to which enclave or address space DB2 associates I/O
write requests.
Table 6. How write I/O priority is determined
Request type
Synchronous writes
Deferred writes
Local
Application's address space
ssnmDBM1 address space
DDF
DDF address space
ssnmDBM1 address space
Chapter 5. z/OS performance options for DB2
33
34
Performance Monitoring and Tuning Guide
Chapter 6. Configuring storage for performance
Increasing the I/O rate and decreasing the frequency of disk access to move data
between real storage and storage devices is key to good performance.
About this task
To meet the diverse needs of application data, a range of storage options is
available, each with different access speeds, capacities, and costs per megabyte.
This broad selection of storage alternatives supports requirements for enhanced
performance and expanded online storage options, providing more options in
terms of performance and price.
The levels in the DB2 storage hierarchy include real storage, storage controller
cache, disk, and auxiliary storage.
Storage servers and channel subsystems
Channels can be more of a bottleneck than any other component of the I/O
subsystem with IBM TotalStorage and other newer storage servers.
The degree of I/O parallelism that can be sustained efficiently is largely a function
of the number of channels. In general, more channels mean better performance.
|
|
|
|
However, not all channels are alike. ESCON® channels, which used to be the
predominant channel type, have a maximum instantaneous data transfer rate of
approximately 17 MB per second. FICON® channels currently have a speed of 4 GB
per second. FICON is the z/OS equivalent of Open Systems Fibre Channel
Protocol (FCP). The FICON speed is bidirectional, theoretically allowing 4 GB per
second to be sustained in both directions. Channel adaptors in the host processor
and the storage server limit the actual speed. The FICON channels in the System
z9 and System z10 servers are faster than those in the prior processors, and they
feature MIDAW (Modified Indirect Data Address Word) channel program
improvements.
Balancing the storage controller cache and buffer resources
The amount of cache to use for DB2 depends primarily on the relative importance
of price and performance.
Having large memory resources for both DB2 buffers and storage controller cache
in not often effective. If you decide to concentrate on the storage controller cache
for performance gains, use the maximum available cache size. If the cache is
substantially larger than the DB2 buffer pools, DB2 can make effective use of the
cache to reduce I/O times for random I/O. For sequential I/O, the improvement
from the cache provides is generally small.
Improving the use of real and virtual storage
You can use several techniques to minimize the use of storage by DB2.
© Copyright IBM Corp. 1982, 2010
35
About this task
The amount of real storage often needs to be close to the amount of virtual
storage.
Procedure
To minimize the amount of storage that DB2 uses:
v Use less buffer pool storage. Using fewer and smaller buffer pools reduces the
amount of real storage space DB2 requires. Buffer pool size can also affect the
number of I/O operations performed; the smaller the buffer pool, the more I/O
operations needed. Also, some SQL operations, such as joins, can create a result
row that does not fit on a 4-KB page.
v Commit frequently to minimize the storage needed for locks.
v Improve the performance for sorting. The highest performance sort is the sort
that is avoided. However, because some sorting cannot be avoided, make sorting
as efficient as possible. For example, assign the buffer pool for your work file
table spaces in database DSNDB07, which are used in sorting, to a buffer pool
other than BP0, such as to BP07.
v Provide for pooled threads. Distributed threads that are allowed to be pooled
use less storage than inactive database access threads. On a per connection basis,
pooled threads use even less storage than inactive database access threads.
v Ensure ECSA size is adequate. The extended common service area (ECSA) is a
system area that DB2 shares with other programs. Shortage of ECSA at the
system level leads to use of the common service area.
DB2 places some load modules and data into the common service area. These
modules require primary addressability to any address space, including the
address space of the application. Some control blocks are obtained from common
storage and require global addressability. For more information, see DB2
Installation Guide.
v Ensure EDM pool space is being used efficiently. Monitor your use of EDM pool
storage using DB2 statistics.
v Use the long-term page fix option for I/O intensive bufferpools. Use PGFIX(YES)
for buffer pools with a high I/O rate, that is, a high number of pages read or
written.
|
|
|
Related concepts
“EDM storage” on page 51
“Overview of index access” on page 591
“Sort access” on page 611
Estimating storage (DB2 Installation Guide)
Related tasks
“Improving the performance of sort processing” on page 59
“Fixing a buffer pool in real storage” on page 50
Planning storage for (DB2 Installation Guide)
Real storage
Real storage refers to the processor storage where program instructions reside while
they are executing.
Real storage also refers to where data is held, for example, data in DB2 buffer
pools that has not been paged out to auxiliary storage, the EDM pools, and the
36
Performance Monitoring and Tuning Guide
sort pool. To be used, data must either reside or be brought into processor storage
or processor special registers. The maximum amount of real storage that one DB2
subsystem can use is the real storage of the processor, although other limitations
might be encountered first.
The large capacity for buffers in real storage and the write avoidance and
sequential access techniques allow applications to avoid a substantial amount of
read and write I/O, combining single accesses into sequential access, so that the
disk devices are used more effectively.
Virtual storage
Virtual storage is auxiliary storage space that can be regarded as addressable
storage because virtual addresses are mapped to real addresses.
Tuning DB2 buffer, EDM, RID, and sort pools
Proper tuning of your buffer pools, EDM pools, RID pools, and sort pools can
improve the response time and throughput for your applications and provide
optimum resource utilization. Using data compression can also improve
buffer-pool hit ratios and reduce table space I/O rates.
Tuning database buffer pools
Buffer pools are areas of virtual storage that temporarily store pages of table spaces
or indexes. When an application program accesses a row of a table, DB2 places the
page that contains that row in a buffer.
About this task
If the requested data is already in a buffer, the application program does not have
to wait for it to be retrieved from disk. Avoiding the need to retrieve data from
disk results in faster performance.
If the row is changed, the data in the buffer must be written back to disk
eventually. But that write operation might be delayed until DB2 takes a checkpoint,
or until one of the related write thresholds is reached. (In a data sharing
environment, however, the writing mechanism is somewhat different. .)
The data remains in the buffer until DB2 decides to use the space for another page.
Until that time, the data can be read or changed without a disk I/O operation.
DB2 allows you to use up to 50 buffer pools that contain 4-KB buffers and up to 10
buffer pools each for 8-KB, 16-KB, and 32-KB buffers. You can set the size of each
of those buffer pools separately during installation.
Buffer Pool Analyzer: You can use the Buffer Pool Analyzer for z/OS to get
recommendations buffer pool allocation changes and to do “what if” analysis of
your buffer pools.
Procedure
|
|
To change the size and other characteristics of a buffer pool, or enable DB2
automatic buffer pool size management:
Use the ALTER BUFFERPOOL command. You can issue the ALTER BUFFERPOOL
command at any time while DB2 is running
Chapter 6. Configuring storage for performance
37
Related concepts
Write operations (DB2 Data Sharing Planning and Administration)
Types of buffer pool pages:
The pages in a database buffer pool can be classified into the following types:
in-use, updated, and available pages.
|
|
|
|
|
|
In-use pages
Pages that contain data that is currently being read or updated. This group
of pages is important because insufficient space for these pages might
cause DB2 to queue or even stop work. These pages are not available to be
overwritten, or stolen, by a new page of data that is read into the buffer
pool.
|
|
|
|
|
|
|
Updated pages
Pages that contain data that has been updated but not yet written to disk
storage. These data on the pages may be reused by the same thread in the
same unit of work and by any other thread if row locking is used and the
separate threads do not lock the same row. However, these pages are not
available to be stolen and overwritten when a new page of data is read
into the bufferpool.
|
|
|
|
|
|
|
|
|
|
|
Available pages
Pages that contain data that can be considered for reuse to avoid I/O and
can be overwritten by data from a new different page that has to be read
into the buffer pool. Available pages are normally stolen on a least recently
used basis, but you can also specify a first-in-first-out (FIFO) page-stealing
algorithm. An important subset of the available pages consists of those that
have been prefetched into the pool by a sequential, list, or dynamic
prefetch, but have not yet been used. These pages, like other available
pages are available for page stealing. When an available page is stolen
before it is used and is subsequently needed, DB2 schedules a synchronous
I/O operation to read the page into the buffer pool.
Assigning a table space or index to a buffer pool:
How you assign data to buffer pools can have a significant impact on performance.
About this task
Restriction: You cannot use the ALTER statement to change the assignment of the
catalog and the directory.
BP0 is the default buffer pool for sorting, but you can change that by assigning the
work file table spaces to another buffer pool. BP0 has a default size of 20000 and a
minimum size of 2000. As with any other buffer pool, use the ALTER
BUFFERPOOL command to change the size of BP0.
|
|
|
|
Procedure
To assign a table space or an index to a particular buffer pool:
Issue one of the following SQL statements and specify the BUFFERPOOL clause:
v CREATE TABLESPACE
v ALTER TABLESPACE
38
Performance Monitoring and Tuning Guide
v CREATE INDEX
v ALTER INDEX
The buffer pool is actually allocated the first time a table space or index assigned
to it is opened.
Specifying a default buffer pool for user data:
You can assign a buffer pool other than BP0 to user data and user indexes.
About this task
Choosing values other than the default value, BP0, for both user data and user
indexes is a good idea, because BP0 must be used by the DB2 catalog and
directory. BP0 is much more difficult to monitor and tune if user data and indexes
also use that buffer pool.
Procedure
In the DSNTIP1 installation panel, change the values in the following fields:
v DEFAULT BUFFER POOL FOR USER DATA
v DEFAULT BUFFER POOL FOR USER INDEXES
Buffer pool thresholds:
How DB2 uses of a buffer pool is governed by several preset values called
thresholds.
Each threshold is a level of use which, when exceeded, causes DB2 to take some
action. Certain thresholds might indicate a buffer pool shortage problem, while
other thresholds merely report normal buffer management by DB2. The level of
use is usually expressed as a percentage of the total size of the buffer pool. For
example, the “immediate write threshold” of a buffer pool is set at 97.5%. When
the percentage of unavailable pages in a buffer pool exceeds that value, DB2 writes
pages to disk when updates are completed.
For very small buffer pools, of fewer than 1000 buffers, some of the thresholds
might be lower to prevent “buffer pool full” conditions, but those thresholds are
not described.
Fixed thresholds:
Some thresholds, such the immediate write threshold, you cannot change. You
should monitor buffer pool usage and note how often those thresholds are reached.
If the fixed thresholds are reached too often, the remedy is to increase the size of
the buffer pool, which you can do with the ALTER BUFFERPOOL command.
However, increasing the size can affect other buffer pools, depending on the total
amount of real storage available for your buffers.
The fixed thresholds are more critical for performance than the variable thresholds.
Generally, you want to set buffer pool sizes large enough to avoid reaching any of
these thresholds, except occasionally.
Each of the fixed thresholds is expressed as a percentage of the buffer pool that
might be occupied by unavailable pages.
Chapter 6. Configuring storage for performance
39
From the highest value to the lowest value, the fixed thresholds are:
Immediate write threshold (IWTH): 97.5%
This threshold is checked whenever a page is to be updated. If the
threshold has been exceeded, the updated page is written to disk as soon
as the update completes. The write is synchronous with the SQL request;
that is, the request waits until the write is completed. The two operations
do not occur concurrently.
Reaching this threshold has a significant effect on processor usage and I/O
resource consumption. For example, updating three rows per page in 10
sequential pages ordinarily requires one or two write operations. However,
when IWTH has been exceeded, the updates require 30 synchronous
writes.
Sometimes DB2 uses synchronous writes even when the IWTH has not
been exceeded. For example, when more than two checkpoints pass
without a page being written, DB2 uses synchronous writes. Situations
such as these do not indicate a buffer shortage.
Data management threshold (DMTH): 95%
This threshold is checked before a page is read or updated. If the threshold
is not exceeded, DB2 accesses the page in the buffer pool once for each
page, no matter how many rows are retrieved or updated in that page. If
the threshold is exceeded, DB2 accesses the page in the buffer pool once
for each row that is retrieved or updated in that page.
Recommendation: Recommendation: Avoid reaching the DMTH because
it has a significant effect on processor usage.
The DMTH is maintained for each individual buffer pool. When the
DMTH is reached in one buffer pool, DB2 does not release pages from
other buffer pools.
Sequential prefetch threshold (SPTH): 90%
This threshold is checked at two different times:
v Before scheduling a prefetch operation. If the threshold has been
exceeded, the prefetch is not scheduled.
v During buffer allocation for an already-scheduled prefetch operation. If
the threshold has been exceeded, the prefetch is canceled.
When the sequential prefetch threshold is reached, sequential prefetch is
inhibited until more buffers become available. Operations that use
sequential prefetch, such as those using large and frequent scans, are
adversely affected.
Thresholds that you can change:
You can change some thresholds directly by using the ALTER BUFFERPOOL
command.
Changing a threshold in one buffer pool has no effect on any other buffer
|
|
pool.
From highest to lowest default value, the variable thresholds are:
40
Performance Monitoring and Tuning Guide
Sequential steal threshold (VPSEQT)
This threshold is a percentage of the buffer pool that might be occupied by
sequentially accessed pages. These pages can be in any state: updated, in-use, or
available. Hence, any page might or might not count toward exceeding any other
buffer pool threshold.
The default value for this threshold is 80%. You can change that to any value from
0% to 100% by using the VPSEQT option of the ALTER BUFFERPOOL command.
This threshold is checked before stealing a buffer for a sequentially accessed page
instead of accessing the page in the buffer pool. If the threshold has been
exceeded, DB2 tries to steal a buffer that holds a sequentially accessed page rather
than one that holds a randomly accessed page.
|
|
|
|
|
|
|
Setting the threshold to 0% prevents any sequential pages from taking up space in
the buffer pool. In this case, prefetch is disabled, and any sequentially accessed
pages are discarded as soon as the number of available buffers is exceeded by the
number of objects being accessed. Setting VPSEQT to 0% is recommended for
avoiding unnecessary prefetch scheduling when the pages are already in buffer
pool, such as in the case of in-memory indexes or data. However, setting VPSEQT
to 0 might disable parallelism.
Setting the threshold to 100% allows sequential pages to monopolize the entire
buffer pool.
Virtual buffer pool parallel sequential threshold (VPPSEQT)
This threshold is a portion of the buffer pool that might be used to support parallel
operations. It is measured as a percentage of the sequential steal threshold
(VPSEQT). Setting VPPSEQT to zero disables parallel operation.
The default value for this threshold is 50% of the sequential steal threshold
(VPSEQT). You can change that to any value from 0% to 100% by using the
VPPSEQT option on the ALTER BUFFERPOOL command.
Virtual buffer pool assisting parallel sequential threshold (VPXPSEQT)
This threshold is a portion of the buffer pool that might be used to assist with
parallel operations initiated from another DB2 in the data sharing group. It is
measured as a percentage of VPPSEQT. Setting VPXPSEQT to zero disallows this
DB2 from assisting with Sysplex query parallelism at run time for queries that use
this buffer pool.
The default value for this threshold is 0% of the parallel sequential threshold
(VPPSEQT). You can change that to any value from 0% to 100% by using the
VPXPSEQT option on the ALTER BUFFERPOOL command.
Deferred write threshold (DWQT)
This threshold is a percentage of the buffer pool that might be occupied by
unavailable pages, including both updated pages and in-use pages.
|
|
The default value for this threshold is 30%. You can change that to any value from
0% to 90% by using the DWQT option on the ALTER BUFFERPOOL command.
Chapter 6. Configuring storage for performance
41
DB2 checks this threshold when an update to a page is completed. If the
percentage of unavailable pages in the buffer pool exceeds the threshold, write
operations are scheduled for enough data sets (at up to 128 pages per data set) to
decrease the number of unavailable buffers to 10% below the threshold. For
example, if the threshold is 50%, the number of unavailable buffers is reduced to
40%.
When the deferred write threshold is reached, the data sets with the oldest
updated pages are written asynchronously. DB2 continues writing pages until the
ratio goes below the threshold.
Vertical deferred write threshold (VDWQT)
This threshold is similar to the deferred write threshold, but it applies to the
number of updated pages for a single page set in the buffer pool. If the percentage
or number of updated pages for the data set exceeds the threshold, writes are
scheduled for that data set, up to 128 pages.
You can specify this threshold in one of two ways:
Percentage
Percentage of the buffer pool that might be occupied by updated pages
from a single page set. The default value for this threshold is 5%. You can
change the percentage to any value from 0% to 90%.
|
|
|
Absolute number
The total number of buffers in the buffer pools that might be occupied by
updated pages from a single page set. You can specify the number of
buffers from 0 to 9999. If you want to use the number of buffers as your
threshold, you must set the percentage threshold to 0.
You can change the percent or number of buffers by using the VDWQT keyword
on the ALTER BUFFERPOOL command.
Because any buffers that count toward VDWQT also count toward DWQT, setting
the VDWQT percentage higher than DWQT has no effect: DWQT is reached first,
write operations are scheduled, and VDWQT is never reached. Therefore, the
ALTER BUFFERPOOL command does not allow you to set the VDWQT percentage
to a value greater than DWQT. You can specify a number of buffers for VDWQT
than is higher than DWQT, but again, with no effect.
This threshold is overridden by certain DB2 utilities, which use a constant limit of
64 pages rather than a percentage of the buffer pool size. LOAD, REORG, and
RECOVER use a constant limit of 128 pages.
VDWQT set to 0:
If you set VDWQT to zero, DB2 implicitly uses the smaller of 1% of the buffer pool
(a specific number of pages), or the number determined by the buffer pool page
size as shown in the following table, to avoid synchronous writes to disk.
|
|
|
Table 7. Number of changed pages based on buffer pool size
42
Buffer pool page size
Number of changed pages
4 KB
40
8 KB
24
Performance Monitoring and Tuning Guide
Table 7. Number of changed pages based on buffer pool size (continued)
Buffer pool page size
Number of changed pages
16 KB
16
32 KB
12
Related concepts
Improving the response time for read-only queries (DB2 Data Sharing Planning
and Administration)
Guidelines for setting buffer pool thresholds:
How you set buffer pools depends on your workload and the type and size of data
being cached. But always think about the entire system when making buffer pool
tuning decisions.
For additional help in tuning your buffer pools, try the Buffer Pool Analyzer for
z/OS.
Frequently re-referenced and updated pages
Suppose that you have a workload such as a branch table in a bank that contains a
few hundred rows and is updated by every transaction. For such a workload, you
want a high value for the deferred write and vertical deferred write threshold
(90%). The result is that I/O is deferred until DB2 checkpoint and you have a
lower I/O rate to disk.
However, if the set of pages updated exceeds the size of the buffer pool, setting
both DWQT and VDWQT to 90% might cause the sequential prefetch threshold
(and possibly the data management threshold and the immediate write threshold)
to be reached frequently. You might need to set DWQT and VDWQT lower in that
case.
Rarely referenced pages
Suppose that you have a customer table in a bank that has millions of rows that
are accessed randomly or are updated sequentially in batch.
In this case, lowering the DWQT or VDWQT thresholds (perhaps down to 0) can
avoid a surge of write I/Os caused by DB2 checkpoint. Lowering those thresholds
causes the write I/Os to be distributed more evenly over time. Secondly, this can
improve performance for the storage controller cache by avoiding the problem of
flooding the device at DB2 checkpoint.
Query-only buffer pools
|
|
|
|
For a buffer pool that is used exclusively for sequential processing, setting VPSEQT
to 99% is reasonable and also might enable DB2 to keep space maps in the buffer.
If parallel query processing is a large part of the workload, set VPPSEQT and, if
applicable, VPXPSEQT, to a very high value.
Chapter 6. Configuring storage for performance
43
Mixed workloads
For a buffer pool used for both query and transaction processing, the value you set
for VPSEQT should depend on the respective priority of the two types of
processing. The higher you set VPSEQT, the better queries tend to perform, at the
expense of transactions. If you are not sure what value to set for VPSEQT, use the
default setting.
Buffer pools that contain LOBs
Put LOB data in buffer pools that are not shared with other data. For both LOG
YES and LOG NO LOBs, use a deferred write threshold (DWQT) of 0. LOBs
specified with LOG NO have their changed pages written at commit time
(force-at-commit processing). If you set DWQT to 0, those writes happen
continuously in the background rather than in a large surge at commit. Dedicating
a single buffer pool to LOB objects is especially efficient in data sharing
environments.
|
|
|
LOBs defined with LOG YES can use deferred write, but by setting DWQT to 0,
you can avoid massive writes at DB2 checkpoints.
Set group buffer pool cast out thresholds to a low value to reduce the need for a
large group buffer pools for LOB objects.
PSPI
Determining size and number of buffer pools:
The size and the number of buffer pools that you use in you DB2 subsystem can
significantly affect the performance of that subsystem.
|
Enabling automatic buffer pool size management:
|
|
You can reduce the amount of time that you spend monitoring and adjusting
buffer pools by enabling the DB2 automatic buffer pool size management feature.
|
About this task
|
|
|
|
|
|
|
Automatic buffer pool management does not completely replace existing tools to
configure, monitor, and tune buffer pool size. However, when you have initially
sized your buffer pools, DB2 and WLM can "fine tune" the buffer pool size, based
on long term trends and steady state growth. The DISPLAY BUFFERPOOL output
includes an AUTOSIZE attribute. You can enable or disable automatic buffer pool
management at the individual buffer pool level. Automatic buffer pool
management is off by default.
|
Procedure
|
To enable automatic buffer pool size management:
|
|
|
|
|
|
|
|
Issue an ALTER BUFFERPOOL command and specify the AUTOSIZE(YES) option.
DB2 performs dynamic buffer pool size adjustments that are based on real-time
workload monitoring.
For example, a noncritical DB2 subsystem can automatically reduce its buffer pool
size. By doing so, it frees the storage so that it can be used by a mission-critical
subsystem on the same LPAR, if important transactions fail to meet performance
goals.
When you enable automatic buffer pool management, DB2 reports the buffer pool
44
Performance Monitoring and Tuning Guide
|
|
|
size and hit ratio for random reads to the z/OS Workload Manager (WLM)
component. DB2 also automatically increases or decreases buffer pool size, as
appropriate, by up to 25% of the originally allocated size.
|
Choosing buffer pool sizes:
Initially, you set the sizes (in number of pages) of your buffer pools on installation
panels DSNTIP1 and DSNTIP2.
About this task
|
Because you can use the ALTER BUFFERPOOL command to modify the sizes of
buffer pools, and enable automatic buffer pool management, choosing an exact size
initially is not important.
Buffer pool size guidelines:
DB2 handles large buffer pools very efficiently. Searching in large buffer pools does
not use any more of the processor's resources than searching in smaller pools.
If insufficient real storage exists to back the buffer pool storage, the resulting
paging activity might cause performance degradation. If you see significant paging
activity, increase the amount of real storage or decrease the size of the buffer pools.
|
|
Important: Insufficient storage causes paging, and in extreme situations, might
cause the system to enter into wait state and require an IPL of the system.
Advantages of large buffer pools
In general, larger buffer pool sizes can:
v Result in a higher buffer pool hit ratio, which can reduce the number of I/O
operations. Fewer I/O operations can reduce I/O contention, which can provide
better response time and reduce the processor resources that are needed for I/O
operations.
v Give an opportunity to achieve higher transaction rates with the same response
time. For any given response time, the transaction rate depends greatly on buffer
pool size.
v Prevent I/O contention for the most frequently used disks, particularly the
catalog tables and frequently referenced user tables and indexes. In addition, a
large buffer pool is beneficial when a DB2 sort is used during a query, because
I/O contention on the disks containing the work file table spaces is reduced.
The buffer pool hit ratio:
Buffer pool hit ratio is a measure of how often a page access (a getpage) is satisfied
without requiring an I/O operation.
PSPI
You can help some of your applications and queries by making the buffer
pools large enough to increase the buffer hit ratio.
Accounting reports, which are application related, show the hit ratio for specific
applications. An accounting trace report shows the ratio for single threads. The
OMEGAMON buffer pool statistics report shows the hit ratio for the subsystem as
a whole. For example, the buffer-pool hit ratio is shown in field A in the
Chapter 6. Configuring storage for performance
45
following figure.
TOT4K READ OPERATIONS
--------------------------BPOOL HIT RATIO (%)A
GETPAGE REQUEST
GETPAGE REQUEST-SEQUENTIAL
GETPAGE REQUEST-RANDOM
SYNCHRONOUS READS B
SYNCHRON. READS-SEQUENTIAL
SYNCHRON. READS-RANDOM
GETPAGE PER SYN.READ-RANDOM
SEQUENTIAL PREFETCH REQUEST
SEQUENTIAL PREFETCH READS
PAGES READ VIA SEQ.PREFETCH
S.PRF.PAGES READ/S.PRF.READ
QUANTITY TOT4K WRITE OPERATIONS
QUANTITY
-------- --------------------------- -------73.12 BUFFER UPDATES
220.4K
PAGES WRITTEN
35169.00
1869.7K BUFF.UPDATES/PAGES WRITTEN
6.27
1378.5K
491.2K SYNCHRONOUS WRITES
3.00
ASYNCHRONOUS WRITES
5084.00
54187.00
35994.00 PAGES WRITTEN PER WRITE I/O
5.78
18193.00
HORIZ.DEF.WRITE THRESHOLD
2.00
27.00 VERTI.DEF.WRITE THRESHOLD
0.00
DM THRESHOLD
0.00
41800.00 WRITE ENGINE NOT AVAILABLE
0.00
14473.00 PAGE-INS REQUIRED FOR WRITE
45.00
C444.0K
30.68
LIST PREFETCH REQUESTS
9046.00
LIST PREFETCH READS
2263.00
PAGES READ VIA LST PREFETCHD3046.00
L.PRF.PAGES READ/L.PRF.READ
1.35
DYNAMIC PREFETCH REQUESTED
6680.00
DYNAMIC PREFETCH READS
142.00
PAGES READ VIA DYN.PREFETCHE1333.00
D.PRF.PAGES READ/D.PRF.READ
9.39
PREF.DISABLED-NO BUFFER
0.00
PREF.DISABLED-NO READ ENG
0.00
PAGE-INS REQUIRED FOR READ
460.4K
Figure 2. OMEGAMON database buffer pool statistics (modified)
The buffer hit ratio uses the following formula to determine how many getpage
operations did not require an I/O operation:
Hit ratio = (getpages - pages_read_from_disk) / getpages
In the formula, pages_read_from_disk is the sum of the following fields:
Number of synchronous reads (B)
Number of pages read via sequential prefetch (C)
Number of pages read via list prefetch (D)
Number of pages read via dynamic prefetch (E)
If you have 1000 getpages and 100 pages were read from disk, the equation would
be as follows:
Hit ratio = (1000-100)/1000
The hit ratio in this case is 0.9.
Highest and lowest hit ratios
Highest hit ratio
The highest possible value for the hit ratio is 1.0, which is achieved when
every page requested is always in the buffer pool. Reading index non-leaf
pages tend to have a very high hit ratio since they are frequently
re-referenced and thus tend to stay in the buffer pool.
46
Performance Monitoring and Tuning Guide
Lowest hit ratio
The lowest hit ratio occurs when the requested page is not in the buffer
pool; in this case, the hit ratio is 0 or less. A negative hit ratio means that
prefetch has brought pages into the buffer pool that are not subsequently
referenced. The pages are not referenced because either the query stops
before it reaches the end of the table space or DB2 must take the pages
away to make room for newer ones before the query can access them.
A low hit ratio is not always bad
While it might seem desirable to make the buffer hit ratio as close to 100% as
possible, do not automatically assume a low buffer-pool hit ratio is bad. The hit
ratio is a relative value, based on the type of application. For example, an
application that browses huge amounts of data using table space scans might very
well have a buffer-pool hit ratio of 0. What you want to watch for is those cases
where the hit ratio drops significantly for the same application. In those cases, it
might be helpful to investigate further.
|
Checking for wait times
|
|
You can also check OTHER READ I/O WAIT in accounting class 3 to check
whether requested pages are read in without any wait time or wait times occurred.
Hit ratios for additional processes
The hit ratio measurement becomes less meaningful if the buffer pool is being used
by additional processes, such as work files or utilities. Some utilities and SQL
statements use a special type of getpage request that reserve an empty buffer
without requiring that the page be read from disk.
A getpage is issued for each empty work file page without read I/O during sort
input processing. The hit ratio can be calculated if the work files are isolated in
their own buffer pools. If they are, then the number of getpages used for the hit
ratio formula is divided in half as follows:
Hit ratio = ((getpages / 2) - pages_read_from_disk) / (getpages / 2)
PSPI
Allocating buffer pool storage to avoid paging:
DB2 limits the total amount of storage that is allocated for virtual buffer pools to
approximately twice the amount of real storage. However, to avoid paging, it is
strongly recommended that you set the total buffer pool size to less than the real
storage that is available to DB2.
About this task
Paging occurs when the virtual storage requirements for a buffer pool exceed the
real storage capacity for the z/OS image. In this case, the least recently used data
pages in the buffer pool are migrated to auxiliary storage. Subsequent access to
these pages results in a page fault and the page must be brought into real storage
from auxiliary storage. Paging of buffer pool storage can negatively affect DB2
performance. The statistics for PAGE-INS REQUIRED FOR WRITE and PAGE-INS
REQUIRED FOR READ in the OMEGAMON statistics report are useful in
determining if the buffer pool size setting is too large for available real storage.
Chapter 6. Configuring storage for performance
47
If the amount of virtual storage that is allocated to buffer pools is more than twice
the amount of real storage, you cannot increase the buffer pool size. DB2 allocates
the minimum buffer pool storage for the BP0, BP8K0, BP16K0, and BP32K buffer
pools as shown in the following table.
|
Table 8. Buffer pool storage allocation for BP0, BP8K0, BP16K0, and BP32K
|
Buffer pool page size
Minimum number of pages allocated
|
4 KB
2000
|
8 KB
1000
|
16 KB
500
|
|
32 KB
250
Procedure
To avoid problems with paging:
Set the total buffer pool size to a value that is less than the amount of real storage
that is available to DB2.
Deciding whether to create additional buffer pools:
You can assign all objects of each page size to the corresponding default buffer
pool, or you can create additional buffer pools of each size according to your
specific situation and requirements.
About this task
DB2 creates four default buffer pools, one for each of the four page sizes:
v BP0
v BP8K0
v BP16K0
v BP32K
Using only default buffer pools:
In certain situations, you should use the default buffer pools for each page size.
Procedure
Choose only the default buffer pools for each page size when:
v Your system is already storage constrained.
v You have no one with the application knowledge that is necessary to do more
specialized tuning.
v Your system is a test system.
Using additional buffer pools:
You might be able to improve performance by creating other buffers in addition to
the default buffers.
Procedure
To take advantage of additional buffer pools, use any of the following methods:
48
Performance Monitoring and Tuning Guide
v Isolate data in separate buffer pools to favor certain applications, data, and
indexes.
– You can favor certain data and indexes by assigning more buffers. For
example, you might improve the performance of large buffer pools by putting
indexes into separate pools from data.
– You can customize buffer pool tuning parameters to match the characteristics
of the data. For example, you might want to put tables and indexes that are
updated frequently into a buffer pool with different characteristics from those
that are frequently accessed but infrequently updated.
v Put work files into separate buffer pools. Doing so provides better performance
for sort-intensive queries. Applications that use created temporary tables use
work files for those tables. Keeping work files separate allows you to monitor
temporary table activity more easily.
Results
This process of segregating different activities and data into separate buffer pools
has the advantage of providing good and relatively inexpensive performance
diagnosis data from statistics and accounting traces.
Choosing a page-stealing algorithm:
When DB2 must remove a page from the buffer pool to make room for a newer
page, the action is called stealing the page from the buffer pool.
About this task
By default, DB2 uses a least-recently-used (LRU) algorithm for managing pages in
storage. This algorithm removes pages that have not been recently used and retains
recently used pages in the buffer pool. However, DB2 can use different
page-stealing algorithms to manage buffer pools more efficiently.
Table 9. Types of page-stealing algorithms
Algorithm
name
Command Option
Description
Advantages
Least-recentlyused
PGSTEAL(LRU)
Monitors which pages have This option keeps pages in the buffer pool
not been recently used in
that are being used frequently and removes
the pool.
unused pages. This algorithm insures that
the most frequently accessed pages are
always in the buffer pool. Most buffer pools
should use this algorithm.
First-in,
first-out
PGSTEAL(FIFO)
Monitors how long a page
is in the buffer pool.
This option removes the oldest pages no
matter how frequently they are referenced.
This simple approach to page stealing
results in a small decrease in the cost of
doing a getpage operation, and can reduce
internal DB2 latch contention in
environments that require very high
concurrency.
Procedure
To specify the page-stealing algorithm:
1. Determine which page-stealing algorithm is the most efficient for the buffer
pool.
Chapter 6. Configuring storage for performance
49
v In most cases, keep the default value, which is LRU.
v Specify the FIFO value for buffer pools that have no I/O, or buffer pools that
have table space or index entries that should always remain in memory.
Because no page-stealing is expected in these types of buffer pools, the
additional cost of the LRU algorithm is not required.
2. Issue an ALTER BUFFERPOOL command, and specify the PGSTEAL option.
Related reference
-ALTER BUFFERPOOL (DB2) (DB2 Commands)
Fixing a buffer pool in real storage:
You can use the PGFIX keyword with the ALTER BUFFERPOOL command to fix a
buffer pool in real storage for an extended period of time.
About this task
The PGFIX keyword has the following options:
PGFIX(YES)
The buffer pool is fixed in real storage for the long term. Page buffers are
fixed when they are first used and remain fixed.
PGFIX(NO)
The buffer pool is fixed in real storage only for the duration of an I/O
operation. Page buffers are fixed and unfixed in real storage, allowing for
paging to disk. PGFIX(NO) is the default option.
|
|
|
To prevent PGFIX(YES) buffer pools from exceeding the real storage capacity, DB2
uses an 80% threshold when allocating PGFIX(YES) buffer pools. If the threshold is
exceeded, DB2 overrides the PGFIX(YES) option with PGFIX(NO).
Procedure
To fix a buffer pool in real storage:
Issue an ALTER BUFFERPOOL command and specify PGFIX(YES). You should use
PGFIX(YES) for buffer pools with a high I/O rate, those buffer pools with a high
number of pages read or written. For buffer pools with zero I/O, such as some
read-only data or some indexes with a nearly 100% hit ratio, PGFIX(YES) is not
recommended because it does not provide any performance advantage.
Designing EDM storage space for performance
The environmental descriptor manager (EDM) pools contain skeleton application
plans and packages, database descriptors, and cached dynamic SQL statements.
You can design them to reduce the number of I/O operations and reduce
processing times.
|
|
|
|
About this task
You can design your EDM storage pools to avoid allocation I/O (a significant part
of the total number of I/Os for a transaction), reduce the time that is required to
check whether users who attempt to execute a plan are authorized to do so, and
reduce the time that is required to prepare statements with the statement cache
pool.
50
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
When pages are needed from the EDM storage pools, any pages that are available
are allocated first. If the available pages do not provide enough space to satisfy the
request, pages are “stolen” from an inactive SKCT, SKPT, DBD, or dynamic SQL
skeleton. If enough space is still not available, an SQL error code is sent to the
application program.
EDM storage pools that are too small cause the following problems.
v Increased I/O activity in DSNDB01.SCT02, DSNDB01.SPT01, and
DSNDB01.DBD01
v Increased response times, due to loading the SKCTs, SKPTs, and DBDs
v Increased processing and response times time for full prepares of dynamic SQL
statements when the EDM statement cache is too small.
v Fewer threads used concurrently, due to a lack of storage
Procedure
To ensure the best performance from EDM pools:
Design your EDM storage according to the following table.
Table 10. Designing the EDM storage pools
Design...
To contain...
EDM RDS below pool
Part of a plan (CT) or package (PT) that is below the
2-GB bar.
EDM RDS above pool
Part of a plan (CT) or package (PT) that is above the
2-GB bar.
EDM DBD pool
Database descriptors
EDM statement pool
The cached dynamic SQL statements
EDM skeleton pool
Skeleton copies of plans (SKCTs) and packages (SKPTs)
EDM storage:
|
|
The environmental descriptor manager (EDM) pools contain skeleton application
plans and packages, database descriptors, and cached dynamic SQL statements.
EDM storage is composed of the following components, each of which is in a
separate storage area:
|
|
|
EDM RDS pool below
A below-the-bar pool, which contains the part of the cursor tables (CTs)
and package tables (PTs) that must be below the bar
|
|
|
EDM RDS pool above
An above-the-bar pool that contains the part of the PTs and CTs that can
be above the bar
|
|
EDM DBD pool
An above-the-bar pool that contains database descriptors (DBDs)
|
|
EDM statement pool
An above-the-bar pool that contains dynamic cached statements
|
|
|
EDM skeleton pool
An above-the-bar pool that contains skeleton package tables (SKPTs) and
skeleton cursor tables (SKCTs)
Chapter 6. Configuring storage for performance
51
|
|
|
|
|
|
During the installation process, the DSNTINST CLIST calculates the sizes of the
following types of storage:
v EDM pools (above and below)
v the EDM statement cache
v the EDM DBD cache
v the EDM skeleton pool
|
You can check the calculated sizes on the DSNTIPC installation panel.
|
|
|
|
|
|
For data sharing, you might need to increase the EDM DBD cache storage estimate
to compensate for the need to store multiple concurrent DBD copies. Each member
maintains a separate copy of a DBD that is referenced by multiple members. New
and separate references to the same DBD might result in multiple copies being
loaded while invalidated copies remain, until the threads that use them are either
committed or deallocated.
Because of an internal process that changes the size of plans initially bound in one
release and then are rebound in a later release, you should carefully monitor the
sizes of the EDM storage pools, and increase their sizes, if necessary.
Related concepts
Estimating storage for the EDM pool (DB2 Installation Guide)
Related reference
CLIST calculations panel 1: DSNTIPC (DB2 Installation Guide)
Controlling EDM storage size:
By following certain recommendations you can control the size of your EDM
storage.
Procedure
To keep EDM storage, and especially the EDM pool, under control:
1. Use more packages. By using multiple packages you can increase the
effectiveness of EDM pool storage management by having smaller objects in the
pool.
2. Use RELEASE(COMMIT) when appropriate. Using the bind option
RELEASE(COMMIT) for infrequently used packages and plans can cause
objects to be removed from the EDM pool sooner.
3. Understand the impact of using DEGREE(ANY). A plan or package that is
bound with DEGREE(ANY) can require 50 to 70% more storage for the CTs and
PTs in the EDM pool than one bound with DEGREE(1). If you change a plan or
package to DEGREE(ANY), check the change in the column AVGSIZE in
SYSPLAN or SYSPACKAGE to determine the increase required.
Measuring the efficiency of EDM pools:
You can use information in theDB2 statistics record to calculate the efficiency of the
EDM skeleton pool, the EDM DBD cache, and the EDM statement cache.
52
Performance Monitoring and Tuning Guide
Procedure
To measure the efficiency of the EDM pools:
|
Gather the following ratios from the OMEGAMON statistics report:
v DBD HIT RATIO (%)
v CT HIT RATIO (%)
v PT HIT RATIO (%)
v STMT HIT RATIO (%)
These ratios for the EDM pool depend upon your location's work load. In most
DB2 subsystems, a value of 80% or more is acceptable. This value means that at
least 80% of the requests were satisfied without I/O.
The number of free pages is shown in FREE PAGES . For pools with stealable
objects, if this value is more than 20% of The number of pages for the
corresponding pools during peak periods, the EDM pool size is probably too large
for that type of pool. In this case, you can reduce its size without affecting the
efficiency ratios significantly.
Below the bar and above the bar pools should have enough space for the
maximum workload. The number of times that a failure occurred because a
particular pool is shown in the values that are recorded in the FAILS DUE TO type
POOL fields. You should increase the size of the corresponding pool if any of those
fields contains a non-zero value.
EDM pools in the statistics report
The DB2 statistics record provides information on the EDM skeleton pool, the
EDM DBD cache, and the EDM statement cache.
The following example shows how OMEGAMON presents information about EDM
pools in the statistics report.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DM,POOL
QUANTITY
--------------------------- -------PAGES IN RDS POOL (BELOW)
32768.00
HELD BY CT
0.00
HELD BY PT
34.33
FREE PAGES
32733.67
FAILS DUE TO POOL FULL
0.00
PAGES IN RDS POOL (ABOVE)
HELD BY CT
HELD BY PT
FREE PAGES
FAILS DUE TO RDS POOL FULL
524.3K
0.00
0.00
524.3K
0.00
PAGES IN DBD POOL (ABOVE)
HELD BY DBD
FREE PAGES
FAILS DUE TO DBD POOL FULL
262.1K
81.00
262.1K
0.00
PAGES IN STMT POOL (ABOVE)
HELD BY STATEMENTS
FREE PAGES
FAILS DUE TO STMT POOL FULL
262.1K
182.00
262.0K
0.00
PAGES IN SKEL POOL (ABOVE)
HELD BY SKCT
HELD BY SKPT
FREE PAGES
FAILS DUE TO SKEL POOL FULL
25600.00
2.00
44.00
25554.00
0.00
Chapter 6. Configuring storage for performance
53
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DBD REQUESTS
DBD NOT FOUND
DBD HIT RATIO (%)
CT REQUESTS
CT NOT FOUND
CT HIT RATIO (%)
PT REQUESTS
PT NOT FOUND
PT HIT RATIO (%)
PKG SEARCH NOT FOUND
PKG SEARCH NOT FOUND INSERT
PKG SEARCH NOT FOUND DELETE
STATEMENTS IN GLOBAL CACHE
633.2K
0.00
100.00
0.00
0.00
N/C
608.7K
0.00
100.00
0.00
0.00
0.00
48.0
For more information about the statistics report, see the IBM
TivoliOMEGAMONDB2 Performance Expert on z/OS Report Reference.
Calculating the EDM statement cache hit ratio:
If you have caching turned on for dynamic SQL, the EDM storage statistics
provide information that can help you determine how successful your applications
are at finding statements in the cache.
About this task
PREPARE REQUESTS (A) records the number of requests to search the cache.
FULL PREPARES (B) records the number of times that a statement was inserted
into the cache, which can be interpreted as the number of times a statement was
not found in the cache. To determine how often the dynamic statement was used
from the cache, check the value in GLOBAL CACHE HIT RATIO (C).
Procedure
To calculate the the EDM statement cache hit ratio:
Use the following formula:
(PREPARE REQUESTS - FULL PREPARES) / PREPARE REQUESTS = hit ratio
Example
The following figure shows the dynamic SQL statements part of OMEGAMON
statistics report.
DYNAMIC SQL,STMT,
QUANTITY
--------------------------- -------PREPARE REQUESTS A
8305.3K
FULL PREPARES
B
0.00
SHORT PREPARES
8544.5K
GLOBAL CACHE HIT RATIO (%) C 100.00
IMPLICIT PREPARES
PREPARES AVOIDED
CACHE LIMIT EXCEEDED
PREP STMT PURGED
LOCAL CACHE HIT RATIO (%)
54
Performance Monitoring and Tuning Guide
0.00
0.00
0.00
0.00
N/C
Figure 3. EDM storage usage for dynamic SQL statements in the OMEGAMON statistics report
For more information, see the IBM TivoliOMEGAMONDB2 Performance Expert on
z/OS Report Reference.
Controlling DBD size for large databases:
When you design your databases, be aware that a very large number of objects in
your database means a larger DBD for that database.
About this task
|
|
|
|
If a large number of create, alter, and drop operations are performed on objects in
a database with a large DBD, DB2 might encounter more contention from the DBD
among transactions that access different objects because storage is not
automatically reclaimed in the DBD.
Procedure
To control the size of DBDs for large databases:
v Monitor and manage DBDs to prevent them from becoming too large. Very large
DBDs can reduce concurrency and degrade the performance of SQL operations
that create or alter objects because of increased I/O and logging. DBDs that are
created or altered in DB2 Version 6 or later do not need contiguous storage, but
can use pieces of approximately 32 KB. Older DBDs require contiguous storage.
v When you create, alter, and drop objects in a database, use the MODIFY utility
to reclaim storage in the DBD. Storage is not automatically reclaimed in a DBD
for these operations.
Related reference
MODIFY RECOVERY (DB2 Utilities)
Balancing EDM space fragmentation and performance:
You can control whether DB2 emphasizes space usage or performance for EDM
pools of different sizes.
About this task
For smaller EDM pools, space utilization or fragmentation is normally more critical
than for larger EDM pools. For larger EDM pools, performance is normally more
critical. DB2 emphasizes performance and uses less optimum EDM storage
allocation when the EDM pool size exceeds 40 MB.
To specify the search algorithm that DB2 uses:
Procedure
Set the keyword EDMBFIT in the DSNTIJUZ job. The EDMBFIT keyword adjusts
the search algorithm on systems with EDM pools that are larger than 40 MB.
Chapter 6. Configuring storage for performance
55
Option
Description
Set EDMBFIT to NO
Which tellsDB2 to use a first-fit algorithm.
Doing so is especially important when a
high class 24 latch (EDM LRU latch)
contention exists. For example, be sure to set
EDMBFIT to NO when class 24 latch
contention exceeds 500 contentions per
second.
Set EDMBFIT to YES
Which tells DB2 to use a better-fit algorithm.
Do this when EDMPOOL full conditions
occur for an EDM pool size that exceeds 40
MB.
Increasing RID pool size
The RID pool is used for all record identifier (RID) processing. You can improve
the performance of transactions that use the RID pool by specifying a sufficient
size for the RID POOL.
About this task
The RID pool is used for enforcing unique keys while updating multiple rows and
for sorting RIDs during the following operations:
v List prefetch, including single index list prefetch
v Access via multiple indexes
v Hybrid joins
The RID Pool Processing section of the OMEGAMON accounting trace record
contains information about whether a transaction used the RID pool.
Procedure
To favor the selection and efficient completion of those access paths:
Increase the maximum RID pool size. If the available RID pool storage is too small,
the statement might revert to a table space scan.
The general formula for computing RID pool size is:
Number of concurrent RID processing activities ×
average number of RIDs × 2 × 5 bytes per RID
v Increasing the RID pool size often results in a significant performance
improvement for queries that use the RID pool. The default RID pool size is 8
MB. You can override this value on installation panel DSNTIPC.
v The RID pool, which all concurrent work shares, is limited to a maximum of
10 000 MB. The RID pool is created at system initialization, but no space is
allocated until RID storage is needed. Space is allocated in 32-KB blocks as
needed, until the maximum size that you specified on installation panel
DSNTIPC is reached.
v A RID pool size of 0 disables those access paths. If you specify a RID pool size
of 0, plans or packages that were previously bound with a non-zero RID pool
size might experience significant performance degradation. Rebind any plans or
packages that include SQL statements that use RID processing.
|
|
|
|
|
56
Performance Monitoring and Tuning Guide
Results
Whether your SQL statements that use RID processing complete efficiently or not
depends on other concurrent work that uses the RID pool.
Example
Three concurrent RID processing activities, with an average of 4000 RIDs each,
would require 120 KB of storage, because:
3 × 4000 × 2 × 5 = 120KB
Controlling sort pool size and sort processing
A sort operation is invoked when a cursor is opened for a SELECT statement that
requires sorting.
About this task
The maximum size of the sort work area allocated for each concurrent sort user
depends on the value that you specified for the SORT POOL SIZE field on
installation panel DSNTIPC. The default value is 2 MB.
Estimating the maximum size of the sort pool:
You can change the maximum size of the sort pool by using the installation panels
in UPDATE mode.
Procedure
To determine a rough estimate for the maximum size:
Use the following formula.
32000 × (16 + sort key length + sort data length)
For sort key length and sort data length, use values that represent the maximum
values for the queries you run. To determine these values, refer to the QW0096KL
(key length) and QW0096DL (data length) fields in IFCID 0096, as mapped by
macro DSNDQW01. You can also determine these values from an SQL activity
trace.
Example
If a column is in the ORDER BY clause that is not in the select clause, that column
should be included in the sort data length and the sort key length as shown in the
following example:
SELECT C1, C2, C3
FROM tablex
ORDER BY C1, C4;
If C1, C2, C3, and C4 are each 10 bytes in length, you could estimate the sort pool
size as follows:
32000 × (16 + 20 + (10 + 10 + 10 + 10)) = 2342000 bytes
The values used in the example above include the items in the following table:
Chapter 6. Configuring storage for performance
57
Table 11. Values used in the sort pool size example
Attribute
Value
Maximum number of sort nodes
32000
Size (in bytes) of each node
16
Sort key length (ORDER BY C1, C4)
20
Sort data length (each column is 10 bytes in
length)
10+10+10+10
How sort work files are allocated:
The work files that are used in sort are logical work files, which reside in work file
table spaces in your work file database (which is DSNDB07 in a non data-sharing
environment).
The sort begins with the input phase, when ordered sets of rows are written to
work files. At the end of the input phase, when all the rows have been sorted and
inserted into the work files, the work files are merged together, if necessary, into a
single work file that contains the sorted data. The merge phase is skipped if only
one work file exists at the end of the input phase. In some cases, intermediate
merging might be needed if the maximum number of sort work files has been
allocated.
DB2 uses the buffer pool when writing to the logical work file. Only the buffer
pool size limits the number of work files that can be used for sorting.
A sort can complete in the buffer pool without I/Os. This ideal situation might be
unlikely, especially if the amount of data being sorted is large. The sort row size is
actually made up of the columns being sorted (the sort key length) and the
columns that the user selects (the sort data length). Having a very large buffer pool
for sort activity can help you to avoid disk I/Os.
When your application needs to sort data, DB2 tries to allocate each sort work file
on a table space that has no secondary allocation (SECQTY = 0) and is least
recently used. When table spaces with the preferred features are not available, then
any available table space will be used.
|
|
|
|
For example, if five logical work files are to be used in the sort, and the installation
has three work file table spaces allocated, then the following table shows which
work file table space would contain each logical work file.
For example, if five logical work files are to be used in the sort, and the installation
has three work file table spaces allocated, then the following table shows which
work file table space would contain each logical work file.
Table 12. How work filetable spaces are allocated for logical work files
58
Logical work file
Work file table space
1
1
2
2
3
3
4
1
5
2
Performance Monitoring and Tuning Guide
Improving the performance of sort processing:
Many factors affect the performance of sort operations, but you can follow certain
recommendations to reduce I/O contention and minimize sort row size.
About this task
The following factors affect the performance of DB2 sort processing:
v Sort pool size
v I/O contention
v Sort row size
v Whether the data is already sorted
For any SQL statement that initiates sort activity, the OMEGAMON SQL activity
reports provide information on the efficiency of the sort that is involved.
Procedure
To minimize the performance impacts of sort processing:
v Increase the size of the sort pool. The larger the sort pool, the more efficient the
sort is.
v Minimize I/O contention on the I/O paths to the physical work files, and make
sure that physical work files are allocated on different I/O paths and packs to
minimize I/O contention. Using disk devices with Parallel Access Volumes
(PAV) support is another way to significantly minimize I/O contention. When
I/Os occur in the merge phase of a sort,DB2 uses sequential prefetch to bring
pages into the buffer pool with a prefetch quantity of eight pages. However, if
the buffer pool is constrained, then DB2 uses a prefetch quantity of four pages
or less, or disables prefetch entirely because of the unavailability of enough
pages.
v Allocate additional physical work files in excess of the defaults, and put those
work files in their own buffer pool.
Segregating work file activity enables you to better monitor and tune sort
performance. It also allows DB2 to handle sorts more efficiently because these
buffers are available only for sort without interference from other DB2 work.
v Increase the amount of available space for work files. Applications that use
created temporary tables use work file space until a COMMIT or ROLLBACK
occurs. (If a cursor is defined WITH HOLD, then the data is held past the
COMMIT.) If sorts are happening concurrently with the temporary table's
existence, then you probably need more space to handle the additional use of
the work files.
Applications that require star join, materialized views, materialized nested table
expressions, non-correlated subqueries or triggers also use work files.
v Write applications to sort only columns that need to sorted because sorted rows
appear twice in the sort row size. A smaller sort row size means that more rows
can fit in the sort pool.
v Select VARCHAR columns only when they are required. Varying length columns
are padded to their maximum length for sort row size.
|
|
v Set the buffer pool sequential steal threshold (VPSEQT) to 99% unless sparse
index is used to access the work files. The default value, which is 80%, allows
20% of the buffers to go unused. A value of 99% prevents space map pages,
which are randomly accessed, from being overwritten by massive prefetch.
Chapter 6. Configuring storage for performance
59
v Increase the buffer pool deferred write threshold (DWQT) or data set deferred
write threshold (VDWQT) values. If DWQT or VDWQT are reached, writes are
scheduled. For a large sort that uses many logical work files, scheduled writes
are difficult to avoid, even if a very large buffer pool is specified. As you
increase the value of VDWQT, watch for buffer shortage conditions and either
increase the work file buffer pool size or reduce VDWQT if buffer shortages
occur.
Managing the opening and closing of data sets
Having the needed data sets open and available for use is important for the
performance of transactions.
However, the number of open data sets affects the amount of available storage,
and number of open data sets in read-write state affects restart time.
Determining the maximum number of open data sets
DB2 defers closing and de-allocating the table spaces or indexes until the number
of open data sets reaches 99% of the value that you specified for DSMAX.
When DSMAX is reached, DB2 closes 300 data sets or 3% of the value of DSMAX,
whichever number of data sets is fewer. Consequently, DSMAX controls not only
the limit of open data sets, but also controls the number of data sets that are closed
when that limit is reached.
How DB2 determines DSMAX:
DB2 uses a formula to calculate the initial value for DSMAX
Initially, DB2 calculates DSMAX according to the following formula.
v Let concdb be the number of concurrent databases specified on installation panel
DSNTIPE.
v Let tables be the number of tables per database specified on installation panel
DSNTIPD.
v Let indexes be the number of indexes per table. The installation CLIST sets this
variable to 2.
v Let tblspaces be the number of table spaces per database specified on installation
panel DSNTIPD.
DB2 calculates the number of open data sets with the following formula:
concdb × {(tables ×indexes) + tblspaces}
Modifying DSMAX:
If you have many partitioned table spaces or LOB table spaces, you might need to
increase DSMAX.
About this task
The formula used by DB2 does not take partitioned or LOB table spaces into
account. Those table spaces can have many data sets. Do not forget to consider the
data sets for nonpartitioned indexes defined on partitioned table spaces with many
partitions and multiple partitioned indexes. If those indexes are defined with a
small PIECESIZE, there could be many data sets. You can modify DSMAX by
updating field DSMAX - MAXIMUM OPEN DATA SETS on installation panel
DSNTIPC.
|
|
|
|
|
|
|
60
Performance Monitoring and Tuning Guide
DSMAX should be larger than the maximum number of data sets that are open
and in use at one time. For the most accurate count of open data sets, refer to the
OPEN/CLOSE ACTIVITY section of the OMEGAMON statistics report. Make sure
the statistics trace was run at a peak period, so that you can obtain the most
accurate maximum figure.
The best indicator of when to increase DSMAX is when the open and close activity
of data sets is high, 1 per second as a general guideline. Refer to the
OPEN/CLOSE value under the SER.TASK SWITCH section of the OMEGAMON
accounting report or NUMBER OF DATASET OPENS in the bufferpool statistics
(which provides the statistics for specific buffer pools). Consider increasing
DSMAX when these values shows more than 1 event per second.
Procedure
To calculate the total number of data sets (rather than the number that are open
during peak periods), you can do the following:
1. To find the number of simple and segmented table spaces, use the following
query. The calculation assumes that you have one data set for each simple,
segmented, and LOB table space. These catalog queries are included in
DSNTESP in SDSNSAMP. You can use them as input to SPUFI.
SELECT CLOSERULE, COUNT(*)
FROM SYSIBM.SYSTABLESPACE
WHERE PARTITIONS = 0
GROUP BY CLOSERULE;
2. To find the number of data sets for the partitioned table spaces, use the
following query, which returns the number of partitioned table spaces and the
total number of partitions.
SELECT CLOSERULE, COUNT(*), SUM(PARTITIONS)
FROM SYSIBM.SYSTABLESPACE
WHERE PARTITIONS > 0
GROUP BY CLOSERULE;
Partitioned table spaces can require up to 4096 data sets for the data, and a
corresponding number of data sets for each partitioned index.
3. To find the number of data sets required for each nonpartitioned index, use the
following query.
SELECT CLOSERULE, COUNT(*)
FROM
SYSIBM.SYSINDEXES T1, SYSIBM.SYSINDEXPART T2
WHERE T1.NAME = T2.IXNAME
AND
T1.CREATOR = T2.IXCREATOR
AND
T2.PARTITION = 0
GROUP BY CLOSERULE;
The calculation assumes that you have only one data set for each
nonpartitioned index. If you use pieces, adjust accordingly.
4. To find the number of data sets for the partitioned indexes, use the following
query, which returns the number of index partitions.
SELECT CLOSERULE, COUNT(*)
FROM
SYSIBM.SYSINDEXES T1, SYSIBM.SYSINDEXPART T2
WHERE T1.NAME = T2.IXNAME
AND
T1.CREATOR = T2.IXCREATOR
AND
T2.PARTITION > 0
GROUP BY CLOSERULE;
You have one data set for each index partition.
5. To find the total number of data sets, add the numbers that result from the four
queries. (For Query 2, use the sum of the partitions that was obtained.)
Chapter 6. Configuring storage for performance
61
Related tasks
“Switching to read-only for infrequently updated and infrequently accessed page
sets” on page 63
Related reference
Active log data set parameters: DSNTIPL (DB2 Installation Guide)
RO SWITCH CHKPTS field (PCLOSEN subsystem parameter) (DB2 Installation
Guide)
RO SWITCH TIME field (PCLOSET subsystem parameter) (DB2 Installation
Guide)
Recommendations for DSMAX:
As with many recommendations in DB2, you must weigh the cost of performance
versus availability when choosing a value for DSMAX.
Consider the following factors:
v For best performance, you should leave enough margin in your specification of
DSMAX so that frequently used data sets can remain open after they are no
longer referenced. If data sets are opened and closed frequently, such as every
few seconds, you can improve performance by increasing DSMAX.
v The number of open data sets on your subsystem that are in read/write state
affects checkpoint costs and log volumes. To control how long data sets stay
open in a read/write state, specify values for the RO SWITCH CHKPTS and RO
SWITCH TIME fields of installation panel DSNTIPL.
v Consider segmented table spaces to reduce the number of data sets.
To reduce open and close activity, you can try reducing the number of data sets
by combining tables into segmented table spaces. This approach is most useful
for development or end-user systems that include a lot of smaller tables that can
be combined into single table spaces.
|
|
|
|
Understanding the CLOSE YES and CLOSE NO options
The CLOSE value for a table space or index affects the process of closing an
object's data sets and how DB2 manages data set closing. Page set refers to a table
space or index.
The process of closing:
DB2 dynamically manages page sets using two levels of page set closure—logical
close and physical close.
Logical close
This occurs when the application has been deallocated from that page set. This is
at either commit or deallocation time, depending on the RELEASE(COMMIT/
DEALLOCATE) option of the BIND command, and is driven by the use count.
When a page set is logically closed, the page set use count is decremented. When
the page set use count is zero, the page set is considered not in use; this makes it a
candidate for physical close.
62
Performance Monitoring and Tuning Guide
Physical close
This happens when DB2 closes and deallocates the data sets for the page set.
When the data sets are closed:
The number of open data sets determines when DB2 must close data sets. When
DB2 closes data sets, all data sets for a particular table space, index, or partition
are closed.
The value you specify for CLOSE determines the order in which page sets that are
not in use are closed. When the open data set count becomes greater than 99% of
DSMAX, DB2 first closes page sets defined with CLOSE YES. The least recently
used page sets are closed first.
If the number of open data sets cannot be limited by closing page sets or partitions
defined with CLOSE YES, DB2 must close page sets or partitions defined with
CLOSE NO. The least recently used CLOSE NO data sets are closed first.
Delaying the physical closure of page sets or partitions until necessary is called
deferred close. Deferred closing of a page set or partition that is no longer being
used means that another application or user can access the table space and employ
the accompanying indexes without DB2 reopening the data sets. Thus, deferred
closing of page sets or partitions can improve your applications' performance by
avoiding I/O processing.
Recommendation: For a table space whose data is continually referenced, in most
cases it does not matter whether it is defined with CLOSE YES or CLOSE NO; the
data sets remain open. This is also true, but less so, for a table space whose data is
not referenced for short periods of time; because DB2 uses deferred close to
manage data sets, the data sets are likely to be open when they are used again.
You could find CLOSE NO appropriate for page sets that contain data you do not
frequently use but is so performance-critical that you cannot afford the delay of
opening the data sets.
If the number of open data sets is a concern, choose CLOSE YES for page sets with
many partitions or data sets.
Switching to read-only for infrequently updated and infrequently
accessed page sets
By converting infrequently used page sets from read-write to read-only, you can
improve performance and data recovery by minimizing the amount of logging
activities and reducing the number of log records.
About this task
For both CLOSE YES and CLOSE NO page sets, SYSLGRNX entries are updated
when the page set is converted from read-write state to read-only state. When this
conversion occurs for table spaces, the SYSLGRNX entry is closed and any updated
pages are externalized to disk. For indexes defined as COPY NO, no SYSLGRNX
entries occur, but the updated pages are externalized to disk.
By converting infrequently used page sets from read-write to read-only state, you
can achieve the following performance benefits:
Chapter 6. Configuring storage for performance
63
v Improved data recovery performance because SYSLGRNX entries are more
precise, closer to the last update transaction commit point. As a result, the
RECOVER utility has fewer log records to process.
v Minimized logging activities. Log records for page set open, checkpoint, and
close operations are only written for updated page sets or partitions. Log records
are not written for read-only page sets or partitions.
Procedure
To specify when unused pages are converted to read only:
Specify the values for the RO SWITCH CHKPTS and RO SWITCH TIME fields of
the DSNTIPL installation panel.
RO SWITCH CHKPTS (PLCOSEN subsystem parameter)
The number of consecutive DB2 checkpoints since a page set or partition
was last updated.
RO SWITCH TIME (PCLOSET subsystem parameter)
the amount of elapsed time since a page set or partition was last updated.
Use the following recommendations to determine how to set these values:
v In most cases, the default values are adequate. However, if you find that the
amount of R/O switching is causing a performance problem for the updates to
SYSLGRNX, consider increasing the value of RO SWITCH TIME.
v For larger buffer pools, you can increase the values of RO SWITCH CHKPTS
and RO SWITCH TIME to bypass delays caused by exhaustive buffer pool scans
for registering and checking the validity of pages when entering and leaving
GBP-dependent states.
v For table spaces that are defined with the NOT LOGGED option, the values for
RO SWITCH CHKPTS and RO SWITCH TIME are set to the recommended
value of 1. Changing these values is not recommended. All read-write table
spaces that are defined with the NOT LOGGED option and not in use are
converted to read-only whenever a DB2 checkpoint occurs. If a checkpoint does
not occur, the not logged table spaces are converted to read-only one minute
after the commit of the last update. DB2 writes the table space from the buffer
pool to external media when it converts the table space from read-write to
read-only, externalizing any unprotected modifications to the data.
Related reference
|
|
|
|
|
|
|
|
|
|
|
|
|
Active log data set parameters: DSNTIPL (DB2 Installation Guide)
RO SWITCH CHKPTS field (PCLOSEN subsystem parameter) (DB2 Installation
Guide)
RO SWITCH TIME field (PCLOSET subsystem parameter) (DB2 Installation
Guide)
Improving disk storage
You can configure your storage devices and disk space to ensure better
performance from DB2.
Selecting and configuring storage devices
Whether you use newer storage servers, such as the IBM TotalStorage DS8000®, or
older storage devices types effects the performance of your DB2 subsystem.
|
|
64
Performance Monitoring and Tuning Guide
Selecting storage devices
Some storage device types are more optimal for certain types of applications.
Procedure
To choose storage devices types:
Consider the following hardware characteristics that affect performance.
|
|
v The size of the cache
v The number of channels and type of channels that are attached and online to a
group of logical volumes, including high performance FICON
v The size of non-volatile storage (NVS), if deferred write performance is a
problem
v Disk arrays
|
|
|
v Advanced features such as Parallel Access Volumes (PAV), HyperPAV, Multiple
Allegiance, and FlashCopy®
v Fast remote replication techniques
Storage servers
|
An I/O subsystem typically consists of many storage disks, which are housed in
storage servers such as the IBM TotalStorage DS8000.
Storage servers provide increased functionality and performance over that of “Just
a Bunch of Disks” technology.
Cache is one of the additional functions. Cache acts as a secondary buffer as data is
moved between real storage and disk. Storing the same data in processor storage
and the cache is not useful. To be useful, the cache must be significantly larger
than the buffers in real storage, store different data, or provide another
performance advantage. TotalStorage and many other new storage servers use
large caches and always pre-stage the data in the cache. You do not need to
actively manage the cache in the newer storage servers as you must do with older
storage device types.
|
With IBM TotalStorage and other new storage servers, disk performance does not
generally affect sequential I/O performance. The measure of disk speed in terms of
RPM (revolutions per minute) is relevant only if the cache hit ratio is low and the
I/O rate is very high. If the I/O rate per disk is proportional to the disk size, small
disks perform better than large disks. Large disks are very efficient for storing
infrequently accessed data. As with cache, spreading the data across more disks is
always better.
|
|
|
|
Remote replication is a significant factor in stored performance. When I/O
performance problems occur, especially in when remote replication is involved,
investigate the storage systems and communications network before looking for
problems with the host database system.
Storage servers and advanced features:
|
IBM TotalStorage offers many advanced features to further boost performance.
|
Other storage servers might offer similar functions.
|
Extended address volumes (EAV):
Chapter 6. Configuring storage for performance
65
|
|
With extended address volumes (EAV), you can store more data that is in VSAM
data sets on a single volume than you can store on non-extended address volumes.
|
|
|
|
|
The maximum amount of data that you can store in a single DB2 table space or
index space is the same for extended and non-extended address volumes. The
same DB2 data sets might use more space on extended address volumes than on
non-extended address volumes because space allocations in the extended area are
multiples of 21 cylinders on extended address volumes.
Parallel Access Volumes (PAV):
The parallel access volumes feature allows multiple concurrent I/Os on a given
device when the I/O requests originate from the same system.
Parallel access volumes (PAV) make storing multiple partitions on the same
volume with almost no loss of performance possible. In older disk subsystems, if
more than one partition is placed on the same volume (intentionally or otherwise),
attempts to read the partitions result in contention, which shows up as I/O
subsystem queue) time. Without PAVs, poor placement of a single data set can
almost double the elapsed time of a parallel query.
Multiple allegiance:
The multiple allegiance feature allows multiple active concurrent I/Os on a single
device when the I/O requests originate from different systems.
Together, parallel access volumes (PAVs) and multiple allegiance dramatically
improve I/O performance for parallel work on the same volume by nearly
eliminating I/O subsystem queue or PEND time and drastically lowering elapsed
time for transactions and queries.
FlashCopy:
The FlashCopy feature provides for fast copying of full volumes.
After an initialization period is complete, the logical copy is considered complete
but the physical movement of the data is deferred.
Peer-to-Peer Remote Copy (PPRC):
The PPRC and PPRC XD (Extended Distance) provides a faster method for
recovering DB2 subsystems at a remote site in the event of a disaster at the local
site.
Related tasks
Backing up with RVA storage control or Enterprise Storage Server (DB2
Administration Guide)
Older storage device types:
Unlike newer storage servers, older devices have much smaller caches.
The small caches require some user management. You can use DFSMS to provide
dynamic management of the cache storage.
Among the processes that can cause performance to suffer on such devices are sort
work files. Sort work files can have a large number of concurrent processes that
66
Performance Monitoring and Tuning Guide
can overload a storage controller with a small cache and thereby degrade system
performance. For example, one large sort could use 100 sequential files, needing 60
MB of storage. Unless the cache sizes are large, you might need to specify BYPASS
on installation panel DSNTIPE or use DFSMS controls to prevent the use of the
cache during sort processing. Separate units for sort work can give better
performance.
Using disk space effectively
How you allocate and manage data sets, compress your data, and design your
indexes can affects the performance of DB2.
Procedure
To use disk space more efficiently:
v Change your allocation of data sets to keep data sets within primary allocations.
v Manage them with the Hierarchical Storage Management functional component
(DFSMShsm) of DFSMS.
v Compress your data.
v Choose a page size that gives you good disk use and I/O performance
characteristics.
v Evaluate the need for and characteristics of your indexes.
Example
To manage the use of disk, you can use RMF™ to monitor how your devices are
used. Watch for usage rates that are higher than 30% to 35%, and for disk devices
with high activity rates. Log devices can have more than 50% utilization without
performance problems.
Related concepts
Recommendations for page size (DB2 Administration Guide)
Related information
Managing DB2 data sets with DFSMShsm (DB2 Administration Guide)
Allocating and extending data sets
Primary and secondary allocation sizes are the main factors that affect the amount
of disk space that DB2 uses.
In general, the primary allocation must be large enough to handle the storage
needs that you anticipate. The secondary allocation must be large enough for your
applications to continue operating until the data set is reorganized.
If the secondary allocation space is too small, the data set might have to be
extended more times to satisfy those activities that need a large space.
IFCID 0258 allows you to monitor data set extension activities by providing
information, such as the primary allocation quantity, maximum data set size, high
allocated space before and after extension activity, number of extents before and
after the extend, maximum volumes of a VSAM data set, and number of volumes
before and after the extend. Access IFCID 0258 in Statistics Class 3 (SC03) through
an IFI READA request.
Chapter 6. Configuring storage for performance
67
Related information
Creating storage groups and managing DB2 data sets (DB2 Administration
Guide)
Planning the placement of DB2 data sets:
To improve performance, plan the placement of DB2 data sets carefully.
Concentrate mainly on data sets for system files (especially the active logs), for the
DB2 catalog and directory, and for user data and indexes. The objective is to
balance I/O activity between different volumes, control units, and channels. Doing
so minimizes the I/O elapsed time and I/O queuing.
Identifying crucial DB2 data sets:
When you placing your data sets, you need to first consider data sets are crucial
for DB2 to function properly.
Procedure
To gather this information: If these reports are not available, consider these the
most important data sets:
Use the I/O reports from the DB2 performance trace. If these reports are not
available, consider the following data sets to be most important:
For transactions
v DSNDB01.SCT02 and its index
v DSNDB01.SPT01 and its index
v DSNDB01.DBD01
v DSNDB06.SYSPLAN table space and indexes on SYSPLANAUTH table
v DSNDB06.SYSPKAGE
v Active and archive logs
v Most frequently used user table spaces and indexes
For queries
v DSNDB01.DBD01
v DSNDB06.SYSPLAN table space and indexes on SYSPLANAUTH
v DSNDB06.SYSPKAGE
v DSNDB06.SYSDBASE table space and its indexes
v DSNDB06.SYSVIEWS table space and the index on SYSVTREE
v Work file table spaces
v DB2 QMF system table data sets
v Most frequently used user table spaces and indexes
These lists do not include other data sets that are less crucial to DB2 performance,
such as those that contain program libraries, control blocks, and formats. Those
types of data sets have their own design recommendations.
Estimating concurrent I/O requests:
The number of concurrent I/O requests is important when you calculate the
number of data paths for your DB2 subsystem.
68
Performance Monitoring and Tuning Guide
About this task
DB2 has a multi-tasking structure in which each user's request runs under a
different task control block (TCB). In addition, the DB2 system itself has its own
TCBs and SRBs for logging and database writes.
Procedure
To estimate the maximum number of concurrent I/O requests when your system is
loaded with data:
Use the following formula:
|
MAX USERS + 600 prefetches + 900 asynchronous writes
Changing catalog and directory size and location:
You can change the size or location of your DB2 catalog or directory .
Procedure
To change the size or location of DB2 catalog or directory data sets:
Choose one of the following actions:
v Run the RECOVER utility on the appropriate database
v Run the REORG utility on the appropriate table space
What to do next
A hierarchy of recovery dependencies determines the order in which you should
try to recover data sets.
Related concepts
Chapter 17, “Using tools to monitor performance,” on page 419
Related reference
RECOVER (DB2 Utilities)
Formatting early and speed-up formatting
You can improve the performance of applications that use heavy insert processing
by allocating space so that cylinders are used as the allocation amount, and by
pre-formatting a table space before inserting data.
Allocating space in cylinders or in large primary and secondary
quantities
Specify your space allocation amounts to ensure allocation by CYLINDER. If you
use record allocation for more than a cylinder, cylinder allocation is used. Cylinder
allocation can reduce the time required to do SQL mass inserts and to perform
LOGONLY recovery; it does not affect the time required to recover a table space
from an image copy or to run the REBUILD utility.
When inserting records, DB2 pre-formats space within a page set as needed. The
allocation amount, which is either CYLINDER or TRACK, determines the amount
of space that is pre-formatted at any one time.
Chapter 6. Configuring storage for performance
69
Because less space is pre-formatted at one time for the TRACK allocation amount,
a mass insert can take longer when the allocation amount is TRACK than the same
insert when the allocation amount is CYLINDER. However, smart secondary space
allocation minimizes the difference between TRACK and CYLINDER.
The allocation amount is dependent on device type and the number of bytes you
specify for PRIQTY and SECQTY when you define table spaces and indexes. The
default SECQTY is 10% of the PRIQTY, or 3 times the page size, whichever is
larger. This default quantity is an efficient use of storage allocation. Choosing a
SECQTY value that is too small in relation to the PRIQTY value results in track
allocation.
Pre-formatting during LOAD or REORG
When DB2 pre-formatting delays impact the performance or execution time
consistency of applications that do heavy insert processing, and if the table size
can be predicted for a business processing cycle, consider using the PREFORMAT
option of LOAD and REORG. If you preformat during LOAD or REORG, DB2
does not have to preformat new pages during execution. When the pre-formatted
space is used and when DB2 has to extend the table space, normal data set
extending and pre-formatting occurs.
Consider pre-formatting only if pre-formatting is causing a measurable delay with
the insert processing or causing inconsistent elapsed times for insert applications.
Recommendation: Quantify the results of pre-formatting in your environment by
assessing the performance both before and after using pre-formatting.
Related concepts
Secondary space allocation (DB2 Administration Guide)
Improving performance with LOAD or REORG PREFORMAT (DB2 Utilities)
Avoiding excessively small extents
Data set extent size affects performance because excessively small extents can
degrade performance during a sequential database scan.
|
Example
|
|
|
Suppose that the sequential data transfer speed is 100 MB per second and that the
extent size is 10 MB. The sequential scan must move to a new extent ten times per
second.
|
|
|
|
Recommendation: Maintain extent sizes that are large enough to avoid excessively
frequent extent moving during scans. Because as many as 16 cylinders can be
pre-formatted at the same time, keep the extent size greater than 16 cylinders for
large data sets.
|
Maximum number of extents
|
|
|
|
An SMS-managed linear data set is limited to 123 extents on a volume and 7257
total extents on all volumes. A non-SMS-managed data set is limited to 123 extents
on a volume and 251 total extents on all volumes. If a data set grows and extents
are not monitored, jobs eventually fail due to these extent limitations.
70
Performance Monitoring and Tuning Guide
|
|
|
Recommendation: Monitor the number of extents to avoid reaching the maximum
number of extents on a volume and the maximum number of extents on all
volumes.
|
Specifying primary quantity for nonpartitioned indexes
Specifying sufficient primary and secondary allocations for frequently used data
sets minimizes I/O time, because the data is not located at different places on the
disks.
|
|
Listing the catalog or VTOC occasionally to determine the number of secondary
allocations that have been made for your more frequently used data sets can also
be helpful. Alternatively, you can use IFCID 0258 in the statistics class 3 trace and
real time statistics to monitor data set extensions. OMEGAMON monitors IFCID
0258.
To prevent wasted space for non-partitioned indexes, take one of the
following actions:
v Let DB2 use the default primary quantity and calculate the secondary quantities.
Do this by specifying 0 for the IXQTY subsystem parameter, and by omitting a
PRIQTY and SECQTY value in the CREATE INDEX statement or ALTER INDEX
statement. If a primary and secondary quantity were previously specified for an
index, you can specify PRIQTY -1 and SECQTY -1 to change to the default
primary quantity and calculated secondary quantity.
v If the MGEXTSZ subsystem parameter is set to NO, so that you control
secondary space allocations, make sure that the value of PRIQTY + (N ×
SECQTY) is a value that evenly divides into PIECESIZE.
Reserving free space
By reserving free space, you can maintain the physical clustering of data and
reduce the need to frequently reorganize table spaces and indexes. However, you
do not want to allocate too much disk space.
About this task
You can use PCTFREE and FREEPAGE clauses to reserve free space in
table spaces and indexes.
The decision of whether to specify the percentage of free space per page or the
number of free pages depends on the type of SQL and the distribution of that
activity across the table space or index.
When free space is needed, using PCTFREE is recommended rather than
FREEPAGE in most situations.
v If update activity on compressed data, which often results in longer rows, is
heavy or insert volume is heavy, use a PCTFREE value greater than the default.
v For concurrency, use MAXROWS or larger PCTFREE values for small tables and
shared table spaces that use page locking. This reduces the number of rows per
page, thus reducing the frequency that any given page is accessed.
In some situations, using FREEPAGE is recommended over PCTFREE:
Chapter 6. Configuring storage for performance
71
v Use FREEPAGE rather than PCTFREE if MAXROWS is 1 or rows are larger than
half a page because DB2 cannot insert a second row on a page.
v For the DB2 catalog table spaces and indexes, use the default values for
PCTFREE. If additional free space is needed, use FREEPAGE.
The table spaces and indexes for the DB2 catalog can also be altered to modify
FREEPAGE and PCTFREE. These options are not applicable for LOB table spaces.
When to reserve free space:
By reserving free space, you can maintain the physical clustering of data and
reduce the need to frequently reorganize table spaces and indexes.
However, you do not want to allocate too much disk space. When deciding
whether to allocate free space, consider the data and each index separately and
assess the insert and update activity on the data and indexes.
When you specify a sufficient amount of free space, advantages during normal
processing include:
v Better clustering of rows (giving faster access)
v Fewer overflows
v Less frequent reorganizations needed
v Less information locked by a page lock
v Fewer index page splits
Disadvantages of specifying free space include:
v More disk space occupied
v Less information transferred per I/O
v More pages to scan
v Possibly more index levels
v Less efficient use of buffer pools and storage controller cache
When free space is not needed
You do not need to specify free for tables and indexes in the following situations:
v The object is read-only.
If you do not plan to insert or update data in a table, no free space is needed for
either the table or its indexes.
v The object is not read-only, but inserts are at the end, and updates that lengthen
varying-length columns are few.
For example, if inserts are in ascending order by key of the clustering index or are
caused by LOAD RESUME SHRLEVEL NONE and update activity is only on
fixed-length columns with non-compressed data, the free space for both the table
and clustering index should be zero.
Generally, free space is beneficial for a non-clustering index because inserts are
usually random. However, if the non-clustering index contains a column with a
timestamp value that causes the inserts into the index to be in sequence, the free
space should be zero.
72
Performance Monitoring and Tuning Guide
Specifying free space on pages:
The PCTFREE clause specifies what percentage of each page in a table space or
index is left free when loading or reorganizing the data.
About this task
DB2 uses the free space later when you insert or update your data. When
no free space is available, DB2 holds your additional data on another page. When
several records are physically located out of sequence, performance suffers. If you
have previously used a large PCTFREE value to force one row per page, you
should specify a MAXROWS clause instead.
Procedure
To specify the percentage of free space on a page:
v To determine the amount of free space currently on a page, run the RUNSTATS
utility, and examine the PERCACTIVE column of SYSIBM.SYSTABLEPART.
v Specify the value of PCTFREE for the table space or index. For example, the
default value for PCTFREE is 5, meaning that 5% of every page is kept free
when you load or reorganize the data. You can include the PCTFREE clause for
a number of statements.
– ALTER INDEX
– ALTER TABLESPACE
– CREATE INDEX
– CREATE TABLESPACE
The default value of PCTFREE for indexes is 10. The maximum amount of space
that is left free in index nonleaf pages is 10%, even if you specify a value higher
than 10 for PCTFREE.
v If you previously used a large PCTFREE value to force one row per page,
specify a MAXROWS clause with a value of 1 on the CREATE or ALTER
TABLESPACE statement instead. The MAXROWS clause has the advantage of
maintaining the free space even when new data is inserted.
Specifying single-row pages:
If you have previously used a large PCTFREE value to force one row per page,
you should specify a MAXROWS clause instead.
About this task
The MAXROWS clause has the advantage of maintaining the free space even when
new data is inserted.
Procedure
To specify single-row pages:
Include a MAXROWS clause with a value of 1 on the CREATE or ALTER
TABLESPACE statement.
Chapter 6. Configuring storage for performance
73
What to do next
If more free space is needed, use FREEPAGE rather than PCTFREE if MAXROWS
is 1.
Specifying the ratio of free pages:
The FREEPAGE clause specifies how often DB2 leaves a full page of free space
when loading data or when reorganizing data or indexes.
About this task
DB2 uses the free space that you allocated later when you insert or update your
data.
For example, if you specify 10 for value of the FREEPAGE clause, DB2 leaves every
10th page free.
Procedure
Specify the value of FREEPAGE for the tablespace or index. You can include the
FREEPAGE clause for the following statements:
v
v
v
v
ALTER INDEX
ALTER TABLESPACE
CREATE INDEX
CREATE TABLESPACE
The maximum value you can specify for FREEPAGE is 255; however, in a
segmented table space, the maximum value is 1 less than the number of pages
specified for SEGSIZE.
Compressing your data
By compressing your data, you can reduce the amount of disk space that is uses.
|
Before you begin
|
|
|
|
You can use the DSN1COMP utility to determine how well compression of your
data will work. Data in a LOB table space or a table space that is defined in the
work file database (the table space for declared temporary tables) cannot be
compressed.
About this task
When you compress data, bit strings that occur frequently are replaced by shorter
strings. Information about the mapping of bit strings to their replacements is
stored in a compression dictionary. Computer processing is required to compress
data before it is stored and to decompress the data when it is retrieved from
storage. In many cases, using the COMPRESS clause can significantly reduce the
amount of disk space needed to store data, but the compression ratio that you
achieve depends on the characteristics of your data.
With compressed data, you might see some of the following performance benefits,
depending on the SQL work load and the amount of compression:
v Higher buffer pool hit ratios
v Fewer I/Os
v Fewer getpage operations
74
Performance Monitoring and Tuning Guide
Procedure
To compress data:
1. Specify COMPRESS YES in the appropriate SQL statement:
v CREATE TABLESPACE
v ALTER TABLESPACE
2. Populate the table space with data by taking one of the following actions:
v Run the LOAD utility with REPLACE, RESUME NO, or RESUME YES (if the
tablespace contains no rows) and without KEEPDICTIONARY.
v
v Run the REORG utility without KEEPDICTIONARY.
If no compression dictionary already exists, and the amount of data in the
tables space reaches a threshold determined by DB2, a compression dictionary
is created. After the compression dictionary is built, DB2 uses it to compress all
subsequent data added to the table space.
Deciding whether to compress data:
You should consider many factors before you decide whether to compress data.
Consider these factors before compressing data:
Data row size
DB2 compresses the data of one record at a time. (The prefix of the record is not
compressed.) As row lengths become shorter, compression yields diminishing
returns because 8 bytes of overhead are required to store each record in a data
page. On the other hand, when row lengths are very long, compression of the data
portion of the row might yield little or no reduction in data set size because DB2
rows cannot span data pages. In the case of very long rows, using a larger page
size can enhance the benefits of compression, especially if the data is accessed
primarily in a sequential mode.
If compressing the record produces a result that is no shorter than the original,
DB2 does not compress the record.
Table space size
Compression can work very well for large table spaces. With small table spaces,
the size of the compression dictionary (64 KB) can offset the space savings that
compression provides.
Processing costs
Decompressing a row of data costs significantly less than compressing that same
row.
The data access path that DB2 uses affects the processor cost for data compression.
In general, the relative overhead of compression is higher for table space scans and
is less costlier for index access.
Chapter 6. Configuring storage for performance
75
I/O costs
When rows are accessed sequentially, fewer I/Os might be required to access data
that is stored in a compressed table space. However, the reduced I/O resource
consumption is traded for extra processor cost for decoding the data.
v If random I/O is necessary to access the data, the number of I/Os does not
decrease significantly, unless the associated buffer pool is larger than the table
and the other applications require little concurrent buffer pool usage.
v Some types of data compress better than others. Data that contains hexadecimal
characters or strings that occur with high frequency compresses quite well, while
data that contains random byte frequencies might not compress at all. For
example, textual and decimal data tends to compress well because certain byte
strings occur frequently.
Data patterns
The frequency of patterns in the data determines the compression savings. Data
with many repeated strings (such as state and city names or numbers with
sequences of zeros) results in good compression savings.
Table space design
Each table space or partition that contains compressed data has a compression
dictionary. The compression dictionary is built when one of the following
operations happens:
v REORG utility without KEEPDICTIONARY
v LOAD utility with REPLACE, RESUME NO, or RESUME YES (if the table space
contains no rows), and without KEEPDICTIONARY.
The dictionary contains a fixed number of entries, usually 4096, and resides with
the data. The dictionary content is based on the data at the time it was built, and
does not change unless the dictionary is rebuilt or recovered, or compression is
disabled with ALTER TABLESPACE.
If you use the REORG utility to build the compression dictionary, DB2 uses a
sampling technique to build the dictionary. This technique uses the first n rows
from the table space and then continues to sample rows for the remainder of the
UNLOAD phase. The value of n is determined by how much your data can be
compressed. In most cases, this sampling technique produces a better dictionary
and might produce better results for table spaces that contain tables with dissimilar
kinds of data.
Otherwise, DB2 uses only the first n rows added to the table space to build the
contents of the dictionary.
If you have a table space that contains more than one table, and the data used to
build the dictionary comes from only one or a few of those tables, the data
compression might not be optimal for the remaining tables. Therefore, put a table
that you want to compress into a table space by itself, or into a table space that
only contains tables with similar kinds of data.
Existing exit routines
An exit routine is executed before compressing or after decompressing, so you can
use DB2 data compression with your existing exit routines. However, do not use
76
Performance Monitoring and Tuning Guide
DB2 data compression in conjunction with DSN8HUFF. (DSN8HUFF is a sample
edit routine that compresses data using the Huffman algorithm, which is provided
with DB2. This adds little additional compression at the cost of significant extra
CPU processing.
Logging effects
If a data row is compressed, all data that is logged because of SQL changes to that
data is compressed. Thus, you can expect less logging for insertions and deletions;
the amount of logging for updates varies. Applications that are sensitive to
log-related resources can experience some benefit with compressed data.
External routines that read the DB2 log cannot interpret compressed data without
access to the compression dictionary that was in effect when the data was
compressed. However, using IFCID 306, you can cause DB2 to write log records of
compressed data in decompressed format. You can retrieve those decompressed
records by using the IFI function READS.
Distributed data
DB2 decompresses data before transmitting it to VTAM.
Related concepts
Compressing data (DB2 Utilities)
Increasing free space for compressed data:
You can provide free space to avoid the potential problem of more getpage and
lock requests for compressed data.
About this task
In some cases, using compressed data results in an increase in the number of
getpages, lock requests, and synchronous read I/Os. Sometimes, updated
compressed rows cannot fit in the home page, and they must be stored in the
overflow page. This can cause additional getpage and lock requests. If a page
contains compressed fixed-length rows with no free space, an updated row
probably has to be stored in the overflow page.
Procedure
To avoid the potential problem of more getpage and lock requests:
Add more free space within the page. Start with 10% additional free space and
adjust further, as needed. If, for example, 10% free space was used without
compression, start with 20% free space with compression for most cases. This
recommendation is especially important for data that is heavily updated.
Determining the effectiveness of compression:
Before compressing data, you can use the DSN1COMP stand-alone utility to
estimate how well the data can be compressed.
Chapter 6. Configuring storage for performance
77
About this task
After data is compressed, you can use compression reports and catalog statistics to
determine how effectively it was compressed.
Procedure
To find the effectiveness of data compression:
v Use the DSN1COMP stand-alone utility to find out how much space can be
saved and how much processing the compression of your data requires. Run
DSN1COMP on a data set that contains a table space, a table space partition, or
an image copy. DSN1COMP generates a report of compression statistics but does
not compress the data.
v Examine the compression reports after you use REORG or LOAD to build the
compression dictionary and compress the data. Both utilities issue a report
message (DSNU234I or DSNU244I). The report message gives information about
how well the data is compressed and how much space is saved. (REORG with
the KEEPDICTIONARY option does not produce the report.)
v Query catalog tables to find information about data compression
– PAGESAVE column of the SYSIBM.SYSTABLEPART tells you the percentage
of pages that are saved by compressing the data.
– PCTROWCOMP columns of SYSIBM.SYSTABLES and SYSIBM.SYSTABSTATS
tells you the percentage of the rows that were compressed in the table or
partition the last time RUNSTATS was run. Use the RUNSTATS utility to
update these catalog columns.
Related reference
DSN1COMP (DB2 Utilities)
Evaluating your indexes
In many cases, you might be able to eliminate indexes that are no longer necessary
or change the characteristics of an index to reduce disk usage.
About this task
Dropping unneeded indexes also improves performance because of savings in
index maintenance.
Eliminating unnecessary partitioning indexes:
A partitioning index requires as many data sets as the partitioned table space.
About this task
In table-controlled partitioning, partitioning key and limit keys are specified in the
CREATE TABLESPACE statement and not the CREATE INDEX statement, which
eliminates the need for any partitioning index data sets.
Procedure
To eliminate the disk space that is required for the partitioning index data sets:
1. Drop partitioning indexes that are used solely to partition the data, and not to
access it.
2. Convert to table-controlled partitioning.
78
Performance Monitoring and Tuning Guide
Related concepts
Differences between partitioning methods (DB2 Administration Guide)
Dropping indexes that were created to avoid sorts:
Indexes that are defined only to avoid a sort for queries with an ORDER BY clause
are unnecessary if DB2 can perform a backward scan of another index to avoid the
sort.
About this task
In earlier versions of DB2, you might have created ascending and descending
versions of the same index for the sole purpose of avoiding a sort operation.
Procedure
To recover the space that is used by these indexes:
Drop indexes that were created to avoid sorts.
For example, consider the following query:
SELECT C1, C2, C3 FROM T
WHERE C1 > 1
ORDER BY C1 DESC;
Having an ascending index on C1 would not have prevented a sort to order the
data. To avoid the sort, you needed a descending index on C1. DB2 can scan an
index either forwards or backwards, which can eliminate the need to have indexes
with the same columns but with different ascending and descending
characteristics.
For DB2 to be able to scan an index backwards, the index must be defined on the
same columns as the ORDER BY and the ordering must be exactly opposite of
what is requested in the ORDER BY. For example, if an index is defined as C1
DESC, C2 ASC,DB2 can use:
v A forward scan of the index for ORDER BY C1 DESC, C2 ASC
v A backward scan of the index for ORDER BY C1 ASC, C2 DESC
However, DB2 does need to sort for either of the following ORDER BY clauses:
v ORDER BY C1 ASC, C2 ASC
v ORDER BY C1 DESC, C2 DESC
Using non-padded indexes:
You can save disk space by using non-padded indexes instead of padded indexes.
About this task
When you define an index as NOT PADDED, the varying-length columns in the
index are not padded to their maximum length. If the index contains at least one
varying-length column, the length information is stored with the key.
Consequently, the amount of savings depends on the number of varying-length
columns in the index and the actual length of the columns in those indexes versus
their maximum lengths.
Procedure
To use index padding efficiently:
Chapter 6. Configuring storage for performance
79
As a general rule, use non-padded indexes only if the average amount that is
saved is greater than about 18 bytes per column. For example, assume that you
have an index key that is defined on a VARCHAR(128) column and the actual
length of the key is 8 bytes. An index that is defined as NOT PADDED would
require approximately 9 times less storage than an index that is defined as
PADDED, as shown by the following calculation:
(128 + 4) / (8 + 2 + 4) = 9
Example
|
Compressing indexes:
|
|
You can reduce the amount of space that an index take takes up on disk by
compressing the index.
|
About this task
|
|
|
|
|
|
However, keep in mind that index compression is heavily data-dependent, and
some indexes might contain data that will not yield significant space savings.
Compressed indexes might also use more real and virtual storage than
non-compressed indexes. The amount of additional real and virtual storage used
depends on the compression ratio used for the compressed keys, the amount of
free space, and the amount of space used by the key map.
|
|
The overhead of compressed indexes can be zero even in random key updates as
long as index pages can be kept in buffer pool.
|
Procedure
|
|
|
|
|
|
|
|
|
|
To reduce the size of the index on disk:
v Use the Use the DSN1COMP utility on existing indexes to get an indication of
the appropriate page size for new indexes. You can choose 8K and 16K
bufferpool page sizes for the index. Choosing a 16K buffer pool instead of a 8K
buffer pool accommodates a potentially higher compression ratio, but also
increases the potential to use more storage. Estimates for index space savings
from the DSN1COMP utility, whether on the true index data or some similar
index data, will not be exact.
|
Index splitting for sequential INSERT activity:
|
|
DB2 detects sequential inserts and splits index pages asymmetrically to improve
space usage and reduce split processing.
|
|
You can further improve performance by choosing the appropriate page size for
index pages.
|
|
|
When all the entries in a leaf page are consumed during inserts, DB2 allocates a
new page and moves some entries from the old page to the new page. DB2 detects
when a series of inserts adds keys in ascending or descending sequential order.
v Specify the COMPRESS YES option when you issue an ALTER INDEX or
CREATE INDEX statement.
80
Performance Monitoring and Tuning Guide
|
|
|
|
When such a pattern is detected, DB2 splits the index pages asymmetrically, by
placing more or fewer keys on the newly allocated page. In this way, DB2 allocates
page space more efficiently and reduces the frequency of split processing
operations.
|
|
|
|
|
|
Traditionally, DB2 split index pages by moving approximately half the entries to
the new page. According to that logic, when sequential inserts added keys in
ascending order, the freed space in the old index page was never used. This meant
that an index used only half of the allocated page space. Page-splitting also
occurred more frequently because the index would fill the available half of each
newly allocated page very quickly.
|
|
|
|
|
|
Larger index page sizes can be beneficial in cases where a frequent index split
results from heavy inserts. The frequency of index splitting can be determined
from LEAFNEAR, LEAFFAR, and NLEAF in SYSINDEXES and SYSINDEXPART
catalog tables, latch 70 (and latch class 6 in statistics) contention in data sharing,
and latch 254 contention in non data sharing (in latch class 7 in statistics) from
performance trace.
|
|
A smaller index page size can be beneficial for achieving higher buffer pool hit
ratios in random read-intensive applications.
Chapter 6. Configuring storage for performance
81
82
Performance Monitoring and Tuning Guide
Chapter 7. Improving DB2 log performance
By understanding the day-to-day activity on the log, you can more effectively
pinpoint when problems occur and better understand how to tune for best
performance.
About this task
DB2 logs changes made to data, and other significant events, as they occur. The
characteristics of your workload have a direct effect on log write performance.
Long-running tasks that commit infrequently incur a lot more data to write at
commit than a typical transaction. These tasks can cause subsystem impact because
of the excess storage consumption, locking contention, and resources that are
consumed for a rollback.
Do not forget to consider the cost of reading the log as well. The cost of reading
the log directly affects how long a restart or a recovery takes because DB2 must
read the log data before applying the log records back to the table space.
Related tasks
Managing the log and the bootstrap data set (DB2 Administration Guide)
Improving log write performance
By following certain recommendations, you can reduce the performance impact of
writing data to the log data sets.
Procedure
|
|
|
|
|
|
|
|
|
|
|
|
To improve log write performance, use any of the following approaches:
v If you replicate your logs to remote sites, choose the storage system that provide
the best possible performance for remote replication.
v Choose the largest size that your system can tolerate for the log output buffer. A
larger size for the log output buffer might decrease the number of forced I/O
operations that occur because additional buffers are unavailable, and can also
reduce the number of wait conditions. You can use the OUTPUT BUFFER field
(the OUTBUFF subsystem parameter) of installation panel DSNTIPL to specify
the size of the output buffer used for writing active log data sets. The maximum
size of the log output buffer is 400,000 KB.
To validate the OUTBUFF setting, you can collect IFCID 0001 (system services
statistics) trace records. The QJSTWTB field indicates the number of times the
buffer was full and caused a log record to wait for I/O to complete. A non-zero
count for QJSTWTB might indicate that the log output buffer is too small.
Similarly the value of the QJSTBPAG, if too large, might indicate that the log
output buffer is too large in relation to the demand for real storage.
v Choose fast devices for log data sets. The devices that are assigned to the active
log data sets must be fast. In environments with high levels of write activity,
high-capacity storage systems, such as the IBM TotalStorage DS8000 series, are
recommended to avoid logging bottlenecks.
v Avoid device contention. Place the copy of the bootstrap data set and, if using
dual active logging, the copy of the active log data sets, on volumes that are
accessible on a path different than that of their primary counterparts.
© Copyright IBM Corp. 1982, 2010
83
v Preformat new active log data sets. Whenever you allocate new active log data
sets, preformat them using the DSNJLOGF utility. This action avoids the
overhead of preformatting the log, which normally occurs at unpredictable
times.
v Stripe active log data sets. The active logs can be striped using DFSMS. Striping
is a technique to improve the performance of data sets that are processed
sequentially. Striping is achieved by splitting the data set into segments or
stripes and spreading those stripes across multiple volumes. Striping can
improve the maximum throughput log capacity and is most effective when
many changes to log records occur between commits. Striping is useful if you
have a high I/O rate for the logs. Striping is needed more with ESCON channels
than with the faster FICON channels.
v Stripe archive log data sets on disk. If writes to the archive do not complete as
fast as writes to the active log , transactions might slow down while waiting for
the active log to be emptied. Second, when applying log records from the
archive log, striping helps DB2 read the archive data sets faster.
|
|
|
|
|
|
|
|
|
|
|
|
Types of log writes
Log writes are divided into two categories: asynchronous and synchronous.
Asynchronous writes
Asynchronous writes are the most common. These asynchronous writes occur
when data is updated. Before, and after, image records are usually moved to the
log output buffer, and control is returned to the application. However, if no log
buffer is available, the application must wait for one to become available.
Synchronous writes
Synchronous writes usually occur at commit time when an application has
updated data. This write is called 'forcing' the log because the application must
wait for DB2 to force the log buffers to disk before control is returned to the
application. If the log data set is not busy, all log buffers are written to disk. If the
log data set is busy, the requests are queued until it is freed.
Writing to two logs
Dual logging is shown in the figure below.
Force
end of Phase 1
Force
beginning of Phase 2
End of COMMIT
I/O
I/O
Log 1
Log 2
I/O
I/O
Time line
Waiting for logging
Application
Figure 4. Dual logging during two-phase commit
84
Performance Monitoring and Tuning Guide
Waiting for logging
|
|
|
|
|
|
|
If you use dual logging (recommended for availability), the write to the first log
sometimes must complete before the write to the second log begins. The first time
a log control interval is written to disk, the write I/Os to the log data sets are
performed in parallel. However, if the same 4-KB log control interval is again
written to disk, the write I/Os to the log data sets must be done serially to prevent
any possibility of losing log data in case of I/O errors on both copies
simultaneously.
Two-phase commit log writes
Because they use two-phase commit, applications that use the CICS, IMS, and RRS
attachment facilities force writes to the log twice. The first write forces all the log
records of changes to be written (if they have not been written previously because
of the write threshold being reached). The second write writes a log record that
takes the unit of recovery into an in-commit state.
Related reference
DSNJLOGF (preformat active log) (DB2 Utilities)
Improving log read performance
The performance impact of log reads is evident during a rollback, restart, and
database recovery.
About this task
DB2 must read from the log and apply changes to the data on disk. Every process
that requests a log read has an input buffer dedicated to that process. DB2 searches
for log records in the following order:
1. Output buffer
2. Active log data set
3. Archive log data set
If the log records are in the output buffer, DB2 reads the records directly from that
buffer. If the log records are in the active or archive log, DB2 moves those log
records into the input buffer used by the reading process (such as a recovery job or
a rollback).
DB2 reads the log records faster from the active log than from the archive log.
Access to archived information can be delayed for a considerable length of time if
a unit is unavailable or if a volume mount is required (for example, a tape mount).
Procedure
To improve log read performance:
v Archive to disk. If the archive log data set resides on disk, it can be shared by
many log readers. In contrast, an archive on tape cannot be shared among log
readers. Although it is always best to avoid reading archives altogether, if a
process must read the archive, that process is serialized with anyone else who
must read the archive tape volume. For example, every rollback that accesses the
archive log must wait for any previous rollback work that accesses the same
archive tape volume to complete. If you do not have enough space to maintain
the archive data sets on disk, consider using DFHSM to write the archive data
sets to tape. This method has a disadvantage in that HSM must read the archive
data set from disk in order to write it to tape, but the recall process is improved
Chapter 7. Improving DB2 log performance
85
for a number of reasons. You can pre-stage the recalls of archive data sets in
parallel (to striped data sets), and when the data sets are recalled, parallel
readers can proceed.
v Avoid device contention on the log data sets by placing your active log data sets
on different volumes and I/O paths to avoid I/O contention in periods of high
concurrent log read activity. When multiple concurrent readers access the active
log, DB2 can ease contention by assigning some readers to a second copy of the
log. Therefore, for performance and error recovery, use dual logging and place
the active log data sets on a number of different volumes and I/O paths.
Whenever possible, put data sets within a copy or within different copies on
different volumes and I/O paths. Ensure that no data sets for the first copy of
the log are on the same volume as data sets for the second copy of the log.
v Stripe active log data sets. The active logs can be striped using DFSMS. Striping
is a technique to improve the performance of data sets that are processed
sequentially. Striping is achieved by splitting the data set into segments or
stripes and spreading those stripes across multiple volumes. Striping can
improve the maximum throughput log capacity and is most effective when
many changes to log records occur between commits. Striping is useful if you
have a high I/O rate for the logs. Striping is needed more with ESCON channels
than with the faster FICON channels.
Log statistics
You can gather statistics for logging activities from the OMEGAMON statistics
report.
A non-zero value for A in the following example indicates that your output
buffer is too small. Ensure that the size you choose is backed up by real storage. A
non-zero value for B is an indicator that your output buffer is too large for the
amount of available real storage.
LOG ACTIVITY
QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- ------READS SATISFIED-OUTPUT BUFF
0.00
0.00
N/C
0.00
READS SATISFIED-OUTP.BUF(%)
N/C
READS SATISFIED-ACTIVE LOG
0.00
0.00
N/C
0.00
READS SATISFIED-ACTV.LOG(%)
N/C
READS SATISFIED-ARCHIVE LOG
0.00
0.00
N/C
0.00
READS SATISFIED-ARCH.LOG(%)
N/C
TAPE VOLUME CONTENTION WAIT
READ DELAYED-UNAVAIL.RESOUR
ARCHIVE LOG READ ALLOCATION
ARCHIVE LOG WRITE ALLOCAT.
CONTR.INTERV.OFFLOADED-ARCH
LOOK-AHEAD MOUNT ATTEMPTED
LOOK-AHEAD MOUNT SUCCESSFUL
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0.00
0.00
0.00
0.00
0.00
0.00
0.00
UNAVAILABLE OUTPUT LOG BUFF A 0.00
OUTPUT LOG BUFFER PAGED IN B
0.00
0.00
0.00
N/C
N/C
0.00
0.00
9456.6K
10.5K
277.3K
308.18
758.3K
842.70
976.8K 1085.48
N/A
2.12
N/C
N/C
N/C
N/C
N/A
16.41
0.48
1.32
1.70
N/A
LOG
LOG
LOG
LOG
LOG
RECORDS CREATED C
CI CREATED D
WRITE I/O REQ (LOG1&2)
CI WRITTEN (LOG1&2)
RATE FOR 1 LOG (MB)
Figure 5. Log statistics in the OMEGAMON statistics report
86
Performance Monitoring and Tuning Guide
Calculating average log record size
One way to determine how much log volume you need is to consider the average
size of a log record.
About this task
As a general estimate, you can start with 200 bytes as the average size. To increase
the accuracy of this estimate, get the real size of the log records that are written.
Procedure
To calculate the average size of log records that are written:
1. Collect the following values from the statistics report:
LOG RECORDS CREATED
The number of log records created (C)
LOG CI CREATED
The number of control intervals created in the active log counter (D)
2. Use the following formula:
D × 4 KB /C = avg size of log record
Improving log capacity
The capacity that you specify for the active log affects DB2 performance
significantly.
About this task
If you specify a capacity that is too small, DB2 might need to access data in the
archive log during rollback, restart, and recovery. Accessing an archive takes a
considerable amount of time.
The following subsystem parameters affect the capacity of the active log. In each
case, increasing the value that you specify for the parameter increases the capacity
of the active log.
NUMBER OF LOGS field on installation panel DSNTIPL
Controls the number of active log data sets that you create.
ARCHIVE LOG FREQ field on installation panel DSNTIPL
Where you provide an estimate of how often active log data sets are
copied to the archive log.
UPDATE RATE field on installation panel DSNTIPL
Where you provide an estimate of how many database changes (inserts,
update, and deletes) you expect per hour.
The DB2 installation CLIST
Uses UPDATE RATE and ARCHIVE LOG FREQ to calculate the data set
size of each active log data set.
CHECKPOINT FREQ field on installation panel DSNTIPN
Specifies the number of log records that DB2 writes between checkpoints
or the number of minutes between checkpoints.
Chapter 7. Improving DB2 log performance
87
Related reference
Update selection menu panel: DSNTIPB (DB2 Installation Guide)
Active log data set parameters: DSNTIPL (DB2 Installation Guide)
Total capacity and the number of logs
You need to have sufficient capacity in the active log to avoid reading the archives,
and you need to consider how that total capacity should be divided.
Those requirements can make the configuration of your active log data sets a
challenge. Having too many or too few active log data sets has ramifications. This
information is summarized in the following table.
Table 13. The effects of installation options on log data sets. You can modify the size of the
data sets in installation job DSNTIJIN
Value for ARCHIVE
LOG FREQ
Value for NUMBER
OF LOGS
Low
High
Many small data sets. Can cause
operational problems when archiving to
tape. Checkpoints occur too frequently.
High
Low
Few large data sets. Can result in a
shortage of active log data sets.
Result
Choosing a checkpoint frequency
If log data sets are too small, checkpoints occur too frequently, and database writes
are not efficient.
About this task
At least one checkpoint is taken each time DB2 switches to a new active log data
set. As a guideline, you should provide enough active log space for at least 10
checkpoint intervals.
Procedure
To specify the checkpoint interval:
v Specify the CHECKPOINT FREQ subsystem parameter. You can change
CHECKPOINT FREQ dynamically with the SET LOG or SET SYSPARM
command.
You can specify the interval in terms of the number of log records that are
written between checkpoints or the number of minutes between checkpoints.
v Avoid taking more than one checkpoint per minute by raising the
CHECKPOINT FREQ value so that the checkpoint interval becomes at least one
minute during peak periods. In general, the recommended checkpoint frequency
is between 2 and 5 minutes.
Increasing the number of active log data sets
You can use the change log inventory utility (DSNJU003) to add more active log
data sets to the BSDS.
88
Performance Monitoring and Tuning Guide
About this task
|
You can specify a maximum of 93 data sets per active log copy.
Related reference
DSNJCNVB (DB2 Utilities)
DSNJU003 (change log inventory) (DB2 Utilities)
Setting the size of active log data sets
You can modify DSNTIJIN installation job to change the size of your active log
data set.
Procedure
To choose the most effective size for your active log data set:
v When you calculate the size of the active log data set, identify the longest unit of
work in your application programs. For example, if a batch application program
commits only once every 20 minutes, the active log data set should be twice as
large as the update information that is produced during this period by all of the
application programs that are running.
For more information on determining and setting the size of your active log data
sets, refer to DB2 Installation Guide.
v Allow time for possible operator interventions, I/O errors, and tape drive
shortages if off-loading to tape. DB2 supports up to 20 tape volumes for a single
archive log data set. If your archive log data sets are under the control of
DFSMShsm, also consider the Hierarchical Storage Manager recall time, if the
data set has been migrated by Hierarchical Storage Manager.
v When archiving to disk, set the primary space quantity and block size for the
archive log data set so that you can offload the active log data set without
forcing the use of secondary extents in the archive log data set. This action
avoids space abends when writing the archive log data set.
v Make the number of records for the active log be divisible by the blocking factor
of the archive log (disk or tape). DB2 always writes complete blocks when it
creates the archive log copy of the active log data set. If you make the archive
log blocking factor evenly divisible into the number of active log records, DB2
does not have to pad the archive log data set with nulls to fill the block. This
action can prevent REPRO errors if you should ever have to REPRO the archive
log back into the active log data set, such as during disaster recovery.
v Make the number of records for the active log be divisible by the blocking factor
of the archive log (disk or tape). DB2 always writes complete blocks when it
creates the archive log copy of the active log data set. If you make the archive
log blocking factor evenly divisible into the number of active log records, DB2
does not have to pad the archive log data set with nulls to fill the block. This
action can prevent REPRO errors if you should ever have to REPRO the archive
log back into the active log data set, such as during disaster recovery.
To determine the blocking factor of the archive log, divide the value specified on
the BLOCK SIZE field of installation panel DSNTIPA by 4096 (that is, BLOCK
SIZE / 4096). Then modify the DSNTIJIN installation job so that the number of
records in the DEFINE CLUSTER field for the active log data set is a multiple of
the blocking factor.
Chapter 7. Improving DB2 log performance
89
v If you offload to tape, consider adjusting the size of each of your active log data
sets to contain the same amount of space as can be stored on a nearly full tape
volume. Doing so minimizes tape handling and volume mounts and maximizes
the use of the tape resource.
If you change the size of your active log data set to fit on one tape volume,
remember that the bootstrap data set is copied to the tape volume along with
the copy of the active log data set. Therefore, decrease the size of your active log
data set to offset the space that is required on the archive tape for the bootstrap
data set.
Controlling the amount of log data
Certain processes such as the LOAD and REORG utility and certain SQL
statements can cause a large amount of information to be logged, requiring a large
amount of log space.
Controlling log size for utilities
The REORG and LOAD LOG(YES) utilities cause all reorganized or loaded data to
be logged.
About this task
For example, if a table space contains 200 million rows of data, this data, along
with control information, is logged when this table space is the object of a REORG
utility job. If you use REORG with the DELETE option to eliminate old data in a
table and run CHECK DATA to delete rows that are no longer valid in dependent
tables, you can use LOG(NO) to control log volume.
Procedure
To reduce the log size:
v When populating a table with many records or reorganizing table spaces or
indexes, specify LOG(NO) and take an inline copy or take a full image copy
immediately after the LOAD or REORG.
v Specify LOGGED when adding less than 1% of the total table space. Doing so
creates additional logging, but eliminates the need for a full image copy
|
|
|
|
|
Controlling log size for SQL operations
The amount of logging performed for applications depends on how much data is
changed.
About this task
Certain SQL statements are quite powerful, and a single statement can sometimes
modify a large amount of data. Such statements include:
INSERT with a fullselect
A large amount of data can be inserted into table or view, depending on
the result of the query.
Mass deletes and mass updates (except for deleting all rows for a table in a
segmented or universal table space)
For non-segmented table spaces, each of these statements results in the
logging of all database data that changes. For example, if a table contains
200 million rows of data, that data and control information are logged if all
|
|
|
90
Performance Monitoring and Tuning Guide
|
|
of the rows of a table are deleted with the SQL DELETE statement. No
intermediate commit points are taken during this operation.
|
|
|
|
|
|
|
For segmented and universal table spaces, a mass delete results in the
logging of the data of the deleted records when any of the following
conditions are true:
v The table is the parent table of a referential constraint.
v The table is defined as DATA CAPTURE(CHANGES), which causes
additional information to be logged for certain SQL operations.
v A delete trigger is defined on the table.
TRUNCATE TABLE
Essentially a mass-delete that does not activate delete triggers
Data definition statements
Logging for the entire database descriptor for which the change was made.
For very large DBDs, this can be a significant amount of logging.
Modification to rows that contain LOB data
Procedure
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To control the use of log space by powerful SQL statements:
v For mass delete operations, consider using segmented table spaces or universal
table spaces. If segmented table spaces are not an option, and no triggers exist
on the table or your application can safely ignore any triggers on the table,
create one table per table space, and use TRUNCATE.
v For inserting a large amount of data, instead of using an SQL INSERT statement,
use the LOAD utility with LOG(NO) and take an inline copy.
v For updates, consider your workload when defining a table's columns. The
amount of data that is logged for update depends on whether the row contains
all fixed-length columns or not. For fixed-length non-compressed rows, changes
are logged only from the beginning of the first updated column to the end of the
last updated column. Consequently, you should keep frequently updated
columns close to each other to reduce log quantities.
For varying-length rows (A varying-length row contains one or more
varying-length columns), data is logged from the first changed byte to the end
of the last updated column if the length is not changed. However, in cases
where the length changes, which are more common, the data is logged from the
first changed byte to the end of the row. Keep varying-length columns at the
end of the row to improve read performance, and keep all frequently updated
columns near the end of the row to improve update performance. However, if
only fixed-length columns are updated frequently, keep those columns close to
each other at the beginning of the row.
To determine whether a workload is read-intensive or update-intensive, check
the log data rate. You can find the rate in the LOG RATE FOR 1 LOG (MB/SEC)
field in the log statistics. Determine the average log size and divide that by 60 to
get the average number of log bytes written per second.
– If you log less than 5 MB per second, the workload is read-intensive.
– If you log more than 5 MB per second, the workload is update-intensive.
v If you have many data definition statements (CREATE, ALTER, DROP) for a
single database, issue them within a single unit of work to avoid logging the
changed DBD for each data definition statement. However, be aware that the
DBD is locked until the COMMIT is issued.
Chapter 7. Improving DB2 log performance
91
|
|
|
v Use the NOT LOGGED option for any LOB or XML data that requires frequent
updating and for which the trade off of non-recoverability of LOB or XML data
from the log is acceptable. (You can still use the RECOVER utility on LOB or
XML table spaces to recover control information that ensures physical
consistency of the LOB or XML table space.) Because LOB and XML table spaces
defined as NOT LOGGED are not recoverable from the DB2 log, you should
make a recovery plan for that data. For example, if you run batch updates, be
sure to take an image copy after the updates are complete.
v For data that is modified infrequently, except during certain periods such as
year-end processing, when frequent or large changes to the data occur over a
short time:
|
|
|
|
|
|
|
|
|
1. Make an image copy of the data.
2. Alter the table space to NOT LOGGED.
3. Make the massive changes.
4. Stop other activities that update the data.
5. Make an image copy of the data.
6. Alter the table space to LOGGED.
v For changes to tables, such as materialized query tables, that contain propagated
data, use the NOT LOGGED option because the data exists elsewhere. If the
data becomes damaged you can refresh the entire table from the original source.
92
Performance Monitoring and Tuning Guide
|
|
|
Chapter 8. Improving the performance of stored procedures
and user-defined functions
|
|
You can improve the performance of stored procedures and user-defined functions
by following certain recommendations.
|
Procedure
|
|
|
|
|
To improve the performance of stored procedures and user-defined functions, use
any of the following recommendations:
v Update the ASUTIME column of the SYIBM.SYSROUTINES catalog table to set
processor limits for each stored procedures or function. The limits that you
specify enable DB2 to cancel procedures or functions that loop.
v Limit the number of times that a stored procedure can terminate abnormally by
specifying one of the following options:
– The MAX ABEND COUNT field on installation panel DSNTIPX. The limit
that you specify applies to all stored procedures and prevents a problem
procedure from overwhelming the system with abend dump processing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
v
v
– The STOP AFTER FAILURES option on the ALTER or CREATE PROCEDURE
statement. The limit that you specify overrides the system limit that is
specified in the MAX ABEND COUNT field to specify limits for specific
stored procedures.
Maximize the number of procedures or functions that can run concurrently in a
WLM-established stored procedure address space.
Group your stored procedures in WLM application environments. For more
information, see Defining application environments.
Use indicator variables in your programs and pass the indicator variables as
parameters. When output parameters occupy a large amount of storage, passing
the entire storage areas to your stored procedure can be wasteful. However, you
can use indicator variables in the calling program to pass only a two-byte area
to the stored procedure and receive the entire area from the stored procedure.
Set a high-enough priority for the WLM-managed stored procedures address
spaces.
Set the performance-related options appropriately in the CREATE PROCEDURE
statement. The following table shows the recommended values.
|
|
|
|
v
|
|
Table 14. Recommended values for performance-related options in the CREATE procedure
statement.
|
Option
Recommend setting
|
PROGRAM TYPE
SUB
|
STAY RESIDENT
YES
|
PARAMETER STYLE
GENERAL WITH NULLS or SQL
|
|
|
|
|
|
|
COMMIT ON RETURN NO for stored procedures that are called locally; YES for stored
procedures that are called from distributed client applications in
environments where sysplex workload balancing is not used.
v
v Do not use the DSNTRACE DD statement in any of your stored procedures
address space startup procedures. DSNTRACE is a facility that can be used to
capture all trace messages for offline reference and diagnosis. However,
© Copyright IBM Corp. 1982, 2010
93
DSNTRACE greatly increases the stored procedure initialization overhead. Also,
DSNTRACE does not function in a multitasking environment because the CAF
does not serialize access to the DSNTRACE trace data set.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v Specify a large enough value for the CACHERAC subsystem parameter on
DSNTIPP installation panel. The CACHERAC parameter specifies how much
storage to allocate for the caching of routine authorization information for all
routines on DB2 the member.
v Set the CMTSTAT subsystem parameter to INACTIVE This setting causes
distributed threads to become inactive at commit when possible. The inactive
threads become available for thread reuse, and that reduces the amount of
thread storage needed for the workload, by reducing the number of distributed
active threads.
v Convert external stored procedures to native SQL procedures whenever possible.
The body of a native SQL procedure is written in SQL, and DB2 does not
generate an associated C program for native stored procedures. Native
procedures typically perform better and have more functionality that external
procedures.
v Study your workload for external stored procedures and functions carefully. You
can use DB2 Performance Expert of DB2 Performance Monitor to monitor stored
procedures and user-defined functions.
|
|
|
|
|
|
|
|
v Use partitioned data set extended (PDSE) members for load libraries that contain
stored procedures. By using PDSE members, you might eliminate the need to
stop and start the stored procedures address space because of growth in load
libraries, because the new extent information is dynamically updated. If a load
library grows from additions or replacements, the library might have to be
extended. If you use partitioned data set (PDS) members, load failures might
occur because the new extent information is not available.
Related concepts
|
“IBM Tivoli OMEGAMON XE” on page 423
|
|
|
Case study: Stored procedure that runs RUNSTATS in parallel
Related tasks
“Limiting resources for a stored procedure” on page 655
Creating a native SQL procedure (Application programming and SQL)
|
|
|
|
Passing large output parameters to stored procedures by using indicator
variables (Application programming and SQL)
Related reference
|
|
ROUTINE AUTH CACHE field (CACHERAC subsystem parameter) (DB2
Installation Guide)
Related information
Defining Application Environments
|
|
|
|
|
Maximizing the number of procedures or functions that run in an
address space
You can improve the performance of stored procedures and user-defined functions
by maximizing the number of procedures or functions that can run concurrently in
a WLM-established stored procedures address space.
|
|
|
94
Performance Monitoring and Tuning Guide
|
About this task
|
|
|
|
Each task control block that runs in a WLM-established stored procedures address
space uses approximately 200 KB below the 16-MB line. DB2 needs this storage for
stored procedures and user-defined functions because you can create both main
programs and subprograms, and DB2 must create an environment for each.
|
Procedure
|
|
|
|
|
|
To maximize the number of procedures or functions that can run concurrently in a
WLM-established stored procedures address space:
v Set the region size for the address spaces to REGION=0 to obtain the largest
possible amount of storage below the 16-MB line.
v Limit storage required by application programs below the 16-MB line by using
the following methods:
|
|
|
|
|
– Link edit programs above the line with AMODE(31) and RMODE(ANY)
attributes
– Use the RES and DATA(31) compiler options for COBOL programs
v Limit storage required by Language Environment® by using the following
run-time options:
|
|
HEAP(,,ANY)
Allocates program heap storage above the 16-MB line
|
|
STACK(,,ANY,)
Allocates program stack storage above the 16-MB line
|
|
STORAGE(,,,4K)
Reduces reserve storage area below the line to 4 KB
|
|
BELOWHEAP(4K,,)
Reduces the heap storage below the line to 4 KB
|
|
LIBSTACK(4K,,)
Reduces the library stack below the line to 4 KB
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ALL31(ON)
Causes all programs contained in the external user-defined function to
execute with AMODE(31) and RMODE(ANY)
The definer can list these options as values of the RUN OPTIONS parameter of
CREATE FUNCTION, or the system administrator can establish these options as
defaults during Language Environment installation. For example, the RUN
OPTIONS option parameter could contain:
H(,,ANY),STAC(,,ANY,),STO(,,,4K),BE(4K,,),LIBS(4K,,),ALL31(ON)
v Set the NUMTCB parameter for WLM-established stored procedures address
spaces to a value greater than 1 to allow more than one TCB run in an address
space. Be aware that setting NUMTCB to a value greater than 1 also reduces
your level of application program isolation. For example, a bad pointer in one
application can overwrite memory that is allocated by another application.
A stored procedure can invoke only one utility in one address space at any
given time because of the resource requirements of utilities. On the WLM
Application-Environment panel, set NUMTCB to 1.With NUMTCB=1 or the
value of NUMTCB being forced to 1, multiple WLM address spaces are created
to run each concurrent utility request that comes from a stored procedure call.
Chapter 8. Improving the performance of stored procedures and user-defined functions
95
|
Related tasks
|
|
Specifying the number of stored procedures that can run concurrently
(Application programming and SQL)
|
|
|
Assigning stored procedures and functions to WLM application
environments
|
|
You can assign procedures to WLM environments to route the work that is
associated with the procedures to specific address spaces.
|
About this task
|
|
|
Workload manager routes work to address spaces based on the application
environment name and service class associated with the stored procedure or
function.
|
|
To assign a stored procedures or user-defined functions to run in WLM-established
stored procedures address spaces:
|
|
|
|
|
|
|
|
|
|
|
|
|
Procedure
1. Make sure you have a numeric value specified in the TIMEOUT VALUE field
of installation panel DSNTIPX. If you have problems with setting up the
environment, this timeout value ensures that your request to execute a stored
procedure does not wait for an unlimited amount of time.
To prevent creating too many address spaces, create a relatively small number
of WLM application environments and z/OS service classes.
2. Minimize the number of application environments and z/OS service classes.
Otherwise, you might cause WLM to create too many WLM address spaces.
WLM creates one address space for each combination of application
environment and service class. In other words, if five application environments
have calling threads, and six service classes exist, WLM might create as many
as 30 address spaces.
3. Use the WLM application environment panels to associate the environment
name with the JCL procedure. The following figure shows an example of this
panel. For detailed information about workload management panels and how
to use them, see Using the WLM ISPF Application.
|
|
|
|
|
96
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Application-Environment Notes Options Help
-----------------------------------------------------------------------Create an Application Environment
Command ===> ___________________________________________________________
Application Environment Name
Description . . . . . . . .
Subsystem Type . . . . . . .
Procedure Name . . . . . . .
Start Parameters . . . . . .
Starting of server
1
1. Managed by
2. Limited to
3. Limited to
.
.
.
.
.
:
.
.
.
.
WLMENV2_____________________ Required
Large Stored Proc Env.
DB2___ Required
DSN1WLM
DB2SSN=DB2A,NUMTCB=2,APPLENV=WLMENV2
_______________________________________
___________________________________
address spaces for a subsystem instance:
WLM
a single address space per system
a single address space per sysplex
Figure 6. WLM panel to create an application environment. You can also use the variable &IWMSSNM for the
DB2SSN parameter (DB2SSN=&IWMSSNM). This variable represents the name of the subsystem for which you are
starting this address space. This variable is useful for using the same JCL procedure for multiple DB2 subsystems.
|
|
|
|
|
|
|
|
|
4. Specify the WLM application environment name for the
WLM_ENVIRONMENT option of the CREATE or ALTER PROCEDURE (or
FUNCTION) statement to associate a stored procedure or user-defined function
with an application environment.
5. Using the install utility in the WLM application, install the WLM service
definition that contains information about this application environment into the
couple data set.
|
|
Recommendations for assigning stored procedures to WLM
environments
|
|
|
|
|
The NUMTCB value that you choose for each application environment will vary
by language. The following table provides general recommendations for the WLM
procedure NUMTCB setting for the different language stored procedures. These
recommendations are based on available resources and should be tuned
accordingly.
|
Table 15. Recommended types WLM environments to define
|
Stored procedure name
NUMTCB value
|
COBOL, C/C++, PL/I
10-40
|
|
|
Debugging COBOL, C/C++
and PL/I related stored
procedures
10-40
|
|
|
|
|
|
|
REXX
1
REXX stored procedures must
run in a WLM procedure
with NUMTCB = 1. If they
execute in a procedure with
NUMTCB>1, unpredictable
results, such as an 0C4 will
occur.
|
|
|
|
|
Java (non-resettable mode)
20-40
Each JVM is started in
non-resettable mode and is
never reset. This allows you
to run many more Java
stored procedures.
6. Activate a WLM policy from the installed service definition.
7. Begin running stored procedures.
Comments
Chapter 8. Improving the performance of stored procedures and user-defined functions
97
|
Table 15. Recommended types WLM environments to define (continued)
|
Stored procedure name
NUMTCB value
Comments
|
|
|
|
|
|
|
|
|
|
External SQL stored
procedures
10-40
Must have one unauthorized
data set. COBOL, C/C++,
and PL/I stored procedures
can share the same
application environment if
the STEPLIB data sets and
runtime options are the same
for all languages.
|
|
WLM Application Environment recommendations
Related tasks
|
|
Managing authorizations for creation of stored procedures in WLM
environments (DB2 Administration Guide)
|
|
Specifying the number of stored procedures that can run concurrently
(Application programming and SQL)
Related concepts
Setting up WLM for DB2 stored procedures
|
Creating the required WLM application environment
Related information
Defining Application Environments
|
|
|
|
|
Accounting for nested activities
|
|
|
The accounting class 1 and class 2 CPU and elapsed times for triggers, stored
procedures, and user-defined functions are accumulated in separate fields and
exclude any time accumulated in other nested activity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These CPU and elapsed times are accumulated for each category during the
execution of each agent until agent deallocation. Package accounting can be used
to break out accounting data for execution of individual stored procedures,
user-defined functions, or triggers. The following figure shows an agent that
executes multiple types of DB2 nested activities.
Time
----T0
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T16
T17
98
Application
DB2
Stored procedure
User-defined function
------------ ---------------------------- -------------------------- -------------------------Code
SQL---------->
<--------SQL---------->
Trigger
SQL
CALL triggered--------------->
Stored procedure code
<-------------SQL
------------->Stored procedure code
<-------------SQL(User-defined function)
Start User-defined function
--------------------------------------------------------->User-defined function code
<---------------------------------------------------------SQL
--------------------------------------------------------->User-defined function code
<---------------------------------------------------------User-defined function ends
Back to Stored procedure----->Stored procedure code
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
T18
T19
T20
T21
SQL
<-------------Back to trigger
Trigger ends
Code<----------Return to Application
End
Figure 7. Time spent executing nested activities
The following table shows the formula used to determine time for nested activities.
|
Table 16. Sample for time used for execution of nested activities
|
Count for
Formula
Class
|
Application elapsed
T21-T0
1
|
|
Application task control block
(TU)
T21-T0
1
|
Application in DB2 elapsed
T2-T1 + T4-T3 + T20-T19
2
|
|
Application in DB2 task control
block (TU)
T2-T1 + T4-T3 + T20-T19
2
|
Trigger in DB2 elapsed
T6-T4 + T19-T18
2
|
|
Trigger in DB2 task control block
(TU)
T6-T4 + T19-T18
2
|
Wait for STP time
T7-T6 + T18–T17
3
|
Stored procedure elapsed
T11-T6 + T18-T16
1
|
|
Stored procedure task control
block (TU)
T11-T6 + T18-T16
1
|
Stored procedure SQL elapsed
T9-T8 + T11-T10 + T17-16
2
|
Stored procedure SQL elapsed
T9-T8 + T11-T10 + T17-T16
2
|
|
Wait for user-defined function
time
T12-T11
3
|
User-defined function elapsed
T16-T11
1
|
|
User-defined function task control T16-T11
block (TU)
1
|
|
User-defined function SQL
elapsed
T14-T13
2
|
|
|
User-defined function SQL task
control block (TU)
T14-T13
2
|
Note: In the preceding table, TU = time used.
|
|
|
The total class 2 time is the total of the "in DB2" times for the application, trigger,
stored procedure, and user-defined function. The class 1 "wait" times for the stored
procedures and user-defined functions need to be added to the total class 3 times.
Chapter 8. Improving the performance of stored procedures and user-defined functions
99
100
Performance Monitoring and Tuning Guide
Chapter 9. Using materialized query tables to improve SQL
performance
Materialized query tables can simplify query processing, greatly improve the
performance of dynamic SQL queries, and are particularly effective in data
warehousing applications, where you can use them to avoid costly aggregations
and joins against large fact tables.
About this task
PSPI
Materialized query tables are tables that contain information that is derived
and summarized from other tables. Materialized query tables pre-calculate and
store the results of queries with expensive join and aggregation operations. By
providing this summary information, materialized query tables can simplify query
processing and greatly improve the performance of dynamic SQL queries.
Materialized query tables are particularly effective in data warehousing
applications.
Automatic query rewrite is the process DB2 uses to access data in a materialized
query table. If you enable automatic query rewrite, DB2 determines if it can resolve
a dynamic query or part of the query by using a materialized query table.
Procedure
To take advantage of eligible materialized query tables:
Rewrite the queries to use materialized query tables instead of the underlying base
tables. Keep in mind that a materialized query table can yield query results that
are not current if the base tables change after the materialized query table is
updated.
PSPI
Configuring automatic query rewrite
You can enable DB2 to rewrite certain queries to use materialized query tables
instead of base tables for faster processing.
Procedure
PSPI
To take advantage of automatic query rewrite with materialized query
tables:
1. Define materialized query tables.
2. Populate materialized query tables.
3. Refresh materialized query tables periodically to maintain data currency with
base tables. However, realize that refreshing materialized query tables can be
an expensive process.
4. Enable automatic query rewrite, and exploit its functions by submitting
read-only dynamic queries.
5. Evaluate the effectiveness of the materialized query tables. Drop under-utilized
tables, and create new tables as necessary.
© Copyright IBM Corp. 1982, 2010
PSPI
101
Materialized query tables and automatic query rewrite
As the amount of data has increased, so has the demand for more interactive
queries.
PSPI
Because databases have grown substantially over the years, queries must
operate over huge amounts of data. For example, in a data warehouse
environment, decision-support queries might need to operate over 1 to 10 terabytes
of data, performing multiple joins and complex aggregation. As the amount of data
has increased, so has the demand for more interactive queries.
Despite the increasing amount of data, these queries still require a response time in
the order of minutes or seconds. In some cases, the only solution is to pre-compute
the whole or parts of each query. You can store these pre-computed results in a
materialized query table. You can then use the materialized query table to answer
these complicated queries when the system receives them. Using a process called
automatic query rewrite, DB2 recognizes when it can transparently rewrite a
submitted query to use the stored results in a materialized query table. By
querying the materialized query table instead of computing the results from the
underlying base tables, DB2 can process some complicated queries much more
efficiently. If the estimated cost of the rewritten query is less than the estimated
cost of the original query, DB2 uses the rewritten query.
Automatic query rewrite
When it uses Automatic query rewrite, DB2 compares user queries with the fullselect
query that defined a materialized query table. It then determines whether the
contents of a materialized query table overlap with the contents of a query. When
an overlap exists, the query and the materialized query table are said to match.
After discovering a match, DB2 rewrites the query to access the matched
materialized query table instead of the one or more base tables that the original
query specified. If a materialized query table overlaps with only part of a query,
automatic query rewrite can use the partial match. Automatic query rewrite
compensates for the non-overlapping parts of the query by accessing the tables
that are specified in the original query.
Automatic query rewrite tries to search for materialized query tables that result in
an access path with the lowest cost after the rewrite. DB2 compares the estimated
costs of the rewritten query and of the original query and chooses the query with
the lower estimated cost.
Example
Suppose that you have a very large table named TRANS that contains one row for
each transaction that a certain company processes. You want to tally the total
amount of transactions by some time period. Although the table contains many
columns, you are most interested in these four columns:
v YEAR, MONTH, DAY, which contain the date of a transaction
v AMOUNT, which contains the amount of the transaction
To total the amount of all transactions between 2001 and 2006, by year, you would
use the following query:
102
Performance Monitoring and Tuning Guide
SELECT YEAR, SUM(AMOUNT)
FROM TRANS
WHERE YEAR >= ’2001’ AND YEAR <= ’2006’
GROUP BY YEAR
ORDER BY YEAR;
This query might be very expensive to run, particularly if the TRANS table is a
very large table with millions of rows and many columns.
Now suppose that you define a materialized query table named STRANS by using
the following CREATE TABLE statement:
CREATE TABLE STRANS AS
(SELECT YEAR AS SYEAR,
MONTH AS SMONTH,
DAY AS SDAY,
SUM(AMOUNT) AS SSUM
FROM TRANS
GROUP BY YEAR, MONTH, DAY)
DATA INITIALLY DEFERRED REFRESH DEFERRED;
After you populate STRANS with a REFRESH TABLE statement, the table contains
one row for each day of each month and year in the TRANS table.
Using the automatic query rewrite process, DB2 can rewrite the original query into
a new query. The new query uses the materialized query table STRANS instead of
the original base table TRANS:
SELECT SYEAR, SUM(SSUM)
FROM STRANS
WHERE SYEAR >= ’2001’ AND SYEAR <= ’2006’
GROUP BY SYEAR
ORDER BY SYEAR
If you maintain data currency in the materialized query table STRANS, the
rewritten query provides the same results as the original query. The rewritten
query offers better response time and requires less CPU time.
Queries that are eligible for rewrite
|
|
|
DB2 supports automatic query rewrite only for read-only, dynamic queries that do
not contain parameter markers. DB2 cannot automatically rewrite statically bound
queries.
PSPI
You can always refer to a materialized query table explicitly in a statically
bound query or in a dynamically prepared query. However, you should consider
updating materialized query table data more frequently if you frequently query the
table directly. Also, allowing end users to refer directly to materialized query tables
reduces an installation's flexibility of dropping and creating materialized query
tables without affecting applications. PSPI
How DB2 considers automatic query rewrite
In general, DB2 considers automatic query rewrite at the query block level. A
read-only, dynamic query can contain multiple query blocks.
PSPI
For example, it might contain a subselect of UNION or UNION ALL,
temporarily materialized views, materialized table expressions, and subquery
predicates. DB2 processes the query without automatic query rewrite if the query
block is or contains any of the following items:
Chapter 9. Using materialized query tables to improve SQL performance
103
v A fullselect in the UPDATE SET statement.
v A fullselect in the INSERT statement.
v A fullselect in the materialized query table definition in the REFRESH TABLE
statement.
v An outer join.
v A query block that contains user-defined scalar or table functions with the
EXTERNAL ACTION attribute or the NON-DETERMINISTIC attribute, or with
the built-in function RAND.
v Parameter markers
|
If none of these items exist in the query block, DB2 considers automatic query
rewrite. DB2 analyzes the query block in the user query and the fullselect in the
materialized query table definition to determine if it can rewrite the query. The
materialized query table must contain all of the data from the source tables (in
terms of both columns and rows) that DB2 needs to satisfy the query. For DB2 to
choose a rewritten query, the rewritten query must provide the same results as the
user query. (DB2 assumes the materialized query table contains current data.)
Furthermore, the rewritten query must offer better performance than the original
user query.
DB2 performs a sophisticated analysis to determine whether it can obtain the
results for the query from a materialized query table:
v DB2 compares the set of base tables that were used to populate the materialized
query table to the set of base tables that are referenced by the user query. If
these sets of tables share base tables in common, the query is a candidate for
query rewrite.
v DB2 compares the predicates in the materialized query table fullselect to the
predicates in the user query. The following factors influence the comparison:
– The materialized query table fullselect might contain predicates that are not in
the user query. If so, DB2 assumes that these predicates might have resulted
in discarded rows when the materialized query table was refreshed. Thus,
any rewritten query that makes use of the materialized query table might not
give the correct results. The query is not a candidate for query rewrite.
Exception: DB2 behavior differs if a predicate joins a common base table to
an extra table that is unique to the materialized query table fullselect. The
predicate does not result in discarded data if you define a referential
constraint between the two base tables to make the predicate lossless.
However, the materialized query table fullselect must not have any local
predicates that reference this extra table.
For an example of a lossless predicate, see Example 2 under “Automatic
query rewrite—complex examples” on page 105.
– Referential constraints on the source tables are very important in determining
whether automatic query rewrite uses a materialized query table.
– Predicates are much more likely to match if you code the predicate in the
user query so that it is the same or very similar to the predicate in the
materialized query table fullselect. Otherwise, the matching process might fail
on some complex predicates.
For example, the matching process between the simple equal predicates such
as COL1 = COL2 and COL2 = COL1 succeeds. Furthermore, the matching process
between simple equal predicates such as COL1 * (COL2 + COL3) = COL5 and
104
Performance Monitoring and Tuning Guide
COL5 = (COL3 + COL2) * COL1 succeeds. However, the matching process
between equal predicates such as (COL1 + 3) * 10 = COL2 and COL1 * 10 +
30 = COL2 fails.
– The items in an IN-list predicate do not need to be in exactly the same order
for predicate matching to succeed.
v DB2 compares GROUP BY clauses in the user query to GROUP BY clauses in the
materialized query table fullselect. If the user query requests data at the same or
higher grouping level as the data in the materialized query table fullselect, the
materialized query table remains a candidate for query rewrite. DB2 uses
functional dependency information and column equivalence in this analysis.
v DB2 compares the columns that are requested by the user query with the
columns in the materialized query table. If DB2 can derive the result columns
from one or more columns in the materialized query table, the materialized
query table remains a candidate for query rewrite.DB2 uses functional
dependency information and column equivalence in this analysis.
v DB2 examines predicates in the user query that are not identical to predicates in
the materialized query table fullselect. Then, DB2 determines if it can derive
references to columns in the base table from columns in the materialized query
table instead. If DB2 can derive the result columns from the materialized query
table, the materialized query table remains a candidate for query rewrite.
If all of the preceding analyses succeed, DB2 rewrites the user query. DB2 replaces
all or some of the references to base tables with references to the materialized
query table. If DB2 finds several materialized query tables that it can use to rewrite
the query, it might use multiple tables simultaneously. If DB2 cannot use the tables
simultaneously, it chooses which one to use according to a set of rules.
After writing the new query, DB2 determines the cost and the access path of that
query. DB2 uses the rewritten query if the estimated cost of the rewritten query is
less than the estimated cost of the original query. The rewritten query might give
only approximate results if the data in the materialized query table is not up to
date. PSPI
Automatic query rewrite—complex examples
These examples can help you understand how DB2 applies automatic query
rewrite to avoid costly aggregations and joins against large fact tables.
The following examples assume a scenario in which a data warehouse has a star
schema. The star schema represents the data of a simplified credit card application,
as shown in the following figure.
Chapter 9. Using materialized query tables to improve SQL performance
105
Location
dimension
Product
dimension
PGROUP
ID
LOC
PLINE
1:N
1:N
ID
LINEID
ID
CITY
NAME
STATE
NAME
COUNTRY
TRANSITEM
ID
1:N
TRANS
N:1
ID
PGID
LOCID
TRANSID
YEAR
PRICE
MONTH
DISCOUNT
DAY
QUANTITY
ACCTID
Time
dimension
N:1
ACCT
CUST
ID
N:1
ID
MARITAL_S
CUSTID
INCOME_R
STATUS
ZIPCODE
RESIDENCE
Account
dimension
Figure 8. Multi-fact star schema. In this simplified credit card application, the fact tables TRANSITEM and TRANS form
the hub of the star schema. The schema also contains four dimensions: product, location, account, and time.
The data warehouse records transactions that are made with credit cards. Each
transaction consists of a set of items that are purchased together. At the center of
the data warehouse are two large fact tables. TRANS records the set of credit card
purchase transactions. TRANSITEM records the information about the items that
are purchased. Together, these two fact tables are the hub of the star schema. The
star schema is a multi-fact star schema because it contains these two fact tables.
The fact tables are continuously updated for each new credit card transaction.
In addition to the two fact tables, the schema contains four dimensions that
describe transactions: product, location, account, and time.
v The product dimension consists of two normalized tables, PGROUP and PLINE,
that represent the product group and product line.
v The location dimension consists of a single, denormalized table, LOC, that
contains city, state, and country.
v The account dimension consists of two normalized tables, ACCT and CUST, that
represent the account and the customer.
v The time dimension consists of the TRANS table that contains day, month, and
year.
106
Performance Monitoring and Tuning Guide
Analysts of such a credit card application are often interested in the aggregation of
the sales data. Their queries typically perform joins of one or more dimension
tables with fact tables. The fact tables contain significantly more rows than the
dimension tables, and complicated queries that involve large fact tables can be
very costly. In many cases, you can use materialized query table to summarize and
store information from the fact tables. Using materialized query tables can help
you avoid costly aggregations and joins against large fact tables. PSPI
Example 1
An analyst submits the following query to count the number of
transactions that are made in the United States for each credit card. The analyst
requests the results grouped by credit card account, state, and year:
UserQ1
-----SELECT T.ACCTID, L.STATE, T.YEAR, COUNT(*) AS CNT
FROM TRANS T, LOC L
WHERE T.LOCID = L.ID AND
L.COUNTRY = ’USA’
GROUP BY T.ACCTID, L.STATE, T.YEAR;
Assume that the following CREATE TABLE statement created a materialized query
table named TRANSCNT:
CREATE TABLE TRANSCNT AS
(SELECT ACCTID, LOCID, YEAR, COUNT(*) AS CNT
FROM TRANS
GROUP BY ACCTID, LOCID, YEAR )
DATA INITIALLY DEFERRED
REFRESH DEFERRED;
If you enable automatic query rewrite, DB2 can rewrite UserQ1 as NewQ1. NewQ1
accesses the TRANSCNT materialized query table instead of the TRANS fact table.
NewQ1
----SELECT A.ACCTID, L.STATE, A.YEAR, SUM(A.CNT) AS CNT
FROM TRANSCNT A, LOC L
WHERE A.LOCID = L.ID
AND
L.COUNTRY = ’USA’
GROUP BY A.ACCTID, L.STATE, A.YEAR;
DB2 can use query rewrite in this case because of the following reasons:
v The TRANS table is common to both UserQ1 and TRANSCNT.
v DB2 can derive the columns of the query result from TRANSCNT.
v The GROUP BY in the query requests data that are grouped at a higher level
than the level in the definition of TRANSCNT.
Because customers typically make several hundred transactions per year with most
of them in the same city, TRANSCNT is about hundred times smaller than TRANS.
Therefore, rewriting UserQ1 into a query that uses TRANSCNT instead of TRANS
improves response time significantly.
Chapter 9. Using materialized query tables to improve SQL performance
107
Example 2
Assume that an analyst wants to find the number of televisions, with a
price over 100 and a discount greater than 0.1, that were purchased by each credit
card account. The analyst submits the following query:
UserQ2
-----SELECT T.ID, TI.QUANTITY *
FROM TRANSITEM TI, TRANS
WHERE TI.TRANSID = T.ID
TI.PGID = PG.ID
TI.PRICE > 100
TI.DISCOUNT > 0.1
PG.NAME = ’TV’;
TI.PRICE * (1 - TI.DISCOUNT) AS AMT
T, PGROUP PG
AND
AND
AND
AND
If you define the following materialized query table TRANSIAB, DB2 can rewrite
UserQ2 as NewQ2:
TRANSIAB
-------CREATE TABLE TRANSIAB AS
(SELECT TI.TRANSID, TI.PRICE, TI.DISCOUNT, TI.PGID,
L.COUNTRY, TI.PRICE * TI.QUANTITY as VALUE
FROM TRANSITEM TI, TRANS T, LOC L
WHERE TI.TRANSID = T.ID AND
T.LOCID = L.ID
AND
TI.PRICE > 1
AND
TI.DISCOUNT > 0.1)
DATA INITIALLY DEFERRED
REFRESH DEFERRED;
NewQ2
----SELECT A.TRANSID, A.VALUE * (1 - A.DISCOUNT) as AM
FROM TRANSIAB A, PGROUP PG
WHERE A.PGID = PG.ID
AND
A.PRICE > 100
AND
PG.NAME = ’TV’;
DB2 can rewrite UserQ2 as a new query that uses materialized query table
TRANSIAB because of the following reasons:
v Although the predicate T.LOCID = L.ID appears only in the materialized query
table, it does not result in rows that DB2 might discard. The referential
constraint between the TRANS.LOCID and LOC.ID columns makes the join
between TRANS and LOC in the materialized query table definition lossless. The
join is lossless only if the foreign key in the constraint is NOT NULL.
v The predicates TI.TRANSID = T.ID and TI.DISCOUNT > 0.1 appear in both the
user query and the TRANSIAB fullselect.
v The fullselect predicate TI.PRICE >1 in TRANSIAB subsumes the user query
predicate TI.PRICE > 100 in UserQ2. Because the fullselect predicate is more
inclusive than the user query predicate, DB2 can compute the user query
predicate from TRANSIAB.
v The user query predicate PG.NAME = ’TV’ refers to a table that is not in the
TRANSIAB fullselect. However, DB2 can compute the predicate from the
PGROUP table. A predicate like PG.NAME = ’TV’ does not disqualify other
predicates in a query from qualifying for automatic query rewrite. In this case
PGROUP is a relatively small dimension table, so a predicate that refers to the
table is not overly costly.
108
Performance Monitoring and Tuning Guide
v DB2 can derive the query result from the materialized query table definition,
even when the derivation is not readily apparent:
– DB2 derives T.ID in the query from T.TRANSID in the TRANSIAB fullselect.
Although these two columns originate from different tables, they are
equivalent because of the predicate T.TRANSID = T.ID. DB2 recognizes such
column equivalency through join predicates. Thus, DB2 derives T.ID from
T.TRANSID, and the query qualifies for automatic query rewrite.
– DB2 derives AMT in the query UserQ2 from DISCOUNT and VALUE in the
TRANSIAB fullselect.
Example 3
This example shows how DB2 matches GROUP BY items and aggregate
functions between the user query and the materialized query table fullselect.
Assume that an analyst submits the following query to find the average value of
the transaction items for each year:
UserQ3
-----SELECT YEAR, AVG(QUANTITY * PRICE) AS AVGVAL
FROM TRANSITEM TI, TRANS T
WHERE TI.TRANSID = T.ID
GROUP BY YEAR;
If you define the following materialized query table TRANSAVG, DB2 can rewrite
UserQ3 as NewQ3:
TRANSAVG
-------CREATE TABLE TRANSAVG AS
(SELECT T.YEAR, T.MONTH, SUM(QUANTITY * PRICE) AS TOTVAL, COUNT(*) AS CNT
FROM TRANSITEM TI, TRANS T
WHERE TI.TRANSID = T.ID
GROUP BY T.YEAR, T.MONTH )
DATA INITIALLY DEFERRED
REFRESH DEFERRED;
NewQ3
----SELECT YEAR, CASE WHEN SUM(CNT) = 0 THEN NULL
ELSE SUM(TOTVAL)/SUM(CNT)
END AS AVGVAL
FROM TRANSAVG
GROUP BY YEAR;
DB2 can rewrite UserQ3 as a new query that uses materialized query table
TRANSAVG because of the following reasons:
v DB2 considers YEAR in the user query and YEAR in the materialized query
table fullselect to match exactly.
v DB2 can derive the AVG function in the user query from the SUM function and
the COUNT function in the materialized query table fullselect.
v The GROUP BY clause in the query NewQ3 requests data at a higher level than
the level in the definition of TRANSAVG.
Chapter 9. Using materialized query tables to improve SQL performance
109
v DB2 can compute the yearly average in the user query by using the monthly
sums and counts of transaction items in TRANSAVG. DB2 derives the yearly
averages from the CNT and TOTVAL columns of the materialized query table by
using a case expression.
Determining whether query rewrite occurred
You can use EXPLAIN to determine whether DB2 has rewritten a user query to use
a materialized query table.
About this task
PSPI
When DB2 rewrites the query, the PLAN TABLE shows the name of the
materialized query that DB2 uses. The value of the TABLE_TYPE column is M to
indicate that the table is a materialized query table.
Example
Consider the following user query:
SELECT YEAR, AVG(QUANTITY * PRICE) AS AVGVAL
FROM TRANSITEM TI, TRANS T
WHERE TI.TRANSID = T.ID
GROUP BY YEAR;
If DB2 rewrites the query to use a materialized query table, a portion of the plan
table output might look like the following table.
Table 17. Plan table output for an example with a materialized query table
PLANNO
METHOD
TNAME
JOIN_TYPE
TABLE_TYPE
1
0
TRANSAVG
-
M
2
3
2
-
?
The value M in TABLE_TYPE indicates that DB2 used a materialized query table.
TNAME shows that DB2 used the materialized query table named
TRANSAVG.You can also obtain this information from a performance trace (IFCID
0022). PSPI
Enabling automatic query rewrite
Whether DB2 can consider automatic query rewrite depends on properly defined
materialized query tables, and the values of two special registers.
Before you begin
v The isolation levels of the materialized query tables must be equal to or higher
than the isolation level of the dynamic query being considered for automatic
query rewrite
v You must populate system-maintained materialized query tables before DB2
considers them in automatic query rewrite.
About this task
The values of two special registers, CURRENT REFRESH AGE and
CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION determine
110
Performance Monitoring and Tuning Guide
whether DB2 can consider using materialized query tables in automatic query
rewrite.
Procedure
|
|
|
|
|
|
|
|
|
|
|
|
|
To enable automatic query rewrite:
v Specify ANY for the CURRENT REFRESH AGE special register.
The value in special register CURRENT REFRESH AGE represents a refresh age.
The refresh age of a materialized query table is the time since the REFRESH
TABLE statement last refreshed the table. (When you run the REFRESH TABLE
statement, you update the timestamp in the REFRESH_TIME column in catalog
table SYSVIEWS.) The special register CURRENT REFRESH AGE specifies the
maximum refresh age that a materialized query table can have. Specifying the
maximum age ensures that automatic query rewrite does not use materialized
query tables with old data. The CURRENT REFRESH AGE has only two values:
0 or ANY. A value of 0 means that DB2 considers no materialized query tables in
automatic query rewrite. A value of ANY means that DB2 considers all
materialized query tables in automatic query rewrite.
The CURRENT REFRESH AGE field on installation panel DSNTIP8 determines
the initial value of the CURRENT REFRESH AGE special. The default value for
the CURRENT REFRESH AGE field is 0.
v Specify the appropriate value for the CURRENT MAINTAINED TABLE TYPES
FOR OPTIMIZATION special register.
The refresh age of a user-maintained materialized query table might not truly
represent the freshness of the data in the table. In addition to the REFRESH
TABLE statement, user-maintained query tables can be updated with the
INSERT, UPDATE, MERGE, TRUNCATE, and DELETE statements and the
LOAD utility. Therefore, you can use the CURRENT MAINTAINED TABLE
TYPES FOR OPTIMIZATION special register to determine which type of
materialized query tables, system-maintained or user-maintained, DB2 considers
in automatic query rewrite. The special register has four possible values that
indicate which materialized query tables DB2 considers for automatic query
rewrite:
SYSTEM
DB2 considers only system-maintained materialized query tables.
USER DB2 considers only user-maintained materialized query tables.
ALL
DB2 considers both types of materialized query tables.
NONE
DB2 considers no materialized query tables.
The CURRENT MAINT TYPES field on installation panel DSNTIP4 determines
the initial value of the CURRENT MAINTAINED TABLE TYPES FOR
OPTIMIZATION special register. the the default value for CURRENT MAINT
TYPES is SYSTEM.
Results
The following table summarizes how to use the CURRENT REFRESH AGE and
CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION special registers
together. The table shows which materialized query tables DB2 considers in
automatic query rewrite.
Chapter 9. Using materialized query tables to improve SQL performance
111
Table 18. The relationship between CURRENT REFRESH AGE and CURRENT MAINTAINED TABLE TYPES FOR
OPTIMIZATION special registers
Value of CURRENT
MAINTAINED TABLE
TYPES FOR
OPTIMIZATION
SYSTEM
USER
ALL
None
CURRENT REFRESH
AGE=ANY
All system-maintained
materialized query
tables
All user-maintained
materialized query
tables
All materialized query None
tables (both
system-maintained and
user-maintained)
CURRENT REFRESH
AGE=0
None
None
None
None
Creating a materialized query table
You can create a materialized query table, which is defined by the result of a
query, to improve the performance of certain SQL applications.
About this task
You should create materialized query tables in a table space that is defined as NOT
LOGGED to avoid the performance overhead created by logging changes to the
data.
|
|
|
Procedure
To create a new table as a materialized query table:
1. Write a CREATE TABLE statement, and specify a fullselect. You can explicitly
specify the column names of the materialized query table or allow DB2 to
derive the column names from the fullselect. The column definitions of a
materialized query table are the same as those for a declared global temporary
table that is defined with the same fullselect.
2. Include the DATA INITIALLY DEFERRED and REFRESH DEFERRED clauses
when you define a materialized query table.
DATA INITIALLY DEFERRED clause
DB2 does not populate the materialized query table when you create
the table. You must explicitly populate the materialized query table.
v For system-maintained materialized query tables, populate the tables
for the first time by using the REFRESH TABLE statement.
v For user-maintained materialized query tables, populate the table by
using the LOAD utility, INSERT statement, or REFRESH TABLE
statement.
REFRESH DEFERRED clause
DB2 does not immediately update the data in the materialized query
table when its base tables are updated. You can use the REFRESH
TABLE statement at any time to update materialized query tables and
maintain data currency with underlying base tables.
3. Specify who maintains the materialized query table:
112
Performance Monitoring and Tuning Guide
MAINTAINED BY SYSTEM clause
|
|
|
|
|
|
|
Specifies that the materialized query table is a system-maintained
materialized query table. You cannot update a system-maintained
materialized query table by using the LOAD utility or the INSERT,
UPDATE, MERGE, TRUNCATE, or DELETE statements. You can
update a system-maintained materialized query table only by using the
REFRESH TABLE statement. BY SYSTEM is the default behavior if you
do not specify a MAINTAINED BY clause.
|
|
|
Create system-maintained materialized query tables in segmented or
universal table spaces because the REFRESH TABLE statement triggers
a mass delete operation.
|
|
|
|
MAINTAINED BY USER clause
Specifies that the table is a user-maintained materialized query table.
You can update a user-maintained materialized query table by using
the LOAD utility, the INSERT, UPDATE, MERGE, TRUNCATE, and
DELETE statements, as well as the REFRESH TABLE statement.
4. Specify whether query optimization is enabled.
ENABLE QUERY OPTIMIZATION clause
Specifies that DB2 can consider the materialized query table in
automatic query rewrite. When you enable query optimization, DB2 is
more restrictive of what you can select in the fullselect for a
materialized query table.
DISABLE QUERY OPTIMIZATION clause
Specifies that DB2 cannot consider the materialized query table in
automatic query rewrite.
Recommendation: When creating a user-maintained materialized query table,
initially disable query optimization. Otherwise, DB2 might automatically
rewrite queries to use the empty materialized query table. After you populate
the user-maintained materialized query table, you can alter the table to enable
query optimization.
Results
The isolation level of the materialized table is the isolation level at which the
CREATE TABLE statement is executed.
After you create a materialized query table, it looks and behaves like other tables
in the database system, with a few exceptions. DB2 allows materialized query
tables in database operations wherever it allows other tables, with a few
restrictions. As with any other table, you can create indexes on the materialized
query table; however, the indexes that you create must not be unique. Instead, DB2
uses the materialized query table's definition to determine if it can treat the index
as a unique index for query optimization.
Example
The following CREATE TABLE statement defines a materialized query table named
TRANSCNT. The TRANSCNT table summarizes the number of transactions in
table TRANS by account, location, and year:
CREATE TABLE TRANSCNT (ACCTID, LOCID, YEAR, CNT) AS
(SELECT ACCOUNTID, LOCATIONID, YEAR, COUNT(*)
FROM TRANS
GROUP BY ACCOUNTID, LOCATIONID, YEAR )
Chapter 9. Using materialized query tables to improve SQL performance
113
DATA INITIALLY DEFERRED
REFRESH DEFERRED
MAINTAINED BY SYSTEM
ENABLE QUERY OPTIMIZATION;
The fullselect, together with the DATA INITIALLY DEFERRED clause and the
REFRESH DEFERRED clause, defines the table as a materialized query table.
Related reference
CREATE TABLE (DB2 SQL)
Rules for materialized query table
If one or more source tables in the materialized query table definition contain a
security label column, certain rules apply to creating a materialized query table.
About this task
v If only one source table contains a security label column, the following
conditions apply:
– You must define the security label column in the materialized query table
definition with the AS SECURITY LABEL clause.
– The materialized query table inherits the security label column from the
source table.
– The MAINTAINED BY USER option is allowed.
v If only one source table contains a security label column and the materialized
query table is defined with the DEFINITION ONLY clause, the materialized
query table inherits the values in the security label column from the source
table. However, the inherited column is not a security label column.
v If more than one source table contains a security label column, DB2 returns an
error code and the materialized query table is not created.
Registering an existing table as a materialized query table
You might already have manually created base tables that act like materialized
query tables and have queries that directly access the base tables. These base tables
are often referred to as summary tables.
Before you begin
To ensure the accuracy of data that is used in automatic query rewrite, ensure that
the summary table is current before registering it as a materialized query table.
Alternatively, you can follow these steps:
1. Register the summary table as a materialized query table with automatic query
rewrite disabled.
2. Update the newly registered materialized query table to refresh the data.
3. Use the ALTER TABLE statement on the materialized query table to enable
automatic query rewrite.
Procedure
To take advantage of automatic query rewrite for an existing summary table:
114
Performance Monitoring and Tuning Guide
Use the ALTER TABLE statement with DATA INITIALLY DEFERRED and
MAINTAINED BY USER clauses, to register the table as materialized query table.
The fullselect must specify the same number of columns as the table you register
as a materialized query table. The columns must have the same definitions and
have the same column names in the same ordinal positions.
The DATA INITIALLY DEFERRED clause indicates that the table data is to remain
the same when the ALTER statement completes. The MAINTAINED BY USER
clause indicates that the table is user-maintained.
Results
The table becomes immediately eligible for use in automatic query rewrite. The
isolation level of the materialized query table is the isolation level at which the
ALTER TABLE statement is executed.
|
|
|
You can continue to update the data in the table by using the LOAD utility or the
INSERT, UPDATE, MERGE, TRUNCATE, or DELETE statements. You can also use
the REFRESH TABLE statement to update the data in the table.
Example
Assume that you have an existing summary table named TRANSCOUNT. The
TRANSCOUNT tabl has four columns to track the number of transactions by
account, location, and year. Assume that TRANSCOUNT was created with this
CREATE TABLE statement:
CREATE TABLE
(ACCTID
LOCID
YEAR
CNT
TRANSCOUNT
INTEGER NOT
INTEGER NOT
INTEGER NOT
INTEGER NOT
NULL
NULL
NULL
NULL);
The following SELECT statement then populated TRANSCOUNT with data that
was derived from aggregating values in the TRANS table:
SELECT ACCTID, LOCID, YEAR, COUNT(*)
FROM TRANS
GROUP BY ACCTID, LOCID, YEAR ;
You could use the following ALTER TABLE statement to register TRANSCOUNT
as a materialized query table. The statement specifies the ADD MATERIALIZED
QUERY clause:
ALTER TABLE TRANSCOUNT ADD MATERIALIZED QUERY
(SELECT ACCTID, LOCID, YEAR, COUNT(*) as cnt
FROM TRANS
GROUP BY ACCTID, LOCID, YEAR )
DATA INITIALLY DEFERRED
REFRESH DEFERRED
MAINTAINED BY USER;
Chapter 9. Using materialized query tables to improve SQL performance
115
Related reference
ALTER TABLE (DB2 SQL)
Altering an existing materialized query table
You can use the ALTER TABLE statement to change the attributes of a materialized
query table or change a materialized query table to a base table.
About this task
Altering a materialized query table to enable it for query optimization makes the
table immediately eligible for use in automatic query rewrite. You must ensure that
the data in the materialized query table is current. Otherwise, automatic query
rewrite might return results that are not current.
One reason you might want to change a materialized query table into a base table
is to perform table operations that are restricted for a materialized query table. For
example, you might want to rotate the partitions on your partitioned materialized
query table. In order to rotate the partitions, you must change your materialized
query table into a base table. While the table is a base table, you can rotate the
partitions. After you rotate the partitions, you can change the table back to a
materialized query table.
In addition to using the ALTER TABLE statement, you can change a materialized
query table by dropping the table and recreating the materialized query table with
a different definition.
Procedure
To change the attributes of a materialized query table:
1. Issue an ALTER TABLE statement.
v Enable or disable automatic query rewrite for a materialized query table with
the ENABLE QUERY OPTIMIZATION or DISABLE QUERY OPTIMIZATION
clause.
v Change the type of materialized query table between system-maintained and
user-maintained by using the MAINTAINED BY SYSTEM or MAINTAINED
BY USER clause.
v
For example
2. Change a materialized query table into a base table.
Assume that you no longer want the TRANSCOUNT table to be a materialized
query table. The following ALTER TABLE statement, which specifies the DROP
MATERIALIZED QUERY clause, changes the materialized query table into a
base table.
ALTER TABLE TRANSCOUNT DROP MATERIALIZED QUERY;
The column definitions and the data in the table do not change.
However, DB2 can no longer use the table in automatic query rewrite. You can
no longer update the table with the REFRESH TABLE statement.
Populating and maintaining materialized query tables
After you define a materialized query table, you need to maintain the accuracy of
the data in the table.
116
Performance Monitoring and Tuning Guide
About this task
This maintenance includes populating the table for the first time and periodically
refreshing the data in the table. You need to refresh the data because any changes
that are made to the base tables are not automatically reflected in the materialized
query table.
The only way to change the data in a system-maintained materialized query table
is through the REFRESH TABLE statement. The INSERT, UPDATE, MERGE,
TRUNCATE and DELETE statements, and the LOAD utility cannot refer to a
system-maintained materialized query table as a target table. Therefore, a
system-maintained materialized query table is read-only.
Any view or cursor that is defined on a system-maintained materialized query
table is read-only. However, for a user-maintained materialized query table, you
can alter the data with the INSERT, UPDATE, and DELETE statements, and the
LOAD utility.
PSPI
Populating a new materialized query table
You can populate a newly created table for the first time by using the REFRESH
TABLE statement or by using INSERT, UPDATE, MERGE, TRUNCATE, or DELETE
statements, or the LOAD utility.
About this task
When you create a materialized query table with the CREATE TABLE statement,
the table is not immediately populated.
Procedure
To populate a materialized query table:
For example, the following
v Issue a REFRESH TABLE statement.
REFRESH TABLE statement populates a materialized query table named
SALESCNT:
REFRESH TABLE SALESCNT;
|
|
|
|
|
You should avoid using the REFRESH TABLE statement to update
user-maintained materialize query tables because the REFRESH TABLE
statement uses a fullselect and can result in a long-running query.
The REFRESH TABLE statement is an explainable statement. The explain output
contains rows for INSERT with the fullselect in the materialized query table
definition.
v Use the INSERT, UPDATE, MERGE, TRUNCATE, or DELETE statements, or the
LOAD utility.
You cannot use the INSERT, UPDATE, MERGE, TRUNCATE, or DELETE
statements, or the LOAD utility to change system-maintained materialized query
tables.
Refreshing a system-maintained materialized query table
You can use the REFRESH TABLE statement to refresh the data in any materialized
query table at any time.
Chapter 9. Using materialized query tables to improve SQL performance
117
Procedure
To refresh an existing materialized query table:
Issue a REFRESH TABLE statement. When you issue the REFRESH TABLE
statement DB2performs the following actions:
1. Deletes all the rows in the materialized query table
2. Executes the fullselect in the materialized query table definition to recalculate
the data from the tables that are specified in the fullselect with the isolation
level for the materialized query table
3. Inserts the calculated result into the materialized query table
4. Updates the DB2 catalog with a refresh timestamp and the cardinality of the
materialized query table
Although the REFRESH TABLE statement involves both deleting and inserting
data, DB2 completes these operations in a single commit scope. Therefore, if a
failure occurs during execution of the REFRESH TABLE statement, DB2 rolls back
all changes that the statement made.
Refreshing user-maintained materialized query tables
You can update the data in user-maintained materialized query tables by using the
INSERT, UPDATE, MERGE, TRUNCATE, and DELETE statements, and the LOAD
utility.
|
|
|
About this task
PSPI
|
You should avoid using the REFRESH TABLE statement to update user-maintained
materialize query tables. Because the REFRESH TABLE statement uses a fullselect
to refresh a materialized query table, the statement can result in a long-running
query. Use insert, update, delete, or load operations might be more efficient than
using the REFRESH TABLE statement.
Depending on the size and frequency of changes in base tables, you might use
different strategies to refresh your materialized query tables. For example, for
infrequent, minor changes to dimension tables, you could immediately propagate
the changes to the materialized query tables by using triggers. For larger or more
frequent changes, you might consider refreshing your user-maintained materialized
query tables incrementally to improve performance.
Procedure
To avoid refresh a user-maintained materialized query table:
Use INSERT, UPDATE, MERGE, TRUNCATE, or DELETE statements, or the LOAD
utility. For example, you might find it faster to generate the data for your
materialized query table and execute the LOAD utility to populate the data.
|
|
Example
For example, assume that you need to add a large amount of data to a fact table.
Then, you need to refresh your materialized query table to reflect the new data in
the fact table. To do this, perform these steps:
v Collect and stage the new data in a separate table.
118
Performance Monitoring and Tuning Guide
v Evaluate the new data and apply it to the materialized table as necessary.
v Merge the new data into the fact table
For an example of such code, see member DSNTEJ3M in DSN910.SDSNSAMP,
which is shipped with DB2. PSPI
Updating statistics on materialized query tables
For optimal performance of materialized query tables, you need to provide DB2
with accurate catalog statistics for access path selection.
About this task
PSPI
When you run the REFRESH TABLE statement, the only statistic that DB2
updates for the materialized query table is the cardinality statistic.
Procedure
To keep catalog statistics current for materialized query tables:
Run the RUNSTATS utility after executing a REFRESH TABLE statement or after
changing the materialized query table significantly. Otherwise, DB2 uses default or
out-of-date statistics. The estimated performance of queries that are generated by
automatic rewrite might inaccurately compare less favorably to the original query.
PSPI
Rules for using materialized query tables in a multilevel security
environment
If source tables have multilevel security with row-level granularity enabled, some
additional rules apply to working with the materialized query table and the source
tables.
Tables with multilevel security enabled contain a security label column,
which is defined with the AS SECURITY LABEL clause. The values in the security
label column indicate which users can access the data in each row.
Creating a materialized query tablet
If one or more source tables in the materialized query table definition contain a
security label column, certain rules apply to creating a materialized query table.
Only one source table contains a security label column
The following conditions apply:
v You must define the security label column in the materialized query
table definition with the AS SECURITY LABEL clause.
v The materialized query table inherits the security label column from the
source table.
v The MAINTAINED BY USER option is allowed.
Only one source table contains a security label column, and a DEFINITION
ONLY clause was used
The materialized query table inherits the values in the security label
column from the source table. However, the inherited column is not a
security label column.
Chapter 9. Using materialized query tables to improve SQL performance
119
More than one source table contains a security label column
DB2 returns an error code, and the materialized query table is not created.
Altering a source table
An ALTER TABLE statement to add a security label column to a table fails if the
table is a source table for a materialized query table.
Refreshing a materialized query table
The REFRESH TABLE statement deletes the data in a materialized query table and
then repopulates the materialized query table according to its table definition.
During this refresh process, DB2 does not check for multilevel security with
row-level granularity.
Related concepts
Multilevel security (DB2 Administration Guide)
Enabling a materialized query table for automatic query
rewrite
After you populate a user-maintained materialized query table, you can alter the
table to enable query optimization.
Before you begin
If the materialized query table is user-maintained, it is populated with data.
About this task
PSPI
When creating a user-maintained materialized query table, initially disable
query optimization. Otherwise, DB2 might automatically rewrite queries to use the
empty materialized query table.
Procedure
To enable a materialized query table for automatic query rewrite:
Issue an ALTER TABLE statement and specify:
v The ENABLE QUERY OPTIMIZATION clause
v An isolation level equivalent or higher than that of the dynamic queries that
might use the materialized query table. The isolation level of the table must be
equal to or higher than the isolation level of the dynamic query being
considered for automatic query rewrite.
PSPI
Recommendations for materialized query table and base table
design
By following certain best practices, you might improve the performance of your
materialized query tables and the queries that use them. These recommendations,
however, do not represent a complete design theory.
120
Performance Monitoring and Tuning Guide
Designing materialized query tables for automatic query rewrite
By following these recommendations, you might improve the performance of
queries that use materialized query tables.
Procedure
To get better performance from your materialized query tables:
v Include aggregate functions strategically in the fullselect of a materialized query
table definition:
– Include COUNT(*) and SUM(expression).
– Include SUM(expression*expression) only if you plan to query VAR(expression),
STDDEV(expression), VAR_SAMP(expression), or STDDEV_SAMP(expression).
– Include COUNT(expression) in addition to COUNT(*) if expression is nullable.
– Include MIN(expression) and MAX(expression) if you plan to query them.
– Do not include AVG(expression), VAR(expression), or STDDEV(expression)
directly if you include either of the following parameter combinations:
- SUM(expression), SUM(expression*expression), and COUNT(*)
- SUM(expression), SUM(expression*expression), and COUNT(expression)
DB2 can derive AVG(expression), VAR(expression), and STDDEV(expression)
from SUM(expression), SUM(expression*expression), and the appropriate
COUNT aggregate function.
v Include the foreign key of a dimension table in the GROUP BY clause of a
materialized query table definition. For example, if you include PGROUP.ID,
also include PGROUP.LINEID. Then DB2 can use the materialized query table to
derive a summary at the LINEID level, without rejoining PGROUP.
v Include all the higher-level columns in the materialized query table if DB2 does
not know the functional dependency in a denormalized dimension table. For
example, if you include CITY in a GROUP BY clause, also include STATE and
COUNTRY in the clause. Similarly, if you include MONTH in the GROUP BY
clause, also include YEAR in the clause.
v Do not use the HAVING clause in your materialized query tables. A materialized
query table with a HAVING clause in its definition has limited usability during
query rewrite.
v Create indexes on materialized query tables as you would for base tables.
Designing base tables for automatic query rewrite
These recommendations describe base table design strategies that might increase
the performance and eligibility of your materialized query tables.
Procedure
To make your base tables work well with materialized query tables:
v Define referential integrity as either ENFORCED or NOT ENFORCED whenever
possible.
v Define an index as unique if it is truly unique.
v Define all base table columns as NOT NULL if possible, so that COUNT(x) is the
same as COUNT(*). Then you do not need to include COUNT(x) for each
nullable column x in a materialized query table definition. If necessary, use a
special value to replace NULL.
Chapter 9. Using materialized query tables to improve SQL performance
121
v Emphasize normalized dimension design over denormalized dimension design
in your base tables. When you use normalized dimension design, you do not
need to include non-primary key columns in a materialized query table, thereby
saving you storage space. DB2 compensates for the lack of non-primary key
columns by deriving these columns through a re-join of the dimension table. If
normalization causes performance problems, you can define materialized query
tables to denormalize the snowflake dimensions.
Materialized query tables—examples shipped with DB2
In addition to the examples shown in this information, DB2 provides a number of
samples to help you design materialized query tables for automatic query rewrite.
PSPI
The samples are based on a data warehouse with a star schema database.
The star schema contains one fact table, SALESFACT, and these four hierarchical
dimensions:
v A TIME dimension that consists of one dimension table
v A PRODUCT dimension that is a snowflake that consists of four fully
normalized tables
v A LOCATION dimension that is a snowflake that consists of five partially
normalized tables
v A CUSTOMER dimension that is a snowflake that consists of four fully
normalized tables
See member DSNTEJ3M in data set DSN810.SDSNSAMP for all of the code,
including the following items:
v SQL statements to create and populate the star schema
v SQL statements to create and populate the materialized query tables
v Queries that DB2 rewrites to use the materialized query table PSPI
122
Performance Monitoring and Tuning Guide
Chapter 10. Managing DB2 threads
Threads are an important DB2 resource. A thread is a DB2 structure that describes
a connection made by an application and traces its progress.
When you install DB2, you choose a maximum number of active allied and
database access threads that can be allocated concurrently. Choosing a good
number for maximum threads is important to keep applications from queuing and
to provide good response time.
When writing an application, you should know when threads are created and
terminated and when they can be reused, because thread allocation can be a
significant part of the cost in a short transaction.
This information provides a general introduction to how DB2 uses threads,
including the following information:
v A discussion of how to choose the maximum number of concurrent threads.
v A description of the steps in creating and terminating an allied thread, in.
v An explanation of the differences between allied threads and database access
threads (DBATs) and a description of how DBATs are created, including how
they become active or pooled and how to set performance goals for individual
DDF threads.
v Design options for reducing thread allocations and improving performance
generally.
Setting thread limits
You can limit the maximum number of DB2 threads that can be allocated
concurrently.
About this task
Set these values to provide good response time without wasting resources, such as
virtual and real storage. The value you specify depends upon your machine size,
your work load, and other factors. When specifying values for these fields,
consider the following:
v Fewer threads than needed under utilize the processor and cause queuing for
threads.
v More threads than needed do not improve the response time. They require more
real storage for the additional threads and might cause more paging and, hence,
performance degradation.
Procedure
To limit the number of allied and database access threads that can be allocated
concurrently:
1. Use the MAX USERS and MAX REMOTE ACTIVE fields on installation panel
DSNTIPE. The combined maximum allowed for MAX USERS and MAX
REMOTE ACTIVE is 2000. If virtual storage or real storage is the limiting
factor, set MAX USERS and MAX REMOTE ACTIVE according to the available
storage.
© Copyright IBM Corp. 1982, 2010
123
2. For the TSO and call attachment facilities, you limit the number of threads
indirectly by choosing values for the MAX TSO CONNECT and MAX BATCH
CONNECT fields of installation panel DSNTIPE. These values limit the number
of connections to DB2. The number of threads and connections allowed affects
the amount of work that DB2 can process.
Related reference
Thread management panel: DSNTIPE (DB2 Installation Guide)
Allied thread allocation
This information describes at a high level the steps in allocating an allied thread,
and some of the factors related to the performance of those steps. This information
does not explain how a database access thread is allocated.
Step 1: Thread creation
During thread creation with ACQUIRE(ALLOCATE), the resources needed to
execute the application are acquired. During thread creation with ACQUIRE(USE),
only the thread is created.
The following list shows the main steps in thread creation.
1. Check the maximum number of threads.
DB2 checks whether the maximum number of active threads, specified as MAX
USERS for local threads or MAX REMOTE ACTIVE for remote threads on the
Storage Sizes panel (DSNTIPE) when DB2 was installed, has been exceeded. If
it has been exceeded, the request waits. The wait for threads is not traced, but
the number of requests queued is provided in the performance trace record
with IFCID 0073.
2. Check the plan authorization.
The authorization ID for an application plan is checked in the SYSPLANAUTH
catalog table (IFCID 0015). If this check fails, the table SYSUSERAUTH is
checked for the SYSADM special privilege.
3. For an application plan, load the control structures associated with the plan.
The control block for an application plan is divided into sections. The header
and directory contain control information; SQL sections contain SQL statements
from the application. A copy of the plan's control structure is made for each
thread executing the plan. Only the header and directory are loaded when the
thread is created.
4. Load the descriptors necessary to process the plan.
Some of the control structures describe the DB2 table spaces, tables, and
indexes used by the application. If ACQUIRE(ALLOCATE) is used, all the
descriptors referred to in the plan are loaded now. If the plan is bound with
ACQUIRE(USE), they are loaded when SQL statements are executed.
The most relevant factors from a system performance point of view are:
Thread reuse
Thread creation is a significant cost for small and medium transactions.
When execution of a transaction is terminated, the thread can sometimes
be reused by another transaction using the same plan.
ACQUIRE option of BIND
ACQUIRE(ALLOCATE) causes all the resources referred to in the
application to be allocated when the thread is created. ACQUIRE(USE)
allocates the resources only when an SQL statement is about to be
124
Performance Monitoring and Tuning Guide
executed. In general, ACQUIRE(USE) is recommended. However, if most of
the SQL is used in every execution of the transaction,
ACQUIRE(ALLOCATE) is recommended.
EDM pool size
The size of the EDM pool influences the number of I/Os needed to load
the control structures necessary to process the plan or package. To avoid a
large number of allocation I/Os, the EDM pool must be large enough to
contain the structures that are needed.
Step 2: Resource allocation
Some of the structures necessary to process the statement are stored in 4 KB pages.
If they are not already present, those are read into database buffer pool BP0 and
copied from there into the EDM pool.
If the plan was bound with ACQUIRE(USE), it acquires resources when the
statement is about to execute.
1. Load the control structures necessary to process the SQL section.
If it is not already in the EDM pool, DB2 loads the control structure's section
corresponding to this SQL statement.
2. Load structures necessary to process statement.
Load any structures referred to by this SQL statement that are not already in
the EDM pool.
3. Allocate and open data sets.
When the control structure is loaded, DB2 locks the resources used.
The most important performance factors for resource allocation are the same as the
factors for thread creation.
Step 3: SQL statement execution
If the statement resides in a package, the directory and header of the package's
control structure is loaded at the time of the first execution of a statement in the
package.
The control structure for the package is allocated at statement execution time. This
is contrasted with the control structures for plans bound with
ACQUIRE(ALLOCATE), which are allocated at thread creation time. The header of
the plan's control structures is allocated at thread creation time regardless of
ACQUIRE(ALLOCATE) or ACQUIRE(USE).
When the package is allocated, DB2 checks authorization using the package
authorization cache or the SYSPACKAUTH catalog table. DB2 checks to see that
the plan owner has execute authority on the package. On the first execution, the
information is not in the cache; therefore, the catalog is used. After the first
execution, the cache is used.
For dynamic bind, authorization checking also occurs at statement execution time.
A summary record, produced at the end of the statement (IFCID 0058), contains
information about each scan that is performed. Included in the record is the
following information:
v The number of updated rows
v The number of processed rows
Chapter 10. Managing DB2 threads
125
v
v
v
v
v
v
v
v
The number
The number
The number
The number
processing
The number
processing
The number
The number
The number
of
of
of
of
deleted rows
examined rows
pages that are requested through a getpage operation
rows that are evaluated during the first stage (stage 1) of
of rows that are evaluated during the second stage (stage 2) of
of getpage requests that are issued to enforce referential constraints
of rows that are deleted or set null to enforce referential constraints
of inserted rows
From a system performance perspective, the most important factor in the
performance of SQL statement execution is the size of the database buffer pool. If
the buffer pool is large enough, some index and data pages can remain there and
can be accessed again without an additional I/O operation.
Step 4: Commit and thread termination
Commit processing can occur many times while a thread is active.
For example, an application program running under the control structure of the
thread could issue an explicit COMMIT or SYNCPOINT several times during its
execution. When the application program or the thread terminates, an implicit
COMMIT or SYNCPOINT is issued.
When a COMMIT or SYNCPOINT is issued from an IMS application running with
DB2, the two-phase commit process begins if DB2 resources have been changed
since the last commit point. In a CICS or RRSAF application, the two-phase
commit process begins only if DB2 resources have changed and a non-DB2
resource has changed within the same commit scope.
The significant events that show up in a performance trace of a commit and thread
termination operation occur in the following sequence:
1. Commit phase 1
In commit phase 1 (IFCID 0084), DB2 writes an end of phase 1 record to the log
(IFCIDs 0032 and 0033). The trace shows two I/Os, one to each active log data
set (IFCIDs 0038 and 0039).
2. Commit phase 2
In commit phase 2 (IFCID 0070), DB2 writes a beginning of phase 2 record to
the log. Again, the trace shows two I/Os. Page and row locks , held to a
commit point, are released. An unlock (IFCID 0021) with a requested token of
zeros frees any lock for the specified duration. A summary lock record (IFCID
0020) is produced, which gives the maximum number of page locks held and
the number of lock escalations. DB2 writes an end of phase 2 record to the log.
If RELEASE(COMMIT) is used, the following events also occur:
v Table space locks are released.
v All the storage used by the thread is freed, including storage for control
blocks, CTs and PTs, and working areas.
|
|
|
|
|
|
v The use counts of the DBDs are decreased by one. If space is needed in the
EDM DBD cache, a DBD can be freed when its use count reaches zero.
v Those table spaces and index spaces with no claimers are made candidates
for deferred close.
126
Performance Monitoring and Tuning Guide
3. Thread termination.
When the thread is terminated, the accounting record is written. It does not
report transaction activity that takes place before the thread is created.
If RELEASE(DEALLOCATE) is used to release table space locks, the DBD use
count is decreased, and the thread storage is released.
Related concepts
Multiple system consistency (DB2 Administration Guide)
Variations on thread management
Transaction flow can vary in different environments and when dynamic SQL
statements are executed.
TSO and call attachment facility
|
|
|
You can use the TSO attachment facility and call attachment facility (CAF) to
request that SQL statements be executed in TSO foreground and batch. The
processes differ from CICS or IMS transactions in that:
v No sign-on is required. The user is identified when the TSO address space is
connected.
v Commit requires only a single-phase and only one I/O operation to each log.
Single phase commit records are IFCID 0088 and 0089.
v Threads cannot be reused because the thread is allocated to the user address
space.
Resource Recovery Services attachment facility (RRSAF)
With RRSAF, you have sign-on capabilities, the ability to reuse threads, and the
ability to coordinate commit processing across different resource managers.
SQL under DB2 QMF
DB2 QMF uses CAF to create a thread when a request for work, such as a SELECT
statement, is issued. A thread is maintained until the end of the session only if the
requester and the server reside in different DB2 subsystems. If the requester and
the server are both in the local DB2 subsystem, the thread is not maintained.
For more information about DB2 QMF connections, see DB2 Query Management
Facility: Installing and Managing DB2 QMF for TSO/CICS.
Related tasks
Controlling connections (DB2 Administration Guide)
Monitoring and displaying RRSAF connections (DB2 Administration Guide)
Reusing threads
In general, you want transactions to reuse threads when transaction volume is high
and the cost of creating threads is significant, but thread reuse is also useful for a
lower volume of priority transactions.
Chapter 10. Managing DB2 threads
127
About this task
For a transaction of five to ten SQL statements (10 I/O operations), the cost of
thread creation can be 10% of the processor cost. But the steps needed to reuse
threads can incur costs of their own.
Reusing threads through bind options
In DB2, you can prepare allied threads for reuse when you bind the plan.
Procedure
To prepare allied threads for reuse:
Bind the plan with the ACQUIRE(USE) and RELEASE(DEALLOCATE) options;
otherwise, the allocation cost is not eliminated but only slightly reduced. Be aware
of the following effects:
v ACQUIRE(ALLOCATE) acquires all resources needed by the plan, including
locks, when the thread is created. ACQUIRE(USE) acquires resources only when
they are needed to execute a particular SQL statement. If most of the SQL
statements in the plan are executed whenever the plan is executed,
ACQUIRE(ALLOCATE) costs less. If only a few of the SQL statements are likely
to be executed, ACQUIRE(USE) costs less and improves concurrency. But with
thread reuse, if most of your SQL statements eventually get issued,
ACQUIRE(USE) might not be as much of an improvement.
v RELEASE(DEALLOCATE) does not free cursor tables (SKCTs) at a commit point.
Therefore, the cursor table could grow as large as the plan. If you are using
created temporary tables, the logical work file space is not released until the
thread is deallocated. Thus, many uses of the same created temporary table do
not cause reallocation of the logical work files, but be careful about holding onto
this resource for long periods of time if you do not plan to use it.
Analyzing the reuse of threads
The OMEGAMON accounting report can help you identify, by plan, when threads
were reused.
About this task
The following sections of the report contain information about the thread reuse:
v NORMAL TERM.
v ABNORMAL TERM.
v IN DOUBT
Example
The following example reports shows the location of the following values:
v NEW USER (A) tells how many threads were not terminated at the end of the
previous transaction or query, and hence reused.
v DEALLOCATION (B) tells how many threads were terminated at the end of
the query or transaction.
v APPL. PROGR. END (C) groups all the other reasons for accounting. Since the
agent did not abend, these are considered normal terminations.
NORMAL TERM.
TOTAL ABNORMAL TERM.
---------------- ----- ------------------
128
Performance Monitoring and Tuning Guide
TOTAL
-----
IN DOUBT
-----------------
TOTAL
------
NEW USER A
DEALLOCATION B
APPL.PROGR. END C
RESIGNON D
DBAT INACTIVE
TYPE2 INACTIVE
RRS COMMIT
END U. THRESH
BLOCK STORAGE
STALENESS
0 APPL.PROGR. ABEND
0 END OF MEMORY
0 RESOL.IN DOUBT
0 CANCEL FORCE
0
193
0
0
0
0
0 APPL.PGM. ABEND
0 END OF MEMORY
0 END OF TASK
0 CANCEL FORCE
0
0
0
0
This technique is accurate in IMS but not in CICS, where threads are reused
frequently by the same user. For CICS, also consider looking at the number of
commits and aborts per thread. For CICS:
v NEW USER (A) is thread reuse with a different authorization ID or transaction
code.
v RESIGN-ON (D) is thread reuse with the same authorization ID.
Distributed database access threads
Database access threads are created to access data at a DB2 server on behalf of a
requester.
|
|
|
|
|
A database access thread is created when a new connection is accepted from a
remote requester, or if DB2 is configured with INACTIVE MODE and a new
request is received from a remote requester and a pooled database access thread is
unavailable to service the new request. Allied threads perform work at a
requesting DB2 subsystem.
Database access threads differ from allied threads in the following ways:
v Database access threads have two modes of processing: ACTIVE MODE and
INACTIVE MODE. These modes are controlled by setting the DDF THREADS
field on the installation panel DSNTIPR.
ACTIVE MODE
When the value of DDF THREADS is ACTIVE, a database access thread
is always active from initial creation to termination.
INACTIVE MODE
When the value of DDF THREADS is INACTIVE, a database access
thread can be active or pooled. When a database access thread in
INACTIVE MODE is active, it is processing requests from client
connections within units of work. When a database access thread is
pooled, it is waiting for the next request from a client to start a new unit
of work.
v Database access threads run in enclave SRB mode.
v Only when in INACTIVE MODE, a database access thread is terminated after it
has processed 200 units of work or after the database access thread has been idle
in the pool for the amount of time specified in the POOL THREAD TIMEOUT
field on the installation panel DSNTIP5. This termination does not prevent
another database access thread from being created to meet processing demand,
as long as the value of the MAX REMOTE ACTIVE field on panel DSNTIPE has
not been reached.
Recommendation: Use INACTIVE MODE threads instead of ACTIVE MODE
threads whenever possible.
Chapter 10. Managing DB2 threads
129
Setting thread limits for database access threads
When you install DB2, you choose a maximum number of active threads that can
be allocated concurrently.
The MAX USERS field on panel DSNTIPE represents the maximum number of
allied threads, and the MAX REMOTE CONNECTED field on panel DSNTIPE
represents the maximum number of database access threads. Together, the values
you specify for these fields cannot exceed 1999.
In the MAX REMOTE CONNECTED field of panel DSNTIPE, you can specify up
to 150,000 as the maximum number concurrent remote connections that can
concurrently exist within DB2. This upper limit is only obtained if you specify the
recommended value INACTIVE for the DDF THREADS field of installation panel
DSNTIPR. Figure 9 illustrates the relationship among the number of active threads
in the system and the total number of connections.
Up to 150 000: Maximum remote
connections
Up to 1999: Maximum remote active
threads and users
Figure 9. Relationship between active threads and maximum number of connections.
Related reference
Thread management panel: DSNTIPE (DB2 Installation Guide)
Distributed data facility panel 1: DSNTIPR (DB2 Installation Guide)
DDF THREADS field (CMTSTAT subsystem parameter) (DB2 Installation
Guide)
MAX USERS field (CTHREAD subsystem parameter) (DB2 Installation Guide)
MAX REMOTE ACTIVE field (MAXDBAT subsystem parameter) (DB2
Installation Guide)
Pooling of INACTIVE MODE threads for DRDA-only
connections
When the value of DDF THREADS is INACTIVE, DRDA® connections with a
client enables a pooling behavior in database access threads.
130
Performance Monitoring and Tuning Guide
A database access thread that is not currently processing a unit of work is called a
pooled thread, and it is disconnected. DB2 always tries to pool database access
threads, but in some cases cannot do so. The conditions listed in the following
table determine whether a thread can be pooled. When the conditions are true, the
thread can be pooled when a COMMIT is issued.
Table 19. Requirements for pooled threads
If there is...
Thread can be pooled?
A DRDA hop to another location
Yes
A package that is bound with RELEASE(COMMIT)
Yes
A package that is bound with RELEASE(DEALLOCATE)
Yes
A held cursor, a held LOB locator, or a package bound with
KEEPDYNAMIC(YES)
No
A declared temporary table that is active (the table was not
explicitly dropped through the DROP TABLE statement or
the ON COMMIT DROP TABLE clause on the DECLARE
GLOBAL TEMPORARY TABLE statement)
No
After a ROLLBACK, a thread can be pooled even if it had open cursors defined
WITH HOLD or a held LOB locator because ROLLBACK closes all cursors and
LOB locators. ROLLBACK is also the only way to use the KEEPDYNAMIC(YES)
bind option to clear information about a dynamic SQL statement.
Note: A different database access behavior is used when private-protocol
connections are involved.
Related tasks
“Using threads with private-protocol connections” on page 133
Advantages of database access threads in INACTIVE mode
You can allow distributed database access threads to be pooled to improve
performance.
|
|
Allowing threads to be pooled has the following advantages:
v You can leave an application that is running on a workstation connected to DB2
from the time the application is first activated until the workstation is shut
down and thus avoid the delay of repeated connections.
v DB2 can support a larger number of DDF connections (150,000 maximum; not
limited by the maximum number of threads).
v Less storage is used for each DDF connection. (A connection uses significantly
less storage than a database access thread.)
v You get an accounting trace record each time that a thread is pooled rather than
once for the entire time that you are connected. When a pooled thread becomes
active, the accounting fields for that thread are re-initialized. As a result, the
accounting record contains information about active threads only, which makes
it easier to study how distributed applications are performing. If a pooled mode
thread remains active because of sections that are specified with the
KEEPDYNAMIC(YES) bind option, an accounting record is still written.
Exception: If you employ account record accumulation, an accounting trace is
not produced each time that a thread becomes pooled. Accounting records can
be rolled up by concatenated combinations of the following values:
– End user ID
Chapter 10. Managing DB2 threads
131
– End transaction name
– End user workstation name
v Each time that a thread is pooled, workload manager resets the information that
it maintains on that thread. The next time that thread is activated, workload
manager begins managing to the goals that you have set for the transactions that
run in that service class. If you use multiple performance periods, it is possible
to favor short-running units of work that use fewer resources while giving fewer
resources over time to long running units of work.
v You can use response time goals, which is not recommended when using
ACTIVE MODE threads.
v INACTIVE MODE threads can better take advantage of the ability to time out
idle active threads.
v Thread reuse lets DDF use a small pool of database access threads to support a
large group of network clients.
v The response times reported by RMF include periods between requests and
within the same unit of work. These times are shown as idle delays.
Related tasks
“Establishing performance periods for DDF threads” on page 136
“Timing out idle active threads”
Enabling threads to be pooled
You must specify INACTIVE on the DDF THREADS field of installation panel
DSNTIPR to allow threads to be pooled.
Timing out idle active threads
You can specify a time limit for active threads to remain idle.
About this task
When a thread remains idle than the specified limit, DB2 might cancel the thread.
However, pooled and in doubt threads are not be canceled.
Procedure
To specify limits for active idle threads:
v Set the value of the IDLE THREAD TIMEOUT field IDHTHOIN subsystem
parameter. The timeout period is an approximation. If a server thread has been
waiting for a request from the requesting site for this period of time, it is
canceled unless the thread is currently pooled or in doubt thread. A value of 0,
the default, means that the server threads cannot be canceled because of an idle
thread timeout. You can specify a value from 0 to 9999 seconds.
v Set the value of the CMTSTAT subsystem parameter to ACTIVE. When you
specify ACTIVE, as application must start its next unit of work within the
specified timeout period, otherwise its thread is canceled.
v Set a value for the TCPKALV subsystem parameter. A TCP/IP keep alive
interval of 5 minutes or less, in conjunction with an IDLE THREAD TIMEOUT
VALUE, can ensure that resources are not locked for a long time when a
network outage occurs.
132
Performance Monitoring and Tuning Guide
Related reference
DDF THREADS field (CMTSTAT subsystem parameter) (DB2 Installation
Guide)
IDLE THREAD TIMEOUT field (IDTHTOIN subsystem parameter) (DB2
Installation Guide)
TCP/IP KEEPALIVE field (TCPKPALV subsystem parameter) (DB2 Installation
Guide)
Using threads with private-protocol connections
Database access threads with private-protocol connections behave differently than
threads with DRDA-only connections.
About this task
When a database access thread is associated with a private-protocol connection,
whether inbound, outbound, or both, the database access thread remains with the
connection or connections. During periods of inactivity, DB2 reduces the memory
footprint of threads until the next unit of work begins.
The MAX INACTIVE DBATS setting determines whether DB2 reduces the memory
footprint of a private-protocol connection that involves a database access thread.
When a private-protocol connection ends a unit of work, DB2 first compares the
number of current inactive database access threads to the value that is specified for
your installation for MAX INACTIVE DBATS. Based on these values, DB2 either
makes the thread inactive or allows it to remain active:
v If the current number of inactive database access threads is below the value in
MAX INACTIVE DBATS, the thread becomes inactive. It cannot be used by
another connection.
|
|
v If the current number of inactive database access threads meets or exceeds the
value in MAX INACTIVE DBATS, the remote connection is terminated.
Procedure
To limit the number of inactive database access threads that can be created:
Specify a value in the MAX INACTIVE DBATS field of installation panel
DSNTIPR. The default is 0, which means that any database access thread that
involves a private-protocol-connection remains active.
While using a private-protocol connection, DB2 always tries to make threads
inactive, but in some cases cannot do so. If MAX INACTIVE DBATS is greater than
0 and the value has not been exceeded, the conditions listed in the following table
determine if a thread can be inactive.
Table 20. Requirements for inactive threads
If there is...
Thread can be inactive?
A hop to another location
Yes
A package that is bound with RELEASE(COMMIT)
Yes
A package that is bound with RELEASE(DEALLOCATE)
No
A held cursor, a held LOB locator, or a package bound with
KEEPDYNAMIC(YES)
No
Chapter 10. Managing DB2 threads
133
Table 20. Requirements for inactive threads (continued)
If there is...
Thread can be inactive?
A declared temporary table that is active (the table was not
explicitly dropped through the DROP TABLE statement)
No
Reusing threads for remote connections
The cost to create a thread can be significant and reusing threads is a way to avoid
that cost. DB2 for z/OS can reuse threads at the requester and at the server.
At the requester, a thread can be reused for an application that uses the CICS, IMS,
or RRS attachment facility. As a server, DB2 can assign that connection to a pooled
database access thread. Those threads can be shared and reused by thousands of
client connections, which lets DB2 support very large client networks at minimal
cost. (Inactive threads are only eligible to be reused by the same connection.)
If your server is not DB2 for z/OS, or some other server that can reuse threads,
then reusing threads for your requesting CICS, IMS, or RRS applications is not a
benefit for distributed access. Thread reuse occurs when sign-on occurs with a new
authorization ID. If that request is bound for a server that does not support thread
reuse, that change in the sign-on ID causes the connection between the requester
and server to be released so that it can be rebuilt again for the new ID.
Using z/OS Workload Manager to set performance objectives
You can use z/OS Workload Manager (WLM) support, to establish z/OS
performance objectives for individual DDF server threads.
About this task
z/OS supports enclave system request blocks (SRBs). A z/OS enclave lets each
thread have its own performance objective. For details on using Workload
Manager, see z/OS MVS Planning: Workload Management.
The z/OS performance objective of the DDF address space does not govern the
performance objective of the user thread. As described in Chapter 5, “z/OS
performance options for DB2,” on page 31, you should assign the DDF address
space to a z/OS performance objective that is similar to the DB2 database services
address space (ssnmDBM1). The z/OS performance objective of the DDF address
space determines how quickly DB2 is able to perform operations associated with
managing the distributed DB2 work load, such as adding new users or removing
users that have terminated their connections. This performance objective should be
a service class with a single velocity goal. This performance objective is assigned
by modifying the WLM Classification Rules for started tasks (STC).
Classifying DDF threads
You can classify DDF threads by, among other things, authorization ID and stored
procedure name. The stored procedure name is only used as a classification if the
first statement issued by the client to being a new unit-of-work is an SQL CALL
statement.
134
Performance Monitoring and Tuning Guide
Procedure
Use the WLM administrative application to define the service classes you want
z/OS to manage. These service classes are associated with performance objectives.
When a WLM-established stored procedure call originates locally, it inherits the
performance objective of the caller, such as TSO or CICS.
|
|
|
|
Important: If classification rules do not exist to classify some or all of your DDF
transactions into service classes, those unclassified transactions are assigned to the
default service class, SYSOTHER, which has no performance goal and is even
lower in importance than a service class with a discretionary goal.
Classification attributes
Each of the WLM classification attributes has a two or three character abbreviation
that you can use when entering the attribute on the WLM menus.
The following WLM classification attributes pertain to DB2 DDF threads:
AI
Accounting information. The value of the DB2 accounting string associated
with the DDF server thread, described by QMDAAINF in the
DSNDQMDA mapping macro. WLM imposes a maximum length of 143
bytes for accounting information.
CI
The DB2 correlation ID of the DDF server thread, described by QWHCCV
in the DSNDQWHC mapping macro.
CN
The DB2 collection name of the first SQL package accessed by the DRDA
requester in the unit of work.
LU
The VTAM LUNAME of the system that issued the SQL request.
NET
The VTAM NETID of the system that issued the SQL request.
PC
Process name. This attribute can be used to classify the application name
or the transaction name. The value is defined by QWHCEUTX in the
DSNDQWHC mapping macro.
PK
The name of the first DB2 package accessed by the DRDA requester in the
unit of work.
PN
The DB2 plan name associated with the DDF server thread. For DB2
private protocol requesters and DB2 DRDA requesters that are at Version 3
or subsequent releases, this is the DB2 plan name of the requesting
application. For other DRDA requesters, use 'DISTSERV' for PN.
PR
Stored procedure name. This classification only applies if the first SQL
statement from the client is a CALL statement.
SI
Subsystem instance. The DB2 server's z/OS subsystem name.
SPM
Subsystem parameter. This qualifier has a maximum length of 255 bytes.
The first 16 bytes contain the client's user ID. The next 18 bytes contain the
client's workstation name. The remaining 221 bytes are reserved.
Important: If the length of the client's user ID is less than 16 bytes, uses
blanks after the user ID to pad the length. If the length of the client's
workstation name is less than 18 bytes, uses blanks after the workstation
name to pad the length.
SSC
Subsystem collection name. When the DB2 subsystem is a member of a
DB2 data sharing group, this attribute can be used to classify the data
sharing group name. The value is defined by QWHADSGN in the
DSNDQWHA mapping macro.
Chapter 10. Managing DB2 threads
135
UI
|
|
|
User ID. The DDF server thread's primary authorization ID, after inbound
name translation, which occurs only with SNA DRDA or private protocol
connections.
Figure 10 shows how you can associate DDF threads with service classes.
Subsystem-Type Xref Notes Options Help
-------------------------------------------------------------------------Create Rules for the Subsystem Type
Row 1 to 5 of 5
Subsystem Type . . . . . . . . DDF
(Required)
Description . . . Distributed DB2 Fold qualifier names?
Enter one or more action codes: A=After
M=Move I=Insert rule IS=Insert Sub-rule
Action
|
B=Before
R=Repeat
C=Copy
. . Y
(Y or N)
D=Delete
-------Qualifier------------Type
Name
Start
____ 1 SI
DB2P
___
____ 2
CN
ONLINE
___
____ 2
PRC
PAYPROC ___
____ 2
UI
SYSADM
___
____ 2
PK
QMFOS2
___
____ 1 SI
DB2T
___
____ 2
PR
PAYPROCT ___
****************************** BOTTOM
-------Class-------Service
Report
DEFAULTS: PRDBATCH
________
PRDBATCH
________
PRDONLIN
________
PRDONLIN
________
PRDONLIN
________
PRDQUERY
________
TESTUSER
________
TESTPAYR
________
OF DATA *****************************
Figure 10. Classifying DDF threads using z/OS Workload Manager. You assign performance goals to service classes
using the services classes menu of WLM.
The preceding figure shows the following classifications above:
v All DB2P applications accessing their first SQL package in the collection
ONLINE are in service class PRDONLIN.
v All DB2P applications that call stored procedure PAYPROC first are in service
class PRDONLIN.
v All work performed by DB2P user SYSADM is in service class PRDONLIN.
v Users other than SYSADM that run the DB2P PACKAGE QMFOS2 are in the
PRDQUERY class. (The QMFOS2 package is not in collection ONLINE.
v All other work on the production system is in service class PRBBATCH.
v All users of the test DB2 system are assigned to the TESTUSER class except for
work that first calls stored procedure PAYPROCT, which is in service class
TESTPAYR.
Establishing performance periods for DDF threads
You can establish performance periods for DDF threads, including threads that run
in the WLM-established stored procedures address space.
About this task
By establishing multiple performance periods, you can cause the thread's
performance objectives to change based upon the thread's processor consumption.
Thus, a long-running unit of work can move down the priority order and let
short-running transactions get in and out at a higher priority.
Procedure
To design performance strategies for these threads:
136
Performance Monitoring and Tuning Guide
v Take into account the events that cause a DDF thread to reset its z/OS
performance period.
v Use velocity goals and use a single-period service class for threads that are
always active. Because threads that are always active do not terminate the
enclave and thus do not reset the performance period to period 1, a
long-running thread always ends up in the last performance period. Any new
business units of work that use that thread suffer the performance consequences.
This makes performance periods unattractive for long-running threads.
Establishing performance objectives for DDF threads
Threads are assigned a service class by the classification rules in the active WLM
service policy. Each service class period has a performance objective (goal), and
WLM raises or lowers that period's access to system resources as needed to meet
the specified goal.
About this task
For example, the goal might be “application APPL8 should run in less than 3
seconds of elapsed time 90% of the time”.
No matter what your service class goals are, a request to start an address space
might time out, depending on the timeout value that is specified in the TIMEOUT
VALUE field of installation DSNTIPX. If the timeout value is too small, you might
need to increase it to account for a busy system.
Procedure
To establish performance objectives for DDF threads and the related address
spaces:
1. Create a WLM service definition that assigns service classes to the DDF threads
under subsystem type DDF and to the DDF address space under subsystem
type STC.
2. Install the service definition using the WLM menus and activate a policy
(VARY WLM,POLICY=policy).
Setting CICS options for threads
The CICS attachment facility provides a multithread connection to DB2 to allow
you to operate DB2 with CICS
About this task
. Threads allow each CICS application transaction or DB2 command to access DB2
resources.
Procedure
Use the CICS resource definition online (RDO) to tune the CICS attachment facility
and define the characteristics of your threads.
When a transaction needs a thread, an existing thread can be reused or a new
thread can be created. If no existing thread is available, and if the maximum
number of threads has not been reached, a thread is created. For more information,
see:
Chapter 10. Managing DB2 threads
137
v CICS Transaction Server for OS/390 Resource Definition Guide for information about
RDO
v CICS Transaction Server for z/OS DB2 Guide for information about DB2
performance considerations and setup of the CICS attachment facility
Setting IMS options for threads
The IMS attachment facility provides a number of design options for threads.
Procedure
You can use the following IMS options for threads:
v Control the number of IMS regions connected to DB2. For IMS, this is also the
maximum number of concurrent threads.
v A dependent region with a subsystem member (SSM) that is not empty is
connected to DB2 at start up time. Regions with a null SSM cannot create a
thread to DB2. A thread to DB2 is created at the first execution of an SQL
statement in an IMS application schedule; it is terminated when the application
terminates.
The maximum number of concurrent threads used by IMS can be controlled by
the number of IMS regions that can connect to DB2 by transaction class
assignments. You can control the number by doing the following:
– Minimize the number of regions needing a thread by the way in which you
assign applications to regions.
– Provide an empty SSM member for regions that does not connect to DB2.
v Optimize the number of concurrent threads used by IMS.
v Provide efficient thread reuse for high volume transactions.
Thread creation and termination is a significant cost in IMS transactions. IMS
transactions identified as wait for input (WFI) can reuse threads: they create a
thread at the first execution of an SQL statement and reuse it until the region is
terminated. In general, though, use WFI only for transactions that reach a region
utilization of at least 75%.
Some degree of thread reuse can also be achieved with IMS class scheduling,
queuing, and a PROCLIM count greater than one. IMS Fast Path (IFP)
dependent regions always reuse the DB2 thread.
Setting TSO options for threads
You can specify limits for the number of threads taken by the TSO and batch
environments
Procedure
To tune your TSO attachment facility:
v Specify values for the following parameters on the Storage Sizes installation
panel (DSNTIPE):
MAX TSO CONNECT
The maximum number of TSO foreground connections (including DB2I,
DB2 QMF, and foreground applications)
MAX BATCH CONNECT
The maximum number of TSO background connections (including batch
jobs and utilities)
138
Performance Monitoring and Tuning Guide
v Because DB2 must be stopped to set new values, consider setting a higher MAX
BATCH CONNECT for batch periods. The statistics record (IFCID 0001) provides
information on the create thread queue. The OMEGAMON statistics report
shows (as shown in the example below) that information under the SUBSYSTEM
SERVICES section.
Example
For TSO or batch environments, having 1% of the requests queued is probably a
good number to aim for by adjusting the MAX USERS value of installation panel
DSNTIPE. Queuing at create thread time is not desirable in the CICS and IMS
environments. If you are running IMS or CICS in the same DB2 subsystem as TSO
and batch, use MAX BATCH CONNECT and MAX TSO CONNECT to limit the
number of threads taken by the TSO and batch environments. The goal is to allow
enough threads for CICS and IMS so that their threads do not queue. To determine
the number of allied threads queued, see the QUEUED AT CREATE THREAD field
(A) of the OMEGAMON statistics report.
SUBSYSTEM SERVICES
--------------------------IDENTIFY
CREATE THREAD
SIGNON
TERMINATE
ROLLBACK
QUANTITY
-------30757.00
30889.00
0.00
61661.00
644.00
COMMIT PHASE 1
COMMIT PHASE 2
READ ONLY COMMIT
0.00
0.00
0.00
UNITS OF RECOVERY INDOUBT
UNITS OF REC.INDBT RESOLVED
0.00
0.00
SYNCHS(SINGLE PHASE COMMIT) 30265.00
QUEUED AT CREATE THREAD A
0.00
SUBSYSTEM ALLIED MEMORY EOT
1.00
SUBSYSTEM ALLIED MEMORY EOM
0.00
SYSTEM EVENT CHECKPOINT
0.00
Figure 11. Thread queuing in the OMEGAMON statistics report
Setting DB2 QMF options for threads
You can change the impact that DB2 QMF has on the performance of DB2 by
specifying certain options in DB2 QMF.
About this task
For more information on these aspects of DB2 QMF and how they affect
performance, see DB2 Query Management Facility: Installing and Managing DB2 QMF
for TSO/CICS.
Procedure
To set DB2 QMF performance options:
Specify the following options:
v The DSQSIROW parameter of the ISPSTART command
Chapter 10. Managing DB2 threads
139
v SPACE parameter of the user DB2 QMF profile (Q.PROFILES)
v DB2 QMF region size and the spill file attributes
v TRACE parameter of the user DB2 QMF profile (Q.PROFILES)
140
Performance Monitoring and Tuning Guide
Chapter 11. Designing DB2 statistics for performance
Accurate and up-to-date statistics are vital for maintaining good performance SQL
processing performance from DB2.
Maintaining statistics in the catalog
DB2 stores catalog statistics that the optimizer uses to determine the best access
paths for optimal performance from your SQL statements.
The optimizer uses catalog statistics, database design, and the details of the SQL
statement to choose access paths. The optimizer also considers the central
processor model, number of central processors, buffer pool size, and RID pool size.
The optimizer considers the number of processors only to determine appropriate
degrees of parallelism.
Several important calculations for access path selection depend upon buffer pool
statistics. The central processor model also affects the optimizer's access path
selection. These two factors can change your queries' access paths from one system
to another, even if all the catalog statistics are identical. You should keep this in
mind when you migrate from a test system to a production system, or when you
model a new application.
Mixed central processor models in a data sharing group can also affect access path
selection.
Related concepts
Access path selection in a data sharing group (DB2 Data Sharing Planning and
Administration)
Statistics used for access path selection
DB2 uses statistics from certain catalog table columns when selecting query access
paths.
PSPI
DB2 uses certain values from the catalog tables directly when selecting access
paths.
For example, the SYSTABLES and SYSTABLESPACE catalog tables indicate how
much data the tables referenced by a query contain and how many pages hold
data, the SYSINDEXES table indicates the most efficient index for a query, and the
SYSCOLUMNS and SYSCOLDIST catalog tables indicate estimated filter factors for
predicates.
The following tables list columns in catalog tables that DB2 uses for access path
selection, values that trigger the use of a default value, and corresponding default
values. Catalog table columns that are used directly by DB2 during access path
selection are identified by a "Yes" value in the Used for access paths? column of
the following tables.
© Copyright IBM Corp. 1982, 2010
141
Every table that RUNSTATS updates
As shown in the following table, the STATSTIME column is updated in
every table that RUNSTATS updates.
Table 21. Columns in every table that RUNSTATS updates that are used for access path selection
Column name
| STATSTIME
|
|
|
|
|
|
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
No
Description
If updated most recently by RUNSTATS,
the date and time of that update, not
updatable in SYSINDEXPART and
SYSTABLEPART. Used for access path
selection for SYSCOLDIST if duplicate
column values exist for the same column
(by user insertion).
SYSIBM.SYSCOLDIST
Contains table level frequency, histogram, and multi-column cardinality
statistics used by the DB2 to estimate filter factors. The columns in this
catalog table that are used for access path selection are shown in the
following table.
Table 22. Columns in the SYSIBM.SYSCOLDIST table that are updated by RUNSTATS or used for access path
selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
Yes
For TYPE C, the number of distinct values
gathered in the column group; for TYPE F,
the number of distinct values for the
column group -1; for TYPE='H', the
number of distinct values in the column
group of the interval indicated by the
value of the QUANTILENO column.
COLGROUPCOLNO
Yes
Yes
Yes
The set of columns associated with the
statistics. Contains an empty string if
NUMCOLUMNS = 1.
COLVALUE
Yes
Yes
Yes
Frequently occurring value in the
distribution.
FREQUENCYF
Yes
Yes
Yes
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of COLVALUE; for TYPE='H', the
percentage of rows with the value of
COLVALUE that fall into the range
between LOWVALUE and HIGHVALUE
for the interval indicated by the value of
the QUANTILENO column.
| HIGHVALUE
|
|
Yes
No
Yes
For TYPE='H', the high bound for the
interval indicated by the value of the
QUANTILENO column.
| LOWVALUE
|
|
Yes
No
Yes
For TYPE='H', the low bound for the
interval indicated by the value of the
QUANTILENO column.
Yes
Yes
Yes
The number of columns that are associated
with the statistics. The default value is 1.
Column name
| CARDF
|
|
|
|
|
|
NUMCOLUMNS
142
Performance Monitoring and Tuning Guide
Description
Table 22. Columns in the SYSIBM.SYSCOLDIST table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
| TYPE
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
Yes
Yes
Yes
The type of statistics gathered:
|
C
Cardinality
|
F
Frequent value
|
N
Non-padded
|
H
Histogram statistics
QUANTILENO
Yes
No
Yes
For histogram statistics, the ordinary
sequence number of the quantile in the
whole consecutive value range from low to
high.
SYSIBM.SYSCOLDISTSTATS
Contains partition-level frequency, histogram, and multi-column cardinality
statistics that are used by RUNSTATS to aggregate table-level frequency,
histogram, and multi-column cardinality statistics that are stored in
SYSIBM.SYSCOLDIST. The columns in this catalog table that are used for
access path selection are shown in the following table.
Table 23. Columns in the SYSCOLDISTSTATS catalog table that are updated by RUNSTATS or used for access path
selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
No
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of COLVALUE; for TYPE='F' or
TYPE='N' the number of rows or keys in
the partition for which the FREQUENCYF
value applies; for TYPE='H', the percentage
of rows with the value of COLVALUE that
fall into the range between LOWVALUE
and HIGHVALUE for the interval
indicated by the value of the
QUANTILENO column.
COLGROUPCOLNO
Yes
Yes
No
The set of columns associated with the
statistics.
COLVALUE
Yes
Yes
No
Frequently occurring value in the
distribution.
FREQUENCYF
Yes
Yes
No
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of COLVALUE; for TYPE='H', the
percentage of rows with the value of
COLVALUE that fall into the range
between LOWVALUE and HIGHVALUE
for the interval indicated by the value of
the QUANTILENO column.
Yes
No
No
For TYPE='H', the high bound for the
interval indicated by the value of the
QUANTILENO column.
Column name
| CARDF
|
|
|
|
|
|
|
|
|
|
| HIGHVALUE
|
|
Description
Chapter 11. Designing DB2 statistics for performance
143
Table 23. Columns in the SYSCOLDISTSTATS catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
KEYCARDDATA
Yes
Yes
No
The internal representation of the estimate
of the number of distinct values in the
partition.
Yes
No
No
For TYPE='H', the low bound for the
interval indicated by the value of the
QUANTILENO column.
Yes
Yes
No
The number of columns associated with
the statistics. The default value is 1.
Yes
Yes
No
The type of statistics gathered:
| LOWVALUE
|
|
NUMCOLUMNS
| TYPE
Description
|
C
Cardinality
|
F
Frequent value
|
N
Non-padded
|
H
Histogram statistics
| QUANTILENO
Yes
No
No
For histogram statistics, the ordinary
sequence number of the quantile in the
whole consecutive value range from low to
high.
SYSIBM.SYSCOLSTATS
Contains partition-level column statistics that are used by DB2 to
determine the degree of parallelism, and are also sometimes used to bound
filter factor estimates. The columns in this catalog table that are used for
access path selection are shown in the following table.
Table 24. Columns in the SYSIBM.SYSCOLSTATS catalog table that are updated by RUNSTATS or used for access
path selection
Column name
| COLCARD
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
Yes
The number of distinct values in the
partition. Do not update this column
manually without first updating
COLCARDDATA to a value of length 0.
For XML column indicators, NODEID
columns, and XML tables, this value of this
column is set to -2.
Yes
Yes
No
The internal representation of the estimate
of the number of distinct values in the
partition. A value appears here only if
RUNSTATS TABLESPACE is run on the
partition. Otherwise, this column contains
a string of length 0, indicating that the
actual value is in COLCARD.
|
|
|
COLCARDDATA
144
Performance Monitoring and Tuning Guide
Description
Table 24. Columns in the SYSIBM.SYSCOLSTATS catalog table that are updated by RUNSTATS or used for access
path selection (continued)
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
Yes
First 2000 bytes of the highest value of the
column within the partition.If the partition
is empty, the value is set to a string of
length 0. For LOB columns, XML column
indicators, NODEID columns and XML
tables, the value of this column is set to
blank.
Yes
Yes
No
First 2000 bytes of the second highest
value of the column within the partition. If
the partition is empty, the value is set to a
string of length 0. For LOB columns, XML
column indicators, NODEID columns and
XML tables, the value of this column is set
to blank. This column is updated with
decoded values if the column is a
randomized key column.
Yes
Yes
Yes
First 2000 bytes of the lowest value of the
column within the partition. If the partition
is empty, the value is set to a string of
length 0.For LOB columns, XML column
indicators, NODEID columns and XML
tables, the value of this column is set to
blank.
LOW2KEY
Yes
Yes
No
First 2000 bytes of the second lowest value
of the column within the partition.If the
partition is empty, the value is set to a
string of length 0.For LOB columns, XML
column indicators, NODEID columns and
XML tables, the value of this column is set
to blank. This column is updated with
decoded values if the column is a
randomized key column.
PARTITION
Yes
Yes
Yes
Partition number for the table space that
contains the table in which the column is
defined.
Column name
| HIGHKEY
|
|
|
|
|
|
HIGH2KEY
|
|
|
|
|
|
|
|
| LOWKEY
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Description
SYSIBM.SYSCOLUMNS
The columns in this catalog table that are used for access path selection are
shown in the following table.
Table 25. Columns in the SYSCOLUMNS catalog table that are updated by RUNSTATS or used for access path
selection
|
|
|
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
COLCARDF
Yes
Yes
Yes
Description
Estimated number of distinct values in the
column, -1 to trigger use of the default
value (25) and -2 for auxiliary indexes,
XML column indicators, NODEID columns,
and XML tables.
Chapter 11. Designing DB2 statistics for performance
145
Table 25. Columns in the SYSCOLUMNS catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
HIGH2KEY
Yes
Yes
Yes
First 2000 bytes of the second highest
value in this column.If the table is empty,
the value is set to a string of length 0. For
auxiliary indexes, XML column indicators,
NODEID columns and XML tables, the
value of this column is set to
blank.RUNSTATS does not update
HIGH2KEY if the column is a randomized
key column.
LOW2KEY
Yes
Yes
Yes
First 2000 bytes of the second lowest value
in this column.If the table is empty, the
value is set to a string of length 0. For
auxiliary indexes, XML column indicators,
NODEID columns and XML tables, the
value of this column is set to
blank.RUNSTATS does not update
LOW2KEY if the column is a randomized
key column.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Description
SYSIBM.SYSINDEXES
Contains table-level index statistics, that are used by DB2 for index costing.
The following columns in this catalog table are used for access path
selection.
Table 26. Columns in the SYSINDEXES catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
AVGKEYLEN
Yes
No
No
Average key length.
CLUSTERED
Yes
Yes
No
Whether the table is actually clustered by
the index. The value of this column is set
to blank for auxiliary indexes, NODEID
indexes, and XML indexes.
CLUSTERING
No
No
Yes
Whether the index was created using
CLUSTER.
CLUSTERRATIOF
Yes
Yes
Yes
A number which, when multiplied by 100,
gives the percentage of rows in clustering
order. For example, 1 indicates that all
rows are in clustering order and .87825
indicates that 87.825% of the rows are in
clustering order. For a partitioned index, it
is the weighted average of all index
partitions in terms of the number of rows
in the partition. The value of this column
is set to -2 for auxiliary indexes, NODEID
indexes, and XML indexes. If this columns
contains the default, 0, DB2 uses the value
in CLUSTERRATIO, a percentage, for
access path selection.
|
|
|
|
|
|
146
Performance Monitoring and Tuning Guide
Table 26. Columns in the SYSINDEXES catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
FIRSTKEYCARDF
Yes
Yes
Yes
Number of distinct values of the first key
column, or an estimate if updated while
collecting statistics on a single partition, -1
to trigger use of the default value (25).
FULLKEYCARDF
Yes
Yes
Yes
Number of distinct values of the full key,
-1 to trigger use of the default value (25).
NLEAF
Yes
Yes
Yes
Number of active leaf pages in the index,
-1 to trigger use of the default value
(SYSTABLES.CARD/300).
NLEVELS
Yes
Yes
Yes
Number of levels in the index tree, -1 to
trigger use of the default value (2).
SPACEF
Yes
Yes
No
Disk storage in KB.
Yes
Yes
Yes
The number of times that data pages are
repeatedly scanned after the index key is
ordered. This number is -1 if statistics have
not been collected. Valid values are -1 or
any value that is equal to or greater than 1.
| DATAREPEATFACTORF
Description
SYSIBM.SYSINDEXPART
Contains statistics for index space utilization and index organization. For
partitioning index of an index controlled partitioned table space, the limit
key column is also used in limited partition scan scenarios. The following
columns in this catalog table are used for access path selection.
Table 27. Columns in the SYSINDEXPART catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
AVGKEYLEN
Yes
No
No
Average key length.
CARDF
Yes
No
No
Number of rows or LOBs referenced by the
index or partition
DSNUM
Yes
Yes
No
Number of data sets.
EXTENTS
Yes
Yes
No
Number of data set extents (for multiple
pieces, the value is for the extents in the
last data set).
Chapter 11. Designing DB2 statistics for performance
147
Table 27. Columns in the SYSINDEXPART catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
FAROFFPOSF
Yes
No
No
Description
Number of times that accessing a different,
“far-off” page is necessary when accessing
all the data records in index order.
Each time that DB2 accesses a far-off page,
accessing the “next” record in index order
probably requires I/O activity.
For nonsegmented table spaces, a page is
considered far-off from the present page if
the two page numbers differ by 16 or more.
For segmented table spaces, a page is
considered far-off from the present page if
the two page numbers differ by SEGSIZE *
2 or more.
Together, NEAROFFPOSF and
FAROFFPOSF indicate how well the index
follows the cluster pattern of the table
space. For a clustering index,
NEAROFFPOSF and FAROFFPOSF
approach a value of 0 as clustering
improves. A reorganization should bring
them nearer their optimal values; however,
if a nonzero FREEPAGE value is specified
on the CREATE TABLESPACE statement,
the NEAROFFPOSF after reorganization
reflects the table on which the index is
defined. Do not expect optimal values for
non-clustering indexes. The value is -1 if
statistics have not been gathered.The value
of this column is set to -2 for node ID
indexes and XML indexes.
|
|
|
LEAFDIST
Yes
No
No
100 times the number of pages between
successive leaf pages.The value of this
column is set to -2 for NODEID indexes
and XML indexes.
LEAFFAR
Yes
Yes
No
Number of leaf pages located physically far
away from previous leaf pages for
successive active leaf pages accessed in an
index scan. See “LEAFNEAR and
LEAFFAR columns” on page 684 for more
information.
LEAFNEAR
Yes
Yes
No
Number of leaf pages located physically
near previous leaf pages for successive
active leaf pages. See “LEAFNEAR and
LEAFFAR columns” on page 684 for more
information.
LIMITKEY
No
No
Yes
The limit key of the partition in an internal
format, 0 if the index is not partitioned.
|
|
|
148
Performance Monitoring and Tuning Guide
Table 27. Columns in the SYSINDEXPART catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
NEAROFFPOSF
Yes
No
No
Description
Number of times that accessing a different,
“near-off” page would be necessary when
accessing all the data records in index
order.
Each time that DB2 accesses a near-off
page, accessing the “next” record in index
order would probably require I/O activity.
For more information about
NEAROFFPOSF, see the description of
FAROFFPOSF.
NEAROFFPOSF is incremented if the
current indexed row is not on the same or
next data page of the previous indexed
row, and if the distance between the two
data pages does not qualify for
FAROFFPOSF.
For nonsegmented table spaces, a page is
considered near-off from the present page
if the difference between the two page
numbers is greater than or equal to 2, and
less than 16. For segmented table spaces, a
page is considered near-off from the
present page if the difference between the
two page numbers is greater than or equal
to 2, and less than SEGSIZE * 2. A nonzero
value in the NEAROFFPOSF field after a
REORG might be attributed to the number
of space map pages that are contained in
the segmented table space.The value of this
column is set to -2 for node ID indexes and
XML indexes.
|
|
|
PQTY
Yes
No
No
The primary space allocation in 4K blocks
for the data set.
PSEUDO_DEL_ENTRIES
Yes
Yes
No
Number of pseudo-deleted keys.
SECQTYI
Yes
No
No
Secondary space allocation in units of 4 KB,
stored in integer format instead of small
integer format supported by SQTY. If a
storage group is not used, the value is 0.
SPACE
Yes
No
No
The number of KB of space currently
allocated for all extents (contains the
accumulated space used by all pieces if a
page set contains multiple pieces)
SQTY
Yes
No
No
The secondary space allocation in 4 KB
blocks for the data set.
SPACEF
Yes
Yes
No
Disk storage in KB.
SYSIBM.SYSINDEXSTATS
Contains partition-level index statistics that are used by RUNSTATS to
Chapter 11. Designing DB2 statistics for performance
149
aggregate table-level index statistics that are stored in
SYSIBM.SYSINDEXES. The following columns in this catalog table are used
for access path selection.
Table 28. Columns in the SYSINDEXSTATS catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
CLUSTERRATIOF
Yes
Yes
No
A number which, when multiplied by 100,
gives the percentage of rows in clustering
order. For example, 1 indicates that all
rows are in clustering order and .87825
indicates that 87.825% of the rows are in
clustering order.
FIRSTKEYCARDF
Yes
Yes
No
Number of distinct values of the first key
column, or an estimate if updated while
collecting statistics on a single partition.
FULLKEYCARDDATA
Yes
Yes
No
The internal representation of the number
of distinct values of the full key.
FULLKEYCARDF
Yes
Yes
No
Number of distinct values of the full key.
KEYCOUNTF
Yes
Yes
No
Number of rows in the partition, -1 to
trigger use of the value in KEYCOUNT.
NLEAF
Yes
Yes
No
Number of leaf pages in the index.
NLEVELS
Yes
Yes
No
Number of levels in the index tree.
Yes
Yes
No
The number of times that data pages are
repeatedly scanned after the index key is
ordered. This number is -1 if statistics have
not been collected. Valid values are -1 or
any value that is equal to or greater than 1.
| DATAREPEATFACTORF
Description
SYSIBM.SYSKEYTARGETS
Contains table-level frequency, histogram, and multi-column cardinality
statistics for column-expression index keys. The values are used by DB2 in
filter factor estimation algorithms for matched expressions. The following
columns in this catalog table are used for access path selection.
Table 29. Columns in the SYSKEYTARGETS catalog table that are updated by RUNSTATS or used for access path
selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
| CARDF
|
|
|
Yes
No
Yes
Number of distinct values for the
key-target. The value of this column is set
to -2 for NODEID indexes and XML
indexes.
| HIGH2KEY
Yes
Yes
Yes
Second highest key value
| LOW2KEY
Yes
Yes
Yes
Second lowest key value
Column name
150
Performance Monitoring and Tuning Guide
Description
Table 29. Columns in the SYSKEYTARGETS catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
| STATS_FORMAT
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
Yes
Yes
Yes
Type of statistics gathered:
|
|
|
blank
No statistics have been collected,
or VARCHAR column statistical
values are padded
|
|
N
Varchar statistical values are not
padded
SYSIBM.SYSKEYTARGETSTATS
Contains partition-level key statistics for keys in column-expression
indexes. The values are used by RUNSTATS to aggregate table-level key
column-expression statistics. The following columns in this catalog table
are used for access path selection.
Table 30. Columns in the SYSKEYTARGETSTATS catalog table that are updated by RUNSTATS or used for access
path selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
| HIGHKEY
Yes
Yes
No
Highest key value
| HIGH2KEY
Yes
Yes
No
Second highest key value
| LOWKEY
Yes
Yes
No
Lowest key value
| LOW2KEY
Yes
Yes
No
Second lowest key value
| STATS_FORMAT
Yes
Yes
No
Type of statistics gathered:
Column name
|
|
|
blank
No statistics have been collected,
or VARCHAR column statistical
values are padded
|
|
N
Varchar statistical values are not
padded
SYSIBM.SYSKEYTGTDIST
Contains table-level frequency, histogram, and multi-column cardinality
statistics for column-expression index keys. The values are used by DB2 in
filter factor estimation algorithms for matched expressions. The following
columns in this catalog table are used for access path selection.
Table 31. Columns in the SYSKEYTGTDIST catalog table that are updated by RUNSTATS or used for access path
selection
Column name
| CARDF
|
|
|
|
|
|
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Yes
Yes
Yes
Description
For TYPE C, Number of distinct values
gathered in the key group; for TYPE F,
number of distinct values for the key
group -1; for TYPE='H', the number of
distinct values in the column group of the
interval indicated by the value in the
QUANTILENO column.
Chapter 11. Designing DB2 statistics for performance
151
Table 31. Columns in the SYSKEYTGTDIST catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
| KEYGROUPKEYNO
|
|
Yes
Yes
Yes
The set of KEYS associated with the
statistics. Contains an empty string if
NUMKEYS = 1.
| KEYVALUE
|
Yes
Yes
Yes
Frequently occurring value in the
distribution.
| FREQUENCYF
|
|
|
|
|
|
|
Yes
Yes
Yes
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of KEYVALUE; for TYPE='H', the
percentage of rows with the value of
KEYVALUE that fall into the range
between LOWVALUE and HIGHVALUE
for the interval indicated by the value of
the QUANTILENO column.
| HIGHVALUE
|
|
Yes
Yes
Yes
For TYPE='H', the high bound for the
interval indicated by the value of the
QUANTILENO column.
| LOWVALUE
|
|
Yes
Yes
Yes
For TYPE='H', the low bound for the
interval indicated by the value of the
QUANTILENO column.
| NUMKEYS
|
Yes
Yes
Yes
The number of keys associated with the
statistics. The default value is 1.
| TYPE
Yes
Yes
Yes
The type of statistics gathered:
Column name
Description
|
C
Cardinality
|
F
Frequent value
|
N
Non-padded
|
H
Histogram statistics
| QUANTILENO
|
|
|
Yes
Yes
Yes
For histogram statistics, the ordinary
sequence number of the quantile in the
whole consecutive value range from low to
high.
SYSIBM.SYSKEYTGTDISTSTATS
Contains partition-level frequency, histogram, and multi-column cardinality
statistics for column-expression index keys. The values are used by
RUNSTATS to aggregate table-level statistics that are stored in
SYSIBM.SYSCOLDIST. The following columns in this catalog table are used
for access path selection.
152
Performance Monitoring and Tuning Guide
Table 32. Columns in the SYSKEYTGTDISTSTATS catalog table that are updated by RUNSTATS or used for access
path selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
| CARDF
|
|
|
|
|
|
|
Yes
Yes
No
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of KEYVALUE; for TYPE='H', the
percentage of rows with the value of
COLVALUE that fall into the range
between LOWVALUE and HIGHVALUE
for the interval indicated by the value of
the QUANTILENO column.
| KEYVALUE
|
Yes
Yes
No
The set of keys associated with the
statistics
| KEYGROUPKEYNO
|
Yes
Yes
No
Frequently occurring value in the
distribution.
| FREQUENCYF
|
|
|
|
|
|
|
Yes
Yes
No
A number, which multiplied by 100, gives
the percentage of rows that contain the
value of KEYVALUE; for TYPE='H', the
percentage of rows with the value of
KEYVALUE that fall into the range
between LOWVALUE and HIGHVALUE
for the interval indicated by the value of
the QUANTILENO column.
| HIGHVALUE
|
|
Yes
Yes
No
For TYPE='H', the high bound for the
interval indicated by the value of the
QUANTILENO column.
| LOWVALUE
|
|
Yes
Yes
No
For TYPE='H', the low bound for the
interval indicated by the value of the
QUANTILENO column.
| QUANTILENO
|
|
|
Yes
Yes
No
For histogram statistics, the ordinary
sequence number of the quantile in the
whole consecutive value range from low to
high.
Column name
Description
SYSIBM.SYSLOBSTATS
Contains LOB table space statistics. The following columns in this catalog
table are used for access path selection.
Table 33. Columns in the SYSLOBSTATS catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
AVGSIZE
Yes
Yes
No
Average size of a LOB in bytes.
FREESPACE
Yes
Yes
No
The number of KB of available space in the
LOB table space.
Chapter 11. Designing DB2 statistics for performance
153
Table 33. Columns in the SYSLOBSTATS catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
ORGRATIO
Yes
Yes
No
Description
The percentage of organization in the LOB
table space. A value of 100 indicates
perfect organization of the LOB table
space. A value of 1 indicates that the LOB
table space is disorganized.
A value of 0.00 indicates that the LOB
table space is totally disorganized. An
empty LOB table space has an ORGRATIO
value of 100.00.
SYSIBM.SYSROUTINES
Contains statistics for table functions. The following columns in this
catalog table are used for access path selection.
Table 34. Columns in the SYSROUTINES catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
CARDINALITY
No
Yes
Yes
The predicted cardinality of a table
function, -1 to trigger use of the default
value (10 000)
INITIAL_INSTS
No
Yes
Yes
Estimated number of instructions executed
the first and last time the function is
invoked, -1 to trigger use of the default
value (40 000)
INITIAL_IOS
No
Yes
Yes
Estimated number of IOs performed the
first and last time the function is invoked,
-1 to trigger use of the default value (0)
INSTS_PER_INVOC
No
Yes
Yes
Estimated number of instructions per
invocation, -1 to trigger use of the default
value (4 000)
IOS_PER_INVOC
No
Yes
Yes
Estimated number of IOs per invocation, -1
to trigger use of the default value (0)
Description
SYSIBM.SYSTABLEPART
Contains statistics for space utilization. The following columns in this
catalog table are used for access path selection.
Table 35. Columns in the SYSTABLEPART catalog table that are updated by RUNSTATS or used for access path
selection
Column name
| AVGROWLEN
CARDF
154
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
Yes
No
No
Average row length
Yes
No
No
Total number of rows in the table space or
partition. For LOB table spaces, the number
of LOBs in the table space.
Performance Monitoring and Tuning Guide
Table 35. Columns in the SYSTABLEPART catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
DSNUM
Yes
Yes
No
Number of data sets.
EXTENTS
Yes
Yes
No
Number of data set extents (for multiple
pieces, the value is for the extents in the
last data set).
FARINDREF
Yes
No
No
Number of rows that are relocated far from
their original page.
If an update operation increases the length
of a record by more than the amount of
available space in the page in which it is
stored, the record is moved to another
page. Until the table space is reorganized,
the record requires an additional page
reference when it is accessed. The sum of
NEARINDREF and FARINDREF is the total
number of such records.
For nonsegmented table spaces, a page is
considered “near” the present page if the
two page numbers differ by 16 or fewer;
otherwise, it is “far from” the present page.
For segmented table spaces, a page is
considered “near” the present page if the
two page numbers differ by (SEGSIZE * 2)
or less. Otherwise, it is “far from” its
original page.
A record that is relocated near its original
page tends to be accessed more quickly
than one that is relocated far from its
original page.
NEARINDREF
Yes
No
No
Number of rows relocated near their
original page.
Chapter 11. Designing DB2 statistics for performance
155
Table 35. Columns in the SYSTABLEPART catalog table that are updated by RUNSTATS or used for access path
selection (continued)
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
PAGESAVE
Yes
No
No
Description
Percentage of pages, times 100, saved in the
table space or partition as a result of using
data compression. For example, a value of
25 indicates a savings of 25%, so that the
required pages are only 75% of what
would be required without data
compression. The value is 0 if no savings
from using data compression are likely, or
if statistics have not been gathered. The
value can be negative if using data
compression causes an increase in the
number of pages in the data set.
This calculation includes the overhead
bytes for each row, the required bytes for
the dictionary, and the required bytes for
the current FREEPAGE and PCTFREE
specification for the table space and
partition.
This calculation is based on an average row
length, and the result varies depending on
the actual lengths of the rows.
PERCACTIVE
Yes
No
No
Percentage of space occupied by active
rows, containing actual data from active
tables, -2 for LOB table spaces.
This value is influenced by the PCTFREE
and the FREEPAGE parameters on the
CREATE TABLESPACE statement and by
unused segments of segmented table
spaces.
PERCDROP
Yes
No
No
For non-segmented table spaces, the
percentage of space occupied by rows of
data from dropped tables; for segmented
table spaces, 0.
PQTY
Yes
No
No
The primary space allocation in 4K blocks
for the data set.
SECQTYI
Yes
No
No
Secondary space allocation in units of 4 KB,
stored in integer format instead of small
integer format supported by SQTY. If a
storage group is not used, the value is 0.
SPACE
Yes
No
No
The number of KB of space currently
allocated for all extents (contains the
accumulated space used by all pieces if a
page set contains multiple pieces).
SPACEF
Yes
Yes
No
Disk storage in KB.
SQTY
Yes
No
No
The secondary space allocation in 4K
blocks for the data set
156
Performance Monitoring and Tuning Guide
SYSIBM.SYSTABLES
Contains table-level table statistics that are used by DB2 throughout the
query costing process. The following columns in this catalog table are used
for access path selection.
Table 36. Columns in the SYSTABLES catalog table that are updated by RUNSTATS or used for access path selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
AVGROWLEN
Yes
Yes
No
Average row length of the table specified
in the table space.
CARD
Yes
Yes
Yes
Total number of rows in the table or total
number of LOBs in an auxiliary table, -1 to
trigger use of the default value (10 000).
EDPROC
No
No
Yes
Non-blank value if an edit exit routine is
used.
NPAGES
Yes
Yes
Yes
Total number of pages on which rows of
this table appear, -1 to trigger use of the
default value (CEILING(1 + CARD/20))
NPAGESF
Yes
Yes
Yes
Number of pages used by the table.
PCTPAGES
Yes
Yes
No
For non-segmented table spaces,
percentage of total pages of the table space
that contain rows of the table; for
segmented table spaces, the percentage of
total pages in the set of segments assigned
to the table that contain rows of the table.
PCTROWCOMP
Yes
Yes
Yes
Percentage of rows compressed within the
total number of active rows in the table.
SPACEF
Yes
Yes
No
Disk storage in KB.
Description
SYSIBM.SYSTABLESPACE
Contains table-space level statistics that are used by DB2 for costing of
non-segmented table spaces. The following columns in this catalog table
are used for access path selection.
Table 37. Columns in the SYSTABLESPACE catalog table that are updated by RUNSTATS or used for access path
selection
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
Description
| AVGROWLEN
Yes
No
No
Average row length.
NACTIVEF
Yes
Yes
Yes
Number of active pages in the table space,
the number of pages touched if a cursor is
used to scan the entire file, 0 to trigger use
of the value in the NACTIVE column
instead. If NACTIVE contains 0, DB2 uses
the default value (CEILING(1 +
CARD/20)).
SPACE
Yes
No
No
Disk storage in KB.
SPACEF
Yes
Yes
No
Disk storage in KB.
Column name
SYSIBM.SYSTABSTATS
Contains partition-level table statistics that are used by DB2 when costing
Chapter 11. Designing DB2 statistics for performance
157
limited partition scans, and are also used by RUNSTATS to aggregate
table-level table statistics that are stored in SYSIBM.SYSTABLES. The
following columns in this catalog table are used for access path selection.
Table 38. Columns in the SYSTABSTATS catalog table that are updated by RUNSTATS or used for access path
selection
Column name
Set by
RUNSTATS?
User can
update?
Used for
access
paths? 1
CARDF
Yes
Yes
Yes
Total number of rows in the partition, -1 to
trigger use of the value in the CARD
column. If CARD is -1, DB2 uses a default
value(10 000).
NACTIVE
Yes
Yes
No
Number of active pages in the partition.
NPAGES
Yes
Yes
Yes
Total number of pages on which rows of
the partition appear, -1 to trigger use of the
default value (CEILING(1 + CARD/20)).
PCTPAGES
Yes
Yes
No
Percentage of total active pages in the
partition that contain rows of the table.
PCTROWCOMP
Yes
Yes
No
Percentage of rows compressed within the
total number of active rows in the
partition, -1 to trigger use of the default
value (0).
Description
Notes:
1. Statistics on LOB-related values are not used for access path selection. The
SYSCOLDISTSTATS and SYSINDEXSTATS catalog tables are not used for
parallelism access paths. Information in the SYSCOLSTATS catalog table (the
CARD, HIGHKEY, LOWKEY, HIGH2KEY, and LOW2KEY colulmns) is used
information is used to determine the degree of parallelism.
PSPI
Filter factors and catalog statistics
DB2 needs an accurate estimate of the number of rows that qualify after applying
each predicate in order to determine optimal access paths.
PSPI
When multiple tables are accessed, filtering also affects the cost of join
order and join method. The catalog tables SYSIBM.SYSCOLUMNS and
SYSIBM.SYSCOLDIST are the main source of statistics for calculating predicate
filter factors. The following columns are particularly important:
v SYSCOLUMNS.COLCARDF indicates whether statistics exist for a column or
not. A positive value is an estimate of the number of distinct values in the
column (cardinality). A value of '-1' results in the use of default statistics.
The value of COLCARDF generated by RUNSTATS TABLESPACE is an estimate
determined by a sampling method. If you know a more accurate number for
COLCARDF, you can supply it by updating the catalog. If the column is the first
column of an index, the value generated by RUNSTATS INDEX is exact.
v Columns in SYSCOLDIST contain statistics about the frequency (or distribution)
of values for a single column. It can also contain statistics about the cardinality
of a group of columns and the frequency of values for that group.
158
Performance Monitoring and Tuning Guide
When frequency statistics do not exist, DB2 assumes that the data is uniformly
distributed and all values in the column occur with the same frequency. This
assumption can lead to an inaccurate estimate of the number of qualifying rows
if the data is skewed, which can result in performance problems.
Assume that a column (AGE_CATEGORY) contains five distinct values
(COLCARDF), each of which occur with the following frequencies:
AGE_CATEGORY
-----------INFANT
CHILD
ADOLESCENT
ADULT
SENIOR
|
|
|
FREQUENCY
--------5%
15%
25%
40%
15%
Without this frequency information, DB2 would use a default filter factor of 1/5
(1/COLCARDF), or 20%, to estimate the number of rows that qualify for
predicate AGE_CATEGORY=ADULT. However, the actual frequency of that age
category is 40%. Thus, the number of qualifying rows is underestimated by 50%.
When collecting statistics about indexes, you can specify the KEYCARD option
of RUNSTATS to collect cardinality statistics on the specified indexes. You can
also specify the FREQVAL option with KEYCARD to specify whether
distribution statistics are collected for what number of concatenated index
columns. By default, distribution statistics are collected on the first column of
each index for the 10 most frequently occurring values. FIRSTKEYCARDF and
FULLKEYCARDF are also collected by default.
The value of FULLKEYCARDF generated by RUNSTATS on a DPSI type index
is an estimate determined by a sampling method. If you know a more accurate
number for FULLKEYCARDF, you can supply it by updating the catalog.
When collecting statistics at the table level, you can specify the COLUMN option
of RUNSTATS to collect cardinality statistics on just the specified columns. You
can also use the COLGROUP option to specify a group of columns for which to
collect cardinality statistics. If you use the FREQVAL option with COLGROUP,
you can also collect distribution statistics for the column group.
To limit the resources required to collect statistics, you only need to collect
column cardinality and frequency statistics that have changed. For example, a
column on GENDER is likely to have a COLCARDF of 2, with M and F as the
possible values. It is unlikely that the cardinality for this column ever changes.
The distribution of the values in the column might not change often, depending
on the volatility of the data.
Recommendation: If query performance is not satisfactory, consider the
following actions:
– Collect cardinality statistics on all columns that are used as predicates in a
WHERE clause.
– Collect frequencies for all columns with a low cardinality that are used as
COL op constant predicates.
– Collect frequencies for a column when the column can contain default data,
the default data is skewed, and the column is used as a COL op constant
predicate.
– Collect KEYCARD on all candidate indexes.
– Collect column group statistics on all join columns.
v LOW2KEY and HIGH2KEY columns are limited to storing the first 2000 bytes of
a key value. If the column is nullable, values are limited to 1999 bytes.
Chapter 11. Designing DB2 statistics for performance
159
v The closer SYSINDEXES.CLUSTERRATIOF is to 100% (a value of 1), the more
closely the ordering of the index entries matches the physical ordering of the
table rows.
PSPI
Related concepts
“How clustering affects access path selection” on page 681
Related reference
RUNSTATS (DB2 Utilities)
|
|
|
|
Histogram statistics
|
|
Restriction: RUNSTATS cannot collect histogram statistics on randomized key
columns.
|
|
|
|
DB2 chooses the best access path for a query based on predicate selectivity
estimation, which in turn relies heavily on data distribution statistics. Histogram
statistics summarize data distribution on an interval scale by dividing the entire
range of possible values within a data set into a number of intervals.
|
|
|
|
DB2 creates equal-depth histogram statistics, meaning that it divides the whole
range of values into intervals that each contain about the same percentage of the
total number rows. The following columns in a histogram statistics table define an
interval:
|
|
QUANTILENO
An ordinary sequence number that identifies the interval.
|
|
HIGHVALUE
The value that serves as the upper bound for the interval.
|
|
LOWVALUE
A value that serves as the lower bound for the interval.
|
|
|
Note the following characteristics of histogram statistics intervals:
v Each interval includes approximately the same number, or percentage, of the
rows. A highly frequent single value might occupy an interval by itself.
|
|
|
|
|
|
|
|
|
|
v A single value is never broken into more than one interval, meaning that the
maximum number of intervals is equal to the number of distinct values on the
column. The maximum number of intervals cannot exceed 100, which is the
maximum number that DB2 supports.
v Adjacent intervals sometime skip values that do not appear in the table,
especially when doing so avoids a large range of skipped values within an
interval. For example, if the value 30 above has 1% frequency, placing it in the
seventh interval would balance the percentage of rows in the 6th and 7th
intervals. However, doing so would introduce a large skipped range to the
seventh interval.
v HIGHVALUE and LOWVALUE can be inclusive or exclusive, but an interval
generally represents a non-overlapped value range.
v NULL values, if any exist, occupy a single interval.
Histogram statistics enable DB2 to improve access path selection by estimating
predicate selectivity from value-distribution statistics that are collected over the
entire range of values in a data set.
|
|
|
160
Performance Monitoring and Tuning Guide
|
|
|
v Because DB2 cannot break any single value into two different intervals, the
maximum number of intervals is limited to the number of distinct values in the
column, and cannot exceed the DB2 maximum of 100 intervals.
Statistics for partitioned table spaces
For a partitioned table space, DB2 keeps statistics separately by partition and also
collectively for the entire table space.
The following table shows the catalog tables that contain statistics by partition and,
for each one, the table that contains the corresponding aggregate statistics.
Table 39. The catalog tables that contain statistics by partition and the table that contains the
corresponding aggregate statistics
Statistics by partition are in
Aggregate statistics are in
SYSTABSTATS
SYSTABLES
SYSINDEXSTATS
SYSINDEXES
SYSCOLSTATS
SYSCOLUMNS
SYSCOLDISTSTATS
SYSCOLDIST
|
SYSKEYTARGETSTATS
SYSKEYTARGETS
|
SYSKEYTGTDISTSTATS
SYSKEYTGTDIST
If you run RUNSTATS for separate partitions of a table space, DB2 uses the results
to update the aggregate statistics for the entire table space. You should either run
RUNSTATS once on the entire object before collecting statistics on separate
partitions or use the appropriate option to ensure that the statistics are aggregated
appropriately, especially if some partitions are not loaded with data. For
recommendations about running RUNSTATS on separate partitions, see “Gathering
and updating statistics” on page 510.
Setting default statistics for created temporary tables
You can establish default statistical values for the cardinality and the number of
pages if you can estimate the normal cardinality and number of pages that used
the values for a particular created temporary table.
About this task
When preparing an SQL statement that refers to a created temporary table, if the
table has been instantiated, DB2 uses the cardinality and number of pages that are
maintained for that table in storage. If the table has not been instantiated, DB2
looks at the CARDF and NPAGES columns of the SYSTABLES row for the created
temporary table. These values are normally -1 because RUNSTATS cannot run
against a created temporary table.
You can manually update the values in the CARDF and NPAGES columns of the
SYSTABLES row for the created temporary table. These values become the default
values that are used if more accurate values are not available or if more accurate
values cannot be used. The more accurate values are available only for dynamic
SQL statements that are prepared after the instantiation of the created temporary
table, but within the same unit of work. These more accurate values are not used if
the result of the dynamic bind is destined for the dynamic statement cache.
Chapter 11. Designing DB2 statistics for performance
161
History statistics
Several catalog tables provide historical statistics for other catalog tables.
PSPI
These catalog history tables include:
v SYSIBM.SYSCOLDIST_HIST
v SYSIBM.SYSCOLUMNS_HIST
v SYSIBM.SYSINDEXES_HIST
v SYSIBM.SYSINDEXPART_HIST
v SYSIBM.SYSINDEXSTATS_HIST
v SYSIBM.SYSKEYTARGETS_HIST
v SYSIBM.SYSKEYTGTDIST_HIST
v SYSIBM.SYSLOBSTATS_HIST
v SYSIBM.SYSTABLEPART_HIST
v SYSIBM.SYSTABLES_HIST
v SYSIBM.SYSTABSTATS_HIST
|
For example, SYSIBM.SYSTABLESPACE_HIST provides statistics for activity in
SYSIBM.SYSTABLESPACE, SYSIBM.SYSTABLEPART_HIST provides statistics for
activity in SYSIBM.SYSTABLEPART, and so on.
When DB2 adds or changes rows in a catalog table, DB2 might also write
information about the rows to the corresponding catalog history table. Although
the catalog history tables are not identical to their counterpart tables, they do
contain the same columns for access path information and space utilization
information. The history statistics provide a way to study trends, to determine
when utilities, such as REORG, should be run for maintenance, and to aid in space
management.
SYSIBM.SYSCOLDIST_HIST
Table 40. Historical statistics columns in the SYSCOLDIST_HIST catalog table
Provides
access path
statistics1
Provides
space
statistics
Yes
No
For TYPE C, Number of distinct values gathered
in the column group; for TYPE='H', the number
of distinct values in the column group of the
interval indicated by the value in the
QUANTILENO column
COLGROUPCOLNO
Yes
No
Identifies the columns involved in multi-column
statistics
COLVALUE
Yes
No
Frequently occurring value in the key distribution
| FREQUENCYF
|
|
|
|
|
|
Yes
No
A number, which multiplied by 100, gives the
percentage of rows that contain the value of
COLVALUE; for TYPE='H', the percentage of rows
with the value of COLVALUE that fall into the
range between LOWVALUE and HIGHVALUE for
the interval indicated by the value of the
QUANTILENO column.
| HIGHVALUE
Yes
No
For TYPE='H', the value of the high bound for the
interval indicated by the value of the
QUANTILENO column.
Column name
| CARDF
|
|
|
|
162
Performance Monitoring and Tuning Guide
Description
Table 40. Historical statistics columns in the SYSCOLDIST_HIST catalog table (continued)
Column name
| LOWVALUE
NUMCOLUMNS
| TYPE
Provides
access path
statistics1
Provides
space
statistics
Yes
No
For TYPE='H', the value of the low bound for the
interval indicated by the value of the
QUANTILENO column.
Yes
No
Number of columns involved in multi-column
statistics
Yes
No
The type of statistics gathered:
Description
|
C
Cardinality
|
F
Frequent value
|
P
Non-padded
|
H
Histogram statistics
| QUANTILENO
Yes
No
For histogram statistics, the ordinary sequence
number of the quantile in the whole consecutive
value range from low to high.
SYSIBM.SYSCOLUMNS_HIST
Table 41. Historical statistics columns in the SYSCOLUMNS_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
COLCARDF
Yes
No
Estimated number of distinct values in the
column
HIGH2KEY
Yes
No
Second highest value of the column, or blank
LOW2KEY
Yes
No
Second lowest value of the column, or blank
Description
SYSIBM.SYSINDEXES_HIST
Table 42. Historical statistics columns in the SYSINDEXES_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
CLUSTERING
Yes
No
Whether the index was created with CLUSTER
CLUSTERRATIOF
Yes
No
A number, when multiplied by 100, gives the
percentage of rows in the clustering order.
FIRSTKEYCARDF
Yes
No
Number of distinct values in the first key column
FULLKEYCARDF
Yes
No
Number of distinct values in the full key
NLEAF
Yes
No
Number of active leaf pages
NLEVELS
Yes
No
Number of levels in the index tree
Yes
No
The number of times that data pages are
repeatedly scanned after the index key is ordered.
This number is -1 if statistics have not been
collected. Valid values are -1 or any value that is
equal to or greater than 1.
| DATAREPEATFACTORF
SYSIBM.SYSINDEXPART_HIST
Chapter 11. Designing DB2 statistics for performance
163
Table 43. Historical statistics columns in the SYSINDEXPART_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
CARDF
No
No
Number of rows or LOBs referenced.
DSNUM
No
Yes
Number of data sets
EXTENTS
No
Yes
Number of data set extents (for multiple pieces,
the value is for the extents in the last data set)
FAROFFPOSF
No
Yes
Number of rows referenced far from the optimal
position.
LEAFDIST
No
Yes
100 times the number of pages between
successive leaf pages
LEAFFAR
No
Yes
Number of leaf pages located physically far away
from previous leaf pages for successive active leaf
pages accessed in an index scan
LEAFNEAR
No
Yes
Number of leaf pages located physically near
previous leaf pages for successive active leaf
pages
NEAROFFPOSF
No
Yes
Number of rows referenced near but not at the
optimal position.
PQTY
No
Yes
Primary space allocation in 4K blocks for the data
set
PSEUDO_DEL_ENTRIES
No
Yes
Number of pseudo-deleted keys
SECQTYI
No
Yes
Secondary space allocation in 4K blocks for the
data set.
SPACEF
No
Yes
Disk storage in KB
SYSIBM.SYSINDEXSTATS_HIST
Table 44. Historical statistics columns in the SYSINDEXSTATS_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
CLUSTERRATIOF
Yes
No
A number, which when multiplied by 100, gives
the percentage of rows in the clustering order.
FIRSTKEYCARDF
Yes
No
Number of distinct values of the first key column
FULLKEYCARDF
Yes
No
Number of distinct values of the full key
KEYCOUNTF
Yes
No
Total number of rows in the partition.
NLEAF
Yes
No
Number of leaf pages
NLEVELS
Yes
No
Number of levels in the index tree
Yes
No
The number of times that data pages are
repeatedly scanned after the index key is ordered.
This number is -1 if statistics have not been
collected. Valid values are -1 or any value that is
equal to or greater than 1.
| DATAREPEATFACTORF
|
SYSIBM.SYSKEYTARGETS_HIST
164
Performance Monitoring and Tuning Guide
Description
Table 45. Historical statistics columns in the SYSKEYTARGETS_HIST catalog table
Provides
access path
statistics1
Provides
space
statistics
| KEYCARDF
|
Yes
No
For type C statistics, the number of distinct values
for key-target
| HIGH2KEY
Yes
No
The second highest key value
| LOW2KEY
Yes
No
The second lowest key value
| STATS_FORMAT
Yes
No
Type of statistics gathered:
Column name
Description
|
|
|
blank
No statistics have been collected, or
VARCHAR column statistical values are
padded
|
N
Varchar statistical values are not padded
SYSIBM.SYSKEYTGTDIST_HIST
Table 46. Historical statistics columns in the SYSKEYTGTDIST_HIST catalog table
Provides
access path
statistics1
Provides
space
statistics
| CARDF
|
|
|
|
Yes
No
For TYPE C, Number of distinct values gathered
in the key group; for TYPE='H', the number of
distinct values in the key group of the interval
indicated by the value in the QUANTILENO
column
| KEYGROUPKEYNO
|
Yes
No
Identifies the keys involved in multi-column
statistics
| KEYVALUE
Yes
No
Frequently occurring value in the key distribution
| HIGHVALUE
|
|
Yes
No
For TYPE='H', the value of the high bound for the
interval indicated by the value of the
QUANTILENO column.
| FREQUENCYF
|
|
|
|
|
|
Yes
No
A number, which multiplied by 100, gives the
percentage of rows that contain the value of
KEYVALUE; for TYPE='H', the percentage of rows
with the value of KEYVALUE that fall into the
range between LOWVALUE and HIGHVALUE for
the interval indicated by the value of the
QUANTILENO column.
| LOWVALUE
|
|
Yes
No
For TYPE='H', the value of the low bound for the
interval indicated by the value of the
QUANTILENO column.
| NUMKEYS
Yes
No
Number of keys involved in multi-key statistics
| TYPE
Yes
No
The type of statistics gathered:
Column name
Description
|
C
Cardinality
|
F
Frequent value
|
P
Non-padded
|
H
Histogram statistics
| QUANTILENO
|
|
Yes
No
For histogram statistics, the ordinary sequence
number of the quantile in the whole consecutive
value range from low to high.
Chapter 11. Designing DB2 statistics for performance
165
SYSIBM.SYSLOBSTATS_HIST
Table 47. Historical statistics columns in the SYSLOBSTATS_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
FREESPACE
No
Yes
The amount of free space in the LOB table space
ORGRATIO
No
Yes
The percentage of organization in the LOB table
space. A value of 100 indicates perfect
organization of the LOB table space. A value of 1
indicates that the LOB table space is disorganized.
A value of 0.00 indicates that the LOB table space
is totally disorganized. An empty LOB table space
has an ORGRATIO value of 100.00.
SYSIBM.SYSTABLEPART_HIST
Table 48. Historical statistics columns in the SYSTABLEPART_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
CARDF
No
Yes
Number of rows in the table space or partition
DSNUM
No
Yes
Number of data sets
EXTENTS
No
Yes
Number of data set extents (for multiple pieces,
the value is for the extents in the last data set)
FARINDREF
No
Yes
Number of rows relocated far from their original
position
NEARINDREF
No
Yes
Number of rows relocated near their original
position
PAGESAVE
No
Yes
Percentage of pages saved by data compression
PERCACTIVE
No
Yes
Percentage of space occupied by active pages
PERCDROP
No
Yes
Percentage of space occupied by pages from
dropped tables
PQTY
No
Yes
Primary space allocation in 4K blocks for the data
set
SECQTYI
No
Yes
Secondary space allocation in 4K blocks for the
data set.
SPACEF
No
Yes
The number of KB of space currently used
SYSIBM.SYSTABLES_HIST
Table 49. Historical statistics columns in the SYSTABLES_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
AVGROWLEN
No
Yes
Average row length of the table specified in the
table space
CARDF
Yes
No
Number of rows in the table or number of LOBs
in an auxiliary table
NPAGESF
Yes
No
Number of pages used by the table
166
Performance Monitoring and Tuning Guide
Description
Table 49. Historical statistics columns in the SYSTABLES_HIST catalog table (continued)
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
PCTPAGES
No
Yes
Percentage of pages that contain rows
PCTROWCOMP
Yes
No
Percentage of active rows compressed
SYSIBM.SYSTABSTATS_HIST
Table 50. Historical statistics columns in the SYSTABSTATS_HIST catalog table
Column name
Provides
access path
statistics1
Provides
space
statistics
Description
CARDF
Yes
No
Number of rows in the partition
NPAGES
Yes
No
Total number of pages with rows
Notes:
1. The access path statistics in the history tables are collected for historical
purposes and are not used for access path selection.
PSPI
Additional statistics that provide index costs
Certain statistics in the SYSINDEXES catalog table also provide information about
costs associated with index processing.
PSPI
The following columns of the SYSIBM.SYSINDEXES catalog table provide
cost information for index processing:
FIRSTKEYCARDF
The number of distinct values of the first index key column. When an
indexable equal predicate is specified on the first index key column,
1/FIRSTKEYCARDF is the filter factor for the predicate and the index. The
higher the number is, the less the cost is.
FULLKEYCARDF
The number of distinct values for the entire index key. When indexable
equal predicates are specified on all the index key columns,
1/FULLKEYCARDF is the filter factor for the predicates and the index.
The higher the number is, the less the cost is.
When the number of matching columns is greater than 1 and less than the
number of index key columns, the filtering of the index is located between
1/FIRSTKEYCARDF and 1/FULLKEYCARDF.
NLEAF
The number of active leaf pages in the index. NLEAF is a portion of the
cost to scan the index. The smaller the number is, the less the cost is. It is
also less when the filtering of the index is high, which comes from
FIRSTKEYCARDF, FULLKEYCARDF, and other indexable predicates.
NLEVELS
The number of levels in the index tree. NLEVELS is another portion of the
Chapter 11. Designing DB2 statistics for performance
167
cost to traverse the index. The same conditions as NLEAF apply. The
smaller the number is, the less the cost is.
DATAREPEATFACTORF
The number of times that data pages are repeatedly scanned after the
|
|
index key is ordered. PSPI
|
Modeling your production system
To see what access paths your production queries use, you can update the catalog
statistics on your test system to be the same as your production system.
About this task
To do that, run RUNSTATS on your production tables to get current statistics for
access path selection. Then retrieve them and use them to build SQL statements to
update the catalog of the test system.
Example: You can use queries similar to the following queries to build those
statements. To successfully model your production system, the table definitions
must be the same on both the test and production systems. For example, they must
have the same creator, name, indexes, number of partitions, and so on.
PSPI
Use the following statements to update SYSTABLESPACE, SYSTABLES,
SYSINDEXES, and SYSCOLUMNS:
SELECT DISTINCT ’UPDATE SYSIBM.SYSTABLESPACE SET NACTIVEF=’
CONCAT STRIP(CHAR(NACTIVEF))
CONCAT’,NACTIVE=’CONCAT STRIP(CHAR(NACTIVE))
CONCAT ’ WHERE NAME=’’’ CONCAT TS.NAME
CONCAT ’’’ AND DBNAME =’’’CONCAT TS.DBNAME CONCAT’’’*’
FROM SYSIBM.SYSTABLESPACE TS, SYSIBM.SYSTABLES TBL
WHERE TS.NAME = TSNAME
AND TBL.CREATOR IN (table creator_list)
AND TBL.NAME IN (table_list)
AND (NACTIVEF >=0 OR NACTIVE >=0);
SELECT ’UPDATE SYSIBM.SYSTABLES SET CARDF=’
CONCAT STRIP(CHAR(CARDF))
CONCAT’,NPAGES=’CONCAT STRIP(CHAR(NPAGES))
CONCAT’,PCTROWCOMP=’CONCAT STRIP(CHAR(PCTROWCOMP))
CONCAT ’ WHERE NAME=’’’CONCAT NAME
CONCAT ’’’ AND CREATOR =’’’CONCAT CREATOR CONCAT’’’*’
FROM SYSIBM.SYSTABLES WHERE
CREATOR IN (creator_list)
AND NAME IN (table_list)
AND CARDF >= 0;
SELECT ’UPDATE SYSIBM.SYSINDEXES SET FIRSTKEYCARDF=’
CONCAT STRIP(CHAR(FIRSTKEYCARDF))
CONCAT ’,FULLKEYCARDF=’CONCAT STRIP(CHAR(FULLKEYCARDF))
CONCAT’,NLEAF=’CONCAT STRIP(CHAR(NLEAF))
CONCAT’,NLEVELS=’CONCAT STRIP(CHAR(NLEVELS))
CONCAT’,CLUSTERRATIO=’CONCAT STRIP(CHAR(CLUSTERRATIO))
CONCAT’,CLUSTERRATIOF=’CONCAT STRIP(CHAR(CLUSTERRATIOF))
CONCAT’,DATAREPEATFACTORF=’CONCAT STRIP(CHAR(DATAREPEATFACTORF))
CONCAT’ WHERE NAME=’’’CONCAT NAME
CONCAT ’’’ AND CREATOR =’’’CONCAT CREATOR CONCAT’’’*’
FROM SYSIBM.SYSINDEXES
WHERE TBCREATOR IN (creator_list)
AND TBNAME IN (table_list)
AND FULLKEYCARDF >= 0;
|
168
Performance Monitoring and Tuning Guide
SELECT ’UPDATE SYSIBM.SYSCOLUMNS SET COLCARDF=’
CONCAT STRIP(CHAR(COLCARDF))
CONCAT’,HIGH2KEY= X’’’ CONCAT HEX(HIGH2KEY)
CONCAT’’’,LOW2KEY= X’’’ CONCAT HEX(LOW2KEY)
CONCAT’’’ WHERE TBNAME=’’’ CONCAT TBNAME CONCAT ’’’ AND COLNO=’
CONCAT STRIP(CHAR(COLNO))
CONCAT ’ AND TBCREATOR =’’’ CONCAT TBCREATOR CONCAT’’’*’
FROM SYSIBM.SYSCOLUMNS
WHERE TBCREATOR IN (creator_list)
AND TBNAME IN (table_list)
AND COLCARDF >= 0;
SYSTABSTATS and SYSCOLDIST require deletes and inserts.
Delete statistics from SYSTABSTATS on the test subsystem for the specified tables
by using the following statement:
DELETE FROM (TEST_SUBSYSTEM).SYSTABSTATS
WHERE OWNER IN (creator_list)
AND
NAME IN (table_list)
Use INSERT statements to repopulate SYSTABSTATS with production statistics that
are generated from the following statement:
SELECT ’INSERT INTO SYSIBM.SYSTABSTATS’
CONCAT ’(CARD,NPAGES,PCTPAGES,NACTIVE,PCTROWCOMP’
CONCAT ’,STATSTIME,IBMREQD,DBNAME,TSNAME,PARTITION’
CONCAT ’,OWNER,NAME,CARDF) VALUES(’
CONCAT STRIP(CHAR(CARD))
CONCAT ’ ,’
CONCAT STRIP(CHAR(NPAGES))
CONCAT ’ ,’
CONCAT STRIP(CHAR(PCTPAGES))
CONCAT ’ ,’
CONCAT STRIP(CHAR(NACTIVE))
CONCAT ’ ,’
CONCAT STRIP(CHAR(PCTROWCOMP))
CONCAT ’ ,’
CONCAT ’’’’ CONCAT CHAR(STATSTIME)
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT IBMREQD
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT STRIP(DBNAME)
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT STRIP(TSNAME)
CONCAT ’’’ ,’
CONCAT STRIP(CHAR(PARTITION))
CONCAT ’ ,’
CONCAT ’’’’ CONCAT STRIP(OWNER)
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT STRIP(NAME)
CONCAT ’’’ ,’
CONCAT STRIP(CHAR(CARDF))
CONCAT ’)*’
FROM SYSIBM.SYSTABSTATS
WHERE OWNER IN (creator_list)
AND
NAME IN (table_list);
PSPI
Delete statistics from SYSCOLDIST on the test subsystem for the specified
tables by using the following statement:
DELETE FROM (TEST_SUBSYSTEM).SYSCOLDIST
WHERE TBOWNER IN (creator_list)
AND
TBNAME IN (table_list);
PSPI
Use INSERT statements to repopulate SYSCOLDIST with production
statistics that are generated from the following statement:
SELECT
CONCAT
CONCAT
CONCAT
’INSERT INTO SYSIBM.SYSCOLDIST ’
’(FREQUENCY,STATSTIME,IBMREQD,TBOWNER’
’,TBNAME,NAME,COLVALUE,TYPE,CARDF,COLGROUPCOLNO’
’,NUMCOLUMNS,FREQUENCYF) VALUES( ’
Chapter 11. Designing DB2 statistics for performance
169
CONCAT STRIP(CHAR(FREQUENCY))
CONCAT ’ ,’
CONCAT ’’’’ CONCAT CHAR(STATSTIME)
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT IBMREQD
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT STRIP(TBOWNER)
CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT STRIP(TBNAME)
CONCAT ’’’,’
CONCAT ’’’’ CONCAT STRIP(NAME)
CONCAT ’’’ ,’
CONCAT ’X’’’ CONCAT STRIP(HEX(COLVALUE)) CONCAT ’’’ ,’
CONCAT ’’’’ CONCAT TYPE
CONCAT ’’’ ,’
CONCAT STRIP(CHAR(CARDF))
CONCAT ’ ,’
CONCAT ’X’’’CONCAT STRIP(HEX(COLGROUPCOLNO)) CONCAT ’’’ ,’
CONCAT CHAR(NUMCOLUMNS)
CONCAT ’ ,’
CONCAT STRIP(CHAR(FREQUENCYF))
CONCAT ’)*’
FROM SYSIBM.SYSCOLDIST
WHERE TBOWNER IN (creator_list)
AND
TBNAME IN (table_list);
PSPI
Note about SPUFI:
v If you use SPUFI to execute the preceding SQL statements, you might need to
increase the default maximum character column width to avoid truncation.
v Asterisks (*) appear in the examples to avoid having the semicolon interpreted
as the end of the SQL statement. Edit the result to change the asterisk to a
semicolon.
Access path differences from test to production: When you bind applications on the
test system with production statistics, access paths should be similar but still might
be different to what you see when the same query is bound on your production
system. The access paths from test to production could be different for the
following possible reasons:
v The processor models are different.
v The number of processors are different. (Differences in the number of processors
can affect the degree of parallelism that is obtained.)
v The buffer pool sizes are different.
v The RID pool sizes are different.
v Data in SYSIBM.SYSCOLDIST is mismatched. (This mismatch occurs only if
some of the previously mentioned steps mentioned are not followed exactly).
v The service levels are different.
v The values of optimization subsystem parameters, such as STARJOIN,
NPGTHRSH, and PARAMDEG (MAX DEGREE on installation panel DSNTIP8)
are different.
v The use of techniques such as optimization hints and volatile tables are different.
|
|
|
If your production system is accessible from your test system, you can use
DB2 PM EXPLAIN on your test system to request EXPLAIN information from
your production system. This request can reduce the need to simulate a production
system by updating the catalog.
You can also use IBM Data Studio or Optimization Service Center for DB2 for
z/OS to compare access plan graphs and capture information about your DB2
environment. IBM Data Studio replaces Optimization Service Center, which is
deprecated.
|
|
|
|
170
Performance Monitoring and Tuning Guide
Related concepts
“Investigating SQL performance with EXPLAIN” on page 421
“Using EXPLAIN to capture information about SQL statements.” on page 514
Real-time statistics
The following topics provide detailed information about the real-time statistics
tables.
Setting up your system for real-time statistics
DB2 always generates in-memory statistics for each table space and index space in
your system, including catalog objects. However, only certain global values are
generated for the SYSRTS table space, and no incremental counters are maintained.
About this task
|
|
|
No statistics are generated for the directory. For partitioned spaces, DB2 generates
information for each partition. However, you need to set the interval for writing
statistics and establish base values for statistics.
Setting the interval for writing real-time statistics
You can set the interval for writing real-time statistics when you install DB2, and
you can subsequently update that interval online.
About this task
The installation field is REAL TIME STATS on panel DSNTIPO. In a data sharing
environment, each member has its own interval for writing real-time statistics.
Procedure
To update the interval:
Modify STATSINT system parameter. The default interval is 30 minutes.
Establishing base values for real-time statistics
Many columns in the real-time statistics tables show the number of times an
operation was performed between the last time a particular utility was run and
when the real-time statistics are written.
About this task
For example, STATSINSERT in SYSTABLESPACESTATS indicates the number of
records or LOBs that have been inserted after the last RUNSTATS utility was run
on the table space or partition. Therefore, for each object for which you want
real-time statistics, run the appropriate utility (REORG, RUNSTATS, LOAD
REPLACE, REBUILD INDEX, or COPY) to establish a base value from which the
delta value can be calculated.
Contents of the real-time statistics tables
Real-time statistics tables contain statistics for indexes and table spaces.
Chapter 11. Designing DB2 statistics for performance
171
Related reference
SYSIBM.SYSINDEXSPACESTATS table (DB2 SQL)
SYSIBM.SYSTABLESPACESTATS table (DB2 SQL)
Operating with real-time statistics
To use the real-time statistics effectively, you need to understand when DB2
collects and externalizes them, and what factors in your system can affect the
statistics.
DSNACCOX stored procedure
The DB2 real-time statistics stored procedure (DSNACCOX) is a sample stored
procedure that makes recommendations to help you maintain your DB2 databases.
PSPI
The DSNACCOX stored procedure represents an enhancement to the DSNACCOR
stored procedure and provides the following improvements:
v Improved recommendations
v New fields
v New formulas
v The option to chose the formula for making recommendations
|
|
|
However, the DSNACCOX stored procedure requires that the real time statistics
are migrated to the DB2 catalog, or DB2 Version 9.1 for z/OS new-function-mode.
Otherwise, use the DSNACCOR stored procedure instead.
|
|
In particular, DSNACCOX performs the following actions:
v Recommends when you should reorganize, image copy, or update statistics for
table spaces or index spaces
v Indicates when a data set has exceeded a specified threshold for the number of
extents that it occupies.
v Indicates whether objects are in a restricted state
DSNACCOX uses data from catalog tables, including real-time statistics tables, to
make its recommendations. DSNACCOX provides its recommendations in a result
set.
|
|
|
DSNACCOX uses the set of criteria that are shown in “DSNACCOX formulas for
recommending actions” on page 185 to evaluate table spaces and index spaces. By
default, DSNACCOX evaluates all table spaces and index spaces in the subsystem
that have entries in the real-time statistics tables. However, you can override this
default through input parameters.
Important information about DSNACCOX recommendations:
v DSNACCOX makes recommendations based on general formulas that require
input from the user about the maintenance policies for a subsystem. These
recommendations might not be accurate for every installation.
v If the real-time statistics tables contain information for only a small percentage
of your DB2 subsystem, the recommendations that DSNACCOX makes might
not be accurate for the entire subsystem.
v Before you perform any action that DSNACCOX recommends, ensure that the
object for which DSNACCOX makes the recommendation is available, and that
|
|
172
Performance Monitoring and Tuning Guide
the recommended action can be performed on that object. For example, REORG
might be recommended for an object, but the object might be stopped.
|
|
Environment
DSNACCOX must run in a WLM-established stored procedure address space.
You should bind the package for DSNACCOX with isolation UR to avoid lock
contention. You can find the installation steps for DSNACCOX in job DSNTIJSG.
Authorization required
To execute the CALL DSNACCOX statement, the owner of the package or plan
that contains the CALL statement must have one or more of the following
privileges on each package that the stored procedure uses:
v
v
v
v
|
The EXECUTE privilege on the package for DSNACCOX
Ownership of the package
PACKADM authority for the package collection
SYSADM authority
The owner of the package or plan that contains the CALL statement must also
have:
v SELECT authority on catalog tables
v The DISPLAY system privilege
Syntax diagram
The following syntax diagram shows the CALL statement for invoking
DSNACCOX. Because the linkage convention for DSNACCOX is GENERAL WITH
NULLS, if you pass parameters in host variables, you need to include a null
indicator with every host variable. Null indicators for input host variables must be
initialized before you execute the CALL statement.
Chapter 11. Designing DB2 statistics for performance
173
CALL DSNACCOX (
QueryType
NULL
LocalSchema
NULL
,
ChkLvl
NULL
CRUpdatedPagesAbs
NULL
ICRUpdatedPagesAbs
NULL
RRTInsertAbs
NULL
RRTDisorgLOBPct
NULL
-1
RRIInsertPct
NULL
-1
RRIAppendInsertPct
NULL
-1
RRINumLevelsLimit
NULL
-1
SRIInsDelUpdPct
NULL
-1
,
,
,
,
Criteria
NULL
CRChangesPct
NULL
-1
,
,
RRIInsertAbs
NULL
-1
,
,
,
,
ICType
NULL
Unused
NULL
,
,
RRIPseudoDeletePct
NULL
-1
,
,
,
,
,
CatlgSchema
NULL
,
,
,
,
,
,
,
,
,
,
RRILeafLimit
NULL
-1
SRTMassDelLimit
NULL
-1
ExtentLimit
NULL
-1
,
RRTIndRefLimit
NULL
-1
RRIDeleteAbs
NULL
SRTInsDelUpdAbs
NULL
,
RRTInsertPct
NULL
-1
RRIMassDelLimit
NULL
-1
SRIMassDelLimit
NULL
-1
,
ICRUpdatedPagesPct
NULL
-1
RRTUnclustInsPct
NULL
-1
RRTMassDelLimit
NULL
-1
RRIDeletePct
NULL
-1
SRTInsDelUpdPct
NULL
-1
,
CRUpdatedPagesPct
NULL
-1
CRIndexSize
NULL
-1
RRTDeleteAbs
NULL
,
SRIInsDelAbs
NULL
,
CRDaySncLastCopy
NULL
-1
,
RRTDataSpaceRat
NULL
-1
,
,
ICRChangesPct
NULL
-1
RRTDeletePct
NULL
-1
,
ObjectType
NULL
,
,
,
LastStatement, ReturnCode, ErrorMsg, IFCARetCode, IFCAResCode, ExcessBytes )
Option descriptions
In the following option descriptions, the default value for an input parameter is
the value that DSNACCOX uses if you specify a null value.
QueryType
Specifies the types of actions that DSNACCOX recommends. This field
contains one or more of the following values. Each value is enclosed in single
quotation marks and separated from other values by a space.
ALL
Makes recommendations for all of the following actions.
COPY Makes a recommendation on whether to perform an image copy.
RUNSTATS
Makes a recommendation on whether to perform RUNSTATS.
174
Performance Monitoring and Tuning Guide
REORG
Makes a recommendation on whether to perform REORG. Choosing
this value causes DSNACCOX to process the EXTENTS value also.
EXTENTS
Indicates when data sets have exceeded a user-specified extents limit.
RESTRICT
Indicates which objects are in a restricted state.
DSNACCOX recommends REORG on the table space when one of the
following conditions is true, and REORG (or ALL) is also specified for
the value of QUERYTYPE:
v The table space is in REORG-pending status.
v The table space is in advisory REORG-pending status as the result of
an ALTER TABLE statement.
DSNACCOX recommends REORG on the index when on the following
conditions is true and REORG (or ALL) is also specified for the value
of QUERYTYPE::
v The index is in REORG-pending status.
v The index is in advisory REORG-pending as the result of an ALTER
TABLE statement.
DSNACCOX recommends FULL COPY on the table space when on the
following conditions is true and COPY (or ALL) is also specified for
the value of QUERYTYPE::
v The table space is in COPY-pending status.
v The table space is in informational COPY-pending status.
DSNACCOX recommends FULL COPY on the index when on the
following conditions is true and COPY (or ALL) is also specified for
the value of QUERYTYPE: and SYSINDEX.COPY='Y':
v The index is in COPY-pending status.
v The index is in informational COPY-pending status.
QueryType is an input parameter of type VARCHAR(40). The default value is
ALL.
ObjectType
Specifies the types of objects for which DSNACCOX recommends actions:
ALL
Table spaces and index spaces.
TS
Table spaces only.
IX
Index spaces only.
ObjectType is an input parameter of type VARCHAR(3). The default value is
ALL.
ICType
Specifies the types of image copies for which DSNACCOX is to make
recommendations:
F
Full image copy.
I
Incremental image copy. This value is valid for table spaces only.
B
Full image copy or incremental image copy.
ICType is an input parameter of type VARCHAR(1). The default is B.
Chapter 11. Designing DB2 statistics for performance
175
CatlgSchema
Specifies the qualifier for DB2 catalog table names. CatlgSchema is an input
parameter of type VARCHAR(128). The default value is SYSIBM.
LocalSchema
Specifies the qualifier for the names of local tables that DSNACCOX references.
LocalSchema is an input parameter of type VARCHAR(128). The default value is
DSNACC.
|
|
|
ChkLvl
Specifies the types of checking that DSNACCOX performs, and indicates
whether to include objects that fail those checks in the DSNACCOX
recommendations result set. This value is the sum of any combination of the
following values:
0
DSNACCOX performs none of the following actions.
1
Exclude rows from the DSNACCOX recommendations result set for
RUNSTATS on:
v Index spaces that are related to tables that are defined as VOLATILE.
v Table spaces for which all of the tables are defined as VOLATILE.
2
Reserved for future use.
4
Check whether rows that are in the DSNACCOX recommendations
result set refer to objects that are in the exception table. For
recommendations result set rows that have corresponding exception
table rows, copy the contents of the QUERYTYPE column of the
exception table to the INEXCEPTTABLE column of the
recommendations result set.
8
Check whether objects that have rows in the recommendations result
set are restricted. Indicate the restricted status in the OBJECTSTATUS
column of the result set.
Important: A row is inserted for objects in the restricted state.
|
|
|
|
16
Reserved for future use.
32
Exclude rows from the DSNACCOX recommendations result set for
index spaces for which the related table spaces have been
recommended for REORG or RUNSTATS.
64
For index spaces that are listed in the DSNACCOX recommendations
result set, check whether the related table spaces are listed in the
exception table. For recommendations result set rows that have
corresponding exception table rows, copy the contents of the
QUERYTYPE column of the exception table to the INEXCEPTTABLE
column of the recommendations result set. Selecting CHKLVL64 also
activates CHKLVLs 32 and 4.
ChkLvl is an input parameter of type INTEGER. The default is 5 (values 1+4).
Criteria
Narrows the set of objects for which DSNACCOX makes recommendations.
This value is the search condition of an SQL WHERE clause.Criteria is an input
parameter of type VARCHAR(4096). The default is that DSNACCOX makes
recommendations for all table spaces and index spaces in the subsystem. The
search condition can use any column in the result set and wildcards are
allowed.
176
Performance Monitoring and Tuning Guide
|
Unused
A parameter that is reserved for future use. Specify the null value for this
parameter. is an input parameter of type VARCHAR(80).
CRUpdatedPagesPct
Specifies, when combined with CRUpdatedPagesAbs, a criterion for
recommending a full image copy on a table space or index space. If both of the
following conditions are true for a table space, DSNACCOX recommends an
image copy:
v The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
v The total number of distinct updates pages is greater than
CRUpdatedPagesABS.
If all of the following conditions are true for an index space, DSNACCOX
recommends an image copy:
v The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
v The total number of distinct updates pages is greater than
CRUpdatedPagesABS.
v The number of active pages in the index space or partition is greater than
CRIndexSize.
CRUpdatedPagesPct is an input parameter of type DOUBLE. The default is 20.0.
A negative value turns off both this criteria and CRUpdatedPagesABS.
CRUpdatedPagesABS
Specifies, when combined with CRUpdatedPagesPct, a criterion for
recommending a full image copy on a table space or index space. If both of the
following conditions are true for a table space, DSNACCOX recommends an
image copy:
v The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
v The total number of distinct updated pages is greater than
CRUpdatedPagesAbs.
If all of the following conditions are true for an index space, DSNACCOX
recommends an image copy:
v The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
v The total number of distinct updates pages is greater than
CRUpdatedPagesAbs.
v The number of active pages in the index space or partition is greater than
CRIndexSize.
CRUpdatedPagesAbs is an input parameter of type INTEGER. The default value
is 0.
CRChangesPct
Specifies a criterion for recommending a full image copy on a table space or
index space. If the following condition is true for a table space, DSNACCOX
recommends an image copy:
Chapter 11. Designing DB2 statistics for performance
177
The total number of insert, update, and delete operations since the last
image copy, divided by the total number of rows or LOBs in a table space
or partition (expressed as a percentage) is greater than CRChangesPct.
If both of the following conditions are true for an index table space,
DSNACCOX recommends an image copy:
v The total number of insert and delete operations since the last image copy,
divided by the total number of entries in the index space or partition
(expressed as a percentage) is greater than CRChangesPct.
v The number of active pages in the index space or partition is greater than
CRIndexSize.
CRChangesPct is an input parameter of type DOUBLE. The default is 10.0. A
negative value turns off this criterion.
CRDaySncLastCopy
Specifies a criterion for recommending a full image copy on a table space or
index space. If the number of days since the last image copy is greater than
this value, DSNACCOX recommends an image copy.
CRDaySncLastCopy is an input parameter of type INTEGER. The default is 7. A
negative value turns off this criterion.
ICRUpdatedPagesPct
Specifies a criterion for recommending an incremental image copy on a table
space. If both of the following conditions are true, DSNACCOX recommends
an incremental image copy:
v The number of distinct pages that were updated since the last image copy,
divided by the total number of active pages in the table space or partition
(expressed as a percentage) is greater than ICRUpdatedPagesPct..
v The number of distinct pages that were updated since last image copy is
greater than ICRUpdatedPagesAbs.
ICRUpdatedPagesPct is an input parameter of type DOUBLE. The default value
is 1.0. A negative value turns off this criterion and ICRUpdatedPagesAbs.
ICRUpdatedPagesAbs
Specifies, when combined with ICRUpdatedPagesPct, a criterion for
recommending an incremental image copy on a table space. If both of the
following conditions are true, DSNACCOX recommends an incremental image
copy:
|
|
|
|
|
|
|
|
|
|
v The number of distinct pages that were updated since the last image copy,
divided by the total number of active pages in the table space or partition
(expressed as a percentage) is greater than ICRUpdatedPagesPct.
v The number of distinct pages that were updated since last image copy is
greater than ICRUpdatedPagesAbs.
|
ICRUpdatedPagesAbs is an input parameter of type INTEGER. The default is 0.
ICRChangesPct
Specifies a criterion for recommending an incremental image copy on a table
space. If the following condition is true, DSNACCOX recommends an
incremental image copy:
The ratio of the number of insert, update, or delete operations since the last
image copy, to the total number of rows or LOBs in a table space or
partition (expressed as a percentage) is greater than ICRChangesPct.
ICRChangesPct is an input parameter of type DOUBLE. The default is 1.0. A
negative value turns off this criterion.
178
Performance Monitoring and Tuning Guide
CRIndexSize
Specifies the minimum index size before checking the ICRUpdatedPagesPct or
ICRChangesPctcriteria for recommending a full image copy on an index space.
CRIndexSize is an input parameter of type INTEGER. The default is 50. A
negative value turns of this criterion and ICRChangesPct.
|
|
|
|
|
|
|
|
RRTInsertsPct
Specifies, when combined with RRTInsertsAbs, a criterion for recommending
that the REORG utility is to be run on a table space. If both of the following
condition are true, DSNACCOX recommends running REORG:
v The sum of insert, update, and delete operations since the last REORG,
divided by the total number of rows or LOBs in the table space or partition
(expressed as a percentage) is greater than RRTInsertsPct
v The sum of insert operations since the last REORG is greater than
RRTInsertsAbs.
RRTInsertsPct is an input parameter of type DOUBLE. The default value is 25.0.
A negative value turns off this criterion and RRTInsertsAbs.
RRTInsertsAbs
Specifies, when combined with RRTInsertsPct, a criterion for recommending
that the REORG utility is to be run on a table space. If both of the following
condition are true, DSNACCOX recommends running REORG:
|
v The sum of insert operations since the last REORG, divided by the total
number of rows or in the table space or partition (expressed as a percentage)
is greater than RRTInsertsPct
v The sum of insert operations since the last REORG is greater than
RRTInsertsAbs.
|
RRTInsertsAbs is an input parameter of type INTEGER. The default value is 0.
|
|
|
RRTDeletesPct
Specifies, when combined with RRTDeletesAbs, a criterion for recommending
that the REORG utility is to be run on a table space. If both of the following
condition are true, DSNACCOX recommends running REORG:
|
v The sum of delete operations since the last REORG, divided by the total
number of rows or in the table space or partition (expressed as a percentage)
is greater than RRTDeletesPct
v The sum of insert operations since the last REORG is greater than
RRTDeleteAbs.
|
|
RRTInsertsPct is an input parameter of type DOUBLE. The default value is 25.0.
A negative value turns off this criterion and RRTDeletesAbs.
|
|
|
RRTDeletesAbs
Specifies, when combined with RRTDeletesPct, a criterion for recommending
that the REORG utility is to be run on a table space. If both of the following
condition are true, DSNACCOX recommends running REORG:
|
v The sum of delete operations since the last REORG, divided by the total
number of rows or in the table space or partition (expressed as a percentage)
is greater than RRTDeletesPct
v The sum of delete operations since the last REORG is greater than
RRTDeletesAbs.
|
RRTDeletesAbs is an input parameter of type INTEGER. The default value is 0.
|
Chapter 11. Designing DB2 statistics for performance
179
RRTUnclustInsPct
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following condition is true, DSNACCOX recommends
running REORG:
The number of unclustered insert operations, divided by the total number
of rows or LOBs in the table space or partition (expressed as a percentage)
is greater than RRTUnclustInsPct.
RRTUnclustInsPct is an input parameter of type DOUBLE. The default is 10.0.
A negative value will turn off this criterion.
RRTDisorgLOBPct
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following condition is true, DSNACCOX recommends
running REORG:
The number of imperfectly chunked LOBs, divided by the total number of
rows or LOBs in the table space or partition (expressed as a percentage) is
greater than RRTDisorgLOBPct.
RRTDisorgLOBPct is an input parameter of type DOUBLE. The default is 50.0.
A negative value will turn off this criterion.
|
|
RRTDataSpaceRat
Specifies a criterion for recommending that the REORG utility is to be run on
table space for space reclamation. If the following condition is true,
DSNACCOX recommends running REORG:
The SPACE allocated is greater than RRTDataSpaceRat multiplied by the
actual space used. (SPACE > RRTDataSpaceRat × (DATASIZE/1024))
RRTDataSpaceRat is an input parameter of type DOUBLE. The default value is
2.0. A negative value turns off this criterion.
RRTMassDelLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If one of the following values is greater than RRTMassDelLimit,
DSNACCOX recommends running REORG:
v The number of mass deletes from a segmented or LOB table space since the
last REORG or LOAD REPLACE
v The number of dropped tables from a nonsegmented table space since the
last REORG or LOAD REPLACE
RRTMassDelLimit is an input parameter of type INTEGER. The default is 0.
RRTIndRefLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following value is greater than RRTIndRefLimit, DSNACCOX
recommends running REORG:
The total number of overflow records that were created since the last
REORG or LOAD REPLACE, divided by the total number of rows or LOBs
in the table space or partition (expressed as a percentage)
RRTIndRefLimit is an input parameter of type DOUBLE. The default is 5.0 in
data sharing environment and 10.0 in a non-data sharing environment.
RRIInsertsPct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the both of the following conditions are true, DSNACCOX
recommends running REORG:
|
180
Performance Monitoring and Tuning Guide
|
v The sum of the number of index entries that were inserted since the last
REORG, divided by the total number of index entries in the index space or
partition (expressed as a percentage) is greater than RRIInsertsPct.
|
v The sum of the number of index entries that were inserted since the last
REORG is greater than RRIInsertsAbs.
|
|
|
RRIInsertsPct is an input parameter of type DOUBLE. The default is 30.0. A
negative value turns off this criterion.
RRIInsertsAbs
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the both of the following conditions are true, DSNACCOX
recommends running REORG:
v The sum of the number of index entries that were inserted since the last
REORG, divided by the total number of index entries in the index space or
partition (expressed as a percentage) is greater than RRIInsertsPct.
v The sum of the number of index entries that were inserted since the last
REORG is greater than RRIInsertsAbs.
|
RRIInsertsAbs is an input parameter of type INTEGER. The default is 0. A
negative value turns off this criterion.
|
RRIDeletesPct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIDeletesPct,
DSNACCOX recommends running REORG:
|
|
v The sum of the number of index entries that were inserted since the last
REORG, divided by the total number of index entries in the index space or
partition (expressed as a percentage) is greater than RRIDeletesPct.
|
v The sum of the number of index entries that were inserted since the last
REORG is greater than RRIDeletesAbs.
This is an input parameter of type DOUBLE. The default is 30.0. A negative
value turns off this criterion.
|
|
|
|
RRIDeletesAbs
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIDeletesPct,
DSNACCOX recommends running REORG:
v The sum of the number of index entries that were inserted since the last
REORG, divided by the total number of index entries in the index space or
partition (expressed as a percentage) is greater than RRIDeletesPct.
v The sum of the number of index entries that were inserted since the last
REORG is greater than RRIDeletesAbs.
This is an input parameter of type INTEGER. The default is 0. A negative
value turns off this criterion.
RRIAppendInsertPct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIAppendInsertPct,
DSNACCOX recommends running REORG:
The number of index entries that were inserted since the last REORG,
REBUILD INDEX, or LOAD REPLACE with a key value greater than the
maximum key value in the index space or partition, divided by the number
of index entries in the index space or partition (expressed as a percentage)
Chapter 11. Designing DB2 statistics for performance
181
RRIInsertDeletePct is an input parameter of type DOUBLE. The default is 20.0.
A negative value turns off this criterion.
RRIPseudoDeletePct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIPseudoDeletePct,
DSNACCOX recommends running REORG:
The number of index entries that were pseudo-deleted since the last
REORG, REBUILD INDEX, or LOAD REPLACE, divided by the number of
index entries in the index space or partition (expressed as a percentage)
RRIPseudoDeletePct is an input parameter of type DOUBLE. The default is 5.0
in data sharing and 10.0 in non data sharing environments. A negative value
turns off this criterion.
RRIMassDelLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the number of mass deletes from an index space or partition
since the last REORG, REBUILD, or LOAD REPLACE is greater than this
value, DSNACCOX recommends running REORG.
RRIMassDelLimit is an input parameter of type INTEGER. The default is 0. A
negative value turns off this criterion.
RRILeafLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRILeafLimit,
DSNACCOX recommends running REORG:
The number of index page splits that occurred since the last REORG,
REBUILD INDEX, or LOAD REPLACE in which the higher part of the split
page was far from the location of the original page, divided by the total
number of active pages in the index space or partition (expressed as a
percentage)
RRILeafLimit is an input parameter of type DOUBLE. The default is 10.0. A
negative value turns off this criterion.
RRINumLevelsLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRINumLevelsLimit,
DSNACCOX recommends running REORG:
The number of levels in the index tree that were added or removed since
the last REORG, REBUILD INDEX, or LOAD REPLACE
RRINumLevelsLimit is an input parameter of type INTEGER. The default is 0. A
negative value turns off this criterion.
SRTInsDelUpdPct
Specifies, when combined with SRTInsDelUpdAbs, a criterion for
recommending that the RUNSTATS utility is to be run on a table space. If both
of the following conditions are true, DSNACCOX recommends running
RUNSTATS:
v The number of insert, update, or delete operations since the last RUNSTATS
on a table space or partition, divided by the total number of rows or LOBs
in table space or partition (expressed as a percentage) is greater than
SRTInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRTInsDelUpdAbs.
182
Performance Monitoring and Tuning Guide
SRTInsDelUpdPct is an input parameter of type DOUBLE. The default is 20.0. A
negative value turns off this criterion.
SRTInsDelUpdAbs
Specifies, when combined with SRTInsDelUpdPct, a criterion for recommending
that the RUNSTATS utility is to be run on a table space. If both of the
following conditions are true, DSNACCOX recommends running RUNSTATS:
v The number of insert, update, and delete operations since the last
RUNSTATS on a table space or partition, divided by the total number of
rows or LOBs in table space or partition (expressed as a percentage) is
greater than SRTInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRTInsDelUpdAbs.
SRTInsDelUpdAbs is an input parameter of type INTEGER. The default is 0.
SRTMassDelLimit
Specifies a criterion for recommending that the RUNSTATS utility is to be run
on a table space. If the following condition is true, DSNACCOX recommends
running RUNSTATS:
v The number of mass deletes from a table space or partition since the last
REORG or LOAD REPLACE is greater than SRTMassDelLimit.
|
|
SRTMassDelLimit is an input parameter of type INTEGER. The default is 0.0. A
negative value turns off this criterion.
|
|
SRIInsDelUpdPct
Specifies, when combined with SRIInsDelUpdAbs, a criterion for recommending
that the RUNSTATS utility is to be run on an index space. If both of the
following conditions are true, DSNACCOX recommends running RUNSTATS:
|
|
v The number of inserted and deleted index entries since the last RUNSTATS
on an index space or partition, divided by the total number of index entries
in the index space or partition (expressed as a percentage) is greater than
SRIInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRIInsDelUpdAbs
|
SRIInsDelUpdPct is an input parameter of type DOUBLE. The default is 20.0. A
negative value turns off this criterion.
|
|
|
SRIInsDelUpdAbs
Specifies, when combined with SRIInsDelUpdPct., a criterion for recommending
that the RUNSTATS utility is to be run on an index space. If the following
condition is true, DSNACCOX recommends running RUNSTATS:
v The number of inserted and deleted index entries since the last RUNSTATS
on an index space or partition, divided by the total number of index entries
in the index space or partition (expressed as a percentage) is greater than
SRIInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRIInsDelUpdAbs,
|
SRIInsDelUpdAbs is an input parameter of type INTEGER. The default is 0.
|
SRIMassDelLimit
Specifies a criterion for recommending that the RUNSTATS utility is to be run
on an index space. If the number of mass deletes from an index space or
partition since the last REORG, REBUILD INDEX, or LOAD REPLACE is
greater than this value, DSNACCOX recommends running RUNSTATS.
Chapter 11. Designing DB2 statistics for performance
183
SRIMassDelLimit is an input parameter of type INTEGER. The default value is
0. A negative value turns off this criterion.
ExtentLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space or index space. Also specifies that DSNACCOX is to warn the user
that the table space or index space has used too many extents. DSNACCOX
recommends running REORG, and altering data set allocations if the following
condition is true:
|
|
|
|
|
|
|
v The number of physical extents in the index space, table space, or partition
is greater than ExtentLimit.
ExtentLimit is an input parameter of type INTEGER. The default value is 254.
A negative value turns off this criterion.
|
|
LastStatement
When DSNACCOX returns a severe error (return code 12), this field contains
the SQL statement that was executing when the error occurred. LastStatement is
an output parameter of type VARCHAR(8012).
ReturnCode
The return code from DSNACCOX execution. Possible values are:
0
DSNACCOX executed successfully.
4
DSNACCOX completed with a warning. The ErrorMsg parameter
contains the input parameters that might be incompatible.
8
DSNACCOX terminated with errors. The ErrorMsg parameter contains
a message that describes the error.
12
DSNACCOX terminated with severe errors. The ErrorMsg parameter
contains a message that describes the error. The LastStatement
parameter contains the SQL statement that was executing when the
error occurred.
14
DSNACCOX terminated because the real time statistics table were not
yet migrated to the catalog.
15
DSNACCOX terminated because it encountered a problem with one of
the declared temporary tables that it defines and uses.
16
DSNACCOX terminated because it could not define a declared
temporary table.
NULL DSNACCOX terminated but could not set a return code.
ReturnCode is an output parameter of type INTEGER.
ErrorMsg
Contains information about DSNACCOX execution when DSNACCOX
terminates with a non-zero value for ReturnCode.
IFCARetCode
Contains the return code from an IFI COMMAND call. DSNACCOX issues
commands through the IFI interface to determine the status of objects.
IFCARetCode is an output parameter of type INTEGER.
IFCAResCode
Contains the reason code from an IFI COMMAND call. IFCAResCode is an
output parameter of type INTEGER.
184
Performance Monitoring and Tuning Guide
|
|
XsBytes
Contains the number of bytes of information that did not fit in the IFI return
area after an IFI COMMAND call. XsBytes is an output parameter of type
INTEGER.
DSNACCOX formulas for recommending actions
The following formulas specify the criteria that DSNACCOX uses for its
recommendations and warnings. The variables in italics are DSNACCOX input
parameters. The capitalized variables are columns of the
SYSIBM.SYSTABLESPACESTATS or SYSIBM.SYSINDEXSPACESTATS tables. The
numbers to the right of selected items are reference numbers for the option
descriptions in “Option descriptions” on page 174.
The figure below shows the formula that DSNACCOX uses to recommend a full
image copy on a table space.
| (((QueryType=’COPY’ OR QueryType=’ALL’) AND
| (ObjectType=’TS’ OR ObjectType=’ALL’) AND
| (ICType=’F’ OR ICType=’B’)) AND
| (COPYLASTTIME IS NULL OR
REORGLASTTIME>COPYLASTTIME OR
|
LOADRLASTTIME>COPYLASTTIME OR
|
| (CURRENT DATE-COPYLASTTIME)>CRDaySncLastCopy OR
| ((COPYUPDATEDPAGES×100)/NACTIVE>CRUpdatedPagesPct AND
(COPYUPDATEDPAGES>CRupdatedPagesAbs)) OR
|
| ((QueryType=’RESTRICT’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
|
| The table space is in COPY-pending status or informational COPY-pending status))
Figure 12. DSNACCOX formula for recommending a full image copy on a table space
The figure below shows the formula that DSNACCOX uses to recommend a full
image copy on an index space.
(((QueryType=’COPY’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’) AND
(ICType=’F’ OR ICType=’B’)) AND
(SYSINDEXES.COPY = ’Y’)) AND
(COPYLASTTIME IS NULL OR
REORGLASTTIME>COPYLASTTIME OR
LOADRLASTTIME>COPYLASTTIME OR
REBUILDLASTTIME>COPYLASTTIME OR
(CURRENT DATE-COPYLASTTIME)>CRDaySncLastCopy OR
(NACTIVE>CRIndexSize AND
(((COPYUPDATEDPAGES×100)/NACTIVE>CRUpdatedPagesPct) AND
(COPYUPDATEDPAGES>CRUpdatedPagesAbs)) OR
(COPYCHANGES×100)/TOTALENTRIES>CRChangesPct)) OR
((QueryType=’RESTRICT’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’) AND
(SYSINDEXES.COPY = ’Y’) AND
The index space is in COPY-pending status or informational COPY-pending status))
Figure 13. DSNACCOX formula for recommending a full image copy on an index space
The figure below shows the formula that DSNACCOX uses to recommend an
incremental image copy on a table space.
Chapter 11. Designing DB2 statistics for performance
185
((QueryType=’COPY’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
(ICType=’I’) AND
COPYLASTTIME IS NOT NULL) AND
(LOADRLASTTIME>COPYLASTTIME OR
REORGLASTTIME>COPYLASTTIME OR
((COPYUPDATEDPAGES×100)/NACTIVE>ICRUpdatedPagesPct) AND
(COPYUPDATEDPAGES>ICRUpdatedPagesAbs)) OR
(COPYCHANGES×100)/TOTALROWS>ICRChangesPct)
Figure 14. DSNACCOX formula for recommending an incremental image copy on a table space
The figure below shows the formula that DSNACCOX uses to recommend a
REORG on a table space. If the table space is a LOB table space, and CHCKLVL=1,
the formula does not include EXTENTS>ExtentLimit.
|
|
|
(((QueryType=’REORG’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’)) AND
(REORGLASTTIME IS NULL AND LOADRLASTTIME IS NULL) OR
(NACTIVE IS NULL OR NACTIVE > 5) AND
((((REORGINSERTS×100)/TOTALROWS>RRTInsertsPct) AND
REORGINSERTS>RRTInsertsAbs) OR
(((REORGDELETE×100)/TOTALROWS>RRTDeletesPct) AND
REORGDELETE>RRTDeleteAbs) OR
(REORGUNCLUSTINS×100)/TOTALROWS>RRTUnclustInsPct OR
(REORGDISORGLOB×100)/TOTALROWS>RRTDisorgLOBPct OR
(SPACE×1024)/DATASIZE>RRTDataSpaceRat OR
((REORGNEARINDREF+REORGFARINDREF)×100)/TOTALROWS>RRTIndRefLimit OR
REORGMASSDELETE>RRTMassDelLimit OR
EXTENTS>ExtentLimit)) OR
((QueryType=’RESTRICT’ ORQueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
The table space is in advisory or informational reorg pending status))
Figure 15. DSNACCOX formula for recommending a REORG on a table space
The figure below shows the formula that DSNACCOX uses to recommend a
REORG on an index space.
186
Performance Monitoring and Tuning Guide
|
|
|
|
(((QueryType=’REORG’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’) AND
(REORGLASTTIME IS NULL AND REBUILDLASTTIME IS NULL) OR
(NACTIVE IS NULL OR NACTIVE > 5) AND
((((REORGINSERTS×100)/TOTALENTRIES>RRIInsertsPct) AND
REORGINSERTS>RRIInsertsAbs) OR
(((REORGDELETE×100)/TOTALENTRIES>RRIDeletesPct) AND
REORGDELETE>RRIDeletesAbs) OR
(REORGAPPENDINSERT×100)/TOTALENTRIES>RRIAppendInsertPct OR
(REORGPSEUDODELETES×100)/TOTALENTRIES>RRIPseudoDeletePct OR
REORGMASSDELETE>RRIMassDeleteLimit OR
(REORGLEAFFAR×100)/NACTIVE>RRILeafLimit OR
REORGNUMLEVELS>RRINumLevelsLimit OR
EXTENTS>ExtentLimit)) OR
((QueryType=’RESTRICT’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’) AND
An index is in advisory-REBUILD-pending stats (ARBDP)))
Figure 16. DSNACCOX formula for recommending a REORG on an index space
The figure below shows the formula that DSNACCOX uses to recommend
RUNSTATS on a table space.
((QueryType=’RUNSTATS’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
Table Space is not cloned) AND
(STATSLASTTIME IS NULL OR
STATSLASTTIME>LOADRLASTTIME OR
STATSLASTTIME>REORGLASTTIME OR
(((STATSINSERTS+STATSDELETES+STATSUPDATES)×100)/TOTALROWS>SRTInsDelUpdPct AND
(STATSINSERTS+STATSDELETES+STATSUPDATES)>SRTInsDelUpdAbs) OR
STATSMASSDELETE>SRTMassDeleteLimit)))
Figure 17. DSNACCOX formula for recommending RUNSTATS on a table space
The figure below shows the formula that DSNACCOX uses to recommend
RUNSTATS on an index space.
|
|
|
((QueryType=’RUNSTATS’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’)
Table Space for the index is not cloned ) AND
(STATSLASTTIME IS NULL OR
STATSLASTTIME>LOADRLASTTIME OR
STATSLASTTIME>REORGLASTTIME OR
(((STATSINSERTS+STATSDELETES)×100)/TOTALENTRIES>SRIInsDelUpdPct AND
(STATSINSERTS+STATSDELETES)>SRIInsDelUpdAbs) OR
STATSMASSDELETE>SRIInsDelUpdAbs)))
Figure 18. DSNACCOX formula for recommending RUNSTATS on an index space
Using an exception table
An exception table is an optional, user-created DB2 table that you can use to place
information in the INEXCEPTTABLE column of the recommendations result set.
You can put any information in the INEXCEPTTABLE column, but the most
Chapter 11. Designing DB2 statistics for performance
187
common use of this column is to filter the recommendations result set. Each row in
the exception table represents an object for which you want to provide information
for the recommendations result set.
To create the exception table, execute a CREATE TABLE statement similar to the
following one. You can include other columns in the exception table, but you must
include at least the columns that are shown.
CREATE TABLE DSNACC.EXCEPT_TBL
(DBNAME CHAR(8) NOT NULL,
NAME CHAR(8) NOT NULL,
QUERYTYPE CHAR(40))
CCSID EBCDIC;
The meanings of the columns are:
DBNAME
The database name for an object in the exception table.
NAME
The table space name or index space name for an object in the exception table.
QUERYTYPE
The information that you want to place in the INEXCEPTTABLE column of the
recommendations result set.
If you put a null value in this column, DSNACCOX puts the value YES in the
INEXCEPTTABLE column of the recommendations result set row for the object
that matches the DBNAME and NAME values.
Recommendation: If you plan to put many rows in the exception table, create a
nonunique index on DBNAME, NAME, and QUERYTYPE.
After you create the exception table, insert a row for each object for which you
want to include information in the INEXCEPTTABLE column.
Example: Suppose that you want the INEXCEPTTABLE column to contain the
string 'IRRELEVANT' for table space STAFF in database DSNDB04. You also want
the INEXCEPTTABLE column to contain 'CURRENT' for table space DSN8S91D in
database DSN8D91A. Execute these INSERT statements:
INSERT INTO DSNACC.EXCEPT_TBL VALUES(’DSNDB04 ’, ’STAFF ’, ’IRRELEVANT’);
INSERT INTO DSNACC.EXCEPT_TBL VALUES(’DSN8D91A’, ’DSN8S91D’, ’CURRENT’);
To use the contents of INEXCEPTTABLE for filtering, include a condition that
involves the INEXCEPTTABLE column in the search condition that you specify in
your Criteria input parameter.
Example: Suppose that you want to include all rows for database DSNDB04 in the
recommendations result set, except for those rows that contain the string
'IRRELEVANT' in the INEXCEPTTABLE column. You might include the following
search condition in your Criteria input parameter:
DBNAME=’DSNDB04’ AND INEXCEPTTABLE<>’IRRELEVANT’
|
Example
|
|
|
|
The figure below is a COBOL example that shows variable declarations and an
SQL CALL for obtaining recommendations for objects in databases DSN8D91A and
DSN8D91L. This example also outlines the steps that you need to perform to
retrieve the two result sets that DSNACCOX returns. These result sets are
188
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
described in “DSNACCOX output” on page 193. See DB2 Application Programming
and SQL Guide for more information about how to retrieve result sets from a stored
procedure.
WORKING-STORAGE SECTION.
***********************
* DSNACCOX PARAMETERS *
***********************
01 QUERYTYPE.
49
QUERYTYPE-LN
PICTURE S9(4) COMP VALUE 40.
49
QUERYTYPE-DTA
PICTURE X(40) VALUE ’ALL’.
01 OBJECTTYPE.
49
OBJECTTYPE-LN
PICTURE S9(4) COMP VALUE 3.
49
OBJECTTYPE-DTA
PICTURE X(3) VALUE ’ALL’.
01 ICTYPE.
49
ICTYPE-LN
PICTURE S9(4) COMP VALUE 1.
49
ICTYPE-DTA
PICTURE X(1) VALUE ’B’.
01 CATLGSCHEMA.
49
CATLGSCHEMA-LN
PICTURE S9(4) COMP VALUE 128.
49
CATLGSCHEMA-DTA
PICTURE X(128) VALUE ’SYSIBM’.
01 LOCALSCHEMA.
49
LOCALSCHEMA-LN
PICTURE S9(4) COMP VALUE 128.
49
LOCALSCHEMA-DTA
PICTURE X(128) VALUE ’DSNACC’.
01 CHKLVL
PICTURE S9(9) COMP VALUE +3.
01 CRITERIA.
49
CRITERIA-LN
PICTURE S9(4) COMP VALUE 4096.
49
CRITERIA-DTA
PICTURE X(4096) VALUE SPACES.
01 UNUSED.
49
UNUSED-LN
PICTURE S9(4) COMP VALUE 80.
49
UNUSED-DTA
PICTURE X(80) VALUE SPACES.
01 CRUPDATEDPAGESPCT
USAGE COMP-2 VALUE +0.
01 CRUPDATEDPAGESABS
PICTURE S9(9) COMP VALUE +0.
01 CRCHANGESPCT
USAGE COMP-2 VALUE +0.
01 CRDAYSNCLASTCOPY
PICTURE S9(9) COMP VALUE +0.
01 ICRUPDATEDPAGESPCT
USAGE COMP-2 VALUE +0.
01 ICRUPDATEDPAGESABS
PICTURE S9(9) COMP VALUE +0.
01 ICRCHANGESPCT
PICTURE S9(9) COMP VALUE +0.
01 CRINDEXSIZE
PICTURE S9(9) COMP VALUE +0.
01 RRTINSERTSPCT
USAGE COMP-2 VALUE +0.
01 RRTINSERTSABS
PICTURE S9(9) COMP VALUE +0.
01 RRTDELETESPCT
USAGE COMP-2 VALUE +0.
01 RRTDELETEABS
PICTURE S9(9) COMP VALUE +0.
01 RRTUNCLUSTINSPCT
USAGE COMP-2 VALUE +0.
01 RRTDISORGLOBPCT
USAGE COMP-2 VALUE +0.
01 RRTDATASPACERAT
PICTURE S9(9) COMP VALUE +0.
01 RRTMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
01 RRTINDREFLIMIT
PICTURE S9(9) COMP VALUE +0.
01 RRIINSERTSPCT
USAGE COMP-2 VALUE +0.
01 RRIINSERTSABS
PICTURE S9(9) COMP VALUE +0.
01 RRIDELETESPCT
USAGE COMP-2 VALUE +0.
01 RRIDELETESABS
PICTURE S9(9) COMP VALUE +0.
01 RRIAPPENDINSERTPCT
USAGE COMP-2 VALUE +0.
01 RRIPSEUDODELETEPCT
USAGE COMP-2 VALUE +0.
01 RRIMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
01 RRILEAFLIMIT
PICTURE S9(9) COMP VALUE +0.
01 RRINUMLEVELSLIMIT
PICTURE S9(9) COMP VALUE +0.
01 SRTINSDELUPDPCT
PICTURE S9(9) COMP VALUE +0.
01 SRTINSDELUPDABS
PICTURE S9(9) COMP VALUE +0.
01 SRTMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
01 SRIINSDELUPDPCT
USAGE COMP-2 VALUE +0.
01 SRIINSDELUPDABS
PICTURE S9(9) COMP VALUE +0.
01 SRIMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
01 EXTENTLIMIT
PICTURE S9(9) COMP VALUE +0.
01 LASTSTATEMENT.
49
LASTSTATEMENT-LN
PICTURE S9(4) COMP VALUE 8012.
49
LASTSTATEMENT-DTA
PICTURE X(8012) VALUE SPACES.
01 RETURNCODE
PICTURE S9(9) COMP VALUE +0.
Chapter 11. Designing DB2 statistics for performance
189
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
01
ERRORMSG.
49
ERRORMSG-LN
49
ERRORMSG-DTA
01 IFCARETCODE
01 IFCARESCODE
01 XSBYTES
PICTURE S9(4) COMP VALUE 1331.
PICTURE X(1331) VALUE SPACES.
PICTURE S9(9) COMP VALUE +0.
PICTURE S9(9) COMP VALUE +0.
PICTURE S9(9) COMP VALUE +0.
*****************************************
* INDICATOR VARIABLES.
*
* INITIALIZE ALL NON-ESSENTIAL INPUT
*
* VARIABLES TO -1, TO INDICATE THAT THE *
* INPUT VALUE IS NULL.
*
*****************************************
01 QUERYTYPE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 OBJECTTYPE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 ICTYPE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 CATLGSCHEMA-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 LOCALSCHEMA-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CHKLVL-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRITERIA-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 UNUSED-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRUPDATEDPAGESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRUPDATEDPAGESABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRCHANGESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRDAYSNCLASTCOPY-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 ICRUPDATEDPAGESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 ICRCHANGESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 CRINDEXSIZE-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTINSERTSPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTINSERTSABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTDELETESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTDELETEABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTUNCLUSTINSPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTDISORGLOBPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTDATASPACERAT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTMASSDELLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRTINDREFLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIINSERTSPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIINSERTSABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIDELETESPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIDELETESABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIAPPENDINSERTPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIPSEUDODELETEPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRIMASSDELLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRILEAFLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 RRINUMLEVELSLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRTINSDELUPDPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRTINSDELUPDABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRTMASSDELLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRIINSDELUPDPCT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRIINSDELUPDABS-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 SRIMASSDELLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 EXTENTLIMIT-IND
PICTURE S9(4) COMP-4 VALUE -1.
01 LASTSTATEMENT-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 RETURNCODE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 ERRORMSG-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 IFCARETCODE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 IFCARESCODE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 XSBYTES-IND
PICTURE S9(4) COMP-4 VALUE +0
PROCEDURE DIVISION.
*********************************************************
* SET VALUES FOR DSNACCOX INPUT PARAMETERS:
*
* - USE THE CHKLVL PARAMETER TO CAUSE DSNACCOX TO CHECK *
*
FOR RELATED TABLE SPACES WHEN PROCESSING INDEX
*
*
SPACES, AND DELETE RECOMMENDATION FOR INDEXSPACES
*
*
WHEN AN ACTION (SUCH AS REORG) ON THE TABLE SPACE
*
*
WILL ALSO CAUSE THE ACTION TO BE DONE ON THE INDEX *
190
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*
SPACE. (CHKLVL=64)
*
* - USE THE CRITERIA PARAMETER TO CAUSE DSNACCOX TO
*
*
MAKE RECOMMENDATIONS ONLY FOR OBJECTS IN DATABASES *
*
DSN8D91A AND DSN8D91L.
*
* - FOR THE FOLLOWING PARAMETERS, SET THESE VALUES,
*
*
WHICH ARE LOWER THAN THE DEFAULTS:
*
*
CRUPDATEDPAGESPCT
4
*
*
CRCHANGESPCT
2
*
*
RRTINSDELUPDPCT
2
*
*
RRTUNCLUSTINSPCT
5
*
*
RRTDISORGLOBPCT
5
*
*
RRIAPPENDINSERTPCT 5
*
*
SRTINSDELUPDPCT
5
*
*
SRIINSDELUPDPCT
5
*
*
EXTENTLIMIT
3
*
* - EXCLUDE CHECKING FOR THESE CRITERIA BY SET THE
*
*
FOLLOWING VALUES TO A NEGATIVE VALUE.
*
*
RRTMASSDELLIMIT
-1
*
*
RRIMASSDELLIMIT
-1
*
*********************************************************
MOVE 64 TO CHKLVL.
MOVE SPACES TO CRITERIA-DTA.
MOVE ’DBNAME = ’’DSN8D91A’’ OR DBNAME = ’’DSN8D91L’’’
TO CRITERIA-DTA.
MOVE 46 TO CRITERIA-LN.
MOVE 4 TO CRUPDATEDPAGESPCT.
MOVE 2 TO CRCHANGESPCT.
MOVE 2 TO RRTINSERTSPCT.
MOVE 5 TO RRTUNCLUSTINSPCT.
MOVE 5 TO RRTDISORGLOBPCT.
MOVE 5 TO RRIAPPENDINSERTPCT.
MOVE 5 TO SRTINSDELUPDPCT.
MOVE 5 TO SRIINSDELUPDPCT
MOVE 3 TO EXTENTLIMIT.
MOVE -1 TO RRTMASSDELLIMIT.
MOVE -1 TO RRIMASSDELLIMIT.
********************************
* INITIALIZE OUTPUT PARAMETERS *
********************************
MOVE SPACES TO LASTSTATEMENT-DTA.
MOVE 1 TO LASTSTATEMENT-LN.
MOVE 0 TO RETURNCODE-O2.
MOVE SPACES TO ERRORMSG-DTA.
MOVE 1 TO ERRORMSG-LN.
MOVE 0 TO IFCARETCODE.
MOVE 0 TO IFCARESCODE.
MOVE 0 TO XSBYTES.
*******************************************************
* SET THE INDICATOR VARIABLES TO 0 FOR NON-NULL INPUT *
* PARAMETERS (PARAMETERS FOR WHICH YOU DO NOT WANT
*
* DSNACCOX TO USE DEFAULT VALUES) AND FOR OUTPUT
*
* PARAMETERS.
*
*******************************************************
MOVE 0 TO CHKLVL-IND.
MOVE 0 TO CRITERIA-IND.
MOVE 0 TO CRUPDATEDPAGESPCT-IND.
MOVE 0 TO CRCHANGESPCT-IND.
MOVE 0 TO RRTINSERTSPCT-IND.
MOVE 0 TO RRTUNCLUSTINSPCT-IND.
MOVE 0 TO RRTDISORGLOBPCT-IND.
MOVE 0 TO RRIAPPENDINSERTPCT-IND.
MOVE 0 TO SRTINSDELUPDPCT-IND.
MOVE 0 TO SRIINSDELUPDPCT-IND.
MOVE 0 TO EXTENTLIMIT-IND.
MOVE 0 TO LASTSTATEMENT-IND.
MOVE 0 TO RETURNCODE-IND.
MOVE 0 TO ERRORMSG-IND.
Chapter 11. Designing DB2 statistics for performance
191
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
MOVE
MOVE
MOVE
MOVE
MOVE
0
0
0
0
0
TO
TO
TO
TO
TO
IFCARETCODE-IND.
IFCARESCODE-IND.
XSBYTES-IND.
RRTMASSDELLIMIT-IND.
RRIMASSDELLIMIT-IND.
*****************
* CALL DSNACCOX *
*****************
EXEC SQL
CALL SYSPROC.DSNACCOX
(:QUERYTYPE
:QUERYTYPE-IND,
:OBJECTTYPE
:OBJECTTYPE-IND,
:ICTYPE
:ICTYPE-IND,
:STATSSCHEMA
:STATSSCHEMA-IND,
:CATLGSCHEMA
:CATLGSCHEMA-IND,
:LOCALSCHEMA
:LOCALSCHEMA-IND,
:CHKLVL
:CHKLVL-IND,
:CRITERIA
:CRITERIA-IND,
:UNUSED
:UNUSED-IND,
:CRUPDATEDPAGESPCT
:CRUPDATEDPAGESPCT-IND,
:CRUPDATEDPAGESABS
:CRUPDATEDPAGESABS-IND,
:CRCHANGESPCT
:CRCHANGESPCT-IND,
:CRDAYSNCLASTCOPY
:CRDAYSNCLASTCOPY-IND,
:ICRUPDATEDPAGESPCT :ICRUPDATEDPAGESPCT-IND,
:ICRCHANGESPCT
:ICRCHANGESPCT-IND,
:CRINDEXSIZE
:CRINDEXSIZE-IND,
:RRTINSERTSPCT
:RRTINSERTSPCT-IND,
:RRTINSERTSABS
:RRTINSERSTABS-IND,
:RRTDELETESPCT
:RRTDELETESPCT-IND,
:RRTDELETESABS
:RRTDELETESABS-IND,
:RRTUNCLUSTINSPCT
:RRTUNCLUSTINSPCT-IND,
:RRTDISORGLOBPCT
:RRTDISORGLOBPCT-IND,
:RRTDATASPACERAT
:RRTDATASPACERAT-IND,
:RRTMASSDELLIMIT
:RRTMASSDELLIMIT-IND,
:RRTINDREFLIMIT
:RRTINDREFLIMIT-IND,
:RRIINSERTSPCT
:RRIINSERTSPCT-IND,
:RRIINSERTSABS
:RRIINSERTSABS-IND,
:RRIDELETESPCT
:RRIDELETESPCT-IND,
:RRIDELETESABS
:RRIDELETESABS-IND,
:RRIAPPENDINSERTPCT :RRIAPPENDINSERTPCT-IND,
:RRIPSEUDODELETEPCT :RRIPSEUDODELETEPCT-IND,
:RRIMASSDELLIMIT
:RRIMASSDELLIMIT-IND,
:RRILEAFLIMIT
:RRILEAFLIMIT-IND,
:RRINUMLEVELSLIMIT
:RRINUMLEVELSLIMIT-IND,
:SRTINSDELUPDPCT
:SRTINSDELUPDPCT-IND,
:SRTINSDELUPDABS
:SRTINSDELUPDABS-IND,
:SRTMASSDELLIMIT
:SRTMASSDELLIMIT-IND,
:SRIINSDELUPDPCT
:SRIINSDELUPDPCT-IND,
:SRIINSDELUPDABS
:SRIINSDELUPDABS-IND,
:SRIMASSDELLIMIT
:SRIMASSDELLIMIT-IND,
:EXTENTLIMIT
:EXTENTLIMIT-IND,
:LASTSTATEMENT
:LASTSTATEMENT-IND,
:RETURNCODE
:RETURNCODE-IND,
:ERRORMSG
:ERRORMSG-IND,
:IFCARETCODE
:IFCARETCODE-IND,
:IFCARESCODE
:IFCARESCODE-IND,
:XSBYTES
:XSBYTES-IND)
END-EXEC.
*************************************************************
* ASSUME THAT THE SQL CALL RETURNED +466, WHICH MEANS THAT *
* RESULT SETS WERE RETURNED. RETRIEVE RESULT SETS.
*
*************************************************************
* LINK EACH RESULT SET TO A LOCATOR VARIABLE
EXEC SQL ASSOCIATE LOCATORS (:LOC1, :LOC2)
WITH PROCEDURE SYSPROC.DSNACCOX
END-EXEC.
192
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
* LINK A CURSOR TO EACH RESULT SET
EXEC SQL ALLOCATE C1 CURSOR FOR RESULT SET
END-EXEC.
EXEC SQL ALLOCATE C2 CURSOR FOR RESULT SET
END-EXEC.
* PERFORM FETCHES USING C1 TO RETRIEVE ALL ROWS
* PERFORM FETCHES USING C2 TO RETRIEVE ALL ROWS
|
DSNACCOX output
|
|
|
If DSNACCOX executes successfully, in addition to the output parameters
described in “Option descriptions” on page 174, DSNACCOX returns two result
sets.
|
|
The first result set contains the results from IFI COMMAND calls that DSNACCOX
makes. The following table shows the format of the first result set.
|
Table 51. Result set row for first DSNACCOX result set
|
Column name
Data type
Contents
|
RS_SEQUENCE
INTEGER
Sequence number of the output line
|
|
RS_DATA
CHAR(80)
A line of command output
|
|
|
|
|
|
|
|
|
|
The second result set contains DSNACCOX's recommendations. This result set
contains one or more rows for a table space or index space. A nonpartitioned table
space or nonpartitioning index space can have at most one row in the result set. A
partitioned table space or partitioning index space can have at most one row for
each partition. A table space, index space, or partition has a row in the result set if
both of the following conditions are true:
v If the Criteria input parameter contains a search condition, the search condition
is true for the table space, index space, or partition.
v DSNACCOX recommends at least one action for the table space, index space, or
partition.
|
The following table shows the columns of a result set row.
:LOC1
:LOC2
FROM FIRST RESULT SET
FROM SECOND RESULT SET
|
Table 52. Result set row for second DSNACCOX result set
|
Column name
Data type
Description
|
DBNAME
VARCHAR(24)
Name of the database that contains the object.
|
NAME
VARCHARCHAR(128)
Table space name or index name.
|
PARTITION
INTEGER
Data set number or partition number.
|
INSTANCE
SMALLINT
Indicates if the object is associated with a data set
instance.
CLONE
CHAR(1)
'Y' or 'N', 'Y' indicates a cloned object.
OBJECTTYPE
CHAR(2)
DB2 object type:
v 'TS' for a table space
v 'IX' for an index space
|
INDEXSPACE
VARCHAR(24)
Index space name.
|
CREATOR
VARCHAR(128)
Index creator name.
Chapter 11. Designing DB2 statistics for performance
193
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
| OBJECTSTATUS
CHAR(40)
Status of the object:
v ORPHANED, if the object is an index space with
no corresponding table space, or if the object does
not exist
v If the object is in a restricted state, one of the
following values:
– TS=restricted-state, if OBJECTTYPE is TS
– IX=restricted-state, if OBJECTTYPE is IX
|
|
– LS=restricted-state, if object is LOB TS.
– LX=restricted-state, if object is XML TS.
restricted-state is one of the status codes that appear
in the output of the DISPLAY DATABASE
command.
v A, if the object is in an advisory state.
v L, if the object is a logical partition, but not in an
advisory state.
v AL, if the object is a logical partition and in an
advisory state.
IMAGECOPY
CHAR(4)
COPY recommendation:
v If OBJECTTYPE is TS: FULL (full image copy), INC
(incremental image copy), or NO
v If OBJECTTYPE is IX: YES or NO
RUNSTATS
CHAR(3)
RUNSTATS recommendation: YES or NO.
EXTENTS
CHAR(3)
Indicates whether the data sets for the object have
exceeded ExtentLimit: YES or NO.
REORG
CHAR(3)
REORG recommendation: YES or NO.
INEXCEPTTABLE
CHAR(40)
A string that contains one of the following values:
v Text that you specify in the QUERYTYPE column
of the exception table.
v YES, if you put a row in the exception table for the
object that this result set row represents, but you
specify NULL in the QUERYTYPE column.
v NO, if the exception table exists but does not have
a row for the object that this result set row
represents.
v Null, if the exception table does not exist, or if the
ChkLvl input parameter does not include the value
4.
| ASSOCIATEDTS
VARCHAR(128)
If OBJECTTYPE is IX this value is the name of the
table space that is associated with the index space.
Otherwise null.
COPYLASTTIME
TIMESTAMP
Timestamp of the last full image copy on the object.
Null if COPY was never run, or if the last COPY
execution is unknown.
LOADRLASTTIME
TIMESTAMP
Timestamp of the last LOAD REPLACE on the object.
Null if LOAD REPLACE was never run, or if the last
LOAD REPLACE execution is unknown.
194
Performance Monitoring and Tuning Guide
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
REBUILDLASTTIME
TIMESTAMP
Timestamp of the last REBUILD INDEX on the object.
Null if REBUILD INDEX was never run, or if the last
REBUILD INDEX execution is unknown.
CRUPDPGSPCT
DOUBLE
IF OBJECTTYPE is TS or IX and IMAGECOPY is YES,
the ratio of distinct updated pages to pre-formatted
pages, expressed as a percentage. Otherwise null.
If the ratio of distinct updated pages to pre-formatted
pages, does not exceed the CRUpdatedPagesPct (for
table spaces) or ICRUpdatedPagesPct (for indexes),
this value is null.
CRUPDPGSABS
INTEGER
IF OBJECTTYPE is TS or IX and IMAGECOPY is YES,
the ratio of distinct updated pages to pre-formatted
pages. Otherwise null.
If the ratio of distinct updated pages to pre-formatted
pages, does not exceed the value specified
forCRUpdatedPagesPct (for table spaces) or
ICRUpdatedPagesPct (for indexes), this value is null.
| CRCPYCHGPCT
DOUBLE
If OBJECTTYPE is TS and IMAGECOPY is YES, the
ratio of the total number insert, update, and delete
operations since the last image copy to the total
number of rows or LOBs in the table space or
partition, expressed as a percentage. If OBJECTTYPE
is IX and IMAGECOPY is YES, the ratio of the total
number of insert and delete operations since the last
image copy to the total number of entries in the
index space or partition, expressed as a percentage.
Otherwise null.
If the ratio of the total number insert, update, and
delete operations since the last image copy to the
total number of rows or LOBs in the table space or
partition does not exceed the value specified
forCRChangesPct (for table spaces) or ICRChangesPct
(for index spaces), this value is null.
CRDAYSCELSTCPY
INTEGER
If OBJECTTYPE is TS or IX and IMAGECOPY is YES,
the number of days since the last image copy.
Otherwise null.
If the number of days since the last image copy does
not exceed the value specified for CrDaySncLastCopy,
this value is null.
CRINDEXSIZE
INTEGER
If OBJECTTYPE is IX and IMAGECOPY is YES, the
number of active pages in the index space or
partition. Otherwise null.
If the the number of active pages in the index space
or partition does not exceed the value specified for
CRIndexSize, this value is null.
REORGLASTTIME
TIMESTAMP
Timestamp of the last REORG on the object. Null if
REORG was never run, or if the last REORG
execution was terminated.
Chapter 11. Designing DB2 statistics for performance
195
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
RRTINSERTSPCT
DOUBLE
If OBJECTTYPE is TS and REORG is YES, the ratio of
the sum of insert, update, and delete operations since
the last REORG to the total number of rows or LOBs
in the table space or partition, expressed as a
percentage. Otherwise null.
If the ratio of the sum of insert, update, and delete
operations since the last REORG to the total number
of rows or LOBs in the table space or partition does
not exceed the value specified for RRTInsertsPct, this
value is null.
|
RRTINSERTSABS
INTEGER
If OBJECTTYPE is TS and REORG is YES, the ratio of
the sum of insert operations since the last REORG to
the total number of rows in the table space or
partition. Otherwise null.
If the the ratio of the sum of insert operations since
the last REORG to the total number of rows in the
table space or partition does not exceed the value
specified for RRTInsertsAbs, this value is null.
|
RRTDELETESPCT
DOUBLE
If the OBJECTTYTPE is TS and REORG is YES, the
ratio of the sum of delete operations since the last
REORG to the total number of rows in the table
space or partition, expressed as a percentage.
Otherwise null.
If the ratio of the sum of delete operations since the
last REORG to the total number of rows in the table
space or partition does not exceed the value specified
for RRTDeletesPct, this value is null.
|
RRTDELETESABS
INTEGER
If OBJECTTYPE is TS and REORG is YES, the total
number of delete operations since the last REORG on
a table space or partition. Otherwise null.
If the the total number of delete operations since the
last REORG does not exceed the value specified for
RRTDeletesAbs, this value is null.
|
RRTUNCINSPCT
DOUBLE
If OBJECTTYPE is TS and REORG is YES, the ratio of
the number of unclustered insert operations to the
total number of rows or LOBs in the table space or
partition, expressed as a percentage. Otherwise null.
If the the ratio of the number of unclustered insert
operations to the total number of rows or LOBs does
not exceed the value specified for RRTUnclustInsPct,
this value is null.
RRTDISORGLOBPCT
DOUBLE
If OBJECTTYPE is TS and REORG is YES, the ratio of
the number of imperfectly chunked LOBs to the total
number of rows or LOBs in the table space or
partition, expressed as a percentage. Otherwise null.
If the the ratio of the number of imperfectly chunked
LOBs to the total number of rows or LOBs in the
table space or partition does not exceed the value of
RRTDisorgLOBPct, this value is null
196
Performance Monitoring and Tuning Guide
Table 52. Result set row for second DSNACCOX result set (continued)
|
|
|
Column name
Data type
Description
RRTDATASPACERAT
DOUBLE
If OBJECTTYPE is TS and REORG is YES,the ratio of
the number of SPACE allocated and the space used,
exceed RRTDataSpaceRat threshold. Otherwise null.
RRTMASSDELETE
INTEGER
If OBJECTTYPE is TS, REORG is YES, and the table
space is a segmented table space or LOB table space,
the number of mass deletes since the last REORG or
LOAD REPLACE. If OBJECTTYPE is TS, REORG is
YES, and the table space is nonsegmented, the
number of dropped tables since the last REORG or
LOAD REPLACE. Otherwise null.
If the number of the number of dropped tables since
the last REORG or LOAD REPLACE does not exceed
the value specified for RRTMassDelLimit, this value is
null.
| RRTINDREF
DOUBLE
If OBJECTTYPE is TS, REORG is YES, the ratio of the
total number of overflow records that were created
since the last REORG or LOAD REPLACE to the total
number of rows or LOBs in the table space or
partition, expressed as a percentage. Otherwise null.
If the ratio of the total number of overflow records
that were created since the last REORG or LOAD
REPLACE to the total number of rows or LOBs does
not exceed the value specified for RRTIndRef, this
value is null.
| RRIINSERTSPCT
DOUBLE
If the ratio of the total number of insert operations
since the last REORG to the total number of index
entries does not exceed the value specified for
RRIInsertsPct, this value is null.
|
| RRIINSERTSABS
INTEGER
|
If OBJECTYPE is IX and REORG is YES, t total
number of insert operations since the last REORG.
Otherwise null.
If the total number of insert operations since the last
REORG does not exceed the value specified for
RRTInsertsAbs, this value is null.
|
| RRIDELETESPCT
If OBJECTTYPE is IX and REORG is YES, the ratio of
the total number of insert operations since the last
REORG to the total number of index entries in the
index space or partition, expressed as a percentage.
Otherwise null.
DOUBLE
If OBJECTTYPE is IX and REORG is YES, the ratio of
the total number of delete operations since the last
REORG to the total number of index entries in the
index space or partition, expressed as a percentage.
Otherwise null.
If the ratio of the total number of delete operations
since the last REORG to the total number of index
entries does not exceed the value specified for
RRIDeletesPct, this value is null.
Chapter 11. Designing DB2 statistics for performance
197
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
RRIDELETABS
INTEGER
If OBJECTYPE is IX and REORG is YES, the total
number of delete operations since the last REORG.
Otherwise null.
If the total number of insert operations since the last
REORG does not exceed the value specified for
RRTDeleteAbs, this value is null.
RRIAPPINSPCT
DOUBLE
If OBJECTTYPE is IX and REORG is YES, the ratio of
the number of index entries that were inserted since
the last REORG, REBUILD INDEX, or LOAD
REPLACE that had a key value greater than the
maximum key value in the index space or partition,
to the number of index entries in the index space or
partition, expressed as a percentage. Otherwise null.
If the ratio of the number of index entries that were
inserted, which had a key value greater than the
maximum key value, to the number of index entries
does not exceed the value specified for
RRIAppendInsertPct, this value is null.
| RRIPSDDELPCT
DOUBLE
If OBJECTTYPE is IX and REORG is YES, the ratio of
the number of index entries that were pseudo-deleted
(the RID entry was marked as deleted) since the last
REORG, REBUILD INDEX, or LOAD REPLACE to
the number of index entries in the index space or
partition, expressed as a percentage. Otherwise null.
If the ratio of the number of index entries that were
pseudo-deleted since the last REORG, REBUILD
INDEX, or LOAD REPLACE to the number of index
entries does not exceed the value specified for
RRIPseudoDeletePct, this value is null.
RRIMASSDELETE
INTEGER
If OBJECTTYPE is IX and REORG is YES, the number
of mass deletes from the index space or partition
since the last REORG, REBUILD, or LOAD
REPLACE. Otherwise null.
If the number of mass deletes from the index space or
partition since the last REORG, REBUILD, or LOAD
REPLACE does not exceed the value specified for
RRIMassDelLimit, this value is null.
| RRILEAF
DOUBLE
If OBJECTTYPE is IX and REORG is YES, the ratio of
the number of index page splits that occurred since
the last REORG, REBUILD INDEX, or LOAD
REPLACE in which the higher part of the split page
was far from the location of the original page, to the
total number of active pages in the index space or
partition, expressed as a percentage. Otherwise null.
If the ratio of the number of index page splits that
occurred since the last REORG, REBUILD INDEX, or
LOAD REPLACE to the total number of active pages
does not exceed the value specified for RRILeafLimit,
this value is null.
198
Performance Monitoring and Tuning Guide
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
RRINUMLEVELS
INTEGER
If OBJECTTYPE is IX and REORG is YES, the number
of levels in the index tree that were added or
removed since the last REORG, REBUILD INDEX, or
LOAD REPLACE. Otherwise null.
If the number of levels in the index tree that were
added or removed does not exceed the value
specified for RRINumLevelsLimit, this value is null.
STATSLASTTIME
TIMESTAMP
Timestamp of the last RUNSTATS on the object. Null
if RUNSTATS was never run, or if the last
RUNSTATS execution was unknown.
SRTINSDELUPDPCT
DOUBLE
If OBJECTTYPE is TS and RUNSTATS is YES, the
ratio of the total number of insert, update, and delete
operations since the last RUNSTATS on a table space
or partition, to the total number of rows or LOBs in
the table space or partition, expressed as a
percentage. Otherwise null.
If the ratio of the total number of insert, update, and
delete operations since the last RUNSTATS to the
total number of rows or LOBs does not exceed the
value specified for SRTInsDelUpdPct, this value is
null.
SRTINSDELUPDABS
INTEGER
If OBJECTTYPE is TS and RUNSTATS is YES, the
total number of insert, update, and delete operations
since the last RUNSTATS on a table space or
partition. Otherwise null.
If the total number of insert, update, and delete
operations since the last RUNSTATS does not exceed
the value specified for SRTInsDelUpdAbs, this value is
null.
SRTMASSDELETE
INTEGER
If OBJECTTYPE is TS and RUNSTATS is YES, the
number of mass deletes from the table space or
partition since the last REORG or LOAD REPLACE.
Otherwise null.
If the number of mass deletes from the table space or
partition since the last REORG or LOAD REPLACE
does not exceed the value specified for
SRTMassDelLimit, this value is null.
SRIINSDELPCT
|
DOUBLE
If OBJECTTYPE is IX and RUNSTATS is YES, the
ratio of the total number of insert and delete
operations since the last RUNSTATS on the index
space or partition, to the total number of index
entries in the index space or partition, expressed as a
percentage. Otherwise null.
If the ratio of the total number of insert and delete
operations since the last RUNSTATS, to the total
number of index entries does not exceed the value
specified for SRIInsDelUpdPct, this value is null.
Chapter 11. Designing DB2 statistics for performance
199
Table 52. Result set row for second DSNACCOX result set (continued)
Column name
Data type
Description
SRIINSDELABS
INTEGER
If OBJECTTYPE is IX and RUNSTATS is YES, the
number insert and delete operations since the last
RUNSTATS on the index space or partition.
Otherwise null.
If the total number of insert, update, and delete
operations since the last RUNSTATS does not exceed
the value specified for SRIInsDelAbsSRIInsDelUpdAbs,
this value is null.
|
SRIMASSDELETE
INTEGER
If OBJECTTYPE is IX and RUNSTATS is YES, the
number of mass deletes from the index space or
partition since the last REORG, REBUILD INDEX, or
LOAD REPLACE. Otherwise, this value is null.
If the number of mass deletes does not exceed the
value specified for SRIMassDelete, this value is null.
TOTALEXTENTS
SMALLINT
If EXTENTS is YES, the number of physical extents in
the table space, index space, or partition. Otherwise,
this value is null.
If the number of physical extents does not exceed the
value specified for ExtentLimit, this value is null.
PSPI
Related reference
CREATE DATABASE (DB2 SQL)
CREATE TABLESPACE (DB2 SQL)
DSNACCOR stored procedure
The DB2 real-time statistics stored procedure (DSNACCOR) is a sample stored
procedure that makes recommendations to help you maintain your DB2 databases.
The DSNACCOX stored procedure replaces the DSNACCOR stored procedure and
provides improved recommendations. If you have migrated your real-time
statistics to the DB2 catalog, or DB2 is running in DB2 Version 9.1 for z/OS
new-function mode or later, you can use the DSNACCOX stored procedure to take
advantage of enhancements, including new fields, improved formulas, and the
option to select the formula that is used for making recommendations.
|
|
|
|
|
|
PSPI
|
In particular, DSNACCOR performs the following actions:
v Recommends when you should reorganize, image copy, or update statistics for
table spaces or index spaces
|
|
v Indicates when a data set has exceeded a specified threshold for the number of
extents that it occupies.
v Indicates whether objects are in a restricted state
|
|
|
DSNACCOR uses data from the SYSIBM.SYSTABLESPACESTATS and
SYSIBM.SYSSYSINDEXSPACESTATS real-time statistics tables to make its
recommendations. DSNACCOR provides its recommendations in a result set.
200
Performance Monitoring and Tuning Guide
DSNACCOR uses the set of criteria that are shown in “DSNACCOR formulas for
recommending actions” on page 210 to evaluate table spaces and index spaces. By
default, DSNACCOR evaluates all table spaces and index spaces in the subsystem
that have entries in the real-time statistics tables. However, you can override this
default through input parameters.
Important information about DSNACCOR recommendations:
v DSNACCOR makes recommendations based on general formulas that require
input from the user about the maintenance policies for a subsystem. These
recommendations might not be accurate for every installation.
v If the real-time statistics tables contain information for only a small percentage
of your DB2 subsystem, the recommendations that DSNACCOR makes might
not be accurate for the entire subsystem.
v Before you perform any action that DSNACCOR recommends, ensure that the
object for which DSNACCOR makes the recommendation is available, and that
the recommended action can be performed on that object. For example, before
you can perform an image copy on an index, the index must have the COPY
YES attribute.
Environment
DSNACCOR must run in a WLM-established stored procedure address space.
You should bind the package for DSNACCOR with isolation UR to avoid lock
contention. You can find the installation steps for DSNACCOR in job DSNTIJSG.
Authorization required
To execute the CALL DSNACCOR statement, the owner of the package or plan
that contains the CALL statement must have one or more of the following
privileges on each package that the stored procedure uses:
v The EXECUTE privilege on the package for DSNACCOR
v Ownership of the package
v PACKADM authority for the package collection
v SYSADM authority
|
The owner of the package or plan that contains the CALL statement must also
have:
v SELECT authority on the real-time statistics tables
v Select authority on catalog tables
v The DISPLAY system privilege
Syntax diagram
The following syntax diagram shows the CALL statement for invoking
DSNACCOR. Because the linkage convention for DSNACCOR is GENERAL WITH
NULLS, if you pass parameters in host variables, you need to include a null
indicator with every host variable. Null indicators for input host variables must be
initialized before you execute the CALL statement.
Chapter 11. Designing DB2 statistics for performance
201
CALL DSNACCOR (
CatlgSchema
NULL
,
CRUpdatedPagesPct
NULL
ICRChangesPct
NULL
RRTDisorgLOBPct
NULL
RRIAppendInsertPct
NULL
RRINumLevelsLimit
NULL
SRIInsDelUpdPct
NULL
QueryType
NULL
,
ObjectType
NULL
LocalSchema
NULL
,
ChkLvl
NULL
,
,
CRChangesPct
NULL
CRIndexSize
NULL
,
,
,
,
SRTInsDelUpdPct
NULL
Restricted
NULL
,
,
,
SRTInsDelUpdAbs
NULL
,
,
,
,
RRILeafLimit
NULL
SRTMassDelLimit
NULL
ExtentLimit
NULL
,
,
RRIInsertDeletePct
NULL
RRIMassDelLimit
NULL
SRIMassDelLimit
NULL
,
ICRUpdatedPagesPct
NULL
RRTUnclustInsPct
NULL
RRTIndRefLimit
NULL
,
,
,
StatsSchema
NULL
,
CRDaySncLastCopy
NULL
RRIPseudoDeletePct
NULL
SRIInsDelUpdAbs
NULL
ICType
NULL
Criteria
NULL
RRTInsDelUpdPct
NULL
RRTMassDelLimit
NULL
,
,
,
,
,
,
,
,
,
,
LastStatement, ReturnCode, ErrorMsg, IFCARetCode, IFCAResCode, ExcessBytes )
Option descriptions
In the following option descriptions, the default value for an input parameter is
the value that DSNACCOR uses if you specify a null value.
QueryType
Specifies the types of actions that DSNACCOR recommends. This field
contains one or more of the following values. Each value is enclosed in single
quotation marks and separated from other values by a space.
ALL
Makes recommendations for all of the following actions.
COPY Makes a recommendation on whether to perform an image copy.
RUNSTATS
Makes a recommendation on whether to perform RUNSTATS.
REORG
Makes a recommendation on whether to perform REORG. Choosing
this value causes DSNACCOR to process the EXTENTS value also.
EXTENTS
Indicates when data sets have exceeded a user-specified extents limit.
RESTRICT
Indicates which objects are in a restricted state.
QueryType is an input parameter of type VARCHAR(40). The default is ALL.
ObjectType
Specifies the types of objects for which DSNACCOR recommends actions:
202
ALL
Table spaces and index spaces.
TS
Table spaces only.
Performance Monitoring and Tuning Guide
IX
Index spaces only.
ObjectType is an input parameter of type VARCHAR(3). The default is ALL.
ICType
Specifies the types of image copies for which DSNACCOR is to make
recommendations:
F
Full image copy.
I
Incremental image copy. This value is valid for table spaces only.
B
Full image copy or incremental image copy.
ICType is an input parameter of type VARCHAR(1). The default is B.
StatsSchema
Specifies the qualifier for the real-time statistics table names. StatsSchema is an
input parameter of type VARCHAR(128). The default is SYSIBM.
CatlgSchema
Specifies the qualifier for DB2 catalog table names. CatlgSchema is an input
parameter of type VARCHAR(128). The default is SYSIBM.
LocalSchema
Specifies the qualifier for the names of tables that DSNACCOR creates.
LocalSchema is an input parameter of type VARCHAR(128). The default is
DSNACC.
ChkLvl
Specifies the types of checking that DSNACCOR performs, and indicates
whether to include objects that fail those checks in the DSNACCOR
recommendations result set. This value is the sum of any combination of the
following values:
0
DSNACCOR performs none of the following actions.
1
For objects that are listed in the recommendations result set, check the
SYSTABLESPACE or SYSINDEXES catalog tables to ensure that those
objects have not been deleted. If value 16 is not also chosen, exclude
rows for the deleted objects from the recommendations result set.
DSNACCOR excludes objects from the recommendations result set if
those objects are not in the SYSTABLESPACE or SYSINDEXES catalog
tables.
|
|
|
When this setting is specified, DSNACCOR does not use
EXTENTS>ExtentLimit to determine whether a LOB table space should
be reorganized.
2
For index spaces that are listed in the recommendations result set,
check the SYSTABLES, SYSTABLESPACE, and SYSINDEXES catalog
tables to determine the name of the table space that is associated with
each index space.
Choosing this value causes DSNACCOR to also check for rows in the
recommendations result set for objects that have been deleted but have
entries in the real-time statistics tables (value 1). This means that if
value 16 is not also chosen, rows for deleted objects are excluded from
the recommendations result set.
4
Check whether rows that are in the DSNACCOR recommendations
result set refer to objects that are in the exception table. For
recommendations result set rows that have corresponding exception
Chapter 11. Designing DB2 statistics for performance
203
table rows, copy the contents of the QUERYTYPE column of the
exception table to the INEXCEPTTABLE column of the
recommendations result set.
8
Check whether objects that have rows in the recommendations result
set are restricted. Indicate the restricted status in the OBJECTSTATUS
column of the result set.
16
For objects that are listed in the recommendations result set, check the
SYSTABLESPACE or SYSINDEXES catalog tables to ensure that those
objects have not been deleted (value 1). In result set rows for deleted
objects, specify the word ORPHANED in the OBJECTSTATUS column.
32
Exclude rows from the DSNACCOR recommendations result set for
index spaces for which the related table spaces have been
recommended for REORG. Choosing this value causes DSNACCOR to
perform the actions for values 1 and 2.
64
For index spaces that are listed in the DSNACCOR recommendations
result set, check whether the related table spaces are listed in the
exception table. For recommendations result set rows that have
corresponding exception table rows, copy the contents of the
QUERYTYPE column of the exception table to the INEXCEPTTABLE
column of the recommendations result set.
ChkLvl is an input parameter of type INTEGER. The default is 7 (values
1+2+4).
Criteria
Narrows the set of objects for which DSNACCOR makes recommendations.
This value is the search condition of an SQL WHERE clause. Criteria is an
input parameter of type VARCHAR(4096). The default is that DSNACCOR
makes recommendations for all table spaces and index spaces in the
subsystem. The search condition can use any column in the result set and
wildcards are allowed.
Restricted
A parameter that is reserved for future use. Specify the null value for this
parameter. Restricted is an input parameter of type VARCHAR(80).
CRUpdatedPagesPct
Specifies a criterion for recommending a full image copy on a table space or
index space. If the following condition is true for a table space, DSNACCOR
recommends an image copy:
The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
See item 2 in Figure 19 on page 210. If both of the following conditions are true
for an index space, DSNACCOR recommends an image copy:
v The total number of distinct updated pages, divided by the total number of
preformatted pages (expressed as a percentage) is greater than
CRUpdatedPagesPct.
v The number of active pages in the index space or partition is greater than
CRIndexSize. See items 2 and 3 in Figure 20 on page 211.
CRUpdatedPagesPct is an input parameter of type INTEGER. The default is 20.
204
Performance Monitoring and Tuning Guide
CRChangesPct
Specifies a criterion for recommending a full image copy on a table space or
index space. If the following condition is true for a table space, DSNACCOR
recommends an image copy:
The total number of insert, update, and delete operations since the last
image copy, divided by the total number of rows or LOBs in a table space
or partition (expressed as a percentage) is greater than CRChangesPct.
See item 3 in Figure 19 on page 210. If both of the following conditions are true
for an index table space, DSNACCOR recommends an image copy:
v The total number of insert and delete operations since the last image copy,
divided by the total number of entries in the index space or partition
(expressed as a percentage) is greater than CRChangesPct.
v The number of active pages in the index space or partition is greater than
CRIndexSize.
See items 2 and 4 in Figure 20 on page 211. CRChangesPct is an input
parameter of type INTEGER. The default is 10.
CRDaySncLastCopy
Specifies a criterion for recommending a full image copy on a table space or
index space. If the number of days since the last image copy is greater than
this value, DSNACCOR recommends an image copy. (See item 1 in Figure 19
on page 210 and item 1 in Figure 20 on page 211.) CRDaySncLastCopy is an
input parameter of type INTEGER. The default is 7.
ICRUpdatedPagesPct
Specifies a criterion for recommending an incremental image copy on a table
space. If the following condition is true, DSNACCOR recommends an
incremental image copy:
The number of distinct pages that were updated since the last image copy,
divided by the total number of active pages in the table space or partition
(expressed as a percentage) is greater than CRUpdatedPagesPct.
(See item 1 in Figure 21 on page 211.) ICRUpdatedPagesPct is an input
parameter of type INTEGER. The default is 1.
ICRChangesPct
Specifies a criterion for recommending an incremental image copy on a table
space. If the following condition is true, DSNACCOR recommends an
incremental image copy:
The ratio of the number of insert, update, or delete operations since the last
image copy, to the total number of rows or LOBs in a table space or
partition (expressed as a percentage) is greater than ICRChangesPct.
(See item 2 in Figure 21 on page 211.) ICRChangesPct is an input parameter of
type INTEGER. The default is 1.
CRIndexSize
Specifies, when combined with CRUpdatedPagesPct or CRChangesPct, a criterion
for recommending a full image copy on an index space. (See items 2, 3, and 4
in Figure 20 on page 211.) CRIndexSize is an input parameter of type INTEGER.
The default is 50.
RRTInsDelUpdPct
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following condition is true, DSNACCOR recommends
running REORG:
Chapter 11. Designing DB2 statistics for performance
205
The sum of insert, update, and delete operations since the last REORG,
divided by the total number of rows or LOBs in the table space or partition
(expressed as a percentage) is greater than RRTInsDelUpdPct
(See item 1 in Figure 22 on page 211.) RRTInsDelUpdPct is an input parameter
of type INTEGER. The default is 20.
RRTUnclustInsPct
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following condition is true, DSNACCOR recommends
running REORG:
The number of unclustered insert operations, divided by the total number
of rows or LOBs in the table space or partition (expressed as a percentage)
is greater than RRTUnclustInsPct.
(See item 2 in Figure 22 on page 211.) RRTUnclustInsPct is an input parameter
of type INTEGER. The default is 10.
RRTDisorgLOBPct
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following condition is true, DSNACCOR recommends
running REORG:
The number of imperfectly chunked LOBs, divided by the total number of
rows or LOBs in the table space or partition (expressed as a percentage) is
greater than RRTDisorgLOBPct.
(See item 3 in Figure 22 on page 211.) RRTDisorgLOBPct is an input parameter
of type INTEGER. The default is 10.
RRTMassDelLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If one of the following values is greater than RRTMassDelLimit,
DSNACCOR recommends running REORG:
v The number of mass deletes from a segmented or LOB table space since the
last REORG or LOAD REPLACE
v The number of dropped tables from a nonsegmented table space since the
last REORG or LOAD REPLACE
(See item 5 in Figure 22 on page 211.) RRTMassDelLimit is an input parameter
of type INTEGER. The default is 0.
RRTIndRefLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space. If the following value is greater than RRTIndRefLimit, DSNACCOR
recommends running REORG:
The total number of overflow records that were created since the last
REORG or LOAD REPLACE, divided by the total number of rows or LOBs
in the table space or partition (expressed as a percentage)
(See item 4 in Figure 22 on page 211.) RRTIndRefLimit is an input parameter of
type INTEGER. The default is 10.
RRIInsertDeletePct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIInsertDeletePct,
DSNACCOR recommends running REORG:
The sum of the number of index entries that were inserted and deleted
since the last REORG, divided by the total number of index entries in the
index space or partition (expressed as a percentage)
206
Performance Monitoring and Tuning Guide
(See item 1 in Figure 23 on page 212.) This is an input parameter of type
INTEGER. The default is 20.
RRIAppendInsertPct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIAppendInsertPct,
DSNACCOR recommends running REORG:
The number of index entries that were inserted since the last REORG,
REBUILD INDEX, or LOAD REPLACE with a key value greater than the
maximum key value in the index space or partition, divided by the number
of index entries in the index space or partition (expressed as a percentage)
(See item 2 in Figure 23 on page 212.) RRIInsertDeletePct is an input parameter
of type INTEGER. The default is 10.
RRIPseudoDeletePct
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRIPseudoDeletePct,
DSNACCOR recommends running REORG:
The number of index entries that were pseudo-deleted since the last
REORG, REBUILD INDEX, or LOAD REPLACE, divided by the number of
index entries in the index space or partition (expressed as a percentage)
(See item 3 in Figure 23 on page 212.) RRIPseudoDeletePct is an input parameter
of type INTEGER. The default is 10.
RRIMassDelLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the number of mass deletes from an index space or partition
since the last REORG, REBUILD, or LOAD REPLACE is greater than this
value, DSNACCOR recommends running REORG.
(See item 4 in Figure 23 on page 212.) RRIMassDelLimit is an input parameter
of type INTEGER. The default is 0.
RRILeafLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRILeafLimit,
DSNACCOR recommends running REORG:
The number of index page splits that occurred since the last REORG,
REBUILD INDEX, or LOAD REPLACE that resulted in a large separation
between the parts of the original page, divided by the total number of
active pages in the index space or partition (expressed as a percentage)
(See item 5 in Figure 23 on page 212.) RRILeafLimit is an input parameter of
type INTEGER. The default is 10.
RRINumLevelsLimit
Specifies a criterion for recommending that the REORG utility is to be run on
an index space. If the following value is greater than RRINumLevelsLimit,
DSNACCOR recommends running REORG:
The number of levels in the index tree that were added or removed since
the last REORG, REBUILD INDEX, or LOAD REPLACE
(See item 6 in Figure 23 on page 212.) RRINumLevelsLimit is an input parameter
of type INTEGER. The default is 0.
SRTInsDelUpdPct
Specifies, when combined with SRTInsDelUpdAbs, a criterion for
Chapter 11. Designing DB2 statistics for performance
207
recommending that the RUNSTATS utility is to be run on a table space. If both
of the following conditions are true, DSNACCOR recommends running
RUNSTATS:
v The number of insert, update, or delete operations since the last RUNSTATS
on a table space or partition, divided by the total number of rows or LOBs
in table space or partition (expressed as a percentage) is greater than
SRTInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRTInsDelUpdAbs.
(See items 1 and 2 in Figure 24 on page 212.) SRTInsDelUpdPct is an input
parameter of type INTEGER. The default is 20.
SRTInsDelUpdAbs
Specifies, when combined with SRTInsDelUpdPct, a criterion for recommending
that the RUNSTATS utility is to be run on a table space. If both of the
following conditions are true, DSNACCOR recommends running RUNSTATS:
v The number of insert, update, and delete operations since the last
RUNSTATS on a table space or partition, divided by the total number of
rows or LOBs in table space or partition (expressed as a percentage) is
greater than SRTInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRTInsDelUpdAbs.
(See items 1 and 2 in Figure 24 on page 212.) SRTInsDelUpdAbs is an input
parameter of type INTEGER. The default is 0.
SRTMassDelLimit
Specifies a criterion for recommending that the RUNSTATS utility is to be run
on a table space. If the following condition is true, DSNACCOR recommends
running RUNSTATS:
v The number of mass deletes from a table space or partition since the last
REORG or LOAD REPLACE is greater than SRTMassDelLimit.
(See item 3 in Figure 24 on page 212.) SRTMassDelLimit is an input parameter
of type INTEGER. The default is 0.
SRIInsDelUpdPct
Specifies, when combined with SRIInsDelUpdAbs, a criterion for recommending
that the RUNSTATS utility is to be run on an index space. If both of the
following conditions are true, DSNACCOR recommends running RUNSTATS:
v The number of inserted and deleted index entries since the last RUNSTATS
on an index space or partition, divided by the total number of index entries
in the index space or partition (expressed as a percentage) is greater than
SRIInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRIInsDelUpdAbs.
|
|
|
|
|
(See items 1 and 2 in Figure 25 on page 212.) SRIInsDelUpdPct is an input
parameter of type INTEGER. The default is 20.
|
|
SRIInsDelUpdAbs
Specifies, when combined with SRIInsDelUpdPct, a criterion for recommending
that the RUNSTATS utility is to be run on an index space. If the following
condition is true, DSNACCOR recommends running RUNSTATS:
208
Performance Monitoring and Tuning Guide
|
|
|
v The number of inserted and deleted index entries since the last RUNSTATS
on an index space or partition, divided by the total number of index entries
in the index space or partition (expressed as a percentage) is greater than
SRIInsDelUpdPct.
v The sum of the number of inserted and deleted index entries since the last
RUNSTATS on an index space or partition is greater than SRIInsDelUpdAbs,
(See items 1 and 2 in Figure 25 on page 212.) SRIInsDelUpdAbs is an input
parameter of type INTEGER. The default is 0.
SRIMassDelLimit
Specifies a criterion for recommending that the RUNSTATS utility is to be run
on an index space. If the number of mass deletes from an index space or
partition since the last REORG, REBUILD INDEX, or LOAD REPLACE is
greater than this value, DSNACCOR recommends running RUNSTATS.
(See item 3 in Figure 25 on page 212.) SRIMassDelLimit is an input parameter of
type INTEGER. The default is 0.
|
|
|
|
|
|
|
|
|
ExtentLimit
Specifies a criterion for recommending that the REORG utility is to be run on a
table space or index space. Also specifies that DSNACCOR is to warn the user
that the table space or index space has used too many extents. DSNACCOR
recommends running REORG, and altering data set allocations if the following
condition is true:
v The number of physical extents in the index space, table space, or partition
is greater than ExtentLimit.
(See Figure 26 on page 212.) ExtentLimit is an input parameter of type
INTEGER. The default is 50.
LastStatement
When DSNACCOR returns a severe error (return code 12), this field contains
the SQL statement that was executing when the error occurred. LastStatement is
an output parameter of type VARCHAR(8012).
ReturnCode
The return code from DSNACCOR execution. Possible values are:
0
DSNACCOR executed successfully. The ErrorMsg parameter contains
the approximate percentage of the total number of objects in the
subsystem that have information in the real-time statistics tables.
4
DSNACCOR completed, but one or more input parameters might be
incompatible. The ErrorMsg parameter contains the input parameters
that might be incompatible.
8
DSNACCOR terminated with errors. The ErrorMsg parameter contains
a message that describes the error.
12
DSNACCOR terminated with severe errors. The ErrorMsg parameter
contains a message that describes the error. The LastStatement
parameter contains the SQL statement that was executing when the
error occurred.
14
DSNACCOR terminated because it could not access one or more of the
real-time statistics tables. The ErrorMsg parameter contains the names
of the tables that DSNACCOR could not access.
15
DSNACCOR terminated because it encountered a problem with one of
the declared temporary tables that it defines and uses.
Chapter 11. Designing DB2 statistics for performance
209
16
DSNACCOR terminated because it could not define a declared
temporary table. No table spaces were defined in the TEMP database.
NULL DSNACCOR terminated but could not set a return code.
ReturnCode is an output parameter of type INTEGER.
ErrorMsg
Contains information about DSNACCOR execution. If DSNACCOR runs
successfully (ReturnCode=0), this field contains the approximate percentage of
objects in the subsystem that are in the real-time statistics tables. Otherwise,
this field contains error messages. ErrorMsg is an output parameter of type
VARCHAR(1331).
IFCARetCode
Contains the return code from an IFI COMMAND call. DSNACCOR issues
commands through the IFI interface to determine the status of objects.
IFCARetCode is an output parameter of type INTEGER.
IFCAResCode
Contains the reason code from an IFI COMMAND call. IFCAResCode is an
output parameter of type INTEGER.
ExcessBytes
Contains the number of bytes of information that did not fit in the IFI return
area after an IFI COMMAND call. ExcessBytes is an output parameter of type
INTEGER.
DSNACCOR formulas for recommending actions
The following formulas specify the criteria that DSNACCOR uses for its
recommendations and warnings. The variables in italics are DSNACCOR input
parameters. The capitalized variables are columns of the
SYSIBM.SYSTABLESPACESTATS or SYSIBM.SYSINDEXSPACESTATS tables. The
numbers to the right of selected items are reference numbers for the option
descriptions in “Option descriptions” on page 202.
The figure below shows the formula that DSNACCOR uses to recommend a full
image copy on a table space.
((QueryType=’COPY’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
ICType=’F’) AND
(COPYLASTTIME IS NULL OR
REORGLASTTIME>COPYLASTTIME OR
LOADRLASTTIME>COPYLASTTIME OR
(CURRENT DATE-COPYLASTTIME)>CRDaySncLastCopy OR
1
(COPYUPDATEDPAGES*100)/NACTIVE>CRUpdatedPagesPct OR 2
(COPYCHANGES*100)/TOTALROWS>CRChangesPct)
3
Figure 19. DSNACCOR formula for recommending a full image copy on a table space
The figure below shows the formula that DSNACCOR uses to recommend a full
image copy on an index space.
210
Performance Monitoring and Tuning Guide
((QueryType=’COPY’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’) AND
(ICType=’F’ OR ICType=’B’)) AND
(COPYLASTTIME IS NULL OR
REORGLASTTIME>COPYLASTTIME OR
LOADRLASTTIME>COPYLASTTIME OR
REBUILDLASTTIME>COPYLASTTIME OR
(CURRENT DATE-COPYLASTTIME)>CRDaySncLastCopy OR
1
(NACTIVE>CRIndexSize AND
2
((COPYUPDATEDPAGES*100)/NACTIVE>CRUpdatedPagesPct OR 3
(COPYCHANGES*100)/TOTALENTRIES>CRChangesPct)))
4
Figure 20. DSNACCOR formula for recommending a full image copy on an index space
The figure below shows the formula that DSNACCOR uses to recommend an
incremental image copy on a table space.
((QueryType=’COPY’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’) AND
ICType=’I’ AND
COPYLASTTIME IS NOT NULL) AND
(LOADRLASTTIME>COPYLASTTIME OR
REORGLASTTIME>COPYLASTTIME OR
(COPYUPDATEDPAGES*100)/NACTIVE>ICRUpdatedPagesPct OR 1
(COPYCHANGES*100)/TOTALROWS>ICRChangesPct))
2
Figure 21. DSNACCOR formula for recommending an incremental image copy on a table space
The figure below shows the formula that DSNACCOR uses to recommend a
REORG on a table space. If the table space is a LOB table space, and CHCKLVL=1,
the formula does not include EXTENTS>ExtentLimit.
((QueryType=’REORG’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’)) AND
(REORGLASTTIME IS NULL OR
((REORGINSERTS+REORGDELETES+REORGUPDATES)*100)/TOTALROWS>RRTInsDelUpdPct OR 1
(REORGUNCLUSTINS*100)/TOTALROWS>RRTUnclustInsPct OR
2
(REORGDISORGLOB*100)/TOTALROWS>RRTDisorgLOBPct OR
3
((REORGNEARINDREF+REORGFARINDREF)*100)/TOTALROWS>RRTIndRefLimit OR
4
REORGMASSDELETE>RRTMassDelLimit OR
5
EXTENTS>ExtentLimit)
6
Figure 22. DSNACCOR formula for recommending a REORG on a table space
The figure below shows the formula that DSNACCOR uses to recommend a
REORG on an index space.
Chapter 11. Designing DB2 statistics for performance
211
((QueryType=’REORG’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’)) AND
(REORGLASTTIME IS NULL OR
((REORGINSERTS+REORGDELETES)*100)/TOTALENTRIES>RRIInsertDeletePct OR
(REORGAPPENDINSERT*100)/TOTALENTRIES>RRIAppendInsertPct OR
(REORGPSEUDODELETES*100)/TOTALENTRIES>RRIPseudoDeletePct OR
REORGMASSDELETE>RRIMassDeleteLimit OR
(REORGLEAFFAR*100)/NACTIVE>RRILeafLimit OR
REORGNUMLEVELS>RRINumLevelsLimit OR
EXTENTS>ExtentLimit)
1
2
3
4
5
6
7
Figure 23. DSNACCOR formula for recommending a REORG on an index space
The figure below shows the formula that DSNACCOR uses to recommend
RUNSTATS on a table space.
((QueryType=’RUNSTATS’ OR QueryType=’ALL’) AND
(ObjectType=’TS’ OR ObjectType=’ALL’)) AND
(STATSLASTTIME IS NULL OR
(((STATSINSERTS+STATSDELETES+STATSUPDATES)*100)/TOTALROWS>SRTInsDelUpdPct AND 1
(STATSINSERTS+STATSDELETES+STATSUPDATES)>SRTInsDelUpdAbs) OR
2
STATSMASSDELETE>SRTMassDeleteLimit)
3
Figure 24. DSNACCOR formula for recommending RUNSTATS on a table space
The figure below shows the formula that DSNACCOR uses to recommend
RUNSTATS on an index space.
|
|
((QueryType=’RUNSTATS’ OR QueryType=’ALL’) AND
(ObjectType=’IX’ OR ObjectType=’ALL’)) AND
(STATSLASTTIME IS NULL OR
(((STATSINSERTS+STATSDELETES)*100)/TOTALENTRIES>SRIInsDelUpdPct AND 1
(STATSINSERTS+STATSDELETES)>SRIInsDelUpdPct) OR
2
STATSMASSDELETE>SRIInsDelUpdAbs)
3
Figure 25. DSNACCOR formula for recommending RUNSTATS on an index space
The figure below shows the formula that DSNACCOR uses to that too many index
space or table space extents have been used.
EXTENTS>ExtentLimit
Figure 26. DSNACCOR formula for warning that too many data set extents for a table space or index space are used
Using an exception table
An exception table is an optional, user-created DB2 table that you can use to place
information in the INEXCEPTTABLE column of the recommendations result set.
You can put any information in the INEXCEPTTABLE column, but the most
common use of this column is to filter the recommendations result set. Each row in
the exception table represents an object for which you want to provide information
for the recommendations result set.
212
Performance Monitoring and Tuning Guide
To create the exception table, execute a CREATE TABLE statement similar to the
following one. You can include other columns in the exception table, but you must
include at least the columns that are shown.
CREATE TABLE DSNACC.EXCEPT_TBL
(DBNAME CHAR(8) NOT NULL,
NAME CHAR(8) NOT NULL,
QUERYTYPE CHAR(40))
CCSID EBCDIC;
The meanings of the columns are:
DBNAME
The database name for an object in the exception table.
NAME
The table space name or index space name for an object in the exception table.
QUERYTYPE
The information that you want to place in the INEXCEPTTABLE column of the
recommendations result set.
If you put a null value in this column, DSNACCOR puts the value YES in the
INEXCEPTTABLE column of the recommendations result set row for the object
that matches the DBNAME and NAME values.
Recommendation: If you plan to put many rows in the exception table, create a
nonunique index on DBNAME, NAME, and QUERYTYPE.
After you create the exception table, insert a row for each object for which you
want to include information in the INEXCEPTTABLE column.
Example: Suppose that you want the INEXCEPTTABLE column to contain the
string 'IRRELEVANT' for table space STAFF in database DSNDB04. You also want
the INEXCEPTTABLE column to contain 'CURRENT' for table space DSN8S91D in
database DSN8D91A. Execute these INSERT statements:
INSERT INTO DSNACC.EXCEPT_TBL VALUES(’DSNDB04 ’, ’STAFF ’, ’IRRELEVANT’);
INSERT INTO DSNACC.EXCEPT_TBL VALUES(’DSN8D91A’, ’DSN8S91D’, ’CURRENT’);
To use the contents of INEXCEPTTABLE for filtering, include a condition that
involves the INEXCEPTTABLE column in the search condition that you specify in
your Criteria input parameter.
Example: Suppose that you want to include all rows for database DSNDB04 in
the recommendations result set, except for those rows that contain the string
'IRRELEVANT' in the INEXCEPTTABLE column. You might include the following
search condition in your Criteria input parameter:
DBNAME=’DSNDB04’ AND INEXCEPTTABLE<>’IRRELEVANT’
Example
The following COBOL example that shows variable declarations and an SQL CALL
for obtaining recommendations for objects in databases DSN8D91A and
DSN8D91L. This example also outlines the steps that you need to perform to
retrieve the two result sets that DSNACCOR returns.
WORKING-STORAGE SECTION.
.
.
.
***********************
* DSNACCOR PARAMETERS *
***********************
Chapter 11. Designing DB2 statistics for performance
213
01
01
01
01
01
49
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
QUERYTYPE.
49
QUERYTYPE-LN
PICTURE S9(4) COMP VALUE 40.
49
QUERYTYPE-DTA
PICTURE X(40) VALUE ’ALL’.
OBJECTTYPE.
49
OBJECTTYPE-LN
PICTURE S9(4) COMP VALUE 3.
49
OBJECTTYPE-DTA
PICTURE X(3) VALUE ’ALL’.
ICTYPE.
49
ICTYPE-LN
PICTURE S9(4) COMP VALUE 1.
49
ICTYPE-DTA
PICTURE X(1) VALUE ’B’.
STATSSCHEMA.
49
STATSSCHEMA-LN
PICTURE S9(4) COMP VALUE 128.
49
STATSSCHEMA-DTA
PICTURE X(128) VALUE ’SYSIBM’.
CATLGSCHEMA.
CATLGSCHEMA-LN
PICTURE S9(4) COMP VALUE 128.
49
CATLGSCHEMA-DTA
PICTURE X(128) VALUE ’SYSIBM’.
LOCALSCHEMA.
49
LOCALSCHEMA-LN
PICTURE S9(4) COMP VALUE 128.
49
LOCALSCHEMA-DTA
PICTURE X(128) VALUE ’DSNACC’.
CHKLVL
PICTURE S9(9) COMP VALUE +3.
CRITERIA.
49
CRITERIA-LN
PICTURE S9(4) COMP VALUE 4096.
49
CRITERIA-DTA
PICTURE X(4096) VALUE SPACES.
RESTRICTED.
49
RESTRICTED-LN
PICTURE S9(4) COMP VALUE 80.
49
RESTRICTED-DTA
PICTURE X(80) VALUE SPACES.
CRUPDATEDPAGESPCT
PICTURE S9(9) COMP VALUE +0.
CRCHANGESPCT
PICTURE S9(9) COMP VALUE +0.
CRDAYSNCLASTCOPY
PICTURE S9(9) COMP VALUE +0.
ICRUPDATEDPAGESPCT
PICTURE S9(9) COMP VALUE +0.
ICRCHANGESPCT
PICTURE S9(9) COMP VALUE +0.
CRINDEXSIZE
PICTURE S9(9) COMP VALUE +0.
RRTINSDELUPDPCT
PICTURE S9(9) COMP VALUE +0.
RRTUNCLUSTINSPCT
PICTURE S9(9) COMP VALUE +0.
RRTDISORGLOBPCT
PICTURE S9(9) COMP VALUE +0.
RRTMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
RRTINDREFLIMIT
PICTURE S9(9) COMP VALUE +0.
RRIINSERTDELETEPCT
PICTURE S9(9) COMP VALUE +0.
RRIAPPENDINSERTPCT
PICTURE S9(9) COMP VALUE +0.
RRIPSEUDODELETEPCT
PICTURE S9(9) COMP VALUE +0.
RRIMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
RRILEAFLIMIT
PICTURE S9(9) COMP VALUE +0.
RRINUMLEVELSLIMIT
PICTURE S9(9) COMP VALUE +0.
SRTINSDELUPDPCT
PICTURE S9(9) COMP VALUE +0.
SRTINSDELUPDABS
PICTURE S9(9) COMP VALUE +0.
SRTMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
SRIINSDELUPDPCT
PICTURE S9(9) COMP VALUE +0.
SRIINSDELUPDABS
PICTURE S9(9) COMP VALUE +0.
SRIMASSDELLIMIT
PICTURE S9(9) COMP VALUE +0.
EXTENTLIMIT
PICTURE S9(9) COMP VALUE +0.
LASTSTATEMENT.
49
LASTSTATEMENT-LN
PICTURE S9(4) COMP VALUE 8012.
49
LASTSTATEMENT-DTA
PICTURE X(8012) VALUE SPACES.
RETURNCODE
PICTURE S9(9) COMP VALUE +0.
ERRORMSG.
49
ERRORMSG-LN
PICTURE S9(4) COMP VALUE 1331.
49
ERRORMSG-DTA
PICTURE X(1331) VALUE SPACES.
IFCARETCODE
PICTURE S9(9) COMP VALUE +0.
IFCARESCODE
PICTURE S9(9) COMP VALUE +0.
EXCESSBYTES
PICTURE S9(9) COMP VALUE +0.
*****************************************
* INDICATOR VARIABLES.
*
* INITIALIZE ALL NON-ESSENTIAL INPUT
*
* VARIABLES TO -1, TO INDICATE THAT THE *
* INPUT VALUE IS NULL.
*
*****************************************
01 QUERYTYPE-IND
PICTURE S9(4) COMP-4 VALUE +0.
01 OBJECTTYPE-IND
PICTURE S9(4) COMP-4 VALUE +0.
214
Performance Monitoring and Tuning Guide
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
ICTYPE-IND
STATSSCHEMA-IND
CATLGSCHEMA-IND
LOCALSCHEMA-IND
CHKLVL-IND
CRITERIA-IND
RESTRICTED-IND
CRUPDATEDPAGESPCT-IND
CRCHANGESPCT-IND
CRDAYSNCLASTCOPY-IND
ICRUPDATEDPAGESPCT-IND
ICRCHANGESPCT-IND
CRINDEXSIZE-IND
RRTINSDELUPDPCT-IND
RRTUNCLUSTINSPCT-IND
RRTDISORGLOBPCT-IND
RRTMASSDELLIMIT-IND
RRTINDREFLIMIT-IND
RRIINSERTDELETEPCT-IND
RRIAPPENDINSERTPCT-IND
RRIPSEUDODELETEPCT-IND
RRIMASSDELLIMIT-IND
RRILEAFLIMIT-IND
RRINUMLEVELSLIMIT-IND
SRTINSDELUPDPCT-IND
SRTINSDELUPDABS-IND
SRTMASSDELLIMIT-IND
SRIINSDELUPDPCT-IND
SRIINSDELUPDABS-IND
SRIMASSDELLIMIT-IND
EXTENTLIMIT-IND
LASTSTATEMENT-IND
RETURNCODE-IND
ERRORMSG-IND
IFCARETCODE-IND
IFCARESCODE-IND
EXCESSBYTES-IND
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
PICTURE
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
S9(4)
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
COMP-4
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
VALUE
+0.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
-1.
+0.
+0.
+0.
+0.
+0.
+0.
PROCEDURE DIVISION.
.
.
.
*********************************************************
* SET VALUES FOR DSNACCOR INPUT PARAMETERS:
*
* - USE THE CHKLVL PARAMETER TO CAUSE DSNACCOR TO CHECK *
*
FOR ORPHANED OBJECTS AND INDEX SPACES WITHOUT
*
*
TABLE SPACES, BUT INCLUDE THOSE OBJECTS IN THE
*
*
RECOMMENDATIONS RESULT SET (CHKLVL=1+2+16=19)
*
* - USE THE CRITERIA PARAMETER TO CAUSE DSNACCOR TO
*
*
MAKE RECOMMENDATIONS ONLY FOR OBJECTS IN DATABASES *
*
DSN8D91A AND DSN8D91L.
*
* - FOR THE FOLLOWING PARAMETERS, SET THESE VALUES,
*
*
WHICH ARE LOWER THAN THE DEFAULTS:
*
*
CRUPDATEDPAGESPCT
4
*
*
CRCHANGESPCT
2
*
*
RRTINSDELUPDPCT
2
*
*
RRTUNCLUSTINSPCT
5
*
*
RRTDISORGLOBPCT
5
*
*
RRIAPPENDINSERTPCT 5
*
*
SRTINSDELUPDPCT
5
*
*
SRIINSDELUPDPCT
5
*
*
EXTENTLIMIT
3
*
*********************************************************
MOVE 19 TO CHKLVL.
MOVE SPACES TO CRITERIA-DTA.
MOVE ’DBNAME = ’’DSN8D91A’’ OR DBNAME = ’’DSN8D91L’’’
TO CRITERIA-DTA.
MOVE 46 TO CRITERIA-LN.
MOVE 4 TO CRUPDATEDPAGESPCT.
Chapter 11. Designing DB2 statistics for performance
215
MOVE 2 TO CRCHANGESPCT.
MOVE 2 TO RRTINSDELUPDPCT.
MOVE 5 TO RRTUNCLUSTINSPCT.
MOVE 5 TO RRTDISORGLOBPCT.
MOVE 5 TO RRIAPPENDINSERTPCT.
MOVE 5 TO SRTINSDELUPDPCT.
MOVE 5 TO SRIINSDELUPDPCT.
MOVE 3 TO EXTENTLIMIT.
********************************
* INITIALIZE OUTPUT PARAMETERS *
********************************
MOVE SPACES TO LASTSTATEMENT-DTA.
MOVE 1 TO LASTSTATEMENT-LN.
MOVE 0 TO RETURNCODE-O2.
MOVE SPACES TO ERRORMSG-DTA.
MOVE 1 TO ERRORMSG-LN.
MOVE 0 TO IFCARETCODE.
MOVE 0 TO IFCARESCODE.
MOVE 0 TO EXCESSBYTES.
*******************************************************
* SET THE INDICATOR VARIABLES TO 0 FOR NON-NULL INPUT *
* PARAMETERS (PARAMETERS FOR WHICH YOU DO NOT WANT
*
* DSNACCOR TO USE DEFAULT VALUES) AND FOR OUTPUT
*
* PARAMETERS.
*
*******************************************************
MOVE 0 TO CHKLVL-IND.
MOVE 0 TO CRITERIA-IND.
MOVE 0 TO CRUPDATEDPAGESPCT-IND.
MOVE 0 TO CRCHANGESPCT-IND.
MOVE 0 TO RRTINSDELUPDPCT-IND.
MOVE 0 TO RRTUNCLUSTINSPCT-IND.
MOVE 0 TO RRTDISORGLOBPCT-IND.
MOVE 0 TO RRIAPPENDINSERTPCT-IND.
MOVE 0 TO SRTINSDELUPDPCT-IND.
MOVE 0 TO SRIINSDELUPDPCT-IND.
MOVE 0 TO EXTENTLIMIT-IND.
MOVE 0 TO LASTSTATEMENT-IND.
MOVE 0 TO RETURNCODE-IND.
MOVE 0 TO ERRORMSG-IND.
MOVE 0 TO IFCARETCODE-IND.
MOVE 0 TO IFCARESCODE-IND.
MOVE 0 TO EXCESSBYTES-IND.
.
.
.
*****************
* CALL DSNACCOR *
*****************
EXEC SQL
CALL SYSPROC.DSNACCOR
(:QUERYTYPE
:QUERYTYPE-IND,
:OBJECTTYPE
:OBJECTTYPE-IND,
:ICTYPE
:ICTYPE-IND,
:STATSSCHEMA
:STATSSCHEMA-IND,
:CATLGSCHEMA
:CATLGSCHEMA-IND,
:LOCALSCHEMA
:LOCALSCHEMA-IND,
:CHKLVL
:CHKLVL-IND,
:CRITERIA
:CRITERIA-IND,
:RESTRICTED
:RESTRICTED-IND,
:CRUPDATEDPAGESPCT
:CRUPDATEDPAGESPCT-IND,
:CRCHANGESPCT
:CRCHANGESPCT-IND,
:CRDAYSNCLASTCOPY
:CRDAYSNCLASTCOPY-IND,
:ICRUPDATEDPAGESPCT :ICRUPDATEDPAGESPCT-IND,
:ICRCHANGESPCT
:ICRCHANGESPCT-IND,
:CRINDEXSIZE
:CRINDEXSIZE-IND,
:RRTINSDELUPDPCT
:RRTINSDELUPDPCT-IND,
:RRTUNCLUSTINSPCT
:RRTUNCLUSTINSPCT-IND,
:RRTDISORGLOBPCT
:RRTDISORGLOBPCT-IND,
:RRTMASSDELLIMIT
:RRTMASSDELLIMIT-IND,
216
Performance Monitoring and Tuning Guide
:RRTINDREFLIMIT
:RRTINDREFLIMIT-IND,
:RRIINSERTDELETEPCT :RRIINSERTDELETEPCT-IND,
:RRIAPPENDINSERTPCT :RRIAPPENDINSERTPCT-IND,
:RRIPSEUDODELETEPCT :RRIPSEUDODELETEPCT-IND,
:RRIMASSDELLIMIT
:RRIMASSDELLIMIT-IND,
:RRILEAFLIMIT
:RRILEAFLIMIT-IND,
:RRINUMLEVELSLIMIT
:RRINUMLEVELSLIMIT-IND,
:SRTINSDELUPDPCT
:SRTINSDELUPDPCT-IND,
:SRTINSDELUPDABS
:SRTINSDELUPDABS-IND,
:SRTMASSDELLIMIT
:SRTMASSDELLIMIT-IND,
:SRIINSDELUPDPCT
:SRIINSDELUPDPCT-IND,
:SRIINSDELUPDABS
:SRIINSDELUPDABS-IND,
:SRIMASSDELLIMIT
:SRIMASSDELLIMIT-IND,
:EXTENTLIMIT
:EXTENTLIMIT-IND,
:LASTSTATEMENT
:LASTSTATEMENT-IND,
:RETURNCODE
:RETURNCODE-IND,
:ERRORMSG
:ERRORMSG-IND,
:IFCARETCODE
:IFCARETCODE-IND,
:IFCARESCODE
:IFCARESCODE-IND,
:EXCESSBYTES
:EXCESSBYTES-IND)
END-EXEC.
*************************************************************
* ASSUME THAT THE SQL CALL RETURNED +466, WHICH MEANS THAT *
* RESULT SETS WERE RETURNED. RETRIEVE RESULT SETS.
*
*************************************************************
* LINK EACH RESULT SET TO A LOCATOR VARIABLE
EXEC SQL ASSOCIATE LOCATORS (:LOC1, :LOC2)
WITH PROCEDURE SYSPROC.DSNACCOR
END-EXEC.
* LINK A CURSOR TO EACH RESULT SET
EXEC SQL ALLOCATE C1 CURSOR FOR RESULT SET :LOC1
END-EXEC.
EXEC SQL ALLOCATE C2 CURSOR FOR RESULT SET :LOC2
END-EXEC.
* PERFORM FETCHES USING C1 TO RETRIEVE ALL ROWS FROM FIRST RESULT SET
* PERFORM FETCHES USING C2 TO RETRIEVE ALL ROWS FROM SECOND RESULT SET
Figure 27. Example of DSNACCOR invocation
Output
If DSNACCOR executes successfully, in addition to the output parameters
described in “Option descriptions” on page 202, DSNACCOR returns two result
sets.
The first result set contains the results from IFI COMMAND calls that DSNACCOR
makes. The following table shows the format of the first result set.
Table 53. Result set row for first DSNACCOR result set
Column name
Data type
Contents
RS_SEQUENCE
INTEGER
Sequence number of the output line
RS_DATA
CHAR(80)
A line of command output
The second result set contains DSNACCOR's recommendations. This result set
contains one or more rows for a table space or index space. A nonpartitioned table
space or nonpartitioning index space can have at most one row in the result set. A
partitioned table space or partitioning index space can have at most one row for
each partition. A table space, index space, or partition has a row in the result set if
both of the following conditions are true:
Chapter 11. Designing DB2 statistics for performance
217
v If the Criteria input parameter contains a search condition, the search condition
is true for the table space, index space, or partition.
v DSNACCOR recommends at least one action for the table space, index space, or
partition.
The following table shows the columns of a result set row.
Table 54. Result set row for second DSNACCOR result set
Column name
Data type
Description
DBNAME
CHAR(8)
Name of the database that contains the object.
NAME
CHAR(8)
Table space or index space name.
PARTITION
INTEGER
Data set number or partition number.
OBJECTTYPE
CHAR(2)
DB2 object type:
v TS for a table space
v IX for an index space
OBJECTSTATUS
CHAR(36)
Status of the object:
v ORPHANED, if the object is an index space with no
corresponding table space, or if the object does not exist
v If the object is in a restricted state, one of the following
values:
– TS=restricted-state, if OBJECTTYPE is TS
– IX=restricted-state, if OBJECTTYPE is IX
restricted-state is one of the status codes that appear in
DISPLAY DATABASE output. See DB2 Command Reference
for details.
v A, if the object is in an advisory state.
v L, if the object is a logical partition, but not in an advisory
state.
v AL, if the object is a logical partition and in an advisory
state.
IMAGECOPY
CHAR(3)
COPY recommendation:
v If OBJECTTYPE is TS: FUL (full image copy), INC
(incremental image copy), or NO
v If OBJECTTYPE is IX: YES or NO
RUNSTATS
CHAR(3)
RUNSTATS recommendation: YES or NO.
EXTENTS
CHAR(3)
Indicates whether the data sets for the object have exceeded
ExtentLimit: YES or NO.
REORG
CHAR(3)
REORG recommendation: YES or NO.
INEXCEPTTABLE
CHAR(40)
A string that contains one of the following values:
v Text that you specify in the QUERYTYPE column of the
exception table.
v YES, if you put a row in the exception table for the object
that this result set row represents, but you specify NULL in
the QUERYTYPE column.
v NO, if the exception table exists but does not have a row for
the object that this result set row represents.
v Null, if the exception table does not exist, or if the ChkLvl
input parameter does not include the value 4.
ASSOCIATEDTS
218
CHAR(8)
Performance Monitoring and Tuning Guide
If OBJECTTYPE is IX and the ChkLvl input parameter includes
the value 2, this value is the name of the table space that is
associated with the index space. Otherwise null.
Table 54. Result set row for second DSNACCOR result set (continued)
Column name
Data type
Description
COPYLASTTIME
TIMESTAMP
Timestamp of the last full image copy on the object. Null if
COPY was never run, or if the last COPY execution was
terminated.
LOADRLASTTIME
TIMESTAMP
Timestamp of the last LOAD REPLACE on the object. Null if
LOAD REPLACE was never run, or if the last LOAD
REPLACE execution was terminated.
REBUILDLASTTIME
TIMESTAMP
Timestamp of the last REBUILD INDEX on the object. Null if
REBUILD INDEX was never run, or if the last REBUILD
INDEX execution was terminated.
CRUPDPGSPCT
INTEGER
If OBJECTTYPE is TS or IX and IMAGECOPY is YES, the ratio
of distinct updated pages to preformatted pages, expressed as
a percentage. Otherwise null.
CRCPYCHGPCT
INTEGER
If OBJECTTYPE is TS and IMAGECOPY is YES, the ratio of
the total number insert, update, and delete operations since
the last image copy to the total number of rows or LOBs in the
table space or partition, expressed as a percentage. If
OBJECTTYPE is IX and IMAGECOPY is YES, the ratio of the
total number of insert and delete operations since the last
image copy to the total number of entries in the index space or
partition, expressed as a percentage. Otherwise null.
CRDAYSCELSTCPY
INTEGER
If OBJECTTYPE is TS or IX and IMAGECOPY is YES, the
number of days since the last image copy. Otherwise null.
CRINDEXSIZE
INTEGER
If OBJECTTYPE is IX and IMAGECOPY is YES, the number of
active pages in the index space or partition. Otherwise null.
REORGLASTTIME
TIMESTAMP
Timestamp of the last REORG on the object. Null if REORG
was never run, or if the last REORG execution was terminated.
RRTINSDELUPDPCT
INTEGER
If OBJECTTYPE is TS and REORG is YES, the ratio of the sum
of insert, update, and delete operations since the last REORG
to the total number of rows or LOBs in the table space or
partition, expressed as a percentage. Otherwise null.
RRTUNCINSPCT
INTEGER
If OBJECTTYPE is TS and REORG is YES, the ratio of the
number of unclustered insert operations to the total number of
rows or LOBs in the table space or partition, expressed as a
percentage. Otherwise null.
RRTDISORGLOBPCT
INTEGER
If OBJECTTYPE is TS and REORG is YES, the ratio of the
number of imperfectly chunked LOBs to the total number of
rows or LOBs in the table space or partition, expressed as a
percentage. Otherwise null.
RRTMASSDELETE
INTEGER
If OBJECTTYPE is TS, REORG is YES, and the table space is a
segmented table space or LOB table space, the number of mass
deletes since the last REORG or LOAD REPLACE. If
OBJECTTYPE is TS, REORG is YES, and the table space is
nonsegmented, the number of dropped tables since the last
REORG or LOAD REPLACE. Otherwise null.
RRTINDREF
INTEGER
If OBJECTTYPE is TS, REORG is YES, the ratio of the total
number of overflow records that were created since the last
REORG or LOAD REPLACE to the total number of rows or
LOBs in the table space or partition, expressed as a percentage.
Otherwise null.
Chapter 11. Designing DB2 statistics for performance
219
Table 54. Result set row for second DSNACCOR result set (continued)
Column name
Data type
Description
RRIINSDELPCT
INTEGER
If OBJECTTYPE is IX and REORG is YES, the ratio of the total
number of insert and delete operations since the last REORG
to the total number of index entries in the index space or
partition, expressed as a percentage. Otherwise null.
RRIAPPINSPCT
INTEGER
If OBJECTTYPE is IX and REORG is YES, the ratio of the
number of index entries that were inserted since the last
REORG, REBUILD INDEX, or LOAD REPLACE that had a key
value greater than the maximum key value in the index space
or partition, to the number of index entries in the index space
or partition, expressed as a percentage. Otherwise null.
RRIPSDDELPCT
INTEGER
If OBJECTTYPE is IX and REORG is YES, the ratio of the
number of index entries that were pseudo-deleted (the RID
entry was marked as deleted) since the last REORG, REBUILD
INDEX, or LOAD REPLACE to the number of index entries in
the index space or partition, expressed as a percentage.
Otherwise null.
RRIMASSDELETE
INTEGER
If OBJECTTYPE is IX and REORG is YES, the number of mass
deletes from the index space or partition since the last REORG,
REBUILD, or LOAD REPLACE. Otherwise null.
RRILEAF
INTEGER
If OBJECTTYPE is IX and REORG is YES, the ratio of the
number of index page splits that occurred since the last
REORG, REBUILD INDEX, or LOAD REPLACE in which the
higher part of the split page was far from the location of the
original page, to the total number of active pages in the index
space or partition, expressed as a percentage. Otherwise null.
RRINUMLEVELS
INTEGER
If OBJECTTYPE is IX and REORG is YES, the number of levels
in the index tree that were added or removed since the last
REORG, REBUILD INDEX, or LOAD REPLACE. Otherwise
null.
STATSLASTTIME
TIMESTAMP
Timestamp of the last RUNSTATS on the object. Null if
RUNSTATS was never run, or if the last RUNSTATS execution
was terminated.
SRTINSDELUPDPCT
INTEGER
If OBJECTTYPE is TS and RUNSTATS is YES, the ratio of the
total number of insert, update, and delete operations since the
last RUNSTATS on a table space or partition, to the total
number of rows or LOBs in the table space or partition,
expressed as a percentage. Otherwise null.
SRTINSDELUPDABS
INTEGER
If OBJECTTYPE is TS and RUNSTATS is YES, the total number
of insert, update, and delete operations since the last
RUNSTATS on a table space or partition. Otherwise null.
SRTMASSDELETE
INTEGER
If OBJECTTYPE is TS and RUNSTATS is YES, the number of
mass deletes from the table space or partition since the last
REORG or LOAD REPLACE. Otherwise null.
| SRIINSDELPCT
INTEGER
If OBJECTTYPE is IX and RUNSTATS is YES, the ratio of the
total number of insert and delete operations since the last
RUNSTATS on the index space or partition, to the total
number of index entries in the index space or partition,
expressed as a percentage. Otherwise null.
| SRIINSDELABS
INTEGER
If OBJECTTYPE is IX and RUNSTATS is YES, the number
insert and delete operations since the last RUNSTATS on the
index space or partition. Otherwise null.
220
Performance Monitoring and Tuning Guide
Table 54. Result set row for second DSNACCOR result set (continued)
Column name
Data type
Description
SRIMASSDELETE
INTEGER
If OBJECTTYPE is IX and RUNSTATS is YES, the number of
mass deletes from the index space or partition since the last
REORG, REBUILD INDEX, or LOAD REPLACE. Otherwise,
this value is null.
TOTALEXTENTS
SMALLINT
If EXTENTS is YES, the number of physical extents in the table
space, index space, or partition. Otherwise, this value is null.
PSPI
Related reference
CREATE DATABASE (DB2 SQL)
CREATE TABLESPACE (DB2 SQL)
When DB2 externalizes real-time statistics
|
|
|
|
|
|
|
|
|
DB2 externalizes real-time statistics in certain circumstances.
v When you issue STOP DATABASE(DSNDB06) SPACENAM(space-name).
This command stops the in-memory statistics database and externalizes statistics
for all objects in the subsystem. No statistics are externalized for the DSNRTSDB
database.
v When you issue STOP DATABASE(database-name) SPACENAM(space-name)
This command externalizes statistics only for database-name and space-name. No
statistics are externalized when the DSNRTSDB database is stopped.
v At the end of the time interval that you specify during installation (the
STATSINT system parameter).
v When you issue STOP DB2 MODE(QUIESCE).
DB2 writes any statistics that are in memory when you issue this command to
the statistics tables. However, if you issue STOP DB2 MODE(FORCE), DB2 does
not write the statistics, and you lose them.
v During utility operations.
Some utilities modify the statistics tables.
DB2 does not maintain real-time statistics for any objects in the real-time
statistics database. Therefore, if you run a utility with a utility list, and the list
contains any real-time statistics objects, DB2 does not externalize real-time
statistics during the execution of that utility for any of the objects in the utility
list. DB2 does not maintain interval counter real-time statistics for SYSLGRNX
and its indexes during utility operation. DB2 maintains statistics for these objects
only during non-utility operation.
Recommendation: Do not include real-time statistics objects in utility lists.
DB2 does not externalize real-time statistics at a tracker site.
How DB2 utilities affect the real-time statistics
In general, SQL INSERT, UPDATE, and DELETE statements cause DB2 to modify
the real-time statistics.
However, certain DB2 utilities also affect the statistics. The following topics discuss
the effect of each of those utilities on the statistics.
Chapter 11. Designing DB2 statistics for performance
221
How LOAD affects real-time statistics:
When you run LOAD REPLACE on a table space or table space partition, you
change the statistics associated with that table space or partition.
The table below shows how running LOAD REPLACE on a table space or table
space partition affects the SYSTABLESPACESTATS statistics.
|
|
Table 55. Changed SYSTABLESPACESTATS values during LOAD REPLACE
Column name
Settings for LOAD REPLACE after
RELOAD phase
TOTALROWS
Number of loaded rows or LOBs1
|
DATASIZE
Actual value
|
NPAGES
Actual value
NACTIVE
Actual value
SPACE
Actual value
EXTENTS
Actual value
LOADRLASTTIME
Current timestamp
REORGINSERTS
0
REORGDELETES
0
REORGUPDATES
0
REORGDISORGLOB
0
REORGUNCLUSTINS
0
REORGMASSDELETE
0
REORGNEARINDREF
0
REORGFARINDREF
0
STATSLASTTIME
Current timestamp2
STATSINSERTS
02
STATSDELETES
02
STATSUPDATES
02
STATSMASSDELETE
02
COPYLASTTIME
Current timestamp3
COPYUPDATEDPAGES
03
COPYCHANGES
03
COPYUPDATELRSN
Null3
COPYUPDATETIME
Null3
Notes:
1. Under certain conditions, such as a utility restart, the LOAD utility might not have an
accurate count of loaded records. In those cases, DB2 sets this value to null. Some rows
that are loaded into a table space and are included in this value might later be removed
during the index validation phase or the referential integrity check. DB2 includes counts
of those removed records in the statistics that record deleted records.
2. DB2 sets this value only if the LOAD invocation includes the STATISTICS option.
3. DB2 sets this value only if the LOAD invocation includes the COPYDDN option.
222
Performance Monitoring and Tuning Guide
|
|
The table below shows how running LOAD REPLACE affects the
SYSINDEXSPACESTATS statistics for an index space or physical index partition.
Table 56. Changed SYSINDEXSPACESTATS values during LOAD REPLACE
|
Column name
Settings for LOAD REPLACE after BUILD
phase
TOTALENTRIES
Number of index entries added1
NLEAF
Actual value
NLEVELS
Actual value
NACTIVE
Actual value
SPACE
Actual value
EXTENTS
Actual value
LOADRLASTTIME
Current timestamp
REORGINSERTS
0
REORGDELETES
0
REORGAPPENDINSERT
0
REORGPSEUDODELETES
0
REORGMASSDELETE
0
REORGLEAFNEAR
0
REORGLEAFFAR
0
REORGNUMLEVELS
0
STATSLASTTIME
Current timestamp2
STATSINSERTS
02
STATSDELETES
02
STATSMASSDELETE
02
COPYLASTTIME
Current timestamp3
COPYUPDATEDPAGES
03
COPYCHANGES
03
COPYUPDATELRSN
Null3
COPYUPDATETIME
Null3
Notes:
1. Under certain conditions, such as a utility restart, the LOAD utility might not
have an accurate count of loaded records. In those cases, DB2 sets this value to
null.
2. DB2 sets this value only if the LOAD invocation includes the STATISTICS
option.
3. DB2 sets this value only if the LOAD invocation includes the COPYDDN
option.
|
|
|
For a logical index partition:
v A LOAD operation without the REPLACE option behaves similar to a SQL
INSERT operation in that the number of records loaded are counted in the
incremental counters such as REORGINSERTS, REORGAPPENDINSERT,
Chapter 11. Designing DB2 statistics for performance
223
STATSINSERTS, and COPYCHANGES. A LOAD operation without the
REPLACE option affects the organization of the data and can be a trigger to run
REORG, RUNSTATS or COPY.
|
|
|
v DB2 does not reset the nonpartitioned index when it does a LOAD REPLACE on
a partition. Therefore, DB2 does not reset the statistics for the index. The REORG
counters from the last REORG are still correct. DB2 updates LOADRLASTTIME
when the entire nonpartitioned index is replaced.
v When DB2 does a LOAD RESUME YES on a partition, after the BUILD phase,
DB2 increments TOTALENTRIES by the number of index entries that were
inserted during the BUILD phase.
How REORG affects real-time statistics:
When you run the REORG utility DB2 modifies some of the real-time statistics for
the involved table space or index.
The table below shows how running REORG on a table space or table space
partition affects the SYSTABLESPACESTATS statistics.
Table 57. Changed SYSTABLESPACESTATS values during REORG
Column name
Settings for REORG SHRLEVEL
NONE after RELOAD phase
TOTALROWS
Number rows or LOBs loaded1
Settings for REORG SHRLEVEL
REFERENCE or CHANGE after SWITCH
phase
For SHRLEVEL REFERENCE: Number of
loaded rows or LOBs during RELOAD phase
For SHRLEVEL CHANGE: Number of loaded
rows or LOBs during RELOAD phase plus
number of rows inserted during LOG APPLY
phase minus number of rows deleted during
LOG phase
| DATASIZE
Actual value
Actual value
| NPAGES
Actual value
Actual value
NACTIVE
Actual value
Actual value
SPACE
Actual value
Actual value
EXTENTS
Actual value
Actual value
REORGLASTTIME
Current timestamp
Current timestamp
REORGINSERTS
0
Actual value2
REORGDELETES
0
Actual value2
REORGUPDATES
0
Actual value2
REORGDISORGLOB
0
Actual value2
REORGUNCLUSTINS
0
Actual value2
REORGMASSDELETE
0
Actual value2
REORGNEARINDREF
0
Actual value2
REORGFARINDREF
0
Actual value2
STATSLASTTIME
Current timestamp3
Current timestamp3
STATSINSERTS
03
Actual value2
STATSDELETES
03
Actual value2
STATSUPDATES
03
Actual value2
224
Performance Monitoring and Tuning Guide
Table 57. Changed SYSTABLESPACESTATS values during REORG (continued)
Column name
Settings for REORG SHRLEVEL
NONE after RELOAD phase
Settings for REORG SHRLEVEL
REFERENCE or CHANGE after SWITCH
phase
STATSMASSDELETE
03
Actual value2
COPYLASTTIME
Current timestamp4
Current timestamp
COPYUPDATEDPAGES
0
4
Actual value2
COPYCHANGES
04
Actual value2
COPYUPDATELRSN
Null4
Actual value5
COPYUPDATETIME
Null4
Actual value5
Notes:
1. Under certain conditions, such as a utility restart, the REORG utility might not
have an accurate count of loaded records. In those cases, DB2 sets this value to
null. Some rows that are loaded into a table space and are included in this
value might later be removed during the index validation phase or the
referential integrity check. DB2 includes counts of those removed records in the
statistics that record deleted records.
2. This is the actual number of inserts, updates, or deletes that are due to
applying the log to the shadow copy.
3. DB2 sets this value only if the REORG invocation includes the STATISTICS
option.
4. DB2 sets this value only if the REORG invocation includes the COPYDDN
option.
5. This is the LRSN or timestamp for the first update that is due to applying the
log to the shadow copy.
The table below shows how running REORG affects the SYSINDEXSPACESTATS
statistics for an index space or physical index partition.
Table 58. Changed SYSINDEXSPACESTATS values during REORG
Column name
Settings for REORG SHRLEVEL
NONE after RELOAD phase
TOTALENTRIES
Number of index entries added1
Settings for REORG SHRLEVEL
REFERENCE or CHANGE after SWITCH
phase
For SHRLEVEL REFERENCE: Number of
added index entries during BUILD phase
For SHRLEVEL CHANGE: Number of added
index entries during BUILD phase plus
number of added index entries during LOG
phase minus number of deleted index entries
during LOG phase
| NLEAF
Actual value
Actual value
NLEVELS
Actual value
Actual value
NACTIVE
Actual value
Actual value
SPACE
Actual value
Actual value
EXTENTS
Actual value
Actual value
REORGLASTTIME
Current timestamp
Current timestamp
REORGINSERTS
0
Actual value2
Chapter 11. Designing DB2 statistics for performance
225
Table 58. Changed SYSINDEXSPACESTATS values during REORG (continued)
Column name
Settings for REORG SHRLEVEL
NONE after RELOAD phase
Settings for REORG SHRLEVEL
REFERENCE or CHANGE after SWITCH
phase
REORGDELETES
0
Actual value2
REORGAPPENDINSERT
0
Actual value2
REORGPSEUDODELETES
0
Actual value2
REORGMASSDELETE
0
Actual value2
REORGLEAFNEAR
0
Actual value2
REORGLEAFFAR
0
Actual value2
REORGNUMLEVELS
0
Actual value2
STATSLASTTIME
Current timestamp3
Current timestamp3
STATSINSERTS
03
Actual value2
STATSDELETES
03
Actual value2
STATSMASSDELETE
03
Actual value2
COPYLASTTIME
Current timestamp4
Unchanged5
COPYUPDATEDPAGES
04
Unchanged5
COPYCHANGES
04
Unchanged5
COPYUPDATELRSN
Null4
Unchanged5
COPYUPDATETIME
Null4
Unchanged5
Notes:
1. Under certain conditions, such as a utility restart, the REORG utility might not have an accurate count of loaded
records. In those cases, DB2 sets this value to null.
2. This is the actual number of inserts, updates, or deletes that are due to applying the log to the shadow copy.
3. DB2 sets this value only if the REORG invocation includes the STATISTICS option.
4. DB2 sets this value only if the REORG invocation includes the COPYDDN option.
5. Inline COPY is not allowed for SHRLEVEL CHANGE or SHRLEVEL REFERENCE.
For a logical index partition, DB2 does not reset the nonpartitioned index when it
does a REORG on a partition. Therefore, DB2 does not reset the statistics for the
index. The REORG counters and REORGLASTTIME are relative to the last time the
entire nonpartitioned index is reorganized. In addition, the REORG counters might
be low because, due to the methodology, some index entries are changed during
REORG of a partition.
How REBUILD INDEX affects real-time statistics:
Rebuilding an index has certain effects on the statistics for the index involved.
The table below shows how running REBUILD INDEX affects the
SYSINDEXSPACESTATS statistics for an index space or physical index partition.
Table 59. Changed SYSINDEXSPACESTATS values during REBUILD INDEX
|
226
Column name
Settings after BUILD phase
TOTALENTRIES
Number of index entries added1
NLEAF
Actual value
NLEVELS
Actual value
Performance Monitoring and Tuning Guide
Table 59. Changed SYSINDEXSPACESTATS values during REBUILD INDEX (continued)
Column name
Settings after BUILD phase
NACTIVE
Actual value
SPACE
Actual value
EXTENTS
Actual value
REBUILDLASTTIME
Current timestamp
REORGINSERTS
0
REORGDELETES
0
REORGAPPENDINSERT
0
REORGPSEUDODELETES
0
REORGMASSDELETE
0
REORGLEAFNEAR
0
REORGLEAFFAR
0
REORGNUMLEVELS
0
Notes:
1. Under certain conditions, such as a utility restart, the REBUILD utility might
not have an accurate count of loaded records. In those cases, DB2 sets this
value to null.
For a logical index partition, DB2 does not collect TOTALENTRIES statistics for the
entire nonpartitioned index when it runs REBUILD INDEX. Therefore, DB2 does
not reset the statistics for the index. The REORG counters from the last REORG are
still correct. DB2 updates REBUILDLASTTIME when the entire nonpartitioned
index is rebuilt.
How RUNSTATS affects real-time statistics:
When the RUNSTATS job starts, DB2 externalizes all in-memory statistics to the
real-time statistics tables.
Only RUNSTATS UPDATE ALL affects the real-time statistics.
The table below shows how running RUNSTATS UPDATE ALL on a table space or
table space partition affects the SYSTABLESPACESTATS statistics.
Table 60. Changed SYSTABLESPACESTATS values during RUNSTATS UPDATE ALL
Column name
During UTILINIT phase
1
After RUNSTATS phase
Timestamp of the start of
RUNSTATS phase
STATSLASTTIME
Current timestamp
STATSINSERTS
Actual value1
Actual value2
STATSDELETES
Actual value1
Actual value2
STATSUPDATES
Actual value1
Actual value2
STATSMASSDELETE
Actual value1
Actual value2
Notes:
1. DB2 externalizes the current in-memory values.
Chapter 11. Designing DB2 statistics for performance
227
2. This value is 0 for SHRLEVEL REFERENCE, or the actual value for SHRLEVEL
CHANGE.
The table below shows how running RUNSTATS UPDATE ALL on an index affects
the SYSINDEXSPACESTATS statistics.
Table 61. Changed SYSINDEXSPACESTATS values during RUNSTATS UPDATE ALL
Column name
During UTILINIT phase
1
After RUNSTATS phase
Timestamp of the start of
RUNSTATS phase
STATSLASTTIME
Current timestamp
STATSINSERTS
Actual value1
Actual value2
STATSDELETES
Actual value1
Actual value2
STATSMASSDELETE
Actual value1
Actual value2
Notes:
1. DB2 externalizes the current in-memory values.
2. This value is 0 for SHRLEVEL REFERENCE, or the actual value for SHRLEVEL
CHANGE.
How COPY affects real-time statistics:
When a COPY job starts, DB2 externalizes all in-memory statistics to the real-time
statistics tables. Statistics are gathered for a full image copy or an incremental copy,
but not for a data set copy.
The table below shows how running COPY on a table space or table space
partition affects the SYSTABLESPACESTATS statistics.
Table 62. Changed SYSTABLESPACESTATS values during COPY
Column name
During UTILINIT phase
1
After COPY phase
Timestamp of the start of
COPY phase
COPYLASTTIME
Current timestamp
COPYUPDATEDPAGES
Actual value1
Actual value2
COPYCHANGES
Actual value1
Actual value2
COPYUPDATELRSN
Actual value1
Actual value3
COPYUPDATETIME
Actual value1
Actual value3
Notes:
1. DB2 externalizes the current in-memory values.
2. This value is 0 for SHRLEVEL REFERENCE, or the actual value for SHRLEVEL
CHANGE.
3. This value is null for SHRLEVEL REFERENCE, or the actual value for
SHRLEVEL CHANGE.
The table below shows how running COPY on an index affects the
SYSINDEXSPACESTATS statistics.
228
Performance Monitoring and Tuning Guide
Table 63. Changed SYSINDEXSPACESTATS values during COPY
Column name
During UTILINIT phase
1
After COPY phase
Timestamp of the start of
COPY phase
COPYLASTTIME
Current timestamp
COPYUPDATEDPAGES
Actual value1
Actual value2
COPYCHANGES
Actual value1
Actual value2
COPYUPDATELRSN
Actual value1
Actual value3
COPYUPDATETIME
Actual value1
Actual value3
Notes:
1. DB2 externalizes the current in-memory values.
2. This value is 0 for SHRLEVEL REFERENCE, or the actual value for SHRLEVEL
CHANGE.
3. This value is null for SHRLEVEL REFERENCE, or the actual value for
SHRLEVEL CHANGE.
How RECOVER affects real-time statistics:
After recovery to the current state, the in-memory counter fields are still valid, so
DB2 does not modify them. However, after a point-in-time recovery, the statistics
might not be valid.
Consequently, DB2 sets all the REORG, STATS, and COPY counter statistics to null
after a point-in-time recovery. After recovery to the current state, DB2 sets
NACTIVE, SPACE, and EXTENTS to their new values. After a point-in-time
recovery, DB2 sets NLEVELS, NACTIVE, SPACE, and EXTENTS to their new
values.
How non-DB2 utilities affect real-time statistics
Non-DB2 utilities do not affect real-time statistics. Therefore, an object that is the
target of a non-DB2 COPY, LOAD, REBUILD, REORG, or RUNSTATS job can cause
incorrect statistics to be inserted in the real-time statistics tables.
Non-DB2 utilities do not affect real-time statistics. Therefore, an object that is the
target of a non-DB2 COPY, LOAD, REBUILD, REORG, or RUNSTATS job can cause
incorrect statistics to be inserted in the real-time statistics tables. Follow this
process to ensure correct statistics when you run non-DB2 utilities:
1. Stop the table space or index on which you plan to run the utility. This action
causes DB2 to write the in-memory statistics to the real-time statistics tables
and initialize the in-memory counters. If DB2 cannot externalize the statistics,
the STOP command does not fail.
2. Run the utility.
3. When the utility completes, update the statistics tables with new totals and
timestamps, and put zero values in the incremental counter.
Real-time statistics on objects in work file databases
|
|
|
Although you cannot run utilities on objects in the work files databases, DB2
records the NACTIVE, SPACE, and EXTENTS statistics on table spaces in those
databases.
Chapter 11. Designing DB2 statistics for performance
229
Real-time statistics for DEFINE NO objects
For objects that are created with DEFINE NO, no row is inserted into the real-time
statistics table until the object is physically defined.
Real-time statistics on read-only or nonmodified objects
DB2 does not externalize the NACTIVE, SPACE, or EXTENTS statistics for
read-only objects or objects that are not modified.
|
|
|
How creating objects affects real-time statistics
|
|
|
|
|
A row is inserted into the real-time statistics when a table space or index is
created. The time stamp for the when the objects is created is stored in the
REORGLASTTIME field. The values for all fields that contain incremental statistics
in the row are set to 0, and real-time statistics for the object are maintained from
that time forward.
When you create a table space or index, DB2 adds statistics to the real-time
statistics tables.
How dropping objects affects real-time statistics
If you drop a table space or index, DB2 deletes its statistics from the real-time
statistics tables.
However, if the real-time statistics database is not available when you drop a table
space or index, the statistics remain in the real-time statistics tables, even though
the corresponding object no longer exists. You need to use SQL DELETE statements
to manually remove those rows from the real-time statistics tables.
If a row still exists in the real-time statistics tables for a dropped table space or
index, and if you create a new object with the same DBID and PSID as the
dropped object, DB2 reinitializes the row before it updates any values in that row.
How SQL operations affect real-time statistics counters
SQL operations affect the counter columns in the real-time statistics tables. These
are the columns that record the number of insert, delete, or update operations, as
well as the total counters, TOTALROWS, and TOTALENTRIES.
UPDATE
When you issue an UPDATE statement, DB2 increments the update
counters.
INSERT
When you issue an INSERT statement, DB2 increments the insert counters.
DB2 keeps separate counters for clustered and unclustered INSERTs.
DELETE
When you issue a DELETE statement, DB2 increments the delete counters.
|
|
|
ROLLBACK
When you issue ROLLBACK statement, or when an implicit rollback
occurs, DB2 increments the counters, depending on the type of SQL
operation that is rolled back:
||
Rolled-back SQL statement
Incremented counters
|
UPDATE
Update counters
|
INSERT
Delete counters
|
|
DELETE
Insert counters
230
Performance Monitoring and Tuning Guide
Notice that for INSERT and DELETE, the counter for the inverse operation
is incremented. For example, if two INSERT statements are rolled back, the
delete counter is incremented by 2.
|
|
|
UPDATE of partitioning keys
If an update to a partitioning key causes rows to move to a new partition,
the following real-time statistics are impacted:
Action
Incremented counters
When UPDATE is executed
Update count of old partition = +1
Insert count of new partition = +1
When UPDATE is committed
Delete count of old partition = +1
When UPDATE is rolled back
Update count of old partition = +1
(compensation log record)
Delete count of new partition = +1
(remove inserted record)
If an update to a partitioning key does not cause rows to move to a new
partition, the counts are accumulated as expected:
Action
Incremented counters
When UPDATE is executed
Update count of current partition = +1
NEAR/FAR indirect reference count = +1
(if overflow occurred)
When UPDATE is rolled back
Update count of current partition = +1
(compensation log record)
Mass DELETE
Performing a mass delete operation on a table space does not cause DB2 to
reset the counter columns in the real-time statistics tables. After a mass
delete operation, the value in a counter column includes the count from a
time prior to the mass delete operation, as well as the count after the mass
delete operation.
Related concepts
How DB2 rolls back work (DB2 Administration Guide)
Related reference
DELETE (DB2 SQL)
INSERT (DB2 SQL)
ROLLBACK (DB2 SQL)
UPDATE (DB2 SQL)
Real-time statistics in data sharing
In a data sharing environment, DB2 members update their statistics serially.
Each member reads the target row from the statistics table, obtains a lock,
aggregates its in-memory statistics, and updates the statistics table with the new
totals. Each member sets its own interval for writing real-time statistics.
DB2 does locking based on the lock size of the DSNRTSDB.DSNRTSTS table space.
DB2 uses cursor stability isolation and CURRENTDATA(YES) when it reads the
statistics tables.
Chapter 11. Designing DB2 statistics for performance
231
At the beginning of a RUNSTATS job, all data sharing members externalize their
statistics to the real-time statistics tables and reset their in-memory statistics. If all
members cannot externalize their statistics, DB2 sets STATSLASTTIME to null. An
error in gathering and externalizing statistics does not prevent RUNSTATS from
running.
At the beginning of a COPY job, all data sharing members externalize their
statistics to the real-time statistics tables and reset their in-memory statistics. If all
members cannot externalize their statistics, DB2 sets COPYLASTTIME to null. An
error in gathering and externalizing statistics does not prevent COPY from
running.
Utilities that reset page sets to empty can invalidate the in-memory statistics of
other DB2 members. The member that resets a page set notifies the other DB2
members that a page set has been reset to empty, and the in-memory statistics are
invalidated. If the notify process fails, the utility that resets the page set does not
fail. DB2 sets the appropriate timestamp (REORGLASTTIME, STATSLASTTIME, or
COPYLASTTIME) to null in the row for the empty page set to indicate that the
statistics for that page set are unknown.
|
|
|
How the EXCHANGE command affects real-time statistics
|
|
|
|
The values of the INSTANCE columns in the SYSTABLESPACESTATS and
SYSINDEXSPACESTATS tables identify the VSAM data set that is associated with
the real-time statistics row. For a cloned object, the real-time statistics table row
might contain two rows, one for each instance of the object.
|
|
|
|
Utility operations and SQL operations can be run separately on each instance of
the cloned object. Therefore, each instance of an object can be used to monitor
activity, and allow recommendations on whether the base, the clone, or both
objects require a REORG, RUNSTATS, or COPY operation.
|
|
|
|
How real-time statistics affect sort work data set allocation for
DB2 utilities
|
|
|
|
When the value of the UTSORTAL subsystem parameter is set to YES, and
real-time statistics data is available, DB2 uses real-time statistics to determine data
set sizes for dynamically allocated sort work data sets for the following utilities:
v CHECK INDEX
v REBUILD INDEX
v REORG TABLESPACE
When the EXCHANGE command is used for clone tables real-time statistics are
affected.
You can specify whether DB2 uses real-time statistics data to estimate the size of
sort work data sets to allocate for certain utilities.
|
|
|
v RUNSTATS with COLGROUP
Improving concurrency with real-time statistics
You can specify certain options to prevent timeouts and deadlocks when you work
with data in real-time statistics tables.
Procedure
To reduce the risk of timeouts and deadlocks:
232
Performance Monitoring and Tuning Guide
v When you run COPY, RUNSTATS, or REORG on the real-time statistics objects,
use SHRLEVEL CHANGE.
v When you execute SQL statements to query the real-time statistics tables, use
uncommitted read isolation.
Recovering the real-time statistics tables
When you recover a DB2 subsystem after a disaster, DB2 starts with the
ACCESS(MAINT) option. No statistics are externalized in this state.
Procedure
Consequently, you need to perform the following actions on the real-time statistics
database:
v Recover the real-time statistics objects after you recover the DB2 catalog and
directory.
v Start the real-time statistics database explicitly, after DB2 restart.
Statistics accuracy
In general, the real-time statistics values are very accurate.
However, several factors can affect the accuracy of the statistics:
v Certain utility restart scenarios
v Certain utility operations that leave indexes in a database restrictive state, such
as RECOVER-pending (RECP)
Always consider the database restrictive state of objects before accepting a utility
recommendation that is based on real-time statistics.
v A DB2 subsystem failure
v A notify failure in a data sharing environment
If you think that some statistics values might be inaccurate, you can correct the
statistics by running REORG, RUNSTATS, or COPY on the objects for which DB2
generated the statistics.
Chapter 11. Designing DB2 statistics for performance
233
234
Performance Monitoring and Tuning Guide
Part 3. Programming DB2 applications for performance
You can achieve better performance from DB2 by considering performance as you
program and deploy your applications.
© Copyright IBM Corp. 1982, 2010
235
236
Performance Monitoring and Tuning Guide
Chapter 12. Tuning your queries
You might be able to rewrite certain queries to enable DB2 to choose more efficient
access paths.
|
|
|
|
Before rewriting queries, you should consider running the REORG utility on your
tables, and make sure that you have correct and up-to-date catalog statistics. If you
still have performance problems after you have tried the suggestions in this
information, you can use other more risky techniques.
Coding SQL statements as simply as possible
Simpler SQL statements require less processing and generally perform better that
more complex statements.
About this task
By following certain general guidelines when you write SQL statements you can
keep them simple, limit the amount of processing that is required to execute the
statement, and ensure better performance from the statements.
Procedure
To get the best performance from your SQL statements:
v Avoid selecting unused columns.
v Do not include unnecessary ORDER BY or GROUP BY clauses.
Coding queries with aggregate functions efficiently
If your query involves aggregate functions, you can take measure to increase the
chances that they are evaluated when the data is retrieved, rather than afterward.
Doing that can improve the performance of the query.
About this task
PSPI
In general, a aggregate function performs best when evaluated during
data access and next best when evaluated during DB2 sort. Least preferable is to
have a aggregate function evaluated after the data has been retrieved. You can use
EXPLAIN to determine when DB2 evaluates the aggregate functions.
Queries that involve the functions MAX or MIN might be able to take advantage
of one-fetch access.
Procedure
To ensure that an aggregate function is evaluated when DB2 retrieves the data:
Code the query so that every aggregate function that it contains meets the
following criteria:
v No sort is needed for GROUP BY. Check this in the EXPLAIN output.
v No stage 2 (residual) predicates exist. Check this in your application.
© Copyright IBM Corp. 1982, 2010
237
v No distinct set functions exist, such as COUNT(DISTINCT C1).
v If the query is a join, all set functions must be on the last table joined. Check
this by looking at the EXPLAIN output.
v All aggregate functions must be on single columns with no arithmetic
expressions.
v The aggregate function is not one of the following aggregate functions:
– STDDEV
– STDDEV_SAMP
– VAR
– VAR_SAMP
PSPI
Using non-column expressions efficiently
DB2 can evaluate certain predicates at an earlier stage of processing called stage 1,
so that the query that contains the predicate takes less time to run. When a
predicate contains column and non-column expressions on the same side of the
operator, DB2 must evaluate the predicate at a later stage.
Procedure
PSPI
To enable stage 1 processing of queries that contain non-column
expressions:
Write each predicate so that all non-column expressions appear on the opposite
side of the operator from any column expressions.
Example
The following predicate combines a column, SALARY, with values that are not
from columns on one side of the operator:
WHERE SALARY + (:hv1 * SALARY) > 50000
If you rewrite the predicate in the following way, DB2 can evaluate it more
efficiently:
WHERE SALARY > 50000/(1 + :hv1)
In the second form, the column is by itself on one side of the operator, and all the
other values are on the other side of the operator. The expression on the right is
called a non-column expression.
PSPI
Materialized query tables and query performance
One way to improve the performance of dynamic queries that operate on very
large amounts of data is to generate the results of all or parts of the queries in
advance, and store the results in materialized query tables.
PSPI
Materialized query tables are user-created tables. Depending on how the
tables are defined, they are user-maintained or system-maintained. If you have set
subsystem parameters or an application sets special registers to tell DB2 to use
238
Performance Monitoring and Tuning Guide
materialized query tables, when DB2 executes a dynamic query, DB2 uses the
contents of applicable materialized query tables if DB2 finds a performance
advantage to doing so.
PSPI
Encrypted data and query performance
Encryption and decryption can degrade the performance of some queries.
However, you can lessen the performance impact of encryption and decryption by
writing your queries carefully and designing your database with encrypted data in
mind.
|
|
|
|
XML data and query performance
XML data is more expensive to process, and might affect the performance of most
SQL statements. Each row of an XML column contains an XML document, requires
more processing, and requires more space in DB2.
PSPI
|
|
|
|
|
|
|
|
When you use the XPath expression to search or extract the XML data,
you can lessen the performance impact by avoiding the descendant or
descendant-or-self axis in the expression. The XMLEXISTS predicate is always
stage 2. However, you can use the XML index to reduce the number of rows, that
is, the number of XML documents, that are to be searched at the second stage. For
information about how to design your XML index so that the XPath predicates in
the XMLEXISTS predicate can be used as the matching predicate in the matching
index scan, see Matching index scan (MATCHCOLS>0.)
|
|
Recommendations: Creating and maintaining the XML index is more costly than
for a non-XML index. You should, if possible, write your query to use non-XML
|
indexes to filter as many rows as possible before the second stage. PSPI
|
Best practices for XML performance in DB2
|
|
By observing certain best practices you can help to improve the performance of
XML data that is stored in DB2 for z/OS.
|
Choose the granularity of XML documents carefully
|
|
|
When you design your XML application, and your XML document structure in
particular, you might have a choice to define which business data is kept together
in a single XML document.
|
|
|
For example, in the department table and sample data shown in the following
query and table uses one XML document per department.
create table dept(unitID char(8), deptdoc xml)
Chapter 12. Tuning your queries
239
||
unitID
deptdoc
|
|
|
|
|
|
|
|
|
|
WWPR
<dept deptID="PR27">
<employee id="901">
<name>Jim Qu</name>
<phone>408 555 1212</phone>
</employee>
<employee id="902">
<name>Peter Pan</name>
<office>216</office>
</employee>
</dept>
|
|
|
|
|
|
|
|
WWPR
<dept deptID="V15">
<employee id="673">
<name>Matt Foreman</name>
<phone>416 891 7301</phone>
<office>216</office>
</employee>
<description>This dept supports sales world wide</description>
</dept>
|
S-USE
...
|
...
...
|
|
|
| Figure 28. Sample data for the dept table
|
This intermediate granularity is a reasonable choice if a department is the
|
predominant granularity at which your application accesses and processes the
|
data. Alternatively, you might decide to combine multiple or many departments
|
into a single XML document, such as those that belong to one unit. This coarse
|
granularity, however, is sub-optimal if your application typically processes only
|
one department at a time.
|
|
|
|
|
|
|
|
You might also choose one XML document per employee with an additional "dept"
attribute for each employee to indicate which department he or she belongs to.
This fine granularity would be a very good choice if employees use business
objects of interest, which are often accessed and processed independently from the
other employees in the same department. However, if the application typically
processes all employees in one department together, one XML document per
department might be the better choice.
|
Use attributes and elements appropriately in XML
|
|
A common question related to XML document design, is when to use attributes
instead of elements, and how that choice affects performance.
|
|
|
This question is much more relevant to data modeling than to performance.
However, as a general rule, XML elements are more flexible than attributes because
they can be repeated and nested.
|
|
|
|
|
For example, the department documents shown in the preceding example, use an
element "phone" which allows multiple occurrences of "phone" for an employee
who has multiple numbers. This design is also extensible in case we later need to
break phone numbers into fragments, such as child elements for country code, area
code, extension, and so on.
240
Performance Monitoring and Tuning Guide
|
|
|
By contrast, if "phone" is an attribute of the employee element instead, it can exist
only once per employee, and you could not add child elements. Such limitations
might hinder future schema evolution.
|
|
|
|
|
Although you can probably model all of your data without using attributes, they
can be a very intuitive choice for data items that are known not to repeat for a
single element, nor have any sub-fields. Attributes can reduce the size of XML data
slightly because they have only a single name-value pair, as opposed to elements,
which have both a start tag and an end tag.
|
|
|
|
|
|
In DB2, you can use attributes in queries, predicates, and index definitions just as
easily as elements. Because attributes are less extensible than elements, DB2 can
apply certain storage and access optimizations. However, these advantages should
be considered an extra performance bonus rather than an incentive to convert
elements to attributes for the sake of performance, especially when data modeling
considerations call for elements.
|
Be aware of XML schema validation overhead
|
|
|
XML schema validation is an optional activity during XML parsing. Performance
studies have shown that XML parsing in general is significantly more
CPU-intensive if schema validation is enabled.
|
|
|
|
|
|
This overhead can vary drastically depending on the structure and size of your
XML documents, and particularly on the size and complexity of the XML Schema
used. For example, you might find 50% higher CPU consumption because of
schema validation with moderately complex schemas. Unless your XML inserts are
heavily I/O bound, the increased CPU consumption typically translates to reduced
insert throughput.
|
|
|
|
|
|
An XML schema defines the structure, elements and attributes, data types, and
value ranges, that are allowed in a set of XML documents. DB2 allows you to
validate XML documents against XML schemas. If you choose to validate
documents, you typically do so at insert time. Validation ensures that data inserted
into the database is compliant with the schema definition, meaning that you
prevent "junk data" from entering your tables.
|
|
|
|
|
|
|
|
|
|
|
Determine whether your application needs the stricter type checking for XML
queries and XML schema compliance. For example, if you are using an application
server which receives, validates, and processes XML documents before they are
stored in the database, the documents probably do not need to be validated again
in DB2. At that point you already know they are valid. Likewise, if the database
receives XML documents from a trusted application, maybe even one that you
control, and you know that the XML data is always valid, avoid schema validation
for the benefit of higher insert performance. If, however, your DB2 database
receives XML data from untrusted sources and you need to ensure schema
compliance at the DB2 level, then you need to spend some extra CPU cycles on
that.
|
Specify full paths in XPath expressions when possible
|
|
|
When you know where in the structure of an XML document the desired element
is located, it is best to provide that information in the form of a fully specified path
to avoid unneeded overhead.
|
Consider the table that is created by the following SQL statement.
Chapter 12. Tuning your queries
241
|
CREATE TABLE customer(info XML);
|
|
|
|
|
|
|
|
|
|
|
|
The following figure shows sample data in the info column.
<customerinfo Cid="1004">
<name>Matt Foreman</name>
<addr country="Canada">
<street>1596 Baseline</street>
<city>Toronto</city>
<state/>Ontario
<pcode>M3Z-5H9</pcode>
</addr>
<phone type="work">905-555-4789</phone>
<phone type="home">416-555-3376</phone>
</customerinfo>
|
|
| Figure 29. Sample data in a customerinfo XML document
|
If you want to retrieve customer phone numbers or the cities where they live, you
|
can choose from several possible path expressions to get that data.
|
|
|
|
|
|
Both /customerinfo/phone and //phone would get you the phone numbers.
Likewise, /customerinfo/addr/city and /customerinfo/*/city both return the city.
For best performance, the fully specified path is preferred over using either * or //
because the fully specified path enables DB2 to navigate directly to the desired
elements, skipping over non-relevant parts of the document.
|
|
|
|
|
|
In other words, if you know where in the document the desired element is located,
it is best to provide that information in the form of a fully specified path to avoid
unneeded overhead. If you ask for //phone instead of /customerinfo/phone, you
ask for phone elements anywhere in the document. This requires DB2 to navigate
down into the "addr" subtree of the document to look for phone elements at any
level of the document.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Using * and // can also lead to undesired or unexpected query results. For
example, if some of the "customerinfo" documents also contain "assistant"
information, as shown in the following figure. The path //phone would return the
customer phones and the assistant phone numbers, without distinguishing them.
From the query result, you might mistakenly process the assistant's phone as a
customer phone number.
<customerinfo Cid="1004">
<name>Matt Foreman</name>
<addr country="Canada">
<street>1596 Baseline</street>
<city>Toronto</city>
<state/>Ontario
<pcode>M3Z-5H9</pcode>
</addr>
<phone type="work">905-555-4789</phone>
<phone type="home">416-555-3376</phone>
<assistant>
<name>Peter Smith</name>
<phone type="home">416-555-3426</phone>
</assistant>
</customerinfo>
|
|
| Figure 30. Sample data with phone and name elements at multiple levels
|
242
Performance Monitoring and Tuning Guide
|
Define lean indexes for XML data to avoid extra overhead
|
|
|
|
|
|
|
|
|
|
|
|
|
Assume that queries often search for "customerinfo" XML documents by customer
name. An index on the customer name element, as shown in the following
statements, can greatly improve the performance of such queries.
|
|
|
|
|
|
Both of the indexes defined above are eligible to evaluate the XMLEXISTS
predicate on the customer name. However, the custname2 index might be
substantially larger than the custname1 index because it contains index entries not
only for customer names but also for assistant names. This is because the XML
pattern //name matches name elements anywhere in the document. However, if we
never search by assistant name then we don't need them indexed.
|
|
|
|
|
|
For read operations, the custname1 index is smaller and therefore potentially better
performing. For insert, update and delete operations, the custname1 index incurs
index maintenance overhead only for customer names, while the custname2 index
requires index maintenance for customer and assistant names. You certainly don't
want to pay that extra price if you require maximum insert, update, and delete
performance and you don't need indexed access based on assistant names.
|
Use XMLEXISTS for predicates that filter at the document level
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Consider the following table and sample data.
|
|
|
|
|
|
|
|
|
|
CREATE TABLE CUSTOMER (info XML);
CREATE INDEX custname1 ON customer(info)
GENERATE KEY USING XMLPATTERN ’/customerinfo/name’ as sql varchar(20);
CREATE INDEX custname2 ON customer(info)
GENERATE KEY USING XMLPATTERN ’//name’ as sql varchar(20);
SELECT * FROM customer
WHERE XMLEXSITS(’$i/customerinfo[name = "Matt Foreman"]’ passing info as $i);
CREATE TABLE customer(info XML);
<customerinfo>
<name>Matt Foreman</name>
<phone>905-555-4789</phone>
</customerinfo>
<customerinfo>
<name>Peter Jones</name>
<phone>905-123-9065</phone>
</customerinfo>
<customerinfo>
<name>Mary Clark</name>
<phone>905-890-0763</phone>
</customerinfo>
Figure 31. Sample data in the customer table
Assume, for example, that you want to return the names of customers which have
the phone number "905-555-4789". You might be tempted to write the following
query.
SELECT XMLQUERY(’$i/customerinfo[phone = "905-555-4789"]/name’ passing info as "i")
FROM customer;
However, this query is not what you want for several reasons:
Chapter 12. Tuning your queries
243
v It returns the following result set which has as many rows as there are rows in
the table. This is because the SQL statement has no where clause and therefore
cannot eliminate any rows. The result is shown in the following figure.
|
|
|
|
|
|
<name>Matt Foreman</name>
3 record(s) selected
|
|
| Figure 32. Result for the preceding example query
|
v For each row in the table which doesn't match the predicate, a row containing
|
an empty XML sequence is returned. This is because the XQuery expression in
|
the XMLQUERY function is applied to one row, or document, at a time and
|
never removes a row from the result set, only modifies its value. The value
|
produced by that XQuery is either the customer's name element if the predicate
|
is true, or the empty sequence otherwise. These empty rows are semantically
|
correct, according to the SQL/XML standard, and must be returned if the query
|
is written as shown.
|
v The performance of the query is poor. First, an index which might exist on
|
/customerinfo/phone cannot be used because this query is not allowed to
|
eliminate any rows. Secondly, returning many empty rows makes this query
|
needlessly slow.
|
|
|
|
|
|
|
|
|
|
To resolve the performance issues and get the desired output, you should use the
XMLQUERY function in the select clause only to extract the customer names, and
move the search condition, which should eliminate rows, into an XMLEXISTS
predicate in the WHERE clause. Doing so will allow index usage, row filtering,
and avoid the overhead of empty results rows. You could write the query as
shown in the following figure.
|
Use square brackets to avoid Boolean predicates in XMLEXISTS
|
|
|
|
|
A common error is to write the previous query without the square brackets in the
XMLEXISTS function, as shown in the following query.
|
|
|
|
|
|
|
Writing the query this way produces the following results shown in the following
figure.
SELECTXMLQUERY(’$i/customerinfo/name’ passing info as "i")
FROM customer
WHERE XMLEXISTS(’$i/customerinfo[phone = "905-555-4789"]’ passing info as "i")
SELECT XMLQUERY(’$i/customerinfo/name’ passing info as "i")
FROM customer
WHERE XMLEXISTS(’$i/customerinfo/phone = "905-555-4789"’ passing info as "i")
<name>Matt Foreman</name>
<name>Peter Jones</name>
<name>Mary Clark</name>
3 record(s) selected
|
|
| Figure 33. Sample results for the preceding example query
|
The expression in the XMLEXISTS predicate is written such that XMLEXISTS
|
always evaluates to true. Hence, no rows are eliminated. For a given row, the
|
XMLEXISTS predicate evaluates to false only if the XQuery expression inside
|
returns the empty sequence. However, without the square brackets the XQuery
|
244
Performance Monitoring and Tuning Guide
|
|
|
|
|
expression is a Boolean expression which always returns a Boolean value and
never the empty sequence. Note that XMLEXISTS truly checks for the existence of
a value and evaluates to true if a value exists, even if that value happens to be the
Boolean value "false". This behavior is correct according to the SQL/XML standard,
although it is probably not what you intended to express.
|
|
|
|
|
|
|
|
|
The impact is again that an index on phone cannot be used because no rows will
be eliminated, and you receive a lot more rows than you actually want. Also,
beware not to make this same mistake when using two or more predicates, as
shown in the following query.
|
|
|
|
|
|
|
The XQuery expression is still a Boolean expression because it has the form "exp1
and exp2." You would write the query as shown in the following query to filter
rows and allow for index usage.
|
Use RUNSTATS to collect statistics for XML data and indexes
|
|
|
|
|
|
The RUNSTATS utility has been extended to collect statistics on XML tables and
indexes, and DB2 optimizer uses these statistics to generate efficient execution plan
for XML queries. Consequently, continue to use RUNSTATS on XML tables and
indexes as you would for relational data. You need to specify XML tablespace
names explicitly or use LISTDEF to include ALL or XML objects to obtain the XML
table statistics.
|
Use SQL/XML publishing views to expose relational data as XML
|
|
|
You can include relational columns in a SQL/XML publishing view, and when
querying the view, express any predicates on those columns rather than on the
constructed XML.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SQL/XML publishing functions allow you to convert relational data into XML
format. Hiding the SQL/XML publishing functions in a view definition can be
beneficial. Applications or other queries can simply select the constructed XML
documents from the view, instead of dealing with the publishing functions
themselves. The following statements creates a view that contains hidden
SQL/XML publishing functions.
SELECT XMLQUERY(’$i/customerinfo/name’ passing info as "i")
FROM customer
WHERE XMLEXISTS(’$i/customerinfo[phone = "905-555-4789"] and
$i/customerinfo[name = "Matt Foreman"]’
passing info as "i")
SELECT XMLQUERY(’$i/customerinfo/name’ passing info as "i")
from customer
WHERE XMLEXISTS(’$i/customerinfo[phone = "905-555-4789" and name = "Matt Foreman"]’
passing info as "i")
CREATE TABLE unit( unitID char(8), name char(20), manager varchar(20));
CREATE VIEW UnitView(unitID, name, unitdoc) as
SELECT unitID, name,
XMLELEMENT(NAME "Unit",
XMLELEMENT(NAME "ID", u,unitID),
XMLELEMENT(NAME "UnitName", u.name),
XMLELEMENT(NAME "Mgr", u.manager)
)
FROM unit u;
Chapter 12. Tuning your queries
245
|
|
|
Note that the view definition includes relational columns. This does not create any
physical redundancy because it is only a view, not a materialized view. Exposing
the relational columns helps to query this view efficiently.
|
|
|
|
|
|
The following query uses a relational predicate to ensure that only the XML
document for "WWPR" is constructed, resulting in a shorter runtime, especially on
a large data set.
|
Use XMLTABLE views to expose XML data in relational format
|
|
|
|
|
|
|
|
|
|
|
|
|
You might also want to use a view to expose XML data in relational format.
Similar caution needs to be applied as before, but in the reverse way. In the
following example the SQL/XML function XMLTABLE returns values from XML
documents in tabular format.
|
|
|
|
The view definition includes XML column info to help query the view efficiently.
Assume that you want to retrieve a tabular list of customer IDs and names for a
given ZIP code. Both of the following queries can do that, but the second one
tends to perform better than the first.
|
|
|
|
|
|
|
|
In the first query, the filtering predicate is expressed on the CHAR column "Zip"
generated by the XMLTABLE function. However, not all the relational predicates
can be applied to the underlying XML column or indexes. Consequently, the query
requires the view to generate rows for all customers and then picks out the one for
zip code "95141".
|
|
|
|
|
The second query uses an XML predicate to ensure that only the rows for "95141"
get generated, resulting in a shorter runtime, especially on a large data set.
|
|
Use SQL and XML statements with parameter markers for short
queries and OLTP applications
|
|
The SQL/XML functions XMLQUERY, XMLTABLE and XMLEXISTS support
external parameters.
|
|
|
|
Very short database queries often execute so fast that the time to compile and
optimize them is a substantial portion of their total response time. Consequently,
you might want to compile, or "prepare," them just once and only pass predicate
literal values for each execution. This technique is recommended for applications
SELECT unitdoc
FROM UnitView
WHERE UnitID = "WWPR";
CREATE TABLE customer(info XML);
CREATE VIEW myview(CustomerID, Name, Zip, Info) AS
SELECT T.*, info
FROM customer, XMLTABLE (’$c/customerinfo’ passing info as "c"
COLUMNS
"CID"
INTEGER
PATH ’./@Cid’,
"Name"
VARCHAR(30) PATH ’./name’,
"Zip"
CHAR(12)
PATH ’./addr/pcode’ ) as T;
SELECT CustomerID, Name
FROM myview
WHERE Zip = ’95141’;
SELECT CustomerID, Name
FROM myView
WHERE xmlexists(’$i/customerinfo[addr/pcode = "95141"]’ passing info as "i");
246
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
with short and repetitive queries. The following query shows how you can use
parameter markers to achieve the result of the preceding example.
|
Avoid code page conversion during XML insert and retrieval
|
|
|
|
XML is different from other types of data in DB2 because it can be internally and
externally encoded. Internally encoded means that the encoding of your XML data
can be derived from the data itself. Externally encoded means that the encoding is
derived from external information.
|
|
|
|
The data type of the application variables that you use to exchange XML data with
DB2 determines how the encoding is derived. If your application uses character
type variables for XML, then it is externally encoded. If you use binary application
data types, then the XML data is considered internally encoded.
|
|
|
Internally encoded means that the encoding is determined by either a Unicode
Byte-Order mark (BOM) or an encoding declaration in the XML document itself,
such as: <?xml version="1.0" encoding="UTF-8" ?>
|
|
|
|
From a performance point of view, the goal is to avoid code page conversions as
much as possible because they consume extra CPU cycles. Internally encoded XML
data is preferred over externally encoded data because it can prevent unnecessary
code page conversion.
|
|
|
|
|
This means that in your application you should prefer binary data types over
character types. For example, in CLI when you use SQLBindParameter() to bind
parameter markers to input data buffers, you should use SQL_C_BINARY data
buffers rather than SQL_C_CHAR, SQL_C_DBCHAR, or SQL_C_WCHAR. In host
applications, use XML AS BLOB as the host variable type.
|
|
|
|
When inserting XML data from Java applications, reading in the XML data as a
binary stream (setBinaryStream) is better than as a string (setString). Similarly, if
your Java application receives XML from DB2 and writes it to a file, code page
conversion may occur if the XML is written as non-binary data.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When you retrieve XML data from DB2 into your application, it is serialized.
Serialization is the inverse operation of XML parsing. It is the process that DB2
uses to convert internal XML format, which is a parsed, tree-like representation,
into the textual XML format that your application can understand. In most cases it
is best to let DB2 perform implicit serialization. This means your SQL/XML
statements simply select XML-type values as shown in the following example, and
that DB2 performs the serialization into your application variables as efficiently as
possible.
|
|
If your application deals with very large XML documents, it might benefit from
using LOB locators for data retrieval. This requires explicit serialization to a LOB
SELECT info
FROM customer
WHERE xmlexists(’$i/customerinfo[phone = $p]’
passing info as "i", cast(? as varchar(12)) as "p")
CREATE TABLE customer(info XML);
SELECT info FROM customer WHERE...;
SELECT XMLQUERY(’$i/customerinfo/name’ passing info as "i")
FROM customer
WHERE...;
Chapter 12. Tuning your queries
247
type, preferably BLOB, because explicit serialization into a character type such as
CLOB can introduce encoding issues and unnecessary code page conversion.
Explicit serialization uses the XMLSERIALIZE function as shown in the following
query.
|
|
|
|
|
|
|
SELECT XMLSERIALIZE(info as BLOB(1M)) FROM customer WHERE...;
Related information
IBM Binary XML Specification
Writing efficient predicates
Predicates are found in the WHERE, HAVING, or ON clauses of SQL statements;
they describe attributes of data. Because SQL allows you to express the same query
in different ways, knowing how predicates affect path selection helps you write
queries that access data efficiently.
PSPI
Predicates are usually based on the columns of a table and either qualify
rows (through an index) or reject rows (returned by a scan) when the table is
accessed. The resulting qualified or rejected rows are independent of the access
path chosen for that table.
The following query has three predicates: an equal predicate on C1, a BETWEEN
predicate on C2, and a LIKE predicate on C3.
SELECT * FROM T1
WHERE C1 = 10 AND
C2 BETWEEN 10 AND 20 AND
C3 NOT LIKE ’A%’
PSPI
Ensuring that predicates are coded correctly
Whether you code the predicates of your SQL statements correctly has a great
effect on the performance of those queries.
Procedure
PSPI
To ensure the best performance from your predicates:
v Make sure all the predicates that you think should be indexable are coded so
that they can be indexable. Refer to Table 65 on page 255 to see which predicates
are indexable and which are not.
v Try to remove any predicates that are unintentionally redundant or not needed;
they can slow down performance.
v For string comparisons other than equal comparisons, ensure that the declared
length of a host variable is less than or equal to the length attribute of the table
column that it is compared to. For languages in which character strings are
null-terminated, the string length can be less than or equal to the column length
plus 1. If the declared length of the host variable is greater than the column
length,in a non-equal comparison, the predicate is stage 1 but cannot be a
matching predicate for an index scan.
For example, assume that a host variable and an SQL column are defined as
follows:
|
|
|
|
|
|
|
C language declaration
248
Performance Monitoring and Tuning Guide
SQL definition
char string_hv[15]
STRING_COL CHAR(12)
A predicate such as WHERE STRING_COL > :string_hv is not a matching predicate
for an index scan because the length of string_hv is greater than the length of
STRING_COL. One way to avoid an inefficient predicate using character host
variables is to declare the host variable with a length that is less than or equal to
the column length:
char string_hv[12]
Because this is a C language example, the host variable length could be 1 byte
greater than the column length:
char string_hv[13]
For numeric comparisons, a comparison between a DECIMAL column and a
float or real host variable is stage 2 if the precision of the DECIMAL column is
greater than 15. For example, assume that a host variable and an SQL column
are defined as follows:
C language declaration
SQL definition
float float_hv
DECIMAL_COL DECIMAL(16,2)
A predicate such as WHERE DECIMAL_COL = :float_hv is not a matching predicate
for an index scan because the length of DECIMAL_COL is greater than 15.
However, if DECIMAL_COL is defined as DECIMAL(15,2), the predicate is stage
1 and indexable. PSPI
Properties of predicates
Predicates in the WHERE and ON clauses of a SQL statement affect how DB2
selects the access path for the statement.
PSPI
Predicates in a HAVING clause are not used when DB2 selects access paths.
Consequently, in this topic, the term 'predicate' means a predicate after WHERE or
ON. A predicate influences the selection of an access path according to the
following factors:
v The type of predicate, according to its operator or syntax.
v Whether the predicate is indexable.
v Whether the predicate is stage 1 or stage 2.
v Whether the predicate contains a ROWID column.
v Whether the predicates in part of an ON clause.
The following terms are used to differentiate and classify certain kinds of
predicates:
Simple or compound
A compound predicate is the result of two predicates, whether simple or
compound, connected together by AND or OR Boolean operators. All
others are simple.
Local or join
Local predicates reference only one table. They are local to the table and
restrict the number of rows returned for that table. Join predicates involve
more than one table or correlated reference. They determine the way rows
are joined from two or more tables.
Chapter 12. Tuning your queries
249
Boolean term
Any predicate that is not contained by a compound OR predicate structure
is a Boolean term. If a Boolean term is evaluated false for a particular row,
the whole WHERE clause is evaluated false for that row.
PSPI
Predicate types
The type of a predicate depends on its operator or syntax. The type determines
what type of processing and filtering occurs when DB2 evaluates the predicate.
PSPI
The following table shows the different predicate types.
Table 64. Definitions and examples of predicate types
Type
Definition
Example
Subquery
Any predicate that includes another
SELECT statement.
C1 IN (SELECT C10 FROM
TABLE1)
Equal
Any predicate that is not a subquery
predicate and has an equal operator and
no NOT operator. Also included are
predicates of the form C1 IS NULL and C
IS NOT DISTINCT FROM.
C1=100
Range
Any predicate that is not a subquery
predicate and contains one of the
following operators:
>
>=
<
<=
LIKE
BETWEEN
C1>100
IN-list
A predicate of the form column IN (list of
values).
C1 IN (5,10,15)
NOT
Any predicate that is not a subquery
predicate and contains a NOT operator.
Also included are predicates of the form
C1 IS DISTINCT FROM.
C1 <> 5 or C1 NOT BETWEEN
10 AND 20
The following two examples show how the predicate type can influence the way
that DB2 chooses an access path. In each one, assume that a unique index, I1 (C1),
exists on table T1 (C1, C2), and that all values of C1 are positive integers.
Example equal predicate
The following query contains an equal predicate:
SELECT * FROM T1 WHERE C1 = 0;
DB2 chooses index access in this case because the index is highly selective on
column C1.
Example range predicate
The following query contains a range predicate:
SELECT C1, C2 FROM T1 WHERE C1 >= 0;
250
Performance Monitoring and Tuning Guide
However, the predicate does not eliminate any rows of T1. Therefore, DB2 might
determine during bind that a table space scan is more efficient than the index scan.
PSPI
Indexable and non-indexable predicates:
An indexable predicate can match index entries; predicates that cannot match index
entries are said to be non-indexable.
PSPI
To make your queries as efficient as possible, you can use indexable
predicates in your queries and create suitable indexes on your tables. Indexable
predicates allow the possible use of a matching index scan, which is often a very
efficient access path.
Indexable predicates might or might not become matching predicates of an index;
depending on the availability of indexes and the access path that DB2 chooses at
bind time.
For example, if the employee table has an index on the column LASTNAME, the
following predicate can be a matching predicate:
SELECT * FROM DSN8910.EMP WHERE LASTNAME = ’SMITH’;
In contrast, the following predicate cannot be a matching predicate, because it is
not indexable.
SELECT * FROM DSN8910.EMP WHERE SEX <> ’F’;
PSPI
Stage 1 and stage 2 predicates:
Rows retrieved for a query go through two stages of processing. Certain predicates
can be applied during the first stage of processing, whereas other cannot be
applied until the second stage of processing. You can improve the performance of
your queries by using predicates that can be applied during the first stage
whenever possible.
PSPI
Predicates that can be applied during the first stage of processing are
called Stage 1 predicates. These predicates are also sometimes said to be sargable.
Similarly, predicates that cannot be applied until the second stage of processing are
called stage 2 predicates, and sometimes described as nonsargable or residual
predicates.
Whether a predicate is stage 1 or stage 2 depends on the following factors:
v The syntax of the predicate.
|
|
|
|
|
|
v Data type and length of constants or columns in the predicate.
A simple predicate whose syntax classifies it as indexable and stage 1 might not
be indexable or stage 1 because of data types that are associated with the
predicate. For example, a predicate that is associated with either columns or
constants of the DECFLOAT data type is never treated as stage 1. Similarly a
predicate that contains constants or columns whose lengths are too long also
might not be stage 1 or indexable.
For example, the following predicate is not indexable:
CHARCOL<’ABCDEFG’, where CHARCOL is defined as CHAR(6)
Chapter 12. Tuning your queries
251
The predicate is not indexable because the length of the column is shorter than
the length of the constant.
The following predicate is not stage 1:
|
DECCOL>34.5e0, where DECCOL is defined as DECIMAL(18,2)
The predicate is not stage 1 because the precision of the decimal column is
greater than 15.
v Whether DB2 evaluates the predicate before or after a join operation. A predicate
that is evaluated after a join operation is always a stage 2 predicate.
v Join sequence.
The same predicate might be stage 1 or stage 2, depending on the join sequence.
Join sequence is the order in which DB2 joins tables when it evaluates a query.
The join sequence is not necessarily the same as the order in which the tables
appear in the predicate.
For example, the predicate might be stage 1 or stage 2:
T1.C1=T2.C1+1
If T2 is the first table in the join sequence, the predicate is stage 1, but if T1 is
the first table in the join sequence, the predicate is stage 2.
You can determine the join sequence by executing EXPLAIN on the query and
examining the resulting plan table.
All indexable predicates are stage 1. The predicate C1 LIKE %BC is stage 1, but is
not indexable.
PSPI
Boolean term predicates:
You can improve the performance of queries by choosing Boolean term predicates
over non-Boolean term predicates for join operations whenever possible.
PSPI
A Boolean term predicate is a simple or compound predicate that, when it is
evaluated false for a particular row, makes the entire WHERE clause false for that
particular row.
For example, in the following query P1, P2 and P3 are simple predicates:
SELECT * FROM T1 WHERE P1 AND (P2 OR P3);
v
v
v
v
P1 is a simple Boolean term predicate.
P2 and P3 are simple non-Boolean term predicates.
P2 OR P3 is a compound Boolean term predicate.
P1 AND (P2 OR P3) is a compound Boolean term predicate.
In single-index processing, only Boolean term predicates are chosen for matching
predicates. Hence, only indexable Boolean term predicates are candidates for
matching index scans. To match index columns by predicates that are not Boolean
terms, DB2 considers multiple-index access.
In join operations, Boolean term predicates can reject rows at an earlier stage than
can non-Boolean term predicates.
Recommendation: For join operations, choose Boolean term predicates over
non-Boolean term predicates whenever possible.
252
Performance Monitoring and Tuning Guide
PSPI
Predicates in the ON clause
The ON clause supplies the join condition in an outer join. For a full outer join, the
clause can use only equal predicates. For other outer joins, the clause can use any
predicates except predicates that contain subqueries.
PSPI
For left and right outer joins, and for inner joins, join predicates in the ON
clause are treated the same as other stage 1 and stage 2 predicates. A stage 2
predicate in the ON clause is treated as a stage 2 predicate of the inner table.
For full outer join, the ON clause is evaluated during the join operation like a
stage 2 predicate.
In an outer join, predicates that are evaluated after the join are stage 2 predicates.
Predicates in a table expression can be evaluated before the join and can therefore
be stage 1 predicates.
For example, in the following statement, the predicate EDLEVEL > 100 is evaluated
before the full join and is a stage 1 predicate:
SELECT * FROM (SELECT * FROM DSN8910.EMP
WHERE EDLEVEL > 100) AS X FULL JOIN DSN8910.DEPT
ON X.WORKDEPT = DSN8910.DEPT.DEPTNO;
PSPI
Using predicates efficiently
By following certain rules for how you write predicates, you can improve how
DB2 processes SQL statements.
Procedure
PSPI
To use predicates most efficiently in SQL statements:
v Use stage 1 predicates whenever possible. Stage 1 predicates are better than
stage 2 predicates because they disqualify rows earlier and reduce the amount of
processing that is needed at stage 2. In terms of resource usage, the earlier a
predicate is evaluated, the better.
v Write queries to evaluate the most restrictive predicates first. When predicates
with a high filter factor are processed first, unnecessary rows are screened as
early as possible, which can reduce processing cost at a later stage. However, a
predicate's restrictiveness is only effective among predicates of the same type
and at the same evaluation stage.
PSPI
When DB2 evaluates predicates
Two sets of rules determine the order of predicate evaluation.
PSPI
The first set of rules describes the order of predicate evaluation by stage:
Chapter 12. Tuning your queries
253
1. Indexable predicates are applied first. All matching predicates on index key
columns are applied first and evaluated when the index is accessed.
Next, stage 1 predicates that have not been picked as matching predicates, but
still refer to index columns, are applied to the index. This is called index
screening.
2. Other stage 1 predicates are applied next.
After data page access, stage 1 predicates are applied to the data.
3. Finally, the stage 2 predicates are applied on the returned data rows.
The second set of rules describes the order of predicate evaluation within each of
the stages:
1. All equal predicates (including column IN list, where list has only one element,
or column BETWEEN value1 AND value1) are evaluated.
2. All range predicates and predicates of the form column IS NOT NULL are
evaluated.
3. All other predicate types are evaluated.
After both sets of rules are applied, predicates are evaluated in the order in which
they appear in the query. Because you specify that order, you have some control
over the order of evaluation.
Exception: Regardless of coding order, non-correlated subqueries are evaluated
before correlated subqueries, unless DB2 correlates, de-correlates, or transforms the
subquery into a join.
|
|
|
PSPI
Summary of predicate processing
The following table lists many of the simple predicates and tells whether those
predicates are indexable or stage 1.
PSPI
The following terms are used:
subq
A correlated or noncorrelated subquery
noncor subq
A non-correlated subquery
cor subq
A correlated subquery
op
any of the operators >, >=, <, <=, ¬>, ¬<
value A constant, host variable, or special register.
pattern Any character string that does not start with the special characters for
percent (%) or underscore (_).
char
Any character string that does not include the special characters for percent
(%) or underscore (_).
expression
Any expression that contains arithmetic operators, scalar functions,
aggregate functions, concatenation operators, columns, constants, host
variables, special registers, or date or time expressions.
noncol expr
A non-column expression, which is any expression that does not contain a
column. That expression can contain arithmetic operators, scalar functions,
concatenation operators, constants, host variables, special registers, or date
or time expressions.
254
Performance Monitoring and Tuning Guide
An example of a non-column expression is
CURRENT DATE - 50 DAYS
Tn col expr
An expression that contains a column in table Tn. The expression might be
only that column.
predicate
A predicate of any type.
|
|
|
|
|
|
|
In general, if you form a compound predicate by combining several simple
predicates with OR operators, the result of the operation has the same
characteristics as the simple predicate that is evaluated latest. For example, if two
indexable predicates are combined with an OR operator, the result is indexable. If a
stage 1 predicate and a stage 2 predicate are combined with an OR operator, the
result is stage 2. Any predicate that is associated with the DECFLOAT data type is
neither stage 1 nor indexable.
Table 65. Predicate types and processing
|
Predicate Type
Index- able?
Stage 1?
Notes
COL = value
Y
Y
16 on page 259
COL = noncol expr
Y
Y
9 on page 258, 11 on
page 258, 12 on page
259, 15 on page 259
COL IS NULL
Y
Y
20 on page 260, 21
on page 260
COL op value
Y
Y
13 on page 259
COL op noncol expr
Y
Y
9 on page 258, 11 on
page 258, 12 on page
259, 13 on page 259
COL BETWEEN value1
AND value2
Y
Y
13 on page 259
COL BETWEEN noncol expr1
AND noncol expr2
Y
Y
9 on page 258, 11 on
page 258, 12 on page
259, 13 on page 259,
23 on page 260
N
N
N
N
10 on page 258
Y
Y
6 on page 258, 7 on
page 258, 11 on page
258, 12 on page 259,
13 on page 259, 14
on page 259, 15 on
page 259, 27 on page
260
COL LIKE 'pattern'
Y
Y
5 on page 258
COL IN (list)
Y
Y
17 on page 259, 18
on page 259
COL <> value
N
Y
8 on page 258, 11 on
page 258
value BETWEEN COL1
AND COL2
COL BETWEEN COL1
AND COL2
|
|
|
|
|
|
|
COL BETWEEN expression1
AND expression2
Chapter 12. Tuning your queries
255
Table 65. Predicate types and processing (continued)
Predicate Type
Index- able?
Stage 1?
Notes
COL <> noncol expr
N
Y
8 on page 258, 11 on
page 258
COL IS NOT NULL
Y
Y
21 on page 260
N
Y
N
Y
N
N
COL NOT IN (list)
N
Y
COL NOT LIKE ' char'
N
Y
5 on page 258
COL LIKE '%char'
N
Y
1 on page 258, 5 on
page 258
COL LIKE '_char'
N
Y
1 on page 258, 5 on
page 258
COL LIKE host variable
Y
Y
2 on page 258, 5 on
page 258
T1.COL = T2 col expr
Y
Y
6 on page 258, 9 on
page 258, 11 on page
258, 12 on page 259,
14 on page 259, 15
on page 259, 25 on
page 260, 27 on page
260
T1.COL op T2 col expr
Y
Y
6 on page 258, 9 on
page 258, 11 on page
258, 12 on page 259,
13 on page 259, 14
on page 259, 15 on
page 259
T1.COL <> T2 col expr
N
Y
8 on page 258, 11 on
page 258, 27 on page
260
T1.COL1 = T1.COL2
N
N
3 on page 258,25 on
page 260
T1.COL1 op T1.COL2
N
N
3 on page 258
T1.COL1 <> T1.COL2
N
N
3 on page 258
COL=(noncor subq)
Y
Y
COL = ANY (noncor subq)
Y
Y
COL = ALL (noncor subq)
N
N
COL op (noncor subq)
Y
Y
COL op ANY (noncor subq)
Y
Y
COL op ALL (noncor subq)
Y
Y
COL <> (noncor subq)
N
Y
COL NOT BETWEEN value1
AND value2
COL NOT BETWEEN noncol exp1
AND noncol expr2
value NOT BETWEEN
COL1 AND COL2
|
|
|
|
|
|
256
Performance Monitoring and Tuning Guide
22 on page 260
28 on page 260
22 on page 260
Table 65. Predicate types and processing (continued)
|
|
|
|
|
Predicate Type
Index- able?
Stage 1?
Notes
COL <> ANY (noncor subq)
N
N
22 on page 260
COL <> ALL (noncor subq)
N
N
COL IN (noncor subq)
Y
Y
(COL1,...COLn) IN (noncor subq)
Y
Y
COL NOT IN (noncor subq)
N
N
(COL1,...COLn) NOT IN (noncor subq)
N
N
COL = (cor subq)
N
N
4 on page 258
COL = ANY (cor subq)
Y
Y
19 on page 260, 22
on page 260
COL = ALL (cor subq)
N
N
COL op (cor subq)
N
N
4 on page 258
COL op ANY (cor subq)
N
N
22 on page 260
COL op ALL (cor subq)
N
N
COL <> (cor subq)
N
N
4 on page 258
COL <> ANY (cor subq)
N
N
22 on page 260
COL <> ALL (cor subq)
N
N
COL IN (cor subq)
Y
Y
(COL1,...COLn) IN (cor subq)
N
N
COL NOT IN (cor subq)
N
N
(COL1,...COLn) NOT IN (cor subq)
N
N
COL IS DISTINCT FROM value
N
Y
8 on page 258, 11 on
page 258
COL IS NOT DISTINCT FROM value
Y
Y
16 on page 259
COL IS DISTINCT FROM noncol expr
N
Y
8 on page 258, 11 on
page 258
COL IS NOT DISTINCT FROM noncol
expr
Y
Y
9 on page 258, 11 on
page 258, 12 on page
259, 15 on page 259
T1.COL1 IS DISTINCT FROM T2.COL2
N
N
3 on page 258
T1.COL1 IS NOT DISTINCT FROM
T2.COL2
N
N
3 on page 258
T1.COL1 IS DISTINCT FROM T2 col
expr
N
Y
8 on page 258, 11 on
page 258
T1.COL1 IS NOT DISTINCT FROM T2
col expr
Y
Y
6 on page 258, 9 on
page 258, 11 on page
258, 12 on page 259,
14 on page 259, 15
on page 259
COL IS DISTINCT FROM (noncor subq)
N
Y
COL IS NOT DISTINCT FROM (noncor
subq)
Y
Y
COL IS NOT DISTINCT FROM (cor
subq)
N
N
19 on page 260,24 on
page 260
19 on page 260
4 on page 258
Chapter 12. Tuning your queries
257
Table 65. Predicate types and processing (continued)
Predicate Type
Index- able?
Stage 1?
Notes
EXISTS (subq)
N
N
19 on page 260
NOT EXISTS (subq)
N
N
expression = value
N
N
27 on page 260
expression <> value
N
N
27 on page 260
expression op value
N
N
27 on page 260
expression op (subq)
N
N
|
XMLEXISTS
Y
N
|
NOT XMLEXISTS
N
N
26 on page 260
Notes:
1. Indexable only if an ESCAPE character is specified and used in the LIKE
predicate. For example, COL LIKE '+%char' ESCAPE '+' is indexable.
2. Indexable only if the pattern in the host variable is an indexable constant (for
example, host variable='char%').
3. If both COL1 and COL2 are from the same table, access through an index on
either one is not considered for these predicates. However, the following
query is an exception:
SELECT *
FROM T1 A, T1 B WHERE A.C1 = B.C2;
By using correlation names, the query treats one table as if it were two
separate tables. Therefore, indexes on columns C1 and C2 are considered for
access.
4. If the subquery has already been evaluated for a given correlation value, then
the subquery might not have to be reevaluated.
5. Not indexable or stage 1 if a field procedure exists on that column.
6. The column on the left side of the join sequence must be in a different table
from any columns on the right side of the join sequence.
7. The tables that contain the columns in expression1 or expression2 must already
have been accessed.
8. The processing for WHERE NOT COL = value is like that for WHERE COL <>
value, and so on.
9. If noncol expr, noncol expr1, or noncol expr2 is a noncolumn expression of one of
these forms, then the predicate is not indexable:
v noncol expr + 0
v noncol expr - 0
v noncol expr * 1
v noncol expr / 1
v noncol expr CONCAT empty string
10. COL, COL1, and COL2 can be the same column or different columns. The
columns are in the same table.
11. Any of the following sets of conditions make the predicate stage 2:
v The first value obtained before the predicate is evaluated is DECIMAL(p,s),
where p>15, and the second value obtained before the predicate is evaluated
is REAL or FLOAT.
258
Performance Monitoring and Tuning Guide
v The first value obtained before the predicate is evaluated is CHAR,
VARCHAR, GRAPHIC, or VARGRAPHIC, and the second value obtained
before the predicate is evaluated is DATE, TIME, or TIMESTAMP.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12. The predicate is stage 1 but not indexable if the first value obtained before the
predicate is evaluated is CHAR or VARCHAR, the second value obtained
before the predicate is evaluated is GRAPHIC or VARGRAPHIC, and the first
value obtained before the predicate is evaluated is not Unicode mixed.
13. If both sides of the comparison are strings, any of the following sets of
conditions makes the predicate stage 1 but not indexable:
v The first value obtained before the predicate is evaluated is CHAR or
VARCHAR, and the second value obtained before the predicate is evaluated
is GRAPHIC or VARGRAPHIC.
v Both of the following conditions are true:
– Both sides of the comparison are CHAR or VARCHAR, or both sides of
the comparison are BINARY or VARBINARY
– The length the first value obtained before the predicate is evaluated is
less than the length of the second value obtained before the predicate is
evaluated.
v Both of the following conditions are true:
– Both sides of the comparison are GRAPHIC or VARGRAPHIC.
– The length of the first value obtained before the predicate is evaluated is
less than the length of the second value obtained before the predicate is
evaluated.
v Both of the following conditions are true:
– The first value obtained before the predicate is evaluated is GRAPHIC or
VARGRAPHIC, and the second value obtained before the predicate is
evaluated is CHAR or VARCHAR.
– The length of the first value obtained before the predicate is evaluated is
less than the length of the second value obtained before the predicate is
evaluated.
14. If both sides of the comparison are strings, but the two sides have different
CCSIDs, the predicate is stage 1 and indexable only if the first value obtained
before the predicate is evaluated is Unicode and the comparison does not
meet any of the conditions in note 13.
15. Under either of these circumstances, the predicate is stage 2:
v noncol expr is a case expression.
v All of the following conditions are true:
– noncol expr is the product or the quotient of two non-column expressions
– noncol expr is an integer value
– COL is a FLOAT or a DECIMAL column
16. If COL has the ROWID data type, DB2 tries to use direct row access instead of
index access or a table space scan.
17. If COL has the ROWID data type, and an index is defined on COL, DB2 tries
to use direct row access instead of index access.
18. IN-list predicates are indexable and stage 1 if the following conditions are
true:
v The IN list contains only simple items. For example, constants, host
variables, parameter markers, and special registers.
v The IN list does not contain any aggregate functions or scalar functions.
v The IN list is not contained in a trigger's WHEN clause.
Chapter 12. Tuning your queries
259
v For numeric predicates where the left side column is DECIMAL with
precision greater than 15, none of the items in the IN list are FLOAT.
v For string predicates, the coded character set identifier is the same as the
identifier for the left side column.
v For DATE, TIME, and TIMESTAMP predicates, the left side column must be
DATE, TIME, or TIMESTAMP.
19. Certain predicates might become indexable and stage 1 depending on how
they are transformed during processing.
20. The predicate types COL IS NULL and COL IS NOT NULL are stage 2
predicates when they query a column that is defined as NOT NULL.
21. If the predicate type is COL IS NULL and the column is defined as NOT
NULL, the table is not accessed because C1 cannot be NULL.
22. The ANY and SOME keywords behave similarly. If a predicate with the ANY
keyword is not indexable and not stage 1, a similar predicate with the SOME
keyword is not indexable and not stage 1.
|
|
23. Under either of these circumstances, the predicate is stage 2:
v noncol expr is a case expression.
24.
25.
|
|
26.
27.
|
|
|
28.
v noncol expr is the product or the quotient of two noncolumn expressions,
that product or quotient is an integer value, and COL is a FLOAT or a
DECIMAL column.
COL IN (noncor subq) is stage 1 for type N access only. Otherwise, it is stage 2.
If the inner table is an EBCDIC or ASCII column and the outer table is a
Unicode column, the predicate is stage 1 and indexable.
The XMLEXISTS is always stage 2. But the same predicate can be indexable
and become the matching predicate if an XML index can be used to evaluate
the XPath expression in the predicate. The XMLEXISTS predicate can never be
a screening predicate.
The predicate might be indexable by an index on expression if it contains an
expression that is a column reference, invokes a built-in function, or contains a
general expression.
This type of predicate is not stage 1 when a nullability mismatch is possible.
PSPI
Examples of predicate properties
The included examples can help you to understand how and at which stage DB2
processes different predicates.
PSPI
Assume that predicate P1 and P2 are simple, stage 1, indexable predicates:
P1 AND P2 is a compound, stage 1, indexable predicate.
P1 OR P2 is a compound, stage 1 predicate, not indexable except by a union of
RID lists from two indexes.
The following examples of predicates illustrate the general rules of predicate
processing. In each case, assume that an index has been created on columns
(C1,C2,C3,C4) of the table and that 0 is the lowest value in each column.
WHERE C1=5 AND C2=7
Both predicates are stage 1 and the compound predicate is indexable. A
matching index scan could be used with C1 and C2 as matching columns.
260
Performance Monitoring and Tuning Guide
WHERE C1=5 AND C2>7
Both predicates are stage 1 and the compound predicate is indexable. A
matching index scan could be used with C1 and C2 as matching columns.
WHERE C1>5 AND C2=7
Both predicates are stage 1, but only the first matches the index. A
matching index scan could be used with C1 as a matching column.
WHERE C1=5 OR C2=7
Both predicates are stage 1 but not Boolean terms. The compound is
indexable. Multiple-index access for the compound predicate is not
possible because no index has C2 as the leading column. For single-index
access, C1 and C2 can be only index screening columns.
WHERE C1=5 OR C2<>7
The first predicate is indexable and stage 1, and the second predicate is
stage 1 but not indexable. The compound predicate is stage 1 and not
indexable.
WHERE C1>5 OR C2=7
Both predicates are stage 1 but not Boolean terms. The compound is
indexable. Multiple-index access for the compound predicate is not
possible because no index has C2 as the leading column. For single-index
access, C1 and C2 can be only index screening columns.
|
|
|
|
|
WHERE C1 IN (cor subq) AND C2=C1
As written, both predicates are stage 2 and not indexable. The index is not
considered for matching-index access, and both predicates are evaluated at
stage 2. However, DB2 might transform the correlated subquery to a
non-correlated subquery during processing, in which case both predicates
become indexable and stage 1
WHERE C1=5 AND C2=7 AND (C3 + 5) IN (7,8)
The first two predicates only are stage 1 and indexable. The index is
considered for matching-index access, and all rows satisfying those two
predicates are passed to stage 2 to evaluate the third predicate.
WHERE C1=5 OR C2=7 OR (C3 + 5) IN (7,8)
The third predicate is stage 2. The compound predicate is stage 2 and all
three predicates are evaluated at stage 2. The simple predicates are not
Boolean terms and the compound predicate is not indexable.
WHERE C1=5 OR (C2=7 AND C3=C4)
The third predicate is stage 2. The two compound predicates (C2=7 AND
C3=C4) and (C1=5 OR (C2=7 AND C3=C4)) are stage 2. All predicates are
evaluated at stage 2.
WHERE (C1>5 OR C2=7) AND C3 = C4
The compound predicate (C1>5 OR C2=7) is indexable and stage 1. The
simple predicate C3=C4 is not stage1; so the index is not considered for
matching-index access. Rows that satisfy the compound predicate (C1>5
OR C2=7) are passed to stage 2 for evaluation of the predicate C3=C4.
|
|
|
|
|
|
|
WHERE C1= 17 and C2 <> 100
In this example, assuming that a RANDOM ordering option has been
specified on C2 in the CREATE INDEX statement, the query can use the
index only in a limited way. The index is an effective filter on C1, but it
would not match on C2 because of the random values. The index is
scanned for all values where C1=17 and only then ensure that values for
C2 are not equal to 100.
Chapter 12. Tuning your queries
261
WHERE (C1 = 1 OR C2 = 1) AND XMLEXISTS('/a/b[c = 1]' PASSING
XML_COL1) AND XMLEXISTS('/a/b[(e = 2 or f[g] = 3) and /h/i[j] = 4]' PASSING
XML_COL2)
The compound predicate (C1 = 1 OR C2 = 1) is indexable and stage 1. The
first XMLEXISTS predicate is indexable and can become a matching
predicate if the XML index /a/b/c has been created. The second
XMLEXISTS predicate is indexable and can use multiple index access if the
XML indexes, /a/b/e, /a/b/f/g, and /a/b/h/i/j, can be used to evaluate
three XPath segments in the predicate. All rows satisfying the three
indexable predicates (one compound and two XMLEXISTS) are passed to
stage 2 to evaluate the same first and second XMLEXISTS predicates again.
|
|
|
|
|
|
|
|
|
|
|
PSPI
|
Predicate filter factors
By understanding of how DB2 uses filter factors you can write more efficient
predicates.
PSPI
The filter factor of a predicate is a number between 0 and 1 that estimates
the proportion of rows in a table for which the predicate is true. Those rows are
said to qualify by that predicate.
For example, suppose that DB2 can determine that column C1 of table T contains
only five distinct values: A, D, Q, W and X. In the absence of other information,
DB2 estimates that one-fifth of the rows have the value D in column C1. Then the
predicate C1=’D’ has the filter factor 0.2 for table T.
How DB2 uses filter factors:
DB2 uses filter factors to estimate the number of rows qualified by a set of
predicates.
For simple predicates, the filter factor is a function of three variables:
v The constant value in the predicate; for instance, 'D' in the previous example.
v The operator in the predicate; for instance, '=' in the previous example and '<>'
in the negation of the predicate.
v Statistics on the column in the predicate. In the previous example, those include
the information that column T.C1 contains only five values.
Recommendation: Control the first two of those variables when you write a
predicate. Your understanding of how DB2 uses filter factors should help you write
more efficient predicates.
Values of the third variable, statistics on the column, are kept in the DB2 catalog.
You can update many of those values, either by running the utility RUNSTATS or
by executing UPDATE for a catalog table.
If you intend to update the catalog with statistics of your own choice, you should
understand how DB2 uses filter factors and interpolation formulas.
PSPI
Default filter factors for simple predicates
DB2 uses default filter factor values when no other statistics exist.
262
Performance Monitoring and Tuning Guide
PSPI
The following table lists default filter factors for different types of
predicates.
Table 66. DB2 default filter factors by predicate type
Predicate Type
Filter Factor
Col = constant
1/25
Col <> constant
1 – (1/25)
Col IS NULL
1/25
Col IS NOT DISTINCT FROM
1/25
Col IS DISTINCT FROM
1 – (1/25)
Col IN (constant list)
(number of constants)/25
1
Col Op constant
1/3
Col LIKE constant
1/10
Col BETWEEN constant1 and constant2
1/10
Note:
1. Op is one of these operators: <, <=, >, >=.
Example
The default filter factor for the predicate C1 = 'D' is 1/25 (0.04). If D is actually not
close to 0.04, the default probably does not lead to an optimal access path.
PSPI
Filter factors for other predicate types:
Examples above represent only the most common types of predicates. If P1 is a
predicate and F is its filter factor, then the filter factor of the predicate NOT P1 is
(1 - F). But, filter factor calculation is dependent on many things, so a specific filter
factor cannot be given for all predicate types.
PSPI
Filter factors for uniform distributions
In certain situations DB2 assumes that a data is distributed uniformly and
calculates filter factors accordingly.
PSPI
DB2 uses the filter factors in the following table if:
v The value in column COLCARDF of catalog table SYSIBM.SYSCOLUMNS for
the column “Col” is a positive value.
v No additional statistics exist for “Col” in SYSIBM.SYSCOLDIST.
Table 67. DB2 uniform filter factors by predicate type
Predicate type
Filter factor
Col = constant
1/COLCARDF
Col <> constant
1 – (1/COLCARDF)
Col IS NULL
1/COLCARDF
Col IS NOT DISTINCT FROM
1/COLCARDF
Col IS DISTINCT FROM
1 – (1/COLCARDF)
Chapter 12. Tuning your queries
263
Table 67. DB2 uniform filter factors by predicate type (continued)
Predicate type
Filter factor
Col IN (constant list)
number of constants /COLCARDF
1
interpolation formula
2
Col Op2 constant
interpolation formula
Col LIKE constant
interpolation formula
Col BETWEEN constant1 and constant2
interpolation formula
Col Op1 constant
Notes:
1. Op1 is < or <=, and the constant is not a host variable.
2. Op2 is > or >=, and the constant is not a host variable.
Example
If D is one of only five values in column C1, using RUNSTATS puts the value 5 in
column COLCARDF of SYSCOLUMNS. If no additional statistics are available, the
filter factor for the predicate C1 = 'D' is 1/5 (0.2).
Filter factors for other predicate types:
Examples above represent only the most common types of predicates. If P1 is a
predicate and F is its filter factor, then the filter factor of the predicate NOT P1 is
(1 - F). But, filter factor calculation is dependent on many things, so a specific filter
factor cannot be given for all predicate types.
PSPI
Interpolation formulas
For a predicate that uses a range of values, DB2 calculates the filter factor by an
interpolation formula.
PSPI
The formula is based on an estimate of the ratio of the number of values
in the range to the number of values in the entire column of the table.
The formulas
The formulas that follow are rough estimates, which are subject to further
modification by DB2. They apply to a predicate of the form col op. constant. The
value of (Total Entries) in each formula is estimated from the values in columns
HIGH2KEY and LOW2KEY in catalog table SYSIBM.SYSCOLUMNS for column col:
Total Entries = (HIGH2KEY value - LOW2KEY value).
v For the operators < and <=, where the constant is not a host variable:
(constant value - LOW2KEY value) / (Total Entries)
v For the operators > and >=, where the constant is not a host variable:
(HIGH2KEY value - constant value) / (Total Entries)
v For LIKE or BETWEEN:
(High constant value - Low constant value) / (Total Entries)
264
Performance Monitoring and Tuning Guide
Example
For column C2 in a predicate, suppose that the value of HIGH2KEY is 1400 and
the value of LOW2KEY is 200. For C2, DB2 calculates Total Entries = 1400 - 200, or
1200.
For the predicate C1 BETWEEN 800 AND 1100, DB2 calculates the filter factor F as:
F = (1100 - 800)/1200 = 1/4 = 0.25
Interpolation for LIKE
DB2 treats a LIKE predicate as a type of BETWEEN predicate. Two values that
bound the range qualified by the predicate are generated from the constant string
in the predicate. Only the leading characters found before the escape character ('%'
or '_') are used to generate the bounds. So if the escape character is the first
character of the string, the filter factor is estimated as 1, and the predicate is
estimated to reject no rows.
Defaults for interpolation
DB2 might not interpolate in some cases; instead, it can use a default filter factor.
Defaults for interpolation are:
v Relevant only for ranges, including LIKE and BETWEEN predicates
v Used only when interpolation is not adequate
v Based on the value of COLCARDF
v Used whether uniform or additional distribution statistics exist on the column if
either of the following conditions is met:
– The predicate does not contain constants
– COLCARDF < 4.
The following table shows interpolation defaults for the operators <, <=, >, >= and
for LIKE and BETWEEN.
Table 68. Default filter factors for interpolation
COLCARDF
Factor for Op1
Factor for LIKE or
BETWEEN
>=100000000
1/10,000
3/100000
>=10000000
1/3,000
1/10000
>=1000000
1/1,000
3/10000
>=100000
1/300
1/1000
>=10000
1/100
3/1000
>=1000
1/30
1/100
>=100
1/10
3/100
>=2
1/3
1/10
=1
1/1
1/1
<=0
1/3
1/10
Note:
Chapter 12. Tuning your queries
265
1. Op is one of these operators: <, <=, >, >=.
PSPI
Filter factors for all distributions
RUNSTATS can generate additional statistics for a column or set of columns. DB2
can use that information to calculate filter factors.
PSPI
DB2 collects two kinds of distribution statistics:
Frequency
The percentage of rows in the table that contain a value for a column or set
of columns
Cardinality
The number of distinct values in a set of columns
The table that follows lists the types of predicates on which these statistics are
used.
Table 69. Predicates for which distribution statistics are used
Type of statistic
Single or
concatenated
columns
Frequency
Single
COL=constant
COL IS NULL
COL IN (constant-list)
COL op1 constant
COL BETWEEN constant AND constant
COL=host-variable
COL1=COL2
T1.COL=T2.COL
COL IS NOT DISTINCT FROM
Frequency
Concatenated
COL=constant COL IS NOT DISTINCT FROM
Cardinality
Single
COL=constant
COL IS NULL
COL IN (constant-list)
COL op1 constant
COL BETWEEN constant AND constant
COL=host-variable
COL1=COL2
T1.COL=T2.COL
COL IS NOT DISTINCT FROM
Cardinality
Concatenated
COL=constant
COL=:host-variable
COL1=COL2
COL IS NOT DISTINCT FROM
Predicates
Note:
1. op is one of these operators: <, <=, >, >=.
266
Performance Monitoring and Tuning Guide
How DB2 uses frequency statistics
Columns COLVALUE and FREQUENCYF in table SYSCOLDIST contain
distribution statistics. Regardless of the number of values in those columns,
running RUNSTATS deletes the existing values and inserts rows for frequent
values.
You can run RUNSTATS without the FREQVAL option, with the FREQVAL option
in the correl-spec, with the FREQVAL option in the colgroup-spec, or in both, with
the following effects:
v If you run RUNSTATS without the FREQVAL option, RUNSTATS inserts rows
for the 10 most frequent values for the first column of the specified index.
v If you run RUNSTATS with the FREQVAL option in the correl-spec, RUNSTATS
inserts rows for concatenated columns of an index. The NUMCOLS option
specifies the number of concatenated index columns. The COUNT option
specifies the number of frequent values. You can collect most-frequent values,
least-frequent values, or both.
v If you run RUNSTATS with the FREQVAL option in the colgroup-spec,
RUNSTATS inserts rows for the columns in the column group that you specify.
The COUNT option specifies the number of frequent values. You can collect
most-frequent values, least-frequent values, or both.
v If you specify the FREQVAL option, RUNSTATS inserts rows for columns of the
specified index and for columns in a column group.
DB2 uses the frequencies in column FREQUENCYF for predicates that use the
values in column COLVALUE and assumes that the remaining data are uniformly
distributed.
Example: Filter factor for a single column
Suppose that the predicate is C1 IN ('3','5') and that SYSCOLDIST contains these
values for column C1:
COLVALUE
’3’
’5’
’8’
FREQUENCYF
.0153
.0859
.0627
The filter factor is .0153 + .0859 = .1012.
Example: Filter factor for correlated columns
Suppose that columns C1 and C2 are correlated. Suppose also that the predicate is
C1='3' AND C2='5' and that SYSCOLDIST contains these values for columns C1
and C2:
COLVALUE
’1’ ’1’
’2’ ’2’
’3’ ’3’
’3’ ’5’
’4’ ’4’
’5’ ’3’
’5’ ’5’
’6’ ’6’
FREQUENCYF
.1176
.0588
.0588
.1176
.0588
.1764
.3529
.0588
The filter factor is .1176.
PSPI
Chapter 12. Tuning your queries
267
Histogram statistics
|
|
|
You can improve access path selection by specifying the HISTOGRAM option in
RUNSTATS.
PSPI
|
|
|
|
|
|
RUNSTATS normally collects frequency statistics for a single-column or
single multi-column data set. Because catalog space and bind time performance
concerns make the collection of these types of statistics on every distinct value
found in the target column or columns very impractical, such frequency statistics
are commonly collected only on the most frequent or least frequent, and therefore
most biased, values.
|
|
|
|
Such limited statistics often do not provide an accurate prediction of the value
distribution because they require a rough interpolation across the entire range of
values. For example, suppose that the YRS_OF_EXPERIENCE column on an
EMPLOYEE table contains the following value frequencies:
|
|
Table 70. Example frequency statistics for values on the YRS_OF_EXPERIENCE column in
an EMPLOYEE table
|
VALUE
FREQUENCY
|
2
10%
|
25
15%
|
26
15%
|
27
7%
|
12
0.02%
|
13
0.01%
|
40
0.0001%
|
|
41
0.00001%
|
Example predicates that can benefit from histogram statistics
|
|
|
|
|
|
|
|
|
|
Some example predicates on values in this table include:
|
|
|
|
|
For each of the above predicates, distribution statistics for any single value cannot
help DB2 to estimate predicate selectivity, other than by uniform interpolation of
filter factors over the uncollected part of the value range. The result of such
interpolation might lead to inaccurate estimation and undesirable access path
selection.
|
How DB2 uses histogram statistics
|
|
DB2 creates a number of intervals such that each interval contains approximately
the same percentage of rows from the data set. The number of intervals is specified
v Equality predicate with unmatched value:
SELECT EMPID FROM EMPLOYEE T
WHERE T.YRS_OF_EXPERIENCE = 6;
v Range predicate:
SELECT T.EMPID FROM EMPLOYEE T
WHERE T.YRS_OF_EXPERIENCE BETWEEN 5 AND 10;
v Non-local predicate:
SELECT T1.EMPID FROM EMPLOYEE T1, OPENJOBS T2
WHERE T1.SPECIALTY = T2.AREA AND T1.YRS_OF_EXPERIENCE > T2.YRS_OF_EXPERIENCE;
268
Performance Monitoring and Tuning Guide
|
|
|
|
by the value of NUMQUANTILES when you use the HISTOGRAM option of
RUNSTATS. Each interval has an identifier value QUANTILENO, and values, the
LOWVALUE and HIGHVALUE columns, that bound the interval. DB2 collects
distribution statistics for each interval.
|
|
|
|
|
|
When you use RUNSTATS to collect statistics on a column that contains such
wide-ranging frequency values, specify the histogram statistics option to collect
more granular distribution statistics that account for the distribution of values
across the entire range of values. The following table shows the result of collecting
histogram statistics for the years of experience values in the employee table. In this
example, the statistics have been collected with 7 intervals:
|
|
Table 71. Histogram statistics for the column YRS_OF_EXPERIENCE in an EMPLOYEE
table.
|
QUANTILENO
LOWVALUE
HIGHVALUE
CARDF
FREQUENCYF
|
1
0
3
4
14%
|
2
4
15
8
14%
|
3
18
24
7
12%
|
4
25
25
1
15%
|
5
26
26
1
15%
|
6
27
30
4
16%
|
|
7
35
40
6
14%
|
PSPI
How DB2 uses multiple filter factors to determine the cost of a
query
When DB2 estimates the cost of a query, it determines the filter factor repeatedly
and at various levels.
PSPI
For example, suppose that you execute the following query:
SELECT COLS FROM T1
WHERE C1 = ’A’
AND
C3 = ’B’
AND
C4 = ’C’;
Table T1 consists of columns C1, C2, C3, and C4. Index I1 is defined on table T1
and contains columns C1, C2, and C3.
Suppose that the simple predicates in the compound predicate have the following
characteristics:
C1='A'
Matching predicate
C3='B' Screening predicate
C4='C' Stage 1, nonindexable predicate
To determine the cost of accessing table T1 through index I1, DB2 performs these
steps:
Chapter 12. Tuning your queries
269
1. Estimates the matching index cost. DB2 determines the index matching filter
factor by using single-column cardinality and single-column frequency statistics
because only one column can be a matching column.
2. Estimates the total index filtering. This includes matching and screening
filtering. If statistics exist on column group (C1,C3), DB2 uses those statistics.
Otherwise DB2 uses the available single-column statistics for each of these
columns.
DB2 also uses FULLKEYCARDF as a bound. Therefore, it can be critical to have
column group statistics on column group (C1, C3) to get an accurate estimate.
3. Estimates the table-level filtering. If statistics are available on column group
(C1,C3,C4), DB2 uses them. Otherwise, DB2 uses statistics that exist on subsets
of those columns.
Important: If you supply appropriate statistics at each level of filtering, DB2 is
more likely to choose the most efficient access path.
You can use RUNSTATS to collect any of the needed statistics.
PSPI
Filter factor estimation for the XMLEXISTS predicate
|
|
|
|
When the statistics are available for the XML NODEID index, from the NODEID
index, the value FIRSTKEYCARD is the number of distinct DOCID values in the
XML table.
PSPI
|
|
|
|
|
|
FIRSTKEYCARD/CARDF(base table) transforms the filtering of an
XMLEXISTS predicate from the XML table to the base table. The filter factors of all
XPath predicates in the XMLEXISTS predicate indicate the filtering on the XML
table. After filter factors are multiplied with FIRSTKEYCARD/CARDF(base table),
the filter factors on the XML tables are transformed to the filter factors on the base
table.
|
|
From the NODEID index statistics, if the value FULLKEYCARD is available, the
FULLKEYCARD is used as the default value of CARDF of the XML table.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When the statistics are available for the XML index, the value FIRSTKEYCARD can
be used as the COLCARD of the comparison operand in the XPath predicates. For
each comparison type, the following rules are used to calculate the filter factor for
the XPath predicates:
v A default filter factor, without any statistic information, is the same as
non-XPath predicates with the same comparison type.
v If the value index can be used to evaluate the XPath predicate, the default filter
factor is redefined based on the FIRSTKEYCARD value. The filter factor is the
same as non-XPath predicates with the same comparison type and the same
COLCARD.
v Because the frequency statistics are not available, for range predicates,
interpolation is performed based on HIGH2KEY/LOW2KEY and the key value
of the comparison. When no statistics are available for the NODEID index and
the value index, no statistics exist from indexes to facilitate the filter factor
estimation for the XPath predicates in the XMLEXISTS predicate. The following
default statistics are used for the value index in the index costing:
– NLEAF reuses the current formula NLEAF = CARDF(xml_table)/300
|
|
|
– NLEVELS uses the current default value 1 for the NODEID index and the
value index for the XMLEXISTS predicate. Because the index statistics are not
available to help the default filter factor estimation, the predicate filter factor
270
Performance Monitoring and Tuning Guide
|
is set according to the predicate comparison type. FIRSTKEYCARD/
|
CARDF(base table) is set to value 1.
PSPI
Avoiding problems with correlated columns
Two columns in a table are said to be correlated if the values in the columns do not
vary independently.
About this task
DB2 might not determine the best access path when your queries include
correlated columns.
Correlated columns
Two columns of data, A and B of a single table, are correlated if the values in
column A do not vary independently of the values in column B.
PSPI
For example, the following table is an excerpt from a large single table.
Columns CITY and STATE are highly correlated, and columns DEPTNO and SEX
are entirely independent.
Table 72. Sample data from the CREWINFO table
CITY
STATE
DEPTNO
SEX
EMPNO
ZIPCODE
Fresno
CA
A345
F
27375
93650
Fresno
CA
J123
M
12345
93710
Fresno
CA
J123
F
93875
93650
Fresno
CA
J123
F
52325
93792
New York
NY
J123
M
19823
09001
New York
NY
A345
M
15522
09530
Miami
FL
B499
M
83825
33116
Miami
FL
A345
F
35785
34099
Los Angeles
CA
X987
M
12131
90077
Los Angeles
CA
A345
M
38251
90091
In this simple example, for every value of column CITY that equals 'FRESNO', the
STATE column contains the same value ('CA').
PSPI
Impacts of correlated columns:
DB2 might not determine the best access path, table order, or join method when
your query uses columns that are highly correlated.
PSPI
Column correlation can make the estimated cost of operations cheaper
than they actually are. Correlated columns affect both single-table queries and join
queries.
Chapter 12. Tuning your queries
271
Column correlation on the best matching columns of an index
The following query selects rows with females in department A345 from Fresno,
California. Two indexes are defined on the table, Index 1 (CITY,STATE,ZIPCODE)
and Index 2 (DEPTNO,SEX).
Query 1
SELECT ... FROM CREWINFO WHERE
CITY = ’FRESNO’ AND STATE = ’CA’
AND DEPTNO = ’A345’ AND SEX = ’F’;
(PREDICATE1)
(PREDICATE2)
Consider the two compound predicates (labeled PREDICATE1 and PREDICATE2),
their actual filtering effects (the proportion of rows they select), and their DB2 filter
factors. Unless the proper catalog statistics are gathered, the filter factors are
calculated as if the columns of the predicate are entirely independent (not
correlated).
When the columns in a predicate correlate but the correlation is not reflected in
catalog statistics, the actual filtering effect to be significantly different from the DB2
filter factor. The following table shows how the actual filtering effect and the DB2
filter factor can differ, and how that difference can affect index choice and
performance.
Table 73. Effects of column correlation on matching columns
INDEX 1
INDEX 2
Matching predicates
Predicate1
CITY=FRESNO AND STATE=CA
Predicate2
DEPTNO=A345 AND SEX=F
Matching columns
2
2
DB2 estimate for
matching columns
(Filter Factor)
column=CITY, COLCARDF=4
Filter Factor=1/4
column=STATE, COLCARDF=3
Filter Factor=1/3
column=DEPTNO,
COLCARDF=4
Filter Factor=1/4
column=SEX, COLCARDF=2
Filter Factor=1/2
Compound filter factor
for matching columns
1/4 × 1/3 = 0.083
1/4 × 1/2 = 0.125
Qualified leaf pages
based on DB2 estimations
0.083 × 10 = 0.83
INDEX CHOSEN (.8 < 1.25)
0.125 × 10 = 1.25
Actual filter factor based on data
distribution
4/10
2/10
Actual number of qualified leaf pages
based on compound predicate
4/10 × 10 = 4
2/10 × 10 = 2
BETTER INDEX CHOICE
(2 < 4)
DB2 chooses an index that returns the fewest rows, partly determined by the
smallest filter factor of the matching columns. Assume that filter factor is the only
influence on the access path. The combined filtering of columns CITY and STATE
272
Performance Monitoring and Tuning Guide
seems very good, whereas the matching columns for the second index do not seem
to filter as much. Based on those calculations, DB2 chooses Index 1 as an access
path for Query 1.
The problem is that the filtering of columns CITY and STATE should not look
good. Column STATE does almost no filtering. Since columns DEPTNO and SEX
do a better job of filtering out rows, DB2 should favor Index 2 over Index 1.
Column correlation on index screening columns of an index
Correlation might also occur on nonmatching index columns, used for index
screening. See “Nonmatching index scan (ACCESSTYPE='I' and MATCHCOLS=0)”
on page 595 for more information. Index screening predicates help reduce the
number of data rows that qualify while scanning the index. However, if the index
screening predicates are correlated, they do not filter as many data rows as their
filter factors suggest. To illustrate this, use Query 1 with the following indexes on
Table 72 on page 271:
Index 3 (EMPNO,CITY,STATE)
Index 4 (EMPNO,DEPTNO,SEX)
In the case of Index 3, because the columns CITY and STATE of Predicate 1 are
correlated, the index access is not improved as much as estimated by the screening
predicates and therefore Index 4 might be a better choice. (Note that index
screening also occurs for indexes with matching columns greater than zero.)
Multiple table joins
In Query 2, the data shown in the following table is added to the original query
(see Query 1) to show the impact of column correlation on join queries.
Table 74. Data from the DEPTINFO table
CITY
STATE
MANAGER
DEPT
DEPTNAME
Fresno
CA
Smith
J123
ADMIN
Los Angeles
CA
Jones
A345
LEGAL
Query 2
SELECT ... FROM CREWINFO T1,DEPTINFO T2
WHERE T1.CITY = ’FRESNO’ AND T1.STATE=’CA’
AND T1.DEPTNO = T2.DEPT AND T2.DEPTNAME = ’LEGAL’;
(PREDICATE 1)
The order that tables are accessed in a join statement affects performance. The
estimated combined filtering of Predicate1 is lower than its actual filtering. So table
CREWINFO might look better as the first table accessed than it should.
Also, due to the smaller estimated size for table CREWINFO, a nested loop join
might be chosen for the join method. But, if many rows are selected from table
CREWINFO because Predicate1 does not filter as many rows as estimated, then
another join method or join sequence might be better.
PSPI
Detecting correlated columns
The first indication that correlated columns might be a problem is poor response
times when DB2 has chosen an inappropriate access path. If you suspect that you
have problems with correlated columns, you can issue SQL statements to test
whether columns are correlated.
Chapter 12. Tuning your queries
273
Procedure
To determine whether columns are correlated:
Issue SQL statements and compare the results as shown in the following example.
Example
PSPI
If you suspect two columns in a table, such as the CITY and STATE
columns in the CREWINFO table might be correlated, then you can issue the
following SQL queries that reflect the relationships between the columns:
SELECT COUNT (DISTINCT CITY) AS CITYCOUNT,
COUNT (DISTINCT STATE) AS STATECOUNT FROM CREWINFO;
The result of the count of each distinct column is the value of COLCARDF in the
DB2 catalog table SYSCOLUMNS. Multiply the previous two values together to get
a preliminary result:
CITYCOUNT
x
STATECOUNT
=
ANSWER1
Then issue the following SQL statement:
SELECT COUNT(*) FROM
(SELECT DISTINCT CITY, STATE
FROM CREWINFO) AS V1;
(ANSWER2)
Compare the result of the previous count (ANSWER2) with ANSWER1. If
ANSWER2 is less than ANSWER1, then the suspected columns are correlated.
PSPI
What to do about correlated columns
When correlated columns cause DB2 to choose an inappropriate access path, you
can use several techniques to try to alter the access path.
Procedure
PSPI
To address problems with correlated columns:
v For leading indexed columns, run the RUNSTATS utility with the KEYCARD
option determine the column correlation. For all other column groups, run the
RUNSTATS utility with the COLGROUP option
v Run the RUNSTATS utility to collect column correlation information for any
column group with the COLGROUP option.
v Update the catalog statistics manually.
v Use SQL that forces access through a particular index.
Results
The RUNSTATS utility collects the statistics that DB2 needs to make proper choices
about queries. With RUNSTATS, you can collect statistics on the concatenated key
columns of an index and the number of distinct values for those concatenated
columns. This gives DB2 accurate information to calculate the filter factor for the
query.
Example
For example, RUNSTATS collects statistics that benefit queries like this:
274
Performance Monitoring and Tuning Guide
SELECT * FROM T1
WHERE C1 = ’a’ AND C2 = ’b’ AND C3 = ’c’ ;
Where:
v The first three index keys are used (MATCHCOLS = 3).
v An index exists on C1, C2, C3, C4, C5.
v Some or all of the columns in the index are correlated in some way. PSPI
DB2 predicate manipulation
In some specific cases, DB2 either modifies some predicates, or generates extra
predicates. Although these modifications are transparent to you, they have a direct
impact on the access path selection and your PLAN_TABLE results.
PSPI
This manipulation occurs because DB2 always uses an index access path
when it is cost effective. Generating extra predicates provides more indexable
predicates potentially, which creates more chances for an efficient index access
path.
Therefore, to understand your PLAN_TABLE results, you must understand how
DB2 manipulates predicates. The information in Table 65 on page 255 is also
helpful.
PSPI
How DB2 modifies IN-list predicates
DB2 automatically modifies some queries that contain IN-list predicates to enable
more access path options.
PSPI
DB2 converts an IN-list predicate that has only one item in the IN-list to an
equality predicate. A set of simple, Boolean term, equal predicates on the same
column that are connected by OR predicates can be converted into an IN-list
predicate. For example: DB2 the predicate C1=5 or C1=10 or C1=15 converts to C1
IN (5,10,15).
Related concepts
“Predicates generated through transitive closure” on page 277
When DB2 simplifies join operations
DB2 can simplify a join operation when the query contains a predicate or an ON
clause that eliminates the null values that are generated by the join operation.
WhenDB2 encounters a join operation that it can simplify, it attempts to do so.
PSPI
However, because full outer joins are less efficient than left or right joins,
and left and right joins are less efficient than inner joins, you should always try to
use the simplest type of join operation in your queries.
Example: ON clause that eliminates null values
Consider the following query:
SELECT * FROM T1 X FULL JOIN T2 Y
ON X.C1=Y.C1
WHERE X.C2 > 12;
Chapter 12. Tuning your queries
275
The outer join operation yields these result table rows:
v The rows with matching values of C1 in tables T1 and T2 (the inner join result)
v The rows from T1 where C1 has no corresponding value in T2
v The rows from T2 where C1 has no corresponding value in T1
However, when you apply the predicate, you remove all rows in the result table
that came from T2 where C1 has no corresponding value in T1. DB2 transforms the
full join into a left join, which is more efficient:
SELECT * FROM T1 X LEFT JOIN T2 Y
ON X.C1=Y.C1
WHERE X.C2 > 12;
Example: Predicate that eliminates null values
The predicate, X.C2>12, filters out all null values that result from the right join:
SELECT * FROM T1 X RIGHT JOIN T2 Y
ON X.C1=Y.C1
WHERE X.C2>12;
Therefore, DB2 can transform the right join into a more efficient inner join without
changing the result:
SELECT * FROM T1 X INNER JOIN T2 Y
ON X.C1=Y.C1
WHERE X.C2>12;
Example: Predicates that follow join operations
The predicate that follows a join operation must have the following characteristics
before DB2 transforms an outer join into a simpler outer join or inner join:
v The predicate is a Boolean term predicate.
v The predicate is false if one table in the join operation supplies a null value for
all of its columns.
These predicates are examples of predicates that can cause DB2 to simplify join
operations:
T1.C1 > 10
T1.C1 IS NOT NULL
T1.C1 > 10 OR T1.C2 > 15
T1.C1 > T2.C1
T1.C1 IN (1,2,4)
T1.C1 LIKE 'ABC%'
T1.C1 BETWEEN 10 AND 100
12 BETWEEN T1.C1 AND 100
Example: ON clause that eliminates unmatched values
This examples shows how DB2 can simplify a join operation because the query
contains an ON clause that eliminates rows with unmatched values:
SELECT * FROM T1 X LEFT JOIN T2 Y
FULL JOIN T3 Z ON Y.C1=Z.C1
ON X.C1=Y.C1;
Because the last ON clause eliminates any rows from the result table for which
column values that come from T1 or T2 are null, DB2 can replace the full join with
a more efficient left join to achieve the same result:
276
Performance Monitoring and Tuning Guide
SELECT * FROM T1 X LEFT JOIN T2 Y
LEFT JOIN T3 Z ON Y.C1=Z.C1
ON X.C1=Y.C1;
Example: Full outer join processed as a left outer join
In one case, DB2 transforms a full outer join into a left join when you cannot write
code to do it. This is the case where a view specifies a full outer join, but a
subsequent query on that view requires only a left outer join.
Example: Consider this view:
CREATE VIEW V1 (C1,T1C2,T2C2) AS
SELECT COALESCE(T1.C1, T2.C1), T1.C2, T2.C2
FROM T1 X FULL JOIN T2 Y
ON T1.C1=T2.C1;
This view contains rows for which values of C2 that come from T1 are null.
However, if you execute the following query, you eliminate the rows with null
values for C2 that come from T1:
SELECT * FROM V1
WHERE T1C2 > 10;
Therefore, for this query, a left join between T1 and T2 would have been adequate.
DB2 can execute this query as if the view V1 was generated with a left outer join
so that the query runs more efficiently.
PSPI
Predicates generated through transitive closure
When the set of predicates that belong to a query logically imply other predicates,
DB2 can generate additional predicates to provide more information for access
path selection.
Rules for generating predicates
PSPI
For single-table or inner join queries, DB2 generates predicates for
transitive closure if any of the following conditions are true:
v The query has an equal type predicate: COL1=COL2. This could be:
– A local predicate
– A join predicate
v The query also has a Boolean term predicate on one of the columns in the first
predicate, with one of the following formats:
– COL1 op value
op is =, <>, >, >=, <, or <=.
value is a constant, host variable, or special register.
– COL1 (NOT) BETWEEN value1 AND value2
– COL1=COL3
v The query is an outer join query and has an ON clause in the form of
COL1=COL2 that comes before a join that has one of the following forms:
– COL1 op value
op is =, , >, >=, <, or <=
– COL1 (NOT) BETWEEN value1 AND value2
Chapter 12. Tuning your queries
277
DB2 generates a transitive closure predicate for an outer join query only if the
generated predicate does not reference the table with unmatched rows. That is, the
generated predicate cannot reference the left table for a left outer join or the right
table for a right outer join.
For a multiple-CCSID query, DB2 does not generate a transitive closure predicate if
the predicate that would be generated has any of the following characteristics:
v The generated predicate is a range predicate (op is >, >=, <, or <=).
v Evaluation of the query with the generated predicate results in different CCSID
conversion from evaluation of the query without the predicate.
When a predicate meets the transitive closure conditions, DB2 generates a new
predicate, whether or not it already exists in the WHERE clause.
The generated predicates have one of the following formats:
v COL op value
op is =, >, >=, <, or <=.
value is a constant, host variable, or special register.
v COL (NOT) BETWEEN value1 AND value2
v COL1=COL2 (for single-table or inner join queries only)
DB2 does not generate a predicate through transitive closure for any predicate that
is associated with the DECFLOAT data type (column or constant).
|
|
Example of transitive closure for an inner join: Suppose that you have written this
query, which meets the conditions for transitive closure:
SELECT * FROM T1, T2
WHERE T1.C1=T2.C1 AND
T1.C1>10;
DB2 generates an additional predicate to produce this query, which is more
efficient:
SELECT * FROM T1, T2
WHERE T1.C1=T2.C1 AND
T1.C1>10 AND
T2.C1>10;
Example of transitive closure for an outer join
Suppose that you have written this outer join query:
SELECT * FROM
(SELECT T1.C1 FROM T1 WHERE T1.C1>10) X
LEFT JOIN
(SELECT T2.C1 FROM T2)
Y
ON X.C1 = Y.C1;
The before join predicate, T1.C1>10, meets the conditions for transitive closure, so
DB2 generates a query that has the same result as this more-efficient query:
SELECT * FROM
(SELECT T1.C1 FROM T1 WHERE T1.C1>10) X
LEFT JOIN
(SELECT T2.C1 FROM T2 WHERE T2.C1>10) Y
ON X.C1 = Y.C1;
278
Performance Monitoring and Tuning Guide
Predicate redundancy
A predicate is redundant if evaluation of other predicates in the query already
determines the result that the predicate provides. You can specify redundant
predicates or DB2 can generate them. However, DB2 does not determine that any
of your query predicates are redundant. All predicates that you code are evaluated
at execution time regardless of whether they are redundant. In contrast, if DB2
generates a redundant predicate to help select access paths, that predicate is
ignored at execution.
PSPI
Adding extra predicates to improve access paths:
By adding extra predicates to a query, you might enable DB2 to take advantage of
more efficient access paths.
About this task
PSPI
DB2 performs predicate transitive closure only on equal and range
predicates. However, you can help DB2 to choose a better access path by adding
transitive closure predicates for other types of operators, such as IN or LIKE.
Example
For example, consider the following SELECT statement:
SELECT * FROM T1,T2
WHERE T1.C1=T2.C1
AND T1.C1 LIKE ’A
If T1.C1=T2.C1 is true, and T1.C1 LIKE ’A%’ is true, then T2.C1 LIKE ’A%’ must
also be true. Therefore, you can give DB2 extra information for evaluating the
query by adding T2.C1 LIKE ’A%’:
SELECT *
WHERE
AND
AND
FROM T1,T2
T1.C1=T2.C1
T1.C1 LIKE ’A%’
T2.C1 LIKE ’A
PSPI
|
|
|
|
|
|
|
|
|
|
|
|
Transformation of SQL predicates to XML predicates
DB2 sometimes transforms an SQL query to change the timing at which a
predicate is applied to improve the performance of the query. DB2 might use such
a transformation to push SQL predicates into the XPath expression embedded in
the XMLTABLE function.
PSPI
For example, a query finds all books that were published after 1991 and lists the
year, title and publisher for each book.
SELECT X.*
FROM T1,
XMLTABLE(’/bib/book’
PASSING T1.bib_xml
Chapter 12. Tuning your queries
279
|
|
|
|
COLUMNS YEAR INT PATH ’@year’,
TITLE VARCHAR(30) PATH ’title’,
PUBLISHER VARCHAR(30) PATH ’publisher’) X
WHERE X.YEAR > 1991;
|
|
|
|
|
|
|
|
|
|
|
DB2 can rewrite the query to process the WHERE X.YEAR > 1991 predicate in the
XMLTABLE function. In the rewritten query the original predicate becomes an
XPath predicate that is associated with the row-xpath-expression of the XMLTABLE
function:
|
Implications of truncation and trailing blanks
|
|
|
|
|
|
|
|
|
|
|
|
Unlike SQL, in which trailing blanks have no significance, in XPath trailing blanks
are significant. For example, the following query contains an additional predicate,
X.publisher = ’Addison-Wesley’:
|
|
|
|
|
Because of the possible truncation when a publiser is cast to varchar(30), and the
possibility of trailing blanks in the original XML data, DB2 must add an internal
operator, db2:rtrim, to simulate the SQL semantics in order to push the predicate
into XPath. As shown below. The predicate X.publisher = ’Addison-Wesley’ is
transformed into [db2:rtrim(publisher,30)="Addison-Wesley"].
|
|
Predicates that are eligible for transformation to XML predicates in
XMLTABLE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A predicate that satisfies the following criteria is eligible for transformation to be
processed by the XMLTABLE function:
v The predicate must have one of the following forms: (Where op stands for any of
the following operators: =, <, >, <=. >=, or <>.)
|
|
– A predicate that is composed of the above forms combined with AND and
OR.
SELECT X.*
FROM T1,
XMLTABLE(’/bib/book[@year>1991]’
PASSING T1.bib_xml
COLUMNS YEAR INT PATH ’@year’,
TITLE VARCHAR(30) PATH ’title’,
PUBLISHER VARCHAR(30) PATH ’publisher’) X
SELECT *
FROM T1,
XMLTABLE(’/bib/book’
PASSING T1.bib_xml
COLUMNS year INT PATH ’@year’,
title VARCHAR(30) PATH ’title’,
publisher VARCHAR(30) PATH ’publisher’) X
WHERE X.year > 1991
AND X.publisher = ’Addison-Wesley’;
– Column op constant, parameter, or host variable, where the column is from
the result table.
– Column op column, where the column on the left hand side is from the result
table and the column and the right hand side is from either the result table or
one of the input tables.
– Column op expression, where the column is from the result table and the
expression is any SQL expression that only contains columns from the input
table.
– A BETWEEN predicate that can be transformed into one of the above forms.
– COLUMN IS (NOT) NULL
280
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
– COLUMN (NOT) IN (expression 1, ..., expression n), where the column is
from the result table and each of the expressions on either a column from the
result table or an SQL expression that contains neither columns from the
result table nor columns from a table that is NOT an input table.
v The predicate is a boolean term predicate.
v The predicate can be applied before any join operations.
v The result column of the XMLTABLE function that is involved in the predicate is
not of any of the following data types:
DATE
TIME
TIMESTAMP
DECFLOAT(16)
REAL
DOUBLE
|
|
|
This restriction does not apply to IS (NOT) NULL predicate.
v The result column of the XMLTABLE function involved in the predicate does not
have a default clause.
|
v The XMLTABLE function does not have a FOR ORDINALITY column.
|
PSPI
Predicates with encrypted data
DB2 provides built-in functions for data encryption and decryption. These
functions can secure sensitive data, but they can also degrade the performance of
some statements if they are not used carefully.
PSPI
If a predicate contains any operator other than = and <>, encrypted data
must be decrypted before comparisons can be made. Decryption makes the
predicates stage 2.
PSPI
Using host variables efficiently
When host variables or parameter markers are used in a query, the actual values
are not known when you bind the package or plan that contains the query. DB2
uses a default filter factor to determine the best access path for an SQL statement.
If that access path proves to be inefficient, you can do several things to obtain a
better access path.
About this task
PSPI
Host variables require default filter factors. When you bind a static SQL
statement that contains host variables, DB2 uses a default filter factor to determine
the best access path for the SQL statement. DB2 often chooses an access path that
performs well for a query with several host variables. However, in a new release
or after maintenance has been applied, DB2 might choose a new access path that
does not perform as well as the old access path. In many cases, the change in
access paths is due to the default filter factors, which might lead DB2 to optimize
the query in a different way.
Chapter 12. Tuning your queries
281
Procedure
To change the access path for a query that contains host variables, use one of the
following actions:
v Bind the package or plan that contains the query with the options
REOPT(ALWAYS), REOPT(AUTO), or REOPT(ONCE).
v Rewrite the query. PSPI
Related tasks
“Reoptimizing SQL statements at run time” on page 315
Writing efficient subqueries
A subquery is a SELECT statement within the WHERE or HAVING clause of an
INSERT, UPDATE, MERGE, or DELETE SQL statement. By understanding how
DB2 processes subqueries, you can estimate the best method to use when writing a
given query when several methods can achieve the same result.
About this task
PSPI
In many cases two or more different SQL statements can achieve identical
results, particularly those that contain subqueries. The statements have different
access paths, however, and probably perform differently.
Subqueries might also contain their own subqueries. Such nested subqueries can be
either correlated or non-correlated. DB2 uses the same processing techniques with
nested subqueries that it does for non-nested subqueries, and the same
optimization techniques apply.
No absolute rules exist for deciding how or whether to code a subquery. DB2
might transform one type of subquery to another, depending on the optimizer
estimation.
|
|
|
Procedure
To ensure the best performance from SQL statements that contain subqueries:
Follow these general guidelines:
v If efficient indexes are available on the tables in the subquery, then a correlated
subquery is likely to be the most efficient kind of subquery.
v If no efficient indexes are available on the tables in the subquery, then a
non-correlated subquery would be likely to perform better.
v If multiple subqueries are in any parent query, make sure that the subqueries are
ordered in the most efficient manner.
Example
Assume that MAIN_TABLE has 1000 rows:
SELECT * FROM MAIN_TABLE
WHERE TYPE IN (subquery 1) AND
PARTS IN (subquery 2);
Assuming that subquery 1 and subquery 2 are the same type of subquery (either
correlated or non-correlated) and the subqueries are stage 2, DB2 evaluates the
282
Performance Monitoring and Tuning Guide
subquery predicates in the order they appear in the WHERE clause. Subquery 1
rejects 10% of the total rows, and subquery 2 rejects 80% of the total rows:
v The predicate in subquery 1 (which is referred to as P1) is evaluated 1000 times,
and the predicate in subquery 2 (which is referred to as P2) is evaluated 900
times, for a total of 1900 predicate checks. However, if the order of the subquery
predicates is reversed, P2 is evaluated 1000 times, but P1 is evaluated only 200
times, for a total of 1200 predicate checks.
v Coding P2 before P1 appears to be more efficient if P1 and P2 take an equal
amount of time to execute. However, if P1 is 100 times faster to evaluate than
P2, then coding subquery 1 first might be advisable. If you notice a performance
degradation, consider reordering the subqueries and monitoring the results.
If you are unsure, run EXPLAIN on the query with both a correlated and a
non-correlated subquery. By examining the EXPLAIN output and understanding
your data distribution and SQL statements, you should be able to determine
which form is more efficient.
This general principle can apply to all types of predicates. However, because
subquery predicates can potentially be thousands of times more processor- and
I/O-intensive than all other predicates, the order of subquery predicates is
particularly important.
Regardless of coding order, DB2 performs non-correlated subquery predicates
before correlated subquery predicates, unless the subquery is transformed into a
join.
PSPI
Correlated and non-correlated subqueries
Different subqueries require different approaches for efficient processing by DB2.
|
|
All subqueries can be classified into either two categories: correlated and
non-correlated
Correlated subqueries
Correlated subqueries contain a reference to a table or column that is outside of the
scope of the subquery.
In the following query, for example, the correlation name X is a value from a table
that is not listed in the FROM clause of the subquery. The inclusion of X illustrates
that the subquery references the outer query block:
SELECT * FROM DSN8910.EMP X
WHERE JOB = ’DESIGNER’
AND EXISTS (SELECT 1
FROM
DSN8910.PROJ
WHERE DEPTNO = X.WORKDEPT
AND MAJPROJ = ’MA2100’);
Chapter 12. Tuning your queries
283
Non-correlated subqueries
Non-correlated subqueries do not refer to any tables or columns that are outside of
the scope of the subquery.
The following example query refers only to tables are within the scope of
the FROM clause.
SELECT * FROM DSN8910.EMP
WHERE JOB = ’DESIGNER’
AND WORKDEPT IN (SELECT DEPTNO
FROM
DSN8910.PROJ
WHERE MAJPROJ = ’MA2100’);
When DB2 transforms a subquery into a join
|
For a SELECT, UPDATE, or DELETE statement, DB2 can sometimes transform a
subquery into a join between the result table of the subquery and the result table
of the outer query.
|
|
|
PSPI
|
|
SELECT statements
|
|
|
|
|
|
|
|
|
|
|
|
|
For a SELECT statement, DB2 transforms the subquery into a join if the following
conditions are true:
|
|
|
|
v The transformation does not introduce redundancy.
v The subquery appears in a WHERE clause.
v The subquery does not contain GROUP BY, HAVING, ORDER BY, FETCH FIRST
n ROWS ONLY, or aggregate functions.
v The subquery has only one table in the FROM clause.
v For a correlated subquery, the comparison operator of the predicate containing
the subquery is IN, = ANY, or = SOME.
v For a noncorrelated subquery, the comparison operator of the predicate
containing the subquery is IN, EXISTS, = ANY, or = SOME.
v For a noncorrelated subquery, the subquery select list has only one column,
guaranteed by a unique index to have unique values.
v For a noncorrelated subquery, the left side of the predicate is a single column
with the same data type and length as the subquery's column. (For a correlated
subquery, the left side can be any expression.)
v Query parallelism is NOT enabled.
|
Example
|
|
|
|
|
|
|
The following subquery can be transformed into a join because it meets the above
conditions for transforming a SELECT statement:
SELECT * FROM EMP
WHERE DEPTNO IN
(SELECT DEPTNO FROM DEPT
WHERE LOCATION IN (’SAN JOSE’, ’SAN FRANCISCO’)
AND DIVISION = ’MARKETING’);
284
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
If a department in the marketing division has branches in both San Jose and San
Francisco, the result of the SQL statement is not the same as if a join were
performed. The join makes each employee in this department appear twice because
it matches once for the department of location San Jose and again of location San
Francisco, although it is the same department. Therefore, it is clear that to
transform a subquery into a join, the uniqueness of the subquery select list must be
guaranteed. For this example, a unique index on any of the following sets of
columns would guarantee uniqueness:
v (DEPTNO)
v (DIVISION, DEPTNO)
v (DEPTNO, DIVISION).
|
|
|
|
|
The resulting query is:
|
UPDATE, DELETE, and other SELECT statements
|
|
|
For an UPDATE or DELETE statement, or a SELECT statement that does not meet
the previous conditions for transformation, DB2 transforms a correlated subquery
into a join if the following conditions are true:
v The transformation does not introduce redundancy.
v The subquery is correlated to its immediate outer query.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SELECT EMP.* FROM EMP, DEPT
WHERE EMP.DEPTNO = DEPT.DEPTNO AND
DEPT.LOCATION IN (’SAN JOSE’, ’SAN FRANCISCO’) AND
DEPT.DIVISION = ’MARKETING’;
v The FROM clause of the subquery contains only one table, and the outer query
for SELECT, UPDATE, or DELETE references only one table.
v If the outer predicate is a quantified predicate with an operator of =ANY or an
IN predicate, and the following conditions are true:
– The left side of the outer predicate is a single column.
– The right side of the outer predicate is a subquery that references a single
column.
– The two columns have the same data type and length.
v The subquery does not contain the GROUP BY or DISTINCT clauses.
v The subquery does not contain aggregate functions.
v The SELECT clause of the subquery does not contain a user-defined function
with an external action or a user-defined function that modifies data.
v The subquery predicate is a Boolean term predicate.
v The predicates in the subquery that provide correlation are stage 1 predicates.
v The subquery does not contain nested subqueries.
v The subquery does not contain a self-referencing UPDATE or DELETE.
v For a SELECT statement, the query does not contain the FOR UPDATE OF
clause.
v For an UPDATE or DELETE statement, the statement is a searched UPDATE or
DELETE.
v For a SELECT statement, parallelism is not enabled.
For a statement with multiple subqueries, DB2 transforms only the last subquery
in the statement that qualifies for transformation.
Chapter 12. Tuning your queries
285
|
Example
|
|
|
|
|
|
The following subquery can be transformed into a join because it meets the above
conditions for UPDATE statements
UPDATE T1 SET T1.C1 = 1
WHERE T1.C1 =ANY
(SELECT T2.C1 FROM T2
WHERE T2.C2 = T1.C2);
PSPI
|
When DB2 correlates and de-correlates subqueries
Correlated and non-correlated subqueries have different processing advantages.
DB2 transforms subqueries to whichever type is most efficient, especially when a
subquery cannot be transformed into a join.
PSPI
DB2 might transform a correlated query to a non-correlated, or de-correlate
the subquery, to improve processing efficiency. Likewise, DB2 the might correlate a
non-correlated subquery. When a correlated and non-correlated subquery can
achieve the same result, the most efficient way depends on the data.
DB2 chooses to correlate or de-correlate subqueries based on cost. Correlated
subqueries allow more filtering to be done within the subquery. Non-correlated
subqueries allow more filtering to be done on the table whose columns are being
compared to the subquery result.
DB2 might correlate a non-correlated subquery, or de-correlate a correlated
subquery, that cannot be transformed into a join to improve access path selection
and processing efficiency.
Example:
DB2 can transform the following non-correlated subquery into a correlated
subquery:
SELECT * FROM T1
WHERE T1.C1 IN (SELECT T2.C1 FROM T2, T3
WHERE T2.C1 = T3.C1)
Can be transformed to:
SELECT * FROM T1
WHERE EXISTS (SELECT 1 FROM T2, T3
WHERE T2.C1 = T3.C1 AND T2.C1 = T1.C1)
Some queries cannot be transformed from one form to another. Most set functions
and grouping functions make it difficult to transform a subquery from one form to
another. Expressions that can prevent such transformation include:
Set functions and grouping functions
Most set functions and grouping functions make it difficult to transform a
subquery from one form to another.
Example:
286
Performance Monitoring and Tuning Guide
In the following query, the non-correlated subquery cannot be correlated to
T1 because it would change the result of the SUM function. Consequently,
only the non-correlated form of the query can be considered.
|
|
|
|
SELECT * FROM T1
WHERE T1.C2 IN (SELECT SUM(T2.C2) FROM T2, T3
WHERE T2.C1 = T3.C1
GROUP BY T2.C1)
Correlated ranges and <> comparisons
Some range comparisons involving correlated columns make it difficult to
de-correlate the subquery. This is because when a correlated subquery is
de-correlated we might have to remove duplicates in order to consider the
virtual table in the outer position (see “Early Out” Processing). This
duplicate removal requires a set of “equal-join” predicate columns as the
key. Without equal-join predicates the early out process breaks down and
doesn't work. That means the virtual table can only be considered in
correlated form (as the inner table of the join).
Example:
DB2 cannot de-correlate the following query and use it to access T1
because removing duplicates on the T2.C2 subquery result does not
guarantee that the range predicate correlation does not qualify multiple
rows from T1.
SELECT * FROM T1
WHERE EXISTS (SELECT 1 FROM T2, T3
WHERE T2.C1 = T3.C1 AND T2.C2 > T1.C2 AND T2.C2 < T1.C3)
PSPI
Subquery tuning
DB2 automatically performs some subquery tuning by subquery to join
transformation and through subquery correlation and de-correlation.
|
|
|
|
|
PSPI
However, you should be aware of the differences among the subqueries
such as those in the following examples. You might need to code a query in one of
the ways below for performance reasons that stem from restrictions to DB2 in
transforming a given query, or restrictions to the DB2 optimizer accurately
estimating the cost of the various transformation choices.
Each of the following three queries retrieves the same rows. All three retrieve data
about all designers in departments that are responsible for projects that are part of
major project MA2100. These three queries show that you can retrieve a desired
result in several ways.
Query A: a join of two tables
|
|
|
|
SELECT DSN8910.EMP.* FROM DSN8910.EMP, DSN8910.PROJ
WHERE JOB = ’DESIGNER’
AND WORKDEPT = DEPTNO
AND MAJPROJ = ’MA2100’;
Chapter 12. Tuning your queries
287
Query B: a correlated subquery
SELECT * FROM DSN8910.EMP X
WHERE JOB = ’DESIGNER’
AND EXISTS (SELECT 1 FROM DSN8910.PROJ
WHERE DEPTNO = X.WORKDEPT
AND MAJPROJ = ’MA2100’);
Query C: a noncorrelated subquery
SELECT * FROM DSN8910.EMP
WHERE JOB = ’DESIGNER’
AND WORKDEPT IN (SELECT DEPTNO FROM DSN8910.PROJ
WHERE MAJPROJ = ’MA2100’);
Choosing between a subquery and a join
|
|
|
|
|
|
|
If you need columns from both tables EMP and PROJ in the output, you must use
the join. Query A might be the one that performs best, and as a general practice
you should code a subquery as a join whenever possible. However, in this
example, PROJ might contain duplicate values of DEPTNO in the subquery, so that
an equivalent join cannot be written. In that case, whether the correlated or
non-correlated form is most efficient depends upon where the application of each
predicate in the subquery provides the most benefit.
|
|
|
|
|
When looking at a problematic subquery, check if the query can be rewritten into
another format, especially as a join, or if you can create an index to improve the
performance of the subquery. Consider the sequence of evaluation for the different
subquery predicates and for all other predicates in the query. If one subquery
predicate is costly, look for another predicate that could be evaluated first to reject
|
more rows before the evaluation of problem subquery predicate.
PSPI
Using scrollable cursors efficiently
Scrollable cursors are a valuable tool for writing applications such as screen-based
applications, in which the result table is small and you often move back and forth
through the data.
Procedure
PSPI
To get the best performance from your scrollable cursors:
v Determine when scrollable cursors work best for you.
Scrollable cursors require more DB2 processing than non-scrollable cursors. If
your applications require large result tables or you only need to move
sequentially forward through the data, use non-scrollable cursors.
v Declare scrollable cursors as SENSITIVE only if you need to see the latest data.
If you do not need to see updates that are made by other cursors or application
processes, using a cursor that you declare as INSENSITIVE requires less
processing by DB2.
If you need to see only some of the latest updates, and you do not need to see
the results of insert operations, declare scrollable cursors as SENSITIVE STATIC.
If you need to see all of the latest updates and inserts, declare scrollable cursors
as SENSITIVE DYNAMIC.
288
Performance Monitoring and Tuning Guide
v To ensure maximum concurrency when you use a scrollable cursor for
positioned update and delete operations, specify ISOLATION(CS) and
CURRENTDATA(NO) when you bind packages and plans that contain
updatable scrollable cursors.
v Use the FETCH FIRST n ROWS ONLY clause with scrollable cursors when it is
appropriate. In a distributed environment, when you need to retrieve a limited
number of rows, FETCH FIRST n ROWS ONLY can improve your performance
for distributed queries that use DRDA access by eliminating unneeded network
traffic.
In a local environment, if you need to scroll through a limited subset of rows in
a table, you can use FETCH FIRST n ROWS ONLY to make the result table
smaller.
v In a distributed environment, if you do not need to use your scrollable cursors
to modify data, do your cursor processing in a stored procedure. Using stored
procedures can decrease the amount of network traffic that your application
requires.
v In a work file database, create table spaces that are large enough for processing
your scrollable cursors.
DB2 uses declared temporary tables for processing the following types of
scrollable cursors:
– SENSITIVE STATIC SCROLL
– INSENSITIVE SCROLL
– ASENSITIVE SCROLL, if the cursor sensitivity in INSENSITIVE. A cursor that
meets the criteria for a read-only cursor has an effective sensitivity of
INSENSITIVE.
v Remember to commit changes often for the following reasons:
– You frequently need to leave scrollable cursors open longer than
non-scrollable cursors.
– An increased chance of deadlocks with scrollable cursors occurs because
scrollable cursors allow rows to be accessed and updated in any order.
Frequent commits can decrease the chances of deadlocks.
– To prevent cursors from closing after commit operations, declare your
scrollable cursors WITH HOLD.
v While sensitive static sensitive scrollable cursors are open against a table, DB2
disallows reuse of space in that table space to prevent the scrollable cursor from
fetching newly inserted rows that were not in the original result set. Although
this is normal, it can result in a seemingly false out-of-space indication. The
problem can be more noticeable in a data sharing environment with transactions
that access LOBs. Consider the following preventive measures:
|
|
|
|
|
–
–
–
–
Check applications such that they commit frequently
Close sensitive scrollable cursors when no longer needed
Remove WITH HOLD parm for the sensitive scrollable cursor, if possible
Isolate LOB table spaces in a dedicated bufferpool in the data sharing
environment
PSPI
Efficient queries for tables with data-partitioned secondary indexes
The number of partitions that DB2 accesses to evaluate a query predicate can affect
the performance of the query. A query that provides data retrieval through a
data-partitioned secondary index (DPSI) might access some or all partitions of the
DPSI.
Chapter 12. Tuning your queries
289
PSPI
For a query that is based only on a DPSI key value or range, DB2 must
examine all partitions. If the query also has predicates on the leading columns of
the partitioning key, DB2 does not need to examine all partitions. The removal
from consideration of inapplicable partitions is known as limited partition scan.
A limited partition scan can be determined at bind time or at run time. For
example, a limited partition scan can be determined at bind time for a predicate in
which a column is compared to a constant. A limited partition scan occurs at run
time if the column is compared to a host variable, parameter marker, or special
register.
Example: limited partition scan
The following example demonstrates how you can use a partitioning index to
enable a limited partition scan on a set of partitions that DB2 needs to examine to
satisfy a query predicate.
Suppose that you create table Q1, with partitioning index DATE_IX and DPSI
STATE_IX:
CREATE TABLESPACE TS1 NUMPARTS 3;
CREATE TABLE Q1 (DATE DATE,
CUSTNO CHAR(5),
STATE CHAR(2),
PURCH_AMT DECIMAL(9,2))
IN TS1
PARTITION BY (DATE)
(PARTITION 1 ENDING AT (’2002-1-31’),
PARTITION 2 ENDING AT (’2002-2-28’),
PARTITION 3 ENDING AT (’2002-3-31’));
CREATE INDEX DATE_IX ON Q1 (DATE) PARTITIONED CLUSTER;
CREATE INDEX STATE_IX ON Q1 (STATE) PARTITIONED;
Now suppose that you want to execute the following query against table Q1:
SELECT CUSTNO, PURCH_AMT
FROM Q1
WHERE STATE = ’CA’;
Because the predicate is based only on values of a DPSI key (STATE), DB2 must
examine all partitions to find the matching rows.
Now suppose that you modify the query in the following way:
SELECT CUSTNO, PURCH_AMT
FROM Q1
WHERE DATE BETWEEN ’2002-01-01’ AND ’2002-01-31’ AND
STATE = ’CA’;
Because the predicate is now based on values of a partitioning index key (DATE)
and on values of a DPSI key (STATE), DB2 can eliminate the scanning of data
partitions 2 and 3, which do not satisfy the query for the partitioning key. This can
be determined at bind time because the columns of the predicate are compared to
constants.
Now suppose that you use host variables instead of constants in the same query:
290
Performance Monitoring and Tuning Guide
SELECT CUSTNO, PURCH_AMT
FROM Q1
WHERE DATE BETWEEN :hv1 AND :hv2 AND
STATE = :hv3;
DB2 can use the predicate on the partitioning column to eliminate the scanning of
unneeded partitions at run time.
Example: limited partition scan when correlation exists
Writing queries to take advantage of limited partition scan is especially useful
when a correlation exists between columns that are in a partitioning index and
columns that are in a DPSI.
For example, suppose that you create table Q2, with partitioning index DATE_IX
and DPSI ORDERNO_IX:
CREATE TABLESPACE TS2 NUMPARTS 3;
CREATE TABLE Q2 (DATE DATE,
ORDERNO CHAR(8),
STATE CHAR(2),
PURCH_AMT DECIMAL(9,2))
IN TS2
PARTITION BY (DATE)
(PARTITION 1 ENDING AT (’2004-12-31’),
PARTITION 2 ENDING AT (’2005-12-31’),
PARTITION 3 ENDING AT (’2006-12-31’));
CREATE INDEX DATE_IX ON Q2 (DATE) PARTITIONED CLUSTER;
CREATE INDEX ORDERNO_IX ON Q2 (ORDERNO) PARTITIONED;
Also suppose that the first 4 bytes of each ORDERNO column value represent the
four-digit year in which the order is placed. This means that the DATE column and
the ORDERNO column are correlated.
|
To take advantage of limited partition scan, when you write a query that has the
ORDERNO column in the predicate, also include the DATE column in the
predicate. The partitioning index on DATE lets DB2 eliminate the scanning of
partitions that are not needed to satisfy the query. For example:
SELECT ORDERNO, PURCH_AMT
FROM Q2
WHERE ORDERNO BETWEEN ’2005AAAA’ AND ’2005ZZZZ’ AND
DATE BETWEEN ’2005-01-01’ AND ’2005-12-31’;
PSPI
|
Making predicates eligible for index on expression
|
|
You can create an index on an expression to improve the performance of queries
that use column-expression predicates.
|
About this task
|
|
|
|
PSPI
Unlike a simple indexes, an index on expression uses key values that are
transformed by an expression that is specified when the index is created. However,
DB2 cannot always use an index on expression. For example, DB2 might not be
able to use an index on expression for queries that contain multiple outer joins,
Chapter 12. Tuning your queries
291
|
materialized views, and materialized table expressions.
|
Procedure
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To enable DB2 to use an index on expression, use the following approaches:
v Create an index on expression for queries that contain predicates that use
column-expressions.
v Rewrite queries that contain multiple outer joins so that a predicate that can be
satisfied by an index on expression is in a different query block than the outer
joins. For example, DB2 cannot use an index on expression for the UPPER
predicate in the following query:
SELECT ...
FROM T1
LEFT OUTER JOIN T2
ON T1.C1 = T2.C1
LEFT OUTER JOIN T3
ON T1.C1 = T3.C1
WHERE UPPER(T1.C2, ’EN_US’) = ’ABCDE’
However, you can rewrite the query so that DB2 can use an index on expression
for the UPPER predicate by placing the UPPER expression in a separate query
block from the outer joins, as shown in the following query:
SELECT ...
FROM (
SELECT C1
FROM T1
WHERE UPPER(T1.C2, ’EN_US’) = ’ABCDE’
) T1
LEFT OUTER JOIN T2
ON T1.C1 = T2.C1
LEFT OUTER JOIN T3
ON T1.C1 = T3.C1
v Rewrite queries that contain materialized views and table expressions so that
any predicate that might benefit from an index on expression is coded inside the
view or table expression. For example, in the following query as written, the
table expression X is materialized because of the DISTINCT keyword, and DB2
cannot use an index on expression for the UPPER predicate:
SELECT ...
FROM (
SELECT DISTINCT C1, C2 FROM T1
) X
, T2
WHERE X.C1 = T2.C1
AND UPPER(X.C2, ’En_US’) = ’ABCDE’
However, you can enable DB2 to use an index on expression for the UPPER
predicate by rewriting the query so that the UPPER predicate is coded inside the
table expression, as shown in the following example:
SELECT ...
FROM (
SELECT DISTINCT C1, C2 FROM T1
WHERE UPPER(T1.C2, ’En_US’) = ’ABCDE’
) X
, T2
WHERE X.C1 = T2.C1
PSPI
|
292
Performance Monitoring and Tuning Guide
|
Improving the performance of queries for special situations
You can use special techniques to improve the access paths of queries for certain
particular types of data and certain specific types of applications.
Using the CARDINALITY clause to improve the performance of
queries with user-defined table function references
The cardinality of a user-defined table function is the number of rows that are
returned when the function is invoked. DB2 uses this number to estimate the cost
of executing a query that invokes a user-defined table function.
PSPI
The cost of executing a query is one of the factors that DB2 uses when it
calculates the access path. Therefore, if you give DB2 an accurate estimate of a
user-defined table function's cardinality, DB2 can better calculate the best access
path.
You can specify a cardinality value for a user-defined table function by using the
CARDINALITY clause of the SQL CREATE FUNCTION or ALTER FUNCTION
statement. However, this value applies to all invocations of the function, whereas a
user-defined table function might return different numbers of rows, depending on
the query in which it is referenced.
To give DB2 a better estimate of the cardinality of a user-defined table function for
a particular query, you can use the CARDINALITY or CARDINALITY
MULTIPLIER clause in that query. DB2 uses those clauses at bind time when it
calculates the access cost of the user-defined table function. Using this clause is
recommended only for programs that run on DB2 for z/OS because the clause is
not supported on earlier versions of DB2.
Example of using the CARDINALITY clause to specify the
cardinality of a user-defined table function invocation
Suppose that when you created user-defined table function TUDF1, you set a
cardinality value of 5, but in the following query, you expect TUDF1 to return 30
rows:
SELECT *
FROM TABLE(TUDF1(3)) AS X;
Add the CARDINALITY 30 clause to tell DB2 that, for this query, TUDF1 should
return 30 rows:
SELECT *
FROM TABLE(TUDF1(3) CARDINALITY 30) AS X;
Example of using the CARDINALITY MULTIPLIER clause to
specify the cardinality of a user-defined table function invocation
Suppose that when you created user-defined table function TUDF2, you set a
cardinality value of 5, but in the following query, you expect TUDF2 to return 30
times that many rows:
SELECT *
FROM TABLE(TUDF2(10)) AS X;
Add the CARDINALITY MULTIPLIER 30 clause to tell DB2 that, for this query,
TUDF1 should return 5*30, or 150, rows:
Chapter 12. Tuning your queries
293
SELECT *
FROM TABLE(TUDF2(10) CARDINALITY MULTIPLIER 30) AS X;
PSPI
Reducing the number of matching columns
You can discourage the use of a poorer performing index by reducing the index's
matching predicate on its leading column.
About this task
PSPI
Consider the example in Figure 34 on page 295, where the index that DB2
picks is less than optimal.
CREATE TABLE PART_HISTORY (
PART_TYPE CHAR(2),
IDENTIFIES THE PART TYPE
PART_SUFFIX CHAR(10),
IDENTIFIES THE PART
W_NOW
INTEGER,
TELLS WHERE THE PART IS
W_FROM
INTEGER,
TELLS WHERE THE PART CAME FROM
DEVIATIONS INTEGER,
TELLS IF ANYTHING SPECIAL WITH THIS PART
COMMENTS
CHAR(254),
DESCRIPTION
CHAR(254),
DATE1
DATE,
DATE2
DATE,
DATE3
DATE);
CREATE UNIQUE INDEX IX1 ON PART_HISTORY
(PART_TYPE,PART_SUFFIX,W_FROM,W_NOW);
CREATE UNIQUE INDEX IX2 ON PART_HISTORY
(W_FROM,W_NOW,DATE1);
+------------------------------------------------------------------------------+
| Table statistics
|
Index statistics
IX1
IX2 |
|--------------------------------+---------------------------------------------|
| CARDF
100,000
|
FIRSTKEYCARDF
1000
50
|
| NPAGES
10,000
|
FULLKEYCARDF
100,000
100,000 |
|
|
CLUSTERRATIO
99%
99% |
|
|
NLEAF
3000
2000 |
|
|
NLEVELS
3
3
|
|------------------------------------------------------------------------------|
|
column
cardinality
HIGH2KEY
LOW2KEY
|
|
------------------------------|
|
Part_type
1000
’ZZ’
’AA’
|
|
w_now
50
1000
1
|
|
w_from
50
1000
1
|
+------------------------------------------------------------------------------+
Q1:
SELECT * FROM PART_HISTORY
WHERE PART_TYPE = ’BB’
P1
AND W_FROM = 3
P2
AND W_NOW = 3
P3
-- SELECT ALL PARTS
-- THAT ARE ’BB’ TYPES
-- THAT WERE MADE IN CENTER 3
-- AND ARE STILL IN CENTER 3
+------------------------------------------------------------------------------+
|
Filter factor of these predicates.
|
|
P1 = 1/1000= .001
|
|
P2 = 1/50 = .02
|
|
P3 = 1/50 = .02
|
|------------------------------------------------------------------------------|
| ESTIMATED VALUES
|
WHAT REALLY HAPPENS
|
|
filter
data |
filter
data
|
| index matchcols factor
rows |
index matchcols
factor
rows
|
294
Performance Monitoring and Tuning Guide
|
ix2
2
.02*.02
40 |
ix2
2
.02*.50
1000
|
|
ix1
1
.001
100 |
ix1
1
.001
100
|
+------------------------------------------------------------------------------+
Figure 34. Reducing the number of MATCHCOLS
DB2 picks IX2 to access the data, but IX1 would be roughly 10 times quicker. The
problem is that 50% of all parts from center number 3 are still in Center 3; they
have not moved. Assume that no statistics are available on the correlated columns
in catalog table SYSCOLDIST. Therefore, DB2 assumes that the parts from center
number 3 are evenly distributed among the 50 centers.
You can get the desired access path by changing the query. To discourage the use
of IX2 for this particular query, you can change the third predicate to be
nonindexable.
SELECT * FROM PART_HISTORY
WHERE PART_TYPE = ’BB’
AND W_FROM = 3
AND (W_NOW = 3 + 0)
<-- PREDICATE IS MADE NONINDEXABLE
Now index I2 is not picked, because it has only one match column. The preferred
index, I1, is picked. The third predicate is a nonindexable predicate, so an index is
not used for the compound predicate.
You can make a predicate non-indexable in many ways. The recommended way is
to add 0 to a predicate that evaluates to a numeric value or to concatenate an
empty string to a predicate that evaluates to a character value.
Indexable
Nonindexable
T1.C3=T2.C4
T1.C1=5
(T1.C3=T2.C4 CONCAT ’’)
T1.C1=5+0
These techniques do not affect the result of the query and cause only a small
amount of overhead.
The preferred technique for improving the access path when a table has correlated
columns is to generate catalog statistics on the correlated columns. You can do that
either by running RUNSTATS or by updating catalog table SYSCOLDIST manually.
PSPI
Indexes for efficient star schema processing
You can create indexes to enable DB2 to use special join methods for star schemas.
PSPI
A star schema is a database design that, in its simplest form, consists of a
large table called a fact table, and two or more smaller tables, called dimension tables.
More complex star schemas can be created by breaking one or more of the
dimension tables into multiple tables.
To access the data in a star schema design, you often write SELECT statements that
include join operations between the fact table and the dimension tables, but no join
operations between dimension tables. These types of queries are known as star-join
queries.
Chapter 12. Tuning your queries
295
|
|
For a star-join query, DB2 might use special join types, star join and pair-wise join, if
the following conditions are true:
v The tables meet the conditions of a star join. (JOIN_TYPE='S')
v The tables meet the conditions of a pair-wise join. (JOIN_TYPE='P')
v The STARJOIN system parameter is set to ENABLE, and the number of tables in
the query block is greater than or equal to the minimum number that is
specified in the SJTABLES system parameter.
|
|
|
|
Whether DB2 uses star join, pair-wise join, or traditional join methods for
processing a star schema query is based on which method results in the lowest
cost access path. The existence of a star schema does not guarantee that either star
join or pair-wise join access will be chosen.
PSPI
Enabling efficient access for queries on star schemas
Pair-wise join processing simplifies index design by using single-column indexes to
join a fact table and the associated dimension tables according to AND predicates.
Procedure
PSPI
To design indexes to enable pair-wise join:
1. Create an index for each key column in the fact able that corresponds to a
dimension table.
2. Partition by, and cluster data according to, commonly used dimension keys.
Doing so can reduce the I/O that is required on the fact table for pair-wise join.
|
|
|
|
What to do next
If you require further performance improvement for some star schema queries,
consider the following index design recommendations to encourage DB2 to use
star join access:
v Define a multi-column index on all key columns of the fact table. Key columns
are fact table columns that have corresponding dimension tables.
v If you do not have information about the way that your data is used, first try a
multi-column index on the fact table that is based on the correlation of the data.
Put less highly correlated columns later in the index key than more highly
correlated columns.
v As the correlation of columns in the fact table changes, reevaluate the index to
determine if columns in the index should be reordered.
v Define indexes on dimension tables to improve access to those tables.
v When you have executed a number of queries and have more information about
the way that the data is used, follow these recommendations:
– Put more selective columns at the beginning of the multi-column index.
– If a number of queries do not reference a dimension, put the column that
corresponds to that dimension at the end of the index, or remove it
completely.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PSPI
|
296
Performance Monitoring and Tuning Guide
|
Rearranging the order of tables in a FROM clause
|
|
The order of tables or views in the FROM CLAUSE can affect the access path that
DB2 chooses for a SQL query.
|
About this task
PSPI
|
|
|
|
If your query performs poorly, it could be because the join sequence is
inefficient. You can determine the join sequence within a query block from the
PLANNO column in the PLAN_TABLE. If you think that the join sequence is
inefficient, try rearranging the order of the tables and views in the FROM clause to
|
match a join sequence that might perform better.
|
PSPI
Improving outer join processing
|
You can use a subsystem parameter to improve how DB2 processes an outer join.
|
About this task
|
PSPI
|
Procedure
|
To improve outer join processing:
|
|
|
|
|
|
|
Set the OJPERFEH subsystem parameter to YES. DB2 takes the following actions,
which can improve outer join processing in most cases:
|
What to do next
|
|
|
These actions might not improve performance for some outer joins, and you
should verify that performance improves. If the performance of queries that
contain outer joins remains inadequate, set OJPERFEH to NO, restart DB2, and
|
rerun those queries. PSPI
v Does not merge table expressions or views if the parent query block of a table
expression or view contains an outer join, and the merge would cause a column
in a predicate to become an expression.
v Does not attempt to reduce work file usage for outer joins.
v Uses transitive closure for the ON predicates in outer joins.
Using a subsystem parameter to optimize queries with IN-list
predicates
You can use the INLISTP parameter to control IN-list predicate optimization.
About this task
PSPI
If you set the INLISTP parameter to a number n that is between 1 and
5000, DB2 optimizes for an IN-list predicate with up to n values. If you set the
INLISTP predicate to zero, the optimization is disabled. The default value for the
INLISTP parameter is 50.
When you enable the INLISTP parameter, you enable two primary means of
optimizing some queries that contain IN-list predicates:
Chapter 12. Tuning your queries
297
v The IN-list predicate is pushed down from the parent query block into the
materialized table expression.
v A correlated IN-list predicate in a subquery that is generated by transitive
closure is moved up to the parent query block.
PSPI
Providing more information to DB2 for access path selection
In certain cases you can improve access path selection for SQL statements by
providing more information to DB2.
Fetching a limited number of rows: FETCH FIRST n ROWS
ONLY
In some applications, you execute queries that can return a large number of rows,
but you need only a small subset of those rows. Retrieving the entire result table
from the query can be inefficient.
About this task
PSPI
You can specify the FETCH FIRST n ROWS ONLY clause in a SELECT
statement to limit the number of rows in the result table of a query to n rows. In
addition, for a distributed query that uses DRDA access, FETCH FIRST n ROWS
ONLY, DB2 prefetches only n rows.
Example: Suppose that you write an application that requires information on only
the 20 employees with the highest salaries. To return only the rows of the
employee table for those 20 employees, you can write a query like this:
SELECT LASTNAME, FIRSTNAME, EMPNO, SALARY
FROM EMP
ORDER BY SALARY DESC
FETCH FIRST 20 ROWS ONLY;
You can also use FETCH FIRST n ROWS ONLY within a subquery.
|
Example:
SELECT * FROM EMP
WHERE EMPNO IN (
SELECT RESPEMP FROM PROJECT
ORDER BY PROJNO FETCH FIRST 3 ROWS ONLY)
Interaction between OPTIMIZE FOR n ROWS and FETCH FIRST n ROWS ONLY:
In general, if you specify FETCH FIRST n ROWS ONLY but not OPTIMIZE FOR n
ROWS in a SELECT statement, DB2 optimizes the query as if you had specified
OPTIMIZE FOR n ROWS.
When both the FETCH FIRST n ROWS ONLY clause and the OPTIMIZE FOR n
ROWS clause are specified, the value for the OPTIMIZE FOR n ROWS clause is
used for access path selection.
Example: Suppose that you submit the following SELECT statement:
SELECT * FROM EMP
FETCH FIRST 5 ROWS ONLY
OPTIMIZE FOR 20 ROWS;
The OPTIMIZE FOR value of 20 rows is used for access path selection.
298
Performance Monitoring and Tuning Guide
PSPI
Minimizing overhead for retrieving few rows: OPTIMIZE FOR n
ROWS
When an application executes a SELECT statement, DB2 assumes that the
application retrieves all the qualifying rows.
PSPI
This assumption is most appropriate for batch environments. However, for
interactive SQL applications, such as SPUFI, it is common for a query to define a
very large potential result set but retrieve only the first few rows. The access path
that DB2 chooses might not be optimal for those interactive applications.
This topic discusses the use of OPTIMIZE FOR n ROWS to affect the performance
of interactive SQL applications. Unless otherwise noted, this information pertains
to local applications.
What OPTIMIZE FOR n ROWS does: The OPTIMIZE FOR n ROWS clause lets an
application declare its intent to do either of these things:
v Retrieve only a subset of the result set
v Give priority to the retrieval of the first few rows
DB2 uses the OPTIMIZE FOR n ROWS clause to choose access paths that minimize
the response time for retrieving the first few rows. For distributed queries, the
value of n determines the number of rows that DB2 sends to the client on each
DRDA network transmission.
Use OPTIMIZE FOR 1 ROW to avoid sorts: You can influence the access path
most by using OPTIMIZE FOR 1 ROW. OPTIMIZE FOR 1 ROW tells DB2 to select
an access path that returns the first qualifying row quickly. This means that
whenever possible, DB2 avoids any access path that involves a sort. If you specify
a value for n that is anything but 1, DB2 chooses an access path based on cost, and
you won't necessarily avoid sorts.
How to specify OPTIMIZE FOR n ROWS for a CLI application: For a Call Level
Interface (CLI) application, you can specify that DB2 uses OPTIMIZE FOR n ROWS
for all queries. To do that, specify the keyword OPTIMIZEFORNROWS in the
initialization file.
How many rows you can retrieve with OPTIMIZE FOR n ROWS: The OPTIMIZE
FOR n ROWS clause does not prevent you from retrieving all the qualifying rows.
However, if you use OPTIMIZE FOR n ROWS, the total elapsed time to retrieve all
the qualifying rows might be significantly greater than if DB2 had optimized for
the entire result set.
When OPTIMIZE FOR n ROWS is effective: OPTIMIZE FOR n ROWS is effective
only on queries that can be performed incrementally. If the query causes DB2 to
gather the whole result set before returning the first row, DB2 ignores the
OPTIMIZE FOR n ROWS clause, as in the following situations:
|
v The query uses SELECT DISTINCT or a set function distinct, such as
COUNT(DISTINCT C1).
v Either GROUP BY or ORDER BY is used, and no index can give the necessary
ordering.
v A aggregate function and no GROUP BY clause is used.
v The query uses UNION.
Chapter 12. Tuning your queries
299
Example: Suppose that you query the employee table regularly to determine the
employees with the highest salaries. You might use a query like this:
SELECT LASTNAME, FIRSTNAME, EMPNO, SALARY
FROM EMP
ORDER BY SALARY DESC;
An index is defined on column EMPNO, so employee records are ordered by
EMPNO. If you have also defined a descending index on column SALARY, that
index is likely to be very poorly clustered. To avoid many random, synchronous
I/O operations, DB2 would most likely use a table space scan, then sort the rows
on SALARY. This technique can cause a delay before the first qualifying rows can
be returned to the application.
If you add the OPTIMIZE FOR n ROWS clause to the statement, DB2 probably
uses the SALARY index directly because you have indicated that you expect to
retrieve the salaries of only the 20 most highly paid employees.
Example: The following statement uses that strategy to avoid a costly sort
operation:
SELECT LASTNAME,FIRSTNAME,EMPNO,SALARY
FROM EMP
ORDER BY SALARY DESC
OPTIMIZE FOR 20 ROWS;
Effects of using OPTIMIZE FOR nROWS:
v The join method could change. Nested loop join is the most likely choice,
because it has low overhead cost and appears to be more efficient if you want to
retrieve only one row.
v An index that matches the ORDER BY clause is more likely to be picked. This is
because no sort would be needed for the ORDER BY.
v List prefetch is less likely to be picked.
v Sequential prefetch is less likely to be requested by DB2 because it infers that
you only want to see a small number of rows.
v In a join query, the table with the columns in the ORDER BY clause is likely to
be picked as the outer table if an index created on that outer table gives the
ordering needed for the ORDER BY clause.
Recommendation: For a local query, specify OPTIMIZE FOR n ROWS only in
applications that frequently fetch only a small percentage of the total rows in a
query result set. For example, an application might read only enough rows to fill
the end user's terminal screen. In cases like this, the application might read the
remaining part of the query result set only rarely. For an application like this,
OPTIMIZE FOR n ROWS can result in better performance by causing DB2 to favor
SQL access paths that deliver the first n rows as fast as possible.
When you specify OPTIMIZE FOR n ROWS for a remote query, a small value of n
can help limit the number of rows that flow across the network on any given
transmission.
You can improve the performance for receiving a large result set through a remote
query by specifying a large value of n in OPTIMIZE FOR n ROWS. When you
specify a large value, DB2 attempts to send the n rows in multiple transmissions.
For better performance when retrieving a large result set, in addition to specifying
OPTIMIZE FOR n ROWS with a large value of n in your query, do not execute
300
Performance Monitoring and Tuning Guide
other SQL statements until the entire result set for the query is processed. If
retrieval of data for several queries overlaps, DB2 might need to buffer result set
data in the DDF address space.
For local or remote queries, to influence the access path most, specify OPTIMIZE
for 1 ROW. This value does not have a detrimental effect on distributed queries.
PSPI
Favoring index access
One common database design involves tables that contain groups of rows that
logically belong together. Within each group, the rows should be accessed in the
same sequence every time.
PSPI
The sequence is determined by the primary key on the table. Lock
contention can occur when DB2 chooses different access paths for different
applications that operate on a table with this design.
To minimize contention among applications that access tables with this design,
specify the VOLATILE keyword when you create or alter the tables. A table that is
defined with the VOLATILE keyword is known as a volatile table. When DB2
executes queries that include volatile tables, DB2 uses index access whenever
possible. One exception is for DELETE statements on a VOLATILE table in a
segmented table space when no WHERE clause is specified. In this case, a table
space scan is used. As well as minimizing contention, using index access preserves
the access sequence that the primary key provides.
Defining a table as volatile has a similar effect on a query to setting the
NPGTHRSH subsystem parameter to favor matching index access for all qualified
tables. However, the effect of NPGTHRSH is subsystem-wide, and index access
might not be appropriate for many queries. Defining tables as volatile lets you
limit the set of queries that favor index access to queries that involve the volatile
tables.
PSPI
Using a subsystem parameter to favor matching index access
DB2 often scans a table space or nonmatching index when the data access statistics
indicate that a table is small, even though matching index access is possible. This is
a problem if the table is small or empty when statistics are collected, but the table
is large when it is queried.
About this task
PSPI
In that case, the statistics are not accurate and can lead DB2 to pick an
inefficient access path.
The best solution to the problem is to run RUNSTATS again after the table is
populated. However, if you cannot do that, you can use subsystem parameter
NPGTHRSH to cause DB2 to favor matching index access over a table space scan
and over nonmatching index access.
The value of NPGTHRSH is an integer that indicates the tables for which DB2
favors matching index access. Values of NPGTHRSH and their meanings are:
Chapter 12. Tuning your queries
301
0
DB2 selects the access path based on cost, and no tables qualify for special
handling. This is the default.
n>=1
If data access statistics have been collected for all tables, DB2 favors
matching index access for tables for which the total number of pages on
which rows of the table appear (NPAGES) is less than n.
Tables with default statistics for NPAGES (NPAGES =-1) are presumed to have 501
pages. For such tables, DB2 favors matching index access only when NPGTHRSH
is set above 501.
|
|
|
Recommendation: Before you use NPGTHRSH, be aware that in some cases,
matching index access can be more costly than a table space scan or nonmatching
index access. Specify a small value for NPGTHRSH (10 or less), which limits the
number of tables for which DB2 favors matching index access. If you need to use
matching index access only for specific tables, create or alter those tables with the
VOLATILE parameter, rather than using the system-wide NPGTHRSH parameter.
PSPI
Updating catalog statistics to influence access path selection
If you have the proper authority, you can influence access path selection by using
an SQL UPDATE or INSERT statement to change statistical values in the DB2
catalog. However, doing so is not generally recommended except as a last resort.
About this task
PSPI
Although updating catalog statistics can help a certain query, other queries
can be affected adversely. Also, the UPDATEs to the catalog must be repeated
whenever RUNSTATS resets the catalog values.
Important: The access path selection techniques that are described here might
cause significant performance degradation if they are not carefully implemented
and monitored. Also, the selection method might change in a later release of DB2,
causing your changes to degrade performance.
Consequently, the following recommendations apply if you make any such
changes.
v Save the original catalog statistics or SQL statements before you consider making
any changes to control the choice of access path.
v Before and after you make any changes, take performance measurements.
v Be prepared to back out any changes that have degraded performance.
v When you migrate to a new release, evaluate the performance again.
If you update catalog statistics for a table space or index manually, and you are
using dynamic statement caching, you must invalidate statements in the cache that
involve those table spaces or indexes. To invalidate statements in the dynamic
statement cache without updating catalog statistics or generating reports, you can
run the RUNSTATS utility with the REPORT NO and UPDATE NONE options on
the table space or the index that the query is dependent on.
The example shown in Figure 34 on page 295, involves this query:
302
Performance Monitoring and Tuning Guide
SELECT * FROM PART_HISTORY
WHERE PART_TYPE = ’BB’
P1
AND W_FROM = 3
P2
AND W_NOW = 3
P3
-----
SELECT ALL PARTS
THAT ARE ’BB’ TYPES
THAT WERE MADE IN CENTER 3
AND ARE STILL IN CENTER 3
This query has a problem with data correlation. DB2 does not know that 50% of
the parts that were made in Center 3 are still in Center 3. The problem was
circumvented by making a predicate nonindexable. But suppose that hundreds of
users are writing queries similar to that query. Having all users change their
queries would be impossible. In this type of situation, the best solution is to
change the catalog statistics.
For the above query, you can update the catalog statistics in one of two ways:
v Run the RUNSTATS utility, and request statistics on the correlated columns
W_FROM and W_NOW. This is the preferred method.
v Update the catalog statistics manually.
Updating the catalog to adjust for correlated columns: One catalog table that you
can update is SYSIBM.SYSCOLDIST, which gives information about a column or
set of columns in a table. Assume that because columns W_NOW and W_FROM
are correlated, and that only 100 distinct values exist for the combination of the
two columns, rather than 2500 (50 for W_FROM * 50 for W_NOW). Insert a row
like this to indicate the new cardinality:
INSERT INTO SYSIBM.SYSCOLDIST
(FREQUENCY, FREQUENCYF, IBMREQD,
TBOWNER, TBNAME, NAME, COLVALUE,
TYPE, CARDF, COLGROUPCOLNO, NUMCOLUMNS)
VALUES(0, -1, ’N’,
’USRT001’,’PART_HISTORY’,’W_FROM’,’ ’,
’C’,100,X’00040003’,2);
You can also use the RUNSTATS utility to put this information in SYSCOLDIST.
You tell DB2 about the frequency of a certain combination of column values by
updating SYSIBM.SYSCOLDIST. For example, you can indicate that 1% of the rows
in PART_HISTORY contain the values 3 for W_FROM and 3 for W_NOW by
inserting this row into SYSCOLDIST:
INSERT INTO SYSIBM.SYSCOLDIST
(FREQUENCY, FREQUENCYF, STATSTIME, IBMREQD,
TBOWNER, TBNAME, NAME, COLVALUE,
TYPE, CARDF, COLGROUPCOLNO, NUMCOLUMNS)
VALUES(0, .0100, ’2006-12-01-12.00.00.000000’,’N’,
’USRT001’,’PART_HISTORY’,’W_FROM’,X’00800000030080000003’,
’F’,-1,X’00040003’,2);
PSPI
Managing query access paths
You can prevent unwanted access path changes at rebind for critical applications
that use static SQL statements.
About this task
PSPI
Chapter 12. Tuning your queries
303
Query optimization in DB2 depends on many factors, and even minor changes in
the database environment often cause significant changes to access paths. Because
DB2 must often rely on incomplete information, such as statistics, suboptimal
access paths are not uncommon and reoptimization sometimes yields access paths
that cause performance regressions and even application outages.
PSPI
Plan management polices
|
|
|
Plan management policies enable you to specify whether DB2 saves historical
information about the access paths for SQL statements.
|
Plan management policy options
|
|
|
When you rebind a package DB2 saves information about the access paths for
static SQL statements as package copies based on the value that you specify for the
PLANMGMT bind option. You can specify the following values:
|
|
OFF
|
|
|
|
BASIC
|
|
|
|
|
EXTENDED
DB2 saves the active copy , and two older copies, which are known as the
previous and original copies. DB2 replaces the previous copy with the
former active copy, saves the newest copy as the active copy, and saves the
former previous copy as the original copy.
No copies are saved. OFF is the default value of the PLANMGMT
subsystem parameter.
DB2 saves the active copy and one older copy, which is known as the
previous copy. DB2 replaces the previous copy path with the former active
copy and saves the newest copy the active copy.
Managing access paths for static SQL statements
|
You can save copies of the access paths for static SQL statements, try to enforce an
existing access path when your rebind static SQL statements, detect access path
changes at rebind, and recover previous access paths after a regression occurs
because of a new access path.
Package copies
|
|
|
When a package is bound, DB2 stores relevant package information for static SQL
statements as records in several catalog tables and in the directory.
PSPI
|
|
|
|
|
|
|
|
|
DB2 records the following types of information about static SQL statements when
you bind a package:
v Metadata, in the SYSIBM.SYSPACKAGES catalog table
v Query text, in the SYSIBM.SYSPACKSTMT catalog table
v Dependencies, in the SYSIBM.SYSPACKSTMT catalog table
v Authorizations
v Access paths
v Compiled runtime structures, in the DSNDB01.SPT01 directory table space
|
|
|
When you rebind a package, DB2 deletes many of these records and replaces them
with new records. However, you can specify a plan management policy to specify
that DB2 retains such records in package copies when you rebind packages. The
304
Performance Monitoring and Tuning Guide
|
|
|
|
|
default plan management policy is OFF, which means that DB2 does not save
package copies unless you specify otherwise when you rebind packages or modify
the PLANMGMT subsystem parameter. When you specify EXTENDED for the
plan management policy, DB2 retains active, previous, and original copies of the
package.
|
|
|
|
|
|
|
|
Although each copy might contain different metadata and compiled runtime
structures, the following attributes are common to all copies in a corresponding set
of active, previous and original package copies:
v Location
v Collection
v Package name
v Version
v Consistency token
|
You can use package copies with the following types of applications:
|
v Regular packages
|
|
|
v Non-native SQL procedures
v External procedures
v Trigger packages
|
Invalid package copies
|
|
|
|
|
|
|
|
Packages can become invalid when any object that the package depends on is
altered or dropped. For example a package can become invalid when an index or
table is dropped, or when a table column is altered. The SYSIBM.SYSPACKDEP
catalog table records package dependencies. Depending on the change, different
copies of the packages can be affected differently. In the case of a dropped table, all
of the copies of the package would be invalidated. Whereas, in the case of a
dropped index, only certain copies that depend on the dropped index might be
invalidated.
|
|
|
|
When DB2 finds that the active copy for a package is invalid, it uses an automatic
bind to replace the current copy. The automatic bind is not effected by invalid
status on any previous or original copies, and the automatic bind replaces the only
active copy.
|
|
|
|
|
|
PSPI
Related concepts
“Plan management polices” on page 304
Related tasks
“Reverting to saved access paths for static SQL statements” on page 306
“Saving access path information for static SQL statements”
Saving access path information for static SQL statements
You can use package copies to automatically save pertinent catalog table and
directory records for static SQL statements when you bind a package or rebind an
existing package.
Chapter 12. Tuning your queries
305
About this task
PSPI
When a performance regression occurs after you rebind a package, you can
use the saved historical information to switch back to the older copy of the
package and regain the old access paths.
Procedure
To save access path information for static SQL statements:
Specify the PLANMGMT bind option when you issue one of the following
commands:
v REBIND PACKAGE
v REBIND TRIGGER PACKAGE
The values of the PLANMGMT option have the following meanings:
OFF
No copies are saved.OFF is the default value of the PLANMGMT
subsystem parameter.
BASIC
DB2 saves the active copy and one older copy, which is known as the
previous copy.DB2 replaces the previous copy path with the former active
copy and saves the newest copy the active copy.
EXTENDED
DB2 saves the active copy, and two older copies, which are known as the
previous and original copies. DB2 replaces the previous copy with the
former active copy, saves the newest copy as the active copy, and saves the
former previous copy as the original copy.
When you rebind a package with the PLANMGMT (BASIC) or (EXTENDED)
options, the following options must remain the same for package copies to remain
usable:
v OWNER
v QUALIFIER
v ENABLE
v DISABLE
v PATH
v PATHDEFAULT
v IMMEDWRITE
PSPI
Related concepts
“Package copies” on page 304
Reverting to saved access paths for static SQL statements
In the event that package that contains static SQL statements suffers performance
regression after a rebind, you can fall back to a copy of the better performing
access paths for the package.
Before you begin
PSPI
The following prerequisites have been met:
v You previously specified a plan management policy to specify that DB2 saves
access path information.
306
Performance Monitoring and Tuning Guide
Procedure
To revert a package to use previously saved access paths:
Specify the SWITCH option when you issue one of the following commands:
v REBIND PACKAGE
v REBIND TRIGGER PACKAGE
You can specify one of the following options:
Option
Description
SWITCH(PREVIOUS)
DB2 toggles the active and previous
packages:
v The existing active copy takes the place of
the previous copy
v The existing previous copy takes the place
of the active copy.
v Any existing original copy remains
unchanged.
SWITCH(ORIGINAL)
DB2 replaces the active copy with the
original copy:
v The existing active copy replaces the
previous copy.
v The existing previous copy is discarded.
v The existing original copy remains
unchanged.
You can use wild cards (*) in the syntax to restore the previous or original
packages for multiple packages. When you specify the SWITCH option, package
copies that are stored as rows of data in the following objects are modfied:
v DSNDB01.SPT01 table space
v SYSIBM.SYSPACKDEP catalog table
If no previous or original package copies exist, DB2 issues an error message for
each package the does not have copies and processes the remainder of the
packages normally.
PSPI
Related concepts
“Package copies” on page 304
Freeing saved access paths for static SQL statements
You can remove saved copies of access path information for static SQL statements
to free up the disk space used to save them.
Before you begin
PSPI
The following prerequisites have been met:
v You previously specified a plan management policy to specify that DB2 saves
access path information.
Chapter 12. Tuning your queries
307
About this task
When you save access path copies for static SQL statements, some disk space is
used to save the copies. The amount of space that is used depends on the plan
management policy that you choose. When you specify an extended plan
management policy, the amount of space used might be triple the amount used
when you specify only that plan management is on. Consequently, you might want
to remove some or unneeded historical copies of the access path information to
free disk space.
Procedure
To remove saved copies of access path information for static SQL statements:
Issue a FREE PACKAGE command, and specify a scope for the action by
specifying the plan management scope.
Option
Description
PLANMGMTSCOPE (ALL)
DB2 removes all copies of the access path
information, including the active copy.
PLANMGMTSCOPE (INACTIVE)
DB2 removes only the previous and original
copies, and retains the active copy for future
use.
FREE PACKAGE (collection-name.package-name) PLANMGMTSCOPE (INACTIVE)
The PLANMGMTSCOPE option cannot be used for remote processing. PSPI
Influencing access path selection by using optimization hints
You can suggest that DB2 selects a particular access path for SQL statements by
creating and enabling optimization hints. However, there are limitations, and DB2
cannot always enforce the access path that is specified in a hint.
About this task
PSPI
You can create the following types of optimization hints:
User-level optimization hints
You can specify that DB2 tries to enforce the access path for an SQL
statement that is issued from a particular authorization ID when the
statement is bound or prepared. Hints of this type apply only when the
statement is issued by the same authorization ID that qualifies the
PLAN_TABLE instance that contains the hint. DB2 uses the value that is
specified in the QUERYNO clause of the SQL statement and the
QUERYNO column of the PLAN_TABLE to identify applicable hints.
PSPI
|
|
Enabling optimization hints
|
Procedure
You can specify whether the DB2 subsystem uses optimization hints.
PSPI
|
308
To enable optimization hints on the DB2 subsystem:
Performance Monitoring and Tuning Guide
|
|
|
1. Set the value of the OPTHINTS subsystem parameter to 'YES'. This value is set
by the OPTIMIZATION HINTS field on the performance and optimization
installation panel. .When you specify 'YES', DB2 enables the following actions:
v SET CURRENT OPTIMIZATION HINT statements.
v The CURRENT OPTIMIZATION HINT bind option.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Otherwise, those actions are blocked by DB2.
2. Create the following index on instances of PLAN_TABLE that contain the
optimization hints:
CREATE INDEX userid.PLAN_TABLE_HINT_IX
userid.PLAN_TABLE
( "QUERYNO",
"APPLNAME",
"PROGNAME",
"VERSION",
"COLLID",
"OPTHINT" )
USING STOGROUP stogroup-name
ERASE NO
BUFFERPOOL BP0
CLOSE NO;
The statement that creates the index is also included as part of the DSNTESC
|
|
|
|
member of the SDSNSAMP library. PSPI
Related tasks
“Creating user-level optimization hints”
|
|
|
Creating user-level optimization hints
|
Before you begin
|
You can create optimization hints to try to enforce a particular access path for a
SQL statement that is issued by a specific single authorization ID.
PSPI
The following prerequisites are met.
|
|
|
v The value of the OPTHINTS subsystem parameter is set to 'YES'.
v An instance of the PLAN_TABLE table is created under the authorization ID that
issues the SQL statement.
|
|
|
|
|
|
|
v An index is created on the following columns of the PLAN_TABLE that issues
the SQL statement:
– QUERYNO
|
–
–
–
–
–
APPLNAME
PROGNAME
VERSION
COLLID
OPTHINT
|
About this task
|
|
|
|
|
When you create optimization hints by the method described below, the hint only
applies to the particular specified SQL statement, and only for instances of that
statement that are issued by the authorization ID that owns the PLAN_TABLE that
contains the hint.DB2 does not attempt to use the hint for instances of that same
statement that are issued by other authorization IDs.
Chapter 12. Tuning your queries
309
|
Procedure
|
To create user-level optimization hints:
1. Optional: Include a QUERYNO clause in your SQL statements. The following
query contains an example of the QUERYNO clause:
|
|
|
|
|
|
|
SELECT * FROM T1
WHERE C1 = 10 AND
C2 BETWEEN 10 AND 20 AND
C3 NOT LIKE ’A%’
QUERYNO 100;
This step is not required for the use of optimization hints. However, by
specifying a query number to identify each SQL statement you can eliminate
ambiguity in the relationships between rows in the PLAN_TABLE and the SQL
corresponding statements.
For example, the statement number for dynamic applications is the number of
the statement that prepares the statements in the application. For some
applications, such as DSNTEP2, the same statement in the application prepares
each dynamic statement, meaning that every dynamic statement has the same
statement number.
Similarly, when you modify an application that contains static statements, the
statement numbers might change, causing rows in the PLAN_TABLE to be out
of sync with the modified application. Statements that use the QUERYNO
clause are not dependent on the statement numbers. You can move those
statements around without affecting the relationship between rows in the
PLAN_TABLE and the corresponding statements in the application.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Such ambiguity might prevent DB2 from enforcing the hints.
2. Insert a name for the hint into the OPTHINT column of the PLAN_TABLE
rows for the SQL statement. This step enables DB2 to identify the
PLAN_TABLE rows that become the hint.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
UPDATE PLAN_TABLE
SET OPTHINT = ’NOHYB’
WHERE QUERYNO = 200 AND
APPLNAME = ’ ’ AND
PROGNAME = ’DSNTEP2’ AND
VERSION = ’ ’ AND
COLLID = ’DSNTEP2’;
3. Optional: Modify the PLAN_TABLE rows to instruct to DB2 try to enforce a
different access path. You might also use optimization hints only to try to
enforce the same access path after a rebind or prepare. In that case, you can
omit this step.
For example, suppose that DB2 chooses a hybrid join (METHOD = 4) when you
know that a sort merge join (METHOD = 2) might perform better. You might issue
the following statement.
UPDATE PLAN_TABLE
SET METHOD = 2
WHERE QUERYNO = 200 AND
APPLNAME = ’ ’ AND
PROGNAME = ’DSNTEP2’ AND
VERSION = ’’ AND
COLLID = ’DSNTEP2’ AND
OPTHINT = ’NOHYB’ AND
METHOD = 4;
4. Instruct DB2 to begin enforcing the hint:
310
Performance Monitoring and Tuning Guide
||
Option
Description
||
|
|
|
|
|
|
|
|
|
For dynamic statements...
1. Issue a SET CURRENT OPTIMIZATION
HINT = 'hint-name' statement.
2. If the SET CURRENT OPTIMIZATION
HINT statement is a static SQL
statement, rebind the plan or package.
3. Issue an EXPLAIN statement for
statements that uses the hint. DB2 adds
rows to the plan table for the statement
and inserts the 'hint-name' value into the
HINT_USED column.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If the dynamic statement cache is enabled,
DB2 tries to use the hint only when no
match is found for the statement in the
dynamic statement cache. Otherwise, DB2
uses the cached plan, and does not prepare
the statement or consider the hint.
Rebind the plan or package that contains the
statements and specify the EXPLAIN(YES)
and OPTHINT('hint-name') options.
For static statements...
DB2 uses the following PLAN_TABLE columns when matching hints to SQL
statements:
v QUERYNO
v APPLNAME
v PROGNAME
v VERSION
v COLLID
v OPTHINT
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It is recommended that you create an index on these columns for the
PLAN_TABLE when you use optimization hints.If DB2 uses all of the hints that
you provided, it returns SQLCODE +394 from the PREPARE of the EXPLAIN
statement and from the PREPARE of SQL statements that use the hints. If any
of your hints are invalid, or if any duplicate hints were found, DB2 issues
SQLCODE +395. If DB2 does not find an optimization hint, DB2 returns
another SQLCODE. Usually, this SQLCODE is 0.DB2 also returns a message at
the completion of the bind operation to identify the numbers of statements for
which hints were fully applied, not applied or partially applied, and not found.
5. Select from the PLAN_TABLE to check whether DB2 used the hints. For
example, you might issue the following statement:
|
Table 75. PLAN_TABLE that shows that the NOHYB optimization hint is used.
|
QUERYNO
METHOD
TNAME
OPTHINTS
|
200
0
EMP
NOHYB
|
200
2
EMPPROJACT
NOHYB
SELECT *
FROM PLAN_TABLE
WHERE QUERYNO = 200
ORDER BY TIMESTAMP, QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ;
The following table shows the example PLAN_TABLE data. The NOHYB hint
is indicated by a value in the OPTHINT column. You can also see that DB2
used that hint, as indicated by the NOHYB value in the HINT_USED column.
HINT_USED
Chapter 12. Tuning your queries
311
|
Table 75. PLAN_TABLE that shows that the NOHYB optimization hint is used. (continued)
|
QUERYNO
METHOD
|
200
3
|
200
0
EMP
NOHYB
|
200
2
EMPPROJECT
NOHYB
|
|
200
3
TNAME
OPTHINTS
HINT_USED
NOHYB
NOHYB
PSPI
|
|
|
|
|
|
Related tasks
“Enabling optimization hints” on page 308
Related reference
“PLAN_TABLE” on page 754
Related information
|
+394 (DB2 Codes)
|
+395 (DB2 Codes)
|
DSNT222I (DB2 Messages)
How DB2 validates optimization hints
DB2 cannot always use access paths that are specified by optimization hints. When
a hint cannot be used, DB2 marks the hint as invalid.
PSPI
If an access path that you specify has major problems, DB2 invalidates all
hints for that query block. In that event, DB2 determines the access path as it
normally does.
DB2 uses only the PLAN_TABLE columns that are shown in the following table
when validating hints.
Table 76. PLAN_TABLE columns that DB2 validates
|
|
Column
Correct values or other explanation
METHOD
Must be 0, 1, 2, 3, or 4. Any other value invalidates the
hints.
CREATOR and TNAME
Must be specified and must name a table, materialized view,
materialized nested table expression. Blank if method is 3. If
a table is named that does not exist or is not involved in the
query, then the hints are invalid.
MATCHCOLS
This value is used only when ACCESSTYPE is IN. The value
must be greater than or equal to 0.
TABNO
Required only if CREATOR, TNAME, and
CORRELATION_NAME do not uniquely identify the table.
This situation might occur when the same table is used in
multiple views (with the same CORRELATION_NAME).
This field is ignored if it is not needed.
312
Performance Monitoring and Tuning Guide
Table 76. PLAN_TABLE columns that DB2 validates (continued)
Column
Correct values or other explanation
ACCESSTYPE
Must contain one of the following values. Any other value
invalidates the hints:
v H
v I
v I1
v M
v N
v O
v R
v RW
v T
v V
|
Values of I, I1, IN, and N all mean single index access. DB2
determines which type to use based on the index specified
in the ACCESSNAME column.
M indicates multiple index access. DB2 uses only the first
row in the authid.PLAN_TABLE for multiple index access
(MIXOPSEQ=0). The choice of indexes, and the AND and
OR operations involved, is determined by DB2. If multiple
index access isn't possible, then the hints are invalidated.
ACCESSCREATOR and
ACCESSNAME
Ignored if ACCESSTYPE is R or M. If ACCESSTYPE is I, I1,
IN, or N, then these fields must identify an index on the
specified table.
If the index doesn't exist, or if the index is defined on a
different table, then the hints are invalid. Also, if the
specified index can't be used, then the hints are invalid.
SORTN_JOIN and
SORTC_JOIN
Must be Y, N or blank. Any other value invalidates the
hints.
This value determines if DB2 should sort the new
(SORTN_JOIN) or composite (SORTC_JOIN) table. This
value is ignored if the specified join method, join sequence,
access type and access name dictate whether a sort of the
new or composite tables is required.
PREFETCH
Must be D, S, L or blank. Any other value invalidates the
hints.
This value determines whether DB2 uses dynamic prefetch
(D) sequential prefetch (S), list prefetch (L), or no prefetch
(blank). (A blank does not prevent sequential detection at
run time.) This value is ignored if the specified access type
and access name dictates the type of prefetch required.
PAGE_RANGE
Must be Y, N or blank. Any other value invalidates the
hints.
Chapter 12. Tuning your queries
313
Table 76. PLAN_TABLE columns that DB2 validates (continued)
Column
Correct values or other explanation
PARALLELISM_MODE
This value is used only if it is possible to run the query in
parallel; that is, the SET CURRENT DEGREE special register
contains ANY, or the plan or package was bound with
DEGREE(ANY).
If parallelism is possible, this column must contain one of
the following values:
v C
v I
v X
v null
All of the restrictions involving parallelism still apply when
using access path hints. If the specified mode cannot be
performed, the hints are either invalidated or the mode is
modified by the optimizer, possibly resulting in the query
being run sequentially. If the value is null then the optimizer
determines the mode.
ACCESS_DEGREE or
JOIN_DEGREE
If PARALLELISM_MODE is specified, use this field to
specify the degree of parallelism. If you specify a degree of
parallelism, this must a number greater than zero, and DB2
might adjust the parallel degree from what you set here. If
you want DB2 to determine the degree, do not enter a value
in this field.
If you specify a value for ACCESS_DEGREE or
JOIN_DEGREE, you must also specify a corresponding
ACCESS_PGROUP_ID and JOIN_PGROUP_ID.
WHEN_OPTIMIZE
Must be R, B, or blank. Any other value invalidates the
hints.
When a statement in a plan that is bound with
REOPT(ALWAYS) qualifies for reoptimization at run time,
and you have provided optimization hints for that
statement, the value of WHEN_OPTIMIZE determines
whether DB2 reoptimizes the statement at run time. If the
value of WHEN_OPTIMIZE is blank or B, DB2 uses only the
access path that is provided by the optimization hints at
bind time. If the value of WHEN_OPTIMIZE is R, DB2
determines the access path at bind time using the
optimization hints. At run time, DB2 searches the
PLAN_TABLE for hints again, and if hints for the statement
are still in the PLAN_TABLE and are still valid, DB2
optimizes the access path using those hints again.
|
|
QBLOCK_TYPE
A value must be specified. A blank in the QBLOCK_TYPE
column invalidates hint.
PRIMARY_ACCESSTYPE
Must be D, T, or blank. Any other value invalidates the
hints.
PSPI
314
Performance Monitoring and Tuning Guide
Related reference
“PLAN_TABLE” on page 754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Limitations on optimization hints
DB2 cannot always apply optimization hints.
PSPI
In certain situations, DB2 cannot apply optimization hints, and hints might
be used different from release to release, or not at all in certain releases. For
example:
v Optimization hints cannot force or undo query transformations, such as
subquery transformation to join or materialization or merge of a view or table
expression.
A query that is not transformed in one release of DB2 might be transformed in a
later release of DB2. Therefore, if you include a hint in one release of DB2 for a
query that is not transformed in that release, but in a later release of DB2, the
query is transformed, DB2 does not use the hint in the later release.
v DB2 might use an optimization hint in two different releases, but use an
equivalent access path that does not look exactly the same in both releases. For
example, prior to Version 9 DB2 work files that represent non-correlated
sub-queries were not shown in the PLAN_TABLE. An access path based on the
same hint that was used in earlier version might now contain a row for that
work file.
v DB2 ignores any PLAN_TABLE row that contains METHOD=3.
v DB2 ignores any PLAN_TABLE row that contains any the following values for
the QBLOCK_TYPE column:
– INSERT
– UNION
– UNIONA
– INTERS
– INTERA
– EXCEPT
|
– EXCEPTA
|
PSPI
Reoptimizing SQL statements at run time
You can specify whether DB2 reoptimizes statements based on literal values that
are provided for host variables, parameter markers, and special registers in the
statement, at execution time.
Procedure
To manage whether DB2 reoptimizes dynamic SQL statements at runtime:
1. Identify the statements execute most efficiently when DB2 follows the rules of
each bind option.
Chapter 12. Tuning your queries
315
Option
Description
REOPT(ALWAYS)
DB2 always uses literal values that are provided for
parameter markers, and special registers to reoptimize
the access path for any SQL statement at every execution
of the statement.
Consider using the REOPT(ALWAYS) bind option in the
following circumstances:
v The SQL statement does not perform well with the
access path that is chosen at bind time.
v The SQL statement takes a relatively long time to
execute. For long-running SQL statements, the
performance gain from the better access path that is
chosen based on the input variable values for each run
can be greater than the performance cost of
reoptimizing the access path each time that the
statement runs.
You can issue the following statements to identify
statements that are reoptimized under the
REOPT(ALWAYS) bind option:
SELECT PLNAME,
CASE WHEN STMTNOI <> 0
THEN STMTNOI
ELSE STMTNO
END AS STMTNUM,
SEQNO, TEXT
FROM
SYSIBM.SYSSTMT
WHERE STATUS IN (’B’,’F’,’G’,’J’)
ORDER BY PLNAME, STMTNUM, SEQNO;
SELECT COLLID, NAME, VERSION,
CASE WHEN STMTNOI <> 0
THEN STMTNOI
ELSE STMTNO
END AS STMTNUM,
SEQNO, STMT
FROM SYSIBM.SYSPACKSTMT
WHERE STATUS IN (’B’,’F’,’G’,’J’)
ORDER BY COLLID, NAME, VERSION, STMTNUM, SEQNO;
316
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Option
Description
REOPT(AUTO)
DB2 determines at execution time whether to reoptimize
access paths for cached dynamic SQL statements based
on literal values that are provided for parameter
markers, host variables and special registers.
Consider using the REOPT(AUTO) bind option to
achieve a better balance between the costs of
reoptimization and the costs of processing a statement.
You might use the REOPT(AUTO) bind option for many
statements for which you could choose either the
REOPT(ALWAYS) or REOPT(NONE) bind options, and
especially in the following situations:
v The statement is a dynamic SQL statement and can be
cached. If dynamic statement caching is not turned on
when DB2 executes a statement specified with the
REOPT(AUTO) bind option, no reoptimization occurs.
v The SQL statement sometimes takes a relatively long
time to execute, depending on the values of referenced
parameter markers, especially when parameter
markers refer to columns that contain skewed values
or that are used in range predicates. In such situations
the estimation of qualifying rows might change based
upon the literal values that are used at execution time.
For such SQL statements, the performance gain from a
new access path that is chosen based on the input
variable values for each might or might not be greater
than the performance cost of reoptimization when the
statement runs.
Chapter 12. Tuning your queries
317
|
Option
Description
|
|
|
|
REOPT(ONCE)
DB2 reoptimizes cached dynamic SQL statements at
execution time for the first execution of the statement
based on literal values that are provided for parameter
markers, and special registers.
|
|
|
|
|
|
The REOPT(ONCE) bind option determines the access
path for an SQL statement only once at run time and
works only with dynamic SQL statements. The
REOPT(ONCE) bind option allows DB2 to store the
access path for dynamic SQL statements in the dynamic
statement cache.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Consider using the REOPT(ONCE) bind option in the
following circumstances:
|
|
|
|
|
|
|
|
|
|
|
v The SQL statement is a dynamic SQL statement.
v The SQL statement does not perform well with the
access path that is chosen at bind time.
v The SQL statement is relatively simple and takes a
relatively short time to execute. For simple SQL
statements, reoptimizing the access path each time that
the statement runs can degrade performance more
than using the access path from the first run for each
subsequent run.
v The same SQL statement is repeated many times in a
loop, or is run by many threads. Because of the
dynamic statement cache, the access path that DB2
chooses for the first set of input variables performs
well for subsequent executions of the same SQL
statement, even if the input variable values are
different each time.
You can specify that DB2 uses the access path that was
selected at runtime. Keep in mind that an SQL statement
that performs well with the REOPT(NONE) bind option
might perform even better with the bind options that
change the access path at run time.
REOPT(NONE)
2. Separate and the statements that run best under the different reoptimization
options into different packages according to the best reoptimization option for
each statement, and specify the appropriate bind option when you bind each
package.
Related tasks
“Using host variables efficiently” on page 281
|
318
Performance Monitoring and Tuning Guide
|
|
Chapter 13. Programming for concurrency
|
|
DB2 processes acquire, or avoid acquiring, locks based on certain general
parameters.
|
About this task
|
|
|
|
|
|
|
|
|
|
|
|
|
PSPI
To preserve data integrity, your application process acquires locks implicitly, under
the control of DB2. It is not necessary for a process to explicitly request a lock to
conceal uncommitted data. Therefore, do not always need to do anything about
DB2 locks. However, you can make better use of your resources and improve
concurrency by understanding the effects of the parameters that DB2 uses to
control locks.
PSPI
Concurrency and locks
Concurrency is the ability of more than one application process to access the same
data at essentially the same time.
PSPI
|
|
|
|
|
An application for order entry is used by many transactions simultaneously. Each
transaction makes inserts in tables of invoices and invoice items, reads a table of
data about customers, and reads and updates data about items on hand. Two
operations on the same data, by two simultaneous transactions, might be separated
only by microseconds. To the users, the operations appear concurrent.
|
Why DB2 controls concurrency
|
|
Concurrency must be controlled to prevent lost updates and such possibly
undesirable effects as unrepeatable reads and access to uncommitted data.
|
|
|
|
|
Lost updates
Without concurrency control, two processes, A and B, might both read the
same row from the database, and both calculate new values for one of its
columns, based on what they read. If A updates the row with its new
value, and then B updates the same row, A's update is lost.
|
|
|
|
|
Access to uncommitted data
Also without concurrency control, process A might update a value in the
database, and process B might read that value before it was committed.
Then, if A's value is not later committed, but backed out, B's calculations
are based on uncommitted (and presumably incorrect) data.
|
|
|
Unrepeatable reads
Some processes require the following sequence of events: A reads a row
from the database and then goes on to process other SQL requests. Later, A
© Copyright IBM Corp. 1982, 2010
319
reads the first row again and must find the same values it read the first
time. Without control, process B could have changed the row between the
two read operations.
|
|
|
|
|
To prevent those situations from occurring unless they are specifically allowed,
DB2 might use locks to control concurrency.
|
How DB2 uses locks
|
|
|
|
|
|
A lock associates a DB2 resource with an application process in a way that affects
how other processes can access the same resource. The process associated with the
resource is said to “hold” or “own” the lock. DB2 uses locks to ensure that no
process accesses data that has been changed, but not yet committed, by another
process. For XML and LOB locks, DB2 also uses locks to ensure that an application
cannot access partial or incomplete data
Locks might cause situations that degrade DB2 performance, including situations
such as suspension, time out, and deadlock.
Suspension
An application process is suspended when it requests a lock that is already held by
another application process and cannot be shared. The suspended process
temporarily stops running.
PSPI
Order of precedence for lock requests
Incoming lock requests are queued. Requests for lock promotion, and requests for
a lock by an application process that already holds a lock on the same object,
precede requests for locks by new applications. Within those groups, the request
order is “first in, first out.”
Example suspension
Using an application for inventory control, two users attempt to reduce the
quantity on hand of the same item at the same time. The two lock requests are
queued. The second request in the queue is suspended and waits until the first
request releases its lock.
Effects of suspension
The suspended process resumes running when:
v All processes that hold the conflicting lock release it.
v The requesting process times out or deadlocks and the process resumes to deal
with an error condition.
PSPI
Time out
An application process is said to time out when it is terminated because it has been
suspended for longer than a preset interval.
PSPI
320
Performance Monitoring and Tuning Guide
Example time out
An application process attempts to update a large table space that is being
reorganized by the utility REORG TABLESPACE with SHRLEVEL NONE. It is
likely that the utility job does not release control of the table space before the
application process times out.
Effects of time outs
DB2 terminates the process, issues two messages to the console, and returns
SQLCODE -911 or -913 to the process (SQLSTATEs '40001' or '57033'). Reason code
00C9008E is returned in the SQLERRD(3) field of the SQLCA. Alternatively, you
can use the GET DIAGNOSTICS statement to check the reason code. If statistics
trace class 3 is active, DB2 writes a trace record with IFCID 0196.
Effects in IMS
If you are using IMS, and a timeout occurs, the following actions take place:
v In a DL/I batch application, the application process abnormally terminates with
a completion code of 04E and a reason code of 00D44033 or 00D44050.
v In any IMS environment except DL/I batch:
– DB2 performs a rollback operation on behalf of your application process to
undo all DB2 updates that occurred during the current unit of work.
– For a non-message driven BMP, IMS issues a rollback operation on behalf of
your application. If this operation is successful, IMS returns control to your
application, and the application receives SQLCODE -911. If the operation is
unsuccessful, IMS issues user abend code 0777, and the application does not
receive an SQLCODE.
– For an MPP, IFP, or message driven BMP, IMS issues user abend code 0777,
rolls back all uncommitted changes, and reschedules the transaction. The
application does not receive an SQLCODE.
COMMIT and ROLLBACK operations do not time out. The command STOP
DATABASE, however, might time out and send messages to the console, but it
retries up to 15 times. PSPI
Deadlock
A deadlock occurs when two or more application processes each hold locks on
resources that the others need and without which they cannot proceed.
Example deadlock
PSPI
The following figure illustrates a deadlock between two transactions.
Chapter 13. Programming for concurrency
321
Table N
(3)
Job EMPLJCHG
000010
Page A
Suspend
(1) OK
Table M
000300
(2) OK
(4)
Page B
Suspend
Job PROJNCHG
Notes:
1. Jobs EMPLJCHG and PROJNCHG are two transactions. Job EMPLJCHG accesses table M, and acquires an
exclusive lock for page B, which contains record 000300.
2. Job PROJNCHG accesses table N, and acquires an exclusive lock for page A, which contains record 000010.
3. Job EMPLJCHG requests a lock for page A of table N while still holding the lock on page B of table M. The job is
suspended, because job PROJNCHG is holding an exclusive lock on page A.
4. Job PROJNCHG requests a lock for page B of table M while still holding the lock on page A of table N. The job is
suspended, because job EMPLJCHG is holding an exclusive lock on page B. The situation is a deadlock.
Figure 35. A deadlock example
Effects of deadlocks
After a preset time interval (the value of DEADLOCK TIME), DB2 can roll back
the current unit of work for one of the processes or request a process to terminate.
That frees the locks and allows the remaining processes to continue. If statistics
trace class 3 is active, DB2 writes a trace record with IFCID 0172. Reason code
00C90088 is returned in the SQLERRD(3) field of the SQLCA. Alternatively, you
can use the GET DIAGNOSTICS statement to check the reason code. (The codes
that describe the exact DB2 response depend on the operating environment.)
It is possible for two processes to be running on distributed DB2 subsystems, each
trying to access a resource at the other location. In that case, neither subsystem can
detect that the two processes are in deadlock; the situation resolves only when one
process times out.
Indications of deadlocks: In some cases, a deadlock can occur if two application
processes attempt to update data in the same page or table space.
Deadlocks and TSO, Batch, and CAF
When a deadlock or timeout occurs in these environments, DB2 attempts to roll
back the SQL for one of the application processes. If the ROLLBACK is successful,
that application receives SQLCODE -911. If the ROLLBACK fails, and the
application does not abend, the application receives SQLCODE -913.
Deadlocks and IMS
If you are using IMS, and a deadlock occurs, the following actions take place:
v In a DL/I batch application, the application process abnormally terminates with
a completion code of 04E and a reason code of 00D44033 or 00D44050.
v In any IMS environment except DL/I batch:
322
Performance Monitoring and Tuning Guide
– DB2 performs a rollback operation on behalf of your application process to
undo all DB2 updates that occurred during the current unit of work.
– For a non-message driven BMP, IMS issues a rollback operation on behalf of
your application. If this operation is successful, IMS returns control to your
application, and the application receives SQLCODE -911. If the operation is
unsuccessful, IMS issues user abend code 0777, and the application does not
receive an SQLCODE.
– For an MPP, IFP, or message driven BMP, IMS issues user abend code 0777,
rolls back all uncommitted changes, and reschedules the transaction. The
application does not receive an SQLCODE.
Deadlocks and CICS
If you are using CICS and a deadlock occurs, the CICS attachment facility decides
whether or not to roll back one of the application processes, based on the value of
the ROLBE or ROLBI parameter. If your application process is chosen for rollback,
it receives one of two SQLCODEs in the SQLCA:
-911
A SYNCPOINT command with the ROLLBACK option was issued on
behalf of your application process. All updates (CICS commands and DL/I
calls, as well as SQL statements) that occurred during the current unit of
work have been undone. (SQLSTATE '40001')
-913
A SYNCPOINT command with the ROLLBACK option was not issued.
DB2 rolls back only the incomplete SQL statement that encountered the
deadlock or timed out. CICS does not roll back any resources. Your
application process should either issue a SYNCPOINT command with the
ROLLBACK option itself or terminate. (SQLSTATE '57033')
Consider using the DSNTIAC subroutine to check the SQLCODE and display the
SQLCA. Your application must take appropriate actions before resuming. PSPI
Promoting basic concurrency
By following certain basic recommendations, you can promote concurrency in your
DB2 system.
About this task
Recommendations are grouped by their scope.
Using system and subsystem options to promote concurrency
Some performance problems can appear to be locking problems although they are
really problems somewhere else in the system.
About this task
For example, a table space scan of a large table can result in timeout situations.
Procedure
To improve concurrency on your DB2 system:
v Resolve overall system, subsystem, and application performance problems to
ensure that you not only eliminate locking symptoms but also correct other
underlying performance problems.
Chapter 13. Programming for concurrency
323
v Consider reducing the number of threads or initiators, increasing the priority for
the DB2 tasks, and providing more processing, I/O, and real memory. If a task is
waiting or is swapped out and the unit of work has not been committed, then it
still holds locks. When a system is heavily loaded, contention for processing,
I/O, and storage can cause waiting.
Designing your databases for concurrency
By following general recommendations and best practices for database design you
can ensure improved concurrency on your DB2 system.
Procedure
PSPI
To design your database to promote concurrency:
v Keep like things together:
– Give each application process that creates private tables a private database.
– Put tables relevant to the same application into the same database.
– Put tables together in a segmented table space if they are similar in size and
can be recovered together.
v Use an adequate number of databases, schema or authorization-ID qualifiers,
and table spaces to keep unlike things apart. Concurrency and performance is
improved for SQL data definition statements, GRANT statements, REVOKE
statements, and utilities. For example, a general guideline is a maximum of 50
tables per database.
v Plan for batch inserts. If your application does sequential batch insertions,
excessive contention on the space map pages for the table space can occur.
This problem is especially apparent in data sharing, where contention on the
space map means the added overhead of page P-lock negotiation. For these
types of applications, consider using the MEMBER CLUSTER option of CREATE
TABLESPACE. This option causes DB2 to disregard the clustering index (or
implicit clustering index) when assigning space for the SQL INSERT statement.
v Use LOCKSIZE ANY until you have reason not to. LOCKSIZE ANY is the
default for CREATE TABLESPACE.
It allows DB2 to choose the lock size, and DB2 usually chooses LOCKSIZE
PAGE and LOCKMAX SYSTEM for non-LOB/non-XML table spaces. For LOB
table spaces, DB2 chooses LOCKSIZE LOB and LOCKMAX SYSTEM. Similarly,
for XML table spaces, DB2 chooses LOCKSIZE XML and LOCKMAX SYSTEM.
You should use LOCKSIZE TABLESPACE or LOCKSIZE TABLE only for
read-only table spaces or tables, or when concurrent access to the object is not
needed. Before you choose LOCKSIZE ROW, you should estimate whether doing
so increases the overhead for locking, especially from P-locks in a data sharing
environment, and weigh that against the increase in concurrency.
|
|
|
|
|
v For small tables with high concurrency requirements, estimate the number of
pages in the data and in the index. In this case, you can spread out your data to
improve concurrency, or consider it a reason to use row locks. If the index
entries are short or they have many duplicates, then the entire index can be one
root page and a few leaf pages.
v Partition large tables to take advantage of parallelism for online queries, batch
jobs, and utilities.
When batch jobs are run in parallel and each job goes after different partitions,
lock contention is reduced. In addition, in data sharing environments, data
sharing overhead is reduced when applications that are running on different
members go after different partitions.
324
Performance Monitoring and Tuning Guide
|
|
v
v
v
|
|
|
|
v
Certain utility operations, such as LOAD and REORG, cannot take advantage of
parallelism in partition-by-growth table spaces.
Partition secondary indexes. By using data-partitioned secondary indexes
(DPSIs) you can promote partition independence and, thereby reduce lock
contention and improve index availability, especially for utility processing,
partition-level operations (such as dropping or rotating partitions), and recovery
of indexes.
However, using data-partitioned secondary indexes does not always improve the
performance of queries. For example, for a query with a predicate that references
only the columns of a data-partitioned secondary index, DB2 must probe each
partition of the index for values that satisfy the predicate if index access is
chosen as the access path. Therefore, take into account data access patterns and
maintenance practices when deciding to use a data-partitioned secondary index.
Replace a nonpartitioned index with a partitioned index only if you will realize
perceivable benefits such as improved data or index availability, easier data or
index maintenance, or improved performance.
Specify fewer rows of data per page. You can use the MAXROWS clause of
CREATE or ALTER TABLESPACE, to specify the maximum number of rows that
can be on a page. For example, if you use MAXROWS 1, each row occupies a
whole page, and you confine a page lock to a single row. Consider this option if
you have a reason to avoid using row locking, such as in a data sharing
environment where row locking overhead can be greater.
If multiple applications access the same table, consider defining the table as
VOLATILE. DB2 uses index access whenever possible for volatile tables, even if
index access does not appear to be the most efficient access method because of
volatile statistics. Because each application generally accesses the rows in the
table in the same order, lock contention can be reduced.
Consider using randomized index key columns. In a data sharing environment,
you can use randomized index key columns to reduce locking contention at the
possible cost of more CPU usage, from increased locking and getpage
operations, and more index page read and write I/Os.
This technique is effective for reducing contention on certain types of equality
predicates. For example, if you create an index on a timestamp column, where
the timestamp is always filled with the current time, every insertion on the
index would be the greatest value and cause contention on the page at the end
of the index. An index on a column of sequential values, such as invoice
numbers, causes similar contention, especially in heavy transaction workload
environments. In each case, using the RANDOM index order causes the values
to be stored at random places in the index tree, and reduce the chance that
consecutive insertions hit the same page and cause contention.
Although the randomized index can relieve contention problems for sets of
similar or sequential values, it does not help with identical values. Identical
values encode the same and each are inserted at the same place on the index
tree. PSPI
Programming your applications for concurrency
By following general recommendations and best practices for database design you
can ensure improved concurrency on your DB2 system.
Procedure
PSPI
To design your applications for concurrency:
Chapter 13. Programming for concurrency
325
v When two different applications access the same data, try to make them do so in
the same sequence. For, example if two applications access five rows of data,
make both access rows 1,2,3,5 in that order. In that case, the first application to
access the data delays the second, but the two applications cannot deadlock. For
the same reason, try to make different applications access the same tables in the
same order.
v To avoid unnecessary lock contention, issue a COMMIT statement as soon as
possible after reaching a point of consistency, even in read-only applications.
Statements issued through SPUFI can be committed immediately by the SPUFI
autocommit feature.
Taking commit points frequently in a long running unit of recovery (UR) has the
following benefits at the possible cost of more CPU usage and log write I/Os:
– Reduces lock contention, especially in a data sharing environment
– Improves the effectiveness of lock avoidance, especially in a data sharing
environment
– Reduces the elapsed time for DB2 system restart following a system failure
– Reduces the elapsed time for a unit of recovery to rollback following an
application failure or an explicit rollback request by the application
– Provides more opportunity for utilities, such as online REORG, to break in
Consider using the UR CHECK FREQ field or the UR LOG WRITE CHECK field
of installation panel DSNTIPN to help you identify those applications that are
not committing frequently. UR CHECK FREQ, which identifies when too many
checkpoints have occurred without a UR issuing a commit, is helpful in
monitoring overall system activity. UR LOG WRITE CHECK enables you to
detect applications that might write too many log records between commit
points, potentially creating a lengthy recovery situation for critical tables.
Even though an application might conform to the commit frequency standards
of the installation under normal operational conditions, variation can occur
based on system workload fluctuations. For example, a low-priority application
might issue a commit frequently on a system that is lightly loaded. However,
under a heavy system load, the use of the CPU by the application might be
preempted, and, as a result, the application might violate the rule set by the UR
CHECK FREQ parameter. For this reason, add logic to your application to
commit based on time elapsed since last commit, and not solely based on the
amount of SQL processing performed. In addition, take frequent commit points
in a long running unit of work that is read-only to reduce lock contention and to
provide opportunities for utilities, such as online REORG, to access the data.
Committing frequently is equally important for objects that are not logged and
objects that are logged. Make sure, for example, that you commit work
frequently even if the work is done on a table space that is defined with the
NOT LOGGED option. Even when a given transaction modifies only tables that
reside in not logged table spaces, a unit of recovery is still established before
updates are performed. Undo processing will continue to read the log in the
backward direction looking for undo log records that must be applied until it
detects the beginning of this unit of recovery as recorded on the log. Therefore,
such transactions should perform frequent commits to limit the distance undo
processing might have to go backward on the log to find the beginning of the
unit of recovery.
v Include logic in a batch program so that it retries an operation after a deadlock
or timeout. Such a method could help you recover from the situation without
assistance from operations personnel.
– Field SQLERRD(3) in the SQLCA returns a reason code that indicates whether
a deadlock or timeout occurred.
|
|
|
|
|
|
|
|
|
|
|
326
Performance Monitoring and Tuning Guide
|
|
|
|
|
– Alternatively, you can use the GET DIAGNOSTICS statement to check the
reason code.
v Bind plans with the AQUIRE(USE) option in most cases. ACQUIRE(USE), which
indicates that DB2 acquires table and table space locks when the objects are first
used and not when the plan is allocated, is the best choice for concurrency.
Packages are always bound with ACQUIRE(USE), by default.
ACQUIRE(ALLOCATE), which is an option for plans, but not for packages, can
provide better protection against timeouts. Consider ACQUIRE(ALLOCATE) for
applications that need gross locks instead of intent locks or that run with other
applications that might request gross locks instead of intent locks. Acquiring the
locks at plan allocation also prevents any one transaction in the application from
incurring the cost of acquiring the table and table space locks. If you need
ACQUIRE(ALLOCATE), you might want to bind all DBRMs directly to the plan.
v Bind applications with the ISOLATION(CS) and CURRENTDATA(NO) options
in most cases. ISOLATION(CS) lets DB2 release acquired row and page locks as
soon as possible. CURRENTDATA(NO) lets DB2 avoid acquiring row and page
locks as often as possible. When you use ISOLATION(CS) and
CURRENTDATA(NO), consider using the SKIPUNCI subsystem parameter value
to YES so that readers do not wait for the outcome of uncommitted inserts.
If you do not use ISOLATION(CS) and CURRENTDATA(NO), in order of
decreasing preference for concurrency, use the bind options:
1. ISOLATION(CS) with CURRENTDATA(YES), when data returned to the
application must not be changed before your next FETCH operation.
2. ISOLATION(RS), when data returned to the application must not be changed
before your application commits or rolls back. However, you do not care if
other application processes insert additional rows.
3. ISOLATION(RR), when data evaluated as the result of a query must not be
changed before your application commits or rolls back. New rows cannot be
inserted into the answer set.
v Use ISOLATION(UR) option cautiously. The Resource Recovery Services
attachment facility UR isolation acquires almost no locks on rows or pages. It is
fast and causes little contention, but it reads uncommitted data. Do not use it
unless you are sure that your applications and end users can accept the logical
inconsistencies that can occur.
As an alternative, consider using the SKIP LOCKED data option if omitting data
is preferable to reading uncommitted data in your application.
v Use sequence objects to generate unique, sequential numbers. Using an identity
column is one way to generate unique sequential numbers.
However, as a column of a table, an identity column is associated with and tied
to the table, and a table can have only one identity column. Your applications
might need to use one sequence of unique numbers for many tables or several
sequences for each table. As a user-defined object, sequences provide a way for
applications to have DB2 generate unique numeric key values and to coordinate
the keys across multiple rows and tables.
The use of sequences can avoid the lock contention problems that can result
when applications implement their own sequences, such as in a one-row table
that contains a sequence number that each transaction must increment. With
DB2 sequences, many users can access and increment the sequence concurrently
without waiting. DB2 does not wait for a transaction that has incremented a
sequence to commit before allowing another transaction to increment the
sequence again.
v Examine multi-row operations such as multi-row inserts, positioned updates,
and positioned deletes, which have the potential of expanding the unit of work.
Chapter 13. Programming for concurrency
327
This situation can affect the concurrency of other users that access the data. You
can minimize contention by adjusting the size of the host-variable-array,
committing between inserts, updates, and preventing lock escalation.
v Use global transactions. The Resource Recovery Services attachment facility
(RRSAF) relies on a z/OS component called Resource Recovery Services (RRS).
RRS provides system-wide services for coordinating two-phase commit
operations across z/OS products. For RRSAF applications and IMS transactions
that run under RRS, you can group together a number of DB2 agents into a
single global transaction.
A global transaction allows multiple DB2 agents to participate in a single global
transaction and thus share the same locks and access the same data. When two
agents that are in a global transaction access the same DB2 object within a unit
of work, those agents do not deadlock or timeout with each other. The following
restrictions apply:
– Parallel Sysplex® is not supported for global transactions.
– Because each of the "branches" of a global transaction are sharing locks,
uncommitted updates issued by one branch of the transaction are visible to
other branches of the transaction.
– Claim/drain processing is not supported across the branches of a global
transaction, which means that attempts to issue CREATE, DROP, ALTER,
GRANT, or REVOKE might deadlock or timeout if they are requested from
different branches of the same global transaction.
– LOCK TABLE might deadlock or timeout across the branches of a global
transaction.
v Use optimistic concurrency control. Optimistic concurrency control represents a
faster, more scalable locking alternative to database locking for concurrent data
access. It minimizes the time for which a given resource is unavailable for use
by other transactions.
When an application uses optimistic concurrency control, locks are obtained
immediately before a read operation and released immediately. Update locks are
obtained immediately before an update operation and held until the end of the
transaction. Optimistic concurrency control uses the RID and a row change
token to test whether data has been changed by another transaction since the
last read operation.
Because DB2 can determine when a row was changed, you can ensure data
integrity while limiting the time that locks are held. With optimistic concurrency
control, DB2 releases the row or page locks immediately after a read operation.
DB2 also releases the row lock after each FETCH, taking a new lock on a row
only for a positioned update or delete to ensure data integrity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To safely implement optimistic concurrency control, you must establish a row
change timestamp column with a CREATE TABLE statement or an ALTER
TABLE statement. The column must be defined with one of the following null
characteristics:
– NOT NULL GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW
CHANGE TIMESTAMP
– NOT NULL GENERATED BY DEFAULT FOR EACH ROW ON UPDATE AS
ROW CHANGE TIMESTAMP
|
|
|
|
|
|
|
|
|
|
|
|
After you establish a row change timestamp column, DB2 maintains the contents
of this column. When you want to use this change token as a condition when
making an update, you can specify an appropriate predicate for this column in a
WHERE clause.
PSPI
|
328
Performance Monitoring and Tuning Guide
Aspects of transaction locks
Understanding the sizes, durations, modes, and objects of transaction locks can
help you understand why a process suspends or times out or why two processes
deadlock, and how you might change the situation.
Lock size
The size (sometimes scope or level) of a lock on data in a table describes the amount
of data that is controlled by the lock. The same piece of data can be controlled by
locks of different sizes.
PSPI
v
v
v
v
DB2 uses locks of the following sizes:
Table space
Table
Partition
Page
v Row
A table space lock (the largest size) controls the most data, all the data in an entire
table space. A page or row lock controls only the data in a single page or row.
As the following figure suggests, row locks and page locks occupy an equal place
in the hierarchy of lock sizes.
Chapter 13. Programming for concurrency
329
Segmented and simple table spaces
Table space lock
Table lock
Row lock
Page lock
LOB table space
XML table space
LOB table space lock
XML table space lock
LOB lock
XML lock
Partitioned table space and universal table space
Partition lock
Row lock
Partition lock
Page lock
Row lock
Page lock
Partition lock
Row lock
Page lock
Figure 36. Sizes of objects locked
Locking larger or smaller amounts of data allows you to trade performance for
concurrency. sing page or row locks instead of table or table space locks has the
following effects:
v Concurrency usually improves, meaning better response times and higher
throughput rates for many users.
v Processing time and use of storage increases. That is especially evident in batch
processes that scan or update a large number of rows.
Using only table or table space locks has the following effects:
v Processing time and storage usage is reduced.
v Concurrency can be reduced, meaning longer response times for some users but
better throughput for one user.
Lock sizes and table space type
DB2 uses different lock sizes depending on the type of table spaces where the locks
are acquired.
Partitioned and universal table space
In a partitioned table space or universal table space, locks are obtained at
the partition level. Individual partitions are locked as they are accessed.
Gross locks (S, U, or X) can be obtained on individual partitions instead of
on the entire partitioned table space. (Partition locks are always acquired
with the LOCKPART NO clause. For table spaces that are defined with the
|
330
Performance Monitoring and Tuning Guide
LOCKPART NO option, DB2 no longer locks the entire table space with
one lock when any partition of the table space is accessed.)
Restriction: If one of the following conditions is true, DB2 must lock all
partitions:
v The plan is bound with ACQUIRE(ALLOCATE).
v The table space is defined with LOCKSIZE TABLESPACE.
v The LOCK TABLE statement is used without the PART option.
Segmented table space
In a segmented table space without partitions, rows from different tables
are contained in different pages. Locking a page does not lock data from
more than one table. Also, DB2 can acquire a table lock, which locks only
the data from one specific table. Because a single row, of course, contains
data from only one table, the effect of a row lock is the same as for a
simple or partitioned table space: it locks one row of data from one table.
Simple table space
DB2 no longer supports the creation of simple table spaces. However, an
existing simple table space can contain more than one table. A lock on the
table space locks all the data in every table. A single page of the table
space can contain rows from every table. A lock on a page locks every row
in the page, no matter what tables the data belongs to. Thus, a lock needed
to access data from one table can make data from other tables temporarily
unavailable. That effect can be partly undone by using row locks instead of
page locks.
LOB table space
In a LOB table space, pages are not locked. Because the concept of rows
does not occur in a LOB table space, rows are not locked. Instead, LOBs
are locked.
XML table space
In an XML table space, XML locks are acquired.
Example: simple versus segmented table spaces
Suppose that tables T1 and T2 reside in table space TS1. In a simple table space, a
single page can contain rows from both T1 and T2. If User 1 and User 2 acquire
incompatible locks on different pages, such as exclusive locks for updating data,
neither can access all the rows in T1 and T2 until one of the locks is released. (User
1 and User 2 can both hold a page lock on the same page when the mode of the
locks are compatible, such as locks for reading data.)
As the figure also shows, in a segmented table space, a table lock applies only to
segments assigned to a single table. Thus, User 1 can lock all pages assigned to the
segments of T1 while User 2 locks all pages assigned to segments of T2. Similarly,
User 1 can lock a page of T1 without locking any data in T2.
Chapter 13. Programming for concurrency
331
Simple table space:
Table space
locking
Page 1
Page 2
Page 3
Page 4
...
Table space
lock applies to
every table in
the table space.
User 1
Lock on TS1
Page locking
Page 1
Page 2
Page 3
Page 4
Page lock
applies to data
from every table
on the page.
User 1
Lock on page 1
...
User 2
Lock on page 3
Segmented table space:
Table locking
Segment for table T1
Segment for table T2
Table lock
applies to only
one table in
the table space.
Page 1
Page 3
Page 2
User 1
Lock on table T1
Page 4
...
User 2
Lock on table T2
Page locking
Segment for table T1
Page lock
applies to data
from only
one table.
Page 1
Page 2
User 1
Lock on table T1
Segment for table T2
Page 3
Page 4
...
User 2
Lock on table T2
Rows from T1:
Rows from T2:
Figure 37. Page locking for simple and segmented table spaces
PSPI
The duration of a lock
The duration of a lock is the length of time the lock is held. It varies according to
when the lock is acquired and when it is released.
Lock modes
The mode (sometimes state) of a lock tells what access to the locked object is
permitted to the lock owner and to any concurrent processes.
332
Performance Monitoring and Tuning Guide
PSPI
When a page or row is locked, the table, partition, or table space containing it is
also locked. In that case, the table, partition, or table space lock has one of the
intent modes: IS, IX, or SIX. The modes S, U, and X of table, partition, and table
space locks are sometimes called gross modes. In the context of reading, SIX is a
gross mode lock because you don't get page or row locks; in this sense, it is like an
S lock.
Example: An SQL statement locates John Smith in a table of customer data and
changes his address. The statement locks the entire table space in mode IX and the
specific row that it changes in mode X.
PSPI
Modes of page and row locks
Modes and their effects are listed in the order of increasing control over resources.
PSPI
Modes and their effects are listed in the order of increasing control over
resources.
S (SHARE)
The lock owner and any concurrent processes can read, but not change, the
locked page or row. Concurrent processes can acquire S or U locks on the
page or row or might read data without acquiring a page or row lock.
U (UPDATE)
The lock owner can read, but not change, the locked page or row.
Concurrent processes can acquire S locks or might read data without
acquiring a page or row lock, but no concurrent process can acquire a U
lock.
U locks reduce the chance of deadlocks when the lock owner is reading a
page or row to determine whether to change it, because the owner can
start with the U lock and then promote the lock to an X lock to change the
page or row.
|
|
|
|
|
X (EXCLUSIVE)
The lock owner can read or change the locked page or row. A concurrent
process cannot acquire S, U, or X locks on the page or row; however, a
concurrent process, such as those bound with the CURRENTDATA(NO) or
ISO(UR) options or running with YES specified for the EVALUNC
subsystem parameter, can read the data without acquiring a page or row
lock. PSPI
|
Modes of table, partition, and table space locks
Modes and their effects are listed in the order of increasing control over resources.
PSPI
IS (INTENT SHARE)
The lock owner can read data in the table, partition, or table space, but not
change it. Concurrent processes can both read and change the data. The
lock owner might acquire a page or row lock on any data it reads.
IX (INTENT EXCLUSIVE)
The lock owner and concurrent processes can read and change data in the
table, partition, or table space. The lock owner might acquire a page or row
lock on any data it reads; it must acquire one on any data it changes.
Chapter 13. Programming for concurrency
333
S (SHARE)
The lock owner and any concurrent processes can read, but not change,
data in the table, partition, or table space. The lock owner does not need
page or row locks on data it reads.
U (UPDATE)
The lock owner can read, but not change, the locked data; however, the
owner can promote the lock to an X lock and then can change the data.
Processes concurrent with the U lock can acquire S locks and read the data,
but no concurrent process can acquire a U lock. The lock owner does not
need page or row locks.
U locks reduce the chance of deadlocks when the lock owner is reading
data to determine whether to change it. U locks are acquired on a table
space when the lock size is TABLESPACE and the statement is a SELECT
with a FOR UPDATE clause. Similarly, U locks are acquired on a table
when lock size is TABLE and the statement is a SELECT with a FOR
UPDATE clause.
SIX (SHARE with INTENT EXCLUSIVE)
The lock owner can read and change data in the table, partition, or table
space. Concurrent processes can read data in the table, partition, or table
space, but not change it. Only when the lock owner changes data does it
acquire page or row locks.
X (EXCLUSIVE)
The lock owner can read or change data in the table, partition, or table
space. A concurrent process can access the data if the process runs with UR
isolation or if data in a partitioned table space is running with CS isolation
and CURRENTDATA((NO). The lock owner does not need page or row
locks. PSPI
Compatibility of lock modes
DB2 uses lock mode to determine whether one lock is compatible with another.
PSPI
The major effect of the lock mode is to determine whether one lock is
compatible with another.
Some lock modes do not shut out all other users. Assume that application process
A holds a lock on a table space that process B also wants to access. DB2 requests,
on behalf of B, a lock of some particular mode. If the mode of A's lock permits B's
request, the two locks (or modes) are said to be compatible.
If the two locks are not compatible, B cannot proceed. It must wait until A releases
its lock. (And, in fact, it must wait until all existing incompatible locks are
released.)
Compatible lock modes
Compatibility for page and row locks is easy to define. The table below shows
whether page locks of any two modes, or row locks of any two modes, are
compatible (Yes) or not (No). No question of compatibility of a page lock with a
row lock can arise, because a table space cannot use both page and row locks.
334
Performance Monitoring and Tuning Guide
Table 77. Compatibility of page lock and row lock modes
Lock Mode
S
U
X
S
U
X
Yes
Yes
No
Yes
No
No
No
No
No
Compatibility for table space locks is slightly more complex. The following table
shows whether or not table space locks of any two modes are compatible.
Table 78. Compatibility of table and table space (or partition) lock modes
Lock Mode
IS
IX
S
U
SIX
X
IS
IX
S
U
SIX
X
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
No
No
No
Yes
No
Yes
Yes
No
No
Yes
No
Yes
No
No
No
Yes
No
No
No
No
No
No
No
No
No
No
No
PSPI
The object of a lock
The object of a lock is the resource that is being locked.
PSPI
You might have to consider locks on any of the following objects:
|
|
|
|
User data in target tables
A target table is a table that is accessed specifically in an SQL statement,
and especially one that the statement updates, either by name or through a
view. Locks on those tables are the most common concern, and the ones
over which you have most control.
User data in related tables
Operations subject to referential constraints can require locks on related
tables. For example, if you delete from a parent table, DB2 might delete
rows from the dependent table as well. In that case, DB2 locks data in the
dependent table as well as in the parent table.
Similarly, operations on rows that contain LOB or XML values might
require locks on the LOB or XML table space and possibly on LOB or XML
values within that table space.
If your application uses triggers, any triggered SQL statements can cause
additional locks to be acquired.
DB2 internal objects
You might notice the following locks on internal objects:
v Portions of the DB2 catalog. For more information, see “Contention on
the DB2 catalog” on page 336.
v The skeleton cursor table (SKCT) that represents an application plan.
v The skeleton package table (SKPT) that represents a package.
Chapter 13. Programming for concurrency
335
v The database descriptor (DBD) that represents a DB2 database.
PSPI
Indexes and data-only locking
No index page locks are acquired during processing. Instead, DB2 uses a technique
called data-only locking to serialize changes.
PSPI
Index page latches are acquired to serialize changes within a page and
guarantee that the page is physically consistent. Acquiring page latches ensures
that transactions accessing the same index page concurrently do not see the page
in a partially changed state.
The underlying data page or row locks are acquired to serialize the reading and
updating of index entries to ensure the data is logically consistent, meaning that
the data is committed and not subject to rollback or abort. The data locks can be
held for a long duration such as until commit. However, the page latches are only
held for a short duration while the transaction is accessing the page. Because the
index pages are not locked, hot spot insert scenarios (which involve several
transactions trying to insert different entries into the same index page at the same
time) do not cause contention problems in the index.
A query that uses index-only access might lock the data page or row, and that lock
can contend with other processes that lock the data. However, using lock
avoidance techniques can reduce the contention.
|
|
|
|
|
|
|
To provide better concurrency, when DB2 searches using XML values (the first key
values) in the XML index key entries, it does not acquire the index page latch and
does not lock either the base table data pages or rows, or the XML table space.
When the matched-value index key entries are found, the corresponding DOCID
values (the second key value) are retrieved. The retrieved DOCID values are used
to retrieve the base table RIDs using the DOCID index. At this time, the regular
data-only locking technique is applied on the DOCID index page and base table
|
data page (or row).
PSPI
Contention on the DB2 catalog
SQL data definition statements, GRANT statements, and REVOKE statements
require locks on the DB2 catalog. If different application processes are issuing these
types of statements, catalog contention can occur.
Contention within the SYSDBASE table space
PSPI
SQL statements that update the catalog table space SYSDBASE contend
with each other when those statements are on the same table space. Those
statements are:
v CREATE TABLESPACE, TABLE, and INDEX
v
v
v
v
v
v
v
ALTER TABLESPACE, TABLE, INDEX
DROP TABLESPACE, TABLE, and INDEX
CREATE VIEW, SYNONYM, and ALIAS
DROP VIEW and SYNONYM, and ALIAS
COMMENT ON and LABEL ON
GRANT and REVOKE of table privileges
RENAME TABLE
v RENAME INDEX
336
Performance Monitoring and Tuning Guide
v ALTER VIEW
Recommendations:
v Reduce the concurrent use of statements that update SYSDBASE for the same
table space.
v When you alter a table or table space, quiesce other work on that object.
Contention independent of databases
The following limitations on concurrency are independent of the referenced
database:
v CREATE and DROP statements for a table space or index that uses a storage
group contend significantly with other such statements.
v CREATE, ALTER, and DROP DATABASE, and GRANT and REVOKE database
privileges all contend with each other and with any other function that requires
a database privilege.
v CREATE, ALTER, and DROP STOGROUP contend with any SQL statements that
refer to a storage group and with extensions to table spaces and indexes that use
a storage group.
v GRANT and REVOKE for plan, package, system, or use privileges contend with
other GRANT and REVOKE statements for the same type of privilege and with
data definition statements that require the same type of privilege.
PSPI
Contention independent of databases:
Some processes result in contention that is independent of the database where the
resources reside.
PSPI
The following limitations on concurrency are independent of the
referenced database:
v CREATE and DROP statements for a table space or index that uses a storage
group contend significantly with other such statements.
v CREATE, ALTER, and DROP DATABASE, and GRANT and REVOKE database
privileges all contend with each other and with any other function that requires
a database privilege.
v CREATE, ALTER, and DROP STOGROUP contend with any SQL statements that
refer to a storage group and with extensions to table spaces and indexes that use
a storage group.
v GRANT and REVOKE for plan, package, system, or use privileges contend with
other GRANT and REVOKE statements for the same type of privilege and with
data definition statements that require the same type of privilege.
PSPI
Locks on the skeleton tables (SKCT and SKPT):
The skeleton table of a plan (SKCT) or package (SKPT) is locked while the plan or
package is running.
PSPI
The following operations require incompatible locks on the SKCT or SKPT,
whichever is applicable, and cannot run concurrently:
v Binding, rebinding, or freeing the plan or package
Chapter 13. Programming for concurrency
337
v Dropping a resource or revoking a privilege that the plan or package depends
on
v In some cases, altering a resource that the plan or package depends on
PSPI
Locks on the database descriptors (DBDs):
Whether a process locks a target DBD depends largely on whether the DBD is
already in the EDM DBD cache.
PSPI
If the DBD is not in the EDM DBD cache
Most processes acquire locks on the database descriptor table space
(DBD01). That has the effect of locking the DBD and can cause conflict
with other processes.
If the DBD is in the EDM DBD cache
the lock on the DBD depends on the type of process, as shown in the
following table:
Table 79. Contention for locks on a DBD in the EDM DBD cache
Process
Type
Process
Lock
acquired
Conflicts with
process type
1
Static DML statements (SELECT, DELETE,
INSERT, UPDATE)1
None
None
2
Dynamic DML statements2
S
3
3
Data definition statements (ALTER, CREATE,
DROP)
X
2,3,4
4
Utilities
S
3
Notes:
1. Static DML statements can conflict with other processes because of
locks on data.
2. If caching of dynamic SQL is turned on, no lock is taken on the DBD
when a statement is prepared for insertion in the cache or for a
statement in the cache.
PSPI
How DB2 chooses lock types
Different types of SQL data manipulation statements acquire locks on target tables.
PSPI
The lock acquired because of an SQL statement is not always a constant
throughout the time of execution. In certain situations,DB2 can change acquired
locks during execution. Many other processes and operations acquire locks. PSPI
338
Performance Monitoring and Tuning Guide
Locks acquired for SQL statements
When SQL statements access or modify data, locks must be acquired to prevent
other applications from accessing data that has been changed but not committed.
How and when the locks are acquired for a particular SQL statement depend on
the type of processing, the access method, and attributes of the application and
target tables.
PSPI
|
|
|
The following tables show the locks that certain SQL processes acquire
and the modes of those locks. Whether locks are acquired at all and the mode of
those locks depend on the following factors:
|
v
v
v
v
v
The type of processing being performed
The value of LOCKSIZE for the target table
The isolation level of the plan, package, or statement
The method of access to data
Whether the application uses the SKIP LOCKED DATA option
Example SQL statement
The following SQL statement and sample steps provide a way to understand the
following tables.
EXEC SQL DELETE FROM DSN8910.EMP WHERE CURRENT OF C1;
|
|
|
|
|
|
Use the following sample steps to understand the table:
1. Find the portion of the table that describes DELETE operations using a cursor.
2. Find the row for the appropriate values of LOCKSIZE and ISOLATION. Table
space DSN8910 is defined with LOCKSIZE ANY. The default value of
ISOLATION is CS with CURRENTDATA (NO) by default.
3. Find the sub-row for the expected access method. The operation probably uses
the index on employee number. Because the operation deletes a row, it must
update the index. Hence, you can read the locks acquired in the sub-row for
“Index, updated”:
v An IX lock on the table space
v An IX lock on the table (but see the step that follows)
v An X lock on the page containing the row that is deleted
4. Check the notes to the entries you use, at the end of the table. For this sample
operation, see:
v Note 2, on the column heading for “Table”. If the table is not segmented, or
if the table is segmented and partitioned, no separate lock is taken on the
table.
v Note 3, on the column heading for “Data Page or Row”. Because LOCKSIZE
for the table space is ANY, DB2 can choose whether to use page locks, table
locks, or table space locks. Typically it chooses page locks.
SELECT with read-only or ambiguous cursor, or with no cursor
The following table shows locks that are acquired during the processing of
SELECT with read-only or ambiguous cursor, or with no cursor SQL statements.
UR isolation is allowed and requires none of these locks.
Chapter 13. Programming for concurrency
339
Table 80. Locks acquired for SQL statements SELECT with read-only or ambiguous cursor
LOCKSIZE
ISOLATION
Access method1
Lock Mode
Table space9
Lock Mode
Table2
Lock Mode Data
Page or Row3
TABLESPACE
CS RS RR
Any
S
n/a
n/a
2
CS RS RR
Any
IS
S
n/a
TABLE
PAGE, ROW,
or ANY
CS
PAGE, ROW,
or ANY
RS
PAGE, ROW,
or ANY
RR
4, 10
IS
4
S5
Index, any use
IS
Table space scan
IS4, 11
IS4
S5
Index, any use
IS4, 10
IS4
S5, U11, or X11
Table space scan
IS4, 10
IS4
S5, U11, or X11
Index/data probe
IS4
IS4
S5, U11, or X11
Index scan6
IS4 or S
S, IS4, or n/a
S5, U11, X11, or n/a
Table space scan6
IS2 or S
S or n/a
n/a
INSERT, VALUES(...), or INSERT fullselect7
The following table shows locks that are acquired during the processing of
INSERT, VALUES(...), or INSERT fullselect SQL statements.
Table 81. Locks acquired for SQL statements INSERT ... VALUES(...) or INSERT ... fullselect
LOCKSIZE
ISOLATION
Access method1
Lock Mode
Table space9
Lock Mode
Table2
Lock Mode Data
Page or Row3
TABLESPACE
CS RS RR
Any
X
n/a
n/a
2
CS RS RR
Any
IX
X
n/a
PAGE, ROW, or
CS RS RR
ANY
Any
IX
IX
X
TABLE
UPDATE or DELETE without cursor
The following table shows locks that are acquired during the processing of
UPDATE or DELETE without curso SQL statements. Data page and row locks
apply only to selected data.
Table 82. Locks acquired for SQL statements UPDATE, or DELETE without cursor
LOCKSIZE
ISOLATION
Access method1
Lock Mode
Table space9
Lock Mode
Table2
Lock Mode Data
Page or Row3
TABLESPACE
CS RS RR
Any
X
n/a
n/a
2
CS RS RR
Any
IX
X
n/a
CS
Index selection
IX
IX
v For delete: X
v For update:
U→X
Index/data selection
IX
IX
U→X
Table space scan
IX
IX
U-->→X
TABLE
PAGE, ROW, or
ANY
340
Performance Monitoring and Tuning Guide
Table 82. Locks acquired for SQL statements UPDATE, or DELETE without cursor (continued)
LOCKSIZE
ISOLATION
Access method1
Lock Mode
Table space9
Lock Mode
Table2
Lock Mode Data
Page or Row3
PAGE, ROW, or
ANY
RS
Index selection
IX
IX
v For update: S or
U8 →X
v For delete:
[S→X] or X
PAGE, ROW, or
ANY
RR
Index/data selection
IX
IX
S or U8 →X
Table space scan
IX
IX
S or U8 →X
Index selection
IX
IX
v For update: [S
or U8 →X] or X
v For delete:
[S→X] or X
Index/data selection
IX
IX
S or U8 →X
Table space scan
IX2 or X
X or n/a
n/a
SELECT with FOR UPDATE OF
The following table shows locks that are acquired during the processing of
SELECT with FOR UPDATE OF SQL statements. Data page and row locks apply
only to selected data.
Table 83. Locks acquired for SQL statements SELECT with FOR UPDATE OF
|
LOCKSIZE
ISOLATION
Access method1
Lock Mode Lock Mode
Table space9 Table2
Lock Mode Data
Page or Row3
TABLESPACE
CS RS RR
Any
S or U12
n/a
n/a
2
CS RS RR
Any
IS or IX
U
n/a
PAGE, ROW, or
ANY
CS
Index, any use
IX
IX
U
Table space scan
IX
IX
U
PAGE, ROW, or
ANY
RS
Index, any use
IX
IX
S, U, or X8
Table space scan
IX
IX
S, U, or X8
PAGE, ROW, or
ANY
RR
Index/data probe
IX
IX
S, U, or X8
Index scan6
IX or X
X, IX, or n/a
S, U, X8, or n/a
Table space scan6
IX2 or X
X or n/a
S, U, X8, or n/a
TABLE
UPDATE or DELETE with cursor
The following table shows locks that are acquired during the processing of xxx
SQL statements.
Table 84. Locks acquired for SQL statements UPDATE or DELETE with cursor
LOCKSIZE
ISOLATION
Access method1
Lock Mode Lock Mode
Table space9 Table2
Lock Mode Data
Page or Row3
TABLESPACE
Any
Any
X
n/a
n/a
2
Any
Any
IX
X
n/a
CS, RS, or RR
Index, updated
IX
IX
X
Index not updated
IX
IX
X
TABLE
PAGE, ROW, or
ANY
Chapter 13. Programming for concurrency
341
Mass delete or TRUNCATE
Lock modes for TRUNCATE depend solely on the type of tables space regardless
of LOCKSIZE or isolation level:
Simple table space
Locks the table space with an X lock
Segmented table space (not partitioned)
Locks the table with an X lock and lock the table space with an IX lock
Partitioned table space (including segmented)
Locks each partition with an X lock
Notes for this topic
1. All access methods are either scan-based or probe-based. Scan-based means
the index or table space is scanned for successive entries or rows. Probe-based
means the index is searched for an entry as opposed to a range of entries,
which a scan does. ROWIDs provide data probes to look for a single data row
directly. The type of lock used depends on the backup access method. Access
methods might be index-only, data-only, or index-to-data.
Index-only
The index alone identifies qualifying rows and the return data.
Data-only:
The data alone identifies qualifying rows and the return data, such as
a table space scan or the use of ROWID for a probe.
Index-to-data
The index is used or the index plus data are used to evaluate the
predicate:
Index selection
The index is used to evaluate predicate and data is used to
return values.
Index/data selection
The index and data are used to evaluate predicate and data is
used to return values.
2. Used only for segmented table spaces that are not partitioned.
|
|
|
|
|
|
|
3. These locks are taken on pages if LOCKSIZE is PAGE or on rows if
LOCKSIZE is ROW. When the maximum number of locks per table space
(LOCKMAX) is reached, locks escalate to a table lock for tables in a
segmented table space without partitions, or to a table space lock for tables in
a non-segmented table space. Using LOCKMAX 0 in CREATE or ALTER
TABLESPACE disables lock escalation.
4. If the table or table space is started for read-only access, DB2 attempts to
acquire an S lock. If an incompatible lock already exists, DB2 acquires the IS
lock.
5. SELECT statements that do not use a cursor, or that use read-only or
ambiguous cursors and are bound with CURRENTDATA(NO), might not
require any lock if DB2 can determine that the data to be read is committed.
This is known as lock avoidance. If your application can tolerate incomplete or
inconsistent results, you can also specify the SKIP LOCKED DATA option in
your query to avoid lock wait times.
|
|
|
|
|
|
342
Performance Monitoring and Tuning Guide
|
|
|
|
|
|
|
|
|
|
|
|
|
6. Even if LOCKMAX is 0, the bind process can promote the lock size to TABLE
or TABLESPACE. If that occurs, SQLCODE +806 is issued.
7. The locks listed are acquired on the object into which the insert is made. A
subselect acquires additional locks on the objects it reads, as if for SELECT
with read-only cursor or ambiguous cursor, or with no cursor.
8. An installation option determines whether the lock is S, U, or X. . If you use
the WITH clause to specify the isolation as RR or RS, you can use the USE
AND KEEP UPDATE LOCKS option to obtain and hold a U lock instead of an
S lock, or you can use the USE AND KEEP EXCLUSIVE LOCKS option to
obtain and hold an X lock instead of an S lock.
9. Includes partition locks, and does not include LOB table space locks.
10. If the table space is partitioned, locks can be avoided on the partitions.
11. If you use the WITH clause to specify the isolation as RR or RS, you can use
the USE AND KEEP UPDATE LOCKS option to obtain and hold a U lock
instead of an S lock, or you can use the USE AND KEEP EXCLUSIVE LOCKS
option to obtain and hold an X lock instead of an S lock.
12. The type of lock that is acquired for isolation levels RS and RR depends on
the setting of the RRULOCK subsystem parameter. If RRULOCK=YES, an S
lock is acquired for isolation levels RR and RS. Otherwise, a U lock is
acquired. PSPI
Lock promotion
Lock promotion is the action of exchanging one lock on a resource for a more
restrictive lock on the same resource, held by the same application process.
PSPI
Example
An application reads data, which requires an IS lock on a table space. Based on
further calculation, the application updates the same data, which requires an IX
lock on the table space. The application is said to promote the table space lock from
mode IS to mode IX.
Effects
When promoting the lock, DB2 first waits until any incompatible locks held by
other processes are released. When locks are promoted, they are promoted in the
direction of increasing control over resources: from IS to IX, S, or X; from IX to SIX
or X; from S to X; from U to X; and from SIX to X.
PSPI
Lock escalation
Lock escalation is the act of releasing a large number of page, row, LOB, or XML
locks, held by an application process on a single table or table space, to acquire a
table or table space lock, or a set of partition locks, of mode S or X instead.
|
|
|
|
PSPI
When locks escalation occurs, DB2 issues message DSNI031I, which
identifies the table space for which lock escalation occurred, and some information
to help you identify what plan or package was running when the escalation
occurred.
Chapter 13. Programming for concurrency
343
Lock counts are always kept on a table or table space level. For an application
process that is accessing LOBs or XML, the LOB or XML lock count on the LOB or
XML table space is maintained separately from the base table space, and lock
escalation occurs separately from the base table space.
|
|
|
|
When escalation occurs for a partitioned table space, only partitions that are
currently locked are escalated. Unlocked partitions remain unlocked. After lock
escalation occurs, any unlocked partitions that are subsequently accessed are
locked with a gross lock.
For an application process that is using Sysplex query parallelism, the lock count is
maintained on a member basis, not globally across the group for the process. Thus,
escalation on a table space or table by one member does not cause escalation on
other members.
Example lock escalation
Assume that a segmented table space without partitions is defined with LOCKSIZE
ANY and LOCKMAX 2000. DB2 can use page locks for a process that accesses a
table in the table space and can escalate those locks. If the process attempts to lock
more than 2000 pages in the table at one time, DB2 promotes its intent lock on the
table to mode S or X and then releases its page locks.
|
If the process is using Sysplex query parallelism and a table space that it accesses
has a LOCKMAX value of 2000, lock escalation occurs for a member only if more
than 2000 locks are acquired for that member.
When lock escalation occurs
Lock escalation balances concurrency with performance by using page or row locks
while a process accesses relatively few pages or rows, and then changing to table
space, table, or partition locks when the process accesses many. When it occurs,
lock escalation varies by table space, depending on the values of LOCKSIZE and
LOCKMAX. Lock escalation is suspended during the execution of SQL statements
for ALTER, CREATE, DROP, GRANT, and REVOKE.
Recommendations
The DB2 statistics and performance traces can tell you how often lock escalation
has occurred and whether it has caused timeouts or deadlocks. As a rough
estimate, if one quarter of your lock escalations cause timeouts or deadlocks, then
escalation is not effective for you. You might alter the table to increase LOCKMAX
and thus decrease the number of escalations.
Alternatively, if lock escalation is a problem, use LOCKMAX 0 to disable lock
escalation.
Example
Assume that a table space is used by transactions that require high concurrency
and that a batch job updates almost every page in the table space. For high
concurrency, you should probably create the table space with LOCKSIZE PAGE
and make the batch job commit every few seconds.
344
Performance Monitoring and Tuning Guide
LOCKSIZE ANY
LOCKSIZE ANY is a possible choice, if you take other steps to avoid lock
escalation. If you use LOCKSIZE ANY, specify a LOCKMAX value large enough so
that locks held by transactions are not normally escalated. Also, LOCKS PER USER
must be large enough so that transactions do not reach that limit.
If the batch job is:
Concurrent with transactions
It must use page or row locks and commit frequently: for example, every
100 updates. Review LOCKS PER USER to avoid exceeding the limit. The
page or row locking uses significant processing time. Binding with
ISOLATION(CS) might discourage lock escalation to an X table space lock
for those applications that read a lot and update occasionally. However,
this might not prevent lock escalation for those applications that are
update intensive.
Non-concurrent with transactions
It need not use page or row locks. The application could explicitly lock the
table in exclusive mode. PSPI
Modes of transaction locks for various processes
DB2 uses different lock modes for different types of processes.
PSPI
The rows in the following table show a sample of several types of DB2
processes. The columns show the most restrictive mode of locks used for different
objects and the possible conflicts between application processes.
Table 85. Modes of DB2 transaction locks
Catalog table
spaces
Skeleton tables
(SKCT and
SKPT)
Database
descriptor (DBD)
(1)
Target table
space (2)
Transaction with static SQL
IS (3)
S
n/a (4)
Any (5)
Query with dynamic SQL
IS (6)
S
S
Any (5)
BIND process
IX
X
S
n/a
SQL CREATE TABLE statement
IX
n/a
X
n/a
SQL ALTER TABLE statement
IX
X (7)
X
n/a
SQL ALTER TABLESPACE statement
IX
X (9)
X
n/a
SQL DROP TABLESPACE statement
IX
X (8)
X
n/a
SQL GRANT statement
IX
n/a
n/a
n/a
SQL REVOKE statement
IX
X (8)
n/a
n/a
Process
Notes:
1. In a lock trace, these locks usually appear as locks on the DBD.
2. The target table space is one of the following table spaces:
v Accessed and locked by an application process
v Processed by a utility
v Designated in the data definition statement
3. The lock is held briefly to check EXECUTE authority.
Chapter 13. Programming for concurrency
345
4. If the required DBD is not already in the EDM DBD cache, locks are acquired
on table space DBD01, which effectively locks the DBD.
5. Except while checking EXECUTE authority, IS locks on catalog tables are held
until a commit point.
6. The plan or package that uses the SKCT or SKPT is marked invalid if a
referential constraint (such as a new primary key or foreign key) is added or
changed, or the AUDIT attribute is added or changed for a table.
7. The plan or package using the SKCT or SKPT is marked invalid as a result of
this operation.
8. These locks are not held when ALTER TABLESPACE is changing the following
options: PRIQTY, SECQTY, PCTFREE, FREEPAGE, CLOSE, and ERASE. PSPI
Related reference
“Locks acquired for SQL statements” on page 339
Options for tuning locks
The following options affect how DB2 uses transaction locks.
IRLM startup procedure options
You can control how DB2 uses locks by specifying certain options when you start
the internal resource lock manager (IRLM).
About this task
PSPI
When you issue the z/OS START irlmproc command, the values of the
options are passed to the startup procedure for the DB2 IRLM. (If an option is not
explicitly specified on the command, the value of its corresponding installation
parameter is used.)
The options that are relevant to DB2 locking are:
SCOPE
Whether IRLM is used for data sharing (GLOBAL) or not (LOCAL). Use
LOCAL unless you are using data sharing. If you use data sharing, specify
GLOBAL.
DEADLOK
The two values of this option specify:
1. The number of seconds between two successive scans for a local
deadlock
2. The number of local scans that occur before a scan for global deadlock
starts
PC
Ignored by IRLM. However, PC is positional and must be maintained in
the IRLM for compatibility.
MAXCSA
Ignored by IRLM. However, MAXCSA is positional and must be
maintained in the IRLM for compatibility.
The maximum amount of storage available for IRLM locks is limited to 90% of the
total space given to the IRLM private address space during the startup procedure.
The other 10% is reserved for IRLM system services, z/OS system services, and
“must complete” processes to prevent the IRLM address space from abending,
346
Performance Monitoring and Tuning Guide
which would bring down your DB2 system. When the storage limit is reached,
lock requests are rejected with an out-of-storage reason code.
You can use the F irlmproc,STATUS,STOR command to monitor the amount of
storage that is available for locks and the MODIFY irlmproc,SET command to
dynamically change the maximum amount of IRLM private storage to use for
locks.
PSPI
Estimating the storage needed for locks
An estimate of the amount of storage needed for locks is calculated when DB2 is
installed.
Procedure
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To estimate the storage that is required for DB2 locks:
1. Gather the following information:
v The maximum number of row or pages locks per user (the LOCKS PER
USER field on the DSNTIPJ installation panel).
v The maximum number of allied threads that can be active concurrently
(MAX USERS field on the DSNTIPE installation panel).
v The maximum number of database access threads that can be active
concurrently (the MAX REMOTE ACTIVE field on the DSNTIPE installation
panel).
2. Use the following formula, which assumes that each lock needs 540 bytes of
storage:
Bytes of Storage = (Max Users + MAX REMOTE ACTIVE) × LOCKS PER USER
× 540
The result is a high-end estimate of the storage space that is needed for locks
because the formula assumes that the maximum number of users are
connected, and each user holds the maximum number of locks.
Setting installation options for wait times
These options determine how long it takes DB2 to identify that a process must be
timed out or is deadlocked. They affect locking in your entire DB2 subsystem.
Specifying the interval for detecting deadlocks
You can specify the interval at which DB2 scans for deadlocked processes at
regular intervals.
About this task
Deadlock detection can cause latch suspensions.
Procedure
To specify the interval for detecting deadlocks:
Specify a value in seconds for the DEADLOCK TIME field on installation panel
DSNTIPJ.
v For systems in which deadlocking is not a problem, have deadlock detection run
less frequently for the best performance and concurrency (but do not choose a
value greater than 5 seconds).
Chapter 13. Programming for concurrency
347
v If your system is prone to deadlocks, you want those detected as quickly as
possible. In that case, choose 1.
The default value of the DEADLOCK TIME field is 1 second.
Specifying the amount of inactive time before a timeout
You can specify how long your system waits for suspended processes.
Procedure
To specify the minimum number of seconds before a timeout can occur:
Specify a value for the IRLMRWT subsystem parameter (the RESOURCE
TIMEOUT field on installation panel DSNTIPI). A small value can cause a large
number of timeouts. With a larger value, suspended processes more often resume
normally, but they remain inactive for longer periods. The default value is 60
seconds.
v If you can allow a suspended process to remain inactive for 60 seconds, use the
defaults for both RESOURCE TIMEOUT and DEADLOCK TIME.
v If you specify a different a different inactive period, consider howDB2 calculates
the wait time for timeouts.
How DB2 calculates the wait time for timeouts
When a process requests a transaction lock that is unavailable, it waits for some
period of time. DB2 determines the appropriate wait time by multiplying a timeout
period by a multiplier based on the type of process.
The timeout period
PSPI
DB2 calculates a timeout period from the values of the RESOURCE TIMEOUT and
DEADLOCK TIME options.
For example, assume that the value of the DEADLOCK TIME option is 5 and the
value of the RESOURCE TIMEOUT option is 18. You can use the following
calculations to see how DB2 calculates a timeout period.
1. Divide RESOURCE TIMEOUT by DEADLOCK TIME (18/5 = 3.6). IRLM limits
the result of this division to 255.
2. Round the result to the next largest integer (Round up 3.6 to 4).
3. Multiply the DEADLOCK TIME by that integer (4 * 5 = 20).
The result, the timeout period (20 seconds), is always at least as large as the value
of RESOURCE TIMEOUT (18 seconds), except when the RESOURCE TIMEOUT
divided by DEADLOCK TIME exceeds 255.
The timeout multiplier
Requests from different types of processes wait for different multiples of the
timeout period according to the timeout multiplier. In a data sharing environment,
you can add another multiplier to those processes to wait for retained locks.
In some cases, you can modify the multiplier value. The following table indicates
the multiplier value by type of process, and whether you can change it.
348
Performance Monitoring and Tuning Guide
Table 86. Timeout multiplier by type
Type
|
|
|
|
Multiplier1
Modifiable?
IMS MPP, IMS Fast Path Message Processing, CICS,
1
DB2 QMF, CAF, TSO batch and online, RRSAF, global
transactions
No
IMS BMPs
4
Yes
IMS DL/I batch
6
Yes
IMS Fast Path Non-message processing
6
No
BIND subcommand processing
3
No
STOP DATABASE command processing
10
No
Utilities
6
Yes
Retained locks for all types
0
Yes
Note:
1. If the transaction occurs on a table space that is not logged, the timeout
multiplier is either three or the current timeout multiplier for the thread,
whichever is greater.
Changing the multiplier for IMS BMP and DL/I batch
You can modify the multipliers for IMS BMP and DL/I batch by modifying the
following subsystem parameters on installation panel DSNTIPI:
IMS BMP TIMEOUT
The timeout multiplier for IMS BMP connections. A value from 1 to 254 is
acceptable. The default is 4.
DL/I BATCH TIMEOUT
The timeout multiplier for IMS DL/I batch connections. A value from 1 to
254 is acceptable. The default is 6.
Additional multiplier for retained lock
For data sharing, you can specify an additional timeout multiplier to be applied to
the connection's normal timeout multiplier. This multiplier is used when the
connection is waiting for a retained lock, which is a lock held by a failed member
of a data sharing group. A zero means don't wait for retained locks.
The scanning schedule
|
The following figure illustrates the following example of scanning to detect a
timeout:
v DEADLOCK TIME is set to 5 seconds.
v RESOURCE TIMEOUT was chosen to be 18 seconds. Therefore, the timeout
period is 20 seconds.
v A bind operation starts 4 seconds before the next scan. The operation multiplier
for a bind operation is 3.
Chapter 13. Programming for concurrency
349
A deadlock example:
0 seconds: BIND starts
Deadlock time
Resource timeout
Timeout period
=
=
=
5 seconds
18 seconds
20 seconds
BIND times out at seconds=69
Elapsed time=69 seconds
Timeout period
0
4
9
14
19
24
29
34
39
44
49
54
59
64
69
Time in seconds
Deadlock time
Grace period
Figure 38. An example of scanning for timeout
The scans proceed through the following steps:
1. A scan starts 4 seconds after the bind operation requests a lock. As determined
by the DEADLOCK TIME, scans occur every 5 seconds. The first scan in the
example detects that the operation is inactive.
2. IRLM allows at least one full interval of DEADLOCK TIME as a “grace period”
for an inactive process. After that, its lock request is judged to be waiting. At 9
seconds, the second scan detects that the bind operation is waiting.
3. The bind operation continues to wait for a multiple of the timeout period. In
the example, the multiplier is 3 and the timeout period is 20 seconds. The bind
operation continues to wait for 60 seconds longer.
4. The scan that starts 69 seconds after the bind operation detects that the process
has timed out.
Consequently, an operation can remain inactive for longer than the value of
RESOURCE TIMEOUT.
If you are in a data sharing environment, the deadlock and timeout detection
process is longer than that for non-data-sharing systems.
You should carefully consider the length of inaction time when choosing your own
values of DEADLOCK TIME and RESOURCE TIMEOUT.
PSPI
Specifying how long an idle thread can use resources
You can specify a limit for the amount of time that active distributed threads can
use resources without doing any processing.
Procedure
To limit the amount of time that distributed threads can remain idle:
Specify a value other than 0 for the IDTHTOIN subsystem parameter (the IDLE
THREAD TIMEOUT field on installation panel DSNTIPR) DB2 detects threads that
have been idle for the specified period, and DB2 cancels the thread. Because the
scan occurs only at 2-minute intervals, your idle threads generally remain idle for
somewhat longer than the value you specify.
The cancellation applies only to active threads. If your installation permits
distributed threads to be inactive and hold no resources, those threads are allowed
to remain idle indefinitely.
The default value is 0. That value disables the scan to time out idle threads. The
threads can then remain idle indefinitely.
350
Performance Monitoring and Tuning Guide
Specifying how long utilities wait for resources
You can specify how long DB2 waits before timing out utilities that wait for locks.
Procedure
To specify the operation multiplier for utilities that wait for drain locks, transaction
locks, or claims to be released:
Specify the value of the UTMOUT subsystem parameter (the UTILITY TIMEOUT
field on installation panel DSNTIPI) The default value is 6. With the default value,
a utility generally waits longer for a resource than does an SQL application.
Calculating wait times for drains
You can calculate how long DB2 waits for drains.
About this task
PSPI
A process that requests a drain might wait for two events:
Acquiring the drain lock.
If another user holds the needed drain lock in an incompatible lock mode,
then the drainer waits.
Releasing all claims on the object.
Even after the drain lock is acquired, the drainer waits until all claims are
released before beginning to process.
If the process drains more than one claim class, it must wait for those events to
occur for each claim class that it drains.
Procedure
|
|
|
|
|
To calculate the maximum amount of wait time:
1. Add the wait time for a drain lock and the wait time for claim release. Both
wait times are based on the timeout period that is calculated by DB2. For the
REORG, REBUILD, REBUILD INDEX, CHECK DATA or CHECK LOB utilities,
with the SHRLEVEL CHANGE options you can use utility parameters to
specify the wait time for a drain lock and to indicate if additional attempts
should be made to acquire the drain lock..
Drainer:
Each wait time is:
Utility (timeout period) × (value of UTILITY TIMEOUT)
Other process
timeout period
2. Add the wait time for claim release.
3. Multiply the result by the number of claim classes drained.
Example
Maximum wait time: Because the maximum wait time for a drain lock is the same
as the maximum wait time for releasing claims, you can calculate the total
maximum wait time as follows:
For utilities
2 × (timeout period) × (UTILITY TIMEOUT) × (number of claim classes)
Chapter 13. Programming for concurrency
351
For other processes
2 × (timeout period) × (operation multiplier) × (number of claim classes)
For example, suppose that LOAD must drain 3 claim classes, that the timeout
period is 20 seconds, and that the value of UTILITY TIMEOUT is 6. Use the
following calculation to determine how long the LOAD might utility be suspended
before being timed out:
Maximum wait time = 2 × 20 × 6 × 3 = 720 seconds
Wait times less than maximum: The maximum drain wait time is the longest
possible time a drainer can wait for a drain, not the length of time it always waits.
For example, The following table lists the steps LOAD takes to drain the table
space and the maximum amount of wait time for each step. A timeout can occur at
any step. At step 1, the utility can wait 120 seconds for the repeatable read drain
lock. If that lock is not available by then, the utility times out after 120 seconds. It
does not wait 720 seconds.
Table 87. Maximum drain wait times: LOAD utility
Step
Maximum Wait
Time (seconds)
1. Get repeatable read drain lock
120
2. Wait for all RR claims to be released
120
3. Get cursor stability read drain lock
120
4. Wait for all CS claims to be released
120
5. Get write drain lock
120
6. Wait for all write claims to be released
120
Total
720
PSPI
Bind options for locks
Certain BIND options determine when an application process acquires and releases
its locks and how it isolates its actions from effects of concurrent processes.
Choosing ACQUIRE and RELEASE options
The ACQUIRE and RELEASE options of bind determine when DB2 locks an object
(table, partition, or table space) your application uses and when it releases the lock.
(The ACQUIRE and RELEASE options do not affect page, row, LOB, or XML
locks.)
|
|
About this task
PSPI
The options apply to static SQL statements, which are bound before your
program executes. If your program executes dynamic SQL statements, the objects
they lock are locked when first accessed and released at the next commit point
though some locks acquired for dynamic SQL might be held past commit points.
The ACQUIRE and RELEASE options are:
352
Performance Monitoring and Tuning Guide
ACQUIRE(ALLOCATE)
Acquires the lock when the object is allocated. This option is not allowed
for BIND or REBIND PACKAGE.
ACQUIRE(USE)
Acquires the lock when the object is first accessed.
|
|
|
RELEASE(DEALLOCATE)
Releases the lock when the object is deallocated (the application ends). The
value has no effect on dynamic SQL statements, which always use
RELEASE(COMMIT), unless you are using dynamic statement caching. The
value also has no effect on packages that are executed on a DB2 server
through a DRDA connection with the client system.
RELEASE(COMMIT)
Releases the lock at the next commit point, unless cursors are held. If the
application accesses the object again, it must acquire the lock again.
The default options for ACQUIRE and RELEASE depend on the type of bind
option as shown in the following table.
Table 88. Default ACQUIRE and RELEASE values for different bind options
Operation
Default values
BIND PLAN
ACQUIRE(USE) and RELEASE(COMMIT).
BIND PACKAGE
No option exists for ACQUIRE; ACQUIRE(USE) is always
used. At the local server the default for RELEASE is the
value used by the plan that includes the package in its
package list. At a remote server the default is COMMIT.
REBIND PLAN or PACKAGE
The existing values for the plan or package that is being
rebound.
Partition locks: Partition locks follow the same rules as table space locks, and all
partitions are held for the same duration. Thus, if one package is using
RELEASE(COMMIT) and another is using RELEASE(DEALLOCATE), all partitions
use RELEASE(DEALLOCATE).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dynamic statement caching: Generally, the RELEASE option has no effect on
dynamic SQL statements with one exception. When you use the bind options
RELEASE(DEALLOCATE) and KEEPDYNAMIC(YES), and your subsystem is
installed with YES for field CACHE DYNAMIC SQL on installation panel
DSNTIP8, DB2 retains prepared SELECT, INSERT, UPDATE, and DELETE
statements in memory past commit points. For this reason, DB2 can honor the
RELEASE(DEALLOCATE) option for these dynamic statements. The locks are held
until deallocation, or until the commit after the prepared statement is freed from
memory, in the following situations:
v The application issues a PREPARE statement with the same statement identifier.
v The statement is removed from memory because it has not been used.
v An object that the statement is dependent on is dropped or altered, or a
privilege needed by the statement is revoked.
v RUNSTATS is run against an object that the statement is dependent on.
If a lock is to be held past commit and it is an S, SIX, or X lock on a table space or
a table in a segmented table space, DB2 sometimes demotes that lock to an intent
lock (IX or IS) at commit. DB2 demotes a gross lock if it was acquired for one of
the following reasons:
Chapter 13. Programming for concurrency
353
v DB2 acquired the gross lock because of lock escalation.
v The application issued a LOCK TABLE.
v The application issued a mass delete (DELETE FROM object without a WHERE
clause or TRUNCATE).
|
|
Procedure
Choose a combination of values for ACQUIRE and RELEASE based on the
characteristics of the particular application.
Example
An application selects employee names and telephone numbers from a table,
according to different criteria. Employees can update their own telephone numbers.
They can perform several searches in succession. The application is bound with the
options ACQUIRE(USE) and RELEASE(DEALLOCATE), for these reasons:
v The alternative to ACQUIRE(USE), ACQUIRE(ALLOCATE), gets a lock of mode
IX on the table space as soon as the application starts, because that is needed if
an update occurs. Most uses of the application do not update the table and so
need only the less restrictive IS lock. ACQUIRE(USE) gets the IS lock when the
table is first accessed, and DB2 promotes the lock to mode IX if that is needed
later.
v Most uses of this application do not update and do not commit. For those uses,
little difference exists between RELEASE(COMMIT) and
RELEASE(DEALLOCATE). However, administrators might update several phone
numbers in one session with the application, and the application commits after
each update. In that case, RELEASE(COMMIT) releases a lock that DB2 must
acquire again immediately. RELEASE(DEALLOCATE) holds the lock until the
application ends, avoiding the processing needed to release and acquire the lock
several times. PSPI
Combinations of ACQUIRE and RELEASE options:
Different combinations of bind options have advantages and disadvantages for
certain situations.
ACQUIRE(ALLOCATE) / RELEASE(DEALLOCATE)
In some cases, this combination can avoid deadlocks by locking all needed
resources as soon as the program starts to run. This combination is most useful for
a long-running application that runs for hours and accesses various tables, because
it prevents an untimely deadlock from wasting that processing.
v All tables or table spaces used in DBRMs bound directly to the plan are locked
when the plan is allocated. (LOB and XML table spaces are not locked when the
plan is allocated and are only locked when accessed.)
v All tables or table spaces are unlocked only when the plan terminates.
v The locks used are the most restrictive needed to execute all SQL statements in
the plan regardless of whether the statements are actually executed.
v Restrictive states are not checked until the page set is accessed. Locking when
the plan is allocated insures that the job is compatible with other SQL jobs.
|
|
|
354
Performance Monitoring and Tuning Guide
Waiting until the first access to check restrictive states provides greater
availability; however, it is possible that an SQL transaction could:
– Hold a lock on a table space or partition that is stopped
– Acquire a lock on a table space or partition that is started for DB2 utility
access only (ACCESS(UT))
– Acquire an exclusive lock (IX, X) on a table space or partition that is started
for read access only (ACCESS(RO)), thus prohibiting access by readers
Disadvantages: This combination reduces concurrency. It can lock resources in
high demand for longer than needed. Also, the option ACQUIRE(ALLOCATE)
turns off selective partition locking; if you are accessing a partitioned table space,
all partitions are locked.
Restriction: This combination is not allowed for BIND PACKAGE. Use this
combination if processing efficiency is more important than concurrency. It is a
good choice for batch jobs that would release table and table space locks only to
reacquire them almost immediately. It might even improve concurrency, by
allowing batch jobs to finish sooner. Generally, do not use this combination if your
application contains many SQL statements that are often not executed.
ACQUIRE(USE) / RELEASE(DEALLOCATE)
This combination results in the most efficient use of processing time in most cases.
v A table, partition, or table space used by the plan or package is locked only if it
is needed while running.
v All tables or table spaces are unlocked only when the plan terminates.
v The least restrictive lock needed to execute each SQL statement is used, with the
exception that if a more restrictive lock remains from a previous statement, that
lock is used without change.
Disadvantages: This combination can increase the frequency of deadlocks. Because
all locks are acquired in a sequence that is predictable only in an actual run, more
concurrent access delays might occur.
ACQUIRE(USE) / RELEASE(COMMIT)
|
|
|
This combination is the default combination and provides the greatest concurrency,
but it requires more processing time if the application commits frequently.
v A table, partition, or table space is locked only when needed. That locking is
important if the process contains many SQL statements that are rarely used or
statements that are intended to access data only in certain circumstances.
v All tables and table spaces are unlocked when:
TSO, Batch, and CAF
An SQL COMMIT or ROLLBACK statement is issued, or your
application process terminates
IMS
A CHKP or SYNC call (for single-mode transactions), a GU call to the
I/O PCB, or a ROLL or ROLB call is completed
CICS
A SYNCPOINT command is issued.
Exception:
Chapter 13. Programming for concurrency
355
If the cursor is defined WITH HOLD, table or table space locks necessary to
maintain cursor position are held past the commit point.
v Table, partition, or table space locks are released at the next commit point unless
the cursor is defined WITH HOLD.
v The least restrictive lock needed to execute each SQL statement is used except
when a more restrictive lock remains from a previous statement. In that case,
that lock is used without change.
Disadvantages: This combination can increase the frequency of deadlocks. Because
all locks are acquired in a sequence that is predictable only in an actual run, more
concurrent access delays might occur.
ACQUIRE(ALLOCATE) / RELEASE(COMMIT)
This combination is not allowed; it results in an error message from BIND.
Choosing an ISOLATION option
Various isolation levels offer less or more concurrency at the cost of more or less
protection from other application processes.
About this task
PSPI
The ISOLATION option of an application specifies the degree to which
operations are isolated from the possible effects of other operations acting
concurrently. Based on this information, DB2 releases S and U locks on rows or
pages as soon as possible.
Regardless of the isolation level that you specify, outstanding claims on DB2
objects can inhibit the execution of DB2 utilities or commands.
The default ISOLATION option differs for different types of bind operations, as
shown in the following table.
Table 89. The default ISOLATION values for different types of bind operations
|
Operation
Default value
BIND PLAN
ISOLATION (CS) with CURRENTDATA (NO)
BIND PACKAGE
The value used by the plan that includes the package
in its package list
REBIND PLAN or PACKAGE
The existing value for the plan or package being
rebound
Procedure
To ensure that your applications can access your data concurrently:
Choose an isolation level according to the needs and characteristics of the
particular application. The recommended order of preference for isolation levels is:
1.
2.
3.
4.
356
Cursor stability (CS)
Uncommitted read (UR)
Read stability (RS)
Repeatable read (RR)
Performance Monitoring and Tuning Guide
For ISOLATION(CS), the CURRENTDATA(NO) option is preferred over
CURRENTDATA(YES).
Although uncommitted read provides the lowest level of isolation, cursor stability
is recommended in most cases because it provides a high level of concurrency,
without sacrificing data integrity.
PSPI
The ISOLATION (CS) option:
The ISOLATION (CS) or cursor stability option allows maximum concurrency with
data integrity.
However, after the process leaves a row or page, another process can
change the data. With CURRENTDATA(NO), the process does not have to leave a
row or page to allow another process to change the data. If the first process returns
to read the same row or page, the data is not necessarily the same. Consider these
consequences of that possibility:
v For table spaces created with LOCKSIZE ROW, PAGE, or ANY, a change can
occur even while executing a single SQL statement, if the statement reads the
same row more than once. In the following statement, data read by the inner
SELECT can be changed by another transaction before it is read by the outer
SELECT.
SELECT * FROM T1
WHERE C1 = (SELECT MAX(C1) FROM T1);
Therefore, the information returned by this query might be from a row that is no
longer the one with the maximum value for C1.
v In another case, if your process reads a row and returns later to update it, that
row might no longer exist or might not exist in the state that it did when your
application process originally read it. That is, another application might have
deleted or updated the row. If your application is doing non-cursor operations
on a row under the cursor, make sure that the application can tolerate “not
found” conditions.
Similarly, assume another application updates a row after you read it. If your
process returns later to update it based on the value you originally read, you
are, in effect, erasing the update made by the other process. If you use
ISOLATION(CS) with update, your process might need to lock out concurrent
updates.One method is to declare a cursor with the FOR UPDATE clause.
For packages and plans that contain updatable static scrollable cursors,
ISOLATION(CS) lets DB2 use optimistic concurrency control. DB2 can use optimistic
concurrency control to shorten the amount of time that locks are held in the
following situations:
v Between consecutive fetch operations
v Between fetch operations and subsequent positioned update or delete operations
DB2 cannot use optimistic concurrency control for dynamic scrollable cursors. With
dynamic scrollable cursors, the most recently fetched row or page from the base
table remains locked to maintain position for a positioned update or delete.
|
|
|
The two following figures show processing of positioned update and delete
operations without optimistic concurrency control and with optimistic concurrency
control.
Chapter 13. Programming for concurrency
357
|
FETCH
row 1
FETCH
row 2
UPDATE WHERE
CURRENT OF
Lock
row 1
Unlock row 1
Lock row 2
Update row 2
Application
Time line
DB2
|
| Figure 39. Positioned updates and deletes with a static non-scrollable cursor and without optimistic concurrency
| control
|
|
| Figure 40. Positioned updates and deletes with a static sensitive scrollable cursor and with optimistic concurrency
| control
Optimistic concurrency control consists of the following steps:
1. When the application requests a fetch operation to position the cursor on a row,
DB2 locks that row, executes the FETCH, and releases the lock.
2. When the application requests a positioned update or delete operation on the
row, DB2 performs the following steps:
a. Locks the row.
b. Reevaluates the predicate to ensure that the row still qualifies for the result
table.
The ISOLATION (UR) option:
The ISOLATION (UR) or uncommitted read option allows an application to read
while acquiring few locks, at the risk of reading uncommitted data. UR isolation
applies only to read-only operations: SELECT, SELECT INTO, or FETCH from a
read-only result table.
Reading uncommitted data introduces an element of uncertainty.
For example, an application tracks the movement of work from station to station
along an assembly line. As items move from one station to another, the application
subtracts from the count of items at the first station and adds to the count of items
at the second. Assume you want to query the count of items at all the stations,
while the application is running concurrently.
If your query reads data that the application has changed but has not committed:
358
Performance Monitoring and Tuning Guide
v If the application subtracts an amount from one record before adding it to
another, the query could miss the amount entirely.
v If the application adds first and then subtracts, the query could add the amount
twice.
If those situations can occur and are unacceptable, do not use UR isolation.
Restrictions for using ISOLATION (UR)
|
You cannot use the ISOLATION (UR) option for the following types of statements:
v INSERT, UPDATE, DELETE, and MERGE
v SELECT FROM INSERT, UPDATE, DELETE, or MERGE.
v Any cursor defined with a FOR UPDATE clause
If you bind with ISOLATION(UR) and the statement does not specify WITH RR or
WITH RS, DB2 uses CS isolation for these types of statements.
|
|
|
When an application uses ISO(UR) and runs concurrently with applications that
update variable-length records such that the update creates a double-overflow
record, the ISO(UR) application might miss rows that are being updated.
When to use ISOLATION (UR)
You can probably use UR isolation in cases such as the following examples:
When errors cannot occur
The follow examples describe situations in which errors can be avoided
while using the ISOLATION (UR) option.
Reference tables
Like a tables of descriptions of parts by part number. Such tables
are rarely updated, and reading an uncommitted update is
probably no more damaging than reading the table 5 seconds
earlier.
Tables with limited access
The employee table of Spiffy Computer, our hypothetical user. For
security reasons, updates can be made to the table only by
members of a single department. And that department is also the
only one that can query the entire table. It is easy to restrict queries
to times when no updates are being made and then run with UR
isolation.
When an error is acceptable
Spiffy Computer wants to do some statistical analysis on employee data. A
typical question is, “What is the average salary by sex within education
level?” Because reading an occasional uncommitted record cannot affect the
averages much, UR isolation can be used.
When the data already contains inconsistent information
Spiffy computer gets sales leads from various sources. The data is often
inconsistent or wrong, and end users of the data are accustomed to dealing
with that. Inconsistent access to a table of data on sales leads does not add
to the problem.
Chapter 13. Programming for concurrency
359
When not to use ISOLATION (UR)
Do not use uncommitted read, ISOLATION (UR), in the following cases:
v When computations must balance
v When the answer must be accurate
v When you are unsure whether using the ISOLATION (UR) might cause damage
The ISOLATION (RS) option:
The ISOLATION (RS) or read stability option allows the application to read the
same pages or rows more than once without allowing qualifying rows to be
updated or deleted by another process.
It offers possibly greater concurrency than repeatable read, because although other
applications cannot change rows that are returned to the original application, they
can insert new rows or update rows that did not satisfy the original search
condition of the application. Only those rows or pages that satisfy the stage 1
predicate (and all rows or pages evaluated during stage 2 processing) are locked
until the application commits. The following figure illustrates this process. In the
example, the rows held by locks L2 and L4 satisfy the predicate.
Application
Request row
Request next row
Time line
Lock Unlock Lock Unlock Lock
L
L
L1
L1
L2
Lock Unlock Lock
L3
L3
L4
DB2
Figure 41. How an application using RS isolation acquires locks when no lock avoidance techniques are used. Locks
L2 and L4 are held until the application commits. The other locks aren't held.
Applications using read stability can leave rows or pages locked for long periods,
especially in a distributed environment.
If you do use read stability, plan for frequent commit points.
An installation option determines the mode of lock chosen for a cursor defined
with the FOR UPDATE OF clause and bound with read stability.
The ISOLATION (RR) option:
The ISOLATION (RR) or repeatable read option allows the application to read the
same pages or rows more than once without allowing any update, insert, or delete
operations by another process. All accessed rows or pages are locked, even if they
do not satisfy the predicate.
Applications that use repeatable read can leave rows or pages locked for
longer periods, especially in a distributed environment, and they can claim more
logical partitions than similar applications using cursor stability.
360
Performance Monitoring and Tuning Guide
Applications that use repeatable read and access a nonpartitioned index cannot run
concurrently with utility operations that drain all claim classes of the
nonpartitioned index, even if they are accessing different logical partitions. For
example, an application bound with ISOLATION(RR) cannot update partition 1
while the LOAD utility loads data into partition 2. Concurrency is restricted
because the utility needs to drain all the repeatable-read applications from the
nonpartitioned index to protect the repeatability of the reads by the application.
They are also subject to being drained more often by utility operations.
Because so many locks can be taken, lock escalation might take place. Frequent
commits release the locks and can help avoid lock escalation.
With repeatable read, lock promotion occurs for table space scan to prevent the
insertion of rows that might qualify for the predicate. (If access is via index, DB2
locks the key range. If access is via table space scans, DB2 locks the table, partition,
or table space.)
An installation option determines the mode of lock chosen for a cursor defined
with the FOR UPDATE OF clause and bound with repeatable read.
Repeatable read and CP parallelism
For CP parallelism, locks are obtained independently by each task. This situation
can possibly increase the total number of locks taken for applications that have the
following attributes:
v Use the repeatable read isolation level.
v Use CP parallelism.
v Repeatedly access the table space using a lock mode of IS without issuing
COMMIT statements.
Repeatable read or read stability isolation cannot be used with Sysplex query
parallelism.
Choosing a CURRENTDATA option
The CURRENTDATA bind option applies differently for applications that access
local and remote data.
About this task
PSPI
Generally CURRENTDATA (NO) increases concurrency with some risk to
the currency of data returned to the application and CURRENTDATA (YES)
reduces concurrency but ensures more currency in data returned to the application.
PSPI
CURRENTDATA for local access:
For local access, the CURRENTDATA option tells whether the data upon which
your cursor is positioned must remain identical to (or “current with”) the data in
the local base table.
For cursors positioned on data in a work file, the CURRENTDATA option has no
effect. This effect only applies to read-only or ambiguous cursors in plans or
packages bound with CS isolation.
Chapter 13. Programming for concurrency
361
CURRENTDATA (YES)
CURRENTDATA (YES) Means that the data upon which the cursor is positioned
cannot change while the cursor is positioned on it. If the cursor is positioned on
data in a local base table or index, then the data returned with the cursor is current
with the contents of that table or index. If the cursor is positioned on data in a
work file, the data returned with the cursor is current only with the contents of the
work file; it is not necessarily current with the contents of the underlying table or
index.
The following figure shows locking with CURRENTDATA (YES).
Application
Request
row or page
Request next
row or page
Time line
Lock Unlock Lock Unlock Lock
L
L
L1
L1
L2
Unlock Lock Unlock Lock
L2
L3
L3
L4
DB2
Figure 42. How an application using CS isolation with CURRENTDATA (YES) acquires locks. This figure shows access
to the base table. The L2 and L4 locks are released after DB2 moves to the next row or page. When the application
commits, the last lock is released.
As with work files, if a cursor uses query parallelism, data is not necessarily
current with the contents of the table or index, regardless of whether a work file is
used. Therefore, for work file access or for parallelism on read-only queries, the
CURRENTDATA option has no effect.
If you are using parallelism but want to maintain currency with the data, you have
the following options:
v Disable parallelism (Use SET DEGREE = '1' or bind with DEGREE(1)).
v Use isolation RR or RS (parallelism can still be used).
v Use the LOCK TABLE statement (parallelism can still be used).
CURRENTDATA(NO)
Is similar to CURRENTDATA(YES) except for the case where a cursor is accessing
a base table rather than a result table in a work file. In those cases, although
CURRENTDATA(YES) can guarantee that the cursor and the base table are current,
CURRENTDATA(NO) makes no such guarantee.
CURRENTDATA for remote access:
For a request to a remote system, CURRENTDATA has an effect for ambiguous
cursors using isolation levels RR, RS, or CS.
For access to a remote table or index, CURRENTDATA(YES) turns off
block fetching for ambiguous cursors. The data returned with the cursor is current
with the contents of the remote table or index for ambiguous cursors. Turning on
block fetch offers best performance, but it means the cursor is not current with the
base table at the remote site.
362
Performance Monitoring and Tuning Guide
Lock avoidance:
With CURRENTDATA(NO), you have much greater opportunity for avoiding
locks.
DB2 can test to see if a row or page has committed data on it. If it has,
DB2 does not have to obtain a lock on the data at all. Unlocked data is returned to
the application, and the data can be changed while the cursor is positioned on the
row. (For SELECT statements in which no cursor is used, such as those that return
a single row, a lock is not held on the row unless you specify WITH RS or WITH
RR on the statement.)
To take the best advantage of this method of avoiding locks, make sure all
applications that are accessing data concurrently issue COMMIT statements
frequently.
The following figure shows how DB2 can avoid taking locks and the table below
summarizes the factors that influence lock avoidance.
Application
Request
row or page
Request next
row or page
Time line
Test and avoid locks
Test and avoid locks
DB2
Figure 43. Best case of avoiding locks using CS isolation with CURRENTDATA(NO). This figure shows access to the
base table. If DB2 must take a lock, then locks are released when DB2 moves to the next row or page, or when the
application commits (the same as CURRENTDATA(YES)).
Table 90. Lock avoidance factors. “Returned data” means data that satisfies the predicate.
“Rejected data” is that which does not satisfy the predicate.
|
|
|
Isolation
CURRENTDATA
Cursor type
Avoid locks
on returned
data?
UR
N/A
Read-only
N/A
N/A
CS
YES
Any
No
Yes1
NO
Read-only
Yes
Updatable
No
Ambiguous
Yes
Avoid locks on
rejected data?
RS
N/A
Any
No
Yes1, 2
RR
N/A
Any
No
No
Notes:
1. Locks are avoided when the row is disqualified after stage 1 processing
Chapter 13. Programming for concurrency
363
2. When using ISO(RS) and multi-row fetch, DB2 releases locks that were
acquired on Stage 1 qualified rows, but which subsequently failed to qualify for
|
|
stage 2 predicates at the next fetch of the cursor.
|
Problems with ambiguous cursors:
A cursor is considered ambiguous if DB2 cannot tell whether it is used for update
or read-only purposes.
If the cursor appears to be used only for read-only, but dynamic SQL could modify
data through the cursor, then the cursor is ambiguous. If you use CURRENTDATA
to indicate an ambiguous cursor is read-only when it is actually targeted by
dynamic SQL for modification, you'll get an error.Ambiguous cursors can
sometimes prevent DB2 from using lock avoidance techniques. However, misuse of
an ambiguous cursor can cause your program to receive a -510 SQLCODE,
meaning:
v The plan or package is bound with CURRENTDATA(NO)
v An OPEN CURSOR statement is performed before a dynamic DELETE WHERE
CURRENT OF statement against that cursor is prepared
v One of the following conditions is true for the open cursor:
– Lock avoidance is successfully used on that statement.
– Query parallelism is used.
– The cursor is distributed, and block fetching is used.
In all cases, it is a good programming technique to eliminate the ambiguity by
declaring the cursor with either the FOR FETCH ONLY or the FOR UPDATE
clause.
Conflicting plan and package bind options
A plan bound with one set of options can include packages in its package list that
were bound with different sets of options.
PSPI
In general, statements in a DBRM bound as a package use the options that the
package was bound with. Statements in DBRMs bound to a plan use the options
that the plan was bound with.
For example, the plan value for CURRENTDATA has no effect on the packages
executing under that plan. If you do not specify a CURRENTDATA option
explicitly when you bind a package, the default is CURRENTDATA(YES).
The rules are slightly different for the bind options RELEASE and ISOLATION.
The values of those two options are set when the lock on the resource is acquired
and usually stay in effect until the lock is released. But a conflict can occur if a
statement that is bound with one pair of values requests a lock on a resource that
is already locked by a statement that is bound with a different pair of values. DB2
resolves the conflict by resetting each option with the available value that causes
the lock to be held for the greatest duration.
If the conflict is between RELEASE(COMMIT) and RELEASE(DEALLOCATE), then
the value used is RELEASE(DEALLOCATE).
364
Performance Monitoring and Tuning Guide
The table below shows how conflicts between isolation levels are resolved. The
first column is the existing isolation level, and the remaining columns show what
happens when another isolation level is requested by a new application process.
Table 91. Resolving isolation conflicts
UR
CS
RS
RR
UR
n/a
CS
RS
RR
CS
CS
n/a
RS
RR
RS
RS
RS
n/a
RR
RR
RR
RR
RR
n/a
PSPI
Using SQL statements to override isolation levels
You can override the isolation level with which a plan or package is bound.
Procedure
PSPI
|
To override the isolation level for a specific SQL statement:
v Issue the SQL statements, and include a WITH isolation level clause. The WITH
isolation level clause:
– Can be used on these statements:
- SELECT
- SELECT INTO
- Searched DELETE
- INSERT from fullselect
- Searched UPDATE
– Cannot be used on subqueries.
– Can specify the isolation levels that specifically apply to its statement. (For
example, because WITH UR applies only to read-only operations, you cannot
use it on an INSERT statement.)
– Overrides the isolation level for the plan or package only for the statement in
which it appears.
The following statement finds the maximum, minimum, and average bonus in
the sample employee table.
SELECT MAX(BONUS), MIN(BONUS), AVG(BONUS)
INTO :MAX, :MIN, :AVG
FROM DSN8910.EMP
WITH UR;
The statement is executed with uncommitted read isolation, regardless of the
value of ISOLATION with which the plan or package containing the statement is
bound.
v If you use the WITH RR or WITH RS clause, you can issue SELECT and
SELECT INTO statements, and specify the following options:
– USE AND KEEP EXCLUSIVE LOCKS
– USE AND KEEP UPDATE LOCKS
– USE AND KEEP SHARE LOCKS
To use these options, specify them as shown in the following example:
SELECT ...
WITH RS USE AND KEEP UPDATE LOCKS;
Chapter 13. Programming for concurrency
365
Results
By using one of these options, you tell DB2 to acquire and hold a specific mode of
lock on all the qualified pages or rows. The following table shows which mode of
lock is held on rows or pages when you specify the SELECT using the WITH RS or
WITH RR isolation clause.
Table 92. Which mode of lock is held on rows or pages when you specify the SELECT using
the WITH RS or WITH RR isolation clause
Option Value
Lock Mode
USE AND KEEP EXCLUSIVE LOCKS
X
USE AND KEEP UPDATE LOCKS
U
USE AND KEEP SHARE LOCKS
S
With read stability (RS) isolation, a row or page that is rejected during stage 2
processing might still have a lock held on it, even though it is not returned to the
application.
With repeatable read (RR) isolation, DB2 acquires locks on all pages or rows that
fall within the range of the selection expression.
All locks are held until the application commits. Although this option can reduce
concurrency, it can prevent some types of deadlocks and can better serialize access
to data. PSPI
Locking a table explicitly
You can override the DB2 rules for choosing initial lock attributes.
About this task
You can use LOCK TABLE on any table, including the auxiliary tables of LOB and
XML table spaces. However, LOCK TABLE has no effect on locks that are acquired
at a remote server.
|
|
|
Procedure
PSPI
To lock a table explicitly:
Issue a LOCK TABLE statement. Two examples are:
LOCK TABLE table-name IN SHARE MODE;
LOCK TABLE table-name PART n IN EXCLUSIVE MODE;
Executing the statement requests a lock immediately, unless a suitable lock exists
already. The bind option RELEASE determines when locks acquired by LOCK
TABLE or LOCK TABLE with the PART option are released. PSPI
Related tasks
“Explicitly locking LOB tables” on page 380
“Explicitly locking XML data” on page 384
When to use LOCK TABLE:
366
Performance Monitoring and Tuning Guide
The LOCK TABLE statement is often appropriate for particularly high-priority
applications.
About this task
It can improve performance if LOCKMAX disables lock escalation or sets
a high threshold for it.
For example, suppose that you intend to execute an SQL statement to change job
code 21A to code 23 in a table of employee data. The table is defined with:
v The name PERSADM1.EMPLOYEE_DATA
v LOCKSIZE ROW
v LOCKMAX 0, which disables lock escalation
Because the change affects about 15% of the employees, the statement can require
many row locks of mode X.
Procedure
v To avoid the overhead for locks, first execute the following statement:
LOCK TABLE PERSADM1.EMPLOYEE_DATA IN EXCLUSIVE MODE;
v If EMPLOYEE_DATA is a partitioned table space, you can choose to lock
individual partitions as you update them. For example:
LOCK TABLE PERSADM1.EMPLOYEE_DATA PART 1 IN EXCLUSIVE MODE;
When the statement is executed, DB2 locks partition 1 with an X lock. The lock
has no effect on locks that already exist on other partitions in the table space.
The effect of LOCK TABLE:
Different locks are acquired when you issue a LOCK TABLE statement, depending
on the mode of the lock and the type of table space.
The table below shows the modes of locks acquired in segmented and
nonsegmented table spaces for the SHARE and EXCLUSIVE modes of LOCK
TABLE. Auxiliary tables of LOB table spaces are considered nonsegmented table
spaces and have the same locking behavior.
Table 93. Modes of locks acquired by LOCK TABLE. LOCK TABLE on partitions behave the
same as nonsegmented table spaces.
LOCK TABLE In
Nonsegmented or
Universal Table
Spaces
Segmented Table
Space Tables
Segmented Table
Space Table Spaces
EXCLUSIVE MODE
X
X
IX
SHARE MODE
S or SIX
S or SIX
IS
Note: The SIX lock is acquired if the process already holds an IX lock. SHARE
MODE has no effect if the process already has a lock of mode SIX, U, or X.
Recommendations for using LOCK TABLE:
Chapter 13. Programming for concurrency
367
You can use LOCK TABLE to prevent other application processes from changing
any row in a table or partition that your process accesses.
About this task
For example, suppose that you access several tables. You can tolerate
concurrent updates on all the tables except one; for that one, you need RR or RS
isolation. You can handle the situation in several ways:
The LOCK TABLE statement locks out changes by any other process, giving the
exceptional table a degree of isolation even more thorough than repeatable read.
All tables in other table spaces are shared for concurrent update. If other tables
exist in the same table space, the statement locks all tables in a simple table space,
even though you name only one table. No other process can update the table space
for the duration of the lock. If the lock is in exclusive mode, no other process can
read the table space, unless that process is running with UR isolation.
You might want to lock a table or partition that is normally shared for any of the
following reasons:
Taking a “snapshot”
If you want to access an entire table throughout a unit of work as it was at
a particular moment, you must lock out concurrent changes. If other
processes can access the table, use LOCK TABLE IN SHARE MODE. (RR
isolation is not enough; it locks out changes only from rows or pages you
have already accessed.)
Avoiding overhead
If you want to update a large part of a table, it can be more efficient to
prevent concurrent access than to lock each page as it is updated and
unlock it when it is committed. Use LOCK TABLE IN EXCLUSIVE MODE.
Preventing timeouts
Your application has a high priority and must not risk timeouts from
contention with other application processes. Depending on whether your
application updates or not, use either LOCK IN EXCLUSIVE MODE or
LOCK TABLE IN SHARE MODE.
Procedure
v Bind the application plan with RR or RS isolation. But that affects all the tables
you access and might reduce concurrency.
v Design the application to use packages and access the exceptional table in only a
few packages, and bind those packages with RR or RS isolation, and the plan
with CS isolation. Only the tables accessed within those packages are accessed
with RR or RS isolation.
v Add the clause WITH RR or WITH RS to statements that must be executed with
RR or RS isolation. Statements that do not use WITH are executed as specified
by the bind option ISOLATION.
v Bind the application plan with CS isolation and execute LOCK TABLE for the
exceptional table. (If other tables exist in the same table space, see the caution
that follows.) The LOCK TABLE statement locks out changes by any other
process, giving the exceptional table a degree of isolation even more thorough
than repeatable read. All tables in other table spaces are shared for concurrent
update.
How access paths effect locks:
368
Performance Monitoring and Tuning Guide
The access path that DB2 uses can affect the mode, size, and even the object of a
lock.
For example, an UPDATE statement using a table space scan might need an X lock
on the entire table space. If rows to be updated are located through an index, the
same statement might need only an IX lock on the table space and X locks on
individual pages or rows.
If you use the EXPLAIN statement to investigate the access path chosen for an
SQL statement, then check the lock mode in column TSLOCKMODE of the
resulting PLAN_TABLE. If the table resides in a nonsegmented table space, or is
defined with LOCKSIZE TABLESPACE, the mode shown is that of the table space
lock. Otherwise, the mode is that of the table lock.
Important points that you should consider when you work with DB2 locks include:
v You usually do not have to lock data explicitly in your program.
v DB2 ensures that your program does not retrieve uncommitted data unless you
specifically allow that.
v Any page or row where your program updates, inserts, or deletes stays locked
at least until the end of a unit of work, regardless of the isolation level. No other
process can access the object in any way until then, unless you specifically allow
that access to that process.
v Commit often for concurrency. Determine points in your program where
changed data is consistent. At those points, you should issue:
TSO, Batch, and CAF
An SQL COMMIT statement
IMS
v
v
v
v
v
A CHKP or SYNC call, or (for single-mode transactions) a GU call to the
I/O PCB
CICS A SYNCPOINT command.
Bind with ACQUIRE(USE) to improve concurrency.
Set ISOLATION (usually RR, RS, or CS) when you bind the plan or package.
– With RR (repeatable read), all accessed pages or rows are locked until the
next commit point.
– With RS (read stability), all qualifying pages or rows are locked until the next
commit point.
– With CS (cursor stability), only the pages or rows currently accessed can be
locked, and those locks might be avoided. (You can access one page or row
for each open cursor.)
You can also set isolation for specific SQL statements, using WITH.
A deadlock can occur if two processes each hold a resource that the other needs.
One process is chosen as “victim”, its unit of work is rolled back, and an SQL
error code is issued.
You can lock an entire nonsegmented table space, or an entire table in a
segmented table space, by the LOCK TABLE statement:
– To let other users retrieve, but not update, delete, or insert, issue the
following statement:
LOCK TABLE table-name IN SHARE MODE
– To prevent other users from accessing rows in any way, except by using UR
isolation, issue the following statement:
LOCK TABLE table-name IN EXCLUSIVE MODE
Chapter 13. Programming for concurrency
369
Related tasks
“Designing your databases for concurrency” on page 324
Improving concurrency for applications that tolerate incomplete
results
You can use the SKIP LOCKED DATA option to skip rows that are locked to
increase the concurrency of applications and transactions that can tolerate
incomplete results.
Before you begin
PSPI
Your application must use one of the following isolation levels:
v Cursor stability (CS)
v Read stability (RS)
The SKIP LOCKED DATA clause is ignored for applications that use uncommitted
read (UR) or repeatable read (RR) isolation levels.
About this task
The SKIP LOCKED DATA option allows a transaction to skip rows that are
incompatibly locked by other transactions when those locks would hinder the
progress of the transaction. Because the SKIP LOCKED DATA option skips these
rows, the performance of some applications can be improved by eliminating lock
wait time. However, you must use the SKIP LOCKED DATA option only for
applications that can reasonably tolerate the absence of the skipped rows in the
returned data. If your transaction uses the SKIP LOCKED DATA option, it does not
read or modify data that is held by locks.
|
|
|
|
|
However, keep in mind that your application cannot rely on DB2 to skip all data
for which locks are held. DB2 skips only locked data that would block the progress
of the transaction that uses the SKIP LOCKED DATA option. If DB2 determines
through lock avoidance that the locked data is already committed, the locked data
is not skipped. Instead, the data is returned with no wait for the locks.
|
|
|
|
|
Important: When DB2 skips data because of the SKIP LOCKED DATA option, it
does not issue a warning. Even if only a subset of the data that satisfies a query is
returned or modified, the transaction completes as if no data was skipped. Use the
SKIP LOCKED data option only when the requirements and expectations of the
application match this behavior.
Procedure
To improve concurrency for applications that require fast results and can tolerate
incomplete results:
Specify the SKIP LOCKED DATA clause in one of the following SQL statements:
v
v
v
v
v
370
SELECT
SELECT INTO
PREPARE
Searched-UPDATE
Searched-DELETE
Performance Monitoring and Tuning Guide
You can also use the SKIP LOCKED DATA option with the UNLOAD utility. Lock
mode compatibility for transactions that use the SKIP LOCKED DATA option is the
same as lock mode compatibility for other page- and row-level locks, except that a
transaction that uses the SKIP LOCKED DATA option does not wait for the locks
to be released and skips the locked data instead.
Example
|
|
|
Suppose that table WORKQUEUE exists in a table space with row-level locking
and has as part of its definition an ELEMENT column, a PRIORITY column and a
STATUS column, which contain the following data:
ELEMENT
1
2
3
4
|
|
|
PRIORITY
1
1
3
1
STATUS
OPEN
OPEN
OPEN
IN-ANALYSIS
Suppose that a transaction has issued an UPDATE against ELEMENT 1 to change
its STATUS from OPEN to IN_ANALYSIS, and that the UPDATE has not yet
committed.
UPDATE EXTABLE
SET C1 = 99
WHERE C1 < 3;
|
|
Suppose that a second transaction issues the following SELECT statement to find
the highest priority work item:
SELECT ELEMENT
WHERE PRIORITY = ’1’ AND STATUS=’OPEN’
SKIP LOCKED DATA;
This query locates ELEMENT 2 without waiting for the transaction that holds a
lock on the row that contains ELEMENT 1 to commit or rollback its operation.
However, you cannot always expect DB2 to skip this data. For example, DB2 might
use lock avoidance or other techniques to avoid acquiring certain locks. PSPI
Using other options to control locking
You can use various options to control such things as how many locks are used
and which mode is used for certain locks such as information about the
LOCKPART clause of CREATE and ALTER TABLESPACE.
|
|
|
|
|
|
Specifying the maximum number of locks that a single process
can hold
|
About this task
The LOCKS PER USER field of installation panel DSNTIPJ specifies the maximum
number of page, row, LOB, or XML locks that can be held by a single process at
any one time. It includes locks for both the DB2 catalog and directory and for user
data
PSPI
|
|
|
When a request for a page, row, LOB, or XML lock exceeds the specified
limit, it receives SQLCODE -904: “resource unavailable” (SQLSTATE '57011'). The
requested lock cannot be acquired until some of the existing locks are released.
|
The default value is 10 000.
Chapter 13. Programming for concurrency
371
|
|
|
The default should be adequate for 90 percent of the work load when using page
locks. If you use row locks on very large tables, you might want a higher value. If
you use LOBs or XML data, you might need a higher value.
|
|
|
|
|
|
|
|
Procedure
v Review application processes that require higher values to see if they can use
table space locks rather than page, row, LOB, or XML locks. The accounting
trace shows the maximum number of page, row, LOB, or XML locks a process
held while a application runs.
v Remember that the value specified is for a single application. Each concurrent
application can potentially hold up to the maximum number of locks specified.
Do not specify zero or a very large number unless it is required to run your
applications. PSPI
|
Specify the size of locks for a table space
|
The LOCKSIZE clause of CREATE and ALTER TABLESPACE statements specifies
the size for locks held on a table or table space by any application process that
accesses it.
About this task
PSPI
In addition to using the ALTER TABLESPACE statement to change the
lock size for user data, you can also change the lock size of any DB2 catalog table
space that is neither a LOB table space nor a table space that contains links. The
relevant options are:
|
|
|
LOCKSIZE TABLESPACE
A process acquires no table, page, row, LOB, or XML locks within the table
space. That improves performance by reducing the number of locks
maintained, but greatly inhibits concurrency.
|
|
|
LOCKSIZE TABLE
A process acquires table locks on tables in a segmented table space without
partitions. If the table space contains more than one table, this option can
provide acceptable concurrency with little extra cost in processor resources.
LOCKSIZE PAGE
A process acquires page locks, plus table, partition, or table space locks of
modes that permit page locks (IS, IX, or SIX). The effect is not absolute: a
process can still acquire a table, partition, or table space lock of mode S or X,
without page locks, if that is needed. In that case, the bind process issues a
message warning that the lock size has been promoted as described under
“Lock promotion” on page 343.
LOCKSIZE ROW
A process acquires row locks, plus table, partition, or table space locks of
modes that permit row locks (IS, IX, or SIX). The effect is not absolute: a
process can still acquire a table, partition, or table space lock of mode S or X,
without row locks, if that is needed. In that case, the bind process issues a
message warning that the lock size has been promoted as described under
“Lock promotion” on page 343.
|
|
|
|
|
|
LOCKSIZE ANY
DB2 chooses the size of the lock, usually LOCKSIZE PAGE.
LOCKSIZE LOB
If a LOB must be accessed, a process acquires LOB locks and the necessary
372
Performance Monitoring and Tuning Guide
LOB table space locks (IS or IX). This option is valid only for LOB table spaces.
See “LOB locks” on page 378 for more information about LOB locking.
|
|
|
|
LOCKSIZE XML
If XML must be accessed, a process acquires XML locks and the necessary
XML table space locks (IS or IX). This option is valid only for XML table
spaces. See “XML locks” on page 381 for more information about XML locking.
DB2 attempts to acquire an S lock on table spaces that are started with read-only
access. If the LOCKSIZE is PAGE, ROW, or ANY and DB2 cannot get the S lock, it
requests an IS lock. If a partition is started with read-only access, DB2 attempts to
get an S lock on the partition that is started RO. For a complete description of how
the LOCKSIZE clause affects lock attributes, see “How DB2 chooses lock types” on
page 338.
The default option is LOCKSIZE ANY, and the LOCKRULE column of the
SYSIBM.SYSTABLESPACE catalog table records the current value for each table
space.
If you do not use the default, base your choice upon the results of monitoring
applications that use the table space.
Procedure
When considering changing the lock size for a DB2 catalog table space:
Be aware that, in addition to user queries, DB2 internal processes such as bind and
authorization checking and utility processing can access the DB2 catalog.
Page locks versus row locks:
The question of whether to use row or page locks depends on your data and your
applications. If you are experiencing contention on data pages of a table space now
defined with LOCKSIZE PAGE, consider LOCKSIZE ROW. But consider also the
trade-offs.
The resource required to acquire, maintain, and release a row lock is about the
same as that required for a page lock. If your data has 10 rows per page, a table
space scan or an index scan can require nearly 10 times as much resource for row
locks as for page locks. But locking only a row at a time, rather than a page, might
reduce the chance of contention with some other process by 90%, especially if
access is random. (Row locking is not recommended for sequential processing.)
Lock avoidance is very important when row locking is used. Therefore, use
ISOLATION(CS) CURRENTDATA(NO) or ISOLATION(UR) whenever possible. In
many cases, DB2 can avoid acquiring a lock when reading data that is known to be
committed. Thus, if only 2 of 10 rows on a page contain uncommitted data, DB2
must lock the entire page when using page locks, but might ask for locks on only
the 2 rows when using row locks. Then, the resource required for row locks would
be only twice as much, not 10 times as much, as that required for page locks.
On the other hand, if two applications update the same rows of a page, and not in
the same sequence, then row locking might even increase contention. With page
locks, the second application to access the page must wait for the first to finish and
might time out. With row locks, the two applications can access the same page
simultaneously, and might deadlock while trying to access the same set of rows.
Chapter 13. Programming for concurrency
373
In short, no single answer fits all cases.
PSPI
Specifying the maximum number of locks that a process can
hold on a table space
You can specify the LOCKMAX clause of the CREATE and ALTER TABLESPACE
statements for tables of user data and also for tables in the DB2 catalog, by using
ALTER TABLESPACE.
About this task
PSPI
The values of the LOCKMAX clause have the following meanings:
LOCKMAX n
Specifies the maximum number of page or row locks that a single application
process can hold on the table space before those locks are escalated. For LOB
table spaces, this value specifies the number of LOB locks that the application
process can hold before escalating. For XML table spaces, this value specifies
the number of page, row, and XML locks that the application process can hold
before escalating. For an application that uses Sysplex query parallelism, a lock
count is maintained on each member.
|
|
|
|
|
|
|
LOCKMAX SYSTEM
Specifies that n is effectively equal to the system default set by the field
LOCKS PER TABLE(SPACE) of installation panel DSNTIPJ.
LOCKMAX 0
Disables lock escalation entirely.
The default value depends on the value of LOCKSIZE, as shown in the following
table.
Table 94. How the default for LOCKMAX is determined
LOCKSIZE
Default for LOCKMAX
ANY
SYSTEM
other
0
Note: For XML table spaces, the default value of LOCKMAX is inherited from the
base table space.
|
|
Catalog record: Column LOCKMAX of table SYSIBM.SYSTABLESPACE.
Procedure
Use one of the following approaches if you do not use the default value:
v Base your choice upon the results of monitoring applications that use the table
space.
v Aim to set the value of LOCKMAX high enough that, when lock escalation
occurs, one application already holds so many locks that it significantly
interferes with others. For example, if an application holds half a million locks
on a table with a million rows, it probably already locks out most other
applications. Yet lock escalation can prevent it from potentially acquiring another
half million locks.
374
Performance Monitoring and Tuning Guide
v If you alter a table space from LOCKSIZE PAGE or LOCKSIZE ANY to
LOCKSIZE ROW, consider increasing LOCKMAX to allow for the increased
PSPI
number of locks that applications might require.
Specifying a default value for the LOCKMAX option
The LOCKS PER TABLE(SPACE) field of installation panel DSNTIPJ becomes the
default value (SYSTEM) for the LOCKMAX clause of the SQL statements CREATE
TABLESPACE and ALTER TABLESPACE.
About this task
PSPI
The default value of the LOCKS PER TABLE(SPACE) field is 1000.
Procedure
v Use the default or, if you are migrating from a previous release of DB2, continue
to use the existing value. The value should be less than the value for LOCKS
PER USER, unless the value or LOCKS PER USER is 0.
v When you create or alter a table space, especially when you alter one to use row
locks, use the LOCKMAX clause explicitly for that table space. PSPI
Specifying lock modes for statements bound with ISOLATION RR
or RS
The RRULOCK subsystem parameter (U LOCK FOR RR/RS field on the DSNTIPI
installation panel) determines the mode of the lock first acquired on a row or page,
table, partition, or table space for certain statements that are bound with RR or RS
isolation.
About this task
PSPI
Those statements include:
SELECT with FOR UPDATE OF
The following table shows which mode of lock is held on rows or pages
when you specify the SELECT using the WITH RS or WITH RR isolation
clause.
Table 95. Which mode of lock is held on rows or pages when you specify the SELECT using
the WITH RS or WITH RR isolation clause
Option Value
Lock Mode
USE AND KEEP EXCLUSIVE LOCKS
X
USE AND KEEP UPDATE LOCKS
U
USE AND KEEP SHARE LOCKS
S
UPDATE and DELETE, without a cursor
The following table shows which mode of lock is held on rows or pages
when you specify an update or a delete without a cursor.
Table 96. Which mode of lock is held on rows or pages when you specify an update or a
delete without a cursor
Option Value
Lock Mode
NO (default)
S
YES
U or X
Chapter 13. Programming for concurrency
375
The YES option can avoid deadlocks but it reduces concurrency.
PSPI
Disabling update locks for searched UPDATE and DELETE
You can use the XLKUPDLT option on the DSNTIPI installation panel to disable
update lock (ULOCK) on searched UPDATE and DELETE statements.
About this task
PSPI
When you do that, you do not have to issue a second lock request to upgrade the
lock from U to X (exclusive lock) for each updated row. This feature is primarily
beneficial in a data sharing environment.
|
|
|
The default value of the XLKUPDLT field is NO
Procedure
To disable update locks on searched UPDATE and DELETE statements:
Specify the XLKUPDLT field when most or all searched UPDATE and DELETE
statements use an index or can be evaluated by stage 1 processing.
|
|
|
|
|
|
|
|
v When you specify NO, DB2 might use lock avoidance when scanning for
qualifying rows. When a qualifying row is found, an S lock or a U lock is
acquired on the row. The lock on any qualifying row or page is then upgraded
to an X lock before performing the update or delete. For stage one
non-qualifying rows or pages, the lock is released if ISOLATION(CS) or
ISOLATION(RS) is used. For ISOLATION(RR), an S lock is retained on the row
or page until the next commit point. This option is best for achieving the highest
rates of concurrency.
v When you specify YES, DB2 uses an X lock on rows or pages that qualify during
stage 1 processing. With ISOLATION(CS), the lock is released if the row or page
is not updated or deleted because it is rejected by stage 2 processing. With
ISOLATION(RR) or ISOLATION(RS), DB2 acquires an X lock on all rows that
fall within the range of the selection expression. Thus, a lock upgrade request is
not needed for qualifying rows though the lock duration is changed from
manual to commit. The lock duration change is not as costly as a lock upgrade.
|
|
|
|
|
|
v When you specify TARGET, DB2 treats the rows or pages of the specific table
targeted by the insert or update as if the value of XLKUPDLT was YES, and
treats rows or pages of other tables referenced by the query, such as those in
referenced only in the WHERE clause, as if XLKUPDLT were set to NO. By
specifying this blended processing, you can prevent time outs caused by strong
lock acquisition of the read-only non-target objects referenced in the update or
delete statement.
|
PSPI
Avoiding locks during predicate evaluation
The EVALUATE UNCOMMITTED field of installation panel DSNTIP8 indicates if
predicate evaluation can occur on uncommitted data of other transactions.
|
|
376
Performance Monitoring and Tuning Guide
About this task
PSPI
The option applies only to stage 1 predicate processing that uses table
access (table space scan, index-to-data access, and RID list processing) for queries
with isolation level RS or CS.
Although this option influences whether predicate evaluation can occur on
uncommitted data, it does not influence whether uncommitted data is returned to
an application. Queries with isolation level RS or CS return only committed data.
They never return the uncommitted data of other transactions, even if predicate
evaluation occurs on such. If data satisfies the predicate during evaluation, the
data is locked as needed, and the predicate is evaluated again as needed before the
data is returned to the application.
A value of NO specifies that predicate evaluation occurs only on committed data
(or on the uncommitted changes made by the application). NO ensures that all
qualifying data is always included in the answer set.
A value of YES specifies that predicate evaluation can occur on uncommitted data
of other transactions. With YES, data might be excluded from the answer set. Data
that does not satisfy the predicate during evaluation but then, because of undo
processing (ROLLBACK or statement failure), reverts to a state that does satisfy the
predicate is missing from the answer set. A value of YES enables DB2 to take fewer
locks during query processing. The number of locks avoided depends on the
following factors:
v The query's access path
v The number of evaluated rows that do not satisfy the predicate
v The number of those rows that are on overflow pages
The default value for this field is NO.
Procedure
Specify YES to improve concurrency if your applications can tolerate returned data
to falsely exclude any data that would be included as the result of undo processing
(ROLLBACK or statement failure).
PSPI
Disregarding uncommitted inserts
The SKIP UNCOMMITTED INSERTS field on installation panel DSNTIP8 controls
whether uncommitted inserts are ignored.
About this task
PSPI
DB2 can handle uncommitted inserts in the following ways:
v DB2 can wait until the INSERT transaction completes (commits or rolls back)
and return data accordingly. This is the default option, NO.
v DB2 can ignore uncommitted inserts, which in many cases can improve
concurrency. This behavior must be specified as YES.
Procedure
For greater concurrency:
Chapter 13. Programming for concurrency
377
Specify YES for most applications. However, the following examples indicate some
instances when the default option, NO, is preferred.
One transaction creates another: Suppose that an initial transaction produces a
second transaction. The initial transaction passes information to the second
transaction by inserting data into a table that the second transaction reads. In this
case, NO should be used.
Data modified by DELETE and INSERT: Suppose that you frequently modify
data by deleting the data and inserting the new image of the data. In such cases
that avoid UPDATE statements, the default should be used.
PSPI
Controlling DB2 locks for LOBs
You can control how DB2 uses locks to control the concurrency of access to LOB
data.
LOB locks
The purpose of LOB locks is different than that of regular transaction locks. A lock
that is taken on a LOB value in a LOB table space is called a LOB lock.
Relationship between transaction locks and LOB locks
LOB column values are stored in a different table space, called a LOB table space,
from the values in the base table.
PSPI
An application that reads or updates a row in a table that contains LOB
columns obtains its normal transaction locks on the base table. The locks on the
base table also control concurrency for the LOB table space. When locks are not
acquired on the base table, such as for ISO(UR), DB2 maintains data consistency by
using LOB locks.
ISOLATION (UR)
|
|
When an application is reading rows using uncommitted read, no page or row
locks are taken on the base table. Therefore, these readers must take an S LOB lock
to ensure that they are not reading a partial LOB or a LOB value that is
inconsistent with the base row. This LOB lock is acquired and released
immediately, which is sufficient for DB2 to ensure that a complete copy of the LOB
|
data is ready for subsequent reference.
PSPI
Hierarchy of LOB locks
Just as page, row, and table space locks have a hierarchical relationship, LOB locks
and locks on LOB table spaces have a hierarchical relationship.
PSPI
If the LOB table space is locked with a gross lock, then LOB locks are not
acquired. In a data sharing environment, the lock on the LOB table space is used to
determine whether the lock on the LOB must be propagated beyond the local
IRLM.
PSPI
LOB and LOB table space lock modes
This information describes the modes of LOB locks and LOB table space locks.
378
Performance Monitoring and Tuning Guide
Modes of LOB locks
PSPI
The following LOB lock modes are possible:
S (SHARE)
The lock owner and any concurrent processes can read, update, or delete
the locked LOB. Concurrent processes can acquire an S lock on the LOB.
X (EXCLUSIVE)
The lock owner can read or change the locked LOB. Concurrent processes
cannot access the LOB.
Modes of LOB table space locks
The following lock modes are possible on the LOB table space:
IS (INTENT SHARE)
The lock owner can update LOBs to null or zero-length, or read or delete
LOBs in the LOB table space. Concurrent processes can both read and
change LOBs in the same table space.
IX (INTENT EXCLUSIVE)
The lock owner and concurrent processes can read and change data in the
LOB table space. The lock owner acquires a LOB lock on any data it
accesses.
S (SHARE)
The lock owner and any concurrent processes can read and delete LOBs in
the LOB table space. An S-lock is only acquired on a LOB in the case of an
ISO(UR)
SIX (SHARE with INTENT EXCLUSIVE)
The lock owner can read and change data in the LOB table space. If the
lock owner is inserting (INSERT or UPDATE), the lock owner obtains a
LOB lock. Concurrent processes can read or delete data in the LOB table
space (or update to a null or zero-length LOB).
X (EXCLUSIVE)
The lock owner can read or change LOBs in the LOB table space. The lock
owner does not need LOB locks. Concurrent processes cannot access the
data.
PSPI
LOB lock and LOB table space lock duration
This information describes the duration of LOB locks and LOB table space locks.
The duration of LOB locks
|
|
|
|
|
|
PSPI
Locks on LOBs are taken when they are needed for an INSERT or
UPDATE operations and released immediately at the completion of the operation.
LOB locks are not held for SELECT and DELETE operations. In the case of an
application that uses the uncommitted read option, a LOB lock might be acquired,
but only to test the LOB for completeness. The lock is released immediately after it
is acquired.
Chapter 13. Programming for concurrency
379
The duration of LOB table space locks
|
|
|
|
|
|
Locks on LOB table spaces are acquired when they are needed; that is, the
ACQUIRE option of BIND has no effect on when the table space lock on the LOB
table space is taken. When the table space lock is released is determined by a
combination of factors:
v The RELEASE option of bind
v Whether the SQL statement is static or dynamic
|
v Whether there are held cursors or held locators
|
|
|
|
When the release option is COMMIT, the lock is released at the next commit point,
unless there are held cursors or held locators. If the release option is
DEALLOCATE, the lock is released when the object is deallocated (the application
ends). The BIND option has no effect on dynamic SQL statements, which always
|
use RELEASE(COMMIT), unless you use dynamic statement caching.
PSPI
When LOB table space locks are not taken
A lock might not be acquired on a LOB table space at all.
PSPI
For example, if a row is deleted from a table and the value of the LOB
column is null, the LOB table space associated with that LOB column is not locked.
DB2 does not access the LOB table space if the application:
v Selects a LOB that is null or zero length
v Deletes a row where the LOB is null or zero length
v Inserts a null or zero length LOB
v Updates a null or zero-length LOB to null or zero-length PSPI
Controlling the number of LOB locks
You can control the number of LOB locks that are taken.
|
|
|
PSPI
LOB locks are counted toward the total number of locks allowed per user.
Control this number by the value you specify on the LOCKS PER USER field of
installation panel DSNTIPJ. The number of LOB locks that are acquired during a
unit of work is reported in IFCID 0020. As with any table space, use the
LOCKMAX clause of the CREATE TABLESPACE or ALTER TABLESPACE
statement to control the number of LOB locks that are acquired within a particular
|
LOB table space. PSPI
Explicitly locking LOB tables
The reasons for using LOCK TABLE on an auxiliary table are somewhat different
than that for regular tables.
About this task
PSPI
You might use the LOCK table statement for lobs for any of the following
reasons:
v You can use LOCK TABLE to control the number of locks acquired on the
auxiliary table.
v You can use LOCK TABLE IN SHARE MODE to prevent other applications from
inserting LOBs.
380
Performance Monitoring and Tuning Guide
With auxiliary tables, LOCK TABLE IN SHARE MODE does not prevent any
changes to the auxiliary table. The statement does prevent LOBs from being
inserted into the auxiliary table, but it does not prevent deletes. Updates are
generally restricted also, except where the LOB is updated to a null value or a
zero-length string.
v You can use LOCK TABLE IN EXCLUSIVE MODE to prevent other applications
from accessing LOBs.
With auxiliary tables, LOCK TABLE IN EXCLUSIVE MODE also prevents access
from uncommitted readers.
v Either statement eliminates the need for lower-level LOB locks. PSPI
Controlling lock size for LOB table spaces
You can us the LOCKSIZE option control the size of locks that are acquired when
applications access data in LOB table spaces.
About this task
PSPI
The LOCKSIZE TABLE, PAGE, ROW, and XML options are not valid for
LOB table spaces. The other options act as follows:
LOCKSIZE TABLESPACE
A process acquires no LOB locks
LOCKSIZE ANY
DB2 chooses the size of the lock. For a LOB table space, this is usually
LOCKSIZE LOB.
LOCKSIZE LOB
If LOBs are accessed, a process acquires the necessary LOB table space
locks (IS or IX), and might acquire LOB locks. PSPI
Controlling DB2 locks for XML data
You can control how DB2 uses locks to control the concurrency of access to XML
data.
XML locks
This information describes the locking that occurs when XML data is accessed.
Locks that are acquired for operations on XML data
DB2 stores XML column values in a separate XML table space. An application that
reads or updates a row in a table that contains XML columns might use lock
avoidance or obtain transaction locks on the base table.
PSPI
|
|
|
|
If an XML column is updated or read, the application might also acquire
transaction locks on the XML table space and XML values that are stored in the
XML table space. A lock that is taken on an XML value in an XML table space is
called an XML lock.
|
|
In data sharing, P page locks are acquired during insert, update, and delete
operations.
Chapter 13. Programming for concurrency
381
In summary, the main purpose of XML locks is for managing the space used by
XML data and to ensure that XML readers do not read partially updated XML
data.
The following table shows the relationship between an operation that is performed
on XML data and the associated XML table space and XML locks that are acquired.
Table 97. Locks that are acquired for operations on XML data. This table does not account
for gross locks that can be taken because of the LOCKSIZE TABLESPACE option, the LOCK
TABLE statement, or lock escalation.
Operation on XML
value
XML table space
lock
XML lock
Comment
Read (including UR)
IS
S
Prevents storage from being
reused while the XML data
is being read.
Insert
IX
X
Prevents other processes
from seeing partial XML
data
Delete
IX
X
To hold space in case the
delete is rolled back.
Storage is not reusable until
the delete is committed and
no other readers of the
XML data exist.
Update
IS->IX
Two XML
locks: an
X-lock for the
delete and an
X-lock for the
insert.
Operation is a delete
followed by an insert.
ISOLATION(UR) or ISOLATION(CS): When an application reads rows using
uncommitted read or lock avoidance, no page or row locks are taken on the base
table. Therefore, these readers must take an S XML lock to ensure that they are not
reading a partial XML value or an XML value that is inconsistent with the base
row. When a conditional XML lock cannot be acquired for a SQL statement with
UR isolation, DB2 might return no rows and issue an SQL return code +100.
|
PSPI
Hierarchy of XML locks
Just as page locks (or row locks) and table space locks have a hierarchical
relationship, XML locks and locks on XML table spaces have a hierarchical
relationship.
PSPI
If the XML table space is locked with a gross lock, then XML locks are not
acquired. In a data sharing environment, the lock on the XML table space is used
to determine whether the lock on the XML must be propagated beyond the local
IRLM.
PSPI
XML and XML table space lock modes
This information describes the modes of XML locks and XML table space locks
PSPI
382
Performance Monitoring and Tuning Guide
|
|
|
S (SHARE)
The lock owner and any concurrent processes can read the locked XML
data. Concurrent processes can acquire an S lock on the XML data. The
purpose of the S lock is to reserve the space used by the XML data.
|
|
X (EXCLUSIVE)
The lock owner can read, update, or delete the locked XML data.
Concurrent processes cannot access the XML data.
PSPI
XML lock and XML table space lock duration
This information describes the duration of XML locks and XML table space locks.
PSPI
The duration of XML locks
|
|
|
|
X-locks on XML data that are acquired for insert, update, and delete statements are
usually released at commit. The duration of XML locks acquired for select
statements varies depending upon isolation level, the setting of the
CURRENTDATA parameter, and whether work files or multi-row fetch are used.
|
|
|
|
|
|
|
|
|
|
XML locks acquired for fetch are not normally held until commit and are either
released at next fetch or at close cursor. Because XML locks for updating (INSERT,
UPDATE, and DELETE) are held until commit and because locks are put on each
XML column in both a source table and a target table, it is possible that a
statement such as an INSERT with a fullselect that involves XML columns can
accumulate many more locks than a similar statement that does not involve XML
data. To prevent system problems caused by too many locks, you can:
|
|
|
|
|
|
|
|
|
v Ensure that you have lock escalation enabled for the XML table spaces that are
involved in the INSERT. In other words, make sure that LOCKMAX is non-zero
for those XML table spaces.
v Alter the XML table space to change the LOCKSIZE to TABLESPACE before
executing the INSERT with fullselect.
v Increase the LOCKMAX value on the table spaces involved and ensure that the
user lock limit is sufficient.
v Use LOCK TABLE statements to lock the XML table spaces.(Locking the
auxiliary table that is contained in a partitioned XML table space locks the XML
table space. In the case of segmented, but non-partitioned, XML table spaces, the
LOCK TABLE statement locks the table with a gross lock and locks the table
space with an intent lock.)
The duration of XML table space locks
|
|
|
|
|
Locks on XML table spaces are acquired when they are needed; that is, the
ACQUIRE option of BIND has no effect on when the table space lock on the XML
table space is taken. The table space lock is released according to the value
specified on the RELEASE option of BIND (except when a cursor is defined WITH
HOLD).
PSPI
Chapter 13. Programming for concurrency
383
When XML table space locks are not taken
A lock might not be acquired on an XML table space at all.
PSPI
DB2 does not access the XML table space if the application:
v Selects an XML value that is null
v Deletes a row where the XML value is null
v Inserts a null XML value
v Updates an XML value to null
|
|
|
|
PSPI
Controlling the number of XML locks
You can control the number of XML locks that are taken.
About this task
PSPI
XML locks are counted toward the total number of locks allowed per user.
Procedure
Control this number by the value you specify on the LOCKS PER USER field of
installation panel DSNTIPJ. The number of XML locks that are acquired during a
unit of work is reported in IFCID 0020. PSPI
Controlling XML lock escalation
As with any table space, use the LOCKMAX clause of the ALTER TABLESPACE
statement to control the number of locks that are acquired within a particular XML
table space before the lock is escalated.
About this task
PSPI
When the total number of page, row, and XML locks reaches the
maximum that you specify in the LOCKMAX clause, the XML locks escalate to a
gross lock on the XML table space, and the page, row, and XML locks are released.
Information about XML locks and lock escalation is reported in IFCID 0020. PSPI
Explicitly locking XML data
You can use a LOCK TABLE statement to explicitly lock XML data
About this task
PSPI
The reasons for using LOCK TABLE on an auxiliary table are somewhat
different than that for regular tables.
v You can use LOCK TABLE to control the number of locks acquired on the
auxiliary table.
v You can use LOCK TABLE IN SHARE MODE to prevent other applications from
inserting, updating, or deleting XML data.
|
|
384
Performance Monitoring and Tuning Guide
v You can use LOCK TABLE IN EXCLUSIVE MODE to prevent other applications
from accessing XML data.
With auxiliary tables, LOCK TABLE IN EXCLUSIVE MODE also prevents access
from uncommitted readers.
v Either statement eliminates the need for lower-level XML locks. PSPI
Specifying the size of locks for XML data
The LOCKSIZE TABLE, PAGE, ROW, and LOB options are not valid for XML table
spaces.
About this task
PSPI
|
The other options act as follows:
LOCKSIZE TABLESPACE
A process acquires no XML locks.
LOCKSIZE ANY
DB2 chooses the size of the lock. For an XML table space, this is usually
LOCKSIZE XML.
LOCKSIZE XML
If XML data is accessed, a process acquires XML locks and the necessary XML
table space locks (IS or IX).
PSPI
Claims and drains for concurrency control
DB2 utilities, commands, and some ALTER, CREATE, and DROP statements can
take over access to some objects independently of any transaction locks that are
held on the object.
Claims
A claim is a notification to DB2 that an object is being accessed.
PSPI
When an application first accesses an object, within a unit of work, it makes a
claim on the object. It releases the claim at the next commit point. Unlike a
transaction lock, a claim normally does not persist past the commit point. To access
the object in the next unit of work, the application must make a new claim.
However, an exception exists. If a cursor defined with the clause WITH HOLD is
positioned on the claimed object, the claim is not released at a commit point.
A claim indicates activity on or interest in a particular page set or partition to DB2.
Claims prevent drains from occurring until the claim is released.
Three classes of claims
The following table shows the three classes of claims and the actions that they
allow.
Chapter 13. Programming for concurrency
385
Table 98. Three classes of claims and the actions that they allow
Claim class
Actions allowed
Write
Reading, updating, inserting, and deleting
Repeatable read
Reading only, with repeatable read (RR) isolation
Cursor stability read
Reading only, with read stability (RS), cursor stability
(CS), or uncommitted read (UR) isolation
Detecting long-running read claims
You can set the length of the period in minutes by using the LRDRTHLD
subsystem parameter in the DSNTIPE thread management panel.
PSPI
Drains
A drain is the action of taking control of access to an object by preventing new
claims and by waiting for existing claims to be released.
PSPI
A utility can drain a partition when applications are accessing it. The drain
quiesces the applications by allowing each one to reach a commit point, but
preventing any of them, or any other applications, from making a new claim.
When no more claims exist, the process that drains (the drainer) controls access to
the drained object. The applications that were drained can still hold transaction
locks on the drained object, but they cannot make new claims until the drainer has
finished.
Drained claim classes
A drainer does not always need complete control. It could drain the following
combinations of claim classes:
v Only the write claim class
v Only the repeatable read claim class
v All claim classes
Example
The CHECK INDEX utility needs to drain only writers from an index space and its
associated table space. RECOVER, however, must drain all claim classes from its
table space. The REORG utility can drain either writers (with DRAIN WRITERS) or
all claim classes (with DRAIN ALL).
PSPI
How DB2 uses drain locks
A drain lock prevents conflicting processes from trying to drain the same object at
the same time.
PSPI
Processes that drain only writers can run concurrently; but a process that
drains all claim classes cannot drain an object concurrently with any other process.
To drain an object, a drainer first acquires one or more drain locks on the object,
one for each claim class that it needs to drain. When the locks are in place, the
drainer can begin after all processes with claims on the object have released their
claims.
386
Performance Monitoring and Tuning Guide
A drain lock also prevents new claimers from accessing an object while a drainer
has control of it.
Types of drain locks
Three types of drain locks on an object correspond to the three claim classes:
v Write
v Repeatable read
v Cursor stability read
In general, after an initial claim has been made on an object by a user, no other
user in the system needs a drain lock. When the drain lock is granted, no drains
on the object are in process for the claim class needed, and the claimer can
proceed.
Exception
The claimer of an object requests a drain lock in two exceptional cases:
v A drain on the object is in process for the claim class needed. In this case, the
claimer waits for the drain lock.
v The claim is the first claim on an object before its data set has been physically
opened. Here, acquiring the drain lock ensures that no exception states prohibit
allocating the data set.
When the claimer gets the drain lock, it makes its claim and releases the lock
before beginning its processing.
PSPI
Utility locks on the catalog and directory
When the target of a utility is an object in the catalog or directory, such as a
catalog table, the utility either drains or claims the object.
PSPI
When the target is a user-defined object, the utility claims or drains it but also uses
the directory and, perhaps, the catalog; for example, to check authorization. In
those cases, the utility uses transaction locks on catalog and directory tables. It
acquires those locks in the same way as an SQL transaction does. For information
about the SQL statements that require locks on the catalog, see “Contention on the
DB2 catalog” on page 336.
The UTSERIAL lock
Access to the SYSUTILX table space in the directory is controlled by a unique lock
called UTSERIAL. A utility must acquire the UTSERIAL lock to read or write in
SYSUTILX, whether SYSUTILX is the target of the utility or is used only
incidentally.
PSPI
Compatibility of utilities
Two utilities are considered compatible if they do not need access to the same object
at the same time in incompatible modes.
Chapter 13. Programming for concurrency
387
PSPI
The concurrent operation of two utilities is not typically controlled by
either drain locks or transaction locks, but merely by a set of compatibility rules.
Before a utility starts, it is checked against all other utilities running on the same
target object. The utility starts only if all the others are compatible.
The check for compatibility obeys the following rules:
v The check is made for each target object, but only for target objects. Typical
utilities access one or more table spaces or indexes, but if two utility jobs use
none of the same target objects, the jobs are always compatible.
An exception is a case in which one utility must update a catalog or directory
table space that is not the direct target of the utility. For example, the LOAD
utility on a user table space updates DSNDB06.SYSCOPY. Therefore, other
utilities that have DSNDB06.SYSCOPY as a target might not be compatible.
v Individual data and index partitions are treated as distinct target objects.
Utilities operating on different partitions in the same table or index space are
compatible.
v When two utilities access the same target object, their most restrictive access
modes determine whether they are compatible. For example, if utility job 1 reads
a table space during one phase and writes during the next, it is considered a
writer. It cannot start concurrently with utility 2, which allows only readers on
the table space. (Without this restriction, utility 1 might start and run
concurrently with utility 2 for one phase; but then it would fail in the second
phase, because it could not become a writer concurrently with utility 2.)
For details on which utilities are compatible, refer to each utility's description in
DB2 Utility Guide and Reference.
The following figure illustrates how SQL applications and DB2 utilities can operate
concurrently on separate partitions of the same table space.
388
Performance Monitoring and Tuning Guide
SQL Application
Deallocate
Write claim, P1
Commit
Allocate
Write claim, P1
Commit
Time line 1
2
3
4
5
6
7
8
9
10
Wait
LOAD, P1
LOAD, P2
LOAD RESUME YES
Time
Event
t1
An SQL application obtains a transaction lock on every partition in the table space. The duration of the locks
extends until the table space is deallocated.
t2
The SQL application makes a write claim on data partition 1 and index partition 1.
t3
The LOAD jobs begin draining all claim classes on data partitions 1 and 2 and index partitions 1 and 2.
LOAD on partition 2 operates concurrently with the SQL application on partition 1. LOAD on partition 1
waits.
t4
The SQL application commits, releasing its write claims on partition 1. LOAD on partition 1 can begin.
t6
LOAD on partition 2 completes.
t7
LOAD on partition 1 completes, releasing its drain locks. The SQL application (if it has not timed out)
makes another write claim on data partition 1.
t10
The SQL application deallocates the table space and releases its transaction locks.
PSPI
Figure 44. SQL and utility concurrency. Two LOAD jobs execute concurrently on two partitions of a table space
Concurrency during REORG
You can specify certain options that might prevent timeouts and deadlocks when
you run the REORG utility.
Procedure
PSPI
|
To improve concurrency for REORG operations:
v If you get timeouts or deadlocks when you use REORG with the SHRLEVEL
CHANGE option, run the REORG utility with the DRAIN ALL option. The
default is DRAIN WRITERS, which is done in the log phase. The specification of
DRAIN ALL indicates that both writers and readers are drained when the
MAXRO threshold is reached.
v Consider the DRAIN ALL option in environments where a lot of update activity
occurs during the log phase. With this specification, no subsequent drain is
required in the switch phase.
PSPI
Chapter 13. Programming for concurrency
389
Utility operations with nonpartitioned indexes
In a nonpartitioned index, either a partitioning index or a secondary index, an
entry can refer to any partition in the underlying table space.
PSPI
DB2 can process a set of entries of a nonpartitioned index that all refer to
a single partition and achieve the same results as for a partition of a partitioned
index. (Such a set of entries is called a logical partition of the nonpartitioned index.)
Suppose that two LOAD jobs execute concurrently on different partitions of the
same table space. When the jobs proceed to build a partitioned index, either a
partitioning index or a secondary index, they operate on different partitions of the
index and can operate concurrently. Concurrent operations on different partitions
are possible because the index entries in an index partition refer only to data in the
corresponding data partition for the table.
Utility processing can be more efficient with partitioned indexes because, with the
correspondence of index partitions to data partitions, they promote partition-level
independence. For example, the REORG utility with the PART option can run
faster and with more concurrency when the indexes are partitioned. REORG
rebuilds the parts for each partitioned index during the BUILD phase, which can
increase parallel processing and reduce the lock contention of nonpartitioned
indexes.
Similarly, for the LOAD PART and REBUILD INDEX PART utilities, the parts for
each partitioned index can be built in parallel during the BUILD phase, which
reduces lock contention and improves concurrency. The LOAD PART utility also
processes partitioned indexes with append logic, instead of the insert logic that it
uses to process nonpartitioned indexes, which also improves performance.
390
Performance Monitoring and Tuning Guide
PSPI
Chapter 14. Programming for parallel processing
You can significantly reduce the response time for data or processor-intensive
queries by taking advantage of the ability of DB2 to initiate multiple parallel
operations when it accesses data from a table or index in a partitioned table space.
Parallel processing
DB2 can initiate multiple parallel operations when it accesses data from a table or
index in a partitioned table space.
Query I/O parallelism manages concurrent I/O requests for a single query, fetching
pages into the buffer pool in parallel. This processing can significantly improve the
performance of I/O-bound queries. I/O parallelism is used only when one of the
other parallelism modes cannot be used.
Query CP parallelism enables true multitasking within a query. A large query can be
broken into multiple smaller queries. These smaller queries run simultaneously on
multiple processors accessing data in parallel, which reduces the elapsed time for a
query.
To expand even farther the processing capacity available for processor-intensive
queries, DB2 can split a large query across different DB2 members in a data
sharing group. This feature is known as Sysplex query parallelism.
DB2 can use parallel operations for processing the following types of operations:
v Static and dynamic queries
v Local and remote data access
v Queries using single table scans and multi-table joins
v Access through an index, by table space scan or by list prefetch
v Sort
When a view or table expression is materialized, DB2 generates a temporary work
file. This type of work file is shareable in CP mode if there is no full outer join
case.
Parallelism for partitioned and nonpartitioned table spaces
Parallel operations usually involve at least one table in a partitioned table space.
Scans of large partitioned table spaces have the greatest performance
improvements where both I/O and central processor (CP) operations can be
carried out in parallel.
Both partitioned, nonpartitioned, and partition-by-growth table spaces can take
advantage of query parallelism. Parallelism is enabled to include non-clustering
indexes. Thus, table access can be run in parallel when the application is bound
with DEGREE (ANY) and the table is accessed through a non-clustering index.
Methods of parallel processing
The figures in this topic show how the parallel methods compare with sequential
prefetch and with each other.
© Copyright IBM Corp. 1982, 2010
391
All three techniques assume access to a table space with three partitions, P1, P2,
and P3. The notations P1, P2, and P3 are partitions of a table space. R1, R2, R3,
and so on, are requests for sequential prefetch. The combination P2R1, for example,
means the first request from partition 2.
Sequential processing
The following figure shows sequential processing. With sequential processing, DB2
takes the three partitions in order, completing partition 1 before starting to process
partition 2, and completing 2 before starting 3. Sequential prefetch allows overlap
of CP processing with I/O operations, but I/O operations do not overlap with
each other. In the example in the following figure, a prefetch request takes longer
than the time to process it. The processor is frequently waiting for I/O.
CP
processing:
P1R1
P1R2
P1R2
P1R3
I/O:
P1R1
P1R3
…
…
P2R1
P2R1
P2R2
P2R2
P2R3
P2R3
…
…
P3R1
P3R1
P3R2
Time line
Figure 45. CP and I/O processing techniques. Sequential processing.
Parallel I/O
The following figure shows parallel I/O operations. With parallel I/O, DB2 manages
data from the three partitions at the same time. The processor processes the first
request from each partition, then the second request from each partition, and so
on. The processor is not waiting for I/O, but there is still only one processing task.
CP processing:
P1R1 P2R1 P3R1 P1R2 P2R2 P3R2 P1R3
…
I/O:
P1
R1
R2
R3
P2
R1
R2
R3
P3
R1
R2
R3
Time line
Figure 46. CP and I/O processing techniques. Parallel I/O processing.
Parallel CP processing and sysplex query parallelism
The following figure shows parallel CP processing. With parallel CP processing, DB2
can use multiple parallel tasks to process the query. Three tasks working
concurrently can greatly reduce the overall elapsed time for data-intensive and
processor-intensive queries. The same principle applies for Sysplex query parallelism,
except that the work can cross the boundaries of a single CPC.
392
Performance Monitoring and Tuning Guide
CP task 1:
I/O:
P1R1
P1R1
P1R2
P1R2
P1R3
P2R1
P2R2
P2R2
P2R3
P3R1
P3R2
P3R2
P3R3
P1R3
…
CP task 2:
I/O:
P2R1
CP task 3:
I/O:
P3R1
…
P2R3
…
…
P3R3
…
…
Time line
Figure 47. CP and I/O processing techniques. Query processing using CP parallelism. The tasks can be contained
within a single CPC or can be spread out among the members of a data sharing group.
Queries that are most likely to take advantage of parallel
operations
Queries that can take advantage of parallel processing are those queries in
which:
v DB2 spends most of the time fetching pages—an I/O-intensive query
A typical I/O-intensive query is something like the following query, assuming
that a table space scan is used on many pages:
SELECT COUNT(*) FROM ACCOUNTS
WHERE BALANCE > 0 AND
DAYS_OVERDUE > 30;
v DB2 spends processor time and I/O time to process rows for certain types of
queries. Those queries include:
Queries with intensive data scans and high selectivity
Those queries involve large volumes of data to be scanned but relatively
few rows that meet the search criteria.
Queries that contain aggregate functions
Column functions (such as MIN, MAX, SUM, AVG, and COUNT)
typically involve large amounts of data to be scanned but return only a
single aggregate result.
Queries that access long data rows
Those queries access tables with long data rows, and the ratio of rows
per page is low (one row per page, for example).
Queries that require large amounts of central processor time
Those queries might be read-only queries that are complex,
data-intensive, or that involve a sort. For example, A typical
processor-intensive query is something like:
SELECT MAX(QTY_ON_HAND) AS MAX_ON_HAND,
AVG(PRICE) AS AVG_PRICE,
AVG(DISCOUNTED_PRICE) AS DISC_PRICE,
SUM(TAX) AS SUM_TAX,
Chapter 14. Programming for parallel processing
393
SUM(QTY_SOLD) AS SUM_QTY_SOLD,
SUM(QTY_ON_HAND - QTY_BROKEN) AS QTY_GOOD,
AVG(DISCOUNT) AS AVG_DISCOUNT,
ORDERSTATUS,
COUNT(*) AS COUNT_ORDERS
FROM
ORDER_TABLE
WHERE SHIPPER = ’OVERNIGHT’ AND
SHIP_DATE < DATE(’2006-01-01’)
GROUP BY ORDERSTATUS
ORDER BY ORDERSTATUS;
Terminology
When the term task is used with information about parallel processing, consider
the context. For parallel query CP processing or Sysplex query parallelism, a task is
an actual z/OS execution unit used to process a query. For parallel I/O processing,
a task simply refers to the processing of one of the concurrent I/O streams.
A parallel group is the term used to name a particular set of parallel operations
(parallel tasks or parallel I/O operations). A query can have more than one parallel
group, but each parallel group within the query is identified by its own unique ID
number.
The degree of parallelism is the number of parallel tasks or I/O operations that DB2
determines can be used for the operations on the parallel group. The maximum of
parallel operations that DB2 can generate is 254. However, for most queries and
DB2 environments, DB2 chooses a lower number.
You might need to limit the maximum number further because more parallel
operations consume processor, real storage, and I/O resources. If resource
consumption in high in your parallelism environment, use the MAX DEGREE field
on installation panel DSNTIP8 to explicitly limit the maximum number of parallel
operations that DB2 generates.
|
|
|
|
|
In a parallel group, an originating task is the TCB (SRB for distributed requests) that
coordinates the work of all the parallel tasks. Parallel tasks are executable units
composed of special SRBs, which are called preemptable SRBs.
With preemptable SRBs, the z/OS dispatcher can interrupt a task at any time to
run other work at the same or higher dispatching priority. For non-distributed
parallel work, parallel tasks run under a type of preemptable SRB called a client
SRB, which lets the parallel task inherit the importance of the originating address
space. For distributed requests, the parallel tasks run under a preemptable SRB
called an enclave SRB.
Related tasks
“Enabling parallel processing” on page 398
Partitioning for optimal parallel performance
The following are general considerations for how to partition data for the best
performance when using parallel processing. Bear in mind that DB2 does not
always choose parallelism, even if you partition the data.
394
Performance Monitoring and Tuning Guide
About this task
This exercise assumes the following:
v You have narrowed the focus to a few, critical queries that are running
sequentially. It is best to include a mix of I/O-intensive and processor-intensive
queries into this initial set. You know how long those queries take now and
what your performance objectives for those queries are. Although tuning for one
set of queries might not work for all queries, overall performance and
throughput can be improved.
v You are optimizing for query-at-a-time operations, and you want a query to
make use of all the processor and I/O resources available to it.
When running many queries at the same time, you might need to increase the
number of partitions and the amount of processing power to achieve similar
elapsed times.
This information guides you through the following analyses:
1. Determining the nature of the query (what balance of processing and I/O
resources it needs)
2. Determining how many partitions the table space should have to meet your
performance objective, number based on the nature of the query and on the
processor and I/O configuration at your site
Determining if a query is I/O- or processor-intensive
How DB2 can a best take advantage of parallel processing for a particular query
depends upon whether the query is I/O or processor intensive.
Procedure
To determine if your sequential queries are I/O or processor-intensive:
Examine the DB2 accounting reports:
v If the “other read I/O time” is close to the total query elapsed time, then the
query is I/O-intensive. “Other read I/O time” is the time that DB2 is waiting for
pages to be read in to the buffer pools.
v If “CPU time” is close to the total query elapsed time, then the query is
processor-intensive.
v If the processor time is somewhere between 30 and 70 percent of the elapsed
time, then the query is pretty well-balanced om terms of CPU and I/O.
Determining the number of partitions for parallel processing
You can calculate the number of partitions that will enable your queries to best
take advantage of parallel processing.
About this task
This information provides general guidance for determining the number of
partitions. However, you must take into account the I/O subsystem, the nature of
the queries that you run, and plan for the data to grow.
If your physical and logical design are not closely tied together, and you can
specify any number of partitions, immediately specifying more partitions than you
Chapter 14. Programming for parallel processing
395
need causes no harm. However, you should start with a reasonable number of
partitions because you can always add more partitions later with the ALTER
TABLESPACE statement.
You can also createpartition-by-growth table spaces, which begin as a single-partition
table spaces and automatically add partitions as needed to accommodate data
growth. Consider creating a partition by growth table space in cases such as a table
space with a single table that is expected to become larger than 64 GB, and which
does not include a suitable partitioning key.
|
|
|
|
|
Consider too the operational complexity of managing many partitions. This
complexity might not be as much of an issue at sites that use tools, such as the
DB2 Automated Utilities Generator and job scheduler.
In general, the number of partitions falls in a range between the number of CPs
and the maximum number of I/O paths to the data. When determining the
number of partitions that use a mixed set of processor- and I/O-intensive queries,
always choose the largest number of partitions in the range you determine.
Procedure
v For processor-intensive queries, specify, at a minimum, a number that is equal to
the number of CPs in the system that you want to use for parallelism, whether
you have a single CPC or multiple CPCs in a data sharing group If the query is
processor-intensive, it can use all CPs available in the system. If you plan to use
Sysplex query parallelism, then choose a number that is close to the total
number of CPs (including partial allocation of CPs) that you plan to allocate for
decision support processing across the data sharing group. Do not include
processing resources that are dedicated to other, higher priority, work.
v For I/O-intensive queries:
1. Calculate the ratio of elapsed time to processor time.
2. Multiply that ratio by the number of processors allocated for decision
support processing.
3. Round up the resulting number to determine how many partitions you can
use to the best advantage, assuming that these partitions can be on separate
devices and have adequate paths to the data.
This calculation also assumes that you have adequate processing power to
handle the increase in partitions. (Which might not be much of an issue with an
extremely I/O-intensive query.)
By partitioning the amount indicated previously, the query is brought into
balance by reducing the I/O wait time. If the number of partitions is less than
the number of CPs available on your system, increase this number close to the
number of CPs available. By doing so, other queries that read this same table,
but that are more processor-intensive, can take advantage of the additional
processing power.
Example: Suppose that you have a 10-way CPC and the calculated number of
partitions is five. Instead of limiting the table space to five partitions, use 10, to
equal the number of CPs in the CPC.
Example configurations for an I/O-intensive query
If the I/O cost of your queries is about twice as much as the processing cost, the
optimal number of partitions when run on a 10-way processor is 20 (2 * number of
processors). The figure below shows an I/O configuration that minimizes the
396
Performance Monitoring and Tuning Guide
elapsed time and allows the CPC to run at 100% busy. It assumes the suggested
guideline of four devices per control unit and four channels per control unit.1
10-way CPC
ESCON channels (20)
ESCON
director
Device
data paths
Storage
control units
Disk
Figure 48. I/O configuration that maximizes performance for an I/O-intensive query
Working with a table space that is already partitioned
You can examine an existing partitioned table space to determine whether parallel
processing can be improved.
About this task
Assume that a table space already has 10 partitions and a particular query uses CP
parallelism on a 10-way CPC. When you add “other read I/O wait time” (from
accounting class 3) and processing time (from accounting class 2), you determine
that I/O cost is three times more than the processing cost. In this case, the optimal
number of partitions is 30 (three times more I/O paths). However, if you can run
on a data sharing group and you add another DB2 subsystem to the group that is
running on a 10-way CPC, the I/O configuration that minimizes the elapsed time
and allows both CPCs to run at 100% would be 60 partitions.
Making the partitions the same size
The degree of parallelism is influenced by the size of the largest physical partition.
About this task
In most cases, DB2 divides the table space into logical pieces, called work ranges to
differentiate these from physical pieces, based on the size of the largest physical
partition of a given table. Suppose that a table consists of 10 000 pages and 10
physical partitions, the largest of which is 5000 pages. DB2 is most likely to create
only two work ranges, and the degree of parallelism would be 2. If the same table
has evenly sized partitions of 1000 pages each and the query is I/O-intensive, then
ten logical work ranges might be created. This example would result in a degree of
parallelism of 10 and reduced elapsed time.
DB2 tries to create equal work ranges by dividing the total cost of running the
work by the logical partition cost. This division often has some left over work. In
this case, DB2 creates an additional task to handle the extra work, rather than
making all the work ranges larger, which would reduce the degree of parallelism.
1. A lower-cost configuration could use as few as two to three channels per control unit shared among all controllers using an
ESCON director. However, using four paths minimizes contention and provides the best performance. Paths might also need to
be taken offline for service.
Chapter 14. Programming for parallel processing
397
Procedure
To rebalance partitions that have become skewed:
Reorganize the table space, and specify the REBALANCE keyword on the REORG
utility statement.
Working with partitioned indexes
The degree of parallelism for accessing partitioned indexes depends on the nature
of the query and on the processor and I/O configuration at your site.
About this task
For an I/O-intensive query, the degree of a parallelism for access to a partitioned
index depends on the number of index partitions that are referenced, whereas the
degree of parallelism for access to a nonpartitioned index depends on the number
of CPs in the system. For a processor-intensive query, the degree of parallelism for
both partitioned and nonpartitioned indexes is influenced by the number of CPs in
the system.
Enabling parallel processing
Queries cannot take advantage of parallelism unless you enable parallel processing.
Before you begin
DB2 must be running on a central processor complex that contains two or more
tightly coupled processors (sometimes called central processors, or CPs). If only
one CP is online when the query is bound, DB2 considers only parallel I/O
operations.
DB2 also considers only parallel I/O operations if you declare a cursor WITH
HOLD and bind with isolation RR or RS.
Procedure
PSPI
To enable parallel processing:
v For static SQL, specify DEGREE(ANY) on BIND or REBIND. This bind option
affects static SQL only and does not enable parallelism for dynamic statements.
v For dynamic SQL, set the CURRENT DEGREE special register to 'ANY'.
–
You can set the special register with the following SQL statement:
SET CURRENT DEGREE=’ANY’;
– You can also change the special register default from 1 to ANY for the entire
DB2 subsystem by modifying the CURRENT DEGREE field on installation
panel DSNTIP8.
|
|
|
Setting the special register affects dynamic statements only. It has no effect on
your static SQL statements. You should also make sure that parallelism is not
disabled for your plan, package, or authorization ID in the RLST.
v If you bind with isolation CS, choose also the option CURRENTDATA(NO), if
possible. This option can improve performance in general, but it also ensures
398
Performance Monitoring and Tuning Guide
v
v
v
v
|
|
that DB2 considers parallelism for ambiguous cursors. If you bind with
CURRENTDATA(YES) and DB2 cannot tell if the cursor is read-only, DB2 does
not consider parallelism. When a cursor is read-only, it is recommended that you
specify the FOR FETCH ONLY or FOR READ ONLY clause on the DECLARE
CURSOR statement to explicitly indicate that the cursor is read-only.
Specify a virtual buffer pool parallel sequential threshold (VPPSEQT) value that
is large enough to provide adequate buffer pool space for parallel processing. If
you enable parallel processing when DB2 estimates a given query's I/O and
central processor cost is high, multiple parallel tasks can be activated if DB2
estimates that elapsed time can be reduced by doing so.
For parallel sorts, allocate sufficient work files to maintain performance. DB2
also considers only parallel I/O operations if you declare a cursor WITH HOLD
and bind with isolation RR or RS.
For complex queries, run the query in parallel within a member of a data
sharing group. With Sysplex query parallelism, use the power of the data
sharing group to process individual complex queries on many members of the
data sharing group.
Limit the degree of parallelism. If you want to limit the maximum number of
parallel tasks that DB2 generates, you can use the MAX DEGREE field on
installation panel DSNTIP4. If system resources are limited the recommended
value of MAX DEGREE is 1 to 2 times the number of online CPUs. Changing
MAX DEGREE, however, is not the way to turn parallelism off. You use the
DEGREE bind parameter or CURRENT DEGREE special register to turn
parallelism off. PSPI
Restrictions for parallelism
Parallelism is not used for all queries; for some access paths, incurring parallelism
overhead makes no sense. Similarly, certain access paths that would reduce the
effectiveness of parallelism are removed from consideration when parallelism is
enabled.
When parallelism is not used
For example, if you are selecting from a temporary table, parallelism is not used.
Check the following table to determine whether your query uses any of the access
paths that do not allow parallelism.
Table 99. Checklist of parallel modes and query restrictions
I/O
parallelism
CP
parallelism
Sysplex
parallelism
Yes
Yes
No
No
No
No
Merge scan join on more
than one column
Yes
Yes
Yes
Queries that qualify for
direct row access
No
No
No
If query uses this...
Access through RID list
(list prefetch and multiple
index access)
| Query blocks that access
Comments
Indicated by 'L' in the PREFETCH
column of PLAN_TABLE, or an M, MX,
MI, or MQ in the ACCESSTYPE column
of PLAN_TABLE.
LOB values.
Indicated by 'D' in the
PRIMARY_ACCESS_TYPE column of
PLAN_TABLE
Chapter 14. Programming for parallel processing
399
Table 99. Checklist of parallel modes and query restrictions (continued)
I/O
parallelism
CP
parallelism
Sysplex
parallelism
Materialized views or
materialized nested table
expressions at reference
time
No
No
No
EXISTS subquery block
No
No
No
Security label column on
table
Yes
Yes
No
| Multi-row fetch
|
|
|
|
Maybe
Maybe
Maybe
| Query blocks that access
| XML values
No
No
No
| Multiple index access to
| return a DOCID list
No
No
No
| Outer join result at
| reference time
No
No
No
| CTE reference
No
No
No
| Table function
No
No
No
| Create global temporary
| table
No
No
No
| Access through IN-list
|
Yes
Yes
No
Indicated by ACCESSTYPE='N' or 'I' in
the PLAN_TABLE.
| Access through
| IN-subquery
No
No
No
Indicated by ACCESSTYPE='N' in the
PLAN_TABLE.
| A DPSI is used to access
No
| the fact table in a star-join
No
No
If query uses this...
Comments
'Yes' for CP applies when there is no
full outer join.
Parallelism might be disabled for the
last parallel group in the top level
query block. For some queries that have
only a single parallel group, parallelism
might be disabled completely.
Indicated by 'DX', 'DI', or 'DU' in the
ACCESSTYPE column of PLAN_TABLE
Access paths that are restricted by parallelism
To ensure that you can take advantage of parallelism, DB2 does not select certain
access paths when parallelism is enabled. When the plan or package is bound with
DEGREE(ANY) or the CURRENT DEGREE special register is set to 'ANY,' DB2:
v Does not choose Hybrid joins with SORTN_JOIN=Y.
v Does not transform certain subqueries to joins.
|
400
Performance Monitoring and Tuning Guide
Chapter 15. Tuning distributed applications
A query that is sent to a remote system can sometimes take longer to execute than
the same query, accessing tables of the same size, on the local DB2 subsystem.
The principal reasons for this potential increase in execution time are:
v The time required to send messages across the network
v Overhead processing, including startup and communication subsystem session
management
Some aspects of overhead processing, for instance, network processing, are not
under DB2 control.
Monitoring and tuning performance in a distributed environment is a complex task
that requires knowledge of several products. Some guidelines follow for improving
the performance of distributed applications.
Related tasks
Maximizing the performance of an application that accesses distributed data
(Application programming and SQL)
Tuning the VTAM system (DB2 Installation Guide)
Tuning TCP/IP (DB2 Installation Guide)
Remote access
DB2 supports remote access between requestor and server relational database
management systems (DBMS)
The two types of access are DRDA access and DB2 private protocol. When three-part
named objects are referenced (or aliases for three-part name objects are referenced),
DB2 chooses between the two connection types based on the bind option that you
choose (or the default protocol set at your site).
Important: Use DRDA for new applications, and migrate existing private protocol
applications to DRDA. No enhancements are planned for private protocol.
Characteristics of DRDA
|
|
With DRDA, the application can remotely bind packages and can execute packages
of static or dynamic SQL that have previously been bound at that location. DRDA
has the following characteristics and benefits:
v With DRDA access, an application can access data at any server that supports
DRDA, not just a DB2 server on a z/OS operating system.
v DRDA supports all SQL features, including user-defined functions, LOBs, stored
procedures, and XML data.
v DRDA can avoid multiple binds and minimize the number of binds that are
required.
v DRDA supports multiple-row FETCH.
DRDA is the preferred method for remote access with DB2.
© Copyright IBM Corp. 1982, 2010
401
Characteristics of DB2 private protocol
Private protocol is an older method for remote access. It can be used only between
DB2 subsystems and only over a SNA network. Private protocol has not been
enhanced to support many new SQL features. Because of these limitations, it is
highly recommended that you migrate to DRDA protocol for communicating
between DB2 subsystems.
Application and requesting systems
Minimizing the number of messages sent between the requester and the server is a
primary way to improve performance.
BIND options for distributed applications
In many cases, certain bind options can improve the performance of SQL
statements that run as part of distributed applications.
Procedure
Consider using the following bind options to improve performance:
v Use the DEFER(PREPARE) bind option, which can reduce the number of
messages that must be sent back and forth across the network.
v ind application plans and packages with ISOLATION(CS) to reduce contention
and message overhead.
Related reference
Program preparation options for remote packages (Application programming
and SQL)
SQL statement options for distributed applications
In many cases, you can use certain strategies to improve the performance of SQL
statements that run on distributed systems.
About this task
Such strategies include:
Committing frequently
Commit frequently to avoid holding resources at the server.
Using fewer SQL statements
Avoid using several SQL statements when one well-tuned SQL statement
can retrieve the desired results. Alternatively, put your SQL statements in a
stored procedure, issue your SQL statements at the server through the
stored procedure, and return the result. Using a stored procedure creates
only one send and receive operation (for the CALL statement) instead of a
potential send and receive operation for each SQL statement.
Depending on how many SQL statements are in your application, using
stored procedures can significantly decrease your elapsed time and might
decrease your processor costs.
Using the RELEASE statement and the DISCONNECT(EXPLICIT) bind option
The RELEASE statement minimizes the network traffic that is needed to
release a remote connection at commit time. For example, if the application
has connections to several different servers, specify the RELEASE
statement when the application has completed processing for each server.
402
Performance Monitoring and Tuning Guide
The RELEASE statement does not close cursors, release any resources, or
prevent further use of the connection until the COMMIT is issued. It just
makes the processing at COMMIT time more efficient.
The bind option DISCONNECT(EXPLICIT) destroys all remote connections
for which RELEASE was specified.
Using the COMMIT ON RETURN YES clause
Consider using the COMMIT ON RETURN YES clause of the
CREATE PROCEDURE statement to indicate that DB2 should issue an
implicit COMMIT on behalf of the stored procedure upon return from the
CALL statement. Using the clause can reduce the length of time locks are
held and can reduce network traffic. With COMMIT ON RETURN YES,
any updates made by the client before calling the stored procedure are
committed with the stored procedure changes.
Setting CURRENT RULES special register to DB2
When requesting LOB data, set the CURRENT RULES special
register to DB2 instead of to STD before performing a CONNECT. A value
of DB2, which is the default, can offer performance advantages. When a
DB2 for z/OS server receives an OPEN request for a cursor, the server uses
the value in the CURRENT RULES special register to determine whether
the application intends to switch between LOB values and LOB locator
values when fetching different rows in the cursor. If you specify a value of
DB2 for CURRENT RULES, the application indicates that the first FETCH
request specifies the format for each LOB column in the answer set and
that the format does not change in a subsequent FETCH request. However,
if you set the value of CURRENT RULES to STD, the application intends
to fetch a LOB column into either a LOB locator host variable or a LOB
host variable.
Although a value of STD for CURRENT RULES gives you more
programming flexibility when you retrieve LOB data, you can get better
performance if you use a value of DB2. With the STD option, the server
does not block the cursor, while with the DB2 option it might block the
cursor where it is possible to do so. For more information, see “LOB and
XML data and its effect on block fetch for DRDA” on page 405.
Block fetch
Block fetch can significantly decrease the number of messages sent across the
network. Block fetch is used only with cursors that do not update or delete data.
With block fetch, DB2 groups the rows that are retrieved by an SQL query into as
large a “block” of rows as can fit in a message buffer. DB2 then transmits the block
over the network, without requiring a separate message for each row.
DB2 can use two different types of block fetch:
v Limited block fetch
v Continuous block fetch
Both types of block fetch are used for DRDA and private protocol, but the
implementation of continuous block fetch for DRDA is slightly different than that
for private protocol.
Chapter 15. Tuning distributed applications
403
Continuous block fetch
In terms of response time, continuous block fetch is most efficient for larger result
sets because fewer messages are transmitted from the requester to retrieve the
entire result set and because overlapped processing is performed at the requester
and the server.
|
|
|
|
However, continuous block fetch uses more networking resources than limited
block fetch. When networking resources are critical, use limited block fetch to run
applications.
The requester can use both forms of blocking at the same time and with different
servers.
If an application is doing read-only processing and can use continuous block fetch,
the sequence goes like this:
1. The requester sends a message to open a cursor and begins fetching the block
of rows at the server.
2. The server sends back a block of rows and the requester begins processing the
first row.
3. The server continues to send blocks of rows to the requester, without further
prompting. The requester processes the second and later rows as usual, but
fetches them from a buffer on the requester's system.
For private protocol, continuous block fetch uses one conversation for each open
cursor. Having a dedicated conversation for each cursor allows the server to
continue sending until all the rows are returned.
For DRDA, only one conversation is used, and it must be made available to the
other SQL statements that are in the application. Thus, the server usually sends
back a subset of all the rows. The number of rows that the server sends depends
on the following factors:
v The size of each row
v The number of extra blocks that are requested by the requesting system
compared to the number of extra blocks that the server returns
For a DB2 for z/OS requester, the EXTRA BLOCKS REQ field on installation
panel DSNTIP5 determines the maximum number of extra blocks requested. For
a DB2 for z/OS server, the EXTRA BLOCKS SRV field on installation panel
DSNTIP5 determines the maximum number of extra blocks allowed.
Example: Suppose that the requester asks for 100 extra query blocks and that the
server allows only 50. The server returns no more than 50 extra query blocks.
The server might choose to return fewer than 50 extra query blocks for any
number of reasons that DRDA allows.
v Whether continuous block fetch is enabled, and the number of extra rows that
the server can return if it regulates that number.
To enable continuous block fetch for DRDA and to regulate the number of extra
rows sent by a DB2 for z/OS server, you must use the OPTIMIZE FOR n ROWS
clause on your SELECT statement. See “Optimizing for very large results sets for
DRDA” on page 408 for more information.
If you want to use continuous block fetch for DRDA, have the application fetch all
the rows of the cursor before doing any other SQL. Fetching all the rows first
404
Performance Monitoring and Tuning Guide
prevents the requester from having to buffer the data, which can consume a lot of
storage. Choose carefully which applications should use continuous block fetch for
DRDA.
Limited block fetch
Limited block fetch guarantees the transfer of a minimum amount of data in
response to each request from the requesting system.
With limited block fetch, a single conversation is used to transfer messages and
data between the requester and server for multiple cursors. Processing at the
requester and server is synchronous. The requester sends a request to the server,
which causes the server to send a response back to the requester. The server must
then wait for another request to tell it what should be done next.
Block fetch with scrollable cursors for DRDA
When a DB2 for z/OS requester uses a scrollable cursor to retrieve data from a
DB2 for z/OS server, the following conditions are true.
v The requester never requests more than 64 rows in a query block, even if more
rows fit in the query block. In addition, the requester never requests extra query
blocks. This is true even if the setting of field EXTRA BLOCKS REQ in the
DISTRIBUTED DATA FACILITY PANEL 2 installation panel on the requester
allows extra query blocks to be requested.
v The requester discards rows of the result table if the application does not use
those rows.
Example: If the application fetches row n and then fetches row n+2, the
requester discards row n+1.
The application gets better performance for a blocked scrollable cursor if it
mostly scrolls forward, fetches most of the rows in a query block, and avoids
frequent switching between FETCH ABSOLUTE statements with negative and
positive values.
v If the scrollable cursor does not use block fetch, the server returns one row for
each FETCH statement.
LOB and XML data and its effect on block fetch for DRDA
|
|
For a non-scrollable blocked cursor, the server sends all the non-LOB and
non-XML data columns for a block of rows in one message, including LOB locator
values.
|
|
|
|
|
|
|
As each row is fetched by the application, the requester obtains the non-LOB data
columns directly from the query block. If the row contains non-null and non-zero
length LOB values, those values are retrieved from the server at that time. This
behavior limits the impact to the network by pacing the amount of data that is
returned at any one time. If all LOB data columns are retrieved into LOB locator
host variables or if the row does not contain any non-null or non-zero length LOB
columns, then the whole row can be retrieved directly from the query block.
For a scrollable blocked cursor, the LOB data columns are returned at the same
time as the non-LOB and non XML data columns. When the application fetches a
row that is in the block, a separate message is not required to get the LOB
columns.
Ensuring block fetch
To use either limited or continuous block fetch, DB2 must determine that the
cursor is not used for updating or deleting.
Chapter 15. Tuning distributed applications
405
About this task
The easiest way to indicate that the cursor does not modify data is to add
the FOR FETCH ONLY or FOR READ ONLY clause to the query in the DECLARE
CURSOR statement as in the following example:
EXEC SQL
DECLARE THISEMP CURSOR FOR
SELECT EMPNO, LASTNAME, WORKDEPT, JOB
FROM DSN8910.EMP
WHERE WORKDEPT = ’D11’
FOR FETCH ONLY
END-EXEC.
If you do not use FOR FETCH ONLY or FOR READ ONLY, DB2 still uses block
fetch for the query if the following conditions are true:
v The cursor is a non-scrollable cursor, and the result table of the cursor is
read-only. This applies to static and dynamic cursors except for read-only views.
v The cursor is a scrollable cursor that is declared as INSENSITIVE, and the result
table of the cursor is read-only.
v The cursor is a scrollable cursor that is declared as SENSITIVE, the result table
of the cursor is read-only, and the value of bind option CURRENTDATA is NO.
v The result table of the cursor is not read-only, but the cursor is ambiguous, and
the value of bind option CURRENTDATA is NO. A cursor is ambiguous when:
– It is not defined with the clauses FOR FETCH ONLY, FOR READ ONLY, or
FOR UPDATE OF.
– It is not defined on a read-only result table.
– It is not the target of a WHERE CURRENT clause on an SQL UPDATE or
DELETE statement.
– It is in a plan or package that contains the SQL statements PREPARE or
EXECUTE IMMEDIATE.
DB2 triggers block fetch for static SQL only when it can detect that no updates or
deletes are in the application. For dynamic statements, because DB2 cannot detect
what follows in the program, the decision to use block fetch is based on the
declaration of the cursor.
DB2 does not use continuous block fetch if the following conditions are true:
v The cursor is referred to in the statement DELETE WHERE CURRENT OF
elsewhere in the program.
v The cursor statement appears that it can be updated at the requesting system.
(DB2 does not check whether the cursor references a view at the server that
cannot be updated.)
Results
The following tables summarize the conditions under which a DB2 server uses
block fetch.
The following table shows the conditions for a non-scrollable cursor.
406
Performance Monitoring and Tuning Guide
Table 100. Effect of CURRENTDATA and cursor type on block fetch for a non-scrollable
cursor
Isolation level
CURRENTDATA
Cursor type
Block fetch
CS, RR, or RS
Yes
Read-only
Yes
Updatable
No
Ambiguous
No
Read-only
Yes
Updatable
No
Ambiguous
Yes
Yes
Read-only
Yes
No
Read-only
Yes
No
UR
The following table shows the conditions for a scrollable cursor that is not used to
retrieve a stored procedure result set.
Table 101. Effect of CURRENTDATA and isolation level on block fetch for a scrollable cursor
that is not used for a stored procedure result set
Isolation level
Cursor sensitivity
CURRENTDATA
Cursor type
Block fetch
CS, RR, or RS
INSENSITIVE
Yes
Read-only
Yes
No
Read-only
Yes
Yes
Read-only
No
Updatable
No
Ambiguous
No
Read-only
Yes
Updatable
No
Ambiguous
Yes
Yes
Read-only
Yes
No
Read-only
Yes
Yes
Read-only
Yes
No
Read-only
Yes
SENSITIVE
No
UR
INSENSITIVE
SENSITIVE
The following table shows the conditions for a scrollable cursor that is used to
retrieve a stored procedure result set.
Table 102. Effect of CURRENTDATA and isolation level on block fetch for a scrollable cursor
that is used for a stored procedure result set
Isolation level
Cursor sensitivity
CURRENTDATA
Cursor type
Block fetch
CS, RR, or RS
INSENSITIVE
Yes
Read-only
Yes
No
Read-only
Yes
Yes
Read-only
No
No
Read-only
Yes
SENSITIVE
Chapter 15. Tuning distributed applications
407
Table 102. Effect of CURRENTDATA and isolation level on block fetch for a scrollable cursor
that is used for a stored procedure result set (continued)
Isolation level
Cursor sensitivity
CURRENTDATA
Cursor type
Block fetch
UR
INSENSITIVE
Yes
Read-only
Yes
No
Read-only
Yes
Yes
Read-only
Yes
No
Read-only
Yes
SENSITIVE
Related reference
DECLARE CURSOR (DB2 SQL)
Optimizing for very large results sets for DRDA
Enabling a DB2 client to request multiple query blocks on each transmission can
reduce network activity and improve performance significantly for applications
that use DRDA access to download large amounts of data.
You can specify a large value of n in the OPTIMIZE FOR n ROWS clause of a
SELECT statement to increase the number of DRDA query blocks that a DB2 server
returns in each network transmission for a non-scrollable cursor. If n is greater
than the number of rows that fit in a DRDA query block, OPTIMIZE FOR n ROWS
lets the DRDA client request multiple blocks of query data on each network
transmission instead of requesting a new block when the first block is full. This use
of OPTIMIZE FOR n ROWS is intended only for applications in which the
application opens a cursor and downloads great amounts of data. The OPTIMIZE
FOR n ROWS clause has no effect on scrollable cursors.
Recommendation: Because the application SQL uses only one conversation, do not
try to do other SQL work until the entire answer set is processed. If the application
issues another SQL statement before the previous statement's answer set has been
received, DDF must buffer them in its address space. You can buffer up to 10 MB
in this way.
Because specifying a large number of network blocks can saturate the network,
limit the number of blocks according to what your network can handle. You can
limit the number of blocks used for these large download operations. When the
client supports extra query blocks, DB2 chooses the smallest of the following
values when determining the number of query blocks to send:
v The number of blocks into which the number of rows (n) on the OPTIMIZE
clause can fit. For example, assume you specify 10000 rows for n, and the size of
each row that is returned is approximately 100 bytes. If the block size used is 32
KB (32768 bytes), the calculation is as follows:
(10000 * 100) / 32768 = 31 blocks
v The DB2 server value for the EXTRA BLOCKS SRV field on installation panel
DSNTIP5. The maximum value that you can specify is 100.
v The client's extra query block limit, which is obtained from the DRDA
MAXBLKEXT parameter received from the client. When DB2 for z/OS acts as a
DRDA client, you set this parameter at installation time with the EXTRA
BLOCKS REQ field on installation panel DSNTIP5. The maximum value that
you can specify is 100. DB2 Connect sets the MAXBLKEXT parameter to -1
(unlimited).
408
Performance Monitoring and Tuning Guide
If the client does not support extra query blocks, the DB2 server on z/OS
automatically reduces the value of n to match the number of rows that fit within a
DRDA query block.
Recommendation for cursors that are defined WITH HOLD: Do not set a large
number of query blocks for cursors that are defined WITH HOLD. If the
application commits while there are still a lot of blocks in the network, DB2 buffers
the blocks in the requester's memory (the ssnmDIST address space if the requester
is a DB2 for z/OS) before the commit can be sent to the server.
Related concepts
The effect of the OPTIMIZE FOR n ROWS clause in distributed applications
(Application programming and SQL)
Optimizing for small results sets for DRDA
When a client does not need all the rows from a potentially large result set,
preventing the DB2 server from returning all the rows for a query can reduce
network activity and improve performance significantly for DRDA applications.
About this task
You can use either the OPTIMIZE FOR n ROWS clause or the FETCH
FIRST n ROWS ONLY clause of a SELECT statement to limit the number of rows
returned to a client program.
Using OPTIMIZE FOR n ROWS: When you specify OPTIMIZE FOR n ROWS and
n is less than the number of rows that fit in the DRDA query block (default size on
z/OS is 32 KB), the DB2 server prefetches and returns only as many rows as fit
into the query block. For example, if the client application is interested in seeing
only one screen of data, specify OPTIMIZE FOR n ROWS, choosing a small
number for n, such as 3 or 4. The OPTIMIZE FOR n ROWS clause has no effect on
scrollable cursors.
Using FETCH FIRST n ROWS ONLY: The FETCH FIRST n ROWS ONLY clause
does not affect network blocking. If FETCH FIRST n ROWS ONLY is specified and
OPTIMIZE FOR n ROWS is not specified, DB2 uses the FETCH FIRST value to
optimize the access path. However, DRDA does not consider this value when it
determines network blocking.
When both the FETCH FIRST n ROWS ONLY clause and the OPTIMIZE FOR n
ROWS clause are specified, the value for the OPTIMIZE FOR n ROWS clause is
used for access path selection.
Example: Suppose that you submit the following SELECT statement:
SELECT * FROM EMP
FETCH FIRST 5 ROWS ONLY
OPTIMIZE FOR 20 ROWS;
The OPTIMIZE FOR value of 20 rows is used for network blocking and access path
selection.
When you use FETCH FIRST n ROWS ONLY, DB2 might use a fast implicit close.
Fast implicit close means that during a distributed query, the DB2 server
automatically closes the cursor when it prefetches the nth row if FETCH FIRST n
ROWS ONLY is specified or when there are no more rows to return. Fast implicit
Chapter 15. Tuning distributed applications
409
close can improve performance because it can save an additional network
transmission between the client and the server.
DB2 uses fast implicit close when the following conditions are true:
v The query uses limited block fetch.
v The query retrieves no LOBs.
v The query retrieves no XML data.
v The cursor is not a scrollable cursor.
v Either of the following conditions is true:
– The cursor is declared WITH HOLD, and the package or plan that contains
the cursor is bound with the KEEPDYNAMIC(YES) option.
– The cursor is declared WITH HOLD and the DRDA client passes the
QRYCLSIMP parameter set to SERVER MUST CLOSE, SERVER DECIDES, or
SERVER MUST NOT CLOSE.
– The cursor is not defined WITH HOLD.
|
When you use FETCH FIRST n ROWS ONLY and DB2 does a fast implicit close,
the DB2 server closes the cursor after it prefetches n rows, or when there are no
more rows.
Data encryption security options
Data encryption security options provide added security for the security-sensitive
data that an application requests from the system. However, the encryption options
can also have a negative impact on performance.
About this task
The following encryption options have a larger performance cost than other
options:
v Encrypted user ID and encrypted security-sensitive data
v Encrypted user ID, encrypted password, and encrypted security-sensitive data
Recommendation: To maximize performance of requester systems, use the
minimum level of security that is required by the sensitivity of the data.
Serving system
For access that uses DRDA, the serving system is the system on which your
remotely bound package executes. For access that uses DB2 private protocol, the
serving system is the DB2 system on which the SQL is dynamically executed.
If you are executing a package on a remote DBMS, then improving performance on
the server depends on the nature of the server. If the remote DBMS on which the
package executes is another DB2 subsystem, then you can use EXPLAIN
information to investigate access path considerations.
Considerations that could affect performance on a remote DB2 server are:
v The maximum number of database access threads that the server allows to be
allocated concurrently. (This is the MAX REMOTE ACTIVE field on installation
panel DSNTIPE.) A request can be queued while waiting for an available thread.
Making sure that requesters commit frequently can let threads be used by other
requesters.
v The Workload Manager priority of database access threads on the remote
system. A low priority could impede your application's distributed performance.
410
Performance Monitoring and Tuning Guide
v You can manage IDs through DB2 to avoid RACF calls at the server
When DB2 is the server, it is a good idea to activate accounting trace class 7. This
provides accounting information at the package level, which can be very useful in
determining performance problems.
Related concepts
Chapter 35, “Interpreting data access by using EXPLAIN,” on page 589
“Setting thread limits for database access threads” on page 130
Managing connection requests from remote applications (DB2 Administration
Guide)
Related tasks
“Using z/OS Workload Manager to set performance objectives” on page 134
Controlling connections to remote systems (DB2 Administration Guide)
Chapter 15. Tuning distributed applications
411
412
Performance Monitoring and Tuning Guide
Part 4. Monitoring DB2 for z/OS performance
Proactive performance monitoring is a key element of maintaining the health of
your system.
© Copyright IBM Corp. 1982, 2010
413
414
Performance Monitoring and Tuning Guide
Chapter 16. Planning for performance monitoring
When you plan to monitor DB2 performance you should consider how to monitor
performance continuously, how and when to perform periodic monitoring, how
you will monitor exceptions, and the costs associated with monitoring.
Your plan for monitoring DB2 performance should include:
v A master schedule of monitoring. Large batch jobs or utility runs can cause
activity peaks. Coordinate monitoring with other operations so that it need not
conflict with unusual peaks, unless that is what you want to monitor.
|
v The kinds of analysis to be performed and the tools to be used. Document the
data that is extracted from the monitoring output. These reports can be
produced using Tivoli Decision Support for z/OS, IBM Tivoli OMEGAMON XE,
other reporting tools, manual reduction, or a program of your own that extracts
information from standard reports.
v A list of people who should review the results. The results of monitoring and
the conclusions based on them should be available to the user support group
and to system performance specialists.
v A strategy for tuning DB2 describes how often changes are permitted and
standards for testing their effects. Include the tuning strategy in regular system
management procedures.
Tuning recommendations might include generic database and application design
changes. You should update development standards and guidelines to reflect
your experience and to avoid repeating mistakes.
Cost factors of performance monitoring
You should consider the following cost factors when planning for performance
monitoring and tuning.
v Trace overhead for global, accounting, statistics, audit, and performance traces
v Trace data reduction and reporting times
v Time spent on report analysis and tuning action
Related concepts
Chapter 17, “Using tools to monitor performance,” on page 419
Related tasks
“Minimizing the use of DB2 traces” on page 672
Continuous performance monitoring
Continuous monitoring watches system throughput, resource usage (processor,
I/Os, and storage), changes to the system, and significant exceptions that might
affect system performance.
Procedure
v Try to continually run classes 1, 3, 4, and 6 of the DB2 statistics trace and classes
1 and 3 of the DB2 accounting trace.
v In the data that you collect, look for statistics or counts that differ from past
records.
© Copyright IBM Corp. 1982, 2010
415
v Pay special attention to peak periods of activity, both of any new application
and of the system as a whole
v Run accounting class 2 as well as class 1 to separate DB2 times from application
times.
Running with CICS without the open transaction environment (OTE), entails less
need to run with accounting class 2. Application and non-DB2 processing take
place under the CICS main TCB. Because SQL activity takes place under the SQL
TCB, the class 1 and class 2 times are generally close. The CICS attachment work
is spread across class 1, class 2, and time spent processing outside of DB2. Class
1 time thus reports on the SQL TCB time and some of the CICS attachment. If
you are concerned about class 2 overhead and you use CICS, you can generally
run without turning on accounting class 2.
v Statistics and accounting information can be very helpful for application and
database designers. Consider putting this information into a performance
warehouse so that the data can be analyzed more easily by all the personnel
who need the information.
IBM Tivoli OMEGAMON DB2 Performance Expert on z/OS includes a
performance warehouse that allows you to define, schedule, and run processes
that help in monitoring performance trends and tuning.
The data in the performance warehouse can be accessed by any member of the
DB2 family or by any product that supports Distributed Relational Database
Architecture™ (DRDA).
|
Planning for periodic monitoring
Periodic monitoring serves to check system throughput, utilized resources
(processor, I/Os, and storage), changes to the system, and significant exceptions
that might affect system performance during peak periods when constraints and
response-time problems are more pronounced.
About this task
A typical periodic monitoring interval of about ten minutes provides information
on the workload achieved, resources used, and significant changes to the system.
In effect, you are taking “snapshots” at peak loads and under normal conditions.
The current peak is also a good indicator of the future average. You might have to
monitor more frequently at first to confirm that expected peaks correspond with
actual ones. Do not base conclusions on one or two monitoring periods, but on
data from several days representing different periods.
You might notice that subsystem response is becoming increasingly sluggish, or
that more applications fail from lack of resources (such as from locking contention
or concurrency limits). You also might notice an increase in the processor time DB2
is using, even though subsystem responses seem normal. In any case, if the
subsystem continues to perform acceptably and you are not having any problems,
DB2 might not need additional tuning.
Procedure
To monitor peak periods:
Gather information from the different parts of your system, including:
v DB2 for z/OS
416
Performance Monitoring and Tuning Guide
|
v
v
v
v
v
z/OS
The transaction manager (IMS, CICS, or WebSphere)
DB2 Connect™
The network
Distributed application platforms (such as Windows, UNIX, or Linux)
To compare the different results from each source, monitor each for the same
period of time. Because the monitoring tools require resources, you need to
consider the overhead for using these tools as well.
Related tasks
“Minimizing the use of DB2 traces” on page 672
Detailed performance monitoring
You can add detailed monitoring to periodic monitoring when you discover or
suspect a problem. You can also use detailed monitoring to investigate areas that
are not covered by your periodic monitoring. To minimize the cost of the detailed
monitoring, limit the information to the specific application and data as much as
possible.
Procedure
v If you have a performance problem, first verify that it is not caused by faulty
design of an application or database.
|
|
|
v If you believe that the problem is caused by the choice of system parameters,
I/O device assignments, or other factors, begin monitoring DB2 to collect data
about its internal activity.
v If you have access path problems, use the query tuning and visual explain
features of IBM Data Studio, Optimization Service Center, or the DB2 EXPLAIN
facility to locate and tune the problems.
Related concepts
Chapter 17, “Using tools to monitor performance,” on page 419
Related tasks
Planning for and designing DB2 applications (Application programming and
SQL)
Related information
Designing a database (DB2 Administration Guide)
Exception performance monitoring
You can use exception monitoring to monitor for specific exceptional values or
events, such as very high response times or deadlocks. Exception monitoring is
most appropriate for response-time and concurrency problems.
|
With IBM Tivoli OMEGAMON XE, exception monitoring is available in both batch
reporting and online monitor. For information on how to use exception processing,
set exception threshold limits in the threshold limit data set, and use the exception
profiling function of OMEGAMON DB2 Performance Expert, see:
v Using IBM Tivoli OMEGAMON XE on z/OS
v OMEGAMON Report Reference
v OMEGAMON Monitoring Performance from ISPF
v OMEGAMON Monitoring Performance from Performance Expert Client
Chapter 16. Planning for performance monitoring
417
Related concepts
“Scenario for analyzing concurrency” on page 497
418
Performance Monitoring and Tuning Guide
Chapter 17. Using tools to monitor performance
These topics describe the various facilities for monitoring DB2 activity and
performance.
The included information covers facilities within the DB2 product as well as tools
that are available outside of DB2.
Figure 49. Monitoring tools in a DB2 environment
Table 103 describes these monitoring tools.
Table 103. Monitoring tools in a DB2 environment
Monitoring tool
Description
CICS Attachment Facility
statistics
Provide information about the use of CICS threads. This
information can be displayed on a terminal or printed in a
report.
© Copyright IBM Corp. 1982, 2010
419
Table 103. Monitoring tools in a DB2 environment (continued)
Monitoring tool
Description
OMEGAMON CICS
Monitoring Facility (CMF)
Provides performance information about each CICS
transaction executed. It can be used to investigate the
resources used and the time spent processing transactions.
Be aware that overhead is significant when CMF is used to
gather performance information.
DB2 catalog queries
Help you determine when to reorganize table spaces and
indexes. See “When to reorganize indexes and table spaces”
on page 683.
DB2 Connect
Can monitor and report DB2 server-elapsed time for client
applications that access DB2 data. See “Reporting
server-elapsed time” on page 545.
DB2 DISPLAY command
Gives you information about the status of threads,
databases, buffer pools, traces, allied subsystems,
applications, and the allocation of tape units for the archive
read process. For information about the DISPLAY
BUFFERPOOL command, see “Monitoring and tuning buffer
pools using online commands” on page 489. For information
about using the DISPLAY command to monitor distributed
data activity, see “The DISPLAY command” on page 541
DB2 EXPLAIN statement
Provides information about the access paths used by DB2.
See “Investigating SQL performance with EXPLAIN” on
page 421.
IBM Tivoli OMEGAMON XE
for DB2 Performance Expert
on z/OS
A licensed program that integrates the function of DB2
Buffer Pool Analyzer and DB2 Performance Monitor (DB2
PM). OMEGAMON provides performance monitoring,
reporting, buffer pool analysis, and a performance
warehouse, all in one tool. OMEGAMON monitors all
subsystem instances across many different platforms in a
consistent way. You can use OMEGAMON to analyze DB2
trace records and optimize buffer pool usage. See “IBM
Tivoli OMEGAMON XE” on page 423 for more information.
IBM Tivoli OMEGAMON XE An orderable feature of DB2 that you can use to analyze
for DB2 Performance Monitor DB2 trace records. As indicated previously, OMEGAMON
(DB2 PM)
includes the function of DB2 PM. OMEGAMON is described
under “IBM Tivoli OMEGAMON XE” on page 423.
420
DB2 RUNSTATS utility
Can report space use and access path statistics in the DB2
catalog. See “Gathering and updating statistics” on page 510.
DB2 STOSPACE utility
Provides information about the actual space allocated for
storage groups, table spaces, table space partitions, index
spaces, and index space partitions.
DB2 trace facility
Provides DB2 performance and accounting information. It is
described under Chapter 19, “Using DB2 Trace to monitor
performance,” on page 437.
Generalized Trace Facility
(GTF)
A z/OS service aid that collects information to analyze
particular situations. GTF can also be used to analyze seek
times and Supervisor Call instruction (SVC) usage, and for
other services. See “Recording GTF trace data” on page 444
for more information.
IMS DFSUTR20 utility
A print utility for IMS Monitor reports.
IMS Fast Path Log Analysis
utility (DBFULTA0)
An IMS utility that provides performance reports for IMS
Fast Path transactions.
Performance Monitoring and Tuning Guide
Table 103. Monitoring tools in a DB2 environment (continued)
Monitoring tool
Description
OMEGAMON IMS
Performance Analyzer (IMS
PA)
A separately licensed program that can be used to produce
transit time information based on the IMS log data set. It
can also be used to investigate response-time problems of
IMS DB2 transactions.
Resource Measurement
Facility™ (RMF)
An optional feature of z/OS that provides system-wide
information on processor utilization, I/O activity, storage,
and paging. RMF provides for three basic types of sessions:
Monitor I, Monitor II, and Monitor III. Monitor I and
Monitor II sessions collect and report data primarily about
specific system activities. Monitor III sessions collect and
report data about overall system activity in terms of work
flow and delay.
System Management Facility
(SMF)
A z/OS service aid used to collect information from various
z/OS subsystems. This information is dumped and reported
periodically, such as once a day. Refer to “Recording SMF
trace data” on page 442 for more information.
Tivoli Decision Support for
z/OS
A licensed program that collects SMF data into a DB2
database and allows you to create reports on the data. See
“Tivoli Decision Support for z/OS” on page 424.
Related reference
RUNSTATS (DB2 Utilities)
STOSPACE (DB2 Utilities)
Investigating SQL performance with EXPLAIN
The DB2 EXPLAIN facility enables you to capture detailed information about the
performance of SQL plans, packages, and statements.
PSPI
By using DB2 EXPLAIN you can capture and analyze the following types
of information:
v A plan, package, or SQL statement when it is bound. The output appears in a
table that you create called PLAN_TABLE, which is also called a plan table. For
experienced users, you can use PLAN_TABLE to give optimization hints to DB2.
v An estimated cost of executing an SQL SELECT, INSERT, UPDATE, or DELETE
statement. The output appears in a table that you create called
DSN_STATEMNT_TABLE, which is also called a statement table.
v User-defined functions referred to in the statement, including the specific name
and schema. The output appears in a table that you create called
DSN_FUNCTION_TABLE, which is also called a function table.
v Execution of SQL statements and groups of SQL statements. By creating a profile
table, you can monitor statements, monitor exceptions such as RLF constraint
violations, obtain snapshot execution reports, and specify statements to be
monitored and explained later.
Related tools and accessories: The following tools can also help you to monitor,
analyze, and tune SQL performance:
|
|
|
IBM Data Studio
In addition to providing other administrative functions, IBM Data Studio
provides query serviceability and tuning capabilities from workstation
Chapter 17. Using tools to monitor performance
421
computers for queries that run in DB2 for z/OS environments. IBM Data
Studio replaces Optimization Service Center for DB2 for z/OS, which is
deprecated. You can use IBM Data Studio to for the following query
optimization tasks:
v View query activity and identify problematic queries
v Get tuning recommendations for statistics that could improve query
performance
v Graph the access plan for a query
v Graphically generate optimization hints
v Generate reports with information about tables and predicates that are
associated with a particular query
v View the values of subsystem parameters
|
|
|
|
|
|
|
|
|
|
|
|
IBM Optim Query Workload Tuner for z/OS
Optim Query Tuner and Optim Query Workload Tuner is the replacement
for the full query tuning functionality of DB2 Optimization Expert for
z/OS.
|
|
|
|
Optimization Service Center for DB2 for z/OS
IBM Optimization Service Center for DB2 for z/OSis a workstation tool
that helps you tune your queries and query workloads. You can quickly
get customized tuning recommendations or perform your own in-depth
analysis by graphing the access plan for a query. You can perform the
following key tasks from Optimization Service Center:
v View query activity and identify problematic queries and query
workloads
v Get tuning recommendations for statistics that could improve query
performance
v Get tuning recommendations for statistics that could improve workload
performance
v Graph the access plan for a query
v Graphically generate optimization hints
v Generate reports with information about tables and predicates that are
associated with a particular query
v View the values of subsystem parameters
v View a query with relevant statistics displayed next to the appropriate
objects
OMEGAMON
IBM Tivoli OMEGAMON XE for DB2 on z/OS is a performance
monitoring tool that formats performance data. OMEGAMON combines
information from EXPLAIN and from the DB2 catalog. It displays
information about the following objects and structures:
v
v
v
v
v
v
v
v
422
Access paths
DBRMs
Host variable definitions
Indexes
Join sequences
Lock types
Ordering
Plans
Performance Monitoring and Tuning Guide
v
v
v
v
Packages
Tables
Table spaces
Table access sequences
Output is presented in a dialog rather than as a table, making the
information easy to read and understand. DB2 Performance Monitor (DB2
PM) performs some of the functions of OMEGAMON.
DB2-supplied EXPLAIN stored procedure
Users without authority to run EXPLAIN directly can obtain access path
information for certain statements by calling the DB2-supplied EXPLAIN
stored procedure (DSNAEXP).
PSPI
Related reference
“EXPLAIN tables” on page 753
IBM Tivoli OMEGAMON XE
OMEGAMON provides performance monitoring, reporting, buffer pool analysis,
and a performance warehouse all in one tool.
v OMEGAMON XE for DB2 Performance Expert on z/OS includes the function of
OMEGAMON DB2 Performance Monitor (DB2 PM), which is also available as a
stand-alone product. Both products report DB2 instrumentation in a form that is
easy to understand and analyze. The instrumentation data is presented in the
following ways:
– The Batch report sets present the data you select in comprehensive reports or
graphs containing system-wide and application-related information for both
single DB2 subsystems and DB2 members of a data sharing group. You can
combine instrumentation data from several different DB2 locations into one
report.
Batch reports can be used to examine performance problems and trends over
a period of time.
– The Online Monitor gives a current “snapshot” view of a running DB2
subsystem, including applications that are running. Its history function
displays information about subsystem and application activity in the recent
past.
Both a host-based and Workstation Online Monitor are provided. The
Workstation Online Monitor substantially improves usability, simplifies online
monitoring and problem analysis, and offers significant advantages. For
example, from Workstation Online Monitor, you can launch Visual Explain so
you can examine the access paths and processing methods chosen by DB2 for
the currently executing SQL statement.
For more information about the Workstation Online Monitor, see
OMEGAMON Monitoring Performance from Performance Expert Client or DB2
Performance Expert for z/OS and Multiplatforms Monitoring Performance from
Workstation for z/OS and Multiplatforms.
In addition, OMEGAMON contains a Performance Warehouse function that lets
you:
– Save DB2 trace and report data in a performance database for further
investigation and trend analysis
Chapter 17. Using tools to monitor performance
423
– Configure and schedule the report and load process from the workstation
interface
– Define and apply analysis functions to identify performance bottlenecks.
v OMEGAMON for DB2 Performance Expert also includes the function of DB2
Buffer Pool Analyzer, which is also available as a stand-alone product. Both
products help you optimize buffer pool usage by offering comprehensive
reporting of buffer pool activity, including:
– Ordering by various identifiers such as buffer pool, plan, object, and primary
authorization ID
– Sorting by getpage, sequential prefetch, and synchronous read
– Filtering capability
In addition, you can simulate buffer pool usage for varying buffer pool sizes and
analyze the results of the simulation reports to determine the impact of any
changes before making those changes to your current system.
Related information
Tivoli OMEGAMON XE for DB2 Performance Expert on z/OS
Tivoli OMEGAMON XE for DB2 Performance Monitor on z/OS
Tivoli Decision Support for z/OS
Tivoli Decision Support for z/OS, formerly known as Tivoli Performance Reporter
for z/OS, collects data into a DB2 database and allows you to create graphical and
tabular reports to use in managing systems performance.
The data can come from different sources, including SMF, the IMS log, the CICS
journal, RMF, and DB2
When considering the use of Tivoli Decision Support for z/OS, consider the
following:
v Tivoli Decision Support data collection and reporting are based on user
specifications. Therefore, an experienced user can produce more suitable reports
than the predefined reports produced by other tools.
v Tivoli Decision Support provides historical performance data that you can use to
compare a current situation with previous data.
v Tivoli Decision Support can be used very effectively for reports based on the
DB2 statistics and accounting records. When using it for the performance trace
consider that:
– Because of the large numb