Oracle FMW 11g R1 SOA with Oracle Database Real Application

Oracle FMW 11g R1 SOA with Oracle Database Real Application
Oracle FMW 11g R1 SOA with
Oracle Database Real Application
Clusters Assessment
Oracle Maximum Availability Architecture White Paper
November 2011
Maximum
Availability
Architecture
Oracle Best Practices For High Availability
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Executive Summary. ......................................................................... 2
Introduction.- ..................................................................................... 3
Oracle FMW SOA RAC Database Index Block contention.-............... 3
Analyzing the SOA Oracle Database Wait Events.- ....................... 3
Possible Solutions for Index Contention Scenarios in the SOA Database
13
What indexes should be partitioned?.- ..........................................21
How to partition the contended indexes?.- ....................................25
Summary.- ....................................................................................33
Oracle FMW SOA Datasources’ Pool Capacity and Statement Cache Size.-
28
Oracle FMW SOA Datasources’ Seconds to Trust and Idle Connection.- 30
References .......................................................................................34
1
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Executive Summary.
More and more customers are using Oracle Real Application Clusters as the High Availability and Scalability
solution for Oracle Fusion Middleware Service Oriented Architecture’s dehydration and metadata repository.
Oracle Real Application Clusters (Oracle RAC) enables an Oracle database to run across a cluster of servers,
providing fault tolerance, performance, and scalability with no application changes necessary. Multiple
customers using Oracle RAC for Oracle FMW SOA with its default configuration have reported problems in
the system’s performance and scalability as their database grows in size. One of the common approaches that
has been taken to tackle this problem is purging the Oracle FMW SOA schemas so that the information from
old composites is removed and the database’s size is reduced. However, in many cases (due to regulation or
even because the growth rates do not allow purge cycles to be as frequent as required) this may not be a
plausible approach. The purpose of this paper is to provide both a procedure for analyzing the possible causes
of performance degradations in Oracle FMW SOA’s RAC Database and the recommended solutions for them.
By using the appropriate connection pool tuning and database adjustments for the schemas used by the Oracle
FMW SOA components (mainly global hash partitions) the document will show how to achieve performance
gains in the order of 15-25% (end to end response times) as compared to the default configuration for the
Oracle FMW SOA’s RAC Database.
2
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Introduction.Oracle Fusion Middleware Service Oriented Architecture (SOA) is a database intensive middleware component.
Data growth from auditing and dehydration usually has a significant impact on database performance and
throughput. Depending on the type of SOA composites running, multiple accesses to tables like
composite_instance, cube_instance, mediator_instance, dlv_message etc may take place. One row of
information is usually stored per instance in these tables and a unique identifier (ID) based on a sequence is
used for each instance of a composite. The standard tuning for database parameters, redo log sizing and
automatic storage management that are generic to any Oracle Real Application Cluster database apply for
Oracle FMW SOA’ repository. There are, however, very specific areas that are especially relevant to Oracle
FMW SOA Suite’s performance and high availability and that have been identified as critical to optimize an
Oracle FMW SOA System:
• Oracle FMW SOA RAC Database Index Contention
• Oracle FMW SOA Datasources Configuration
The following sections address these areas separately.
Oracle FMW SOA RAC Database Index Block contention.Index block contention is very common in busy databases and it’s especially common on tables that have
monotonically increasing key values. In OLTP systems, index management activities are constantly taking place
and these events can cause transient waits. One of the most common incarnations of index contention in an
Oracle environment is the “high key” problem derived from the b-tree structure of an index. Oracle FMW
SOA makes intensive use of sequences for identifying instances, documents etc. When rows are inserted based
on these sequences, each insert of a low-level index tree node must propagate upwards to the high-key
indicators in the b-tree index. Until this propagation completes, other index operations must wait. This may
cause a considerable amount of wait events during SOA processing/updates to the Oracle RAC database. Each
specific type of composite may cause contention in different indexes, it is hence required to determine which
are the most critical objects causing the bottlenecks.
Analyzing the SOA Oracle Database Wait Events.-
The best way to understand how to determine the elements that are causing undesired waits in the Oracle
3
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
FMW SOA Database is by using a real example and walk through the steps to detect the possible performance
issues. In our case we set up a 2 Mid Tier SOA cluster front ended with OHS and an LBR using a two node
Oracle 11g R2 RAC. The Fusion Order Demo (FOD) example was deployed to the SOA cluster. Oracle
Application Testing Suite (ATS) was used to stress the system. An Oracle Load Testing (OLT) Controller was
set up in a Windows node and an Oracle ATS agent was deployed in a Linux server to stress the system.
Different number of virtual users (20, 25 40) where configured to trigger the FOD order processing web
service. Figure 1 shows the topology used for testing:
Image 1: Topology used for the tests
4
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Each invocation by a virtual user involved 3 different composites, with multiple Oracle Mediator, Oracle BPEL
and JMS and File adapter invocations. A total number of 4000 to 10000 iterations (depending on the test
batteries) were performed per virtual user. At the end of the test, both Oracle ATS and Oracle Enterprise
Manager Automatic Workload Repository (AWR) reports were used to analyze the performance of the system.
The first test was conducted with the default database objects, schemas and indexes that Oracle Repository
Creation Utility (RCU) creates in the database for the soa_infra schema. The following diagram obtained from
Oracle Application Testing Suite reflects the evolution of the “transactions per second” metric during the tests:
Image 2: Performance for the SOA System with the default configuration
Although not easy to appreciate due to the scale in the transaction/sec Y AXIS, there is a constant degradation
in the performance. This was also confirmed in th throughput measured in the database and in the SQL queries
execution time. If we compare the initial and end lapses for the tests we can appreciate this better: The
following diagrams show the throughout in the initial stages of the test and in the last iterations of the same:
5
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Initial stage of the test:
Image 3: Initial performance for the SOA System with the default configuration
Final stages of the test:
Image 4: Final performance for the SOA System with the default configuration
6
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
From values around 12 transactions per second in the initial stages, the system evolves to values in the order of
11.5 transactions per second. Although the CPU idle time does not diminish much in the mid tiers nodes, it is
constantly decreasing in the DB nodes. The following diagram shows the evolution of CPU idle percentage in
the green for the middle tiers and in orange for the database nodes:
Image 5: CPU Idle time for the Midtier and Database
Let’s analyze the Oracle FMW SOA RAC database metrics. For this we use the Automatic Workload
Repository (AWR) reports for the window of time used for the test. (Oracle ATS allows generating snapshots
at the beginning and end of the tests.) 1)Logon to the Oracle Enterprise Manager Database Control 2)Click on
the server tab 3)Click on the “Automatic Workload Repository” link under the “Statistics Management” 4)Click
then on the “Run AWR Report” and select the appropriate snapshots (beginning of the test and end of the
test). The relevant information in the AWR is in the wait event tables:
7
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 6: Top events for a SOA RAC database with the default configuration
In the table we have the following relevant wait events
• db file sequential read : According to the Oracle Database documentation, a db file sequential read is
an event that shows a wait for a foreground process while doing a sequential read from the database.
This is an operation most commonly used for single block reads. Single block reads are mostly
commonly seen for index block access or table block access by a rowid (Eg: to access a table block after
an index entry has been seen)
• log file sync: Commits are not complete until LGWR writes log buffers including commit redo records
to log files. After posting LGWR to write, user or background processes waits for LGWR to signal
back with 1 sec timeout. This may be caused by starving Cpu or slow I/O so unlikely t be specific to
the SOA system
• gc current block busy. The “gc current block busy”, “gc buffer busy acquire” and “gc cr block busy”
wait events indicate that the local instance that is making the request is not receiving a current or
consistent read block fast enough. The term "busy" in these events' names indicates that the sending of
the block was delayed on a remote instance. For example, a block cannot be shipped immediately if
Oracle Database has not yet written the redo for the block's changes to a log file.
Leaving aside the performance implications derived from CPU starvation, slow storage or to the cluster
interconnect speed (all of which are not intrinsic to the SOA system) we are given clear indications in the AWR
8
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
reports that there is some sort of contention for blocks being accessed. We can use Oracle Automated
Historical Session (ASH) to analyze the specific roots causes in detail:
Let’s first identify the number of wait events for each type of wait class. For doing this we use a query to the
dba_hist_active_sess_history table and use the snap ids that were generated by Oracle ATS when the tests started
and ended.:
SQL>select wait_class_id, wait_class, count(*) cnt from dba_hist_active_sess_history
where snap_id between 2417 and 2426 group by wait_class_id, wait_class order by 3;
TABLE 1. WAIT EVENTS AND THEIR OCCURENCES
WAIT_CLASS_ID
WAIT_CLASS
OCURRENCES
2723168908
Idle
3
2000153315
Network
5
4217450380
Application
50
3290255840
Configuration
569
3875070507
Concurrency
1172
4108307767
System I/O
2966
1893977003
Other
3637
3386400367
Commit
5532
1740759767
User I/O
7403
3871361733
Cluster
11327
The results show that cluster waits are being the main cause of contention (they are causing over 1/3rd of the
wait events) Let’s see the associated wait events in detail:
SQL>select
event_id,
event,
count(*)
cnt
from
dba_hist_active_sess_history
where
snap_id between 2417 and 2426 and wait_class_id=3871361733 group by event_id, event
order by 3;
9
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
TABLE 2. WAIT EVENTS AND THEIR OCCURENCES
EVENT_ID
EVENT
OCURRENCES
1520064534
gc cr block busy
1290
2000153315
gc buffer busy acquire
2252
2701629120
gc current block busy
5255
This confirms the overall results in the AWR report: there is a considerable gap between the “gc current
block” busy and other events. Together, the “gc buffer busy acquire” and the “gc current block busy”
events are causing 85% of the Cluster waits. So what is the SQL code behind these events?
SQL>select
sql_id,
count(*)
cnt
from
dba_hist_active_sess_history
where
snap_id
between 2417 and 2426 and event_id in (2701629120, 1912606394) group by sql_id having
count(*)>1000 order by 2;
TABLE 3. SQL IDS FOR THE CONTENDING QUERIES AND THEIR OCCURENCES
SQL_ID
OCURRENCES
88m9cswp1ccx9
329
1kv8x2bwzn6mm
387
dz4v8ywm0yq5b
456
43kzjnjyatq9s
499
7jjp97nb9h2up
536
6gdrgwgb1gbja
537
ffnmvfxa4bg7u
835
0r5xv5d42p3p6
970
We can determine the specific SQL statement for each ID:
SQL>select sql_text from dba_hist_sqltext where sql_id='0r5xv5d42p3p6';
INSERT
INTO
UPDATED_TIME,
REFERENCE_INSTANCE
(ID,
PROTOCOL_CORRELATION_ID,
BINDING_TYPE,
CREATED_TIME,
REFERENCE_NAME,
CONVERSATION_ID,
10
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
STACK_TRACE, PARENT_ID, ERROR_CODE, COMPOSITE_INSTANCE_ID, OPERATION_NAME,
COMPOSITE_DN,
CREATED_BY,
STATE,
ECID,
ERROR_MESSAGE,
ADDITIONAL_PROPERTIES, UPDATED_BY, CPST_PARTITION_DATE) VALUES (:1 , :2 , :3 , :4 ,
:5 , :6 , :7 , :8 , :9 , :10 , :11 , :12 , :13 , :14 , :15 , :16 , :17 , :18 , :19 , :20 )
Usign similar queries for the other statements, it was shown that the reference_instance,
mediator_case_instance and composite_instance tables were the major cause of waits and delays in
processing. We can dig a little deeper to find the specific objects causing the contention:
SQL>select current_obj#, count(*) cnt from dba_hist_active_sess_history where snap_id
between 2417 and 2426 and event_id=2701629120 and sql_id='0r5xv5d42p3p6' group by
current_obj# order by 2;
TABLE 5.OBJECT IDS CAUSING THE CONTENTION AND OCCURENCES
OBJECT_ID
OCURRENCES
97366
112
97367
144
97370
207
97365
246
SQL>select object_id, owner, object_name, object_type from dba_objects where object_id
in (97366, 97367, 97370,97365);
TABLE 6. OBJECT NAMES CAUSING CONTENTION
OBJECT_ID
OBJEC_NAME
OBJECT_TYPE
97366
SOAINFRA
INDEX
REFERENCE_INSTANCE_ID
97367
SOAINFRA
INDEX
REFERENCE_INSTANCE_CO_ID
97370
SOAINFRA
INDEX
REFERENCE_INSTANCE_TIME_CDN
97365
SOAINFRA
INDEX
11
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
TABLE 6. OBJECT NAMES CAUSING CONTENTION
OBJECT_ID
OBJEC_NAME
OBJECT_TYPE
REFERENCE_INSTANCE_ECID
…
This gives us a good idea of the problem: the Oracle RAC Cluster used for SOA is waiting on indexes
(constantly growing for the new composite instances) that are added to different tables. Similar queries like
the above can be used to determine the indexes that are causing this contention in other insert operations.
This is also consistent with the Top SQL Statement highlighted in the Oracle AWR report for cluster waits:
The following indexes were identified for this specific case (Oracle Fusion Order Demo) after a few tests:
•
•
•
•
•
•
•
SOAINFRA.REFERENCE_INSTANCE_ECID;
SOAINFRA.REFERENCE_INSTANCE_CO_ID;
SOAINFRA.STATE_TYPE_DATE;
SOAINFRA.REFERENCE_INSTANCE_ID;
SOAINFRA.REFERENCE_INSTANCE_TIME_CDN;
SOAINFRA.MEDIATOR_CASE_DETAIL_INDEX1;
SOAINFRA_CASE_INSTANCE_PKEY (Primary Key for
SOAINFRA.MEDIATOR_CASE_INSTANCE table)
• SOAINFRA.CI_ECID;
12
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
•
•
•
•
•
•
•
SOAINFRA.CI_CREATION_DATE;
SOAINFRA.MEDIATOR_INSTANCE_INDEX1
SOAINFRA.MEDIATOR_INSTANCE_INDEX2;
SOAINFRA.COMPOSITE_INSTANCE_CREATED;
SOAINFRA.COMPOSITE_INSTANCE_ID;
SOAINFRA.AT_PK; (Audit trail enabled. Primary key for SOAINFRA.AUDIT_TRAIL table)
SOAINFRA.AC_PK; (Audit trail enabled. Primary key for SOAINFRA.AUDIT_COUNTER table)
It needs to be noted that the objects causing the contention can (and will likely) vary depending on the type
of composite. It is hence recommended to use the set of steps and verifications above to determine in each
specific case the objects causing the contention in the cluster.
Possible Solutions for Index Contention Scenarios in the SOA Database
There are different measures that can be taken to reduce this contention. Some websites and external blogs
to Oracle have recommended using failvoer mode instead of load balancing for the JDBC pool. The
reasoning is that by using one single database instance the contention will be eliminated while still
providing. This is not recommended by Oracle. In fact a simple test with the out of the box configuration
for the SOA MultiDatasource proves that using the failvoer mode will provide worse performance as shown
in the following diagrams:
Image 7: Throughput for SOA Servers using failover mode (single databaseinsatnce) vs. load balancing mode (2 RAC
instances)
13
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
TABLE 7 . TRHOUPUT VALUES FOR THE WLS_SOA SERVERS
SERVER
DATABASE
AVERAGE THROUGHPUT
WLS_SOA1
FAILOVER MODE
129.50
WLS_SOA2
FAILOVER MODE
142.45
WLS_SOA1
LOAD BALANCING
151.17
WLS_SOA2
LOAD BALANCING
152.17
i.e. without any tuning at all the throughput of a SOA system using failvoer mode is approximately 12%
lower than the throughput for a SOA system using RAC with load balancing mode. Additionally, the
implications of using failover mode go well beyond the pure performance aspect: when using the failover
mode the failover latency is worse since there is a delay in service restarts. Additionally, the system’s
horizontal scalability is null since only one DB instance is providing a service at any point in time.
There are a few other strategies intended to reduce the effect of this type of contention that the Oracle
Database documentation recommends. The main ones are:
•
Using Reverse key indexes: Creating a reverse key index, compared to a standard index,
reverses the bytes of each column indexed (except the rowid) while keeping the column order.
Such an arrangement can help avoid performance degradation with Oracle Real Application
Clusters where modifications to the index are concentrated on a small set of leaf blocks. By
reversing the keys of the index, the insertions become distributed across all leaf keys in the
index. However, using the reverse key arrangement eliminates the ability to run an index range
scanning query on the index. Because lexically adjacent keys are not stored next to each other in
a reverse-key index, only fetch-by-key or full-index (table) scans can be performed.
•
Setting cache and noorder options for the sequences used. If there is insufficient caching of
sequences, contention will show up as an increase in service times for DML. However cache and no
order are already used by the relevant sequences used by the SOAINFRA schema:
SQL>Select dbms_metadata.get_ddl('SEQUENCE','WFTASKSEQ',’SOAINFRA') from dual;
CREATE
SEQUENCE
""PRODPS3J20_SOAINFRA"".""WFTASKSEQ""
MINVALUE
1
MAXVALUE
9999999999999999999999999999 INCREMENT BY 1 START WITH 425200 CACHE 20 NOORDER
NOCYCLE
14
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
If sequences cause contention, the gv$enqueue_stat view will show the event with type SQ
SQL>select count(*), eq_type from gv$enqueue_stat group by eq_type order by 1;
This count is low for the Oracle FMW SOA composites tested. Depending on whether this parameter
reaches high values, consider increasing the cache size assigned to the sequence.
•
•
Adjusting the index block size: According to the Oracle Database Documentation, A block
size of 8 KB is optimal for most systems. Small block size reduces block contention, however is
not recommended for large rows and SOA uses large rows in many tables in the SOAINFRA
schema.
Using hash partitioned global indexes: This is the most recommended approach for
performance improvement in index contention scenarios. Hash partitioning can improve
performance of indexes where a small number leaf blocks in the index have high contention in
multiuser OLTP environment. In some OLTP applications, index insertions happen only at the
right edge of the index. This situation occurs when the index is defined on monotonically
increasing columns as it is the case of a few tables in the SOAINFRA schema. In such
situations, the right edge of the index becomes a hotspot because of contention for index pages,
buffers, latches for update, and additional index maintenance activity, which results in
performance degradation. With hash partitioned global indexes index entries are hashed to
different partitions based on partitioning key and the number of partitions. This spreads out
contention over number of defined partitions, resulting in increased throughput. Since index
maintenance can present a significant CPU and I/O resource demand in any write-intensive
application it is recommended to rebuild only the necessary indexes.
In Oracle Fusion Order Demo’s case, the database’s performance improvement when using global hash
partitioned indexes is dramatic:
15
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 8: Top events for a SOA RAC database using global hash partitions for identified contended indexes
As we can see, the improvement is considerable
• Db sequential read wait time without global partitions by hash: 62021 secs
• Db sequential read wait time with global partitions by hash: 18110 secs
 Wait reduced 70%
• Gc current block busy wait time without global partitions by hash: 52469 secs
• Gc current block busy wait time with global partitions by hash: 14973 secs
 Wait reduced 71%
In total, the wait time was reduced in 43911+37496=81407 sec (i.e. around 0.3 seconds per web service
invocation!). This is was also consistent with the reduction in the wait time and average active sessions
shown by Oracle Enterprise Manager Database Console for the SOA Database Cluster:
16
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 9: Active Sessions comparison for a SOA System without/with global hash partitioned indexes
In our case, this initial improvement in the Oracle Database Wait events (a result of using global hash
partitions for the main contended indexes) showed as a 10% gain in performance in the end to end response
time (i.e. in the response time to the Web Service for the Fusion Order Demo invocation). This was
primarily due to the fact that storage latency and CPU saturation was high as well and it prevailed above
other events (as can be verified in the “log file sync” wait time metrics in Images 7 and 8). However, the
values got dramatically better as the number of instances in the SOA database and the concurrency were
increased (i.e. more concurrent users are accessing the system and more SOA instances have been stored in
the RAC database)
The following shows the data comparison for tests conducted with 40 concurrent users and a wait time of 1
second for a period of time when the number of instances in the database was already reaching the order of
tens of thousands of instances. It compares the exact same stress test for a SOA Database not using the
recommended has partitions for indexes and the same with hash partitions. This first diagram describes the
number of sessions and the events that they are waiting for. The grey area reflects the waits on Cluster work
17
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
which is the Even Class affected by index contention:
Image 10: Active Sessions comparison for a SOA System without/with global hash partitioned indexes with high number
of composite instances already stored
In this table we can see the amount of time spent waiting in each event. The table on the right (labeled as
“2nd”) shows the times measured for the hash partitioned SOA database. Obsever the tremendous reduction
in wait time for the “gc current block busy”, the “gc buffer busy acquire” and the “enq TX: index
contention”
events
Image 11: Comparison of top wait events for a SOA RAC Database with the default configuration and a SOA RAC
Database using global hash partitions for the most contended indexes
This translated in a much lower contention:
18
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 12: Buffer waits comparison SOA RAC Database with the default configuration and a SOA RAC Database using
global hash partitions for the most contended indexes
and in a considerable improvement in the execution time for the most critical queries in the SOA system:
Image 13: Cluster waits for the top SQL statements in a SOA RAC Database with the default configuration and a SOA RAC
Database using global hash partitions for the most contended indexes
19
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 14: Execution time for the top SQl queries in a SOA RAC Database with the default configuration and a SOA RAC
Database using global hash partitions for the most contended indexes
Finally, here is the response time (end to end, i.e. from the invocation to the web service to the return of the
response) for the two stress tests:
Image 15: End to end response times for a SOA System with/without global hash partitioned indexes
20
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
The system has improved its response time in approximately 25%!
If the number of composites instances gets even larger, the performance improvement is even better. The
following picture compares the performance and throughput for a SOA system using a single instance
database, a RAC database with the default configuration for SOA schemas and a RAC database with
appropriate indexes using global partitions by hash. This comparison was done with over 300000 instances
pre-seeded in the database.
Image 16: Throughput and response time for different SOA Database Configurations
What indexes should be partitioned?.-
For each specific SOA system, the indexes causing the contention may vary. The analysis presented in
section “Analyzing the SOA Oracle Database Wait Events-” should be performed; the appropriate indexes
causing the contention should be determined and then recreated with the appropriate global partitions.
Global hash partitioned indexes may, however, affect negatively the performance of range queries.
Specifically, the impact on Oracle Enterprise Manager FMW Control date-range queries (or any other
custom range queries on the SOAINFRA tables) should be considered before partitioning the involved
indexes. The possible negative impact on analysis-oriented queries vs. the runtime improvements achieved
by using global hash partitioned indexes should be analyzed individually in each Oracle FMW SOA
system.
21
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
The following indexes are especially critical when retrieving instance’s information and doing time-based
analysis (operations like searching for composite instances that occurred in specific intervals) in Oracle
Enterprise Manager Fusion Middleware Control:
•
•
•
•
•
STATE_TYPE_DATE ON SOAINFRA.DLV_MESSAGE
REFERENCE_INSTANCE_TIME_CDN ON SOAINFRA.REFERENCE_INSTANCE
CI_CREATION_DATE ON SOAINFRA.CUBE_INSTANCE
COMPOSITE_INSTANCE_CREATED ON SOAINFRA.COMPOSITE_INSTANCE
MEDIATOR_INSTANCE_INDEX2 ON SOAINFRA.MEDIATOR_INSTANCE
The response time in Enterprise Manager Searches can be penalized if these indexes are partitioned. The
following tables summarize the performance results obtained for a typical date-range search operation in the
SOA instances screen for the FOD Sample Composite (the test includes logon, access to the soa-infra screen
and search for OrderBooking composite instances that occurred between two specific dates)
Image 17 Average response times for date-range searches in EM when using the default indexes.
Image 18 Average response times for date-range searches in EM when using global hash partitioned indexes for the
22
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
STATE_TYPE_DATE, REFERENCE_INSTANCE_TIME_CDN, CI_CREATION_DATE,
COMPOSITE_INSTANCE_CREATED and MEDIATOR_INSTANCE_INDEX2 indexes
Image 19: Average response times for date-range searches in EM when using global hash partitioned indexes for hot block
indexes excluding the time-based ones
The results demonstrate that partitioning the time-based indexes makes the search operation 10 times slower.
When only the non-time-based indexes are partitioned, the results are 15% better as compared to the default
indexes. The wait lapses in the DB are reduced:
Image 20 Database wait time improvements when using global hash partitioned indexes excluding date-related indexes
compared to non using global has partitioned indexes at all
And the time per transactions gets also improved to a great extent.
23
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 21 Performance improvement per transaction when using global hash partitioned indexes excluding date-related
indexes compared to non using global has partitioned indexes at all
The number of partitions used for the indexes also affects the behavior (in both reducing contention hence
improving throughput and in delaying range-based queries). Depending on the type of usage of Oracle
Enterprise Manager FMW Control for SOA analysis it may be considered to partition the indexes to a small
deree thus reducing contention without penalizing much analysis operations. The following diagram shows the
impact of using different number of partitions for the date-realated indexes in the response time for EM
instance search operations.
Image 22: EM load testing response time vs number of partitions for the date-based indexes
If it is not expected that range-based queries are regularly performed on the SOAINFRA schema (EM
searches, EM browsing of instances or customer queries against the SOA schemas), using a small number of
partition for date-base indexes may be a plausible approach.
24
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
How to partition the contended indexes?.-
For new installations the best approach is to just drop the existing affected indexes and recreate them with the
appropriate hash partitions. It is recommended to maintain the parameters originally used for the definition of
the index as these may have been optimized already. To determine the existing definition of the index, the
dbms_metadata package can be used:
SQL>select dbms_metadata.GET_DDL('INDEX','REFERENCE_INSTANCE_ECID',SOAINFRA') from
dual;
OUTPUT: CREATE INDEX "SOAINFRA"."REFERENCE_INSTANCE_ECID" ON "
SOAINFRA"."REFERENCE_INSTANCE" ("ECID") PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE
STATISTICS TABLESPACE "SOAINFRA"
It is easy to use the return of the query and add the hash partitioning:
SQL>DROP INDEX SOAINFRA.REFERENCE_INSTANCE_ECID;
SQL>CREATE
INDEX
SOAINFRA.REFERENCE_INSTANCE_ECID
ON
SOAINFRA.REFERENCE_INSTANCE
(ECID) PCTFREE 10 INITRANS 100 MAXTRANS 255 COMPUTE STATISTICS STORAGE(INITIAL 65536
NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST
GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE
SOAINFRA GLOBAL PARTITION BY HASH (ECID) PARTITIONS 128; (NOTE:
Oracle recommends using a
power of 2 for the number of partitions to prevent data from clustering within specific partitions)
For systems that are already in use, the impact of re-shaping the indexes can be minimized by using the
dbms_redefinition package. This package provides a mechanism to make table-structure modifications
without significantly affecting the availability of the table (online redefinition). Redefining tables online
provides a substantial increase in availability compared to traditional methods of redefining tables. When a
table is redefined online, it is accessible to both queries and DML during most of the redefinition process.
The table is locked in the exclusive mode only during a very small window that is independent of the size
of the table and complexity of the re-definition, and that is completely transparent to users. For redefining
online an index that will use global hash partitions, the steps are:
1. Create interim table with partitioned index (without constraints if the index is related to any). Use a
different name for the partition index in the interim table from the one in the original table
(example: pk_new vs. pk)
25
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
2. Use the dbms_redefinition package to redefine the table without copying indexes but copying
constraints and all other dependent objects
3. Drop the interim TABLE
4. Rename the index’s name as it was initially in the original table (rename pk_new to pk)
The following script provides an example using the AT_PK index casee (primary key in the Autid_trail
table):
CREATE TABLE SOAINFRA.AUDIT_COUNTER_RD (CIKEY NUMBER(*,0), LAST_COUNT_ID NUMBER(*,0),
LAST_EVENT_ID NUMBER(*,0), LAST_DETAIL_ID NUMBER(*,0), CI_PARTITION_DATE TIMESTAMP
(6)) SEGMENT CREATION IMMEDIATE
PCTFREE 10 PCTUSED 40 INITRANS 40 MAXTRANS 255
NOCOMPRESS LOGGING TABLESPACE SOAINFRA;
CREATE INDEX SOAINFRA.AC_PK_RD ON SOAINFRA.AUDIT_COUNTER_RD (CIKEY) PCTFREE 10
INITRANS 40 MAXTRANS 255 COMPUTE STATISTICS TABLESPACE SOAINFRA GLOBAL PARTITION BY
HASH(CIKEY)
PARTITIONS 64;
DECLARE
redefinition_errors PLS_INTEGER:=0;
BEGIN
DBMS_REDEFINITION.START_REDEF_TABLE (
uname => SOAINFRA'
,orig_table => ‘AUDIT_COUNTER’
,int_table => ‘AUDIT_COUNTER_RD'
,col_mapping => NULL
,options_flag => DBMS_REDEFINITION.CONS_USE_PK
);
DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS (
uname => ‘SOAINFRA'
,orig_table => ‘AUDIT_COUNTER’
,int_table => ‘AUDIT_COUNTER_RD'
,copy_indexes => 0
,copy_triggers => TRUE
,copy_constraints => TRUE
,copy_privileges => TRUE
26
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
,ignore_errors => TRUE
,num_errors => redefinition_errors
,copy_statistics => FALSE
,copy_mvlog => FALSE);
IF (redefinition_errors > 0) THEN
DBMS_OUTPUT.PUT_LINE('>>> AUDIT_COUNTER_RD to AUDIT_COUNTER failed: ' ||
TO_CHAR(redefinition_errors));
END IF;
DBMS_REDEFINITION.FINISH_REDEF_TABLE (
uname => 'SOAINFRA'
,orig_table => ‘AUDIT_COUNTER’
,int_table => ‘AUDIT_COUNTER_RD'
);
END;
/
DROP TABLE SOAINFRA. AUDIT_COUNTER_RD;
ALTER INDEX AC_PK_RD RENAME TO AC_PK;
With this approach, the original table and its indexes are accessible during and after the redefinition process,
except during the small lock window while executing the finish_redef_table method. The manual index
renaming happens after the redefinition and can cause cursors invalidation, but renaming of the indexes is a
metadata change operation which is very light and is done in a sub-second period. If any cursor is invalidated
the query is automatically retried without any noticeable downtime in the application.
27
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Oracle FMW SOA Datasources’ Pool Capacity and Statement Cache Size.-
The Oracle FMW SOA Suite installation process using a RAC database as repository configures by default a
pool with an initial pool capacity of 0 and a maximum pool capacity of 50. These settings are intended to
minimize the start time for the SOA servers. These values have direct implications on the server’s failover
latency in server migration scenarios. They also affect other aspects of the failure behavior of a server. When a
server’s failover takes place, all incoming requests are routed to the available server causing all connections to
be routed through that server’s connection pool. If the failover takes place during a high load window (i.e.
when a higher number of connections are in use) it is likely that the pool’s maximum capacity will be reached in
the failover server and contention can take place. A possible solution is to increase the maximum pool’s size
above the required (i.e. beyond what is needed by the system during peaks of load) so that when failover
happens, there are enough connections in the failover server to serve requests. Since the pool is enlarged to
maximum capacity on a demand basis, this does not affect the server’s start/restart latency; hence, provided
that enough resources are available, the maximum capacity can be increased to larger values than 50.
Contrary to the maximum pool’s capacity, the initial pool’s capacity size may have an adverse effect in failvoer
scenarios. This is mainly because the number of connections specified by the initial capacity parameter is
created upfront when the server starts. Connection creation is quite resource intensive and has an impact on
the system. The following diagram shows the restart latency for different initial pool sizes:
Image 23: Effect of increasing the initial pool size on the server's start time
28
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
As the diagram demostrates there is a clear increase in the start times with larger initial pool sizes (note that the
pools reflected in the diagram are only the SOALocalTxDataSource and SOADataSource ones; the effect
woulld be increased if the EDN, MDS and Orasdpm dataosurces were also changed). If the load on the system
remains relatively constant through the day, it is unlikely that connections will remain unsued for large periods
hence it will not be required to maintain a large initial pool to elminiate bad response times in initial access. On
the contrary if the system is likely to leave connections unsused for long periods of time, it will be necessary to
analyze whether the improvmenet in the response time of the ninitial requests to the server outweights the
increased latency in servers’ restarts.
A similar behavior takes place with the statement cache. The Oracle SOA Database connection pool
statement’s cache is filled as queries are sent to the database and it is adjusted based on the Last Used Resource
(LRU) algorithm. However large statement caches and pools have a multiplying effect on the open cursors in
the database. i.e. if your SOA cluster includes 4 nodes, your max pool capacity is 200 and you set a statement
cache of 40, the system will end up with 4x200x40=32000 Open cursors in the Database! . If a SOA system
contains many different types of composites (some with Adapter instances, some with Mediator etc) it is
unlikely that the LRU stack in the cache will remain constant. Consider in these cases the reduction of the
statement cache. The LRU algorithm in a very variant system may be forcing continuous updates in the cache
with unnecessary replacements causing additional overhead. Using the Oracle Database Reports (AWR) as
shown in the analysis for the index contention issues, determine the top statements and adjust the cache
accordingly. If there are many contenders for maximum repetition, do not increase the cache. A bad statement
cache is worse than no cache at all.
As a summary for Connection Pools and Statement Cache for the SOA system:
• Increase Initial Pool Size only when it is strictly necessary to improve ramp up response times. Keep in
mind that server’s restart/failover latency increases with larger initial pools and increasing the server
restart latency may cause contention in the servers that remains alive (i.e. the sooner the failed server
resumes processing, the better for the other severs that remained alive)
• When a non zero initial capacity is used and a one of the database instances is down while the SOA
server is being restarted, the server’s start latency is increased (the pool’s creation will keep retrying
during startup)
• Increase the pool’s maximum capacity to plan for scenarios where a WLS server may be down and a
single pool (in the server that remains alive) needs to sustain all the incoming requests. The remaining
server’s pool max capacity should be large enough to elude contention.
29
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
• Statement Cache has a considerable impact on resource consumption in the Database. If your SOA
system contains many different types of composites it may be harmful to increase the statement cache.
Oracle FMW SOA Datasources’ Seconds to Trust and Idle Connection.To adjust properly the “Test Connections on Reserve”, “Test frequency” and “Seconds to trust an idle
Connection“ values for an Oracle FMW SOA Connection pools it is necessary to understand what each of
these parameters imply. Connections in an Oracle FMW SOA Server’s WLS Datasource are tested when
different events occur:
• When they are created
• When they are handed to the SOA system for use
• In regular periods determined by the “Test Frequency” parameter.
By default, the Oracle FMW SOA database connections are tested when they are created using the “SELECT 1
FROM DUAL” SQL statement. This happens when the Oracle FMW SOA servers are started. Additionally
since Oracle FMW SOA sets the default “Test Frequency” for the database pools to 300 seconds, all those
connections that have not been handled to the SOA system are tested periodically with the same query every 5
minutes. Also, “Test Connections on Reserve” is enabled for SOA by default, which implies that connections
are verified before being handed to the SOA applications. If any of the tests fails Oracle WebLogic Server
closes the connection, recreates it and tests the new connection before returning it to the pool.
Whether or not all these “tests” are required will depend on the usage pattern for the SOA system. Testing
connections can cause a delay in executing composites, but it makes sure that the connection is viable when
Oracle FMW SOA gets it. i.e enabling “Test connections on Reserve” and using short test frequencies makes
the system more reliable. Using “Seconds to Trust an Idle Pool Connection” can reduce the number of tests
(hence reducing overhead) but makes the system less reliable. This parameter determines the number of
seconds since the connection was used for the last times that WebLogic Server will trust it skipping other
connection tests. “Seconds to Trust an Idle Pool Connection” is set to 0 by default for the Oracle FMW SOA
connection pools because this makes the system more resilient to failures in the database. This guarantees that
connections are always verified before being handed to the SOA applications. However, if this parameter is set
to a different value than 0, WebLogic Server will trust the connection and will skip the connection tests (“test
frequency” and “test connections on reserve” tests) if the SOA connection is requested during the specified
interval after the connection was used for the last time.
30
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
If an Oracle FMW SOA system is expected to get requests constantly/regularly then most of the connections
will remain in use frequently and the “Seconds to Trust and Idle Connection” parameter can be used to skip
the periodic tests determined by the “test frequency” and the “test connections on reserve” settings. If on the
contrary the Oracle FMW SOA system has irregular peaks of use and remains idle during long periods, it is
better to keep the parameter set to 0 because this will force the connection’s verification with every reserve and
with every frequency period specified.
Image 24: Connection usage pattern effect on the system's realiability
In a stress test, the load remains constant and most connections remain in use most of the time, hence a lot of
tests are skipped and a non irrelevant performance improvement is perceived. The following diagrams show
AWR report with the number of executions of the SELECT 1 from DUAL statement in each case and the
overhead caused by it.
31
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Image 25: TOP SQL executions when Seconds to trust an Idle Connection is set to 0 (default for SOA)
Image 26: TOP SQL executions when Seconds to trust an Idle Connection is set to 50
32
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
However this property has serious implications on the system’s reliability: if a failure happens in the database,
invalid connections may be handed to the SOA system and the exceptions will be detected deeper in the SOA
system’s code. The connection’s failover latency will also be higher and for large pools with many connections
in idle state the chances of repeated failures are high until the pool is invalidated. Oracle recommends
weighting these considerations vs. the performance gains and if strictly necessary using short periods for the
Seconds to trust Idle Connection so that the “Test frequency” and “Test connections on reserve” verifications
take place once in a while.
Summary.-
Oracle Fusion Middleware Service Oriented Architecture Suite makes intensive use of indexes in different
tables of the SOAINFRA database schema. Performance may be negatively impacted by contention in the
access to the blocks holding these indexes when using Oracle Real Application Clusters for the Database
holding the Oracle SOA schemas. It is required to analyze the objects causing the contention (which may
vary depending on the type of composite) and use global hash partitioning for the appropriate indexes. The
benefits of using global hash partitioned increases exponentially when high concurrency takes place and
high order indexes are used in the SOA database. Attention needs to be paid to the possible impact of
portioning on time-range queries and weighted decisions are required to compensate regular instances
insertions vs. searches for specific instances. Finally, when adjusting the pool for the SOA datasources the
required failover scenarios need to be accounted for with special adjustments for the pool’s capacity as well
as the trust-an-idle-connection pool parameters.
33
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
References
1.
Oracle Fusion Middleware Enterprise Deployment Guide for Oracle SOA Suite
http://download.oracle.com/docs/cd/E14571_01/core.1111/e12036/toc.htm
2.
Oracle Fusion Middleware High Availability Guide
http://download.oracle.com/docs/cd/E14571_01/core.1111/e10106/toc.htm
3.
Oracle® Fusion Middleware Configuring and Managing JDBC Data Sources for Oracle WebLogic Server
http://download.oracle.com/docs/cd/E17904_01/web.1111/e13737/toc.htm
4.
Oracle® Real Application Clusters Administration and Deployment Guide
http://download.oracle.com/docs/cd/B28359_01/rac.111/b28254/toc.htm
Also, visit the MAA Web site on OTN for a list of Oracle Fusion Middleware high availability documentation and best practice white papers at:
http://www.oracle.com/technetwork/database/features/availability/fusion-middleware-maa-155387.html
34
Oracle White Paper—Oracle FMW 11g R1 SOA with Oracle Database Real Application Cluster Assessment
Oracle FMW 11g R1 SOA High Availability
Assessment
November 2011
Author: Fermin Castro
Contributing Authors:
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Copyright © 2009, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and
the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other
warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or
fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are
formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any
means, electronic or mechanical, for any purpose, without our prior written permission.
Worldwide Inquiries:
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective
Phone: +1.650.506.7000
owners.
Fax: +1.650.506.7200
oracle.com
0109
35
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising