Who Ate My DB Time? Advanced Techniques For Database

Who Ate My DB Time? Advanced Techniques For Database
Who Ate My DB Time? Advanced Techniques For Database
Performance Analysis
Prabhaker Gongloor (GP) & Kurt Engeleiter
Product Management, Database Manageability, Oracle Corporation
1
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
The following is intended to outline our general product direction.
It is intended for information purposes only, and may not be
incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and
timing of any features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
2
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences: Boeing Inc.
3
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12.1)
– Offline Analysis
• Customer Experiences
4
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time is Pervasive, Familiar, Evolutionary
• DB Time is the fundamental measure of Oracle performance
– Arrived at progressively but designed-in from 10g
• You are likely using DB Time methods already
– AWR and ASH reports
– Enterprise Manager
• DB Time is performance analysis science matured
– Wait event analysis, YAPP and tracing are all its ancestors
5
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Why Do We Care About Time?
• Human time is critical to the enterprise
• Systems performance affects business goals
– Human time + technology resource time
• “Time is money”
• Performance improvement means doing things faster
Performance is always and only about time
6
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time Definition
• Total time in database calls by
foreground sessions
DB Time
• Includes
– CPU time
– IO time
– non-idle wait time
• DB Time is not called response time on
purpose
• Common currency for Oracle
performance analysis
7
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Wait
I/O
CPU
DB Time: Quiz
Question: On a four CPU system, for one hour
elapsed time, what is the maximum amount of CPU
time that can be accumulated?
Answer: Four hours of CPU time maximum.
8
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time: Quiz
Question: On a four CPU system, for one hour elapsed time, what
is the maximum amount of DB Time that can be accumulated?
Hint: How many sessions can wait in a queue?
Answer: Number of ‘sessions’ * elapsed time
The maximum number of sessions that are on CPU or in a non-idle
wait is the init.ora parameter ‘sessions’. If ‘sessions’ is 300 then:
300 * 1 hour = 300 hours of DB Time
9
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
10
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
11
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time on the Performance Page
• Area under curve is equal to DB time
• DB Time displayed as CPU, I/O or non-idle wait class
• Wait class breakdown indicates solution space
12
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time on the Top Activity Page
DB Time
• Plots DB Time from ASH data
• Shows top DB Time consumers by SQL and Session
13
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
DB Time in AWR Report
• If you tune by Wait Events you are tuning by DB Time
• Top 5 Timed Events in the AWR report shows CPU time and non-idle
waits for foregrounds
14
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Automatic Database Diagnostics Monitor
Boost Administrator
Productivity
IO / CPU
Issues
Eliminate Performance
Bottlenecks
High
Load
SQL
Automatic Workload
Repository Snapshots
•
•
•
•
•
Self-Diagnostic
Engine
Lower IT Management
Costs
RAC
Issues
Performance &
Management
Advisors
Deliver Higher Quality of
Service
DB Time data collected in Automatic Workload Repository snapshots
Immediate DB Time analysis done by Automatic Database Diagnostic Monitor
Resolve performance issues faster with drill-down root-cause analysis
Classification tree based on Oracle performance tuning expertise
Performance expert; now a RAC specialist too in Oracle Database 11g
15
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
16
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Compare Period ADDM
17
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Why Compare Period ADDM
• Existing ‘Compare Period Report’ compares
across two AWR time ranges
• Displays a large number of metrics
• Organized by topic – interpreting each section
requires a distinct and significant base of
knowledge (a lot to learn)
• Challenging to separate the insignificant from
the critical (no common denominator)
• Focused on the WHAT and the HOW, omits the
WHY
18
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Compare Period ADDM
SQL Commonality
AWR Snapshot
Period 1
Regressed SQL
I/O Bound
Undersized SGA
AWR Snapshot
Period 2
•
•
•
•
Compare
Period ADDM
Analysis Report
Analysis performance across two AWR time ranges
Executes full ADDM analysis and makes actionable recommendations along with impact
Identifies workload changes, configuration changes, regressed SQL, resource limits
Correlate changes to root causes – encapsulates years of expert Oracle experience
19
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Compare Period ADDM: Method
Identify what changed
•
30% smaller Buffer cache
• Configuration changes, workload
changes
•
10% new SQL
•
Top SQL increased 45%
•
Read I/O up 55%
•
Buffer cache reduction
caused read I/O increase
Quantify performance differences
• Uses DB Time as basis for measuring
performance
Identify root cause
• Correlate performance differences
with changes
20
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Correlation Rules Examples
Symptom
Paging
21
Cause
• Physical memory size
• SGA/PGA Target
Hard/Soft
Parse Time
• Cursor sharing
• Shared Pool size
• Session cached cursors
SQL
Regressions
• Optimizer parameters
• SQL parameters
Temp I/O
• PGA Aggregate Target
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Compare Period ADDM: Report
• SQL commonality offers a sanity test – are the workloads comparable?
• Not impacted by SQLs with low DB time
22
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Compare Period
ADDM Examples
23
Copyright © 2011, Oracle and/or its affiliates. All rights
© 2011 Oracle Corporation
– Proprietary and Confidential –Oracle Internal Only
reserved.
Real-Time ADDM
24
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real-Time ADDM - Motivation
•
Extremely slow or hung databases
–
Cannot connect to or diagnose unresponsive databases
–
Current remedy: bounce the database
•
All diagnostic information is lost
•
Problem can occur again since root causes not identified
–
Need quick, lightweight analysis mechanism
•
•
25
No I/O or global resources such as enqueues or latches
Real-time analysis
–
System under performance stress
–
AWR snapshot may exacerbate problem
–
Normal ADDM time interval may be too long to diagnose short spikes
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real-Time ADDM – Architecture
EM Agent
Deadlocks
Diagnostic Connection
Unresponsive DB
Hangs
JDBC Connection
Enterprise
Manager
•
•
•
•
Latches
Real time analysis
Database
ADDM
Analysis
Uses a pre-established diagnostic connection for unresponsive systems
Initiates a standard JDBC connection for real-time analysis
Diagnostic connection collects data without holding latches or running SQL
First intelligent advisor to diagnose problems in real-time as they occur, no matter
how sick the system is
26
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Top Issues Identified by Real-Time ADDM
Resource Constraints
Memory Issues
• CPU Bound Hosts
• Memory Paging
• I/O Bound
• Interconnect Bound
• Library Cache Issues
• Memory Allocation Issues
• Excessive PGA growth
Hangs
• Instance Shutting Down
• Top Blocker Analysis
• Memory Pool Flushing
• ASM Unresponsive
27
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Resource Limits Reached
• Sessions
• Processes
Deadlocks
Real-Time ADDM – Data Collected
JDBC Collection
Diagnostic Collection
• Database and Instance Metadata
• Hang analysis data
• Instance Metrics
• Session activity by wait class
• ASH Data
• I/O metrics
• Top SQL from ASH
• Host metrics
• Top Sessions from ASH
• Database Initialization Parameters
• Host Hardware Information
• Instance and Host Memory Usage
• Host Metrics
28
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real-Time ADDM
• Real-Time ADDM screen shows database activity over the last hour
• Analysis done for most recent five minutes
• Click ‘Start’ button to initiate analysis
29
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real-Time ADDM
• EM communicates with RDBMS in a tight loop
• Analysis runs in real-time and updates as each UI refresh brings more data
• Progress tab shows present and upcoming collections, gives an insight into the
analysis yet to be performed
• Completes quickly, but no need to wait for the final results – finding set is
cumulative
• Data supporting the analysis is presented alongside the findings
30
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real-Time ADDM
Examples
31
Copyright © 2011, Oracle and/or its affiliates. All rights
© 2011 Oracle Corporation
– Proprietary and Confidential –Oracle Internal Only
reserved.
ASH Analytics
Flexib
le Time Picker
Flexib
le Activity Chart
Flexib
le Top Chart
32
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Flexib
le Top Chart
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
33
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Offline Analysis
Real Application Testing
•
•
•
•
34
Can be used to reproduce, analyze and resolve
production problems in test
Provides comprehensive offline testing and DB Time
analysis
Allows you to thoroughly investigate a large set of
performance options before deploying to production
Enables root cause analysis of production problems in
test
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Real Application Testing – Two Components
End-to-end testing with real workloads
Create Test
System
Capture
Workload
Replay
Workload
• SQL Performance Analyzer
Deploy Replay
Clients
• Database Replay
• SQL unit testing for response time
• Identify and tune regressed SQL
• Integrated into SQL Tuning Advisor and
SQL Plan Baseline
35
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
• Load, performance testing for throughput
• Remediate application concurrency problems
• Identify operations using the most DB Time
Real Application Testing Database Replay
• Database load and
performance testing with real
production workloads
Production
Test
Clients
Replay Driver
– Production workload characteristics
such as timing, transaction
dependency, think time, etc., fully
maintained
• Test and measure transaction
throughput improvements
• Identify application scalability
and concurrency problems
• Remediate issues preproduction for risk-free
migration
36
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Storage
Storage
Capture
Process
Replay
Analysis &
Reporting
Offline Analysis
Database Replay
•
The most common use of Database Replay has been as a testing tool for
assessing impact of system changes
•
It can be also used for offline troubleshooting in situations where AWR/ASH
data alone may not suffice
•
It enables faster root cause analysis through comprehensive testing and DB
Time analysis
37
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Offline Analysis
Database Replay Offline Examples
•
Reproducing hang situations in test - e.g., when a batch job or certain
pattern of events is known to cause a hang
•
Reproducing significant problems observed in production on test - e.g.,
RAC node eviction, cache brown out, fail-over, ORA-600, ORA-7445
issues
•
Performing destructive testing with real workload that may not be possible
in production - e.g., hot swapping disks, memory boards, etc.
•
Enabling tracing with production workload in test production because of
the high performance overhead of tracing. Tracing can sometimes be
necessary to provide support diagnostics - e.g., setting diagnostic events
38
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Offline Analysis – ORA-600 Example
• An ORA-600 was occurring during a production batch process
• Rather than troubleshoot on the production system Database Replay was used
to capture and replay the batch process in test
• The ORA-600 reproduced in test
39
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Offline Analysis – ORA-600 Example
• EM Support Workbench Guided Workflow used for the ORA-600 alert
• SQL Repair Advisor was run using the Guided Resolution Workflow
• SQL Patch was applied as recommended by SQL Repair Advisor
40
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Offline Analysis – ORA-600 Example
• Batch job replayed with SQL Patch implemented
• ORA-600 did not reproduce – patch verified for production
41
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Agenda
• What is DB Time?
• Performance Tuning with DB Time
– Online Analysis
• Current Features
• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences: Boeing Inc.
42
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Database Replay
A Powerful Oracle Performance and Load Test Tool
First-hand experience by Boeing
Andre Scharkowski
[email protected]
Introduction
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
• About myself
• Oracle Database Administration
• Performance & Scalability Testing
• About the Boeing Company
• 200+ DBAs
• Oracle, SQL Server, DB2, Teradata
• About my work group - DCAC/MRM
• 16 Oracle DBAs
• Applications for the design, configuration & manufacturing of airplanes
• 30 production and about 400 non-production databases
Copyright © 2008 Boeing. All rights reserved.
Introduction (cont.)
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
• Database sizes & load
• From 100s MB to 3TB
• 20K sessions 15K active
• Architecture and equipment
• Oracle 11.2.0.2 on AIX 6.1, IBM pSeries (Power 6)
• RAC and HACMP / BCV
• Oracle features in use
•
•
•
•
•
RAC, DataGuard, EBR (Edition-Based Redefinition)
DB-links, Materialized Views
Triggers, PL/SQL, Synonyms, Complex Views & Grants,
EM-grid, OCM, rman
Partitioning, some Parallel Query
Copyright © 2008 Boeing. All rights reserved.
Why is Database Replay such a big deal?
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
• Tool is able to reproduce true production load in all its variety and complexity
• DBA has ability to test database changes without having to involve other, non-DBA
groups
• Test tool of choice for Oracle patches, Oracle upgrades, parameter changes,
architecture and hardware changes
• Tool has the potential of saving large amounts of resources normally spent on test script
development
• Tool can be used to generate a background database load that can supplement load
generated by traditional test tools (save licenses)
• Tool can reduce the size and complexity of test environments -- no need for application
servers, web servers, client machines
Copyright © 2008 Boeing. All rights reserved.
Deployment status of Database Replay at Boeing
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
• Performed about 50 production captures and over 100 replays
• Replays helped validate 10.2.0.4 11.2.0.1. upgrade
• Using the tool to validate new EMC V-Max tiered storage
• Plan on making Database Replay part of regular test cycle
• issue: few outages many types of changes happen at once
• Plan on working with Oracle to improve the product
• timeout feature
• explore Oracle’s Application Testing Suite (ATS)
• make use of Workload Analyzer
Copyright © 2008 Boeing. All rights reserved.
capture in progress
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Copyright © 2008 Boeing. All rights reserved.
Architecture & Information Management
Active Session View in OEM during Replay
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
Start of replay
Contention
Replay Finished
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Copyright © 2008 Boeing. All rights reserved.
Architecture & Information Management
Conclusion
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Architecture & Information Management
• Exciting performance and load test tool
• true production load at very low overhead
• rapid testing possible
• simple setup
• resource savings (HW, scripting)
Copyright © 2008 Boeing. All rights reserved.
Acknowledgement
Engineering,
Operations
& Technology
| Information
Technology
Engineering,
Operations
& Technology
| Information
Technology
Many Thanks To:
Ashish Agrawal (Oracle)
Prabhaker Gongloor (Oracle)
Yuri Grinshteyn (Oracle)
Oracle Support
The DCAC/MRM DBA team at Boeing
Copyright © 2008 Boeing. All rights reserved.
Architecture & Information Management
Q&A
53
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
54
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
55
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement