Updates

    Oracle library cache tuning – A Real Case Study.

    Library cache tuning – A Real Case Study.

    Scenario :
    Users started complaining about very poor application performance, hung and incomplete transactions. Some severity one ticket for immediate action.
    Environment :
    Oracle 10g (10.2.0.4) Two node RAC, About 15 different Java and .net legacy applications.
    Highly transactional.

    11 Gig Shared pool
    16 Gig DB cache

    Availability : 24*7, Outage unacceptable to business
    I made some observations, thought it would be a nice idea to share them with you all.
    Here are the citation from AWR report(1 hr.) during the problem period.
    Snap Id Snap Time Sessions Cursors/Session
    Begin Snap: 39718 07-Jun-11 05:00:13 438 66.3
    End Snap: 39719 07-Jun-11 06:00:19 651 64.4
    Elapsed:   60.09 (mins)
    DB Time:   490.75 (mins)

    • Very poor response time can be visible from the differences between db time and elapsed time – No wonder why users were complaining.
    Load Profile :
    Redo size: 698,307.96 9,249.30
    Logical reads: 187,227.45 2,479.88
    Block changes: 4,078.58 54.02
    Physical reads: 29.63 0.39
    Physical writes: 79.30 1.05
    User calls: 2,638.29 34.94
    Parses: 1200.23 5.81
    Hard parses: 52.84 0.70
    Sorts: 230.95 3.06
    Logons: 0.91 0.01
    Executes: 601.64 7.97
    Transactions: 75.50

    • Parse count is just half of user calls, so every two user call there is a parse request.Dammn bad, this going to be a huge CPU consumer.
    • Hard parses per second is 52.84, if I said very bad for parse count, I should say very very very bad for hard parse – this is very much resource intensive. Unacceptable !!
    Instance Efficiency :
    Buffer Nowait %: 100.00 Redo NoWait %: 100.00
    Buffer Hit %: 99.99 In-memory Sort %: 100.00
    Library Hit %: 91.75 Soft Parse %: 87.95
    Execute to Parse %: 27.13 Latch Hit %: 99.97
    Parse CPU to Parse Elapsd %: 2.17 % Non-Parse CPU: 97.35

    • Low execute to parse ratio denotes CPU is significantly busy in parsing. Soft Parse% showing, most of the parse are soft parses. It means we should focus on soft parsing activity.
    • Parse CPU to Parse Elapsed % is quite low, which indicate  some bottleneck is there related to parsing. It could be a side-effect of huge parsing pressure. Like CPU cycles are not available.
    • Library Hit % is very poor, it should be atleast above 95% .
    Top 5 Timed Events :
    latch: library cache 6,470 16,152 2,496 54.9 Concurrency
    CPU time   11,675   39.7
    log file sync 193,236 1,109 6 3.8 Commit
    log file parallel write 224,479 795 4 2.7 System I/O
    db file sequential read 38,110 386 10 1.3 User I/O

    • High wait on latch: library cache . So very clear that slow performance is due to the contention in Library cache.
    • Also looked at SQL ordered by Parse Calls and SQL ordered by Version Count. That pin point the issue at library chache.
    Whats next ? ?
    Above observations indicate that there is some serious problem to share the sqls. It could be a result of bad application code wihout bind variables . Each calls are going for a parse – either soft or hard and no caching. The obvious advantage to caching cursors by session is reduced parse times, which can leads to faster overall execution times.
    Caching Caching .. Lets look at the Oracle caching parameters now. The value was
    cursor_sharing= similar
    open_cursors= 3000
    session_cached_cursors= 100
    Two things I played around was cursor_sharing and session_cached_cursors. I can ignore open_cursors for time being as it is already high for my environment.
    I have chosen to use SESSION_CACHED_CURSORS to help out in this scenario as an application that is continually closing and reopening cursors, you can monitor its effectiveness via two more statistics in v$sesstat. The statistic “session cursor cache hits” reflects the number of times that a statement the session sent for parsing was found in the session cursor cache, meaning it didn’t have to be reparsed and your session didn’t have to search through the library cache for it. You can compare this to the statistic “parse count (total)”; subtract “session cursor cache hits” from “parse count (total)” to see the number of parses that actually occurred.
    v$sesstat also provides a statistic to monitor the number of cursors each session has in its session cursor cache.
    –session cached cursors, by session
    select a.value, s.username, s.sid, s.serial#
    from v$sesstat a, v$statname b, v$session s
    where a.statistic# = b.statistic#  and s.sid=a.sid
    and b.name = ‘session cursor cache count’ ;

    –session cached cursor usage.
    select a.value curr_cached, p.value max_cached, s.username, s.sid, s.serial#
    from v$sesstat a, v$statname b, v$session s, v$parameter2 p
    where a.statistic# = b.statistic#  and s.sid=a.sid
    and p.name=’session_cached_cursors’
    and b.name = ‘session cursor cache count

    Above query gave a result that CUR_CACHED=MAX_CACHED . If the session cursor cache count is maxed out, session_cursor_cache_hits is low compared to all parses, and you suspect that the application is re-submitting the same queries for parsing repeatedly, then increasing SESSION_CURSOR_CACHE_COUNT will help with latch contention and can give boost to performance.
    One fix is ready now i.e increase SESSION_CACHED_CURSOR from 100 to 300 (Decided to put 300 first and  monitor the session cursor caching).
    We had a high hard parse rate as well – So the second parameters to play around was CURSOR_SHARING. For poorly coded application Oracle highly recomend to keep CURSOR_SHARING=FORCE . Also there are few bugs reported with CURSOR_SHARING=SIMILAR option as well as it is Deprecated from 11g onwards. [ANNOUNCEMENT: Deprecating the cursor_sharing = ‘SIMILAR’ setting [MOS Note ID 1169017.1]]
    Conclusion :
    Changed initialization parameters for session cached cursors and cursor sharing hence reduced library cache contention by avoiding parse calls and releasing cpu cycle.
    I have changed to
    CURSOR_SHARING=FORCE
    SESSION_CACHED_CURSOR=300

    Certaily you can dynamicaly do CURSOR_SHARING=FORCE
    ALTER SYSTEM SET CURSOR_SHARING=FORCE SCOPE=BOTH;
    But SESSION_CACHED_CURSOR is not dynamic and you need a bounce to reflects this. I was in trouble for this because it is a zero tolerence system and outage should be planned a month before. I cannot wait until that to tune this perf issue. Again there is anotehr back door method to implement SESSION_CACHED_CURSOR to get effect, but that’s a topic for another blog – you can see it in another post here ..
    I have mentioned at the very start cursor parameters that can introduce  surprises. Yes, very true. After these two changes all the library cache contention was vanished.
    Hard parses reduced from 58 per sec to 1 or 0 per sec. Library hit ratio become 98% and soft parse ration become 99%  and response time become jet fast.

    1 comments:

    Seshu said...

    I guess you set this at the session level by creating a triger to fix this issue "SESSION_CACHED_CURSOR=300"

    Post a Comment

    Copyright © ORACLE ONLINE DBA
    Developed By Pavan Yennampelli