1g 之前, 通常我们数据库 hang 住了之后, 我们会对数据库做 hang analyze 来进行分析, 在 11g 之后, 我们可以通过一个新的视图 v$wait_chains 来诊断数据库 hang 和 contention 在 11gR1 这个版本里面, Oracle 通过 diag 进程实现了一个功能, 每隔 3 秒做一次本地的 hang analyze, 每隔 10 秒做一次 global 的 hang analyze 而这些信息会存放在内存里面, Oracle 把这一块内存称作 hang analysis cache 而这一部分内存信息, 对我们数据库诊断 hang 和 contention 起着非常重要的作用而数据库还有一些特性及工具也需要使用这块内存区域比如 Hang Management, Resource Manager Idle Blocker Kill, SQL Tune Hang Avoidance 和 pmon 清除, 还有一些外部工具如 Procwatcher
我们看一下 v$wait_chains 视图的定义以 11gR2 为例
- SQL> desc v$wait_chains
- Name Null? Type
- ----------------------------------------- -------- ----------------------------
- CHAIN_ID NUMBER
- CHAIN_IS_CYCLE VARCHAR2(5)
- CHAIN_SIGNATURE VARCHAR2(801)
- CHAIN_SIGNATURE_HASH NUMBER
- INSTANCE NUMBER
- OSID VARCHAR2(25)
- PID NUMBER
- SID NUMBER
- SESS_SERIAL# NUMBER
- BLOCKER_IS_VALID VARCHAR2(5)
- BLOCKER_INSTANCE NUMBER
- BLOCKER_OSID VARCHAR2(25)
- BLOCKER_PID NUMBER
- BLOCKER_SID NUMBER
- BLOCKER_SESS_SERIAL# NUMBER
- BLOCKER_CHAIN_ID NUMBER
- IN_WAIT VARCHAR2(5)
- TIME_SINCE_LAST_WAIT_SECS NUMBER
- WAIT_ID NUMBER
- WAIT_EVENT NUMBER
- WAIT_EVENT_TEXT VARCHAR2(64)
- P1 NUMBER
- P1_TEXT VARCHAR2(64)
- P2 NUMBER
- P2_TEXT VARCHAR2(64)
- P3 NUMBER
- P3_TEXT VARCHAR2(64)
- IN_WAIT_SECS NUMBER
- TIME_REMAINING_SECS NUMBER
- NUM_WAITERS NUMBER
- ROW_WAIT_OBJ# NUMBER
- ROW_WAIT_FILE# NUMBER
- ROW_WAIT_BLOCK# NUMBER
- ROW_WAIT_ROW# NUMBER
继续查询该视图的定义可以发现该数据来自于基表 x$ksdhng_chains 因为前面介绍过进程会 10 秒做一次 global 的 hang, 所以这个视图是包含了全局的信息的虽然它是 v$ 开头的
- SQL> select * from V$FIXED_VIEW_DEFINITION where view_name like %WAIT_CHAINS%;
- VIEW_NAME VIEW_DEFINITION
- -------------------- ----------------------------------------------------------------------------------------------------------------------------------
- V$WAIT_CHAINS select s.chain_id, decode(s.chain_is_cycle, 0,FALSE,TRUE), s.chain_signature, s.chain_signature_hash, s.instance, s.osid,
- s.pid, s.sid, s.sess_serial#, decode(s.blocker_is_valid, 0,FALSE,TRUE), decode(s.blocker_is_valid, 0, to_number(null), s.
- blocker_instance), s.blocker_osid, decode(s.blocker_is_valid, 0, to_number(null), s.blocker_pid), decode(s.blocker_is_valid, 0,
- to_number(null), s.blocker_sid), decode(s.blocker_is_valid, 0, to_number(null), s.blocker_sess_serial#), decode(s.blocker_chain
- _id, 0, to_number(null), s.blocker_chain_id), decode(s.in_wait, 0,FALSE,TRUE), decode(s.in_wait, 0, s.time_since_last_wait
- _secs, to_number(null)), decode(s.in_wait, 0, to_number(null), s.wait_id), decode(s.in_wait, 0, to_number(null), s.wait_event),
- s.wait_event_text, decode(s.in_wait, 0, to_number(null), s.p1), s.p1_text, decode(s.in_wait, 0, to_number(null), s.p2), s.p2
- _text, decode(s.in_wait, 0, to_number(null), s.p3), s.p3_text, decode(s.in_wait, 0, to_number(null), s.in_wait_secs), decode(
- s.in_wait, 0, to_number(null), s.time_remaining_secs), s.num_waiters, decode(s.in_wait, 0, to_number(null), s.row_wait_obj#),
- decode(s.in_wait, 0, to_number(null), s.row_wait_file#), decode(s.in_wait, 0, to_number(null), s.row_wait_block#), decode(s.in_w
- ait, 0, to_number(null), s.row_wait_row#) from X$KSDHNG_CHAINS s
Oracle 在 mos 上提供了一些脚本来做一些信息诊断一种是普通版本的都可以使用的, 还有一种是 11gR2 专用的因为在 11gR2 的 v$session 视图中有一个字段叫 final_blocking_session, 这个字段能够去查看最上层的阻塞者最终的 blocker 一般都处于 wait_chain 的顶端这样的 session 才会引起问题我们先来看看普通的查询. 首先随便制造两个 session 共同更新一行的情况
- SQL> SELECT chain_id, num_waiters, in_wait_secs, osid, blocker_osid, substr(wait_event_text,1,30)
- FROM v$wait_chains;
- CHAIN_ID NUM_WAITERS IN_WAIT_SECS OSID BLOCKER_OSID SUBSTR(WAIT_EVENT_TEXT,1,30)
- ---------- ----------- ------------ ------------------------- ------------------------- ------------------------------------------------------------
- 8 31377 31447 enq: TX - row lock contention
- 15 31447 SQL*Net message from client
通用的查询:
接下来在执行下一个基础的格式化后的脚本.
- set pages 1000
- set lines 120
- set heading off
- column w_proc format a50 tru
- column instance format a20 tru
- column inst format a28 tru
- column wait_event format a50 tru
- column p1 format a16 tru
- column p2 format a16 tru
- column p3 format a15 tru
- column Seconds format a50 tru
- column sincelw format a50 tru
- column blocker_proc format a50 tru
- column waiters format a50 tru
- column chain_signature format a100 wra
- column blocker_chain format a100 wra
- SELECT *
- FROM (SELECT Current Process: ||osid W_PROC, SID ||i.instance_name INSTANCE,
- INST #: ||instance INST,Blocking Process: ||decode(blocker_osid,null,,blocker_osid)||
- from Instance ||blocker_instance BLOCKER_PROC,Number of waiters: ||num_waiters waiters,
- Wait Event: ||wait_event_text wait_event, P1: ||p1 p1, P2: ||p2 p2, P3: ||p3 p3,
- Seconds in Wait: ||in_wait_secs Seconds, Seconds Since Last Wait: ||time_since_last_wait_secs sincelw,
- Wait Chain: ||chain_id ||: ||chain_signature chain_signature,Blocking Wait Chain: ||decode(blocker_chain_id,null,
- ,blocker_chain_id) blocker_chain
- FROM v$wait_chains wc,
- v$instance i
- WHERE wc.instance = i.instance_number (+)
- AND ( num_waiters> 0
- OR ( blocker_osid IS NOT NULL
- AND in_wait_secs> 10 ) )
- ORDER BY chain_id,
- num_waiters DESC)
- WHERE ROWNUM <101;
最终结果如下图所示, 我们能够清楚的看到进程 31447 阻塞了进程 31377 进程 31377 在等待 enq: TX row lock contention
基于 11gR2 的查询
使用 final_blocking_session 字段, 能查到最上端的阻塞进程
- set pages 1000
- set lines 120
- set heading off
- column w_proc format a50 tru
- column instance format a20 tru
- column inst format a28 tru
- column wait_event format a50 tru
- column p1 format a16 tru
- column p2 format a16 tru
- column p3 format a15 tru
- column Seconds format a50 tru
- column sincelw format a50 tru
- column blocker_proc format a50 tru
- column fblocker_proc format a50 tru
- column waiters format a50 tru
- column chain_signature format a100 wra
- column blocker_chain format a100 wra
- SELECT *
- FROM (SELECT Current Process: ||osid W_PROC, SID ||i.instance_name INSTANCE,
- INST #: ||instance INST,Blocking Process: ||decode(blocker_osid,null,,blocker_osid)||
- from Instance ||blocker_instance BLOCKER_PROC,
- Number of waiters: ||num_waiters waiters,
- Final Blocking Process: ||decode(p.spid,null,,
- p.spid)|| from Instance ||s.final_blocking_instance FBLOCKER_PROC,
- Program: ||p.program image,
- Wait Event: ||wait_event_text wait_event, P1: ||wc.p1 p1, P2: ||wc.p2 p2, P3: ||wc.p3 p3,
- Seconds in Wait: ||in_wait_secs Seconds, Seconds Since Last Wait: ||time_since_last_wait_secs sincelw,
- Wait Chain: ||chain_id ||: ||chain_signature chain_signature,Blocking Wait Chain: ||decode(blocker_chain_id,null,
- ,blocker_chain_id) blocker_chain
- FROM v$wait_chains wc,
- gv$session s,
- gv$session bs,
- gv$instance i,
- gv$process p
- WHERE wc.instance = i.instance_number (+)
- AND (wc.instance = s.inst_id (+) and wc.sid = s.sid (+)
- and wc.sess_serial# = s.serial# (+))
- AND (s.final_blocking_instance = bs.inst_id (+) and s.final_blocking_session = bs.sid (+))
- AND (bs.inst_id = p.inst_id (+) and bs.paddr = p.addr (+))
- AND ( num_waiters> 0
- OR ( blocker_osid IS NOT NULL
- AND in_wait_secs> 10 ) )
- ORDER BY chain_id,
- num_waiters DESC)
- WHERE ROWNUM < 101;
这里可以看到当前会话是 2395 在等待 enq: TM contention, 而它的顶级阻塞者是 2309 通过这些脚本我们能够方便的进行查询, 能够方便的找到谁是阻塞者, 甚至是最上层的阻塞者当然在我们的 diaghang.sql 的脚本里面, 我们看到了下列内容这里 Oracle 通过我们内存直接访问, 从 x$ksdhng_chains 里面把 chain 的信息全部获取出来, 用于最终的 hang 分析的诊断
- -- dump hang analysis chains
- oradebug direct_access enable trace
- oradebug direct_access disable reply
- oradebug direct_access set content_type = text/plain
- oradebug direct_access select * from x$ksdhng_chains
来源: http://www.bubuko.com/infodetail-2529318.html