使用coe脚本固定执行计划和手工指定profile的outline的方式有什么区别?

测试目的:
1,假设下面的两个语句分别为A语句和B语句:
A语句: select count(*) from lunar where extent_id=1;
B语句: select /*+ FULL(lunar) */ count(*) from lunar where extent_id=1;
2,使用coe的脚本(在SQLT中可以知道整套coe脚本)手工为A语句指定B语句的执行计划,看看什么效果
(即,让原本正常走索引的A语句使用全表扫描的B语句的执行计划)

3,使用dbms_sqltune.import_sql_profile为A语句指定执行计划,看看什么效果

首先,做一个测试表:

[oracle@lunar /]$ ss

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jun 7 21:29:52 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Welcome Lunar's oracle world!

Love you , baby !

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning and Real Application Testing options

SYS@lunar>conn lunar/lunar
Welcome Lunar's oracle world!

Love you , baby !

Connected.
LUNAR@lunar>create table lunar as select * from dba_extents;

Table created.

Elapsed: 00:00:04.61

在extent_id列上创建非唯一索引:

LUNAR@lunar>create index idx_lunar_extent_id on lunar(extent_id) ;

Index created.

Elapsed: 00:00:00.02

LUNAR@lunar>EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'LUNAR',TABNAME => 'LUNAR',DEGREE => 5,
 2  CASCADE => TRUE,ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE);

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.45

可以看到下面的语句正确的使用了索引(把这条语句称之为A语句):

LUNAR@lunar>select count(*) from lunar where extent_id=1;

  COUNT(*)
----------
       323

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1050880347  ---这里是A语句的执行计划的plan hash value

-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     |     1 |     3 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                     |     1 |     3 |            |          |
|*  2 |   INDEX RANGE SCAN| IDX_LUNAR_EXTENT_ID |    84 |   252 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EXTENT_ID"=1)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        527  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

LUNAR@lunar>

我使用hint指定该语句必须走全表扫描,姑且把这条语句称为B语句:

LUNAR@lunar>select /*+ FULL(lunar) */ count(*) from lunar where extent_id=1;

  COUNT(*)
----------
       323

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 291688323  ---这里是B语句的执行计划的plan hash value

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |     3 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |       |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| LUNAR |    84 |   252 |    16   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("EXTENT_ID"=1)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         50  consistent gets
          0  physical reads
          0  redo size
        527  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

LUNAR@lunar>

找到A语句和B语句的sql_id:

LUNAR@lunar>select sql_id from v$sql where sql_text like '%FULL(lunar)%'; 

SQL_ID
-------------
078qb40kcsnnu
7f8h3z6pj9w52
05538tk3vkzu3

Elapsed: 00:00:00.10

LUNAR@lunar>select sql_id,sql_text from v$sql where sql_text like '%extent_id=1%'; 

SQL_ID
-------------
SQL_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------
078qb40kcsnnu
EXPLAIN PLAN SET STATEMENT_ID='PLUS271153' FOR select /*+ FULL(lunar) */ count(*) from lunar where extent_id=1

39sr1xpxpkbmv
select sql_id,sql_text from v$sql where sql_text like '%extent_id=1%'

05538tk3vkzu3  ---------注意,这里是B语句的sql_id
select /*+ FULL(lunar) */ count(*) from lunar where extent_id=1

fvrmp2a2t38dc  ---------注意,这里是A语句的sql_id
select count(*) from lunar where extent_id=1

g4bcvwhrgradr
EXPLAIN PLAN SET STATEMENT_ID='PLUS271153' FOR select count(*) from lunar where extent_id=1

g4bcvwhrgradr
EXPLAIN PLAN SET STATEMENT_ID='PLUS271153' FOR select count(*) from lunar where extent_id=1


6 rows selected.

Elapsed: 00:00:00.03
LUNAR@lunar>

使用coe的脚本来固定执行计划,291688323是B语句的执行计划:

LUNAR@lunar>@coe_xfr_sql_profile fvrmp2a2t38dc 291688323

Parameter 1:
SQL_ID (required)



PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
     1050880347        .039

Parameter 2:
PLAN_HASH_VALUE (required)


Values passed to coe_xfr_sql_profile:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL_ID         : "fvrmp2a2t38dc"
PLAN_HASH_VALUE: "291688323"

SQL>BEGIN
  2    IF :sql_text IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20100, 'SQL_TEXT for SQL_ID &&sql_id. was not found in memory (gv$sqltext_with_newlines) or AWR (dba_hist_sqltext).');
  4    END IF;
  5  END;
  6  /
SQL>SET TERM OFF;
SQL>BEGIN
  2    IF :other_xml IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20101, 'PLAN for SQL_ID &&sql_id. and PHV &&plan_hash_value. was not found in memory (gv$sql_plan) or AWR (dba_hist_sql_plan).');
  4    END IF;
  5  END;
  6  /
SQL>SET TERM OFF;

Execute coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql
on TARGET system in order to create a custom SQL Profile
with plan 291688323 linked to adjusted sql_text.


COE_XFR_SQL_PROFILE completed.
SQL>@coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql
SQL>REM
SQL>REM $Header: 215187.1 coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql 11.4.4.4 2014/06/08 carlos.sierra $
SQL>REM
SQL>REM Copyright (c) 2000-2012, Oracle Corporation. All rights reserved.
SQL>REM
SQL>REM AUTHOR
SQL>REM   carlos.sierra@oracle.com
SQL>REM
SQL>REM SCRIPT
SQL>REM   coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql
SQL>REM
SQL>REM DESCRIPTION
SQL>REM   This script is generated by coe_xfr_sql_profile.sql
SQL>REM   It contains the SQL*Plus commands to create a custom
SQL>REM   SQL Profile for SQL_ID fvrmp2a2t38dc based on plan hash
SQL>REM   value 291688323.
SQL>REM   The custom SQL Profile to be created by this script
SQL>REM   will affect plans for SQL commands with signature
SQL>REM   matching the one for SQL Text below.
SQL>REM   Review SQL Text and adjust accordingly.
SQL>REM
SQL>REM PARAMETERS
SQL>REM   None.
SQL>REM
SQL>REM EXAMPLE
SQL>REM   SQL> START coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql;
SQL>REM
SQL>REM NOTES
SQL>REM   1. Should be run as SYSTEM or SYSDBA.
SQL>REM   2. User must have CREATE ANY SQL PROFILE privilege.
SQL>REM   3. SOURCE and TARGET systems can be the same or similar.
SQL>REM   4. To drop this custom SQL Profile after it has been created:
SQL>REM  EXEC DBMS_SQLTUNE.DROP_SQL_PROFILE('coe_fvrmp2a2t38dc_291688323');
SQL>REM   5. Be aware that using DBMS_SQLTUNE requires a license
SQL>REM  for the Oracle Tuning Pack.
SQL>REM   6. If you modified a SQL putting Hints in order to produce a desired
SQL>REM  Plan, you can remove the artifical Hints from SQL Text pieces below.
SQL>REM  By doing so you can create a custom SQL Profile for the original
SQL>REM  SQL but with the Plan captured from the modified SQL (with Hints).
SQL>REM
SQL>WHENEVER SQLERROR EXIT SQL.SQLCODE;
SQL>REM
SQL>VAR signature NUMBER;
SQL>VAR signaturef NUMBER;
SQL>REM
SQL>DECLARE
  2  sql_txt CLOB;
  3  h       SYS.SQLPROF_ATTR;
  4  PROCEDURE wa (p_line IN VARCHAR2) IS
  5  BEGIN
  6  DBMS_LOB.WRITEAPPEND(sql_txt, LENGTH(p_line), p_line);
  7  END wa;
  8  BEGIN
  9  DBMS_LOB.CREATETEMPORARY(sql_txt, TRUE);
 10  DBMS_LOB.OPEN(sql_txt, DBMS_LOB.LOB_READWRITE);
 11  -- SQL Text pieces below do not have to be of same length.
 12  -- So if you edit SQL Text (i.e. removing temporary Hints),
 13  -- there is no need to edit or re-align unmodified pieces.
 14  wa(q'[select count(*) from lunar where extent_id=1]');
 15  DBMS_LOB.CLOSE(sql_txt);
 16  h := SYS.SQLPROF_ATTR(
 17  q'[BEGIN_OUTLINE_DATA]',
 18  q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
 19  q'[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]',
 20  q'[DB_VERSION('11.2.0.3')]',
 21  q'[ALL_ROWS]',
 22  q'[OUTLINE_LEAF(@"SEL$F5BB74E1")]',
 23  q'[MERGE(@"SEL$2")]',
 24  q'[OUTLINE(@"SEL$1")]',
 25  q'[OUTLINE(@"SEL$2")]',
 26  q'[FULL(@"SEL$F5BB74E1" "LUNAR"@"SEL$2")]',
 27  q'[END_OUTLINE_DATA]');
 28  :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
 29  :signaturef := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt, TRUE);
 30  DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
 31  sql_text    => sql_txt,
 32  profile     => h,
 33  name        => 'coe_fvrmp2a2t38dc_291688323',
 34  description => 'coe fvrmp2a2t38dc 291688323 '||:signature||' '||:signaturef||'',
 35  category    => 'DEFAULT',
 36  validate    => TRUE,
 37  replace     => TRUE,
 38  force_match => FALSE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );
 39  DBMS_LOB.FREETEMPORARY(sql_txt);
 40  END;
 41  /

PL/SQL procedure successfully completed.

SQL>WHENEVER SQLERROR CONTINUE
SQL>SET ECHO OFF;

            SIGNATURE
---------------------
 12989549146262258752


           SIGNATUREF
---------------------
 11568928226314596944


... manual custom SQL Profile has been created


COE_XFR_SQL_PROFILE_fvrmp2a2t38dc_291688323 completed
SQL>

根据提示执行COE_XFR_SQL_PROFILE_fvrmp2a2t38dc_291688323.sql脚本:

SQL>@coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql
SQL>REM
SQL>REM $Header: 215187.1 coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql 11.4.4.4 2014/06/08 carlos.sierra $
SQL>REM
SQL>REM Copyright (c) 2000-2012, Oracle Corporation. All rights reserved.
SQL>REM
SQL>REM AUTHOR
SQL>REM   carlos.sierra@oracle.com
SQL>REM
SQL>REM SCRIPT
SQL>REM   coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql
SQL>REM
SQL>REM DESCRIPTION
SQL>REM   This script is generated by coe_xfr_sql_profile.sql
SQL>REM   It contains the SQL*Plus commands to create a custom
SQL>REM   SQL Profile for SQL_ID fvrmp2a2t38dc based on plan hash
SQL>REM   value 291688323.
SQL>REM   The custom SQL Profile to be created by this script
SQL>REM   will affect plans for SQL commands with signature
SQL>REM   matching the one for SQL Text below.
SQL>REM   Review SQL Text and adjust accordingly.
SQL>REM
SQL>REM PARAMETERS
SQL>REM   None.
SQL>REM
SQL>REM EXAMPLE
SQL>REM   SQL> START coe_xfr_sql_profile_fvrmp2a2t38dc_291688323.sql;
SQL>REM
SQL>REM NOTES
SQL>REM   1. Should be run as SYSTEM or SYSDBA.
SQL>REM   2. User must have CREATE ANY SQL PROFILE privilege.
SQL>REM   3. SOURCE and TARGET systems can be the same or similar.
SQL>REM   4. To drop this custom SQL Profile after it has been created:
SQL>REM  EXEC DBMS_SQLTUNE.DROP_SQL_PROFILE('coe_fvrmp2a2t38dc_291688323');
SQL>REM   5. Be aware that using DBMS_SQLTUNE requires a license
SQL>REM  for the Oracle Tuning Pack.
SQL>REM   6. If you modified a SQL putting Hints in order to produce a desired
SQL>REM  Plan, you can remove the artifical Hints from SQL Text pieces below.
SQL>REM  By doing so you can create a custom SQL Profile for the original
SQL>REM  SQL but with the Plan captured from the modified SQL (with Hints).
SQL>REM
SQL>WHENEVER SQLERROR EXIT SQL.SQLCODE;
SQL>REM
SQL>VAR signature NUMBER;
SQL>VAR signaturef NUMBER;
SQL>REM
SQL>DECLARE
  2  sql_txt CLOB;
  3  h       SYS.SQLPROF_ATTR;
  4  PROCEDURE wa (p_line IN VARCHAR2) IS
  5  BEGIN
  6  DBMS_LOB.WRITEAPPEND(sql_txt, LENGTH(p_line), p_line);
  7  END wa;
  8  BEGIN
  9  DBMS_LOB.CREATETEMPORARY(sql_txt, TRUE);
 10  DBMS_LOB.OPEN(sql_txt, DBMS_LOB.LOB_READWRITE);
 11  -- SQL Text pieces below do not have to be of same length.
 12  -- So if you edit SQL Text (i.e. removing temporary Hints),
 13  -- there is no need to edit or re-align unmodified pieces.
 14  wa(q'[select count(*) from lunar where extent_id=1]');
 15  DBMS_LOB.CLOSE(sql_txt);
 16  h := SYS.SQLPROF_ATTR(
 17  q'[BEGIN_OUTLINE_DATA]',
 18  q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
 19  q'[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]',
 20  q'[DB_VERSION('11.2.0.3')]',
 21  q'[ALL_ROWS]',
 22  q'[OUTLINE_LEAF(@"SEL$F5BB74E1")]',
 23  q'[MERGE(@"SEL$2")]',
 24  q'[OUTLINE(@"SEL$1")]',
 25  q'[OUTLINE(@"SEL$2")]',
 26  q'[FULL(@"SEL$F5BB74E1" "LUNAR"@"SEL$2")]',
 27  q'[END_OUTLINE_DATA]');
 28  :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
 29  :signaturef := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt, TRUE);
 30  DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
 31  sql_text    => sql_txt,
 32  profile     => h,
 33  name        => 'coe_fvrmp2a2t38dc_291688323',
 34  description => 'coe fvrmp2a2t38dc 291688323 '||:signature||' '||:signaturef||'',
 35  category    => 'DEFAULT',
 36  validate    => TRUE,
 37  replace     => TRUE,
 38  force_match => FALSE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );
 39  DBMS_LOB.FREETEMPORARY(sql_txt);
 40  END;
 41  /

PL/SQL procedure successfully completed.

SQL>WHENEVER SQLERROR CONTINUE
SQL>SET ECHO OFF;

            SIGNATURE
---------------------
 12989549146262258752


           SIGNATUREF
---------------------
 11568928226314596944


... manual custom SQL Profile has been created


COE_XFR_SQL_PROFILE_fvrmp2a2t38dc_291688323 completed

再次查询,验证一下A语句是否已经按照我们指定的那样采用了B语句的全表扫描的执行计划:

SQL>select count(*) from lunar where extent_id=1;

  COUNT(*)
----------
       323


Execution Plan
----------------------------------------------------------
Plan hash value: 1050880347

--------------------------------------------------------------------------------
---------

| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| T
ime     |

--------------------------------------------------------------------------------
---------

|   0 | SELECT STATEMENT  |                     |     1 |     3 |     1   (0)| 0
0:00:01 |

|   1 |  SORT AGGREGATE   |                     |     1 |     3 |            |
        |

|*  2 |   INDEX RANGE SCAN| IDX_LUNAR_EXTENT_ID |    84 |   252 |     1   (0)| 0
0:00:01 |

--------------------------------------------------------------------------------
---------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EXTENT_ID"=1)

Note
-----
   - SQL profile "coe_fvrmp2a2t38dc_291688323" used for this statement


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        527  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

结论:我们的猜测是错误的,使用coe的脚本固定执行计划,必须是从该sql_id的已经有的所有执行计划中挑选其一,而不能凭空指定一个你认为合适的。

下面,我们使用其他方法固定SQL的执行计划。首先找出B语句的执行计划和outline:

SQL>set lines 1000
SQL>set pages 10000
SQL>select * from table(dbms_xplan.display_cursor('05538tk3vkzu3',null,'outline'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  05538tk3vkzu3, child number 0
-------------------------------------
select /*+ FULL(lunar) */ count(*) from lunar where extent_id=1

Plan hash value: 291688323

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |       |       |    16 (100)|          |
|   1 |  SORT AGGREGATE    |       |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| LUNAR |    84 |   252 |    16   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      DB_VERSION('11.2.0.3')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "LUNAR"@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("EXTENT_ID"=1)


33 rows selected.

SQL>

使用下面的脚本为A语句指定B语句的outline:

LUNAR@lunar>declare
  2  v_hints sys.sqlprof_attr;
  3  sql_txt clob;
  4  begin 
  5  v_hints:=sys.sqlprof_attr(
  6        'BEGIN_OUTLINE_DATA',
  7        'IGNORE_OPTIM_EMBEDDED_HINTS',
  8        'OPTIMIZER_FEATURES_ENABLE(''11.2.0.3'')',
  9        'DB_VERSION(''11.2.0.3'')',
 10        'ALL_ROWS',
 11        'OUTLINE_LEAF(@"SEL$1")',
 12        'FULL(@"SEL$1" "LUNAR"@"SEL$1")',
 13        'END_OUTLINE_DATA'
 14  );
 15    select sql_fulltext into sql_txt from v$sql where sql_id='fvrmp2a2t38dc';
 16    dbms_sqltune.import_sql_profile(sql_text => sql_txt,
 17    profile => v_hints,name => 'SQLPROF_Lunar',
 18    replace => TRUE,force_match => TRUE);
 19  end;
 20  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.19
LUNAR@lunar>

再次测试,发现语句A已经采用了我们指定的outline,即使用了B语句的全表扫描的执行计划:
LUNAR@lunar>select count(*) from lunar where extent_id=1;

  COUNT(*)
----------
       323

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 291688323

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |     3 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |       |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| LUNAR |    84 |   252 |    16   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("EXTENT_ID"=1)

Note
-----
   - SQL profile "SQLPROF_Lunar" used for this statement  注意,这里已经采用了SQLPROF_Lunar


Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
         57  consistent gets
          1  physical reads
          0  redo size
        527  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

LUNAR@lunar>

结论:
1,使用coe的脚本固定执行计划,必须是从该sql_id的已经有的所有执行计划中挑选其一,而不能凭空指定一个你认为合适的。
因此,如果需要从某条sql已经执行过的执行计划中挑选一个合适的,那么coe的脚本简单轻巧,非常合适。

2,如果要手工构造一个该SQL语句从未使用的执行计划,可以采用dbms_sqltune.import_sql_profile的方式手工设置outline,从而改变执行计划。
在这种需求下,coe的脚本是无能为力的。

发表在 Performence Tuning | 标签为 , , | 5 条评论

打赌结果——12.1.0.2.0曾经可以这样

今天跟深圳小兄弟打赌,内容不详述了,结果如下,当事人和当时围观的人自然清楚,不清楚的我也不便解释了,O(∩_∩)O哈哈~

其实这个测试我本来不用做的,因为飞总早在几个月前已经测试过了,但是兄弟不相信,我不得不再弄一次,O(∩_∩)O哈哈~
先看下当前环境,注意,这里的信息都足以证明是一个非exadata环境……


[root@localhost ~]# imageinfo
-bash: imageinfo: command not found
[root@localhost ~]# imagehistory
-bash: imagehistory: command not found
[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.32-300.10.1.el5uek #1 SMP Wed Feb 22 17:37:40 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]# cat /etc/*relea*
Enterprise Linux Enterprise Linux Server release 5.8 (Carthage)
cat: /etc/lsb-release.d: Is a directory
Oracle Linux Server release 5.8
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
[root@localhost ~]# 


[root@localhost ~]# ls /etc/*relea*
/etc/enterprise-release  /etc/oracle-release  /etc/redhat-release

/etc/lsb-release.d:
core-4.0-amd64  core-4.0-noarch     graphics-4.0-ia32    printing-4.0-amd64  printing-4.0-noarch
core-4.0-ia32   graphics-4.0-amd64  graphics-4.0-noarch  printing-4.0-ia32
[root@localhost ~]# 
[root@localhost ~]# 

[root@localhost opt]# ps -ef|grep xd
root     23530 23350  0 20:35 pts/1    00:00:00 grep xd
[root@localhost opt]# 

[root@localhost ~]# cd /opt
[root@localhost opt]# ls
ORCLfmap
[root@localhost opt]# pwd
/opt
[root@localhost opt]# 

下面我们来测试in-memory option功能(注意版本):

SQL> SELECT * FROM V$VERSION;
 
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit                         0
PL/SQL Release 12.1.0.2.0 -                                                               0
CORE    12.1.0.2.0                                                                        0
TNS for Linux: Version 12.1.0.2.0 -                                                       0
NLSRTL Version 12.1.0.2.0 -                                                               0
 
SQL> SHOW PARAMETER inmemory;
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_query                       string      ENABLE
inmemory_size                        big integer 200M


SQL> select TABLE_NAME,INMEMORY_PRIORITY,INMEMORY_DISTRIBUTE,INMEMORY_COMPRESSION from user_tables;
 
TABLE_NAME                     INMEMORY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------------ -------- --------------- -----------------
LUNAR
 
SQL>  SELECT * FROM V$INMEMORY_AREA;
 
POOL                       ALLOC_BYTES USED_BYTES POPULATE_STATUS                CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL                     166723584          0 DONE                                3
64KB POOL                     33554432          0 DONE                                3

SQL>  alter table  LUNAR inmemory;
 
Table altered.
 
SQL> SELECT COUNT(*) FROM LUNAR;
 
  COUNT(*)
----------
     90000

SQL> SELECT * FROM V$INMEMORY_AREA;
 
POOL                       ALLOC_BYTES USED_BYTES POPULATE_STATUS                CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL                     166723584    4194304 DONE                                3
64KB POOL                     33554432     131072 DONE                                3
 

猜猜,谁赢了,O(∩_∩)O哈哈~

发表在 ORACLE 12C | 标签为 , | 留下评论

DUL第二篇——使用DUL抽取dmp文件内容

dul常用命令:
DUL> unload database ;
DUL> unload user ;
DUL> unload table ;
DUL> scan database;
DUL> scan tables;

记录一下测试表的记录数目:

SYS@lunar>select count(*) from lunar.lunar;

  COUNT(*)
----------
     17634  这里我们看到是17634 行记录

Elapsed: 00:00:00.09
SYS@lunar>

其余的配置参数,参考第一篇DUL 第一篇 —— DUL是什么?
启动DUL,然后直接执行unpump header,之后就可以抽取了:

DUL> unpump header dump file lunar.01.dmp;
Version is 769
check sum is 1864601239
data pump id is 6783164
master_obj_no is 18333
header blocks is 1
data pump file number is 1
block size is 4096
character set id is 873
master table block offset is 411
(Master table is at byte offset (411 -1) * 4096 = 1679360)
DUL> unpump table lunar.dmp (OWNER VARCHAR2(30),OBJECT_NAME VARCHAR2(128),SUBOBJECTNAME VARCHAR2(30),OBJECT_ID NUMBER,DATA_OBJECT_ID NUMBER,OBJECT_TYPE VARCHAR2(19),CREATED DATE,LAST_DDL_TIME DATE,TIMESTAMP VARCHAR2(19),STATUS VARCHAR2(7),TEMPORARY VARCHAR2(1),GENERATED VARCHAR2(1),SECONDARY VARCHAR2(1),NAMESPACE NUMBER,EDITION_NAME VARCHAR2(30)) dump file lunar.01.dmp from 15048 until 1676342; 
17634 rows unloaded
DUL> 

这里我们看到17634 行记录全部抽取出来了,如果你想测试的更好玩,可以dd掉其中的一部分数据,然后测试看dul怎么工作。

然后直接导入数据,就这么简单,O(∩_∩)O哈哈~


[oracle@lunar dul]$ sqlldr userid=lunar/lunar control=/home/oracle/test/dul/dump000.ctl

SQL*Loader: Release 11.2.0.3.0 - Production on Thu Mar 6 13:11:44 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
Commit point reached - logical record count 256
Commit point reached - logical record count 320
Commit point reached - logical record count 384
Commit point reached - logical record count 448
Commit point reached - logical record count 512
Commit point reached - logical record count 576
Commit point reached - logical record count 640
Commit point reached - logical record count 704
Commit point reached - logical record count 768
Commit point reached - logical record count 832
Commit point reached - logical record count 896
................
Commit point reached - logical record count 17270
Commit point reached - logical record count 17334
Commit point reached - logical record count 17398
Commit point reached - logical record count 17462
Commit point reached - logical record count 17526
Commit point reached - logical record count 17590
Commit point reached - logical record count 17634
[oracle@lunar dul]$

 

这里我们看到全部的17634 行数据都导入表中了。

发表在 DUL ODU | 标签为 , | 2 条评论

DUL 第一篇 —— DUL是什么?

最近打算陆续研究一些DUL的东西,这个在互联网上已经有太多了,我本人用的不多,主要是确实很多时候用不上(运维做得好,这些其实都几乎用不上……)

不过这些东西涉及了很多对于Oracle的运行机制和原理的理解,玩玩还是很有意思的 :)

DUL是Oracle Internal工具,专供Oracle Support使用,请勿自行在工作环境操作,否则后果自负!!

第一篇 DUL是什么?

发表在 DUL ODU | 标签为 | 留下评论

Exadata上,ASM的Req_mir_free_MB 是如何得到的?

Exadata上,ASM的Req_mir_free_MB 是如何得到的?
首先,我们都知道,可通过v$asm_diskgroup来查看required_mirror_free_mb:

SQL> set linesize 150
SQL> set pages 999
SQL> SELECT name, type, group_number, total_mb, free_mb, required_mirror_free_mb FROM v$asm_diskgroup ORDER BY name;

NAME                           TYPE   GROUP_NUMBER   TOTAL_MB    FREE_MB REQUIRED_MIRROR_FREE_MB
------------------------------ ------ ------------ ---------- ---------- -----------------------
DATA_DM01                      NORMAL            1   15593472    9371300                 2074432
DBFS_DG                        NORMAL            2     894720     893248                   81338
RECO_DM01                      NORMAL            3    3896064    2597260                  434924

SQL> 

这里看到,实际上lsdg的输出跟v$asm_diskgroup是一致的(注意,早起的数据库版本由于rebalance时的bug,可能造成显示不一致,例如 Bug 7699985)

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304  15593472  9371300          2074432         3648434              0             N  DATA_DM01/
MOUNTED  NORMAL  N         512   4096  4194304    894720   893248            81338          405955              0             Y  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   3896064  2597260           434924         1081168              0             N  RECO_DM01/
ASMCMD> 

由于Exadata上,没有外部冗余,数据完全通过ASM镜像,那么就要考虑当磁盘故障或者cell故障时的数据保护,需要相应的根据Normal Redundancy或者High Redundancy来考虑分别两种情况:

1,在Normal Redundancy时,要考虑1块磁盘损坏或者1个cell不能启动时,如果修复时间超过缺省的3.6小时,需要使用现有系统中可用磁盘空间做Rebalance

2,在HighRedundancy时,要考虑2块磁盘损坏或者2个cell不能启动时,如果修复时间超过缺省的3.6小时,需要使用现有系统中可用磁盘空间做Rebalance

Oracle根据内部算法,使用一个调解因子做系数,比如,不同版本的Oracle,评估方法不同:

如果数据库是11.2.0.3或者11.2.0.4,那么这个调解因子为1.10,如果是低于这个版本的数据库,调解因子是1.5(具体该因子的算法尚未知道)

对于每一个磁盘组,他们需要调解的空间大小(即,预留空间)=v$asm_diskgroup.required_mirror_free_mb * 调解因子

然后根据不同的镜像(Normal或者High),以及不同的Exadata配置(1/8,1/4、1/2,满配)再分别乘以一个系数,下面的代码根据每个磁盘组的最大的failgroup的大小来确定需要保留的空间:
然后,计算出来总共的需要“预留的空间”按照不同配置的Exadata乘以一个比例,就是我们这看到的 REQUIRED_MIRROR_FREE_MB:

对于Normal Redundancy:


      -- Eighth Rack
      IF dg.type = 'NORMAL' THEN
      
         -- Eighth Rack
         IF (v_num_disks < 36) THEN
            -- Use eqn: y = 1.21344 x+ 17429.8
            v_required_free_mb :=  1.21344 * v_max_total_mb + 17429.8;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Quarter Rack
         ELSIF (v_num_disks >= 36 AND v_num_disks < 84) THEN 
            -- Use eqn: y = 1.07687 x+ 19699.3
   -- Revised 2/21/14 for 11.2.0.4 to use eqn: y=0.803199x + 156867, more space but safer
            v_required_free_mb := 0.803199 * v_max_total_mb + 156867;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Half Rack
         ELSIF (v_num_disks >= 84 AND v_num_disks < 168) THEN 
            -- Use eqn: y = 1.02475 x+53731.3
            v_required_free_mb := 1.02475 * v_max_total_mb + 53731.3;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;

         -- Full rack is most conservative, it will be default
         ELSE
            -- Use eqn: y = 1.33333 x+83220.
            v_required_free_mb := 1.33333 * v_max_total_mb + 83220;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;      
         
         END IF;
         
         -- DISK usable file MB
         v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/2);
         v_disk_desc := 'ONE disk';
         
         -- CELL usable file MB
         v_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/2 );
         v_one_cell_usable_mb := v_cell_usable_mb;

对于High Redundancy:


         -- Eighth Rack
         IF (v_num_disks <= 18) THEN
            -- Use eqn: y = 4x + 0
   -- Updated for 11.2.0.4 to higher value:  y = 3.84213x + 84466.4
            v_required_free_mb :=  3.84213 * v_max_total_mb + 84466.4;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Quarter Rack
         ELSIF (v_num_disks > 18 AND v_num_disks <= 36) THEN 
            -- Use eqn: y = 3.87356 x+417692.
            v_required_free_mb := 3.87356 * v_max_total_mb + 417692;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Half Rack
         ELSIF (v_num_disks > 36 AND v_num_disks <= 84) THEN 
            -- Use eqn: y = 2.02222 x+56441.6
            v_required_free_mb := 2.02222 * v_max_total_mb + 56441.6;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;

         -- Full rack is most conservative, it will be default
         ELSE
            -- Use eqn: y = 2.14077 x+54276.4
            v_required_free_mb := 2.14077 * v_max_total_mb + 54276.4;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;      
         
         END IF;
      
         -- DISK usable file MB
         v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/3);      
         v_disk_desc := 'TWO disks';   
         
         -- CELL usable file MB
         v_one_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/3 );

完整的代码可在这里下载

发表在 ASM, FAQ, 内部机制 | 标签为 , , | 留下评论

Exadata和非Exadata平台上,ASM的可用空间如何计算?

曾经很多次,有客户问到ASM上可用的空闲空间问题,实际上,由于ASM带有3中冗余设置方式,分别应对不同场景的数据冗余情况,因此,通常在Exadata上的选择和非Exadata上的选择是不同的。
.
在非Exadata的环境中,通常我们会使用RAID,比如RAID10,因此ASM中使用External Redundancy,例如:

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     51200    45862                0           45862              0             N  CFARCH1/
MOUNTED  EXTERN  N         512   4096  1048576     51200    47653                0           47653              0             N  CFARCH2/
MOUNTED  EXTERN  N         512   4096  1048576    563200   486047                0          486047              0             N  CFDATA/
MOUNTED  EXTERN  N         512   4096  1048576     51200    50804                0           50804              0             Y  CRS/
MOUNTED  EXTERN  N         512   4096  1048576     51200    51057                0           51057              0             N  HEARCH1/
MOUNTED  EXTERN  N         512   4096  1048576     51200    50965                0           50965              0             N  HEARCH2/
MOUNTED  EXTERN  N         512   4096  1048576   1536000  1177575                0         1177575              0             N  HEDATA/
MOUNTED  EXTERN  N         512   4096  1048576     51200    51066                0           51066              0             N  MZARCH1/
MOUNTED  EXTERN  N         512   4096  1048576     51200    50951                0           50951              0             N  MZARCH2/
MOUNTED  EXTERN  N         512   4096  1048576    358400   155375                0          155375              0             N  MZDATA/
MOUNTED  EXTERN  N         512   4096  1048576    102400      967                0             967              0             N  OGGACFS/
MOUNTED  EXTERN  N         512   4096  1048576     51200    49896                0           49896              0             N  TLARCH1/
MOUNTED  EXTERN  N         512   4096  1048576     51200    50157                0           50157              0             N  TLARCH2/
MOUNTED  EXTERN  N         512   4096  1048576    563200   381834                0          381834              0             N  TLDATA/
ASMCMD>

.
此时,不需要考虑使用ASM内部的预留空间做故障盘冗余等操作,因此Req_mir_free_MB这列为0,Free_MB的含义和数值跟Usable_file_MB是一致的。
.
而在Exadata上,我们都知道所有的存储节点是不做RAID的(每个cell的前两快盘的前面29.1G做了软RAID1),数据的冗余和条带划都是通过ASM自身的机制来保证的。
.
因此,在Exadata上不支持External Redundancy,缺省安装采用Normal Redundancy。
.
========================Image 11.2.2.4.2========Oracle 11.2.0.2==========

这个例子是Exadata 1/4 Rack下面的输出:

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  4194304  16625664  16624360          1511424         7556468              0             Y  DATA_DG/
MOUNTED  NORMAL  N         512   4096  4194304    894720    893568            81338          406115              0             N  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   2863872   2863224           260352         1301436              0             N  RECO_DM01/
ASMCMD>

要知道这里面每一个数据是怎么计算出来的,首先要知道每一个ASM DISK是怎么创建出来的,下面是这套系统安装时创建GRIDDISK的命令:
.
INFO: We will be creating the following disk groups…
INFO: DATA_DM01 with size 451G remaining space 77.734375g
INFO: RECO_DM01 with size ALL=77.734375g remaining space 0
.
cellcli -e create griddisk ALL HARDDISK prefix=testgd,size=528.6875G
cellcli -e create griddisk DBFS_DG_CD_02_dm01cel01 celldisk=CD_02_dm01cel01,size=29.125G —注意,DBFS_DG只在每个cell的后面10块盘上划分,因为前两快盘的那29.125G 用来装OS了
.
cellcli -e drop griddisk ALL prefix=testgd force
cellcli -e create griddisk ALL HARDDISK prefix=DATA_DM01, size=451G
cellcli -e create griddisk ALL HARDDISK prefix=RECO_DM01

这里我们看到,onecommand缺省安装时,在创建grid的时候,总是先划分出来最内侧的528.6875G用来创建了griddisk:testdg
.
之后,创建了29.125的DBFS_DG,再之后又删除了之前的testdg,然后分别划分和创建了DATA_DM01和RECO_DM01。
.
这么做的考虑是Oracle希望按照磁盘的外道和内道进行划分,尽量让速度相对较快的空间给DATA_DM01来使用,RECO_DM01在中间,而DBFS_DG在相对最慢的空间。
.
不过我个人觉得,这个快与慢相对于整个系统来说,其速度恐怕没有太多的影响,因为OLTP,我们会借助FLASH CACHE,而OLAP系统,这点磁盘外道和内道的区别,相比跑几个小时甚至更长时间的系统来说,还有那么重要么?
.
1,1/4配置的Exadata(600G的盘)上,数据库可用的总空间(此处的总空间是不考虑“安全使用的情况”)为TOTAL_MB/2
即,
Total_MB : (16625664+894720+2863872)/2=20384256/2=10192128 MB
其中,每个磁盘组显示的Total_MB的数据的来源:
DATA_DM01 : 451*12*3*1024=16625664 MB
RECO_DM01 : 77.734375*12*3*1024=2865600 MB
DBFS_DG : 29.125*10*3*1024=894720 MB
.
我们都知道,1/4配置的3个cell,每个cell 12块盘600G盘,裸容量是: 600*12*3=22118400MB=21TB,而这里,我们看到ASM可用的总容量是20384256MB(19.44TB),那么其余的1.6TB(21-19.4=1.56T)在哪里呢?
.
我计算了一下,按照上面的划盘操作,每个cell的前两快盘的前面29.125G用来做操作系统了,后面10块盘的最前面的29.125G用来做DBFS_DG了。然后,每块盘上面有528.6875G用来做DATA_DM01和RECO_DM01。
.
即 600-528.6875-29.125=42.19GB
.
也就是说,没块盘上还有42.19G左右的空间既没有做GRIDDISK,也没有做操作系统空间,这个应该就是文档上说的用于管理的空间了。
.
我们计算一下,42.19*12*3=1.48T,这个值跟1.56T已经很接近了。 我猜,文档上说的用于管理的空间,恐怕就是这个了,O(∩_∩)O哈哈~
.
2,目前可以使用的总空间(不考虑“安全使用的情况”为)FREE_MB/2:
Free_MB : (16624360+893568+2863224)/2=10190576
.
3,可以安全使用的总空间是 :(FREE_MB – REQUIRED_MIRROR_FREE_MB)/2,即 USABLE_FILE_MB的值
Usable_file_MB=(FREE_MB – REQUIRED_MIRROR_FREE_MB)/2
其中,每个磁盘组显示的Usable_file_MB的数据的来源:
DATA_DM01 : (16624360-1511424)/2=7556468 MB
RECO_DM01 : (893568-81338)/2=406115 MB
DBFS_DG : (2863224-260352)/2=1301436 MB
.
对于USABLE_FILE_MB为负值的解释如下:
(1)、当Free_MB < Req_mir_free_MB时,可以安全使用的空间(即USABLE_FILE_MB)就会为负值.
比如,这里DATA_DM01为 -2.33 TB,这个表示当磁盘组如果有一个磁盘失效,是不能有效恢复成redundancy的,即也有部分数据会处在”不安全”的状态下。
举例来说,就是,此时如果DATA_DM01所属的某块物理盘损坏(目前采用的是每块600GB的HP盘),那么将会有600GB的数据只能保有一份,不能进行redudancy。
.
(2)、如果在实际运行中,储备了部分磁盘或者可以确定硬件工程师能够很快更换硬盘,那么当有磁盘损坏时,ASM就会使用新更换的硬盘做redundancy.
而不依赖于现有online的磁盘上的空闲空间去进行数据的冗余恢复机制。
此时,USABLE_FILE_MB为负值对系统就没有太大影响了,即,此时Usable_file_MB 只是反映了当前”可安全利用的剩余空间”.
这种情况下,客户只需要关注FREE_MB,只要FREE_MB为正值,就没有问题。
.
当然,不同的版本,使用的空间可能不同,比如更早期的版本,可能使用的是如下的划分方法。但是,变化主要在DATA_DM01和RECO_DM01的大小上:

INFO: We will be creating the following disk groups...
INFO: DATA_DM01 with size 451G remaining space 77.734375g
INFO: RECO_DM01 with size ALL=77.734375g remaining  space 0
INFO: checking nodes in /opt/oracle.SupportTools/onecommand/cell_group...

对于系统预留的管理空间和DBFS_DG的空间,没有变化。
.
SQL> @a.sql
—— DISK and CELL Failure Diskgroup Space Reserve Requirements ——
This procedure determines how much space you need to survive a DISK or CELL failure.
It also shows the usable space available when reserving space for disk or cell failure.
Please see MOS note 1551288.1 for more information.
. . .
Description of Derived Values:
One Cell Required Mirror Free MB : Required Mirror Free MB to permit successful rebalance after losing largest CELL regardless of redundancy type
Disk Required Mirror Free MB : Space needed to rebalance after loss of single or double disk failure (for normal or high redundancy)
Disk Usable File MB : Usable space available after reserving space for disk failure and accounting for mirroring
Cell Usable File MB : Usable space available after reserving space for SINGLE cell failure and accounting for mirroring
. . .
ASM Version: 11.2.0.2 – WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN VERIFIED ON THIS VERSION!
. . .
————————————————————————-
DG Name: DATA_DM01
DG Type: NORMAL
Num Disks: 36
Disk Size MB: 433,152
. . .
DG Total MB: 15,593,472
DG Used MB: 6,222,172
DG Free MB: 9,371,300
. . .
One Cell Required Mirror Free MB: 7,796,736
. . .
Disk Required Mirror Free MB: 504,774
. . .
Disk Usable File MB: 4,433,263
Cell Usable File MB: 787,282
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
————————————————————————-
DG Name: DBFS_DG
DG Type: NORMAL
Num Disks: 30
Disk Size MB: 29,824
. . .
DG Total MB: 894,720
DG Used MB: 1,472
DG Free MB: 893,248
. . .
One Cell Required Mirror Free MB: 447,360
. . .
Disk Required Mirror Free MB: 53,619
. . .
Disk Usable File MB: 419,814
Cell Usable File MB: 222,944
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
————————————————————————-
DG Name: RECO_DM01
DG Type: NORMAL
Num Disks: 36
Disk Size MB: 108,224
. . .
DG Total MB: 3,896,064
DG Used MB: 1,298,804
DG Free MB: 2,597,260
. . .
One Cell Required Mirror Free MB: 1,948,032
. . .
Disk Required Mirror Free MB: 243,792
. . .
Disk Usable File MB: 1,176,734
Cell Usable File MB: 324,614
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
. . .
Script completed.

PL/SQL procedure successfully completed.

SQL>

发表在 ASM, FAQ, 内部机制 | 标签为 , , | 2 条评论

我在Oracle的第二个ppt——Exadata运维交流

在Oracle工作8年了,这是第二个ppt,最满意的我学会了修改ppt模板,O(∩_∩)O哈哈~

本次交流的主要能容源于本网站,主要是跟同事一起交流Exadata运维中的常见问题,以及如何更好的为客户做好Exadata的相关服务。

Exadata运维交流

发表在 日常运维 | 2 条评论

Exadata上的进程-Diskmon进程

Master Diskmon是Oracle Clusterware 11.1.0.7版本引入的一个新的进程(主要是为了Exadata Storage Server软件而设计的),该进程作为缺省安装的一部分随着Oracle Clusterware的安装就存在了。
Master Diskmon主要负责监控cell,并负责跟数据库节点的diskmon进程通信。该进程还参与IO fencing机制和IORM(IO Resource Manager)。
Master Diskmon进程是一个单独的进程,他跟ocssd进程通信,即便是非Exadata环境,该进程也是存在的(只是非Exadata环境,Diskmon进程没有什么作用,后面会解释这个)。

在11.1.0.7中,/bin/sh /etc/init.d/init.cssd 会启动2个diskmon相关进程,即:

root 1717 0.0 0.0 6716 1368 ? Ss 11:43 0:07 /bin/sh /etc/init.d/init.cssd fatal
<span>root 2799 0.0 0.0 6720 1364 ? S 11:44 0:00 \_ /bin/sh /etc/init.d/init.cssd diskmon
oracle 3317 0.0 0.9 108864 18976 ? Sl 11:44 0:00 | \_ /u01/64bit/B107/crs/bin/diskmon.bin -d -f</span>

再来看下11.2的情况。我们都知道11.2的Oracle Clusterware结构与11.1中变化很大,所有的进程都是由ohasd启动的。下面是11.2的进程启动流程:



无标题


从图中可以看出来,当机器启动时,init 脚本会启动ohasd进程,而ohasd进程启动orarootagent 守护进程,这个orarootagent守护进程是root用户的所有ohasd资源的agent,他负责启动crsd、ctssd、ACFS Drivers和diskmon.
虽然名字是“Diskmon”,但是其主要作用是在Exadata的数据库节点上监控所有节点和网络的连接,作用是确认这些节点的存活状态。
在Exadata上,ocssd的IO Fencing进程就是跟diskmon进程通信来处理IO fencing的。

另外,从11.2以后,我们可以看到,

[root@dm01db01 init.d]# strings /u01/app/11.2.0.3/grid/bin/orarootagent.bin|grep diskmon
ora.diskmon.type
diskmon
diskmon
/valgrind_diskmon
diskmon -d -f
diskmon
DiskmonAgent::DiskmonAgent diskmon is enabled
DiskmonAgent::DiskmonAgent diskmon is disabled
DiskmonAgent:: diskmon shutting down cleanly %d
DiskmonAgent:: Unable to reboot..clean ora.diskmon...
DiskmonAgent::check: diskmon OS pid unknown %u
DiskmonAgent::check: diskmon OS pid %u
[root@dm01db01 init.d]#

上面我们看到diskmon的启动方式为“diskmon.bin -d -f”,其中:
-d 表示打开trace选项
-f 创建该进程前,remove之前创建的管道

这个很好理解,因为我们知道在Exadata上,通信是DB进程的读写是使用pipe的方式的。

[root@dm01db01 ~]# ps -ef|grep d.bin
root 2388 1 0 19:10 ? 00:00:13 /u01/app/11.2.0.3/grid/bin/ohasd.bin reboot
grid 2610 1 0 19:10 ? 00:00:08 /u01/app/11.2.0.3/grid/bin/oraagent.bin
grid 2624 1 0 19:10 ? 00:00:00 /u01/app/11.2.0.3/grid/bin/mdnsd.bin
grid 2636 1 0 19:10 ? 00:00:07 /u01/app/11.2.0.3/grid/bin/gpnpd.bin
<span>root 2648 1 0 19:10 ? 00:01:21 /u01/app/11.2.0.3/grid/bin/orarootagent.bin</span>
grid 2651 1 0 19:10 ? 00:00:12 /u01/app/11.2.0.3/grid/bin/gipcd.bin
root 2670 1 0 19:10 ? 00:00:23 /u01/app/11.2.0.3/grid/bin/osysmond.bin
root 2684 1 0 19:10 ? 00:00:02 /u01/app/11.2.0.3/grid/bin/cssdmonitor
root 2706 1 0 19:10 ? 00:00:01 /u01/app/11.2.0.3/grid/bin/cssdagent
grid 2710 1 0 19:10 ? 00:00:02 /u01/app/11.2.0.3/grid/bin/diskmon.bin -d -f
grid 2733 1 0 19:10 ? 00:00:20 /u01/app/11.2.0.3/grid/bin/ocssd.bin
root 2851 1 0 19:11 ? 00:00:03 /u01/app/11.2.0.3/grid/bin/octssd.bin reboot
grid 2877 1 0 19:11 ? 00:00:03 /u01/app/11.2.0.3/grid/bin/evmd.bin
root 3216 1 2 19:12 ? 00:07:13 /u01/app/11.2.0.3/grid/bin/ologgerd -M -d /u01/app/11.2.0.3/grid/crf/db/dm01db01
root 3279 1 0 19:13 ? 00:00:13 /u01/app/11.2.0.3/grid/bin/crsd.bin reboot
grid 3392 2877 0 19:13 ? 00:00:00 /u01/app/11.2.0.3/grid/bin/evmlogger.bin -o /u01/app/11.2.0.3/grid/evm/log/evmlogger.info -l /u01/app/11.2.0.3/grid/evm/log/evmlogger.log
grid 3445 1 0 19:13 ? 00:00:08 /u01/app/11.2.0.3/grid/bin/oraagent.bin
grid 3452 1 0 19:13 ? 00:00:01 /u01/app/11.2.0.3/grid/bin/scriptagent.bin
root 3455 1 0 19:13 ? 00:00:27 /u01/app/11.2.0.3/grid/bin/orarootagent.bin
oracle 3807 1 0 19:14 ? 00:00:09 /u01/app/11.2.0.3/grid/bin/oraagent.bin
root 22357 22301 0 23:34 pts/4 00:00:00 grep d.bin
[root@dm01db01 ~]#

查看ocssd.bin进程的调用信息:

[root@dm01db01 ~]# pstack 2733
。。。。。。。。。。。。

Thread 3 (Thread 0x4243c940 (LWP 2834)):
#0 0x0000003946c0aee9 in <a href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from /lib64/libpthread.so.0
#1 0x00002b12cee6bc78 in sltspcwait () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
<span>#2 0x00002b12cf04ccfb in kgzf_send_main () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1 --发送消息
</span>#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x425bd940 (LWP 2835)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002b12cf25f686 in skgznp_read_msg () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
<span>#2 0x00002b12cf04c25e in kgzf_recv_main () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1 ---接收消息
</span>#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x2b12d0b2d330 (LWP 2733)):
#0 0x0000003946c0b150 in <a href="mailto:pthread_cond_timedwait@@GLIBC_2.3.2">pthread_cond_timedwait@@GLIBC_2.3.2</a> () from /lib64/libpthread.so.0
#1 0x00002b12cee6bc36 in sltspctimewait () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x00002b12ccc56a18 in clsucvtimewait () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#3 0x0000000000464829 in clssnmNMGetStatus ()
#4 0x0000000000494a12 in clssgmStartNMMon ()
#5 0x000000000040a504 in clssscmain ()
#6 0x0000000000407d50 in main ()
[root@dm01db01 ~]#

查看diskmon.bin的调用信息:

[root@dm01db01 ~]# pstack 2710
Thread 11 (Thread 0x40808940 (LWP 2714)):
#0 0x0000003946c0aee9 in <a href="mailto:pthread_cond_wait@@GLIBC_2.3.2">pthread_cond_wait@@GLIBC_2.3.2</a> () from /lib64/libpthread.so.0
#1 0x00002aeb5478bc78 in sltspcwait () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x00002aeb562bf640 in clsd_logThread () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x410b8940 (LWP 2715)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb55e64768 in sosstcppoll () from /u01/app/11.2.0.3/grid/lib/libcell11.so
#2 0x00002aeb55e431eb in ossnet_poll_in_batches () from /u01/app/11.2.0.3/grid/lib/libcell11.so
#3 0x00002aeb55e4389e in ossnet_async_monitor_poll () from /u01/app/11.2.0.3/grid/lib/libcell11.so
#4 0x00002aeb55e3ca72 in oss_wait () from /u01/app/11.2.0.3/grid/lib/libcell11.so
#5 0x00002aeb55e3d118 in oss_wait_postable () from /u01/app/11.2.0.3/grid/lib/libcell11.so
<span>#6 0x00000000004243a0 in dskm_tcpmon_thrd_main () ----监控tcp端口</span>
#7 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#8 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x412b9940 (LWP 2716)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb56515194 in sgipcwWaitHelper () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#2 0x00002aeb56512da1 in sgipcwWait () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#3 0x00002aeb5636cb47 in gipcWaitOsd () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#4 0x00002aeb56359599 in gipcInternalWait () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#5 0x00002aeb56306f60 in gipcWaitF () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
<span>#6 0x00002aeb56212d4b in clsssRecvMsg () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so --这三个thread的主要作用是监控其他进程的状态,并接收消息
#7 0x00002aeb561eea9c in clssgsGroupGetStatus () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
#8 0x00002aeb561e4ec6 in clssgsgrpstat () from /u01/app/11.2.0.3/grid/lib/libhasgen11.so
</span>#9 0x000000000042957f in dskm_rac_thrd_main ()
#10 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#11 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x414ba940 (LWP 2717)):
#0 0x0000003946c0b150 in <a href="mailto:pthread_cond_timedwait@@GLIBC_2.3.2">pthread_cond_timedwait@@GLIBC_2.3.2</a> () from /lib64/libpthread.so.0
#1 0x00002aeb5478bc36 in sltspctimewait () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x0000000000425996 in dskm_oss_thrd_main ()
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x40be3940 (LWP 2735)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
<span>#1 0x00002aeb54b7f686 in skgznp_read_msg () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1 --读取接收到的消息
</span>#2 0x0000000000424dfe in dskm_slave_thrd_main ()
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x416bb940 (LWP 2837)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb54b7f686 in skgznp_read_msg () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x0000000000424dfe in dskm_slave_thrd_main ()
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x40e3a940 (LWP 2846)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb5603d6fe in ssskgxp_poll () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#2 0x00002aeb5603687f in sskgxp_select () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#3 0x00002aeb55fe9392 in skgxpiwait () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#4 0x00002aeb55fe7ed4 in skgxpwaiti () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#5 0x00002aeb56025c14 in skgxpwait () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#6 0x0000000000425e85 in dskm_rcv_thrd_main ()
#7 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#8 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x418bc940 (LWP 3154)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb54b7f686 in skgznp_read_msg () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x0000000000424dfe in dskm_slave_thrd_main ()
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x41abd940 (LWP 3238)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb5603d6fe in ssskgxp_poll () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#2 0x00002aeb5603687f in sskgxp_select () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#3 0x00002aeb55fe9392 in skgxpiwait () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#4 0x00002aeb55fe7ed4 in skgxpwaiti () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
#5 0x00002aeb56025c14 in skgxpwait () from /u01/app/11.2.0.3/grid/lib/libskgxp11.so
<span>#6 0x0000000000417ea3 in dskm_hb_thrd_main () --心跳检测
</span>#7 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#8 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x41cbe940 (LWP 4004)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb54b7f686 in skgznp_read_msg () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x0000000000424dfe in dskm_slave_thrd_main ()
#3 0x0000003946c0673d in start_thread () from /lib64/libpthread.so.0
#4 0x00000039460d44bd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x2aeb56ee1c50 (LWP 2710)):
#0 0x00000039460cb696 in poll () from /lib64/libc.so.6
#1 0x00002aeb54b80441 in skgznp_accept () from /u01/app/11.2.0.3/grid/lib/libclntsh.so.11.1
#2 0x000000000042d339 in dskm_main ()
#3 0x000000000040ae3a in main ()
[root@dm01db01 ~]#

上面我们看到libskgxp11.so动态库,这个文件(libskgxpX.so,其中X代表Oracle版本号9/10/11)最早是9i RAC引入的。
skgxp 是System Kernel Generic Interface Inter-Process Communications的缩写,是oracle开放的一个应用接口,用于传输GCS和GES 的数据。

非Infiniband环境下,Oracle自带的libskgxp文件定义的传输协议是UDP/IP。
在Infiniband网络中,Oracle自带的libskgxp文件定义的传输协议是IPoIB(IP over Infiniband),而在Exadata环境中,缺省使用的传输协议是RDS.

在diskmon进程的日志中显示了该进程的版本,进程号、跟cell的通信信息、监控细节等等:

I/O Fencing and SKGXP HA monitoring daemon -- Version 1.2.0.0
Process 2710 started on 2014-03-15 at 19:10:49.694

2014-03-15 19:10:49.702: [ DISKMON][2710] dskm main: starting up
2014-03-15 19:10:49.702: [ DISKMON][2710:1458445392] dskm_tcpmon_thrd_creat: thread created
2014-03-15 19:10:49.703: [ DISKMON][2710:1091275072] dskm_tcpmon_thrd_main: running
2014-03-15 19:10:49.703: [ DISKMON][2710:1091275072] dskm_tcpmon_thrd_main: initFlag = 81
2014-03-15 19:10:49.705: [ DISKMON][2710:1093376320] dskm_rac_thrd_main: running
2014-03-15 19:10:49.705: [ DISKMON][2710:1458445392] dskm_rac_thrd_creat2: got the post from the css event handling thread
2014-03-15 19:10:49.707: [ DISKMON][2710:1093376320] CELL communication is configured to use 1 interface(s):
2014-03-15 19:10:49.707: [ DISKMON][2710:1093376320] 192.168.56.31
2014-03-15 19:10:49.711: [ DISKMON][2710:1095477568] dskm_oss_thrd_main: running
2014-03-15 19:10:49.711: [ DISKMON][2710:1458445392] dskm_oss_thrd_creat2: got the post from the oss check status thread
2014-03-15 19:10:49.711: [ DISKMON][2710:1458445392] dskm main: startup complete
2014-03-15 19:10:49.711: [ DISKMON][2710:1458445392] listening on -&gt; /var/tmp/.oracle/master_diskmon
2014-03-15 19:10:49.736: [ DISKMON][2710:1093376320] IPC version: Oracle UDP/IP (generic)
2014-03-15 19:10:49.736: [ DISKMON][2710:1093376320] IPC Vendor 1 Protocol 2
2014-03-15 19:10:49.736: [ DISKMON][2710:1093376320] Version 4.1
2014-03-15 19:10:49.739: [ DISKMON][2710:1093376320] dskm_pre_oss_ini6: oss context reconnect initialized, parameter values:
_dskm_disable_reconnect_to_cell = 0
_dskm_reconnect_to_oss_attempts = 7
_dskm_reconnect_to_oss_freq_in_sec = 2
_dskm_reconnect_to_oss_counter_reset_freq_in_sec = 60
2014-03-15 19:10:49.739: [ DISKMON][2710:1093376320] dskm_clss_ini1: calling clssscbinit
2014-03-15 19:10:49.739: [ DISKMON][2710:1093376320] dskm_clss_ini2: calling clsssinit
2014-03-15 19:10:50.253: [ DISKMON][2710:1093376320] dskm_clss_ini2: calling clsssinit
2014-03-15 19:10:50.633: [ DISKMON][2710:1458445392] dskm_slave_thrd_creat: thread created
2014-03-15 19:10:50.633: [ DISKMON][2710:1086208320] dskm_slave_thrd_main1: slave 1 running
2014-03-15 19:10:50.633: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_IDENTIFY (0x0001)
2014-03-15 19:10:50.633: [ DISKMON][2710:1086208320] dskm_proc_identify8: client orarootagent/2648, diskmon kgzm 2.1, slave 1, reid cid=DUMMY,icin=-1,nmn=-1,lnid=-1,gid=-1,gin=-1,gmn=-1,umemid=-1,opid=-1,opsn=-1,lvl=process hdr=0xfece0100
2014-03-15 19:10:50.633: [ DISKMON][2710:1086208320] dskm_send_version1:
2014-03-15 19:10:50.637: [ DISKMON][2710:1086208320] dskm_send_version4: done
2014-03-15 19:10:50.637: [ DISKMON][2710:1086208320] dskm_process_msg7: processed msg 0 type KGZM_IDENTIFY (0x0001), retcode 0
2014-03-15 19:10:50.638: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:10:50.764: [ DISKMON][2710:1093376320] dskm_clss_ini2: calling clsssinit
2014-03-15 19:10:51.276: [ DISKMON][2710:1093376320] dskm_clss_ini2: calling clsssinit
2014-03-15 19:10:51.291: [ DISKMON][2710:1093376320] dskm_clss_ini5: successful clsssinit(), clssvers 2.1
2014-03-15 19:10:51.291: [ DISKMON][2710:1093376320] dskm_clss_ini6: calling clssnsqlnum
2014-03-15 19:10:53.643: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:10:56.645: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:10:59.650: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:02.654: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:05.657: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:08.662: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:11.669: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:14.671: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:17.667: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-15 19:11:20.377: [ DISKMON][2710:1093376320] dskm_clss_ini8: calling clsssattrib
2014-03-15 19:11:20.377: [ DISKMON][2710:1093376320] dskm_clss_ini11: calling clssnsqname
2014-03-15 19:11:20.384: [ DISKMON][2710:1093376320] dskm_clss_ini13: calling clsssattrib
2014-03-15 19:11:20.384: [ DISKMON][2710:1093376320] dskm_clss_ini15: calling clssgsregnodegrp

。。。。。。。。

<span>2014-03-16 00:48:48.442: [ DISKMON][2710:1103882560] dskm_issue_ioctl_helper4: oss_ioctl 13 to device o/192.168.56.11 (oss_fd 1, handle 0x1699eb28)
2014-03-16 00:48:48.446: [ DISKMON][2710:1103882560] dskm_bcast_oss11: oss_ioctl request 0x1699eb28 completed
2014-03-16 00:48:48.446: [ DISKMON][2710:1103882560] dskm_process_msg7: processed msg 342 type KGZM_BCAST_OSS_IOCTL (0x000a), retcode 0
</span>2014-03-16 00:48:48.711: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:48:51.719: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:48:54.713: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:48:57.721: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:00.735: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:03.742: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:06.815: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:09.754: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:12.760: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:15.763: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:18.765: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:21.770: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:24.784: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:27.794: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:30.800: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:33.800: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:36.803: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:39.977: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:42.815: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:46.362: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
<span>2014-03-16 00:49:48.458: [ DISKMON][2710:1103882560] dskm_process_msg5: received msg type KGZM_BCAST_OSS_IOCTL (0x000a)
2014-03-16 00:49:48.459: [ DISKMON][2710:1103882560] dskm_issue_ioctl_helper4: oss_ioctl 13 to device o/192.168.56.11 (oss_fd 1, handle 0x1699eb28)
2014-03-16 00:49:48.464: [ DISKMON][2710:1103882560] dskm_bcast_oss11: oss_ioctl request 0x1699eb28 completed
2014-03-16 00:49:48.464: [ DISKMON][2710:1103882560] dskm_process_msg7: processed msg 343 type KGZM_BCAST_OSS_IOCTL (0x000a), retcode 0
</span>2014-03-16 00:49:49.267: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:51.833: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 00:49:54.844: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)

。。。。。。。。。。

2014-03-16 01:43:42.362: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:43:45.367: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:43:48.371: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:43:51.377: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
<span>2014-03-16 01:43:51.766: [ DISKMON][2710:1103882560] dskm_process_msg5: received msg type KGZM_BCAST_OSS_IOCTL (0x000a)
2014-03-16 01:43:51.766: [ DISKMON][2710:1103882560] dskm_issue_ioctl_helper4: oss_ioctl 13 to device o/192.168.56.11 (oss_fd 1, handle 0x1699eb28)
2014-03-16 01:43:51.886: [ DISKMON][2710:1103882560] dskm_bcast_oss11: oss_ioctl request 0x1699eb28 completed
2014-03-16 01:43:51.886: [ DISKMON][2710:1103882560] dskm_process_msg7: processed msg 395 type KGZM_BCAST_OSS_IOCTL (0x000a), retcode 0
</span>2014-03-16 01:43:54.378: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:43:57.384: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:00.389: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:03.388: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:06.393: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:09.403: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:12.411: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)
2014-03-16 01:44:15.418: [ DISKMON][2710:1086208320] dskm_process_msg5: received msg type KGZM_PING (0x0011)

从diskmon的日志信息我们可以发现,每隔1分钟,ocss进程会发送一个KGZM_BCAST_OSS_IOCTL指令给diskmon,然后diskmon会去跟cell通信(我这里是VM环境,只有一个cell,IP地址是:192.168.56.11)

显然,这些都说明了,diskmon进程确实只在Exadata环境才有作用。

在11.1.0.7的Cluster Resource中,可以看到diskmon进程是缺省就启动的(前面说过,这个版本首次引入了diskmon进程,不过该功能是为了Exadata环境设计的):

$GRID_HOME/bin/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac1 Started
ora.crsd
1 ONLINE ONLINE rac1
ora.cssd
1 ONLINE ONLINE rac1
ora.cssdmonitor
1 ONLINE ONLINE rac1
ora.ctssd
1 ONLINE ONLINE rac1 OBSERVER
<span>ora.diskmon
1 ONLINE ONLINE rac1
</span>ora.drivers.acfs
1 ONLINE ONLINE rac1
ora.evmd
1 ONLINE ONLINE rac1
ora.gipcd
1 ONLINE ONLINE rac1
ora.gpnpd
1 ONLINE ONLINE rac1
ora.mdnsd
1 ONLINE ONLINE rac1

从11.2.0.2以后,Cluster Resource又多出了两个进程,即:(关于haip和crf不做这里的重点讨论):

ora.cluster_interconnect.haip
1 ONLINE ONLINE rac1
ora.crf
1 ONLINE ONLINE rac1

从11.2.0.3以后,在非Exadata上, ora.diskmon的缺省状态为offline:
ora.diskmon
1 OFFLINE OFFLINE rac1

这里面主要是两点原因:
1,非Exadata环境中,diskmon是没有什么用处的
2,Diskmon在非Exadata环境下,不但没走用,还容易触发一些bug,因此索性缺省为offline

从12c以后, Cluster Resource家族又增添了新成员:ora.storage,这个资源是为了Flex ASM而设计的:

ora.storage
1 ONLINE ONLINE racnode1 STABLE


Diskmon进程的日志位置:
GRID_HOME/log/host/diskmon – Disk Monitor Daemon
发表在 FAQ | 标签为 , | 留下评论

Exadata上的Writeback和Writethrouth-4-X3和X4上的Flash Card

从X3开始采用 Flash Accelerator F40 400GB Solid State Memory ,每个cell上有4块F40卡,每个卡上有4个100GB的FDOM(每个F40的容量为4*100GB=400GB),这样,每个cell的总Flash容量为1.6T。可见从容量上已经突飞猛进了。

这里,我们再回来思考上一篇《Exadata上的Writeback和Writethrouth-4-V2和X2上的Flash Card》中提到的一个问题:是否可以将Exadata上的Flash Card作为ASM disk使用?

Exadata上的Writeback和Writethrouth-4-X3和X4上的Flash Card

发表在 FAQ, 体系架构 | 标签为 , , , | 留下评论

Exadata上的Writeback和Writethrouth-3-V2和X2上的Flash Card

从V2和X2的机器上硬件带有4块写缓存功能Sun Flash Accelerator F20 PCIe Card卡,每个F20上有4个FMOD模块,每个FMOD模块是24G(每个F20为24*4=96GB容量)。因此V2和X2的每台存储服务器上可以支持4*96GB=384GB的缓存,如图:


无标题


Exadata上的Writeback和Writethrouth-3-V2和X2上的Flash Card

发表在 FAQ, 体系架构 | 标签为 , , , | 留下评论