联系:QQ(5163721)
标题:Exadata上cell如何判断何时进行智能扫描?————1.在Exadata环境的测试
作者:Lunar©版权所有[文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.]
我们知道,SmartScan的先决条件是下面3个:
1,必须是对象上的全表扫描或者全索引快速扫描(TABLE ACCESS FULL或INDEX FAST FULL SCAN)
2,必须使用直接路径读取(Direct Path Read)。SmartScan的数据流是无法缓存在SGA缓冲池中的。直接路径读取可以串行,也可以并行,缓冲在进程的PGA(heap)中。
3,对象必须存储在Exadata的Cell节点上
.
在Exadata环境的数据库节点上:
1,ASM实例使用libcell11.a这个动态库跟cell节点通信,
2,数据库实例使用kcfis( Kernel File Intelligent Storage )执行SmartScan。
3,Diskmon进行心跳监控,I/O fencing和IORM 。
.
而cell节点上,主要是cellsrv(多线程的服务)完成相关的IO请求的操作(MS主要是监控,RS是重启其他进程)
.
并且,对于Linux环境,Exadata和非Exadata上面安装的Oracle介质并无却别,那么有两个问题:
1,这两个环境是上执行计划为什么会不同(一个是全表扫描,一个是智能扫描)?
2,cell在收到数据库节点的请求后,怎么判断是否执行只能扫描呢?
.
首先,在Exadata上大体有下面3种读取数据块的方式:
1,普通的block读:从磁盘读取数据块,读入到SGA
2,直接路径读:从磁盘读取数据块,读入PGA
3,智能扫描:由cell直接从底盘读取,读入PGA
.
而在非Exadata环境,通常只有前两种。
我们分别用gdb跟踪一下。
首先在我的Exadata VM上测试:
创建测试表:
------------ session 1 ---------------- LUNAR@lunar>create table lunar as select * from dba_objects where rownum<=100; Table created. Elapsed: 00:00:00.35 LUNAR@lunar>select pid, spid from v$process p, v$session s where p.addr = s.paddr and s.sid=(select sid from v$mystat where rownum=1); PID SPID ---------- ------------------------------------------------ 24 12868 Elapsed: 00:00:00.05 LUNAR@lunar>
使用gdb跟踪这个会话:
———— 回到 session 2 —————-
(gdb) c
Continuing.
继续跟踪
———— 回到 session 1 —————-
执行: LUNAR@bbff1>select count(1) from lunar;
hang住,等待gdb继续跟踪……
———— 回到 session 2 继续跟踪 —————-
(gdb) c
Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () (gdb) c Continuing. c Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing. Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () (gdb) c Continuing. Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () (gdb) c Continuing. Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () (gdb) c Continuing.
———— 回到 session 1 —————-
这时,我们看到session已经返回了结果:
LUNAR@bbff1>select count(1) from lunar; COUNT(1) ---------- 100 Elapsed: 00:01:58.50 LUNAR@bbff1>
我们发现,对于这个SQL的执行,基本上是反复重复下面的动作,这里例子中是重复做了8次:(后面注释为个人猜测)
Breakpoint 2, 0x00000000021c9bee in kcfis_capability_tab_get () ---猜测是读表的大小或者各种存储属性等等(storage参数的相关内容,是否压缩,是否加密等等) Breakpoint 6, 0x00000000021c9cd6 in kcfis_tablespace_is_on_sage () ---判断表空间是否在cell上,也就是SAGE(Storage Appliance for Grid Environments)是Exadata最早的雏形,也是Oracle内部开发时该项目的名称 Breakpoint 170, 0x000000000681b1b6 in kcfis_tbs_storage_attr () ---判断表空间的属性 1 Breakpoint 14, 0x00000000021cb0ae in kcfis_get_ts_attrs () ---判断表空间的属性 2 猜测这两判断的来源应该不同
在使用event 10046跟踪一下:
LUNAR@bbff1>select pid, spid from v$process p, v$session s where p.addr = s.paddr and s.sid=(select sid from v$mystat where rownum=1); PID SPID ---------- ------------------------ 32 20361 Elapsed: 00:00:06.14 LUNAR@bbff1>
从跟踪文件上看,执行计划是“TABLE ACCESS STORAGE FULL LUNAR”,确实是Smart Scan:
PARSING IN CURSOR #47327115779400 len=26 dep=0 uid=37 oct=3 lid=37 tim=1394896509806091 hv=3764746106 ad='7e7afa50' sqlid='dprpmj3h6axvu' select count(1) from lunar END OF STMT PARSE #47327115779400:c=32002,e=4391771,p=0,cr=5,cu=0,mis=1,r=0,dep=0,og=1,plh=291688323,tim=1394896509806089 EXEC #47327115779400:c=0,e=628,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=291688323,tim=1394896509806815 WAIT #47327115779400: nam='SQL*Net message to client' ela= 8 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1394896509806894 FETCH #47327115779400:c=0,e=74,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=291688323,tim=1394896509806997 STAT #47327115779400 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=4 pr=0 pw=0 time=91 us)' STAT #47327115779400 id=2 cnt=100 pid=1 pos=1 obj=17867 op='TABLE ACCESS STORAGE FULL LUNAR (cr=4 pr=0 pw=0 time=138 us cost=3 size=0 card=100)' WAIT #47327115779400: nam='SQL*Net message from client' ela= 323 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1394896509807410 FETCH #47327115779400:c=0,e=4,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=291688323,tim=1394896509807575 WAIT #47327115779400: nam='SQL*Net message to client' ela= 6 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1394896509807623
格式化一下:
SQL ID: dprpmj3h6axvu Plan Hash: 291688323 select count(1) from lunar call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.03 4.39 0 5 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 4 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.03 4.39 0 9 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 37 Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=4 pr=0 pw=0 time=91 us) 100 100 100 TABLE ACCESS STORAGE FULL LUNAR (cr=4 pr=0 pw=0 time=138 us cost=3 size=0 card=100) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message from client 2 126.89 126.89 Disk file operations I/O 1 0.00 0.00 SQL*Net message to client 2 0.00 0.00
顺便解释一下,之所以之前重复了8次执行kcfis那些函数,还包括了动态采样,O(∩_∩)O哈哈~:
******************************************************************************** SQL ID: 08vx76v2xp6jp Plan Hash: 291688323 SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /*+ NO_PARALLEL("LUNAR") FULL("LUNAR") NO_PARALLEL_INDEX("LUNAR") */ :"SYS_B_2" AS C1, :"SYS_B_3" AS C2 FROM "LUNAR"."LUNAR" "LUNAR") SAMPLESUB call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.30 0 0 0 0 Execute 1 0.01 0.95 0 0 0 0 Fetch 1 0.00 0.21 0 4 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 0.01 1.46 0 4 0 1 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 37 (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=4 pr=0 pw=0 time=211740 us) 100 100 100 TABLE ACCESS STORAGE FULL LUNAR (cr=4 pr=0 pw=0 time=9494 us cost=3 size=0 card=409) ********************************************************************************
未完,待续……