User guide上列出了全部的Metric,这里只说些一般比较关注的:
CellCLI> list metriccurrent FC_BY_USED, FC_BY_DIRTY
FC_BY_USED FLASHCACHE 307,012 MB
FC_BY_DIRTY FLASHCACHE 250,320 MB
FC_BY_USED – number of MB cached (total)
FC_BY_DIRTY – number of dirty MB cached (data written only to FlashCache but not to disks)
CellCLI> list metriccurrent GD_BY_FC_DIRTY
GD_BY_FC_DIRTY DATA_CD_00_cel14 7,214 MB
GD_BY_FC_DIRTY DATA_CD_01_cel14 6,698 MB
. . .
CellCLI> list metriccurrent CD_BY_FC_DIRTY where metricObjectName=FD_05_cel14
CD_BY_FC_DIRTY FD_05_cel14 19,430 MB
GD_BY_FC_DIRTY – number of dirty MB cached for the griddisk
CD_BY_FC_DIRTY – number of dirty MB cached on the flash celldisk
通常我们测试一个应用的flashcache的负载,可以参考如下过程:
1,重置flashcache的统计信息:
CellCLI> alter cell events="immediate cellsrv.cellsrv_flashcache(resetStats,0,0,0)"
2,应用测试增加负载
3,dump统计信息:
看 cellsrv 的ADR…/trace/下面的svtrc开头的trace文件,比如我的环境:
export CELL_ADR=/opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell CellCLI> alter cell events="immediate cellsrv.cellsrv_flashcache(dumpStats,0,0,L)" L – is any non-negative number to be used as a result Label
动态调整从griddisk重读的频率:
CellCLI> alter cell events = "immediate cellsrv.cellsrv_setparam ('_cell_flashcache_diag_reads_frequency ','20')"
下面是查看flashcache的控制块的结构的方法:
首先在cellsrv的alert中找到类似下面的信息:
。。。 Sun Jun 30 10:24:28 2013 <strong>Caching enabled on FlashCache Part dm01cel01_FLASHCACHE guid=67d1e918-7455-4afa-9b6f-7ecb2da11b82 (504684860), size=816MB, cdisk=FD_02_dm01cel01</strong> Caching enabled on FlashCache Part dm01cel01_FLASHCACHE guid=d6400f21-abcc-4ea3-a09c-79425619ef36 (1922598212), size=816MB, cdisk=FD_01_dm01cel01 Caching enabled on FlashCache Part dm01cel01_FLASHCACHE guid=decfc31a-e1ba-4d63-9161-50f6db206572 (1710125700), size=816MB, cdisk=FD_03_dm01cel01 Caching enabled on FlashCache Part dm01cel01_FLASHCACHE guid=4dec16e9-1559-468d-9cfe-39443ac18ddc (1241609148), size=816MB, cdisk=FD_00_dm01cel01 FlashCache: allowing client IOs (mode=Writeback) Sun Jun 30 10:24:52 2013 Smart Flash Logging enabled on FlashLog dm01cel01_FLASHLOG (2124647276), size=128MB, cdisk=FD_00_dm01cel01 Sun Jun 30 10:24:52 2013 Smart Flash Logging enabled on FlashLog dm01cel01_FLASHLOG (885498860), size=128MB, cdisk=FD_03_dm01cel01 Sun Jun 30 10:24:53 2013 Smart Flash Logging enabled on FlashLog dm01cel01_FLASHLOG (1996959724), size=128MB, cdisk=FD_02_dm01cel01 。。。
dump这个flashcache的控制块结构:
CellCLI> list cell detail
name: dm01cel01
bbuTempThreshold: 60
bbuChargeThreshold: 800
bmcType: absent
cellVersion: OSS_11.2.3.2.1_LINUX.X64_130109
cpuCount: 2
diagHistoryDays: 7
fanCount: 1/1
fanStatus: normal
flashCacheMode: WriteBack
id: ef56a78d-a9cc-4c04-baac-648a18370eb7
interconnectCount: 1
interconnect1: eth0
iormBoost: 0.0
ipaddress1: 192.168.56.11/24
kernelVersion: 2.6.18-274.el5xen
<strong> makeModel: Fake hardware</strong>
metricHistoryDays: 7
offloadEfficiency: 1,000.0
powerCount: 1/1
powerStatus: normal
releaseVersion: 11.2.3.2.1
releaseTrackingBug: 14522699
status: online
temperatureReading: 0.0
temperatureStatus: normal
upTime: 0 days, 0:09
cellsrvStatus: running
msStatus: running
rsStatus: running
<strong>CellCLI> alter cell events="immediate cellsrv.cellsrv_flashcache(dumpctrlblock, 504684860,0,L)"</strong>
Dump sequence #1 has been written to <strong><strong>/opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/dm01cel01/trace/svtrc_2312_58.trc</strong>
Cell dm01cel01 successfully altered
CellCLI>
trace内容如下:
Trace file /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/dm01cel01/trace/svtrc_2312_58.trc ORACLE_HOME = /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109 System name: Linux Node name: dm01cel01 Release: 2.6.18-274.el5xen Version: #1 SMP Mon Jul 25 14:24:57 EDT 2011 Machine: x86_64 CELL SW Version: OSS_11.2.3.2.1_LINUX.X64_130109 *** 2013-06-30 10:24:25.763 UserThread: LWPID: 2648 userId: 58 kernelId: 58 pthreadID: 0x6641a940 FCC: Control block memory dump for Flash ID 504684860: 2013-06-30 10:33:35.042335*: Dump sequence #1: 2AAB6BD00000 A624DCC2 00000000 54434346 4B4C424C [..$.....FCCTLBLK] 2AAB6BD00010 64373600 31396531 34372D38 342D3535 [.67d1e918-7455-4] 2AAB6BD00020 2D616661 66366239 6365372D 61643262 [afa-9b6f-7ecb2da] 2AAB6BD00030 38623131 03050032 00000002 00000000 [11b82...........] 2AAB6BD00040 00280000 00000000 33000000 00000000 [..(........3....] 2AAB6BD00050 5B0B8A32 0000013F 1E14E13C 00000002 [2..[?...<.......] 2AAB6BD00060 00000010 00000000 54434346 4B4C424C [........FCCTLBLK] 2AAB6BD00070 00000000 00000000 [........] <strong>FCC: Control block formatted dump for Flash ID 504684860: fid=504684860, head/tail FCCTLBLK/FCCTLBLK, mdSize=2621440, chksum=2787433666, flashSz=855638016, flags=5, cacheline size shift 16, creation_timestamp=1371622050354, version=2, guid=67d1e918-7455-4afa-9b6f-7ecb2da11b82, pers_mode=Writeback, verif_level=crc. </strong><strong>FCC: Control block for flashID=504684860 is located on cdisk='FD_02_dm01cel01' at offset 184549376 FCC: Primary FC metadata for flashID=504684860 is located on cdisk='FD_02_dm01cel01', offset/size: 184614912 1261568 FCC: Shadow FC metadata for flashID=504684860 is located on cdisk='FD_02_dm01cel01', offset/size: 185876480 1261568 </strong>
检查某个griddisk是否被cache,并dump其cache-line header的信息
首先看alert 。。。 Sun Jun 30 10:24:26 2013 CellDisk v0.7 name=CD_disk01_dm01cel01 status=NORMAL confine status=NONE confine reason=CD_GOOD guid=55031178-14e1-49f8-b080-ce4293b5c683 found on dev=/opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 GridDisk name=data_CD_disk01_dm01cel01 guid=6065c05e-8eae-461c-9b43-02b5c46fd6bb (542754140), cached by these FlashCache parts: 504684860 GridDisk name=reco_CD_disk01_dm01cel01 guid=6d40fc07-34e3-4e71-9fd8-a51a98e68769 (709161532), cached by these FlashCache parts: 504684860 Initialization of celldisk CD_disk01_dm01cel01 on /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 completed. GridDisk name=data_CD_disk12_dm01cel01 guid=0fb93c36-a5b5-431a-aa7b-025b52f7cbe4 (2820417692), cached by these FlashCache parts: 1241609148 GridDisk name=reco_CD_disk12_dm01cel01 guid=bba5ae9b-dc0f-4515-94f1-2104e8d0bc44 (1611296652), cached by these FlashCache parts: 1241609148 。。。
使用griddisk id和griddisk offset从FC中dump相关的cache的元数据:
SQL> conn lunar/lunar
Connected.
SQL> create table lunartest as select * from dba_objects;
Table created.
SQL> alter table lunartest STORAGE (CELL_FLASH_CACHE keep);
Table altered.
SQL> select object_id from user_objects;
OBJECT_ID
----------
17852
SQL> select count(*) from lunartest;
COUNT(*)
----------
17580
SQL>
CellCLI> LIST FLASHCACHECONTENT WHERE objectNumber=17852 DETAIL
cachedKeepSize: 0
cachedSize: 966656
dbID: 3118431096
dbUniqueName: BBFF
hitCount: 0
missCount: 0
objectNumber: 17852
tableSpaceNumber: 4
CellCLI>
SQL> select count(*) from lunartest;
COUNT(*)
----------
17580
SQL>
CellCLI> LIST FLASHCACHECONTENT WHERE objectNumber=17852 DETAIL
cachedKeepSize: 1925120
cachedSize: 1925120
dbID: 3118431096
dbUniqueName: BBFF
hitCount: 25
hoursToExpiration: 24
missCount: 2
objectNumber: 17852
tableSpaceNumber: 4
CellCLI>
CellCLI> alter cell events="immediate cellsrv.cellsrv_flashcache(dumpmdchunk,1,542754140,1)"
Cell dm01cel01 successfully altered
CellCLI>
[root@dm01cel01 trace]# cat svtrc_2312_80.trc
Trace file /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/dm01cel01/trace/svtrc_2312_80.trc
ORACLE_HOME = /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109
System name: Linux
Node name: dm01cel01
Release: 2.6.18-274.el5xen
Version: #1 SMP Mon Jul 25 14:24:57 EDT 2011
Machine: x86_64
CELL SW Version: OSS_11.2.3.2.1_LINUX.X64_130109
*** 2013-06-30 10:24:25.608
UserThread: LWPID: 2670 userId: 80 kernelId: 80 pthreadID: 0x74030940
2013-06-30 10:24:26.061175*: For GridDisk data_CD_disk05_dm01cel01 set these caching FlashIDs: 1922598212
2013-06-30 10:24:26.061175*: For GridDisk reco_CD_disk05_dm01cel01 set these caching FlashIDs: 1922598212
2013-06-30 10:24:26.139905*: [CDP] initCDPers – found persdata for guid: fb9bb908-044e-44a1-afa1-2428c065b9bc
dmgType: DMG_UNKNOWN dmgSlot: 255 predFailStat: 0 ioTimeIndex: 0 lastIOCompTime: 1372559066070289 lastIOSubmitTime: 0 histIOLatIndex: 0
CellDisk UUID: fb9bb908-044e-44a1-afa1-2428c065b9bc CellDiskPersObj File offset: 7360
confTransIdx: 0 Current confine state: NONE Health incarnation number: 0 ConfineTransIndex cstate ccause activeForced activeAlertSent inactiveForced inactiveAlertSent asmRespond testsFailed testOutcomeForced noneTime activeTime inactiveTime finalTime
0 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
1 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
2 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
3 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
4 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
5 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
6 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
7 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
8 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
9 NONE CD_GOOD 0 0 0 0 0 0 0 0 0 0 0
2013-06-30 10:25:15.733434*: New info from MS for CD CD_disk09_dm01cel01: diskMediaGroup: DMG_UNKNOWN, disk slot number: 1, predictive failure on disk: 0
No Cache header ID=1, loc=542703616
[root@dm01cel01 trace]#
不知道是否因为我的是VM,因此,这个命令实际上并没有令我满意的结果…………
