今天本想测试一个东西,却遇到一个ORA-1578,比较郁闷,最近vm总是出现乱七八糟郁闷问题。。。。。
LUNAR@travel>select count(*) from ff; select count(*) from ff * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' LUNAR@travel> SYS@travel>select file#,name from v$datafile; FILE# NAME ---------- ------------------------------------------------------- 1 /stage/travel/system01.dbf 2 /stage/travel/sysaux01.dbf 3 /stage/travel/undotbs01.dbf 4 /stage/travel/users01.dbf 5 /stage/travel/lunar01.dbf SYS@travel> 使用dbv检测了一下,发现两个坏块: [oracle@lunar ~]$ dbv DBVERIFY: Release 11.2.0.3.0 - Production on Sat Oct 26 20:32:04 2013 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. Keyword Description (Default) ---------------------------------------------------- FILE File to Verify (NONE) START Start Block (First Block of File) END End Block (Last Block of File) BLOCKSIZE Logical Block Size (8192) LOGFILE Output Log (NONE) FEEDBACK Display Progress (0) PARFILE Parameter File (NONE) USERID Username/Password (NONE) SEGMENT_ID Segment ID (tsn.relfile.block) (NONE) HIGH_SCN Highest Block SCN To Verify (NONE) (scn_wrap.scn_base OR scn) [oracle@lunar ~]$ dbv file=/stage/travel/users01.dbf DBVERIFY: Release 11.2.0.3.0 - Production on Sat Oct 26 20:32:26 2013 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = /stage/travel/users01.dbf Page 128 is influx - most likely media corrupt Corrupt block relative dba: 0x01000080 (file 4, block 128) Fractured block found during dbv: Data in bad block: type: 30 format: 2 rdba: 0x01000080 last change scn: 0x0000.00055d9a seq: 0x4 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x5d9a2004 check value in block header: 0xd9f8 computed block checksum: 0x0 Page 129 is marked corrupt Corrupt block relative dba: 0x01000081 (file 4, block 129) Bad check value found during dbv: Data in bad block: type: 33 format: 2 rdba: 0x01000081 last change scn: 0x0000.00057901 seq: 0x3 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x79012103 check value in block header: 0xc13e computed block checksum: 0x41f0 |
原因已经很清楚了,file 4 block 128是物理坏块;file 4 block 129是checksum问题
DBVERIFY - Verification complete Total Pages Examined : 32000 Total Pages Processed (Data) : 29573 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing (Index): 0 Total Pages Processed (Other): 380 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 2045 Total Pages Marked Corrupt : 2 Total Pages Influx : 1 Total Pages Encrypted : 0 Highest block SCN : 442564 (0.442564) 我来试试rman检测的效果: [oracle@lunar ~]$ rman target / Recovery Manager: Release 11.2.0.3.0 - Production on Sat Oct 26 20:58:42 2013 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: TRAVEL (DBID=2874070637) RMAN> backup validate check logical database; Starting backup at 26-OCT-13 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=140 device type=DISK channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00001 name=/stage/travel/system01.dbf input datafile file number=00002 name=/stage/travel/sysaux01.dbf input datafile file number=00004 name=/stage/travel/users01.dbf input datafile file number=00003 name=/stage/travel/undotbs01.dbf input datafile file number=00005 name=/stage/travel/lunar01.dbf channel ORA_DISK_1: backup set complete, elapsed time: 00:00:45 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 1 OK 0 12609 44800 537449 File Name: /stage/travel/system01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 22483 Index 0 6617 Other 0 3091 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 2 OK 0 24301 38400 537463 File Name: /stage/travel/sysaux01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 4311 Index 0 3504 Other 0 6284 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 3 OK 0 1 19840 537463 File Name: /stage/travel/undotbs01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 0 Other 0 19839 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 4 FAILED 0 2045 32000 442564 File Name: /stage/travel/users01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 29573 Index 0 0 Other 2 382 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 5 OK 0 12475 12800 442504 File Name: /stage/travel/lunar01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 1 Other 0 324 validate found one or more corrupt blocks See trace file /u01/app/oracle/diag/rdbms/travel/travel/trace/travel_ora_11181.trc for details channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including current control file in backup set including current SPFILE in backup set channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02 List of Control File and SPFILE =============================== File Type Status Blocks Failing Blocks Examined ------------ ------ -------------- --------------- SPFILE OK 0 2 Control File OK 0 594 Finished backup at 26-OCT-13 RMAN> exit Recovery Manager complete. [oracle@lunar ~]$ ss SQL*Plus: Release 11.2.0.3.0 Production on Sat Oct 26 21:02:20 2013 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning and Real Application Testing options SYS@travel>select * from v$database_block_corruption; FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE ---------- ---------- ---------- ------------------ ------------------ 4 128 1 0 FRACTURED 这里的信息与dbv的检测完全一致,非常方便,block corrupt的类型清晰明了 4 129 1 0 CHECKSUM SYS@travel> 尝试使用event 10231跳过都失败了: LUNAR@travel>set autotrace on explain LUNAR@travel>alter session set events '10231 trace name context forever, level 1'; Session altered. LUNAR@travel>select count(*) from ff; select count(*) from ff * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' LUNAR@travel> LUNAR@travel>alter session set events '10231 trace name context forever, level 10'; Session altered. LUNAR@travel>select count(*) from ff; select count(*) from ff * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' LUNAR@travel> 尝试按照rowid抽取,也失败了: declare n number:=0; bad_rows number := 0; error_code number; ora1578 EXCEPTION; PRAGMA EXCEPTION_INIT(ora1578, -1578); begin for i in (select rowid rid from ff) loop begin insert into newff select * from ff where rowid=i.rid; n:=n+1; exception when ora1578 then bad_rows := bad_rows + 1; insert into bad_rows values(i.rid,1578); commit; when others then error_code:=SQLCODE; bad_rows := bad_rows + 1; insert into bad_rows values(i.rid,error_code); commit; end; end loop; dbms_output.put_line('Total Bad Rows: '||bad_rows); dbms_output.put_line('Total Good rows: '||n); end; / LUNAR@travel>declare 2 n number:=0; 3 bad_rows number := 0; 4 error_code number; 5 ora1578 EXCEPTION; 6 PRAGMA EXCEPTION_INIT(ora1578, -1578); 7 begin 8 for i in (select rowid rid from ff) loop 9 begin 10 insert into newff 11 select * 12 from ff 13 where rowid=i.rid; 14 n:=n+1; 15 exception 16 when ora1578 then 17 bad_rows := bad_rows + 1; 18 insert into bad_rows values(i.rid,1578); 19 commit; 20 when others then 21 error_code:=SQLCODE; 22 bad_rows := bad_rows + 1; 23 insert into bad_rows values(i.rid,error_code); 24 commit; 25 end; 26 end loop; 27 dbms_output.put_line('Total Bad Rows: '||bad_rows); 28 dbms_output.put_line('Total Good rows: '||n); 29 end; 30 / declare * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' ORA-06512: at line 8 LUNAR@travel> |
尝试DBMS_REPAIR也不行:
execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('','', flags=>dbms_repair.noskip_flag); select owner, segment_name, segment_type, partition_name from dba_segments where header_file = &AFN and header_block = &BL; exec DBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'SYSTEM'); exec DBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'SYSTEM'); SET serveroutput ON DECLARE cc NUMBER; BEGIN DBMS_REPAIR.check_object(schema_name => 'LUNAR',object_name =>'FF',corrupt_count => cc); DBMS_OUTPUT.put_line(a => TO_CHAR(cc)); END; / SYS@travel>SET serveroutput ON DECLARE SYS@travel> 2 cc NUMBER; 3 BEGIN 4 DBMS_REPAIR.check_object(schema_name => 'LUNAR',object_name =>'FF',corrupt_count => cc); 5 DBMS_OUTPUT.put_line(a => TO_CHAR(cc)); 6 END; 7 / DECLARE * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' ORA-06512: at "SYS.DBMS_REPAIR", line 294 ORA-06512: at line 4 SYS@travel> dump一下,看看什么情况: SYS@travel>oradebug setmypid Statement processed. SYS@travel> ALTER system dump datafile 4 block 129; System altered. SYS@travel>oradebug tracefile_name /u01/app/oracle/diag/rdbms/travel/travel/trace/travel_ora_11492.trc SYS@travel> *** 2013-10-26 22:06:33.568 Start dump data blocks tsn: 4 file#:4 minblk 129 maxblk 129 Block dump from cache: Dump of buffer cache at level 4 for tsn=4 rdba=16777345 BH (0x693e6418) file#: 4 rdba: 0x01000081 (4/129) class: 9 ba: 0x69182000 set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 110,28 dbwrid: 0 obj: 17946 objn: 17946 tsn: 4 afn: 4 hint: f hash: [0x7448c5d0,0x7448c5d0] lru: [0x693e6630,0x693e63d0] ckptq: [NULL] fileq: [NULL] objq: [0x697f40d8,0x693e63f8] objaq: [0x697f40e8,0x693e6408] st: XCURRENT md: NULL fpin: 'ktspswh6: ktspInitScan' tch: 0 flags: auto_bmr_tried LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] Block dump from disk: buffer tsn: 4 rdba: 0x01000081 (4/129) scn: 0x0000.00057901 seq: 0x03 flg: 0x04 tail: 0x79012103 frmt: 0x02 chkval: 0xc13e type: 0x21=SECOND LEVEL BITMAP BLOCK ------这里看到是type: 0x21=SECOND LEVEL BITMAP BLOCK,event 10231等等手段是跳不过去的 Hex dump of corrupt header 3 = CHKVAL 尝试使用bbed修改: BBED> verify DBVERIFY - Verification starting FILE = /stage/travel/users01.dbf BLOCK = 129 Block 129 is corrupt Corrupt block relative dba: 0x01000081 (file 0, block 129) Bad check value found during verification Data in bad block: type: 33 format: 2 rdba: 0x01000081 last change scn: 0x0000.00057901 seq: 0x3 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x79012103 check value in block header: 0xc13e computed block checksum: 0x41f0 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 1 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED BBED> BBED> sum apply Check value for File 4, Block 129: current = 0x80ce, required = 0x80ce BBED> verify DBVERIFY - Verification starting FILE = /stage/travel/users01.dbf BLOCK = 129 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED BBED> |
好了,已经不抱错了,再试试,dbv,这里理论上应该是只有1个坏块了,file 4 block 129已经修好了:
[oracle@lunar bbed]$ dbv file=/stage/travel/users01.dbf DBVERIFY: Release 11.2.0.3.0 - Production on Sat Oct 26 23:10:51 2013 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = /stage/travel/users01.dbf Page 128 is influx - most likely media corrupt Corrupt block relative dba: 0x01000080 (file 4, block 128) Fractured block found during dbv: Data in bad block: type: 30 format: 2 rdba: 0x01000080 last change scn: 0x0000.00055d9a seq: 0x4 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x5d9a2004 check value in block header: 0xd9f8 computed block checksum: 0x0 DBVERIFY - Verification complete Total Pages Examined : 32000 Total Pages Processed (Data) : 29565 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing (Index): 0 Total Pages Processed (Other): 389 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 2045 Total Pages Marked Corrupt : 1 Total Pages Influx : 1 Total Pages Encrypted : 0 Highest block SCN : 539524 (0.539524) [oracle@lunar bbed]$ |
记录一下:
bbed修改后的block: 有问题的block:
*** 2013-10-26 23:12:54.998 *** 2013-10-26 22:06:33.568 Start dump data blocks tsn: 4 file#:4 minblk 129 maxblk 129 Start dump data blocks tsn: 4 file#:4 minblk 129 maxblk 129 Block dump from cache: Block dump from cache: Dump of buffer cache at level 4 for tsn=4 rdba=16777345 Dump of buffer cache at level 4 for tsn=4 rdba=16777345 BH (0x693e6418) file#: 4 rdba: 0x01000081 (4/129) class: 9 ba: 0x69182000 BH (0x693e6418) file#: 4 rdba: 0x01000081 (4/129) class: 9 ba: 0x69182000 set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 116,28 set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 110,28 dbwrid: 0 obj: 17946 objn: 17946 tsn: 4 afn: 4 hint: f dbwrid: 0 obj: 17946 objn: 17946 tsn: 4 afn: 4 hint: f hash: [0x7448c5d0,0x7448c5d0] lru: [0x693e6630,0x693e63d0] hash: [0x7448c5d0,0x7448c5d0] lru: [0x693e6630,0x693e63d0] ckptq: [NULL] fileq: [NULL] objq: [0x697f40d8,0x693e63f8] objaq: [0x697f40e8,0x693e6408] ckptq: [NULL] fileq: [NULL] objq: [0x697f40d8,0x693e63f8] objaq: [0x697f40e8,0x693e6408] st: XCURRENT md: NULL fpin: 'ktspswh6: ktspInitScan' tch: 0 st: XCURRENT md: NULL fpin: 'ktspswh6: ktspInitScan' tch: 0 flags: auto_bmr_tried flags: auto_bmr_tried LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] Block dump from disk: Block dump from disk: buffer tsn: 4 rdba: 0x01000081 (4/129) buffer tsn: 4 rdba: 0x01000081 (4/129) scn: 0x0000.00057901 seq: 0x03 flg: 0x04 tail: 0x79012103 scn: 0x0000.00057901 seq: 0x03 flg: 0x04 tail: 0x79012103 frmt: 0x02 chkval: 0x80ce type: 0x21=SECOND LEVEL BITMAP BLOCK frmt: 0x02 chkval: 0xc13e type: 0x21=SECOND LEVEL BITMAP BLOCK Hex dump of block: st=0, typ_found=1 Hex dump of corrupt header 3 = CHKVAL Dump of memory from 0x00007FFDE0FECA00 to 0x00007FFDE0FEEA00 |
对比修复之前的file 4 block129的dbv的信息:
Page 129 is marked corrupt Corrupt block relative dba: 0x01000081 (file 4, block 129) Bad check value found during dbv: Data in bad block: type: 33 format: 2 rdba: 0x01000081 last change scn: 0x0000.00057901 seq: 0x3 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x79012103 check value in block header: 0xc13e computed block checksum: 0x41f0 |
再次尝试event 10231跳过,发现还是不行:
SYS@travel>conn lunar/lunar Connected. LUNAR@travel>alter session set events '10231 trace name context forever, level 10'; Session altered. LUNAR@travel>select count(*) from ff; select count(*) from ff * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 4, block # 129) ORA-01110: data file 4: '/stage/travel/users01.dbf' LUNAR@travel>! |
bbed再次检查,发现没有问题:
[oracle@lunar bbed]$ dbv file=/stage/travel/users01.dbf DBVERIFY: Release 11.2.0.3.0 - Production on Sat Oct 26 23:20:14 2013 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = /stage/travel/users01.dbf Page 128 is influx - most likely media corrupt Corrupt block relative dba: 0x01000080 (file 4, block 128) Fractured block found during dbv: Data in bad block: type: 30 format: 2 rdba: 0x01000080 last change scn: 0x0000.00055d9a seq: 0x4 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x5d9a2004 check value in block header: 0xd9f8 computed block checksum: 0x0 DBVERIFY - Verification complete Total Pages Examined : 32000 Total Pages Processed (Data) : 29565 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing (Index): 0 Total Pages Processed (Other): 389 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 2045 Total Pages Marked Corrupt : 1 Total Pages Influx : 1 Total Pages Encrypted : 0 Highest block SCN : 539524 (0.539524) [oracle@lunar bbed]$ |
怀疑是buffer中该块还是老的,因此试试刷新一下:
SYS@travel>alter system flush BUFFER_CACHE; System altered. SYS@travel>conn lunar/lunar Connected. LUNAR@travel>alter session set events '10231 trace name context forever, level 10'; Session altered. LUNAR@travel>select count(*) from ff; COUNT(*) ---------- 141112 LUNAR@travel> |
问题解决了,O(∩_∩)O哈哈~
bbed改的块哪个offset呢?