Focus On Oracle

Installing, Backup & Recovery, Performance Tuning,
Troubleshooting, Upgrading, Patching

Oracle Engineered System


当前位置: 首页 » 技术文章 » 内部原理

如何用BBED使Offline的数据文件Online

环境:IBM AIX两节点RAC,数据库版本10.2.0.4(使用了ASM) 

故障原因:

经了解,目前此数据库的容量为6T,ASM磁盘组空间几乎用尽,在向磁盘组增加磁盘时,由于种种原因最终导致2个数据文件offline。由于数据库比较大,数据库没有备份,可怜的是,归档日志是定期清除的,当发现这个问题时,所需的归档日志已被清除,想通过常规手段使文件online已不可能,幸运的时,通过BBED最终使文件online成功,虽然后续还要一些问题,最终得以解决。

alert日志内容
Mon Jul  4 09:55:25 2016
KCF: write/open error block=0x63a093 online=1
     file=11 +ASM_DG02/abdd/datafile/dth_img.261.769872303
     error=15078 txt: ''
Automatic datafile offline due to write error on
file 11: +ASM_DG02/abdd/datafile/dth_img.261.769872303
Mon Jul  4 09:55:38 2016
Errors in file /u01/oracle/admin/abdd/bdump/abdd1_j000_3605220.trc:
ORA-12012: error on auto execute of job 10303
ORA-01115: IO error reading block from file 67 (block # 42653)
ORA-01110: data file 67: '+ASM_DG02/abdd/datafile/data02.dbf'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-01115: IO error reading block from file 67 (block # 198500)
ORA-01110: data file 67: '+ASM_DG02/abdd/datafile/data02.dbf'

恢复思路:
A.安装BBED(由于是10.2.0.4的库,自身就有bbed编译所需的文件)
B.找出2个Offline文件在磁盘上的位置
C.通过dd生成备份/恢复这两个数据文件头的命令
D.正常关闭数据库
E.用dd复制出2个正常的数据文件头部和2个Offline的数据文件头部
   注意:这2个offline的文件头部备份2份,因为后面要修改。复制出2个正常的数据文件头部用作参考。
F.用bbed查看正常文件的头部在偏移量484到512的数值
G.用bbed修改2个offline文件头部在偏移量484到512的数值,确保Offline文件和正常文件头部的数值是一致的
H.用sqlplus连接其中一节点并启动数据库到mount
I.恢复数据文件
   recover datafile 67;
   recover datafile 11;
J.已只读模式打开数据库
   alter database open read only;
   alter database datafile 11 online;
   alter database datafile 67 online;
   select distinct status from v$datafile;
   select * from v$recover_file;
K.关闭数据库,以正常模式打开数据库
   shutdown immediate
   startup
L.经过一段时间观察,没有出现ora-600等异常情况
M.关闭数据库,用srvctl start database

注意:在操作前,注意备份system表空间,offline的文件,当前的控制文件,在线日志文件

打开链接参考生成备份ASM数据文件头部的脚本

dd if=/dev/rhdisk5 of=/home/oracle/backup/1_1_264.FH bs=1048576 skip=283 count=1
dd if=/dev/rhdisk8 of=/home/oracle/backup/1_2_262.FH bs=1048576 skip=116 count=1
dd if=/dev/rhdisk17 of=/home/oracle/backup/2_8_261.FH bs=1048576 skip=53055 count=1
dd if=/dev/rhdisk12 of=/home/oracle/backup/2_2_264.FH bs=1048576 skip=3616 count=1

BBED具体操作
BBED> info all
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /home/oracle/backup/1_1_264.FH                                       0
     2  /home/oracle/backup/1_2_262.FH                                       0
     3  /home/oracle/backup/2_2_264.FH                                       0
     4  /home/oracle/backup/2_8_261.FH                                       0 

BBED> set dba 1,1
        DBA             0x00400001 (4194305 1,1)

BBED> p kcvfhckp
struct kcvfhckp, 160 bytes                  @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x36a28488
      ub2 kscnwrp                           @488      0x0d58
   ub4 kcvcptim                             @492      0x369faee7
   ub2 kcvcpthr                             @496      0x0002
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x00015a93
         ub4 kcrbabno                       @504      0x00006b8d
         ub2 kcrbabof                       @508      0x0010
为了方便查看,后面部分省略
BBED> set dba 2,1 DBA 0x00800001 (8388609 2,1) BBED> p kcvfhckp struct kcvfhckp, 160 bytes @484 struct kcvcpscn, 8 bytes @484 ub4 kscnbas @484 0x36a28488 ub2 kscnwrp @488 0x0d58 ub4 kcvcptim @492 0x369faee7 ub2 kcvcpthr @496 0x0002 union u, 12 bytes @500 struct kcvcprba, 12 bytes @500 ub4 kcrbaseq @500 0x00015a93 ub4 kcrbabno @504 0x00006b8d ub2 kcrbabof @508 0x0010
为了方便查看,后面部分省略
BBED> set dba 3,1 DBA 0x00c00001 (12582913 3,1) BBED> p kcvfhckp struct kcvfhckp, 160 bytes @484 struct kcvcpscn, 8 bytes @484 ub4 kscnbas @484 0x36195936 ub2 kscnwrp @488 0x0d58 ub4 kcvcptim @492 0x369dbc5a ub2 kcvcpthr @496 0x0002 union u, 12 bytes @500 struct kcvcprba, 12 bytes @500 ub4 kcrbaseq @500 0x000159e9 ub4 kcrbabno @504 0x00000002 ub2 kcrbabof @508 0x0010 为了方便查看,后面部分省略 BBED> set dba 4,1 DBA 0x01000001 (16777217 4,1) BBED> p kcvfhckp struct kcvfhckp, 160 bytes @484 struct kcvcpscn, 8 bytes @484 ub4 kscnbas @484 0x36195936 ub2 kscnwrp @488 0x0d58 ub4 kcvcptim @492 0x369dbc5a ub2 kcvcpthr @496 0x0002 union u, 12 bytes @500 struct kcvcprba, 12 bytes @500 ub4 kcrbaseq @500 0x000159e9 ub4 kcrbabno @504 0x00000002 ub2 kcrbabof @508 0x0010
为了方便查看,后面部分省略

从上面我们可以看到,文件1,2头部是一样的,这2个文件是正常的,后面2个文件是Offline的文件,我们需要做的就是修改checkpoint的信息以及RBA的信息

BBED操作

注意:AIX是大端,修改时顺序无需调换

先查看偏移量484到508的相信信息

BBED> set count 26
        COUNT           26
BBED> d offset 484 dba 1,1
 File: /home/oracle/backup/1_1_264.FH (1)
 Block: 1                Offsets:  484 to  509           Dba:0x00400001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>

BBED> d offset 484 dba 2,1
 File: /home/oracle/backup/1_2_262.FH (2)
 Block: 1                Offsets:  484 to  509           Dba:0x00800001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>

BBED> d offset 484 dba 3,1
 File: /home/oracle/backup/2_2_264.FH (3)
 Block: 1                Offsets:  484 to  509           Dba:0x00c00001
------------------------------------------------------------------------
 36195936 0d580cf8 369dbc5a 0002de80 000159e9 00000002 0010 

 <32 bytes per line>

BBED> d offset 484 dba 4,1
 File: /home/oracle/backup/2_8_261.FH (4)
 Block: 1                Offsets:  484 to  509           Dba:0x01000001
------------------------------------------------------------------------
 36195936 0d580cf8 369dbc5a 0002de80 000159e9 00000002 0010 

 <32 bytes per line>

BBED>
修改文件3,4在偏移量484到508的相关信息,修改完成后,确认1,2,3,4偏移量在484到508的信息是一致的
BBED> set dba 3,1
        DBA             0x00c00001 (12582913 3,1)

BBED> m /x 0x0d580000 offset 488
 File: /home/oracle/backup/2_2_264.FH (3)
 Block: 1                Offsets:  488 to  513           Dba:0x00c00001
------------------------------------------------------------------------
 0d580000 369faee7 0002de80 00015a93 00006b8d 00100000 0600 

 <32 bytes per line>

BBED> m /x 0x0002a054 offset 496
 File: /home/oracle/backup/2_2_264.FH (3)
 Block: 1                Offsets:  496 to  521           Dba:0x00c00001
------------------------------------------------------------------------
 0002a054 00015a93 00006b8d 00100000 06000000 00000000 0000 

 <32 bytes per line>

BBED> sum
Check value for File 3, Block 1:
current = 0x44a7, required = 0x368b

BBED> sum apply
Check value for File 3, Block 1:
current = 0x368b, required = 0x368b

BBED> set dba 4,1
        DBA             0x01000001 (16777217 4,1)

BBED> m /x 0x0d580000 offset 488
 File: /home/oracle/backup/2_8_261.FH (4)
 Block: 1                Offsets:  488 to  513           Dba:0x01000001
------------------------------------------------------------------------
 0d580000 369faee7 0002de80 00015a93 00006b8d 00100000 0600 

 <32 bytes per line>

BBED> m /x 0x0002a054 offset 496
 File: /home/oracle/backup/2_8_261.FH (4)
 Block: 1                Offsets:  496 to  521           Dba:0x01000001
------------------------------------------------------------------------
 0002a054 00015a93 00006b8d 00100000 06000000 00000000 0000 

 <32 bytes per line>

BBED> sum
Check value for File 4, Block 1:
current = 0x8ef7, required = 0xfcdb

BBED> sum apply
Check value for File 4, Block 1:
current = 0xfcdb, required = 0xfcdb

BBED> d offset 484 dba 1,1
 File: /home/oracle/backup/1_1_264.FH (1)
 Block: 1                Offsets:  484 to  509           Dba:0x00400001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>

BBED> d offset 484 dba 2,1
 File: /home/oracle/backup/1_2_262.FH (2)
 Block: 1                Offsets:  484 to  509           Dba:0x00800001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>

BBED> d offset 484 dba 3,1
 File: /home/oracle/backup/2_2_264.FH (3)
 Block: 1                Offsets:  484 to  509           Dba:0x00c00001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>

BBED> d offset 484 dba 4,1
 File: /home/oracle/backup/2_8_261.FH (4)
 Block: 1                Offsets:  484 to  509           Dba:0x01000001
------------------------------------------------------------------------
 36a28488 0d580000 369faee7 0002a054 00015a93 00006b8d 0010 

 <32 bytes per line>
用dd还原修改后的数据文件头部
dd  if=/home/oracle/backup/2_8_261.FH of=/dev/rhdisk17 bs=1048576 seek=53055 count=1 conv=notrunc
dd  if=/home/oracle/backup/2_2_264.FH of=/dev/rhdisk12 bs=1048576 seek=3616 count=1 conv=notrunc

正常打开后数据库alert日志

ALTER DATABASE RECOVER  datafile 11  
Tue Jul  5 23:03:43 2016
Media Recovery Start
Tue Jul  5 23:03:43 2016
SUCCESS: diskgroup ASM_DG02 was mounted
Tue Jul  5 23:03:45 2016
 parallel recovery started with 15 processes
Tue Jul  5 23:03:45 2016
Media Recovery Complete (abdd1)
Tue Jul  5 23:03:45 2016
SUCCESS: diskgroup ASM_DG02 was dismounted
Tue Jul  5 23:03:45 2016
Completed: ALTER DATABASE RECOVER  datafile 11  
Tue Jul  5 23:03:54 2016
ALTER DATABASE RECOVER  datafile 67  
Media Recovery Start
Tue Jul  5 23:03:54 2016
SUCCESS: diskgroup ASM_DG02 was mounted
Tue Jul  5 23:03:54 2016
 parallel recovery started with 15 processes
Tue Jul  5 23:03:54 2016
Media Recovery Complete (abdd1)
Tue Jul  5 23:03:54 2016
SUCCESS: diskgroup ASM_DG02 was dismounted
Tue Jul  5 23:03:54 2016
Completed: ALTER DATABASE RECOVER  datafile 67  
Tue Jul  5 23:04:16 2016
alter database open read only
Tue Jul  5 23:04:16 2016
This instance was first to open
Tue Jul  5 23:04:16 2016
SUCCESS: diskgroup ASM_DG02 was mounted
Tue Jul  5 23:04:24 2016
Picked broadcast on commit scheme to generate SCNs
Tue Jul  5 23:04:32 2016
SMON: enabling cache recovery
Tue Jul  5 23:04:32 2016
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 16
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: alter database open read only
Tue Jul  5 23:04:34 2016
Errors in file /u01/oracle/admin/abdd/udump/abdd1_ora_5571420.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 2
alter database datafile 11 online
Tue Jul  5 23:08:05 2016
Starting control autobackup
Control autobackup written to DISK device
	handle '+ASM_DG01/abdd/autobackup/2016_07_05/s_916434663.16892.916441687'
Completed: alter database datafile 11 online
Tue Jul  5 23:08:17 2016
alter database datafile 67 online
Tue Jul  5 23:08:17 2016
Starting control autobackup
Control autobackup written to DISK device
	handle '+ASM_DG01/abdd/autobackup/2016_07_05/s_916434663.16727.916441699'
Completed: alter database datafile 67 online
Tue Jul  5 23:09:01 2016
Thread 1 advanced to log sequence 134864 (LGWR switch)
  Current log# 5 seq# 134864 mem# 0: +ASM_DG01/abdd/onlinelog/group_5.408.776366043
  Current log# 5 seq# 134864 mem# 1: +ASM_DG01/abdd/onlinelog/group_5.420.776366045
Wed Jul  6 00:01:36 2016
Thread 1 advanced to log sequence 134865 (LGWR switch)
  Current log# 7 seq# 134865 mem# 0: +ASM_DG01/abdd/onlinelog/group_7.314.776366093
  Current log# 7 seq# 134865 mem# 1: +ASM_DG01/abdd/onlinelog/group_7.325.776366093
Wed Jul  6 00:10:50 2016
Thread 1 advanced to log sequence 134866 (LGWR switch)
  Current log# 2 seq# 134866 mem# 0: +ASM_DG01/abdd/onlinelog/group_2.260.769366221
  Current log# 2 seq# 134866 mem# 1: +ASM_DG01/abdd/onlinelog/group_2.261.769366221
Wed Jul  6 00:16:49 2016
Thread 1 advanced to log sequence 134867 (LGWR switch)
  Current log# 1 seq# 134867 mem# 0: +ASM_DG01/abdd/onlinelog/group_1.258.769366219
  Current log# 1 seq# 134867 mem# 1: +ASM_DG01/abdd/onlinelog/group_1.259.769366219
Wed Jul  6 00:24:23 2016
Thread 1 advanced to log sequence 134868 (LGWR switch)
  Current log# 5 seq# 134868 mem# 0: +ASM_DG01/abdd/onlinelog/group_5.408.776366043
  Current log# 5 seq# 134868 mem# 1: +ASM_DG01/abdd/onlinelog/group_5.420.776366045
Wed Jul  6 00:32:29 2016
Thread 1 advanced to log sequence 134869 (LGWR switch)
  Current log# 7 seq# 134869 mem# 0: +ASM_DG01/abdd/onlinelog/group_7.314.776366093
  Current log# 7 seq# 134869 mem# 1: +ASM_DG01/abdd/onlinelog/group_7.325.776366093
Wed Jul  6 00:47:23 2016

解决ktsplbfmb-sync错误

ORA-00600: internal error code, arguments: [ktsplbfmb-sync], [], [], [], [], [], [], []

通过dbv校验数据文件,发现之前有坏块,根据file id,block id可以查到,坏块涉及的对象有2个:一个为Lob Index(一个块),一个为Lob Segment(多个块)

尝试通过expdp导出这2个表,Lob Index损坏的表可以正常导出,然后通过move table,应用程序端出现的错误消失,Lob Segment损坏的表,无法通过expdp导出。最终通过找到损坏的表的rowid,忍痛割爱通过empty_blob()重新初始化,好在损坏的行不多,只有2行,至此问题圆满解决。


小结:在数据库上不论做什么操作,都要认真去分析调查,小心无大错。


Reference

Export Receives The Errors ORA-1555 ORA-22924 ORA-1578 ORA-22922 (Doc ID 787004.1)

Export Fails With Errors ORA-2354 ORA-1555 ORA-22924 And How To Confirm LOB Segment Corruption Using Export Utility (Doc ID 833635.1)



关键词:recover block bbed 

相关文章

RMAN备份脚本样本
ORA-600 【2662】
Oracle Database Recover Check
如何用BBED使Offline的数据文件Online
ORA-00338/ORA-00312
如何使用BBED
Interpret The Datafile OS Header/Block Zero
如何计算Oracle Block的校验值
10g,11g,12c X$KCVFH(一)
如何安装BBED

联系我们

韩朝阳



    Phone:186-9589-1286

    E-mail:ohsdba@qq.com

    微  信: ohsdba


电话(186-9589-1286)

QQ(375349564)

微信(ohsdba)


常用链接

Oracle Database Upgrade

Oracle

eDelivery
My Oracle Support

Exadata 12.1 Document

Exadata 12.2 Document

Oracle 11gR2 Document

Oracle 12cR1 Document

Oracle 12cR2 Document

GoldenGate 12c

Oracle Fusion Middleware

Mysql Document

PostgreSQL

Python

github

sourceforge



Top
Cell186-9589-1286
QQ 375349564
E-mail ohsdba@qq.com