一, 环境描述
(1)Oracle http://www.linuxidc.com/topicnews.aspx?tid=12 11.2.0.3 RAC ON Oracle Linux 6 x86_64, 只有一个 ASM 外部冗余磁盘组 --DATA;
(2)OCR,VOTEDISK,DATAFILE,CONTROLFILE,SPFILE 全部位于这个磁盘组上;
二, 故障描述
(1) 存储故障导致 ASM 磁盘丢失.
(2)CRS 因为 OCR 和 VOTEDISK 的丢失, 除了 OHAS 还联机外, CLUSTERWARE 服务都已经停止.
三, 备份情况
(1)RMAN 备份: 包括 controlfile,database,spfile,archivelog,
(2)OCR 备份: 没有进行过人工备份, 在 $CRS_HOME/cdata 目录下有 CRS 自动备份文件.
四, 操作步骤
说明: 准使用 CRS 自动备份的文件恢复 OCR, 使用 RMAN 备份来恢复数据库; 准备恢复数据的同时, 调整 ASM 磁盘组, 将 OCR,VOTEDISK 同数据库文件分开存放.
Oracle 11g 从入门到精通 PDF + 光盘源代码 http://www.linuxidc.com/Linux/2013-06/85670.htm https://www.linuxidc.com/Linux/2013-06/85670.htm
Ubuntu http://www.linuxidc.com/topicnews.aspx?tid=2 12.04(amd64) 安装完 Oracle 11gR2 后各种问题解决方法 http://www.linuxidc.com/Linux/2013-06/86155.htm https://www.linuxidc.com/Linux/2013-06/86155.htm
4.1 恢复 OCR 和 VOTEDISK
(1) 在所有 RAC 节点上停止 CRS 服务
- [root@rac1 ~]# crsctl stop has -f
- CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'
- CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'
- CRS-2673: Attempting to stop 'ora.crf'on'rac1'
- CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded
- CRS-2677: Stop of'ora.crf'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'
- CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'
- CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
- [root@rac2 ~]# crsctl stop has -f
- CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac2'
- CRS-2673: Attempting to stop 'ora.mdnsd'on'rac2'
- CRS-2673: Attempting to stop 'ora.crf'on'rac2'
- CRS-2677: Stop of'ora.mdnsd'on'rac2' succeeded
- CRS-2677: Stop of'ora.crf'on'rac2' succeeded
- CRS-2673: Attempting to stop 'ora.gipcd'on'rac2'
- CRS-2677: Stop of'ora.gipcd'on'rac2' succeeded
- CRS-2673: Attempting to stop 'ora.gpnpd'on'rac2'
- CRS-2677: Stop of'ora.gpnpd'on'rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
(2) 在一个节点上以 NOCRS 方式启动 CRS, 此操作会启动 ASM 实例.
[root@rac1 ~]# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
- CRS-2672: Attempting to start 'ora.mdnsd'on'rac1'
- CRS-2676: Start of'ora.mdnsd'on'rac1' succeeded
- CRS-2672: Attempting to start 'ora.gpnpd'on'rac1'
- CRS-2676: Start of'ora.gpnpd'on'rac1' succeeded
- CRS-2672: Attempting to start 'ora.CSSdmonitor'on'rac1'
- CRS-2672: Attempting to start 'ora.gipcd'on'rac1'
- CRS-2676: Start of'ora.cssdmonitor'on'rac1' succeeded
- CRS-2676: Start of'ora.gipcd'on'rac1' succeeded
- CRS-2672: Attempting to start 'ora.cssd'on'rac1'
- CRS-2672: Attempting to start 'ora.diskmon'on'rac1'
- CRS-2676: Start of'ora.diskmon'on'rac1' succeeded
- CRS-2676: Start of'ora.cssd'on'rac1' succeeded
- CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip'on'rac1'
- CRS-2672: Attempting to start 'ora.ctssd'on'rac1'
- CRS-2681: Clean of'ora.cluster_interconnect.haip'on'rac1' succeeded
- CRS-2672: Attempting to start 'ora.cluster_interconnect.haip'on'rac1'
- CRS-2676: Start of'ora.ctssd'on'rac1' succeeded
- CRS-2676: Start of'ora.cluster_interconnect.haip'on'rac1' succeeded
- CRS-2672: Attempting to start 'ora.asm'on'rac1'
- CRS-2676: Start of'ora.asm'on'rac1' succeeded
(3) 新添加了三块磁盘, 已经使用 UDEV 进行了绑定, 查看磁盘状态.
- [root@rac1 ~]# su - grid
- [grid@rac1 ~]$ sqlplus / as sysasm
- SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 17:41:49 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
- Connected to:
- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
- With the Real Application Clusters and Automatic Storage Management options
- SQL> select group_number group#, disk_number disk#, OS_MB, state, path, header_status from v$asm_disk orderby 1,2;
- GROUP# DISK# OS_MB STATE PATH HEADER_STATUS
- ---------- ---------- ---------- ---------- -------------------- ----------------------
- 0 0 1024 NORMAL /dev/asm-diskc CANDIDATE
- 0 1 5120 NORMAL /dev/asm-diskd CANDIDATE
- 0 2 20480 NORMAL /dev/asm-diskb CANDIDATE
(4) 创建三个磁盘组, SYSTEMDG 给 CRS 使用, 用于存放 OCR,VOTEDISK 和 ASM 实例的 SPFILE. 其余两个给 ORACLE 使用, DATADG 用于存放 datafile,controlfile,redolog,spfile;ARCLOGDG 存放 archivelog.
SQL> create diskgroup SYSTEMDG external redundancy
- disk '/dev/asm-diskc'
- ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
Diskgroup created.
SQL> create diskgroup DATADG external redundancy
- disk '/dev/asm-diskb'
- ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
Diskgroup created.
SQL> create diskgroup ARCLOGDG external redundancy
- disk '/dev/asm-diskd'
- ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
Diskgroup created.
(5) 准备恢复 OCR 和 VOTEDISK,/etc/oracle/ocr.loc 中记录了 OCR 路径, 修改 ocrconfig_loc 的值, 以便将 OCR 恢复到新的磁盘组中.
- [root@rac1 ~]# more /etc/oracle/ocr.loc
- ocrconfig_loc=+DATA
- local_only=FALSE
- [root@rac1 ~]# vi /etc/oracle/ocr.loc
- ocrconfig_loc=+SYSTEMDG
- local_only=FALSE
(6) 恢复 OCR
[root@rac1 ~]# ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy
- rac1 2013/07/05 12:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
- rac1 2013/07/05 08:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr
- rac1 2013/07/05 04:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup02.ocr
- rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr
- rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr
- PROT-25: Manual backups for the Oracle Cluster Registry are not available
- [root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
- [root@rac1 ~]#
- [root@rac1 ~]# ocrcheck
Status of Oracle Cluster Registry isas follows :
- Version : 3
- Total space (kbytes) : 262120
- Used space (kbytes) : 2840
- Available space (kbytes) : 259280
- ID : 59415097
- Device/File Name : +SYSTEMDG
Device/File integrity check succeeded
- Device/File not configured
- Device/File not configured
- Device/File not configured
- Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
(7) 创建 VOTEDISK
[root@rac1 ~]# crsctl replace votedisk +SYSTEMDG
CRS-4602: Failed 27 toadd voting file afb0ca0f35684f1abfd43d5ec2dc1123.
Failed toreplace voting disk groupwith +SYSTEMDG.
CRS-4000: Command Replace failed, or completed with errors.
以上报错是因为使用 UDEV 绑定 ASM 磁盘时需要更改默认磁盘搜索路径为 / dev/asm*, 修改 ASM 磁盘搜索路径
- [root@rac1 ~]# su - grid
- [grid@rac1 ~]$ sqlplus / as sysasm
- SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 19:03:25 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
- Connected to:
- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
- With the Real Application Clusters and Automatic Storage Management options
- SQL> show parameter asm_diskstring
- NAME TYPE VALUE
- ------------------------------------ ----------- ------------------------------
- asm_diskstring string
- SQL>
- SQL>
- SQL> alter system set asm_diskstring = '/dev/asm*';
System altered.
SQL> create spfile from memory;
create spfile from memory
- *
- ERROR at line 1:
- ORA-00349: failure obtaining block sizefor
- '+DATA/rac-cluster/asmparameterfile/registry.253.819922365'
- ORA-15001: diskgroup "DATA" does not exist orisnot mounted
- SQL> create spfile='+SYSTEMDG'from memory;
File created.
- SQL> startup force mount;
- ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
- ASM instance started
- Total System Global Area 283930624 bytes
- Fixed Size 2227664 bytes
- Variable Size 256537136 bytes
- ASM Cache 25165824 bytes
- ASM diskgroups mounted
在次创建 VOTEDISK, 成功.
[root@rac1 init]# crsctl replace votedisk +SYSTEMDG
Successful addition of voting disk 8ebb7a63accb4fa8bfa7ab65df7a8c8a.
Successfully replaced voting disk groupwith +SYSTEMDG.
CRS-4266: Voting file(s) successfully replaced
(8) OCR 和 VOTEDISK 都恢复完成后, 重启 CRS 到正常模式.
- [root@rac1 ~]# crsctl stop has -f
- CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'
- CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'
- CRS-2673: Attempting to stop 'ora.ctssd'on'rac1'
- CRS-2673: Attempting to stop 'ora.asm'on'rac1'
- CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded
- CRS-2677: Stop of'ora.asm'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip'on'rac1'
- CRS-2677: Stop of'ora.ctssd'on'rac1' succeeded
- CRS-2677: Stop of'ora.cluster_interconnect.haip'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.cssd'on'rac1'
- CRS-2677: Stop of'ora.cssd'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'
- CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded
- CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'
- CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
- [root@rac1 ~]# crsctl check crs
- CRS-4638: Oracle High Availability Services is online
- CRS-4537: Cluster Ready Services is online
- CRS-4529: Cluster Synchronization Services is online
- CRS-4533: Event Manager is online
- [root@rac1 ~]#
来源: http://www.92to.com/bangong/2018/05-21/33818197.html