Oracle 11g DG手工switchover切换标准化流程
环境:RHEL 6.5 + Oracle GI 11.2.0.4 + Oracle DB 11.2.0.4
Primary RAC(2 nodes) + Standby RAC(2 nodes)
Oracle DG切换类型有两种:switchover和failover。对于switchover而言,是计划内的由DBA主动去执行的操作,所以它的操作步骤一定是可以形成标准化流程的。
本文就在我的实验环境下做一次基本的标准化switchover流程:
切换前准备:最好可以先关闭主备库RAC的其他节点
我这里是分别关闭主备库的第二个节点:
- PRIMARY NODE2'Instance
- [grid@jyrac2 ~]$ srvctl stop instance -d jyzhao -i jyzhao2
- STANDBY NODE2'Instance
- [grid@jystdrac2 ~]$ srvctl stop instance -d mynas -i jyzhao2
当然也可以SQLplus操作关闭其他节点的实例.
注:如果不关闭,正常切换时也会自动被关闭。只是为了防止某些环境有其他问题,手工先关闭其他实例可以方便排查。
1.主库切换为备库:
- ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
观察主库alert日志:
- Sun Aug 13 09:54:53 2017
- alter database commit to switchover to standby with session shutdown
- ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 13741] (jyzhao1)
- Waiting for all non-current ORLs to be archived...
- All non-current ORLs have been archived.
- Waiting for all FAL entries to be archived...
- All FAL entries have been archived.
- Waiting for potential Physical Standby switchover target to become synchronized...
- Active, synchronized Physical Standby switchover target has been identified
- Sun Aug 13 09:54:56 2017
- Errors in file /opt/app/oracle/diag/rdbms/jyzhao/jyzhao1/trace/jyzhao1_j000_29834.trc:
- ORA-12012: error on auto execute of job 3
- ORA-16456: switchover to standby in progress or completed
- Switchover End-Of-Redo Log thread 1 sequence 182 has been fixed
- Switchover: Primary highest seen SCN set to 0x0.0x456df2
- ARCH: Noswitch archival of thread 1, sequence 182
- ARCH: End-Of-Redo Branch archival of thread 1 sequence 182
- ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
- ARCH: Standby redo logfile selected for thread 1 sequence 182 for destination LOG_ARCHIVE_DEST_2
- Archived Log entry 411 added for thread 1 sequence 182 ID 0x958da9ee dest 1:
- ARCH: Archiving is disabled due to current logfile archival
- Primary will check for some target standby to have received alls redo
- Final check for a synchronized target standby. Check will be made once.
- Sun Aug 13 09:54:59 2017
- Process (ospid 4297) is suspended due to switchover to physical standby operation.
- LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target
- Active, synchronized target has been identified
- Target has also received all redo
- Backup controlfile written to trace file /opt/app/oracle/diag/rdbms/jyzhao/jyzhao1/trace/jyzhao1_ora_13741.trc
- Clearing standby activation ID 2509089262 (0x958da9ee)
- The primary database controlfile was created using the
- 'MAXLOGFILES 192' clause.
- There is space for up to 188 standby redo logfiles
- Use the following SQL commands on the standby database to create
- standby redo logfiles that match the primary database:
- ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
- ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
- ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
- ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
- ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800;
- Archivelog for thread 1 sequence 182 required for standby recovery
- Switchover: Primary controlfile converted to standby controlfile succesfully.
- Switchover: Complete - Database shutdown required
- USER (ospid: 13741): terminating the instance
- Sun Aug 13 09:55:00 2017
- ORA-1092 : opitsk aborting process
- Instance terminated by USER, pid = 13741
- Completed: alter database commit to switchover to standby with session shutdown
- Shutting down instance (abort)
- License high water mark = 11
- Sun Aug 13 09:55:01 2017
- Instance shutdown complete
主要注意到正常应该有“End-Of-Redo Branch archival”字样,并且最终成功切换到standby,最后数据库是关闭的。
操作之前,可以看alert日志,也可以使用SQL查询是否可以切换:
- select OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS from v$database;
2.备库切换为主库:
- ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
查看备库alert日志:
- Sun Aug 13 09:58:30 2017
- alter database commit to switchover to primary
- ALTER DATABASE SWITCHOVER TO PRIMARY (jyzhao1)
- Maximum wait for role transition is 15 minutes.
- Switchover: Media recovery is still active
- Role Change: Canceling MRP - no more redo to apply
- Sun Aug 13 09:58:31 2017
- MRP0: Background Media Recovery cancelled with status 16037
- Errors in file /opt/app/oracle/diag/rdbms/mynas/jyzhao1/trace/jyzhao1_mrp0_7745.trc:
- ORA-16037: user requested cancel of managed recovery operation
- Sun Aug 13 09:58:31 2017
- Managed Standby Recovery not using Real Time Apply
- Recovery interrupted!
- MRP0: Background Media Recovery process shutdown (jyzhao1)
- Role Change: Canceled MRP
- All dispatchers and shared servers shutdown
- CLOSE: killing server sessions.
- CLOSE: all sessions shutdown successfully.
- Sun Aug 13 09:58:34 2017
- SMON: disabling cache recovery
- Backup controlfile written to trace file /opt/app/oracle/diag/rdbms/mynas/jyzhao1/trace/jyzhao1_ora_7669.trc
- SwitchOver after complete recovery through change 4550130
- Online log +DATA/mynas/onlinelog/group_1.266.951608731: Thread 1 Group 1 was previously cleared
- Online log +FRA/mynas/onlinelog/group_1.257.951608737: Thread 1 Group 1 was previously cleared
- Online log +DATA/mynas/onlinelog/group_2.267.951608745: Thread 1 Group 2 was previously cleared
- Online log +FRA/mynas/onlinelog/group_2.258.951608751: Thread 1 Group 2 was previously cleared
- Online log +DATA/mynas/onlinelog/group_3.268.951608757: Thread 2 Group 3 was previously cleared
- Online log +FRA/mynas/onlinelog/group_3.259.951608763: Thread 2 Group 3 was previously cleared
- Online log +DATA/mynas/onlinelog/group_4.269.951608769: Thread 2 Group 4 was previously cleared
- Online log +FRA/mynas/onlinelog/group_4.260.951608775: Thread 2 Group 4 was previously cleared
- Standby became primary SCN: 4550128
- AUDIT_TRAIL initialization parameter is changed back to its original value as specified in the parameter file.
- Switchover: Complete - Database mounted as primary
- Completed: alter database commit to switchover to primary
- Sun Aug 13 09:59:07 2017
- ARC1: Becoming the 'no SRL' ARCH
最后注意到备库成功切换到主库,启动到mount状态。
上面已经完成了切换,这一步只是把新主库open,新备库启动并开启MRP:
- --NEW PRIMARY:
- ALTER DATABASE OPEN;
- --NEW STANDBY:
- STARTUP
- ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
确定同步正常,启动其他节点:
- PRIMARY NODE2'Instance
- [grid@jyrac2 ~]$ srvctl stop instance -d jyzhao -i jyzhao2
- STANDBY NODE2'Instance
- [grid@jystdrac2 ~]$ srvctl stop instance -d mynas -i jyzhao2
至此,完成Oracle 11g标准化switchover切换操作。
我这里发现一个小问题,就是切换后发现无法实时同步,最终发现是备库的配置还是ARCH:
- log_archive_dest_2 string SERVICE=jyzhao ARCH VALID_FOR=
- (ONLINE_LOGFILES,PRIMARY_ROLE)
- DB_UNIQUE_NAME=jyzhao
修改备库的配置,去掉ARCH,也就是使用LGWR传输即可:
- alter SYSTEM SET log_archive_dest_2 = 'SERVICE=jyzhao VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=jyzhao'sid = '*'SCOPE = BOTH;
关于使用这两种方式传输日志的区别可参考《ARCH和LGWR进程同步DG日志的区别》。
来源: http://www.linuxidc.com/Linux/2017-08/146331.htm