Listener使用ip导致dgbroker报ORA-12514

有一套adg环境,数据正常同步,同时也部署配置了dgbroker管理工具, 但是在运行一会后会出现TNS的报错,在做swithover和failover也会失败,根据alert日志可以发现做切换的时候会去访问db_unique_name_DGB的TNS,而查看主备起的监听,却并没有发现这样一个service,经过查资料,这个监听是会在dgbroker服务起来的时候动态注册的,动态注册的时候是去找的隐含参数__dg_broker_service_names

下面是一个正常的环境下的情况

1
2
3
4
5
6
SQL> @pd dg_broker_service
Show all parameters and session values from x$ksppi/x$ksppcv...

NUM N_HEX NAME VALUE DESCRIPTION
---------- ----- -------------------------------------------------------- ------------------------------ ---------------------------------------------------------------------------------------------------
775 307 __dg_broker_service_names ora12c_DGB, ora12c_CFG service names for broker use

正常情况下已经启动的监听

阅读更多

12c新特性:Far Sync

12c版本dataguard中引入了一个新特性:far sync备库,它作为级联备库主要作为redo log仓库给终端库使用。它不包含任何数据文件,在far sync备库上只有日志传输服务是启用的。

far sync备库的好处主要在于它可以处于最大可用模式下作为主库的一个本地归档日志库,而其后的物理或者逻辑备库可以放在较远的其他地方。当之前的主备库距离较远时,可以先通过同步的方式将日志快速传到far sync备库,而后从far sync备库将日志通过异步的方式传到物理备库,这样既兼顾了效率同时也保证了安全。

Far Sync工作原理

原理其实很简单,farsync端接收到从主库同步过来的数据,然后将其异步推送给最多29个远端目的地。

far sync并不是一个标准的数据库,因为它只包含一些必要的文件,比如控制文件、spfile和standby redo日志。这个库必须要放在离主库较近的地方,这样才能确保较低的延时。为了高可用性,一般建议不要放在同一个物理位置。

如果主库出现故障,则跟以往一样实施failover,而far sync实例的存在保证了在failover过程中数据的安全性和完整性。

测试环境

Host Role
HostA Primary
HostB Far Sync
HostC Standby

这里已经有一个正在运行的主备库,为了简化,这里就只在HostB上新建一个far sync实例。

创建Far Sync实例

从主库上创建所需的控制文件和pfile

1
2
3
alter database create far sync instance controlfile as '/u01/app/oracle/control01.ctl';

create pfile='/u01/app/oracle/initora12c.ora' from spfile;

将文件传到HostB

1
scp control01.ctl initora12c.ora racnode1:/u01/app/oracle/

编辑pfile,修改control_files参数的路径

1
*.control_files='/u01/app/oracle/oradata/ORA12C/controlfile/control01.ctl'

创建相关目录,并将文件移到到合适位置

1
2
3
4
5
6
7
mkdir -p /u01/app/oracle/fast_recovery_area/ora12c/ORA12C
mkdir -p /u01/app/oracle/admin/ora12c/adump
mkdir -p /u01/arch/ora12c
mkdir -p /u01/app/oracle/oradata/ORA12C/{controlfile,onlinelog}

mv control01.ctl oradata/ORA12C/controlfile/
mv initora12c.ora product/12.2.0/dbhome_1/dbs/

通过spfile启动farsync实例,然后修改成mount状态,创建standby redo log

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
SQL> create spfile from pfile;

File created.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1946157056 bytes
Fixed Size 8794264 bytes
Variable Size 587206504 bytes
Database Buffers 1342177280 bytes
Redo Buffers 7979008 bytes
Database mounted.

set linesize 300
col member for a100
select GROUP#,MEMBER,TYPE from v$logfile;

GROUP# MEMBER TYPE
---------- ---------------------------------------------------------------------------------------------------- -------
3 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_3_gcm27r5h_.log ONLINE
4 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_4_gcwxsqn3_.log STANDBY
2 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_2_gcm27r55_.log ONLINE
5 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_5_gcwxswdf_.log STANDBY
1 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_1_gcm27r4t_.log ONLINE
6 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_6_gcwxt172_.log STANDBY
7 /u01/app/oracle/oradata/ORA12C/onlinelog/o1_mf_7_gcwxt5w8_.log STANDBY

Dataguard 配置

三个库的tnsnames.ora都要修改成一样

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ORA12C =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = testyum)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = ora12c)
)
)

ora12c_dg =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stbyum)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = ora12c)
)
)

ora12c_fs =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode1)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = ora12c)
)
)

修改far sync实例的db_unique_name,启用dg broker

1
2
3
4
5
6
alter system set db_unique_name='ORA12C_FS' scope=spfile;

alter system set dg_broker_start=true;

-- 重启实例
startup mount

连到主库的dgmgrl,增加far sync实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
DGMGRL> connect sys/oracle
Connected to "ora12c"
Connected as SYSDG.
DGMGRL> add FAR_SYNC ora12c_fs as connect identifier is ora12c_fs;
far sync instance "ora12c_fs" added

DGMGRL> show configuration verbose;

Configuration - ora12ccfg

Protection Mode: MaxPerformance
Members:
ora12c - Primary database
ora12c_dg - Physical standby database
ora12c_fs - Far sync instance (disabled)

Properties:
FastStartFailoverThreshold = '30'
OperationTimeout = '30'
TraceLevel = 'USER'
FastStartFailoverLagLimit = '30'
CommunicationTimeout = '180'
ObserverReconnect = '0'
FastStartFailoverAutoReinstate = 'TRUE'
FastStartFailoverPmyShutdown = 'TRUE'
BystandersFollowRoleChange = 'ALL'
ObserverOverride = 'FALSE'
ExternalDestination1 = ''
ExternalDestination2 = ''
PrimaryLostWriteAction = 'CONTINUE'
ConfigurationWideServiceName = 'ora12c_CFG'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

目前far sync还未激活,需要设置redo路由再激活

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
edit database ora12c set property redoroutes='(local:ora12c_fs sync)';

edit far_sync ora12c_fs set property redoroutes='(ora12c:ora12c_dg)';

edit configuration set protection mode as MaxAvailability;

enable far_sync ora12c_fs;

DGMGRL> show configuration verbose;

Configuration - ora12ccfg

Protection Mode: MaxAvailability
Members:
ora12c - Primary database
ora12c_fs - Far sync instance
ora12c_dg - Physical standby database

Properties:
FastStartFailoverThreshold = '30'
OperationTimeout = '30'
TraceLevel = 'USER'
FastStartFailoverLagLimit = '30'
CommunicationTimeout = '180'
ObserverReconnect = '0'
FastStartFailoverAutoReinstate = 'TRUE'
FastStartFailoverPmyShutdown = 'TRUE'
BystandersFollowRoleChange = 'ALL'
ObserverOverride = 'FALSE'
ExternalDestination1 = ''
ExternalDestination2 = ''
PrimaryLostWriteAction = 'CONTINUE'
ConfigurationWideServiceName = 'ora12c_CFG'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

这里设置了从主库到far sync是通过sync的方式,而far sync的redo路由设置成从primary获取的redo都会发送到物理备库。

比较三个库生成的log_archive_dest_n参数

DATABASE log_archive_dest_1 log_archive_dest_2
Pri LOCATION=/u01/arch/ora12c VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=ora12c service=”ora12c_fs”, SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name=”ora12c_fs” net_timeout=30, valid_for=(online_logfile,all_roles)
Far Sync location=”USE_DB_RECOVERY_FILE_DEST”, valid_for=(STANDBY_LOGFILE,STANDBY_ROLE) service=”ora12c_dg”, ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name=”ora12c_dg” net_timeout=30, valid_for=(standby_logfile,all_roles)
Stb LOCATION=/u01/arch/ora12c VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=ora12c_dg

可以看到Pri的日志是只会传送到far sync实例,所以做个测试看是否数据能正常同步

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- primary
sys@ORA12C> create table test as select * from dba_tables where 1=2;

Table created.

sys@ORA12C> drop table test purge;

Table dropped.

-- standby
sys@ORA12C> select * from test;

no rows selected

sys@ORA12C> /
select * from test
*
ERROR at line 1:
ORA-00942: table or view does not exist

配置全部结束,far sync主要是针对较远距离的传输场景,兼顾了数据的安全性也减轻了主库的压力,而far sync本身也比较轻量,不会造成瓶颈。

12c rac to rac 物理备库最佳实践

这篇文章主要目的是详细记录在12cR2环境中,配置主备都是rac环境的dataguard,下图列出了本次将要创建的环境

准备步骤说明

  • 通过vmware创建虚拟机
  • 总共4台,两台作为主库,两台作为备库
  • 主库安装12cR2 rac环境,创建实例
  • 备库安装12cR2 grid软件和database软件,不创建实例

测试环境介绍

画图的时候手误,scan hostname两节点是一致的

主库:

备库:

所用软件

  • VMWARE 14 pro
  • Oracle Grid Infrastructure 12c R2
  • Oracle Database 12c R2
  • Oracle Linux 7.6

Dataguard 配置

  • Protection mode: Maximum Performance
  • Fast-start failover: Disabled
  • Standby database type: Physical Standby

Rac环境中dg注意事项

rac环境中配置dg与单机中的配置有所不同:

  • 在Rman中执行duplicate database for standby from active database命令时,只会在备库创建一个单实例,你需要手动设置rac参数包括INSTANCE_NUMBER和INSTANCE_NAME等,在另一个主机上启用第二个实例,然后添加新创建的数据库到OCR中
  • INSTANCE_NUMBERINSTANCE_NAME参数必须在第二个实例当中手动修改
  • 不管主库或者备库都可以是单机的非集群环境
  • standby redo logfiles必须在共享磁盘上,每个备库实例必须配置归档standby redo log到同样目录
  • 如果没设置OMF,则LOG_ARCHIVE_FORMAT参数必须设置,需要加上%t或者T%来确保名称的唯一性

安装大纲

下面列出了安装的标准化步骤

这里默认主备库都已经安装好了所需环境,如果对12cR2 rac安装不清楚的可以参考我之前的文章:在vmware linux 7.5安装oracle 12c rac

准备主库环境

  • 开启归档
  • 配置ARCHIVELOG DELETION POLICY策略
  • 开启forced logging
  • 配置standby redo log files
  • 配置主库相关参数
  • 开启数据库闪回
  • 配置CONTROL_FILE_RECORD_KEEP_TIME参数
  • 配置tnsnames.ora

准备备库环境

  • 创建相关目录
  • 创建静态监听
  • 创建init.ora
  • 拷贝主库password文件
  • 配置tnsnames.ora

创建物理备库

  • 通过rman duplicate在线创建
  • 设置备库参数
  • 开启redo apply
  • 检查物理备库同步情况

检查环境

确保每个节点都能访问其他所有地址

主库

1
2
3
4
5
6
7
8
9
10
192.168.146.121  racnode1 racnode1.oracle.com
192.168.146.123 racnode1-vip

192.168.146.122 racnode2 racnode2.oracle.com
192.168.146.124 racnode2-vip

192.168.146.125 racnode-scan

10.10.10.121 racnode1-priv
10.10.10.122 racnode2-priv

备库

1
2
3
4
5
6
7
8
9
10
192.168.146.131  stbracnode1 stbracnode1.oracle.com
192.168.146.133 stbracnode1-vip

192.168.146.132 stbracnode2 stbracnode2.oracle.com
192.168.146.134 stbracnode2-vip

192.168.146.135 stbracnode-scan

10.10.10.131 stbracnode1-priv
10.10.10.132 stbracnode2-priv

crs状态

1
crs_stat -t

检查服务状态

1
srvctl status racnode -n xxx

准备主库环境

检查数据库是否是归档模式,如果不是,则设置成归档模式,归档日志存放在+FRA下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
sys@RAC1> select log_mode from v$database;

LOG_MODE
------------
NOARCHIVELOG

sys@RAC1> show parameter log_archive_dest_1

PARAMETER_NAME TYPE VALUE
------------------------------------------------------------ ----------- ---------------------------------------------------------------------------
log_archive_dest_1 string

ALTER SYSTEM SET LOG_ARCHIVE_DEST_1 = 'LOCATION=USE_DB_RECOVERY_FILE_DEST' scope=BOTH;

srvctl stop database -d rac -o immediate
srvctl start database -d rac -o mount

sqlplus / as sysdba
ALTER DATABASE ARCHIVELOG;

sys@RAC1> alter system switch logfile;

System altered.

sys@RAC1> select name from v$archived_log;

NAME
---------------------------------------------------------------------------
+FRA/RAC/ARCHIVELOG/2019_04_17/thread_1_seq_2.261.1005831111

开启数据库的强制日志选项

1
2
3
4
5
6
7
8
9
sys@RAC1> SELECT FORCE_LOGGING FROM V$DATABASE;

FORCE_LOGGING
---------------------------------------
NO

sys@RAC1> ALTER DATABASE FORCE LOGGING;

Database altered.

确定所需的standby redo logfile数量,可以通过以下公式得出

(maximum number of redo logfiles +1) * number of threads

根据下面语句得出的结果,可以看到需要(4+1)*2=10组standby redo logfile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT GROUP#, BYTES/1024/1024 MB FROM V$LOG ORDER BY 1,2;
col INSTANCE format a5

GROUP# MB
---------- ----------
1 200
2 200
3 200
4 200

sys@RAC1> sys@RAC1> SELECT DISTINCT THREAD#, INSTANCE FROM GV$THREAD ORDER BY 1,2;

THREAD# INSTA
---------- -----
1 rac1
2 rac2

创建standby redo logfile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
ALTER DATABASE ADD STANDBY LOGFILE THREAD 1 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 1 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 1 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 1 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 1 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 2 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 2 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 2 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 2 '+FRA' SIZE 200M ;
ALTER DATABASE ADD STANDBY LOGFILE THREAD 2 '+FRA' SIZE 200M ;

--检查
SELECT INST_ID, GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS
FROM GV$STANDBY_LOG ORDER BY 2,1;

INST_ID GROUP# THREAD# SEQUENCE# ARC STATUS
---------- ---------- ---------- ---------- --- ----------
1 5 1 0 YES UNASSIGNED
2 5 1 0 YES UNASSIGNED
1 6 1 0 YES UNASSIGNED
2 6 1 0 YES UNASSIGNED
1 7 1 0 YES UNASSIGNED
2 7 1 0 YES UNASSIGNED
1 8 1 0 YES UNASSIGNED
2 8 1 0 YES UNASSIGNED
1 9 1 0 YES UNASSIGNED
2 9 1 0 YES UNASSIGNED
1 10 2 0 YES UNASSIGNED
2 10 2 0 YES UNASSIGNED
1 11 2 0 YES UNASSIGNED
2 11 2 0 YES UNASSIGNED
1 12 2 0 YES UNASSIGNED
2 12 2 0 YES UNASSIGNED
1 13 2 0 YES UNASSIGNED
2 13 2 0 YES UNASSIGNED
1 14 2 0 YES UNASSIGNED
2 14 2 0 YES UNASSIGNED

在主集群上两台机器上均配置tnsnames.ora,这里备库只填写一号节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
RAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)
)
)

STBRAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stbracnode1)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)
)
)

在备集群一号节点上配置tnsnames.ora,添加下面的连接描述

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
RAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)
)
)

STBRAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stbracnode1)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)(UR=A) >>>>==== UR=A能让监听处于blocked等状态时客户端连接上来
)
)

在备集群二号节点上配置tnsnames.ora

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
RAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)
)
)

STBRAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stbracnode2)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rac)
)
)

在所有机器上测试连接

1
2
tnsping rac
tnsping stbrac

在主库上配置dataguard相关参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
show parameter DB_UNIQUE_NAME

# 注意参数大小写敏感:
ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(rac,stbrac)' SCOPE=BOTH SID='*';
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2= 'SERVICE=stbrac ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=stbrac' SCOPE=BOTH SID='*';

alter system set log_archive_dest_state_1 = ENABLE SCOPE=BOTH SID='*';
alter system set log_archive_dest_state_2 = ENABLE SCOPE=BOTH SID='*';

# 应该返回EXCLUSIVE
show parameter REMOTE_LOGIN_PASSWORDFILE

# FAL = fetch archive log
ALTER SYSTEM SET FAL_SERVER='stbrac' SCOPE=BOTH SID='*';
ALTER SYSTEM SET FAL_CLIENT='rac' SCOPE=BOTH SID='*';

# 如果主备库文件目录一致,则不用设置
show parameter DB_FILE_NAME_CONVERT
show parameter STANDBY_FILE_MANAGEMENT
ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='AUTO' SCOPE=BOTH SID='*';

# 设置ARCn进程最大值 ( default 4):
ALTER SYSTEM SET LOG_ARCHIVE_MAX_PROCESSES=8 SCOPE=BOTH SID='*';

主库开启数据库闪回

1
2
ALTER SYSTEM SET DB_FLASHBACK_RETENTION_TARGET =2880 scope=BOTH;
ALTER DATABASE FLASHBACK ON;

在所有节点配置sqlnet.ora,添加下面配置

1
NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

准备备库环境

在备集群的一号节点上,创建相关目录。因为都采用的ASM存储,所以要进到ASM目录下去创建

1
2
3
4
5
6
7
su - grid
asmcmd
cd DATA
mkdir stbrac

-- 在两个节点上都创建目录
mkdir -p /u01/app/oracle/admin/stbrac/{adump,cdump}

在stbracnode1上,创建initstbrac1.ora参数文件,先只用填写DB_NAME即可

1
echo 'DB_NAME=rac' > $ORACLE_HOME/dbs/initstbrac1.ora

在racnode1上,将密码文件复制到备集群的两个节点。密码文件目前存放在ASM当中,通过asmcmd工具拷贝到文件系统当中,然后传输到备集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
su - grid
asmcmd pwget --dbuniquename rac

asmcmd pwcopy +DATA/RAC/PASSWORD/pwdrac.256.1005822021 /tmp/orapwstbrac

# racnode1
scp /tmp/orapwstbrac stbracnode1:/u01/app/oracle/product/12.2.0/dbhome_1/dbs/orapwstbrac1
scp /tmp/orapwstbrac stbracnode2:/u01/app/oracle/product/12.2.0/dbhome_1/dbs/orapwstbrac2

# stbracnode1
chown oracle:oinstall /u01/app/oracle/product/12.2.0/dbhome_1/dbs/orapwstbrac1

# stbracnode2
chown oracle:oinstall /u01/app/oracle/product/12.2.0/dbhome_1/dbs/orapwstbrac2

在stbracnode1上创建静态监听

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
su - grid
cp listener.ora listener.ora.bak
vim listener.ora

--添加下面这段
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=rac)
(SID_NAME=stbrac1)
(ORACLE_HOME=/u01/app/oracle/product/12.2.0/dbhome_1)
)
)

-- 重启监听
srvctl stop listener
srvctl start listener
lsnrctl services | grep stbrac

创建备库

启动stbracnode1上的实例到nomount状态

1
2
3
4
5
6
7
su - oracle
sqlplus / as sysdba

export ORACLE_SID=stbrac1
startup nomount pfile='/u01/app/oracle/product/12.2.0/dbhome_1/dbs/initstbrac1.ora';

create spfile from pfile='/u01/app/oracle/product/12.2.0/dbhome_1/dbs/initstbrac1.ora';

通过rman duplicate来创建备库,这个命令可以从主库执行也可以从备库执行。

如果duplicate命令在关掉辅助实例以后失败了,你必须重新关闭和启动备库,而且重新执行之前要先删掉你所建好的spfile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
rman target sys/oracle@rac auxiliary sys/oracle@stbrac

run {
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate auxiliary channel stbc1 type disk;
DUPLICATE TARGET DATABASE FOR STANDBY FROM ACTIVE DATABASE
SPFILE
set 'db_unique_name'='stbrac'
set instance_name='stbrac1'
set instance_number='1'
set db_create_online_log_dest_1='+FRA'
set db_create_online_log_dest_2='+DATA'
set db_recovery_file_dest='+FRA'
set audit_file_dest='/u01/app/oracle/admin/stbrac/adump'
set core_dump_dest='/u01/app/oracle/admin/stbrac/cdump'
nofilenamecheck
dorecover;
}

详细日志:点击这里

检查备库实例是否运行正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
sys@STBRAC1> SELECT NAME, DB_UNIQUE_NAME, DATABASE_ROLE, OPEN_MODE FROM GV$DATABASE;

NAME DB_UNIQUE_NAME DATABASE_ROLE OPEN_MODE
--------- ------------------------------ ---------------- --------------------
RAC stbrac PHYSICAL STANDBY MOUNTED

COL DATAFILE FORMAT a100
SELECT INST_ID, NAME DATAFILE FROM GV$DATAFILE;

INST_ID DATAFILE
---------- ----------------------------------------------------------------------------------------------------
1 +DATA/STBRAC/DATAFILE/system.263.1005926817
1 +DATA/STBRAC/DATAFILE/sysaux.260.1005926819
1 +DATA/STBRAC/DATAFILE/undotbs1.301.1005926851
1 +DATA/STBRAC/86B2264711E008FCE0537992A8C0E982/DATAFILE/system.290.1005926841
1 +DATA/STBRAC/86B2264711E008FCE0537992A8C0E982/DATAFILE/sysaux.291.1005926833
1 +DATA/STBRAC/DATAFILE/users.284.1005926855
1 +DATA/STBRAC/86B2264711E008FCE0537992A8C0E982/DATAFILE/undotbs1.262.1005926847
1 +DATA/STBRAC/DATAFILE/undotbs2.300.1005926855
1 +DATA/STBRAC/86B2433955FB2449E0537992A8C02B22/DATAFILE/system.264.1005926841
1 +DATA/STBRAC/86B2433955FB2449E0537992A8C02B22/DATAFILE/sysaux.293.1005926833
1 +DATA/STBRAC/86B2433955FB2449E0537992A8C02B22/DATAFILE/undotbs1.261.1005926849
1 +DATA/STBRAC/86B2433955FB2449E0537992A8C02B22/DATAFILE/undo_2.303.1005926849
1 +DATA/STBRAC/86B2433955FB2449E0537992A8C02B22/DATAFILE/users.283.1005926855
1 +DATA/STBRAC/DATAFILE/test.302.1005926851


sys@STBRAC1> SELECT INST_ID,GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS FROM GV$STANDBY_LOG ORDER BY 1,2,3;

INST_ID GROUP# THREAD# SEQUENCE# ARC STATUS
---------- ---------- ---------- ---------- --- ----------
1 5 1 0 YES UNASSIGNED
1 6 1 0 YES UNASSIGNED
1 7 1 0 YES UNASSIGNED
1 8 1 0 YES UNASSIGNED
1 9 1 0 YES UNASSIGNED
1 10 2 0 YES UNASSIGNED
1 11 2 0 YES UNASSIGNED
1 12 2 0 YES UNASSIGNED
1 13 2 0 YES UNASSIGNED
1 14 2 0 YES UNASSIGNED


sys@STBRAC1> SELECT TYPE, count(*) FROM GV$LOGFILE GROUP BY TYPE;

TYPE COUNT(*)
------- ----------
ONLINE 8
STANDBY 20

可以看到standby redo logfile 有很多重复,删除重复项

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
column member format a50
select group#, member from v$logfile where type='STANDBY' order by 1,2;

GROUP# MEMBER
---------- --------------------------------------------------
5 +DATA/STBRAC/ONLINELOG/group_5.278.1005926909
5 +FRA/STBRAC/ONLINELOG/group_5.280.1005926905
6 +DATA/STBRAC/ONLINELOG/group_6.277.1005926915
6 +FRA/STBRAC/ONLINELOG/group_6.279.1005926911
7 +DATA/STBRAC/ONLINELOG/group_7.276.1005926921
7 +FRA/STBRAC/ONLINELOG/group_7.278.1005926917
8 +DATA/STBRAC/ONLINELOG/group_8.275.1005926925
8 +FRA/STBRAC/ONLINELOG/group_8.277.1005926923
9 +DATA/STBRAC/ONLINELOG/group_9.274.1005926931
9 +FRA/STBRAC/ONLINELOG/group_9.276.1005926929
10 +DATA/STBRAC/ONLINELOG/group_10.273.1005926937
10 +FRA/STBRAC/ONLINELOG/group_10.272.1005926935
11 +DATA/STBRAC/ONLINELOG/group_11.272.1005926943
11 +FRA/STBRAC/ONLINELOG/group_11.273.1005926939
12 +DATA/STBRAC/ONLINELOG/group_12.271.1005926947
12 +FRA/STBRAC/ONLINELOG/group_12.295.1005926945
13 +DATA/STBRAC/ONLINELOG/group_13.314.1005926953
13 +FRA/STBRAC/ONLINELOG/group_13.296.1005926951
14 +DATA/STBRAC/ONLINELOG/group_14.315.1005926959
14 +FRA/STBRAC/ONLINELOG/group_14.297.1005926957

set linesize 150
set pagesize 100
select 'alter database drop standby logfile member ''' || member || ''';'
from v$logfile where type='STANDBY' and member like '+DATA%';

'ALTERDATABASEDROPSTANDBYLOGFILEMEMBER'''||MEMBER||''';'
------------------------------------------------------------------------------------------------------------------------------------------------------
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_5.278.1005926909';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_6.277.1005926915';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_7.276.1005926921';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_8.275.1005926925';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_9.274.1005926931';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_10.273.1005926937';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_11.272.1005926943';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_12.271.1005926947';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_13.314.1005926953';
alter database drop standby logfile member '+DATA/STBRAC/ONLINELOG/group_14.315.1005926959';

-- 确认删除完毕
sys@STBRAC1> SELECT GROUP#, MEMBER FROM V$LOGFILE WHERE TYPE='STANDBY' ORDER BY 1,2;

GROUP# MEMBER
---------- --------------------------------------------------
5 +FRA/STBRAC/ONLINELOG/group_5.280.1005926905
6 +FRA/STBRAC/ONLINELOG/group_6.279.1005926911
7 +FRA/STBRAC/ONLINELOG/group_7.278.1005926917
8 +FRA/STBRAC/ONLINELOG/group_8.277.1005926923
9 +FRA/STBRAC/ONLINELOG/group_9.276.1005926929
10 +FRA/STBRAC/ONLINELOG/group_10.272.1005926935
11 +FRA/STBRAC/ONLINELOG/group_11.273.1005926939
12 +FRA/STBRAC/ONLINELOG/group_12.295.1005926945
13 +FRA/STBRAC/ONLINELOG/group_13.296.1005926951
14 +FRA/STBRAC/ONLINELOG/group_14.297.1005926957

设置备库的参数

1
2
3
4
5
ALTER SYSTEM SET FAL_SERVER=rac SCOPE=BOTH SID='*';
ALTER SYSTEM SET FAL_CLIENT=stbrac SCOPE=BOTH SID='*';
ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(rac,stbrac)' SID='*';
ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO SID='*';
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=rac ASYNC DB_UNIQUE_NAME=rac VALID_FOR=(ONLINE_LOGFILE,PRIMARY_ROLE)' SCOPE=BOTH SID='*';

注册备库到ocr

1
2
3
4
su - oracle 
srvctl add database -d stbrac -o $ORACLE_HOME -role physical_standby -startoption mount -diskgroup DATA,FRA
srvctl add instance -d stbrac -i stbrac1 -n stbracnode1
srvctl add instance -d stbrac -i stbrac2 -n stbracnode2

将备库的spfile从文件系统中导入asm

通过pfile生成spfile

1
2
create pfile from spfile;
create spfile='+data/stbrac/PARAMETERFILE/spfilestbrac.ora' from pfile;

修改pfile指向spfile

1
2
3
4
5
echo "SPFILE='+data/stbrac/PARAMETERFILE/spfilestbrac.ora'" > $ORACLE_HOME/dbs/initstbrac1.ora
rm $ORACLE_HOME/dbs/spfilestbrac1.ora

-- stbracnode2
echo "SPFILE='+data/stbrac/PARAMETERFILE/spfilestbrac.ora'" > $ORACLE_HOME/dbs/initstbrac2.ora

更新OCR中的spfile信息

1
srvctl modify database -d stbrac -p +DATA/STBRAC/PARAMETERFILE/spfilestbrac.ora

重启备库实例

1
2
3
4
5
6
7
8
9
srvctl stop instance -d stbrac -i stbrac1
srvctl start instance -d stbrac -i stbrac1
srvctl status database -d stbrac

SQL> show parameter spfile;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string +DATA/STBRAC/PARAMETERFILE/spfilestbrac.ora

修改spfile中的INSTANCE_NAMEINSTANCE_NUMBER参数

1
2
3
4
ALTER SYSTEM SET INSTANCE_NUMBER=1 SCOPE=SPFILE SID='stbrac1';
ALTER SYSTEM SET INSTANCE_NUMBER=2 SCOPE=SPFILE SID='stbrac2';
ALTER SYSTEM SET INSTANCE_NAME='stbrac1' SCOPE=SPFILE SID='stbrac1';
ALTER SYSTEM SET INSTANCE_NAME='stbrac2' SCOPE=SPFILE SID='stbrac2';

启动备库的第二个实例

1
2
srvctl start instance -d stbrac -i stbrac2
srvctl status database -d stbrac

开启备库的日志应用进程

1
alter database recover managed standby database using current logfile disconnect from session;

检查每个节点的dg配置情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
select inst_id, database_role, db_unique_name instance, open_mode, protection_mode, protection_level from gv$database;

INST_ID DATABASE_ROLE INSTANCE OPEN_MODE PROTECTION_MODE PROTECTION_LEVEL
---------- ---------------- ------------------------------ -------------------- -------------------- --------------------
1 PHYSICAL STANDBY stbrac READ ONLY WITH APPLY MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
2 PHYSICAL STANDBY stbrac READ ONLY WITH APPLY MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE


select inst_id, database_mode, recovery_mode, protection_mode
from gv$archive_dest_status
where dest_name ='LOG_ARCHIVE_DEST_1';

INST_ID DATABASE_MODE RECOVERY_MODE PROTECTION_MODE
---------- --------------- ----------------------- --------------------
2 OPEN_READ-ONLY MANAGED REAL TIME APPLY MAXIMUM PERFORMANCE
1 OPEN_READ-ONLY MANAGED REAL TIME APPLY MAXIMUM PERFORMANCE

检查redo传输服务

1
2
3
4
5
6
7
8
9
10
11
alter session set nls_date_format ='yyyy-mm-dd hh24:mi:ss';
select inst_id, sequence#, applied, first_time, next_time
from gv$archived_log order by 2,1,4;

-- 主库建表
create table test as select * from dba_tables where 1=2;

-- 检查备库
SQL> select * from test;

no rows selected

备库检查dg统计信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
set linesize 300
col start_time format a20
col item format a20
select to_char(start_time, 'yyyy-mm-dd hh24:mi:ss') start_time, item , sofar, units
from v$recovery_progress
where item in ('Active Apply Rate', 'Average Apply Rate', 'Redo Applied');

START_TIME ITEM SOFAR UNITS
-------------------- -------------------- ---------- --------------------------------
2019-04-18 17:26:33 Active Apply Rate 1148 KB/sec
2019-04-18 17:26:33 Average Apply Rate 58 KB/sec
2019-04-18 17:26:33 Redo Applied 22 Megabytes



col name for a13
col value for a13
col unit for a30
set lines 132
select name, value, unit, time_computed
from v$dataguard_stats where name in ('transport lag', 'apply lag');

NAME VALUE UNIT TIME_COMPUTED
------------- ------------- ------------------------------ ------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval 04/18/2019 17:33:57
apply lag +00 00:00:00 day(2) to second(0) interval 04/18/2019 17:33:57

优化配置

配置主库归档日志删除策略

1
2
rman target /
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;

启用备库的闪回

1
2
3
4
alter system set db_flashback_retention_target =2880;
alter database recover managed standby database cancel;
alter database flashback on;
alter database recover managed standby database using current logfile disconnect from session;

配置dg broker

查看系统现有dg broker配置

1
2
3
4
5
6
7
sys@RAC1> show parameter dg_broker

PARAMETER_NAME TYPE VALUE
------------------------------------------------------------ ----------- -------------------------------------------------------------------------------------
dg_broker_config_file1 string /u01/app/oracle/product/12.2.0/dbhome_1/dbs/dr1rac.dat
dg_broker_config_file2 string /u01/app/oracle/product/12.2.0/dbhome_1/dbs/dr2rac.dat
dg_broker_start boolean FALSE

rac环境中dgbroker的配置文件要存放在共享目录上,在ASM上新建一个目录

1
2
3
su - grid
ASMCMD>cd +data/rac/
ASMCMD>mkdir DGCONFIG

停掉备库apply

1
alter database recover managed standby database cancel;

修改主库参数

1
2
3
4
alter system set LOG_ARCHIVE_DEST_2='' scope=both;
ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/RAC/DGCONFIG/dr1rac.dat' SCOPE=spfile sid='*';
ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/RAC/DGCONFIG/dr2rac.dat' SCOPE=spfile sid='*';
alter system set dg_broker_start=true scope=both sid='*';

同时也修改备库

1
2
3
4
5
6
7
8
su - grid
ASMCMD>cd +data/stbrac/
ASMCMD>mkdir DGCONFIG

alter system set LOG_ARCHIVE_DEST_2='' scope=both;
ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/STBRAC/DGCONFIG/dr1stbrac.dat' SCOPE=spfile sid='*';
ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/STBRAC/DGCONFIG/dr2stbrac.dat' SCOPE=spfile sid='*';
alter system set dg_broker_start=true scope=both sid='*';

重启主备库

1
2
3
4
5
srvctl stop database -d rac
srvctl start database -d rac

srvctl stop database -d stbrac
srvctl start database -d stbrac

创建dgbroker配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
su - oracle
dgmgrl>
create configuration 'raccfg' as primary database is 'rac' connect identifier is 'rac';
add database 'stbrac' as connect identifier is 'stbrac';
ENABLE configuration;
enable database 'rac';
enable database 'stbrac';


DGMGRL> show configuration

Configuration - raccfg

Protection Mode: MaxPerformance
Members:
rac - Primary database
stbrac - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS (status updated 17 seconds ago)

测试通过dgbroker做swithover切换操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
DGMGRL> switchover to 'stbrac';
Performing switchover NOW, please wait...
Operation requires a connection to database "stbrac"
Connecting ...
Connected to "stbrac"
Connected as SYSDBA.
New primary database "stbrac" is opening...
Oracle Clusterware is restarting database "rac" ...
Connected to "rac"
Connected to "rac"
Switchover succeeded, new primary is "stbrac"
DGMGRL> show configuration;

Configuration - raccfg

Protection Mode: MaxPerformance
Members:
stbrac - Primary database
rac - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS (status updated 110 seconds ago)

--重新swithover回来
DGMGRL> switchover to 'rac';
Performing switchover NOW, please wait...
Operation requires a connection to database "rac"
Connecting ...
Connected to "rac"
Connected as SYSDBA.
New primary database "rac" is opening...
Oracle Clusterware is restarting database "stbrac" ...
Switchover succeeded, new primary is "rac"

至此整个安装配置过程结束。

ORA-16086: standby database does not contain available standby log files

早上来检查告警邮件,一套10.2.0.5的dg主库一直在报ora-16086错误,根据错误提示大概是备库没有可用的standby log files。在11g以后,这个错误名称换成了ORA-16086 Redo data cannot be written to the standby redo log

1
2
3
4
5
16086, 0000, "standby database does not contain available standby log files"
// *Cause: The primary database is in "no data loss" mode, but the standby
// database does not contain any "standby log files".
// *Action: Add one or more standby log files to the standby database.
// This can be done while the standby database is mounted.

当前主备同步依旧正常,切换日志也能顺利传到备库,备库也能正常应用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
主:
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/arch
Oldest online log sequence 384007
Next log sequence to archive 384012
Current log sequence 384012

备:
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/arch
Oldest online log sequence 383998
Next log sequence to archive 0
Current log sequence 384012

主备库均为最大可用模式

1
2
3
4
5
SQL> select protection_mode,protection_level from v$database;

PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION

而主备库的standby log file均存在,且大小一致,状态正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SQL> select * from v$logfile;

GROUP# STATUS TYPE MEMBER IS_
---------- ------- ------- -------------------------------------------------------------------------------- ---
1 ONLINE /u01/app/oracle/oradata/test/redo01.log NO
2 ONLINE /u01/app/oracle/oradata/test/redo02.log NO
3 ONLINE /u01/app/oracle/oradata/test/redo03.log NO
4 ONLINE /u01/app/oracle/oradata/test/redo04.log NO
5 ONLINE /u01/app/oracle/oradata/test/redo05.log NO
6 ONLINE /u01/app/oracle/oradata/test/redo06.log NO
7 STANDBY /u01/app/oracle/oradata/test/redo07.log NO
8 STANDBY /u01/app/oracle/oradata/test/redo08.log NO
9 STANDBY /u01/app/oracle/oradata/test/redo09.log NO
10 STANDBY /u01/app/oracle/oradata/test/redo10.log NO
11 STANDBY /u01/app/oracle/oradata/test/redo11.log NO
12 STANDBY /u01/app/oracle/oradata/test/redo12.log NO
13 STANDBY /u01/app/oracle/oradata/test/redo13.log NO

MOS上有说可能是因为fra目录满导致的bug,但检查数据库并没有启用fra

1
2
3
4
5
6
7
SQL> show parameter recovery

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string
db_recovery_file_dest_size big integer 0
recovery_parallelism integer 0

尝试重建standby log file,主备库均删除重建

1
2
3
4
5
6
7
8
9
10
11
12
13
alter database drop standby logfile group 7;
alter database drop standby logfile group 8;
alter database drop standby logfile group 9;
alter database drop standby logfile group 10;
alter database drop standby logfile group 11;
alter database drop standby logfile group 12;
alter database drop standby logfile group 13;

select 'alter database add standby logfile group '||group#||' '''||member||''' size 1g reuse;' from v$logfile where type='STANDBY';

recover managed standby database cancel;

recover managed standby database using current logfile disconnect from session;

观察了一小时,并未再出现告警。