上一篇文章我们介绍了 DG Broker 的手动切换,一条命令搞定切库。有同学就说了那不还得用手么?有木有办法在主库挂掉时自动切换?请看下文……
1
FSFO简介
Fast-Start Failover是建立在broker基础上的一个快速故障转换的机制,通过fast-start failover可以自动检测primary的故障,然后自动的failover到预先指定的standby上面,这样可以最大化的减少故障时间,提高数据库的可用性。
2
FSFO测试环境
OS:linux Redhat 7
oracle:11.2.0.4
10.110.110.22主库 DB_UNIQUE_NAME test_pri
10.110.110.21备库 DB_UNIQUE_NAME test_std
10.110.110.30 observer主机
DG broker 配置名称为test
sys密码:123456
3
FSFO前提条件
1)DG broker configuration的配置必须在最大可用或者最大性能模式下2)保护模式、standby redo logs和传输模式
最大可用 sync
最大性用 async
主库的LogXptMode和备库一致。
主备的standby redo要保持一致。
3)主备库都打开数据库级别闪回4)在观察者主机上安装DGMGRL5)观察者主机上配置 TNSNAMES.ORA 文件6)主备库配置静态服务
修改主库监听
SID_LIST_LISTENER=
加入:
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=test_pri_dgmgrl)
(ORACLE_HOME=/home/oracle/app/product/11.2.0/db_1)
(SID_NAME=test))
)
修改备库监听
SID_LIST_LISTENER=
加入:
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=test_std_DGMGRL
(ORACLE_HOME=/home/oracle/app/product/11.2.0/db_1)
(SID_NAME=test))
)
4
FSFO配置
1)根据DG环境确定目标备库,一般一主一备。
2)指定目标备库 FastStartFailoverTarget
DGMGRL> edit database 'TEST_PRI' set property FastStartFailoverTarget='test_std';
Property "faststartfailovertarget" updated
DGMGRL> show database 'TEST_PRI' FastStartFailoverTarget;
FastStartFailoverTarget = 'test_std'
如果系统只有一主一备,这个步骤可以省略,在FSFO启动的时候,会自动指定。
3)配置保护模式。
DGMGRL> EDIT DATABASE ' TEST_PRI ' SET PROPERTY LogXptMode=SYNC;
DGMGRL> EDIT DATABASE ' test_std ' SET PROPERTY LogXptMode=SYNC;
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
4)设置 FastStartFailoverThreshold 参数
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold = 15;
默认值30 秒,最小值 6.
5)可选的数据库配置
FastStartFailoverPmyShutdown 默认 TRUE
FastStartFailoverLagLimit 默认30 seconds
FastStartFailoverAutoReinstate 默认 TRUE
6)启用额外可选的fast-start failover条件。
Datafile Offline Yes
Corrupted Dictionary Yes
Corrupted Controlfile Yes
Inaccessible Logfile No
Stuck Archiver No
还可以指定ORA类型错误作为FSFO的触发条件。
7)启用 fast-start failover.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
8)启动观察者
主库必须是启动状态。
DGMGRL> START OBSERVER;
9)检查 fast-start failover 环境。
DGMGRL> SHOW FAST_START FAILOVER;
Fast-Start Failover: ENABLED
Threshold: 15 seconds
Target: test_std
Observer: 10-110-110-30
Lag Limit: 30 seconds (not in use)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Observer Reconnect: (none)
Observer Override: FALSE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
5
客户端TAF切换配置(主库)
1)在主库上配置一个TAF的service
此服务在数据库出现故障时会发送通知给客户端,允许查询语句在故障转移发生后继续运行。
在主库端运行:
begin
DBMS_SERVICE.CREATE_SERVICE(
service_name => 'SERVICE_PRI',
network_name => 'SERVICE_PRI',
aq_ha_notifications => TRUE,
failover_method => 'BASIC',
failover_type => 'SELECT',
failover_retries => 15,
failover_delay => 5);
end;
/
2)创建在主库启动服务的存储过程
创建一个存储过程来实现此目的,如果当前数据库是主库它就启动此服务,如果是备库就停止。
主库执行:
create or replace procedure SERVICE_PRI is
v_role VARCHAR(30);
begin
select DATABASE_ROLE into v_role from V$DATABASE;
if v_role = 'PRIMARY' then
DBMS_SERVICE.START_SERVICE('SERVICE_PRI');
else
DBMS_SERVICE.STOP_SERVICE('SERVICE_PRI');
end if;
end;
/
3)创建一个触发器来确保服务运行
创建触发器,让数据库在启动或角色转换时运行此存储过程。用于当数据库open时,不需要重启数据库,如果是主库则执行存储过程。当数据库切换后,如果是主库则执行存储过程。
主库执行:
create or replace TRIGGER FSFO_TRIGGER
after startup or db_role_change on database
begin
SERVICE_PRI;
end;
/
4)启动新创建的service
在主库上执行该存储过程(或者重启数据库,在启动数据库时会触发执行触发器),在主库做日志的切换,将变化应用到备库。
主库执行:
SQL> exec SERVICE_PRI;
PL/SQL 过程已成功完成。
已用时间: 00: 00: 00.01
查看监听状态:lsnrctl status
22主库上,多出来一个服务:
Services Summary...
Service "TEST_PRI" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
Service "TEST_PRI_DGB" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
Service "SERVICE_PRI" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
Service "test" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
Service "testXDB" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
Service "test_pri_dgmgrl" has 1 instance(s).
Instance "test", status UNKNOWN, has 1 handler(s) for this service...
Service "test_std" has 1 instance(s).
Instance "test", status READY, has 1 handler(s) for this service...
The command completed successfully
21备库不会出现该服务:
Services Summary...
Service "test" has 1 instance(s).
Instance "test", status READY, has 2 handler(s) for this service...
Service "test_pri" has 1 instance(s).
Instance "test", status READY, has 2 handler(s) for this service...
Service "test_std" has 1 instance(s).
Instance "test", status READY, has 2 handler(s) for this service...
Service "test_std_DGB" has 1 instance(s).
Instance "test", status READY, has 2 handler(s) for this service...
Service "test_std_DGMGRL" has 1 instance(s).
Instance "test", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
5)在备库查询,确认触发器和存过已经应用
sys@test> select trigger_name, TRIGGERING_EVENT from dba_triggers where trigger_name = 'FSFO_TRIGGER';
TRIGGER_NAME TRIGGERING_EVENT
--------------------------------------------
FSFO_TRIGGER STARTUP
6)客户端tnsnames 配置
此时在dgmgrl中手动shutdown abort掉主库,那么连接的cmd中会停顿一会,等待fast-start failover切换完成后,则继续返回结果。主备切换不影响用户的select操作,但是如果是dml操作,则所有事务回滚:
FSFO =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = tcp)(HOST = 10.110.110.22)(PORT = 9539))
(ADDRESS = (PROTOCOL = tcp)(HOST = 10.110.110.21)(PORT = 9539))
(LOAD_BALANCE = off)
(FAILOVER = on)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = SERVICE_PRI)
(FAILOVER_MODE =
(TYPE = select)
(METHOD = basic)
(RETRIES = 180)
(DELAY = 5)
)
)
)
6
JDBC中连接串的配置
在数据库中建立服务、存储过程和触发器后,需要如下配置:
注意:在druid连接池中连接串的配置必须在一行中。
好啦,FSFO的配置本文先介绍到这里,关于主库挂掉后的自动切换过程,咱们下回讲解。
网易MySQL开源中间件Cetus
__________________________
github地址
https://github.com/Lede-Inc/cetus/blob/master/doc/cetus-quick-try.md
欢迎加star关注
社群
技术专家在线及时反馈
领取专属 10元无门槛券
私享最新 技术干货