一、架构说明
MHA(Master High Availability)是一个用于 MySQL 主从复制管理和自动故障转移的开源工具集。MHA 的主要目的是提供 MySQL 环境的高可用性和自动故障转移功能,确保在主库发生故障时能够快速切换到备库,降低业务中断时间。
MHA 主要由以下几个组件组成:
1.Manager:管理节点,负责监控主库的状态,并处理自动故障转移。Manager 会不断检查数据库的主从同步状态,当主库发生故障时,会自动将备库提升为新的主库,并重新配置其他从库的复制关系。
如果主从延迟,则等待所有relay log全部应用后,将VIP漂移到新主库上,整个故障转移过程对应用程序完全透明。
2.Node:节点,即 MySQL 实例。MHA 通过在每个 MySQL 实例上部署 MHA 脚本来对节点进行监控和操作。
3.SSH:MHA 使用 SSH 协议来进行节点之间的通信和操作。确保在使用 MHA 之前,通过 SSH 配置了节点之间的互信。
MHA 的工作流程通常如下:
1.Manager 通过 SSH 登录到节点上监控 MySQL 主库的状态,包括主从复制延迟、节点健康状况等。
2.当发现主库发生故障或不可用时,Manager 会触发自动故障转移流程,将备库晋升为新的主库。
3.Manager 会更新其他从库的复制关系,确保它们能够正确复制新的主库。
主从复制(一主一从) + 增强半同步 + GTID + MHA(ping_type=INSERT)
架构 | 操作系统 | IP | VIP | 主机名 | 服务版本 | 端口 | 磁盘空间 | 内存 | CPU |
---|---|---|---|---|---|---|---|---|---|
MHA | redhat7.9 | 192.168.111.34 | 192.168.111.36 | mha-master01 | mysql-8.0.35、manager-0.58、node-0.58 | 3307 | 20G | 4G | 8C |
– | – | 192.168.111.35 | 192.168.111.36 | mha-slave01 | mysql-8.0.35、manager-0.58、node-0.58 | 3307 | 20G | 4G | 8C |
注:该方案仅支持一主一从
缺点:
1.manager管理节点存在单点故障
2. io_threads异常或者延迟时,有丢失数据的风险(主要由网络故障引发,所以主从节点必须在同一个网段)
二、环境准备
2.1 环境准备、搭建主从在此链接跳转
删除依赖冲突yum remove mariadb*
注意:搭建MHA需要的依赖包如下(两个节点都安装)
perl依赖
yum -y install perl-Module-Install perl-Module-Build
MySQL驱动
yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManageryum install perl-Class-Load perl-MIME-Lite perl-Mail-Send perl-Mail-Sender perl-Mail-Sendmail perl-Params-Validate perl-Sys-Syslog perl-Email-Date-Format perl-MIME-Types
以上依赖包如果yum源没有则手动下载安装,依赖包可在以下网址搜索perl-DBD-MySQL-4.023-6.el7.x86_64perl-Config-Tiny-2.14-7.el7.noarchperl-Log-Dispatch-2.41-1.el7.1.noarchperl-Parallel-ForkManager-1.18-2.el7.noarchperl-MIME-Lite-3.030-1.el7.noarchperl-Mail-Sender-0.8.23-1.el7.noarchperl-Mail-Sendmail-0.79-21.el7.noarchperl-Email-Date-Format-1.002-15.el7.noarchperl-MIME-Types-1.38-2.el7.noarch
若出现libmysqlclient.so.18报错问题以下是解决方式。
https://rpm.pbone.net/wget ftp://ftp.pbone.net/mirror/download.fedora.redhat.com/pub/fedora/epel/7/aarch64/Packages/p/perl-Log-Dispatch-2.41-1.el7.1.noarch.rpmwget ftp://ftp.pbone.net/mirror/archive.fedoraproject.org/epel/7.2020-10-05/x86_64/Packages/p/perl-Parallel-ForkManager-1.18-2.el7.noarch.rpmwget ftp://ftp.pbone.net/mirror/archive.fedoraproject.org/epel/7.2020-10-05/ppc64/Packages/p/perl-MIME-Lite-3.030-1.el7.noarch.rpmwget ftp://ftp.pbone.net/mirror/archive.fedoraproject.org/epel/7.2019-05-29/ppc64le/Packages/p/perl-Mail-Sender-0.8.23-1.el7.noarch.rpmwget ftp://ftp.pbone.net/mirror/download.fedora.redhat.com/pub/fedora/epel/7/aarch64/Packages/p/perl-Mail-Sendmail-0.79-21.el7.noarch.rpmwget ftp://ftp.pbone.net/mirror/download.fedora.redhat.com/pub/fedora/epel/7/ppc64le/Packages/p/perl-Email-Date-Format-1.002-15.el7.noarch.rpmwget ftp://ftp.pbone.net/mirror/archive.fedoraproject.org/epel/7.2020-04-20/x86_64/Packages/p/perl-MIME-Types-1.38-2.el7.noarch.rpm安装mysql驱动wget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-8.0/mysql-community-libs-8.0.28-1.el7.x86_64.rpmwget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-8.0/mysql-community-common-8.0.28-1.el7.x86_64.rpmwget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-8.0/mysql-community-libs-compat-8.0.28-1.el7.x86_64.rpmwget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-8.0/mysql-community-client-plugins-8.0.28-1.el7.x86_64.rpmwget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-5.6/MySQL-shared-5.6.51-1.el7.x86_64.rpmwget ftp://ftp.pbone.net/mirror/dev.mysql.com/pub/Downloads/MySQL-8.0/mysql-community-client-plugins-8.0.28-1.el7.x86_64.rpmrpm -ivh mysql-community-common-8.0.28-1.el7.x86_64.rpmrpm -ivh mysql-community-client-plugins-8.0.28-1.el7.x86_64.rpmrpm -ivh mysql-community-libs-8.0.28-1.el7.x86_64.rpmrpm -ivh mysql-community-libs-compat-8.0.28-1.el7.x86_64.rpmrpm -ivh MySQL-shared-5.6.51-1.el7.x86_64.rpm 与mysql-community-libs 互补
2.2 数据库用户权限管理
2.2.1 权限管理
在主从环境按照以下建议赋予权限,禁止赋予all privileges权限
普通用户DDL权限控制
管理员账号,账号名统一为admin_root,由项目负责人管理,用于数据初始化等临时操作
create user 'admin_root'@'%' identified by 'myttrepl@2222#TO'; grant alter,Alter routine,Create,Create routine,Create view,Delete,Drop,Select,Insert,Show view,Update on *.* to 'admin_root'@'%';grant Event,Execute,Index,Lock tables,Process,References,Reload,Show databases,Trigger on *.* to 'admin_root'@'%';
应用账号权限
#无特殊需求是,默认赋予以下权限
create user 'app_du'@'%' identified by 'myttrepl@2222#TO'; grant select,update,delete,insert,Show view,Trigger,Execute,Alter routine on db.* to 'app_du'@'%';grant event on *.* to 'app_du'@'%';
如果主库宕机后,能够快速上线,要避免业务写从库,这样主库可以以slave身份重新加入,并保持数据同步。
处理方式:不给业务分配超级权限账号,就可以避免从库写入(因为从库始终处于read_only模式)
2.3 配置互信
方法1:使用ssh-copy命令配置(分别在2个节点执行)
ssh-keygen -t rsassh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.111.34ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.111.35
方法2:手工配置
执行ssh-keygen -t rsa生成一对密钥
然后将公钥(id_rsa.pub)分别写入34、35两台机器的/root/.ssh/authorized_keys中
测试是否成功
ssh 192.168.111.34 date&& ssh 192.168.111.35 date
三、安装MHA
3.1 创建目录
mkdir -p /data/mha/{scripts,conf,manager}
3.2 配置在线切换、故障切换检测脚本
vi /data/mha/scripts/master_ip_failover_vip需要修改的地方my $vip = '192.168.111.36/24';my $ssh_start_vip = "/usr/sbin/ifconfig ens32:$key $vip";my $ssh_stop_vip = "/usr/sbin/ifconfig ens32:$key down";
#!/usr/bin/env perluse strict;use warnings FATAL => 'all';use Getopt::Long; my ($command,$ssh_user,$orig_master_host, $orig_master_ip,$orig_master_port, $new_master_host, $new_master_ip,$new_master_port); my $vip = '192.168.111.36/24';my $key = '1';my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip";my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; GetOptions('command=s'=> \$command,'ssh_user=s' => \$ssh_user,'orig_master_host=s' => \$orig_master_host,'orig_master_ip=s' => \$orig_master_ip,'orig_master_port=i' => \$orig_master_port,'new_master_host=s'=> \$new_master_host,'new_master_ip=s'=> \$new_master_ip,'new_master_port=i'=> \$new_master_port,);exit &main();sub main {print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1;eval {print "Disabling the VIP on old master: $orig_master_host \n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) { my $exit_code = 10;eval {print "Enabling the VIP - $vip on the new master - $new_master_host \n";&start_vip();$exit_code = 0;};if ($@) {warn $@;exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {print "Checking the Status of the script.. OK \n";exit 0;}else {&usage();exit 1;}}sub start_vip() {`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;}sub stop_vip() { return 0unless($ssh_user);`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;} sub usage {print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";}
vi /data/mha/scripts/master_ip_online_change_vip需要修改的地方my $vip = '192.168.111.36/24';my $ssh_start_vip = "/usr/sbin/ifconfig ens32:$key $vip";my $ssh_stop_vip = "/usr/sbin/ifconfig ens32:$key down";
#!/usr/bin/env perluse strict;use warnings FATAL =>'all';use Getopt::Long; my $vip = '192.168.111.36/24';# Virtual IPmy $key = "1";my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip";my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down";my $exit_code = 0;my ($command,$orig_master_is_new_slave, $orig_master_host,$orig_master_ip, $orig_master_port, $orig_master_user,$orig_master_password, $orig_master_ssh_user, $new_master_host,$new_master_ip,$new_master_port,$new_master_user,$new_master_password,$new_master_ssh_user,);GetOptions('command=s'=> \$command,'orig_master_is_new_slave' => \$orig_master_is_new_slave,'orig_master_host=s' => \$orig_master_host,'orig_master_ip=s' => \$orig_master_ip,'orig_master_port=i' => \$orig_master_port,'orig_master_user=s' => \$orig_master_user,'orig_master_password=s' => \$orig_master_password,'orig_master_ssh_user=s' => \$orig_master_ssh_user,'new_master_host=s'=> \$new_master_host,'new_master_ip=s'=> \$new_master_ip,'new_master_port=i'=> \$new_master_port,'new_master_user=s'=> \$new_master_user,'new_master_password=s'=> \$new_master_password,'new_master_ssh_user=s'=> \$new_master_ssh_user,);exit &main();sub main {#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";if ( $command eq "stop" || $command eq "stopssh" ) {# $orig_master_host, $orig_master_ip, $orig_master_port are passed.# If you manage master ip address at global catalog database,# invalidate orig_master_ip here.my $exit_code = 1;eval {print "\n\n\n***************************************************************\n";print "Disabling the VIP - $vip on old master: $orig_master_host\n";print "***************************************************************\n\n\n\n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) {# all arguments are passed.# If you manage master ip address at global catalog database,# activate new_master_ip here.# You can also grant write access (create user, set read_only=0, etc) here.my $exit_code = 10;eval {print "\n\n\n***************************************************************\n";print "Enabling the VIP - $vip on new master: $new_master_host \n";print "***************************************************************\n\n\n\n";&start_vip();$exit_code = 0;};if ($@) {warn $@;exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {print "Checking the Status of the script.. OK \n";`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;exit 0;}else {&usage();exit 1;}}# A simple system call that enable the VIP on the new mastersub start_vip() {`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;}# A simple system call that disable the VIP on the old_mastersub stop_vip() {`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;}sub usage {print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";}
chmod +x /data/mha/scripts/master_ip_failover_vipchmod +x /data/mha/scripts/master_ip_online_change_vip
3.3 安装MHA软件
mha4mysql-node-0.58.tar.gz —> 两节点都安装
mha4mysql-manager-0.58.tar.gz —> 两节点都安装
cd /data/mhawget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gzwget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gztar -zxvf mha4mysql-node-0.58.tar.gzcd mha4mysql-node-0.58perl Makefile.PLmake && make installln -s /data/mha/mha4mysql-node-0.58/bin/* /usr/local/bin/ll /usr/local/bin/tar -zxvf mha4mysql-manager-0.58.tar.gzcd mha4mysql-manager-0.58perl Makefile.PLmake && make installln -s /data/mha/mha4mysql-manager-0.58/bin/* /usr/local/bin/ll /usr/local/bin/
3.4 在manager节点配置masterha.cnf、app.cnf(两个节点都配置)
创建mha管理账号(仅在节点1创建)
create user 'i_mha'@'192.168.%.%' identified by "myttrepl@2222#TO";GRANT ALL PRIVILEGES ON *.* TO 'i_mha'@'192.168.%.%' with grant option;
mkdir -p /data/mha/masterha/vi /data/mha/conf/masterha.cnf
[server default]user=i_mha#设置监控用户mhapassword=myttrepl@2222#TO#设置mysql中mha用户的密码,这个密码是前文中创建监控用户的那个密码repl_user=repl#设置复制用户的用户replrepl_password=myttrepl@2222#TO#设置复制用户repl的密码ssh_user=root#设置ssh的登录用户名ping_interval=10#设置监控主库,发送ping包的时间间隔10秒,默认是3秒,尝试三次没有回应的时候自动进行failoverping_type=CONNECTremote_workdir=/data/mha/masterha/#manager工作目录
vi /data/mha/conf/app.cnf
manager_workdir=/data/mha/manager/#manager工作目录manager_log=/data/mha/manager/manager.log#manager日志#workdir on the node for mysql servermaster_binlog_dir=/data/mysql8.0.35/3307/binlog/#master保存binlog的位置,这里的路径要与master里配置的binlog的路径一致,以便MHA能找到#自动故障切换master脚本master_ip_failover_script=/data/mha/scripts/master_ip_failover_vip#手动切换时运行的脚本master_ip_online_change_script= /data/mha/scripts/master_ip_online_change_vip#指定检查的从服务器IP地址secondary_check_script = /usr/local/bin/masterha_secondary_check -s 192.168.111.34 -s 192.168.111.35[server1]hostname=192.168.111.34ssh_port=22port=3307#设置为候选master,设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中最新的slavecandidate_master=1#默认情况下如果一个slave落后master 超过100M的relay logs的话,MHA将不会选择该slave作为一个新的master, 因为对于这个slave的恢复需要花费很长时间;通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的mastercheck_repl_delay=0 #忽略relay logs日志的复制延迟[server2]hostname=192.168.111.35ssh_port=22port=3307candidate_master=1check_repl_delay=0
3.5 检测互信
[root@mhaserver01 conf]# masterha_check_ssh --global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnfThu Feb 22 16:34:13 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.Thu Feb 22 16:34:13 2024 - [info] Reading application default configuration from /data/mha/conf/app.cnf..Thu Feb 22 16:34:13 2024 - [info] Reading server configuration from /data/mha/conf/app.cnf..Thu Feb 22 16:34:13 2024 - [info] Starting SSH connection tests..Thu Feb 22 16:34:13 2024 - [debug]Thu Feb 22 16:34:13 2024 - [debug]Connecting via SSH from root@192.168.111.34(192.168.111.34:22) to root@192.168.111.35(192.168.111.35:22)..Thu Feb 22 16:34:13 2024 - [debug] ok.Thu Feb 22 16:34:14 2024 - [debug]Thu Feb 22 16:34:13 2024 - [debug]Connecting via SSH from root@192.168.111.35(192.168.111.35:22) to root@192.168.111.34(192.168.111.34:22)..Thu Feb 22 16:34:14 2024 - [debug] ok.Thu Feb 22 16:34:14 2024 - [info] All SSH connection tests passed successfully.
3.6 检测主从状态
[root@mhaserver01 conf]# masterha_check_repl --global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnfFri Feb 23 10:50:09 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.Fri Feb 23 10:50:09 2024 - [info] Reading application default configuration from /data/mha/conf/app.cnf..Fri Feb 23 10:50:09 2024 - [info] Reading server configuration from /data/mha/conf/app.cnf..Fri Feb 23 10:50:09 2024 - [info] MHA::MasterMonitor version 0.58.Fri Feb 23 10:50:10 2024 - [info] GTID failover mode = 1Fri Feb 23 10:50:10 2024 - [info] Dead Servers:Fri Feb 23 10:50:10 2024 - [info] Alive Servers:Fri Feb 23 10:50:10 2024 - [info] 192.168.111.34(192.168.111.34:3307)Fri Feb 23 10:50:10 2024 - [info] 192.168.111.35(192.168.111.35:3307)Fri Feb 23 10:50:10 2024 - [info] Alive Slaves:Fri Feb 23 10:50:10 2024 - [info] 192.168.111.35(192.168.111.35:3307)Version=8.0.35 (oldest major version between slaves) log-bin:enabledFri Feb 23 10:50:10 2024 - [info] GTID ONFri Feb 23 10:50:10 2024 - [info] Replicating from 192.168.111.34(192.168.111.34:3307)Fri Feb 23 10:50:10 2024 - [info] Primary candidate for the new Master (candidate_master is set)Fri Feb 23 10:50:10 2024 - [info] Current Alive Master: 192.168.111.34(192.168.111.34:3307)Fri Feb 23 10:50:10 2024 - [info] Checking slave configurations..Fri Feb 23 10:50:10 2024 - [info]read_only=1 is not set on slave 192.168.111.35(192.168.111.35:3307).Fri Feb 23 10:50:10 2024 - [info] Checking replication filtering settings..Fri Feb 23 10:50:10 2024 - [info]binlog_do_db= , binlog_ignore_db=Fri Feb 23 10:50:10 2024 - [info]Replication filtering check ok.Fri Feb 23 10:50:10 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.Fri Feb 23 10:50:10 2024 - [info] Checking SSH publickey authentication settings on the current master..Fri Feb 23 10:50:11 2024 - [info] HealthCheck: SSH to 192.168.111.34 is reachable.Fri Feb 23 10:50:11 2024 - [info]192.168.111.34(192.168.111.34:3307) (current master) +--192.168.111.35(192.168.111.35:3307)Fri Feb 23 10:50:11 2024 - [info] Checking replication health on 192.168.111.35..Fri Feb 23 10:50:11 2024 - [info]ok.Fri Feb 23 10:50:11 2024 - [info] Checking master_ip_failover_script status:Fri Feb 23 10:50:11 2024 - [info] /data/mha/scripts/master_ip_failover_vip --command=status --ssh_user=root --orig_master_host=192.168.111.34 --orig_master_ip=192.168.111.34 --orig_master_port=3307IN SCRIPT TEST====/usr/sbin/ifconfig ens32:1 down==/usr/sbin/ifconfig ens32:1 192.168.111.36/24===Checking the Status of the script.. OKFri Feb 23 10:50:11 2024 - [info]OK.Fri Feb 23 10:50:11 2024 - [warning] shutdown_script is not defined.Fri Feb 23 10:50:11 2024 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.
3.7 开启MHA-manager监控进程(mhamaster02启用manager管理)
注意:第一次启动,主库上的VIP 不会自动绑定,需要手动去绑定,主库发生故障切换会进行vip的漂移
#起服务nohup masterha_manager --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --ignore_last_failover /data/mha/run.log 2>&1 start slave;//启动slave进程 --查看slave状态确认IO线程、SQL线程都以运行mysql> show slave status\G;//查看当前的从库状态Slave_IO_Running: Yes//IO线程已运行 Slave_SQL_Running: Yes//SQL线程已运行mysql> show slave status\G;//查看当前的从库状态
4.2.4 主库启动manager服务并查看MHA状态
nohup masterha_manager --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --ignore_last_failover /data/mha/run.log 2>&1 &masterha_check_status --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnfapp (pid:14837) is running(0:PING_OK), master:192.168.111.34
五、MHA 日常维护命令集:
1 查看ssh 登陆是否成功
masterha_check_ssh --global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnf
2 查看主从同步情况
masterha_check_repl--global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnf
3 检查启动的状态
masterha_check_status--global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnf
4 停止mha
masterha_stop --global_conf=/data/mha/conf/masterha.cnf --conf=/data/mha/conf/app.cnf
5 启动mha
nohup masterha_manager --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --ignore_last_failover /data/mha/run.log 2>&1 &
注意:当有slave 节点宕掉的情况是启动不了的,加上–ignore_fail_on_start 即使有节点宕掉也能启动mha,需要在配置文件中设置ignore_fail=1
6 failover 后下次重启
每次failover 切换后会在管理目录生成文件app.failover.complete ,下次在切换的时候会发现有这个文件导致切换不成功,需要手动清理掉。rm -rf /masterha/app1/app1.failover.complete也可以加上参数--ignore_last_failover
7 手工failover
手工failover 场景,master 死掉,但是masterha_manager 没有开启,可以通过手工failover:masterha_master_switch --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --dead_master_host=old_ip --dead_port=port --master_state=dead --new_master_host=new_ip --new_master_port=port --ignore_last_failover
8 masterha_manager 是一种监视和故障转移的程序。另一方面,masterha_master_switch 程序不监控主库。masterha_master_switch 可以用于主库故障转移,也可用于在线总开关。
9 手动在线切换(比如做维护切换时)
masterha_master_switch ---global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --master_state=alive --new_master_host=192.168.111.34 --orig_master_is_new_slave或者masterha_master_switch --global_conf=/data/mha/conf/masterha.cnf--conf=/data/mha/conf/app.cnf --master_state=alive --new_master_host=192.168.111.34 -orig_master_is_new_slave --running_updates_limit=10000--orig_master_is_new_slave 切换时加上此参数是将原master 变为slave 节点,如果不加此参数,原来的master 将不启动--running_updates_limit=10000 切换时候选master 如果有延迟的话,mha 切换不能成功,加上此参数表示延迟在此时间范围内都可切换(单位为s),但是切换的时间长短是由recover时relay 日志的大小决定手动在线切换mha,切换时需要将在运行的mha 停掉后才能切换。在备库先执行DDL,一般先stop slave,一般不记录mysql 日志,可以通过set SQL_LOG_BIN =0 实现。然后进行一次主备切换操作,再在原来的主库上执行DDL。这种方法适用于增减索引,如果是增加字段就需要额外注意。