当前位置：

首页
/
IT
/
程序
/
MySQL
/
基于GTID的Mysql-Mha高可用方案探索

基于GTID的Mysql-Mha高可用方案探索

声明：本篇文章内容整理来源于互联网以及本人自己的梳理总结，目的是从零到一的搭建起来 mysql mha 高可用架构。

一、软件概述

MHA（Master High Availability）目前在 MySQL 高可用方面是一个相对成熟的解决方案，它由日本 DeNA 公司 youshimaton（现就职于 Facebook 公司）开发，是一套优秀的作为 MySQL 高可用性环境下故障切换和主从提升的高可用软件。在 MySQL 故障切换过程中，MHA 能做到在 0~30 秒之内自动完成数据库的故障切换操作，并且在进行故障切换的过程中，MHA 能在最大程度上保证数据的一致性，以达到真正意义上的高可用。

该软件由两部分组成：MHA Manager（管理节点）和 MHA Node（数据节点）。MHA Manager 可以单独部署在一台独立的机器上管理多个 master-slave 集群，也可以部署在一台 slave 节点上。MHA Node 运行在每台 MySQL 服务器上，MHA Manager 会定时探测集群中的 master 节点，当 master 出现故障时，它可以自动将最新数据的 slave 提升为新的 master，然后将所有其他的 slave 重新指向新的 master。整个故障转移过程对应用程序完全透明。

在 MHA 自动故障切换过程中，MHA 试图从宕机的主服务器上保存二进制日志，最大程度的保证数据的不丢失，但这并不总是可行的。例如，如果主服务器硬件故障或无法通过 ssh 访问，MHA 没法保存二进制日志，只进行故障转移而丢失了最新的数据。使用 MySQL 5.5 的半同步复制，可以大大降低数据丢失的风险。MHA 可以与半同步复制结合起来。如果只有一个 slave 已经收到了最新的二进制日志，MHA 可以将最新的二进制日志应用于其他所有的 slave 服务器上，因此可以保证所有节点的数据一致性。

目前 MHA 主要支持一主多从的架构，要搭建 MHA, 要求一个复制集群中必须最少有三台数据库服务器，一主二从，即一台充当 master，一台充当备用 master，另外一台充当从库，因此至少需要三台服务器。

二、环境说明

2.1 系统环境

系统版本	CentOS release 6.5 (Final)
内核版本	2.6.32-431.el6.x86_64

2.2 软件版本

Mysql 选取最新的 5.7 版本，下载二进制包的方式部署

2.3 角色说明

2.4 目录约定

三、Mysql 安装部署

官方下载地址： https://dev.mysql.com/downloads/mysql/

3.1 准备安装包

下载二进制安装包放置三台 mysql 服务器的 / usr/local 目录下面，软件包名称为：mysql-5.7.18-linux-glibc2.5-x86_64.tar.gz

3.2 卸载系统 lib 包

#rpm - qa | grep mysql mysql - libs - 5.1.71 - 1.el6.x86_64#rpm - e mysql - libs - 5.1.71 - 1.el6.x86_64 - nodeps

3.3 创建 mysql 用户

#groupadd mysql#useradd - r - g mysql mysql #参数表示mysql用户是系统用户，不可用于登录系统。

3.4 安装程序

解压 TAR 包，更改所属的组和用户，并且创建软连接的方式

#pwd / usr / local#ls mysql - 5.7.18 - linux - glibc2.5 - x86_64.tar.gz mysql - 5.7.18 - linux - glibc2.5 - x86_64.tar.gz#tar - zxvf mysql - 5.7.18 - linux - glibc2.5 - x86_64.tar.gz#ln - s mysql - 5.7.18 - linux - glibc2.5 - x86_64 mysql#ll mysql lrwxrwxrwx 1 root root 34 May 25 12 : 54 mysql - >mysql - 5.7.18 - linux - glibc2.5 - x86_64#chown - R mysql: mysql mysql - 5.7.18 - linux - glibc2.5 - x86_64

3.5 初始化数据库（！！！做完 3.7 再做此步骤吧）

#mkdir - pv / data / mysqldata#创建放置mysql的数据目录mkdir: created directory` / data / mysqldata '
# chown -R mysql:mysql /data/mysqldata/
# pwd
/usr/local/mysql
# ./bin/mysqld --user=mysql --basedir=/usr/local/mysql/ --datadir=/data/mysqldata/ --initialize
# mysql 5.7使用mysqld进行数据库的初始化
2017-05-25T05:37:55.408516Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2017-05-25T05:37:55.674238Z 0 [Warning] InnoDB: New log files created, LSN=45790
2017-05-25T05:37:55.721899Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2017-05-25T05:37:55.780139Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 4dbe04ea-410c-11e7-82cb-000e1ebbacb0.
2017-05-25T05:37:55.780754Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed ' cannot be opened.
2017-05-25T05:37:55.781792Z 1 [Note] A temporary password is generated for root@localhost: rtSOJ:hyx54y  #会初始化一个root密码
'

3.6 修改启动脚本

#pwd / usr / local / mysql#cp - a. / support - files / mysql.server  / etc / init.d / mysqld#vim / etc / init.d / mysqld     basedir = '/usr/local/mysql'     datadir = '/data/mysqldata'

3.7 创建配置文件

# / etc / my.cnf[mysqld]
basedir = /usr/local / mysql datadir = /data/mysqldata port = 3306 server_id = 86#！！！每台服务器的id不一样，必须更改socket = /tmp/mysql.sock pid - file = /data/mysqldata / mysql.pid
sql_mode = NO_ENGINE_SUBSTITUTION,
STRICT_TRANS_TABLES
gtid_mode = on enforce_gtid_consistency = on log - bin = mysqlbin log - slave - updates = 1 binlog_format = row log_error = /data/mysqldata / mysql - error.log
innodb_buffer_pool_size = 6G#根据自己的内存大小进行更改，内存的70 % ~80 % innodb_log_buffer_size = 64M innodb_max_dirty_pages_pct = 50
default - storage - engine = InnoDB sync_binlog = 1 innodb_flush_log_at_trx_commit = 2 innodb_file_per_table = 1 innodb_data_file_path = ibdata1: 1G: autoextend

3.8 启动数据库

# / etc / init.d / mysqld start Starting MySQL.Logging to '/data/mysqldata/mysql-error.log'......SUCCESS !

3.9 初始化密码

mysql5.7 会生成一个初始化密码，位置在： /root/.mysql_secret ，在 error 日志中也有

#cat / root / .mysql_secret#Password set
for user 'root@localhost'at 2017 - 05 - 25 12 : 59 : 48 OU1qrvuT1t) ?
# / usr / local / mysql / bin / mysql - uroot - p Enter password: Welcome to the MySQL monitor. Commands end with;
or\g.Your MySQL connection id is 3 Server version: 5.7.18
Copyright(c) 2000,
2017,
Oracle and / or its affiliates.All rights reserved.
Oracle is a registered trademark of Oracle Corporation and / or its affiliates.Other names may be trademarks of their respective owners.
Type 'help;'or '\h'
for help.Type '\c'to clear the current input statement.
mysql > SET PASSWORD = PASSWORD('dbpass123');
Query OK,
0 rows affected,
1 warning(0.00 sec)
mysql > flush privileges;
Query OK,
0 rows affected(0.00 sec)

3.10 设置开机启动

chkconfig--level 35 mysqld on

四、主从复制搭建

Mysql 复制在 5.6.5 之后有两种配置方式，一种是基于 binlog 文件中的 POS 值来进行配置，另外一种则是通过 GTID 的方式来进行复制。

4.1 GTID 工作原理

1、全局事务标识：global transaction identifiers。2、GTID是一个事务一一对应，并且全局唯一ID。3、一个GTID在一个服务器上只执行一次，避免重复执行导致数据混乱或者主从不一致。4、GTID用来代替传统复制方法，不再使用MASTER_LOG_FILE + MASTER_LOG_POS开启复制。而是使用MASTER_AUTO_POSTION = 1的方式开始复制。5、MySQL - 5.6.5开始支持的，MySQL - 5.6.10后开始完善。6、在传统的slave端，binlog是不用开启的，但是在GTID中slave端的binlog是必须开启的，目的是记录执行过的GTID（强制）。
过程描述：
1、当一个事务在主库端执行并提交时，产生GTID，一同记录到binlog日志中。2、binlog传输到slave,
并存储到slave的relaylog后，读取这个GTID的这个值设置gtid_next变量，即告诉Slave，下一个要执行的GTID值。3、sql线程从relay log中获取GTID，然后对比slave端的binlog是否有该GTID。4、如果有记录，说明该GTID的事务已经执行，slave会忽略。5、如果没有记录，slave就会执行该GTID事务，并记录该GTID到自身的binlog，在读取执行事务前会先检查其他session持有该GTID，确保不被重复执行。6、在解析过程中会判断是否有主键，如果没有就用二级索引，如果没有就用全部扫描。

4.2 配置文件必须参数

[mysqld] server_id = 86#服务器id，每个mysql实例必须不一样，建议取ip最后段的值
gtid_mode = on#开启gtid模式enforce_gtid_consistency = on#强制gtid一致性，开启后对于特定create table不被支持
log - bin = mysqlbin#开启binlog log - slave - updates = 1 binlog_format = row#强烈建议，其他格式可能造成数据不一致

4.3 创建复制帐号

在主库上创建复制帐号，授权从库使用帐号来进行复制

GRANT REPLICATION SLAVE ON * . * TO repluser@192.168.129.87 IDENTIFIED BY 'replpass';
GRANT REPLICATION SLAVE ON * . * TO repluser@192.168.129.88 IDENTIFIED BY 'replpass';
flush privileges;

4.4 从库开启复制

#在两个从库上执行：CHANGE MASTER TO MASTER_HOST = '192.168.129.86',
#主库ip MASTER_USER = 'repluser',
#复制帐号MASTER_PASSWORD = 'replpass',
#帐号密码MASTER_AUTO_POSITION = 1;
Start slave;

GTID 复制官方地址：https://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html

4.5 查看复制状态

在从库主机上使用 show slave status 查看复制状态，当 Slave_IO_Running 和 Slave_SQL_Running 都是 Yes 的时候说明主从复制状态是正常的，此时可以在主库上操作数据，然后在从库上验证数据是否会同步过来。

至此，mysql 基于 GTID 的主从复制搭建完毕，下面就剩下 mha 软件的搭建了。

五、mha 搭建

mha node 角色需要部署在每台主机上面，mha manager 只需要部署在 mha 控制节点 89 上。

5.1 配置 host 文件

使用主机别名的方式更容易使配置简单，在每台机器的 / etc/hosts 文件中添加以下对应关系

#add / etc / hosts 192.168.129.86 node86 master 192.168.129.87 node87 salve87 192.168.129.88 node88 salve88

5.2 mhanode 节点部署

Mha 是由 perl 语言开发，所以需要使用 perl 的依赖，推荐使用 yum 进行安装，此软件需要安装在每台服务器上.

5.2.1 安装依赖

yum install - y perl - DBD - MySQL.x86_64\perl - DBI.x86_64 perl - ExtUtils - CBuilder\perl - ExtUtils - MakeMaker perl - CPAN.x86_64\perl - Mail - Sender perl - Log - Dispatch

5.2.2 安装 mhanode

tar xf mha4mysql - node - 0.56.tar.gz cd mha4mysql - node - 0.56 perl Makefile.PL make make install

5.2.3node 主要工具

Node 工具包（这些工具通常由 MHA Manager 的脚本触发，无需人为操作）主要包括以下几个工具：

save_binary_logs保存和复制master的二进制日志apply_diff_relay_logs识别差异的中继日志事件并将其差异的事件应用于其他的slave filter_mysqlbinlog去除不必要的ROLLBACK事件（MHA已不再使用这个工具）purge_relay_logs清除中继日志（不会阻塞SQL线程）

5.3mhamanager 节点部署（需要访问公网）

Mha manager 控制节点单独一台服务器，部署在 192.168.129.89 服务器上。

由于已经安装 mha node，所以相关依赖的 perl 模块已经安装，可以直接安装 mha manager 软件.

5.3.1 安装 mhamanager

tar - zxvf mha4mysql - manager - 0.56.tar.gz cd mha4mysql - manager - 0.56 perl Makefile.PL make make install

5.3.2 manager 主要工具

masterha_check_ssh检查MHA的SSH配置状况masterha_check_repl检查MySQL复制状况masterha_manger启动MHA masterha_check_status检测当前MHA运行状态masterha_master_monitor检测master是否宕机masterha_master_switch控制故障转移（自动或者手动）masterha_conf_host添加或删除配置的server信息

5.4 配置 ssh 密钥登陆

需要配置 SSH 登陆无密码验证功能，因为 mha 切换的时候需要到主机上执行命令，各主机之间应当都是免密登陆。

需要注意的是不能禁止 password 登陆，否则会出现错误.

Mha manager 主机需要登陆到三台 node 节点主机，在 192.168.129.89 上执行：

#ssh - keygen#一路回车#cd~ / .ssh#ssh - copy - id - i. / id_rsa.pub root@192.168.129.86#ssh - copy - id - i. / id_rsa.pub root@192.168.129.87#ssh - copy - id - i. / id_rsa.pub root@192.168.129.88

在 192.168.129.86/87/88 上生成密钥对，然后互相打通 ssh 密钥登陆

86上执行命令: #ssh - copy - id - i. / id_rsa.pub root@192.168.129.88#ssh - copy - id - i. / id_rsa.pub root@192.168.129.87 87上执行命令: #ssh - copy - id - i. / id_rsa.pub root@192.168.129.86#ssh - copy - id - i. / id_rsa.pub root@192.168.129.88 88上执行命令: #ssh - copy - id - i. / id_rsa.pub root@192.168.129.86#ssh - copy - id - i. / id_rsa.pub root@192.168.129.87

5.5 设置从库只读

两台 slave 服务器设置 read_only（从库对外提供读服务，只所以没有写进配置文件，是因为随时 slave 会提升为 master）

#设置只读的指令，需要在两个从库87 / 88上执行#mysql - uroot - p - e 'set global read_only=1'

5.6 创建监控帐号

#在主库上执行grant all privileges on * . * to 'mhamon'@'192.168.129.89'identified by 'mhamonpass';
flush privileges;

5.7 设置 relaylog 清理

MHA 在发生切换的过程中，从库的恢复过程中依赖于 relay log 的相关信息，所以这里要将 relay log 的自动清除设置为 OFF，

采用手动清除 relay log 的方式。在默认情况下，从服务器上的中继日志会在 SQL 线程执行完毕后被自动删除。

#设置关闭relay_log自动清理的指令，需要在两个从库87 / 88上执行#mysql - uroot - p - e 'set global relay_log_purge=0'

在 MHA 环境中，这些中继日志在恢复其他从服务器时可能会被用到，因此需要禁用中继日志的自动删除功能。定期清除中继日志需要考虑到复制延时的问题。在 ext3 的文件系统下，删除大的文件需要一定的时间，会导致严重的复制延时。为了避免复制延时，需要暂时为中继日志创建硬链接，因为在 linux 系统中通过硬链接删除大文件速度会很快。（在 mysql 数据库中，删除大表时，通常也采用建立硬链接的方式）

MHA 节点中包含了 pure_relay_logs 命令工具，它可以为中继日志创建硬链接，执行 SET GLOBAL relay_log_purge=1, 等待几秒钟以便 SQL 线程切换到新的中继日志，再执行 SET GLOBAL relay_log_purge=0. 这是此工具的原理.

可参照： http://blog.csdn.net/leshami/article/details/45688503

Pure_relay_log 脚本介绍

--user mysql用户名--password mysql密码--port端口号--workdir指定创建relay log的硬链接的位置，默认是 /
var / tmp，由于系统不同分区创建硬链接文件会失败，故需要执行硬链接具体位置，成功执行脚本后，硬链接的中继日志文件被删除--disable_relay_log_purge默认情况下，如果relay_log_purge = 1，脚本会什么都不清理，自动退出，通过设定这个参数，当relay_log_purge = 1的情况下会将relay_log_purge设置为0。清理relay log之后，最后将参数设置为OFF。

两台从服务器上设置 relay 脚本定期清除.

#crontab - l 0 4 * **/bin/bash / data / scripts / purge_relay_log.sh[root@nma04 - 305 - bigdata - 035129087.ctc.local][~]#cat / data / scripts / purge_relay_log.sh#清理脚本# ! /bin/bash
. / etc / profile.~ / .bash_profile.~ / .bashrc
user = root passwd = dbpass123 port = 3306 log_dir = '/data/masterha/log'work_dir = '/data/masterha'purge = '/usr/local/bin/purge_relay_logs'
if [!-d $log_dir] then mkdir $log_dir - p fi
$purge--user = $user--password = $passwd--disable_relay_log_purge--port = $port--workdir = $work_dir >> $log_dir / purge_relay_logs.log 2 > &1

5.8 配置 mha manager

5.8.1 Mha 软件目录结构

#tree / data / mha / #存放mha的相关配置与脚本 / data / mha / ├──app1#日志目录│  ├──app1.log#日志文件│  └──app1.master_status.health#检查文件├──app1.cnf#mha配置文件├──bin│  └──mhaCli.sh#mha启动脚本，自己编写├──master_ip_failover#mha检测失败后执行的自动切换脚本└──sendEmail#mha切换后的发送告警邮件的脚本

5.8.2 mha 配置文件

/data/mha/app1.cnf

[server
default] manager_log = /data/mha / app1 / app1.log#设置manager的日志manager_workdir = /data/mha / app1 / #设置manager的工作目录master_binlog_dir = /data/mysqldata#设置master保存binlog的位置，以便MHA可以找到master的日志，我这里的也就是mysql的数据目录master_ip_failover_script = /data/mha / master_ip_failover#设置自动failover时候的切换脚本master_ip_online_change_script = /data/mha / master_ip_failover#设置手动failover时候的切换脚本report_script = /data/mha / sendEmail#设置发生切换后发送的报警的脚本remote_workdir = /tmp              # 设置远端mysql在发生切换时binlog的保存位置
ping_interval=10                  #设置监控主库，发送ping包的时间间隔，默认是3秒，尝试三次没有回应的时候自动进行railover
user=mhamon                    # 设置监控用户
password=mhamonpass         #  设置监控用户的密码
repl_password=replpass        # 设置复制账号的密码 
repl_user=repluser                # 设置复制账号
ssh_port=22                     # 设置ssh的端口号
ssh_user=root                    # 设置ssh的登陆用户
secondary_check_script=/usr / local / bin / masterha_secondary_check - s slave87 - s slave88
    [server1] hostname = 192.168.129.86 port = 3306
    [server2] hostname = 192.168.129.87 port = 3306#下面这两个参数我没有使用，但是需要了解#candidate_master = 1设置为候选master，如果设置该参数以后，发生主从切换以后将会将此从库提升为主库，即使这个主库不是集群中事件最新的slave#check_repl_delay = 0默认情况下如果一个slave落后master 100M的relay logs的话，MHA将不会选择该slave作为一个新的master，#因为对于这个slave的恢复需要花费很长时间，通过设置check_repl_delay = 0,
    MHA触发切换在选择一个新的master的时候将会忽略复制延时，这个参数对于设置了candidate_master = 1的主机非常有用，因为这个候选主在切换的过程中一定是新的master
    [server3] hostname = 192.168.129.88 port = 3306

5.8.3 mha 启动脚本

#cat bin / mhaCli.sh# ! /bin/bash
. / etc / profile.~ / .bash_profile.~ / .bashrc
run_num = $(ps - ef | grep masterha_manager | grep - v grep | wc - l) pid_file = '/data/mha/app1/app1.master_status.health'
start() {
    if [[$run_num < 1]];
    then args = "--conf=/data/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover"nohup masterha_manager $args < /dev/null > /data/mha / app1 / app1.log 2 > &1 &
    else echo 'mha is already running...'fi
}
stop() {
    if [[$run_num < 1]];
    then echo 'mha not running ...'exit 64
    else ps - ef | grep masterha_manager | grep - v grep | awk '{print $2}' | xargs kill - 9 rm - f $pid_file echo 'mha stop...'fi
}
status() {
    masterha_check_status--conf = /data/mha / app1.cnf
}
case "$1" in start) start;;
    stop) stop;;status) status;; * ) echo 'mhaCli {stop|start|status}';;esac

5.8.4 mha 自动切换脚本

脚本直接使用下面的这个，注意修改标红部分

#cat master_ip_failover# ! /usr/bin / env perl
use strict;
use warnings FATAL = >'all';
use Getopt: :Long;
my($command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port);
my $vip = '192.168.129.91/26';#设置VIP的地方my $key = '91';#此处代表绑定在bond0: 91上my $ssh_start_vip = "/sbin/ifconfig bond0:$key $vip";#开启VIP my $ssh_stop_vip = "/sbin/ifconfig bond0:$key down";#关闭VIP
GetOptions('command=s' = >\$command, 'ssh_user=s' = >\$ssh_user, 'orig_master_host=s' = >\$orig_master_host, 'orig_master_ip=s' = >\$orig_master_ip, 'orig_master_port=i' = >\$orig_master_port, 'new_master_host=s' = >\$new_master_host, 'new_master_ip=s' = >\$new_master_ip, 'new_master_port=i' = >\$new_master_port, );
exit & main();
sub main {
    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
    if ($command eq "stop" || $command eq "stopssh") {
        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n"; & stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif($command eq "start") {
        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n"; & start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif($command eq "status") {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    } else { & usage();
        exit 1;
    }
}
sub start_vip() {`ssh $ssh_user\@$new_master_host\" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
    print
    "Usage: master_ip_failover--command = start | stop | stopssh | status--orig_master_host = host--orig_master_ip = ip--orig_master_port = port--new_master_host = host--new_master_ip = ip--new_master_port = port\n ";
}
"

5.8.5 告警脚本

#cat sendEmail# ! /usr/bin / perl
#Copyright(C) 2011 DeNA Co.,
Ltd.##This program is free software;
you can redistribute it and / or modify#it under the terms of the GNU General Public License as published by#the Free Software Foundation;
either version 2 of the License,
or# (at your option) any later version.##This program is distributed in the hope that it will be useful,
#but WITHOUT ANY WARRANTY;
without even the implied warranty of#MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.See the#GNU General Public License
for more details.##You should have received a copy of the GNU General Public License#along with this program;
if not,
write to the Free Software#Foundation,
Inc.,
#51 Franklin Street,
Fifth Floor,
Boston,
MA 02110 - 1301 USA
##Note: This is a sample script and is not complete.Modify the script based on your environment.
use strict;
use warnings FATAL = >'all';
use Mail: :Sender;
use Getopt: :Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded my($dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body);
my $smtp = 'smtp.163.com';
my $mail_from = 'xxx@163.com';
my $mail_user = 'xxx@163.com';
my $mail_pass = 'xxxxxx';
my $mail_to = ['xxx@xxx.cn', 'xxxxx@xxx.cn'];
GetOptions('orig_master_host=s' = >\$dead_master_host, 'new_master_host=s' = >\$new_master_host, 'new_slave_hosts=s' = >\$new_slave_hosts, 'subject=s' = >\$subject, 'body=s' = >\$body, );
mailToContacts($smtp, $mail_from, $mail_user, $mail_pass, $mail_to, $subject, $body);
sub mailToContacts {
    my($smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg) = @_;
    open my $DEBUG,
    "> /tmp/monitormail.log"or die "Can't open the debug      file:$!\n";
    my $sender = new Mail: :Sender {
        ctype = >'text/plain; charset=utf-8',
        encoding = >'utf-8',
        smtp = >$smtp,
        from = >$mail_from,
        auth = >'LOGIN',
        TLS_allowed = >'0',
        authid = >$user,
        authpwd = >$passwd,
        to = >$mail_to,
        subject = >$subject,
        debug = >$DEBUG
    };
    $sender - >MailMsg({
        msg = >$msg,
        debug = >$DEBUG
    }) or print $Mail: :Sender: :Error;
    return 1;
}
#Do whatever you want here
exit 0;

5.9 Mha 集群测试 & 启动

测试 ssh 免密通信是否正常：masterha_check_ssh --conf=/data/mha/app1.cnf

出现的是六个 ok，没有 fail 的报错，则是通过。

测试 mysql 复制是否正常：masterha_check_repl --conf=/data/mha/app1.cnf

Wed May 31 16 : 14 : 56 2017 - [warning] Global configuration file / etc / masterha_default.cnf not found.Skipping.Wed May 31 16 : 14 : 56 2017 - [info] Reading application
default configuration from / data / mha / app1.cnf..Wed May 31 16 : 14 : 56 2017 - [info] Reading server configuration from / data / mha / app1.cnf..Wed May 31 16 : 14 : 56 2017 - [info] MHA: :MasterMonitor version 0.56.Wed May 31 16 : 14 : 57 2017 - [info] GTID failover mode = 1 Wed May 31 16 : 14 : 57 2017 - [info] Dead Servers: Wed May 31 16 : 14 : 57 2017 - [info] Alive Servers: Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.86(192.168.129.86 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.87(192.168.129.87 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.88(192.168.129.88 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] Alive Slaves: Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.87(192.168.129.87 : 3306) Version = 5.7.18 - log(oldest major version between slaves) log - bin: enabled Wed May 31 16 : 14 : 57 2017 - [info] GTID ON Wed May 31 16 : 14 : 57 2017 - [info] Replicating from 192.168.129.86(192.168.129.86 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.88(192.168.129.88 : 3306) Version = 5.7.18 - log(oldest major version between slaves) log - bin: enabled Wed May 31 16 : 14 : 57 2017 - [info] GTID ON Wed May 31 16 : 14 : 57 2017 - [info] Replicating from 192.168.129.86(192.168.129.86 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] Current Alive Master: 192.168.129.86(192.168.129.86 : 3306) Wed May 31 16 : 14 : 57 2017 - [info] Checking slave configurations..Wed May 31 16 : 14 : 57 2017 - [info] read_only = 1 is not set on slave 192.168.129.87(192.168.129.87 : 3306).Wed May 31 16 : 14 : 57 2017 - [info] read_only = 1 is not set on slave 192.168.129.88(192.168.129.88 : 3306).Wed May 31 16 : 14 : 57 2017 - [info] Checking replication filtering settings..Wed May 31 16 : 14 : 57 2017 - [info] binlog_do_db = ,
    binlog_ignore_db = Wed May 31 16 : 14 : 57 2017 - [info] Replication filtering check ok.Wed May 31 16 : 14 : 57 2017 - [info] GTID(with auto - pos) is supported.Skipping all SSH and Node package checking.Wed May 31 16 : 14 : 57 2017 - [info] Checking SSH publickey authentication settings on the current master..Wed May 31 16 : 14 : 57 2017 - [info] HealthCheck: SSH to 192.168.129.86 is reachable.Wed May 31 16 : 14 : 57 2017 - [info] 192.168.129.86(192.168.129.86 : 3306)(current master) + --192.168.129.87(192.168.129.87 : 3306) + --192.168.129.88(192.168.129.88 : 3306)
    Wed May 31 16 : 14 : 57 2017 - [info] Checking replication health on 192.168.129.87..Wed May 31 16 : 14 : 57 2017 - [info] ok.Wed May 31 16 : 14 : 57 2017 - [info] Checking replication health on 192.168.129.88..Wed May 31 16 : 14 : 57 2017 - [info] ok.Wed May 31 16 : 14 : 57 2017 - [info] Checking master_ip_failover_script status: Wed May 31 16 : 14 : 57 2017 - [info] / data / mha / master_ip_failover--command = status--ssh_user = root--orig_master_host = 192.168.129.86--orig_master_ip = 192.168.129.86--orig_master_port = 3306
    IN SCRIPT TEST === =/sbin/ifconfig bond0: 91 down == /sbin/ifconfig bond0: 91 192.168.129.91 / 26 ===
    Checking the Status of the script..OK Wed May 31 16 : 14 : 57 2017 - [info] OK.Wed May 31 16 : 14 : 57 2017 - [warning] shutdown_script is not defined.Wed May 31 16 : 14 : 57 2017 - [info] Got exit code 0(Not master dead).
    MySQL Replication Health is OK.

启动 mha manager 节点：

/data/mha / bin / mhaCli.sh start

检查 mha manager 的状态：

masterha_check_status--conf = /data/mha / app1.cnf或者： / data / mha / bin / mhaCli.sh status

5.10 配置 VIP

因为使用脚本的方式，所以第一次 VIP 需要手动配置，在主服务器 86 上执行：

# / sbin / ifconfig bond0: 91 192.168.129.91 / 26#ifconfig bond0 Link encap: Ethernet HWaddr 00 : 0E: 1E: BD: 03 : E0 inet addr: 192.168.129.86 Bcast: 192.168.129.127 Mask: 255.255.255.192 UP BROADCAST RUNNING MASTER MULTICAST MTU: 1500 Metric: 1 RX packets: 1335051 errors: 0 dropped: 0 overruns: 0 frame: 0 TX packets: 708712 errors: 0 dropped: 0 overruns: 0 carrier: 0 collisions: 0 txqueuelen: 0 RX bytes: 1524679739(1.4 GiB) TX bytes: 708278247(675.4 MiB)
bond0: 91 Link encap: Ethernet HWaddr 00 : 0E: 1E: BD: 03 : E0 inet addr: 192.168.129.91 Bcast: 192.168.129.127 Mask: 255.255.255.192 UP BROADCAST RUNNING MASTER MULTICAST MTU: 1500 Metric: 1

六、Mha 集群测试

测试方法可以将主上面的 mysql 停掉之后，观察 mha 的日志，可以看到详细的切换过程.

停掉 86 的 mysql， VIP 91 会漂移到新的 master，并且新的 master 也会被提升为主.

在 87 上执行 ifconfig 查看验证.

七、节点重新上线

当原来的主服务器 A 问题修复好之后，此时需要重新上线主机，则可以通过以下方式：

在服务器 A 上搭建好 mysql 服务，建议和之前配置参数一致；服务器之间免密。
在现在的 master 或者 slave 使用 mysqldump 将数据备份，加 --master-data=2 -A 参数
将备份数来的数据在服务器 A 上进行恢复，完成后执行 flush privileges 刷新权限。
成后配置 GTID 的 changemaster 操作（见 4.4），start slave 即可
将主机的信息添加到 mha 的配置文件中，以便 mhamanager 检测到新的节点主机
使用 mha 的测试命令进行测试，见（5.9），成功则启动 mha 程序即可

八、总结

目前高可用方案可以一定程度上实现数据库的高可用，比如 MMM，heartbeat+drbd，Cluster 等，还有 percona 的 Galera Cluster 等。

这些高可用软件各有优劣。在进行高可用方案选择时，主要是看业务还有对数据一致性方面的要求。最后出于对数据库的高

可用和数据一致性的要求，推荐使用 MHA 架构。

参考文档：

https://yq.aliyun.com/articles/58004?spm=5176.100239.blogcont57855.9.jUuCt0

http://www.cnblogs.com/gomysql/p/3675429.html

http://blog.csdn.net/wb96a1007/article/details/51559741

https://github.com/yoshinorim/mha4mysql-manager/wiki/ReleaseNotes#changes-in-manager-057

来源: http://www.cnblogs.com/topicjie/p/7188533.html

与本文相关文章

暂无,快来抢沙发吧！