资讯专栏INFORMATION COLUMN

MySQL集群搭建(6)-双主+keepalived高可用

CarlBenjamin / 2055人阅读

摘要:双主是一个比较简单的高可用架构,适用于中小集群,今天就说说怎么用做的高可用。缺点也比较明显,就是增加从节点的情况下,从节点不会主动切换同步对象,而且脚本需要自己实现,有一定风险。

双主 + keepalived 是一个比较简单的 MySQL 高可用架构,适用于中小 MySQL 集群,今天就说说怎么用 keepalived 做 MySQL 的高可用。

1 概述 1.1 keepalived 简介

简单地说,keepalived 就是通过管理 VIP 来实现机器的高可用的,在使用 keepalived 的情况下,只有一台服务器能够提供服务(通过 VIP 来实现),当 Master 主机宕机后,VIP 会自动飘移到另一台服务器

keepalived 采用 Master/Slave 模式, 在 Master 上设置配置文件的 VIP,当 Master 宕机后,VIP 自动漂移到另一台 keepalived 服务器上

keepalived 可以用来做各种软件的高可用集群,它会一直检测服务器的状态,如果有一台服务器宕机,或工作出现故障,keepalived 将检测到,并将有故障的服务器从系统中剔除,同时使用其他服务器代替该服务器的工作,当服务器工作正常后 keepalived 自动将服务器加入到服务器群中。

1.2 keepalived 配合双主

keepalived 使用默认配置只能做到主机级别的高可用,但是我们的 MySQL 要做高可用至少要增加以下功能

能够检测 MySQL 服务状态

主节点 read_only=0,备节点 read_only=1

切换时,备节点要等待主节点同步完成

所以,keepalived 实现 MySQL 高可用需要使用自定义脚本来进行扩展

2 环境准备 2.1 数据库环境

操作前已经准备好了一套主主架构数据库,搭建方法参考 MySQL集群搭建(2)-主主从模式

节点信息
IP 系统 端口 MySQL版本 节点 读写 说明
10.0.0.247 Centos6.5 3306 5.7.9 Master 读写 主节点
10.0.0.248 Centos6.5 3306 5.7.9 Standby 只读,可切换为读写 备主节点
VIP 信息
简称 VIP 类型
RW-VIP 10.0.0.237 读写VIP
Master 参考配置

</>复制代码

  1. [client]
  2. port = 3306
  3. default-character-set=utf8mb4
  4. socket = /data/mysql_db/test_db/mysql.sock
  5. [mysqld]
  6. datadir = /data/mysql_db/test_db
  7. basedir = /usr/local/mysql57
  8. tmpdir = /tmp
  9. socket = /data/mysql_db/test_db/mysql.sock
  10. pid-file = /data/mysql_db/test_db/mysql.pid
  11. skip-external-locking = 1
  12. skip-name-resolve = 1
  13. port = 3306
  14. server_id = 2473306
  15. default-storage-engine = InnoDB
  16. character-set-server = utf8mb4
  17. default_password_lifetime=0
  18. auto_increment_offset = 1
  19. auto_increment_increment = 2
  20. #### log ####
  21. log_timestamps=system
  22. log_bin = /data/mysql_log/test_db/mysql-bin
  23. log_bin_index = /data/mysql_log/test_db/mysql-bin.index
  24. binlog_format = row
  25. relay_log_recovery=ON
  26. relay_log=/data/mysql_log/test_db/mysql-relay-bin
  27. relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
  28. log_error = /data/mysql_log/test_db/mysql-error.log
  29. #### replication ####
  30. log_slave_updates = 1
  31. replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%
  32. #### semi sync replication settings #####
  33. plugin_dir=/usr/local/mysql57/lib/plugin
  34. plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
  35. loose_rpl_semi_sync_master_enabled = 1
  36. loose_rpl_semi_sync_slave_enabled = 1
Slave 参考配置

</>复制代码

  1. [client]
  2. port = 3306
  3. default-character-set=utf8mb4
  4. socket = /data/mysql_db/test_db/mysql.sock
  5. [mysqld]
  6. datadir = /data/mysql_db/test_db
  7. basedir = /usr/local/mysql57
  8. tmpdir = /tmp
  9. socket = /data/mysql_db/test_db/mysql.sock
  10. pid-file = /data/mysql_db/test_db/mysql.pid
  11. skip-external-locking = 1
  12. skip-name-resolve = 1
  13. port = 3306
  14. server_id = 2483306
  15. default-storage-engine = InnoDB
  16. character-set-server = utf8mb4
  17. default_password_lifetime=0
  18. auto_increment_offset = 2
  19. auto_increment_increment = 2
  20. #### log ####
  21. log_timestamps=system
  22. log_bin = /data/mysql_log/test_db/mysql-bin
  23. log_bin_index = /data/mysql_log/test_db/mysql-bin.index
  24. binlog_format = row
  25. relay_log_recovery=ON
  26. relay_log=/data/mysql_log/test_db/mysql-relay-bin
  27. relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
  28. log_error = /data/mysql_log/test_db/mysql-error.log
  29. #### replication ####
  30. log_slave_updates = 1
  31. replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%
  32. #### semi sync replication settings #####
  33. plugin_dir=/usr/local/mysql57/lib/plugin
  34. plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
  35. loose_rpl_semi_sync_master_enabled = 1
  36. loose_rpl_semi_sync_slave_enabled = 1
2.2 创建监控用的账号

</>复制代码

  1. - 由于是测试环境,账号密码设置比较随便
  2. create user monitor@"localhost" identified by "monitor";
  3. grant all on *.* to monitor@"localhost";
  4. flush privileges;
2.3 安装 keepalived

我们在 Master 和 Slave 上部署 keepalived

1). yum 安装

如果有对应的 yum 源,直接安装就可以了

</>复制代码

  1. yum install -y keepalived
2). 源码安装

下载安装包, 下载地址 keepalived, 使用 1.2.24 版本举例

</>复制代码

  1. # 安装依赖
  2. yum install -y gcc popt-devel openssl openssl-devel libssl-dev libnl-devel popt-devel libnfnetlink-devel
  3. # 下载包
  4. wget http://www.keepalived.org/software/keepalived-1.2.24.tar.gz
  5. # 解压安装
  6. tar -xvz -f keepalived-1.2.24.tar.gz
  7. cd keepalived-1.2.24
  8. ./configure --prefix=/usr/local/keepalived
  9. make && make install
  10. cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
  11. cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
  12. cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
  13. mkdir /etc/keepalived/
  14. cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
3 配置高可用 3.1 keepalived 配置

打开 /etc/keepalived/keepalived.conf 文件, 按照实际情况加上下面的配置

</>复制代码

  1. global_defs {
  2. router_id MYSQL_MM # 标识
  3. vrrp_skip_check_adv_addr
  4. vrrp_strict # 严格执行 VRRP 协议规范
  5. vrrp_garp_interval 0
  6. vrrp_gna_interval 0
  7. }
  8. vrrp_script check_mysql {
  9. script "/bin/sh /etc/keepalived/keepalived_mysql_check.sh" # 检查脚本
  10. interval 10 # 检查周期
  11. }
  12. vrrp_instance MYSQL_MM {
  13. state BACKUP # 都设为 BACKUP,避免起来后抢占
  14. interface eth0 # 网卡名称,根据实际情况填写
  15. virtual_router_id 243 # 用来区分 VRRP 组播的标记,取值 0-255
  16. priority 100
  17. advert_int 1
  18. nopreempt # 设为非抢占
  19. authentication {
  20. auth_type PASS
  21. auth_pass 1111
  22. }
  23. # Master 节点可以注释掉下面语句,防止启动 keepalived 的时候执行脚本
  24. notify_master "/bin/sh /etc/keepalived/keepalived_mysql_start.sh" # 变为 MASTER 时执行
  25. virtual_ipaddress {
  26. 10.0.0.237
  27. }
  28. # Slave 节点可以注释下面检查脚本,Slave 没有必要一直检查
  29. track_script {
  30. check_mysql
  31. }
  32. }
3.2 配置检查脚本

打开 /etc/keepalived/keepalived_mysql_check.sh, 写入检测脚本

</>复制代码

  1. #!/bin/sh
  2. # @Author: chengqm
  3. # MySQL 检测脚本
  4. MyPath=$(cd $(dirname $0); pwd)
  5. cd $MyPath
  6. ThisTime=`date "+%F %T"`
  7. log_file="/var/log/keepalived_mysql.log"
  8. # MySQL 连接方式,根据实际情况调整
  9. export MYSQL_PWD="monitor"
  10. MYSQL_USER="monitor"
  11. MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
  12. mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "
  13. # 美化输出
  14. function techo() {
  15. message=$1
  16. message_level=$2
  17. if [ -e $message_level ];then
  18. message_level="info"
  19. fi
  20. echo "`date "+%F %T"` - [${message_level}] $message" >> $log_file
  21. }
  22. # 检查函数, 正常返回 0
  23. function check {
  24. ret=`$mysql_connect -N -e "select 1 as value"`
  25. if [ $? -ne 0 ] || [ $ret -ne "1" ];then
  26. return 1
  27. else
  28. return 0
  29. fi
  30. }
  31. function read_only {
  32. param=$1
  33. $mysql_connect -e "set global read_only = ${param}"
  34. techo "设置是否只读 read_only ${param}"
  35. }
  36. # 失效转移
  37. function failover {
  38. techo "开始执行失效转移"
  39. # 1. 停止 keepalived
  40. killall keepalived
  41. # 2. 如果还能执行的话,设为 read_only
  42. read_only 1
  43. if [ $? -eq 0 ];then
  44. # 3. 如果还能执行,kill 所有的连接
  45. $mysql_connect -e "select concat("KILL ",id,";") from information_schema.processlist where user!="root" AND db is not null into outfile "/tmp/kill.txt.${ThisTime}";"
  46. if [ $? -eq 0 ];then
  47. $mysql_connect -e "source /tmp/kill.txt.${ThisTime};"
  48. fi
  49. fi
  50. # 4. 其他操作,比如说自动关机
  51. techo "失效转移执行成功,当前数据库关闭访问"
  52. }
  53. # 有问题检查 4 次
  54. for ((i=1; i<=4; i ++))
  55. do
  56. check
  57. if [ $? -eq 0 ];then
  58. techo "MySQL is ok"
  59. # 正常退出脚本
  60. exit 0
  61. else
  62. techo "Connection failed $i time(s)"
  63. sleep 1
  64. fi
  65. done
  66. techo "无法连接当前数据库"
  67. # 失效转移
  68. failover

注意:脚本没有经过严格测试,需要根据实际情况调整

3.3 配置提升为 Master 时执行的脚本

打开 /bin/sh /etc/keepalived/keepalived_mysql_start.sh", 写入脚本内容

</>复制代码

  1. #!/bin/sh
  2. # @Author: chengqm
  3. # keepalived 变为 Master 时执行
  4. MyPath=$(cd $(dirname $0); pwd)
  5. cd $MyPath
  6. ThisTime=`date "+%F %T"`
  7. log_file="/var/log/keepalived_mysql.log"
  8. # MySQL 连接方式,根据实际情况调整
  9. export MYSQL_PWD="monitor"
  10. MYSQL_USER="monitor"
  11. MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
  12. mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "
  13. # 美化输出
  14. function techo() {
  15. message=$1
  16. message_level=$2
  17. if [ -e $message_level ];then
  18. message_level="info"
  19. fi
  20. echo "`date "+%F %T"` - [${message_level}] $message" >> $log_file
  21. }
  22. # 检查函数, 正常返回 0
  23. function check {
  24. ret=`$mysql_connect -N -e "select 1 as value"`
  25. if [ $? -ne 0 ] || [ $ret -ne "1" ];then
  26. return 1
  27. else
  28. return 0
  29. fi
  30. }
  31. # 获取 slave status 的信息
  32. function slave_info() {
  33. tmp_file=/tmp/slave_info.tmp
  34. $mysql_connect -e "show slave statusG" > /tmp/slave_info.tmp
  35. slave_sql=`grep "Slave_SQL_Running:" $tmp_file | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  36. seconds_behind_master=`grep "Seconds_Behind_Master:" $tmp_file | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  37. master_log_file=`grep "Master_Log_File:" $tmp_file | head -1 | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  38. master_log_pos=`grep "Read_Master_Log_Pos:" $tmp_file | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  39. relay_master_log_file=`grep "Relay_Master_Log_File:" $tmp_file | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  40. exec_master_log_pos=`grep "Exec_Master_Log_Pos:" $tmp_file | sed "s/s*//g" | tr "A-Z" "a-z" | awk -F":" "{print $2}"`
  41. }
  42. # 设置是否可读
  43. function read_only {
  44. param=$1
  45. $mysql_connect -e "set global read_only = ${param}"
  46. techo "设置是否只读 read_only ${param}"
  47. }
  48. # 处理数据同步
  49. function sync_master_log() {
  50. # 如果是数据一致性优先,等待同步完毕。如果是服务可用性优先,可以注销下面的代码
  51. slave_info
  52. if [ $slave_sql == "yes" ];then
  53. techo "当前同步位置 Master ${master_log_file} ${master_log_pos}"
  54. techo "等待同步到 Master ${master_log_file} ${master_log_pos}"
  55. $mysql_connect -e "select master_pos_wait("$master_log_file", $master_log_pos);" > /dev/null
  56. techo "同步完毕"
  57. fi
  58. }
  59. techo "当前数据库提升为主库"
  60. check
  61. if [ $? -ne 0 ];then
  62. techo "无法连接当前数据库"
  63. exit 1
  64. fi
  65. # 等待同步
  66. sync_master_log
  67. # 设为可写
  68. read_only 0

注意:脚本没有经过严格测试,需要根据实际情况调整

3.4 启动 keepalived

由于配置了 BACKUP 模式,所以两个 keepalived 先起来的是主,先后在主备节点执行

</>复制代码

  1. /etc/init.d/keepalived start

检查 /var/log/message 日志,确认 keepalived 没有报错

检查 Master IP 状态, 确认设置了 VIP

</>复制代码

  1. [root@cluster01 shell]# ip addr
  2. 1: lo: mtu 16436 qdisc noqueue state UNKNOWN
  3. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  4. inet 127.0.0.1/8 scope host lo
  5. inet6 ::1/128 scope host
  6. valid_lft forever preferred_lft forever
  7. 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
  8. link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
  9. inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
  10. inet 10.0.0.237/32 scope global eth0
  11. inet6 fe80::f816:3eff:fede:8033/64 scope link
  12. valid_lft forever preferred_lft forever

检查 MySQL 检测脚本执行情况,确认正常运行

</>复制代码

  1. [root@cluster01 ~]# tail -f /var/log/keepalived_mysql.log
  2. ...
  3. 2019-01-28 15:04:18 - [info] MySQL is ok
  4. 2019-01-28 15:04:28 - [info] MySQL is ok
4 失效转移测试

mytest 库里新建 nowdate 测试表,只有 idctime 字段,然后每秒插入一条数据

</>复制代码

  1. [root@cluster03 ~]# while true; do date;mysql -h10.0.0.237 -P3306 -umytest -e "use mytest;insert into nowdate values (null, now());"; sleep 1;done
  2. Mon Jan 28 15:04:26 CST 2019
  3. Mon Jan 28 15:04:27 CST 2019
  4. ...

kill 掉 Master 进程

</>复制代码

  1. killall mysqld

查看旧 Master 日志

</>复制代码

  1. 2019-01-28 15:04:48 - [info] MySQL is ok
  2. 2019-01-28 15:04:58 - [info] Connection failed 1 time(s)
  3. 2019-01-28 15:04:59 - [info] Connection failed 2 time(s)
  4. 2019-01-28 15:05:00 - [info] Connection failed 3 time(s)
  5. 2019-01-28 15:05:01 - [info] Connection failed 4 time(s)
  6. 2019-01-28 15:05:02 - [info] 无法连接当前数据库
  7. 2019-01-28 15:05:02 - [info] 开始执行失效转移
  8. 2019-01-28 15:05:02 - [info] 设置是否只读 read_only 1
  9. 2019-01-28 15:05:02 - [info] 失效转移执行成功,当前数据库关闭访问

查看新 Master 日志

</>复制代码

  1. 2019-01-28 15:05:04 - [info] 当前数据库提升为主库
  2. 2019-01-28 15:05:04 - [info] 当前同步位置 Master mysql-bin.000015 32338
  3. 2019-01-28 15:05:04 - [info] 等待同步到 Master mysql-bin.000015 32338
  4. 2019-01-28 15:05:04 - [info] 同步完毕
  5. 2019-01-28 15:05:04 - [info] 设置是否只读 read_only 0
  6. 2019-01-28 15:05:05 - [info] MySQL is ok

查看新 Master IP,确认 VIP 已经飘过来了

</>复制代码

  1. [root@cluster02 ~]# ip addr
  2. 1: lo: mtu 16436 qdisc noqueue state UNKNOWN
  3. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  4. inet 127.0.0.1/8 scope host lo
  5. inet6 ::1/128 scope host
  6. valid_lft forever preferred_lft forever
  7. 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
  8. link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff
  9. inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0
  10. inet 10.0.0.237/32 scope global eth0
  11. inet6 fe80::f816:3eff:fe66:7ee8/64 scope link
  12. valid_lft forever preferred_lft forever

查看插入数据执行情况,大概有 12 秒是不可用的

</>复制代码

  1. Mon Jan 28 15:04:51 CST 2019
  2. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  3. Mon Jan 28 15:04:52 CST 2019
  4. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  5. Mon Jan 28 15:04:53 CST 2019
  6. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  7. Mon Jan 28 15:04:54 CST 2019
  8. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  9. Mon Jan 28 15:04:55 CST 2019
  10. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  11. Mon Jan 28 15:04:56 CST 2019
  12. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  13. Mon Jan 28 15:04:57 CST 2019
  14. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  15. Mon Jan 28 15:04:58 CST 2019
  16. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  17. Mon Jan 28 15:05:00 CST 2019
  18. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  19. Mon Jan 28 15:05:01 CST 2019
  20. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  21. Mon Jan 28 15:05:02 CST 2019
  22. ERROR 2003 (HY000): Can"t connect to MySQL server on "10.0.0.237" (111)
  23. Mon Jan 28 15:05:03 CST 2019

失效切换成功

5 总结

使用双主 + keepalived 的优点是部署简单,双主加半同步情况下,理论上不会丢数据,适用于中小型 MySQL 集群。缺点也比较明显,就是增加从节点的情况下,从节点不会主动切换同步对象,而且脚本需要自己实现,有一定风险。

文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。

转载请注明本文地址:https://www.ucloud.cn/yun/17896.html

相关文章

  • MySQL集群搭建(6)-双主+keepalived可用

    摘要:双主是一个比较简单的高可用架构,适用于中小集群,今天就说说怎么用做的高可用。缺点也比较明显,就是增加从节点的情况下,从节点不会主动切换同步对象,而且脚本需要自己实现,有一定风险。 双主 + keepalived 是一个比较简单的 MySQL 高可用架构,适用于中小 MySQL 集群,今天就说说怎么用 keepalived 做 MySQL 的高可用。 1 概述 1.1 keepalive...

    lily_wang 评论0 收藏0
  • MySQL可用方案测试

    MySQL高可用方案测试 img{ display:block; margin:0 auto !important; width:100%; } body{ width:75%; margin...

    IT那活儿 评论0 收藏2496

发表评论

0条评论

CarlBenjamin

|高级讲师

TA的文章

阅读更多
最新活动
阅读需要支付1元查看
<