之前有写过redis+sentinel的哨兵机制主从的切换,这一次多了一个keepalived,是为了能够方便项目只支持一台访问,可是又要高可用的情况下,就可以执行此方案。
本次主要讲的就是keepalived的配置,如何才能做到单台访问而实现高可用,从而实现主从无缝切换。
1.有个问题需要注意
当master down了,backup接管了,master再次起来,不能再成为master。否则master恢复了再接管的话,会造成业务来回切换,这时候就需要nopreempt参数了。
nopreempt:设置不抢占,这里只能设置在state为backup的节点上,而且这个节点的优先级必须别另外的高。
先来看看方案的整体思路:
通过keepalived的自定义脚本功能监控本机的redis服务状态,当监控脚本检测到redis服务出现异常时,则改变本机keepalived的优先级,同时这会导致master/backup角色的变化,而keepalived在角色变化时也会触发一些机制执行相关脚本,这就为我们改变redis的master/slave状态提供了机会,这样做的目的是为了是redis的master/slave直接的数据保持一致。
在keepalived+redis的使用过程中有三种情况:
1 一种是keepalived挂了,同时redis也挂了,这样的话直接VIP飘走之后,通过哨兵对redis数据同步,并切换主从,哨兵集群会自动去切换,保证数据的一致性。
2 另一种是keepalived挂了,redis没挂,这时候VIP飘走后,redis的master/slave还是老的对应关系,如果不变化的话会把数据写入redis slave中,从而不会同步到master上去,这就要借助监控脚本反转redis的master/slave关系。这时候就要预留一点时间进行数据同步,然后反转master/slave。
3 还有一种是keepalived没挂,redis挂了,这时候根据监控脚本会检测到redis挂了,并且降低keepalived master的优先级,同样会导致VIP飘走,情况和第二种一样,也是需要进行数据同步,然后反转当前redis的master/slave关系的。
进入正题,安装keepalived的过程就略了,安装redis的也跳过,直接将keepalived的配置文件。
一.配置主keepalived
! Configuration File for keepalived
global_defs {
lvs_id LVS_redis
}
vrrp_script chk_redis {
script "/etc/keepalived/scripts/redis_check.sh" #执行指定脚本
weight -20 #脚本结果导致的优先级变更:20表示优先级+20;-20则表示优先级-20
interval 2 #指定脚本的执行时间间隔
}
vrrp_instance VI_1 {
state backup
interface eth0 #把vip挂再哪个网卡上
virtual_router_id 51
nopreempt #不抢占资源,只有在主的keepalived设置
priority 200 #权重值
advert_int 5
track_script {
chk_redis
}
virtual_ipaddress {
192.168.18.230 #设置VIP
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
二.建立主redis切换状态脚本
在/etc/keepalived目录下建立log和scripts目录。
在script下有五个脚本,一个是检测redis状态的redis_check.sh脚本,其余四个是keepalived状态变化时执行的脚本。keepalived有master/backup/stop/fault四种状态,因为我们主要是关注系统上的业务,所以在在keepalived进入fault/stop状态后,也认为是进入了backup状态,需要对redis的master/slave关系进行反转,否则即使VIP漂移过去,但是redis的主从关系还没有改变,会导致数据不一致,所以最终四个脚本只有两种内容。
(1)检测脚本redis_check.sh (主从这里配置都一样)
#!/bin/bash
###/etc/keepalived/scripts/redis_check.sh
ALIVE=`/opt/redis/src/redis-cli PING` #参考使用只修改redis-cli的路径即可
if [ "$ALIVE" == "PONG" ]; then
echo $ALIVE
exit 0
else
echo $ALIVE
exit 1
fi
(2)keepalived进入master状态时的检测脚本redis_master.sh
#!/bin/bash
###/etc/keepalived/scripts/redis_master.sh
REDISCLI="/opt/redis/src/redis-cli" #修改redis-cli的位置,如有做变量,则直接输入为"redis-cli"即可
LOGFILE="/etc/keepalived/log/redis-state.log" #生成日志路径
pid=$$
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver]" >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] Run 'SLAVEOF 192.168.18.137 6379'" >> $LOGFILE #修改下从redis的IP即可
$REDISCLI SLAVEOF 192.168.18.137 6379 >> $LOGFILE 2>&1 #修改下从redis的IP即可
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] wait 10 sec for data sync from old master" >> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] data rsync from old mater ok..." >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] Run slaveof no one,close master/slave" >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] wait other slave connect...." >> $LOGFILE
(3)keepalived进入backup时的检测脚本(stop和fault的脚本跟backup脚本一致,所以CP修改下名称即可)
#!/bin/bash
###/etc/keepalived/scripts/redis_backup.sh
REDISCLI="/opt/redis/src/redis-cli"
LOGFILE="/etc/keepalived/log/redis-state.log"
pid=$$
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master]" >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] Being slave state..." >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] wait 10 sec for data sync from old master" >> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] data rsync from old mater ok..." >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] Run 'SLAVEOF 192.168.18.137 6379'" >> $LOGFILE #修改从redis的IP即可
$REDISCLI SLAVEOF 192.168.18.137 6379 >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] slave connect to 192.168.18.137 ok..." >> $LOGFILE #修改从redis的IP即可
三.从keepalived配置文件配置
! Configuration File for keepalived
global_defs {
lvs_id LVS_redis
}
vrrp_script chk_redis {
script "/etc/keepalived/scripts/redis_check.sh"
weight -20
interval 2
}
vrrp_instance VI_1 {
state backup
interface eth0
virtual_router_id 51
priority 190 #权重值要比主的低
advert_int 5
track_script {
chk_redis
}
virtual_ipaddress {
192.168.18.230
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
四.从redis状态切换脚本
在/etc/keepalived目录下建立log和scripts目录
(1)从redis服务状态检测脚本redis_check.sh(136上面内容和它一样)
#!/bin/bash
###/etc/keepalived/scripts/redis_check.sh
ALIVE=`/opt/redis-3.2.3/src/redis-cli PING` #修改redis-cli的路径即可,如设置好变量,直接写"redis-cli"即可
if [ "$ALIVE" == "PONG" ]; then
echo $ALIVE
exit 0
else
echo $ALIVE
exit 1
fi
(2)从 keepalived进入master状态时的检测脚本redis_master.sh
#!/bin/bash
###/etc/keepalived/scripts/redis_master.sh
REDISCLI="/opt/redis-3.2.3/src/redis-cli" #修改redis-cli的路径即可
LOGFILE="/etc/keepalived/log/redis-state.log"
pid=$$
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[backup]" >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[backup] Run 'SLAVEOF 192.168.18.136 6379'" >> $LOGFILE #从redis要写主redis的IP
$REDISCLI SLAVEOF 192.168.18.136 6379 >> $LOGFILE 2>&1 #从redis要写主redis的IP
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[backup] wait 10 sec for data sync from old master" >> $LOGFILE
sleep 10
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] data rsync from old mater ok..." >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] Run slaveof no one,close master/slave" >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] wait other slave connect...." >> $LOGFILE
(3)从keepalived进入backup/stop/fault时的检测脚本,由于内容都一致,所以只写出redis_backup.sh
#!/bin/bash
###/etc/keepalived/scripts/redis_backup.sh
REDISCLI="/opt/redis-3.2.3/src/redis-cli"
LOGFILE="/etc/keepalived/log/redis-state.log"
pid=$$
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] Being slave state..." >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] wait 15 sec for data sync from old master" >> $LOGFILE
sleep 15
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[master] data rsync from old mater ok..." >> $LOGFILE
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] Run 'SLAVEOF 192.168.18.136 6379'" >> $LOGFILE
$REDISCLI SLAVEOF 192.168.18.136 6379 >> $LOGFILE 2>&1
echo "`date +'%Y-%m-%d:%H:%M:%S'`|$pid|state:[slaver] slave connect to 192.168.18.136 ok..." >> $LOGFILE
五.开始实验
既然我们设置了nopreempt,那么在启动keepalived的时候就有启动的顺序问题了,我们把redis的master和keepalived的master(虽然配置文件中都是backup,但是我们是想让136这台做master的,由于在keepalived的master上面设置了nopreempt参数,所以在启动keepalived服务的时候,一定要先启动redis master的那台,因为在设置了nopreempt了,keepalived在启动后都是先进入backup状态,而脚本又设置了进入backup状态后,会连接新的对方进行数据同步,所以,在启动keepalived之前还有一个条件就是redis的master和slave中的数据必须一致。这样先启动redis的master那台的keepalived,虽然redis master会连接到redis slave同步数据,但是两边数据在刚开始的时候是一致的,并不会产生什么问题。
1.模拟一,首先我们把redis按顺序启动,然后查看哨兵以及redis.log是否数据同步。
查看redis日志,我们发现以及跟137的达成同步了。
我们再把主redis关掉,会看到原来的主已经变成从了,而VIP也已经飘逸过去137身上了。
2.模拟二,把keepalived关掉,redis正常的情况下会发现它也自动切换VIP并把主变成了从。
总结:从以上可以见得,不管是keepalived宕了,还是redis宕了,都会自动去执行脚本并切换他,使得无缝切换,更多的切换时间参数已经超时时间,后续你们自己配一下就好,我就不多说了,其他的现象,你们也可以模拟一下,在实验的过程中发现有问题的,可以在下方评论,我会回馈你们的。