├── README.md ├── README_Chinese.md ├── gr_mw_mode_sw_cheker.sh ├── gr_sw_mode_checker.sh └── proxysql_groupreplication_checker.sh /README.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | According to the blog [HA with MySQL Group Replication and ProxySQL](http://lefred.be/content/ha-with-mysql-group-replication-and-proxysql/), we can use ProxySQL to make HA with MySQL Group Replication(MGR), and even we can realize Read-Write split above it. This project provides several shell scripts to be used in ProxySQL scheduler in order to meet the function we memtion. 4 | 5 | ## proxysql_groupreplication_checker.sh 6 | 7 | This script is an example of scheduler that can be used with ProxySQL to monitor MySQL Group Replication members 8 | 9 | Modify from : [https://github.com/lefred/proxysql_groupreplication_checker](https://github.com/lefred/proxysql_groupreplication_checker) 10 | 11 | ### Features and Limitations 12 | 13 | #### Features 14 | 15 | - Read-Write split 16 | - Multi Write node 17 | - Automatic switch over when write node fail 18 | 19 | #### Limitations 20 | 21 | - MGR must run in multi-primary mode 22 | 23 | In general, we just need one write node and other node to be read, and we need to adapt MGR two mode: Multi-Primary Mode and Single-Primary Mode, so the scripts below come into the world. 24 | 25 | ## gr_mw_mode_sw_cheker.sh 26 | 27 | This script is using for monitoring MySQL Group Replication in **Multi-Primary Mode**, and we limit **there is only one node to be write node at a time.** 28 | 29 | ### Features and Limitations 30 | 31 | #### Features 32 | 33 | - Read-Write split 34 | - Automatic switch over when single write node fail 35 | 36 | #### Limitations 37 | 38 | - MGR must run in multi-primary Mode 39 | - Only one node to be write node at a time 40 | 41 | ### Configuration 42 | 43 | Assume that we have one MGR group with 3 node deploy in multi-primary mode: 44 | 45 | ```sql 46 | mysql> SELECT * FROM performance_schema.replication_group_members; 47 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 48 | | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | 49 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 50 | | group_replication_applier | 4a48f63a-d47b-11e6-a16f-a434d920fb4d | CrazyPig-PC | 24801 | ONLINE | 51 | | group_replication_applier | 592b4ea5-d47b-11e6-a3cd-a434d920fb4d | CrazyPig-PC | 24802 | ONLINE | 52 | | group_replication_applier | 6610aa92-d47b-11e6-a60e-a434d920fb4d | CrazyPig-PC | 24803 | ONLINE | 53 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 54 | 3 rows in set (0.00 sec) 55 | ``` 56 | 57 | and we have deployed ProxySQL(version 1.3.2) correctly on one same machine. 58 | 59 | Now we need to config MGR and ProxySQL to act together : 60 | 61 | **1) let ProxySQL have the privilege to connect and SELECT something from MGR members** 62 | 63 | In all MGR members, execute: 64 | 65 | ```sql 66 | set SQL_LOG_BIN=0; 67 | grant SELECT on *.* to 'proxysql'@'%' identified by 'proxysql'; 68 | flush privileges; 69 | set SET SQL_LOG_BIN=1; 70 | ``` 71 | 72 | **2) create custom function and view in MGR members** 73 | 74 | According to [https://github.com/lefred/mysql_gr_routing_check/blob/master/addition_to_sys.sql](https://github.com/lefred/mysql_gr_routing_check/blob/master/addition_to_sys.sql), execute the sql in all of the MGR members. 75 | 76 | > Tip: we will use the custom function and view in the shell script. 77 | 78 | **3) config proxysql** 79 | 80 | Add MGR members to proyxsql `mysql_servers` table: 81 | 82 | ```sql 83 | insert into mysql_servers (hostgroup_id, hostname, port) values(1, '127.0.0.1', 24801); 84 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24801); 85 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24802); 86 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24803); 87 | ``` 88 | 89 | `hostgroup_id = 1` represent the write group, and we have only one write node at a time, `hostgroup_id = 2` represent the read group and it includes all the MGR members. 90 | 91 | It’s time to add some routing rules to be able to use those hostgroups: 92 | 93 | ```sql 94 | insert into mysql_query_rules (active, match_pattern, destination_hostgroup, apply) 95 | values (1,"^SELECT",2,1); 96 | ``` 97 | 98 | We will route all queries starting by select to hostgroup which hostgroup_id is 2. 99 | 100 | > This is not a recommendation of course a we will also send to hostgroup 2 all SELECT… FOR UPDATE, for example 101 | 102 | And then we need to change the default proxysql monitor user and password we create in **step 1)** 103 | 104 | ```sql 105 | UPDATE global_variables SET variable_value='proxysql' WHERE variable_name='mysql-monitor_username'; 106 | UPDATE global_variables SET variable_value='proxysql' WHERE variable_name='mysql-monitor_password'; 107 | ``` 108 | 109 | Finally we can load `global_variables`, `mysql_servers`, `mysql_query_rules` to runtime and even to disk: 110 | 111 | ```sql 112 | LOAD MYSQL VARIABLES TO RUNTIME; 113 | SAVE MYSQL VARIABLES TO DISK; 114 | LOAD MYSQL SERVERS TO RUNTIME; 115 | SAVE MYSQL SERVERS TO DISK; 116 | LOAD MYSQL QUERY RULES TO RUNTIME; 117 | SAVE MYSQL QUERY RULES TO DISK; 118 | ``` 119 | 120 | **4) config scheduler** 121 | 122 | First, put the script `gr_mw_mode_sw_cheker.sh` to path : `/var/lib/proxysql/` 123 | 124 | Finally, we just need to config our script into proxysql scheduler and load it to runtime and even save to disk: 125 | 126 | ```sql 127 | insert into scheduler(id, active, interval_ms, filename, arg1, arg2, arg3, arg4) 128 | values(1, 1, 5000, '/var/lib/proxysql/gr_mw_mode_sw_checker.sh', 1, 2, 1, '/var/lib/proxysql/checker.log'); 129 | 130 | LOAD SCHEDULER TO RUNTIME; 131 | SAVE SCHEDULER TO DISK; 132 | ``` 133 | 134 | - *active* : 1: enable scheduler to schedule the script we provide 135 | - *interval_ms* : invoke one by one in cycle (eg: 5000(ms) = 5s represent every 5s invoke the script) 136 | - *filename*: represent the script file path 137 | - *arg1~arg4*: represent the input parameters the script received 138 | 139 | The script Usage: 140 | 141 | ```bash 142 | gr_mw_mode_sw_cheker.sh writehostgroup_id readhostgroup_id [writeNodeCanRead] [log file] 143 | ``` 144 | 145 | So : 146 | 147 | - *arg1* -> writehostgroup_id 148 | - *arg2* -> readhostgroup_id 149 | - *arg3* -> writeNodeCanRead, 1(YES, the default value), 0(NO) 150 | - *arg4* -> log file, default: `'./checker.log'` 151 | 152 | ## gr_sw_mode_checker.sh 153 | 154 | This script is using for monitoring MySQL Group Replication in **Single-Primary Mode**, so the limit is also : **there is only one node to be write node at a time.** 155 | 156 | ### Features and Limitations 157 | 158 | #### Features 159 | 160 | - Read-Write split 161 | - Switch over automatic when single write node failure 162 | 163 | #### Limitations 164 | 165 | - MGR(MySQL Group Replication) run in single-primary Mode 166 | - Only one node to be write node at a time 167 | 168 | ### Configuration 169 | 170 | the same configuration step as [gr_mw_mode_sw_cheker.sh](https://github.com/ZzzCrazyPig/proxysql_groupreplication_checker#gr_mw_mode_sw_cheker.sh), just in **step 4)**, replace the script with `gr_sw_mode_cheker.sh` 171 | -------------------------------------------------------------------------------- /README_Chinese.md: -------------------------------------------------------------------------------- 1 | # 介绍 2 | 3 | 根据博客[HA with MySQL Group Replication and ProxySQL](http://lefred.be/content/ha-with-mysql-group-replication-and-proxysql/)的介绍,利用ProxySQL可以为MySQL Group Replication(后面简称MGR) 提供高可用及读写分离方案。这个项目里面主要提供了几个可以被ProxySQL scheduler调度,用于监控MGR成员状态,实现MGR高可用,读写分离且写故障切换功能的脚本。 4 | 5 | ## proxysql_groupreplication_checker.sh 6 | 7 | 脚本修改自 : [https://github.com/lefred/proxysql_groupreplication_checker](https://github.com/lefred/proxysql_groupreplication_checker),提供了MGR Multi-Primary 模式下读写分离功能,以及写节点故障切换功能。 8 | 9 | ### 特性和限制 10 | 11 | #### 特性 12 | 13 | - 读写分离 14 | - 同一时刻可以有多个写节点 15 | - 写节点故障切换 16 | 17 | #### 限制 18 | 19 | - MGR只能是Multi-Primary Mode。 20 | 21 | 22 | 目前,MGR多写特性仍然不够完善,在多个节点并发写很可能出现写冲突,因此,在实际应用中,我们的场景是: 只有一个节点用于写,其他节点用于读。 23 | 24 | 同时我们要支持MGR的两种模式: Multi-Primary Mode和Single-Primary Mode 25 | 26 | 于是,就诞生了接下来的两个脚本: 27 | 28 | > Tip: 后面脚本的设计灵感来自这个示例脚本 29 | 30 | ## gr_mw_mode_sw_cheker.sh 31 | 32 | 实现Multi-Primary Mode下的MySQL Group Replication高可用和读写分离的脚本,限制只能有一个节点用于写。 33 | 34 | ### 特性和限制 35 | 36 | #### 特性 37 | 38 | - 读写分离 39 | - 写节点故障自动切换 40 | 41 | #### 限制 42 | 43 | - MGR必须部署于Multi-Primary Mode 44 | - 只有一个节点用于写 45 | 46 | ### 配置方式 47 | 48 | 假设我们已经部署了一个3节点组成的MGR集群,模式为Multi-Primary: 49 | 50 | ```sql 51 | mysql> SELECT * FROM performance_schema.replication_group_members; 52 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 53 | | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | 54 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 55 | | group_replication_applier | 4a48f63a-d47b-11e6-a16f-a434d920fb4d | CrazyPig-PC | 24801 | ONLINE | 56 | | group_replication_applier | 592b4ea5-d47b-11e6-a3cd-a434d920fb4d | CrazyPig-PC | 24802 | ONLINE | 57 | | group_replication_applier | 6610aa92-d47b-11e6-a60e-a434d920fb4d | CrazyPig-PC | 24803 | ONLINE | 58 | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 59 | 3 rows in set (0.00 sec) 60 | ``` 61 | 62 | 并且,已经成功的在同一台机器上部署了ProxySQL(version 1.3.2),并成功运行。 63 | 64 | 现在我们需要配置MGR和ProxySQL,让它们能够共同协作,实现高可用和读写分离: 65 | 66 | **1) MGR节点创建ProxySQL能够连接和执行SELECT操作的用户** 67 | 68 | 在所有的MGR节点上,执行: 69 | 70 | ```sql 71 | set SQL_LOG_BIN=0; 72 | grant SELECT on *.* to 'proxysql'@'%' identified by 'proxysql'; 73 | flush privileges; 74 | set SET SQL_LOG_BIN=1; 75 | ``` 76 | 77 | **2) MGR节点创建定制的自定义函数和视图** 78 | 79 | 这些自定义函数和视图会被脚本所使用,提供MGR节点状态的相关指标。在所有的MGR节点上执行[https://github.com/lefred/mysql_gr_routing_check/blob/master/addition_to_sys.sql](https://github.com/lefred/mysql_gr_routing_check/blob/master/addition_to_sys.sql)提供的脚本。 80 | 81 | **3) 配置proxysql** 82 | 83 | 将MGR节点信息写入proxysql `mysql_servers`表: 84 | 85 | ```sql 86 | insert into mysql_servers (hostgroup_id, hostname, port) values(1, '127.0.0.1', 24801); 87 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24801); 88 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24802); 89 | insert into mysql_servers (hostgroup_id, hostname, port) values(2, '127.0.0.1', 24803); 90 | ``` 91 | 92 | `hostgroup_id = 1`代表write group,针对我们提出的限制,这个地方只配置了一个节点;`hostgroup_id = 2`代表read group,包含了MGR的所有节点。 93 | 94 | 然后,加入读写分离规则,让所有的SELECT操作路由到hostgroup_id为2的hostgroup: 95 | 96 | ```sql 97 | insert into mysql_query_rules (active, match_pattern, destination_hostgroup, apply) 98 | values (1,"^SELECT",2,1); 99 | ``` 100 | 101 | > 这么做会引起所有的SELECT .. FOR UPDATE也发到hostgroup_id为2的hostgroup,更细粒度的正则表达式能够避免这个情况,这里不讨论这个细粒度的实现。 102 | 103 | 接下来我们需要修改proxysql的监控用户和密码为我们上面 **step 1)** 提供的用户和密码。 104 | 105 | ```sql 106 | UPDATE global_variables SET variable_value='proxysql' WHERE variable_name='mysql-monitor_username'; 107 | UPDATE global_variables SET variable_value='proxysql' WHERE variable_name='mysql-monitor_password'; 108 | ``` 109 | 110 | 最后我们需要将`global_variables`,`mysql_servers`和`mysql_query_rules`表的信息加载到RUNTIME,更进一步加载到DISK: 111 | 112 | ```sql 113 | LOAD MYSQL VARIABLES TO RUNTIME; 114 | SAVE MYSQL VARIABLES TO DISK; 115 | LOAD MYSQL SERVERS TO RUNTIME; 116 | SAVE MYSQL SERVERS TO DISK; 117 | LOAD MYSQL QUERY RULES TO RUNTIME; 118 | SAVE MYSQL QUERY RULES TO DISK; 119 | ``` 120 | 121 | **4) 配置scheduler** 122 | 123 | 首先,将我们提供的脚本`gr_mw_mode_sw_cheker.sh`放到目录`/var/lib/proxysql/`下 124 | 125 | 最后,我们在proxysql的scheduler表里面加载如下记录,然后加载到RUNTIME使其生效,同时还可以持久化到磁盘: 126 | 127 | ```sql 128 | insert into scheduler(id, active, interval_ms, filename, arg1, arg2, arg3, arg4) 129 | values(1, 1, 5000, '/var/lib/proxysql/gr_mw_mode_sw_checker.sh', 1, 2, 1, '/var/lib/proxysql/checker.log'); 130 | 131 | LOAD SCHEDULER TO RUNTIME; 132 | SAVE SCHEDULER TO DISK; 133 | ``` 134 | 135 | - *active* : 1: 表示使脚本调度生效 136 | - *interval_ms* : 每隔多长时间执行一次脚本 (eg: 5000(ms) = 5s 表示每隔5s脚本被调用一次) 137 | - *filename*: 指定脚本所在的具体路径,如上面的`/var/lib/proxysql/checker.log` 138 | - *arg1~arg4*: 指定传递给脚本的参数 139 | 140 | 脚本及对应的参数说明如下: 141 | 142 | ```bash 143 | gr_mw_mode_sw_cheker.sh writehostgroup_id readhostgroup_id [writeNodeCanRead] [log file] 144 | ``` 145 | 146 | - *arg1* -> 指定writehostgroup_id 147 | - *arg2* -> 指定readhostgroup_id 148 | - *arg3* -> 写节点是否可以用于读, 1(YES, the default value), 0(NO) 149 | - *arg4* -> log file, default: `'./checker.log'` 150 | 151 | ## gr_sw_mode_checker.sh 152 | 153 | 实现Single-Primary Mode下的MySQL Group Replication高可用和读写分离的脚本,限制只能有一个节点用于写。 154 | 155 | ### 特性与限制 156 | 157 | #### 特性 158 | 159 | - 读写分离 160 | - 写节点故障自动切换 161 | 162 | > 当写节点故障,脚本会寻找下一个真正的写节点,进行故障切换,这个过程比Multi-Primary Mode要复杂些。 163 | 164 | #### 限制 165 | 166 | - MGR必须部署在Single-Primary模式 167 | - 同一时间只有一个节点用于写 168 | 169 | ### 配置方式 170 | 171 | 与上一个脚本`gr_mw_mode_sw_cheker.sh`的部署方式基本一致,只是在 **step 4)** 中将脚本替换为`gr_sw_mode_checker.sh` 172 | -------------------------------------------------------------------------------- /gr_mw_mode_sw_cheker.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # author: CrazyPig 4 | # date: 2017-01-07 5 | # version: 1.0 6 | 7 | function usage() { 8 | echo "Usage: $0 [write node can be read : 1(YES: default) or 0(NO)] [log_file]" 9 | exit 0 10 | } 11 | 12 | if [ "$1" = '-h' -o "$1" = '--help' ] || [ -z "$1" -o -z "$2" ]; then 13 | usage 14 | fi 15 | 16 | # receive input arg 17 | writeGroupId="${1:-1}" 18 | readGroupId="${2:-2}" 19 | writeNodeCanRead="${3:-1}" 20 | errFile="${4:-"./checker.log"}" 21 | 22 | # variable define 23 | proxysql_user="admin" 24 | proxysql_password="admin" 25 | proxysql_host="127.0.0.1" 26 | proxysql_port="6032" 27 | 28 | switchOver=0 29 | timeout=3 30 | 31 | # enable(1) debug info or not(0) 32 | debug=1 33 | function debug() { 34 | local appendToFile="${2:-0}" 35 | local msg="[`date \"+%Y-%m-%d %H:%M:%S\"`] $1" 36 | if [ $debug -eq 1 ]; then 37 | echo $msg 38 | fi 39 | if [ $appendToFile -eq 1 ]; then 40 | echo $msg >> $errFile 41 | fi 42 | } 43 | 44 | debug "writeGroupId : $writeGroupId, readGroupId : $readGroupId, writeNodeCanRead : $writeNodeCanRead, errFile : $errFile" 45 | proxysql_cmd="mysql -u$proxysql_user -p$proxysql_password -h$proxysql_host -P$proxysql_port -Nse" 46 | debug "proxysql_cmd : $proxysql_cmd" 47 | mysql_credentials=$($proxysql_cmd "SELECT variable_value FROM global_variables WHERE variable_name IN ('mysql-monitor_username','mysql-monitor_password') ORDER BY variable_name DESC") 48 | mysql_user=$(echo $mysql_credentials | awk '{print $1}') 49 | mysql_password=$(echo $mysql_credentials | awk '{print $2}') 50 | debug "mysql_user : $mysql_user, mysql_password : $mysql_password" 51 | mysql_cmd="mysql -u$mysql_user -p$mysql_password" 52 | debug "mysql_cmd : $mysql_cmd" 53 | 54 | update_servers_cmd_opts="LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;" 55 | 56 | # check node status is OK or not 57 | function isNodeOk() { 58 | local gr_status=$1 59 | local trx_behind=$2 60 | debug "gr_status : $gr_status, trx_behind : $trx_behind" 61 | if [ "$gr_status" == "YES" ]; then 62 | return 1 63 | elif [ "$gr_status" == "" -o "$gr_status" == "NO" ]; then 64 | return 0 65 | fi 66 | } 67 | 68 | # main logic 69 | 70 | # find current write node 71 | read cwn_hostname cwn_port <<< $($proxysql_cmd "SELECT hostname,port FROM mysql_servers WHERE hostgroup_id = $writeGroupId LIMIT 1;") 72 | debug "current write node hostname : ${cwn_hostname}, port : ${cwn_port}" 73 | # for every read node in read hostgroup 74 | output=$($proxysql_cmd "SELECT hostgroup_id, hostname, port, status FROM mysql_servers WHERE hostgroup_id = $readGroupId;" 2>> $errFile) 75 | while read hostgroup_id hostname port status 76 | do 77 | # check node is ok 78 | read gr_status trx_behind <<< $(timeout $timeout $mysql_cmd -h$hostname -P$port -Nse "SELECT viable_candidate, transactions_behind FROM sys.gr_member_routing_candidate_status" 2>>$errFile | tail -1 2>>$errFile) 79 | isNodeOk $gr_status $trx_behind 80 | isOK=$? 81 | debug "node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, status: $status, isOK: $isOK ]" 82 | # node is current write node 83 | if [ "$hostname" == "$cwn_hostname" -a "$port" == "$cwn_port" ]; then 84 | debug "node is the current write node" 85 | if [ $isOK -eq 0 ]; then 86 | # need to find new write node 87 | switchOver=1 88 | debug "current write node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is not OK, we need to do switch over" 1 89 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 90 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_HARD' WHERE hostgroup_id = $writeGroupId AND hostname = '$cwn_hostname' AND port = $cwn_port; ${update_servers_cmd_opts}" 2>> $errFile 91 | else # isOK = 1 92 | if [ $writeNodeCanRead -eq 0 ]; then # write node can not be read 93 | debug "isOK : $isOK, write node can not be read, will update node status to OFFLINE_SOFT [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 94 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 95 | else 96 | if [ "$status" != "ONLINE" ]; then 97 | debug "isOK : $isOK, write node can be read, will update node status to ONLINE [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 98 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $writeGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 99 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 100 | fi 101 | fi 102 | fi 103 | # node is not current write node and status is not OK 104 | elif [ $isOK -eq 0 ]; then 105 | debug "read node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is not OK, we will set it's status to be 'OFFLINE_SOFT'" 1 106 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 107 | elif [ $isOK -eq 1 -a "$status" == "OFFLINE_SOFT" ]; then 108 | debug "read node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is OK, but is's status is 'OFFLINE_SOFT', we will update it to be 'ONLINE' 1" 109 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 110 | fi 111 | done <> $errFile) 118 | if [ "$nwn_hostname" != "" ]; then 119 | debug "find new write node from read hostgroup, [hostgroup_id: $readGroupId, hostname : $nwn_hostname, port : $nwn_port]" 1 120 | $proxysql_cmd "UPDATE mysql_servers SET hostname = '$nwn_hostname', port = $nwn_port, status = 'ONLINE' WHERE hostgroup_id = $writeGroupId; ${update_servers_cmd_opts}" 2>> $errFile 121 | if [ $writeNodeCanRead -eq 0 ]; then 122 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$nwn_hostname' AND port = $nwn_port; ${update_servers_cmd_opts}" 2>> $errFile 123 | fi 124 | else 125 | debug "from read hostgroup, we can not find new node to be write node" 1 126 | fi 127 | fi 128 | -------------------------------------------------------------------------------- /gr_sw_mode_checker.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # author: CrazyPig 4 | # date: 2017-01-08 5 | # version: 1.0 6 | 7 | function usage() { 8 | echo "Usage: $0 [write node can be read : 1(YES: default) or 0(NO)] [log_file]" 9 | exit 0 10 | } 11 | 12 | if [ "$1" = '-h' -o "$1" = '--help' ] || [ -z "$1" -o -z "$2" ]; then 13 | usage 14 | fi 15 | 16 | # receive input arg 17 | writeGroupId="${1:-1}" 18 | readGroupId="${2:-2}" 19 | writeNodeCanRead="${3:-1}" 20 | errFile="${4:-"./checker.log"}" 21 | 22 | # variable define 23 | proxysql_user="admin" 24 | proxysql_password="admin" 25 | proxysql_host="127.0.0.1" 26 | proxysql_port="6032" 27 | 28 | switchOver=0 29 | timeout=3 30 | 31 | # enable(1) debug info or not(0) 32 | debug=1 33 | function debug() { 34 | local appendToFile="${2:-0}" 35 | local msg="[`date \"+%Y-%m-%d %H:%M:%S\"`] $1" 36 | if [ $debug -eq 1 ]; then 37 | echo $msg 38 | fi 39 | if [ $appendToFile -eq 1 ]; then 40 | echo $msg >> $errFile 41 | fi 42 | } 43 | 44 | debug "writeGroupId : $writeGroupId, readGroupId : $readGroupId, writeNodeCanRead : $writeNodeCanRead, errFile : $errFile" 45 | proxysql_cmd="mysql -u$proxysql_user -p$proxysql_password -h$proxysql_host -P$proxysql_port -Nse" 46 | debug "proxysql_cmd : $proxysql_cmd" 47 | mysql_credentials=$($proxysql_cmd "SELECT variable_value FROM global_variables WHERE variable_name IN ('mysql-monitor_username','mysql-monitor_password') ORDER BY variable_name DESC") 48 | mysql_user=$(echo $mysql_credentials | awk '{print $1}') 49 | mysql_password=$(echo $mysql_credentials | awk '{print $2}') 50 | debug "mysql_user : $mysql_user, mysql_password : $mysql_password" 51 | mysql_cmd="mysql -u$mysql_user -p$mysql_password" 52 | debug "mysql_cmd : $mysql_cmd" 53 | 54 | update_servers_cmd_opts="LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;" 55 | 56 | # check node status is OK or not 57 | function isNodeOk() { 58 | local gr_status=$1 59 | local trx_behind=$2 60 | debug "gr_status : $gr_status, trx_behind : $trx_behind" 61 | if [ "$gr_status" == "YES" ]; then 62 | return 1 63 | elif [ "$gr_status" == "" -o "$gr_status" == "NO" ]; then 64 | return 0 65 | fi 66 | } 67 | 68 | # main logic 69 | 70 | # find current write node 71 | read cwn_hostname cwn_port <<< $($proxysql_cmd "SELECT hostname,port FROM mysql_servers WHERE hostgroup_id = $writeGroupId LIMIT 1;") 72 | debug "current write node hostname : ${cwn_hostname}, port : ${cwn_port}" 73 | # for every read node in read hostgroup 74 | output=$($proxysql_cmd "SELECT hostgroup_id, hostname, port, status FROM mysql_servers WHERE hostgroup_id = $readGroupId;" 2>> $errFile) 75 | while read hostgroup_id hostname port status 76 | do 77 | # check node is ok 78 | read gr_status trx_behind <<< $(timeout $timeout $mysql_cmd -h$hostname -P$port -Nse "SELECT viable_candidate, transactions_behind FROM sys.gr_member_routing_candidate_status" 2>>$errFile | tail -1 2>>$errFile) 79 | isNodeOk $gr_status $trx_behind 80 | isOK=$? 81 | debug "node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, status: $status, isOK: $isOK ]" 82 | # node is current write node 83 | if [ "$hostname" == "$cwn_hostname" -a "$port" == "$cwn_port" ]; then 84 | debug "node is the current write node" 85 | if [ $isOK -eq 0 ]; then 86 | # need to find new write node 87 | switchOver=1 88 | debug "current write node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is not OK, we need to do switch over" 1 89 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 90 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_HARD' WHERE hostgroup_id = $writeGroupId AND hostname = '$cwn_hostname' AND port = $cwn_port; ${update_servers_cmd_opts}" 2>> $errFile 91 | else # isOK = 1 92 | read isPrimaryNode <<< $(timeout $timeout $mysql_cmd -h$hostname -P$port -Nse "SELECT IF((SELECT @@server_uuid) = (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='group_replication_primary_member'), 1, 0);" 2>> $errFile) 93 | if [ $isPrimaryNode -eq 0 ]; then 94 | debug "current write node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is no longer the write node, we need to do switch over" 1 95 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_HARD' WHERE hostgroup_id = $writeGroupId AND hostname = '$hostname' AND port = '$port'; ${update_servers_cmd_opts}" 2>> $errFile 96 | switchOver=1 97 | if [ "$status" != "ONLINE" ]; then 98 | debug "isOK : $isOK, write node can be read, will update node status to ONLINE [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 99 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 100 | fi 101 | continue 102 | fi 103 | if [ $writeNodeCanRead -eq 0 ]; then # write node can not be read 104 | debug "isOK : $isOK, write node can not be read, will update node status to OFFLINE_SOFT [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 105 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 106 | else 107 | if [ "$status" != "ONLINE" ]; then 108 | debug "isOK : $isOK, write node can be read, will update node status to ONLINE [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 109 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $writeGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 110 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 111 | fi 112 | fi 113 | fi 114 | # node is not current write node and status is not OK 115 | elif [ $isOK -eq 0 ]; then 116 | debug "read node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is not OK, we will set it's status to be 'OFFLINE_SOFT'" 1 117 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 118 | elif [ $isOK -eq 1 -a "$status" == "OFFLINE_SOFT" ]; then 119 | debug "read node [hostgroup_id: $hostgroup_id, hostname: $hostname, port: $port, isOK: $isOK] is OK, but is's status is 'OFFLINE_SOFT', we will update it to be 'ONLINE' 1" 120 | $proxysql_cmd "UPDATE mysql_servers SET status = 'ONLINE' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 121 | fi 122 | done <> $errFile) 131 | while read hostname port 132 | do 133 | read isPrimaryNode <<< $(timeout $timeout $mysql_cmd -h$hostname -P$port -Nse "SELECT IF((SELECT @@server_uuid) = (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME= 'group_replication_primary_member'), 1, 0);" 2>> $errFile) 134 | if [ "$isPrimaryNode" != "" -a $isPrimaryNode -eq 1 ]; then 135 | # success in finding new primary node from read hostgroup 136 | debug "success in finding new primary node from read hostgroup [hostgroup_id: $readGroupId, hostname: $hostname, port: $port]" 1 137 | $proxysql_cmd "UPDATE mysql_servers SET hostname = '$hostname', port = '$port', status = 'ONLINE' WHERE hostgroup_id = $writeGroupId; ${update_servers_cmd_opts}" 2>> $errFile 138 | successSwitchOver=1 139 | if [ $writeNodeCanRead -eq 0 ]; then 140 | $proxysql_cmd "UPDATE mysql_servers SET status = 'OFFLINE_SOFT' WHERE hostgroup_id = $readGroupId AND hostname = '$hostname' AND port = $port; ${update_servers_cmd_opts}" 2>> $errFile 141 | fi 142 | break 143 | fi 144 | done < [hostgroup_id read] [number writers] [writers are readers 0|1} [log_file]" 17 | exit 0 18 | } 19 | 20 | if [ "$1" = '-h' -o "$1" = '--help' -o -z "$1" ] 21 | then 22 | usage 23 | fi 24 | 25 | if [ $# -lt 1 ] 26 | then 27 | echo "Invalid number of arguments" 28 | usage 29 | fi 30 | 31 | HOSTGROUP_WRITER_ID="${1}" 32 | HOSTGROUP_READER_ID="${2:--1}" 33 | NUMBER_WRITERS="${3:-0}" 34 | WRITER_IS_READER="${4:-1}" 35 | ERR_FILE="${5:-/dev/null}" 36 | 37 | #echo "Hostgroup writers $HOSTGROUP_WRITER_ID" 38 | #echo "Hostgroup readers $HOSTGROUP_READER_ID" 39 | #echo "Number of writers $NUMBER_WRITERS" 40 | #echo "Writers are readers $WRITER_IS_READER" 41 | #echo "log file $ERR_FILE" 42 | 43 | #Timeout exists for instances where mysqld may be hung 44 | TIMEOUT=10 45 | 46 | PROXYSQL_CMDLINE="mysql -u$PROXYSQL_USERNAME -p$PROXYSQL_PASSWORD -h $PROXYSQL_HOSTNAME -P $PROXYSQL_PORT --protocol=tcp -Nse" 47 | MYSQL_CREDENTIALS=$($PROXYSQL_CMDLINE "SELECT variable_value FROM global_variables WHERE variable_name IN ('mysql-monitor_username','mysql-monitor_password') ORDER BY variable_name DESC") 48 | MYSQL_USERNAME=$(echo $MYSQL_CREDENTIALS | awk '{print $1}') 49 | MYSQL_PASSWORD=$(echo $MYSQL_CREDENTIALS | awk '{print $2}') 50 | #echo $MYSQL_CREDENTIALS 51 | #echo $MYSQL_USERNAME $MYSQL_PASSWORD 52 | MYSQL_CMDLINE="timeout $TIMEOUT mysql -u$MYSQL_USERNAME -p$MYSQL_PASSWORD " 53 | 54 | $PROXYSQL_CMDLINE "SELECT hostgroup_id, hostname, port, status, max_replication_lag FROM mysql_servers WHERE hostgroup_id IN ($HOSTGROUP_WRITER_ID, $HOSTGROUP_READER_ID) AND status <> 'OFFLINE_HARD'" | while read hostgroup server port stat max_replication_lag 55 | do 56 | read GR_STATUS READONLY TRX_BEHIND <<<$($MYSQL_CMDLINE -h $server -P $port -Nse "SELECT viable_candidate, read_only, transactions_behind FROM sys.gr_member_routing_candidate_status" 2>>${ERR_FILE} | tail -1 2>>${ERR_FILE}) 57 | echo "`date` Check server $hostgroup:$server:$port , status $stat , GR_STATUS $GR_STATUS, READONLY $READONLY, TRX_BEHIND $TRX_BEHIND" >> ${ERR_FILE} 58 | UPDATE_STATS_ONLINE_CMD="UPDATE mysql_servers SET status='ONLINE' WHERE hostgroup_id=$hostgroup AND hostname='$server' AND port='$port'; LOAD MYSQL SERVERS TO RUNTIME;" 59 | UPDATE_STATS_OFFLINE_SOFT_CMD="UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=$hostgroup AND hostname='$server' AND port='$port'; LOAD MYSQL SERVERS TO RUNTIME;" 60 | if [ "${GR_STATUS}" == "" -a "${READONLY}" == "" -a "${TRX_BEHIND}" == "" ] ; then 61 | # case : mysql server can not reach 62 | echo "`date` Changing server $hostgroup:$server:$port to status OFFLINE_SOFT" >> ${ERR_FILE} 63 | $PROXYSQL_CMDLINE "${UPDATE_STATS_OFFLINE_SOFT_CMD}" 2>> ${ERR_FILE} 64 | elif [ "${GR_STATUS}" == "YES" -a "${READONLY}" == "NO" -a "$stat" != "ONLINE" -a ${TRX_BEHIND} -le $max_replication_lag ] ; then 65 | echo "`date` Changing server $hostgroup:$server:$port to status ONLINE" >> ${ERR_FILE} 66 | $PROXYSQL_CMDLINE "${UPDATE_STATS_ONLINE_CMD}" 2>> ${ERR_FILE} 67 | elif [ "${GR_STATUS}" == "YES" -a "${READONLY}" == "YES" -a "$stat" != "ONLINE" -a "$hostgroup" == "$HOSTGROUP_READER_ID" -a ${TRX_BEHIND} -le $max_replication_lag ] ; then 68 | echo "`date` Changing server $hostgroup:$server:$port to status ONLINE" >> ${ERR_FILE} 69 | $PROXYSQL_CMDLINE "${UPDATE_STATS_ONLINE_CMD}" 2>> ${ERR_FILE} 70 | elif [ "${GR_STATUS}" == "NO" -o "${READONLY}" == "YES" -a "$stat" = "ONLINE" -a "$hostgroup" == "$HOSTGROUP_WRITER_ID" ] ; then 71 | echo "`date` Changing server $hostgroup:$server:$port to status OFFLINE_SOFT" >> ${ERR_FILE} 72 | $PROXYSQL_CMDLINE "${UPDATE_STATS_OFFLINE_SOFT_CMD}" 2>> ${ERR_FILE} 73 | elif [ "${GR_STATUS}" == "YES" -a "${READONLY}" == "NO" -a "$stat" = "ONLINE" -a ${TRX_BEHIND} -gt $max_replication_lag ] ; then 74 | echo "`date` Changing server $hostgroup:$server:$port to status OFFLINE_SOFT" >> ${ERR_FILE} 75 | $PROXYSQL_CMDLINE "${UPDATE_STATS_OFFLINE_SOFT_CMD}" 2>> ${ERR_FILE} 76 | fi 77 | done 78 | 79 | if [ $NUMBER_WRITERS -gt 0 ] 80 | then 81 | CONT=0 82 | # Only check online servers 83 | $PROXYSQL_CMDLINE "SELECT hostname, port FROM mysql_servers WHERE hostgroup_id = $HOSTGROUP_WRITER_ID AND status = 'ONLINE' order by hostname, port" | while read server port 84 | do 85 | if [ $CONT -ge $NUMBER_WRITERS ] 86 | then 87 | # Number of writers reached, disabling extra servers 88 | echo "`date` Number of writers reached, disabling extra write server $HOSTGROUP_WRITER_ID:$server:$port to status OFFLINE_SOFT" >> ${ERR_FILE} 89 | $PROXYSQL_CMDLINE "UPDATE mysql_servers set status = 'OFFLINE_SOFT' WHERE hostgroup_id = $HOSTGROUP_WRITER_ID AND hostname = '$server' AND port = $port;" 2>> ${ERR_FILE} 90 | fi 91 | CONT=$(( $CONT + 1 )) 92 | done 93 | fi 94 | 95 | if [ $WRITER_IS_READER -eq 0 ] 96 | then 97 | # Writer is not a read node, but only if we have another read node online 98 | READER_NON_WRITER=$($PROXYSQL_CMDLINE "SELECT count(*) FROM mysql_servers ms1 LEFT JOIN mysql_servers ms2 ON ms1.hostname = ms2.hostname AND ms1.port = ms2.port AND ms1.hostgroup_id <> ms2.hostgroup_id WHERE ms1.hostgroup_id = $HOSTGROUP_READER_ID AND ms1.status = 'ONLINE' AND (ms2.hostgroup_id = $HOSTGROUP_WRITER_ID OR ms2.hostgroup_id IS NULL) AND (ms2.status = 'OFFLINE_SOFT' OR ms2.hostgroup_id IS NULL);" 2>>${ERR_FILE}) 99 | if [ $READER_NON_WRITER -gt 0 ] 100 | then 101 | $PROXYSQL_CMDLINE "SELECT hostname, port FROM mysql_servers WHERE hostgroup_id = $HOSTGROUP_WRITER_ID AND status = 'ONLINE' order by hostname, port" | while read server port 102 | do 103 | echo "`date` Disabling read for write server $HOSTGROUP_READER_ID:$server:$port to status OFFLINE_SOFT" >> ${ERR_FILE} 104 | $PROXYSQL_CMDLINE "UPDATE mysql_servers set status = 'OFFLINE_SOFT' WHERE hostgroup_id = $HOSTGROUP_READER_ID AND hostname = '$server' AND port = $port;" 2>> ${ERR_FILE} 105 | done 106 | else 107 | echo "`date` Not enough read servers, we won't disable read in write servers" >> ${ERR_FILE} 108 | fi 109 | fi 110 | 111 | echo "`date` Enabling config" >> ${ERR_FILE} 112 | $PROXYSQL_CMDLINE "LOAD MYSQL SERVERS TO RUNTIME;" 2>> ${ERR_FILE} 113 | 114 | exit 0 115 | --------------------------------------------------------------------------------