Linux Administration: Deploying Highly Available MySQL with MHA and HAProxy

The problem of setting up a highly available and scalable MySQL system has been worked on for more than a decade, with many possible solutions available - MariaDB Galera Cluster, Percona XtraDB Cluster, MySQL Cluster CGE and many other custom setups just to name a few . The one I am going to demonstrate today involves MHA [1], HAProxy [2] (for scaling reads and writes) and keepalived to implement VRRP [3]. The benefit of using MHA is that you can use any storage engine you want, along with Oracle, Maria or Percona SQL servers. It also works well with just traditional replication or with GTIDs.

MHA - or Master High Availability Manager and tools for MySQL - is a set of tools written in Perl, that consist of a manager that sits on a dedicated host, and a collection of scripts residing on the MySQL nodes. The manager monitors the status of the cluster and when the master fails, it promotes the most current slave to be the new master and executes a script to deal with the failover (e.g. moves the virtual IP, or makes a change in a config file etc). This is accomplished by the manager ssh-ing to the MySQL nodes, and running the scripts, scp-ing relay log etc. If allowing a process to ssh to your database servers is not an option for you, then this setup is not the best choice.

For this example we'll have four servers - two for the MHA manager and HAProxy, and two for the master-slave MySQL servers. I'll be using GTID based replication [4].

First lets start by setting up GTID replication between the master and the slave.

On both database servers:

File: gistfile1.sh ------------------ [root@mysql-n01 ~] cat /etc/mysql/conf.d/replication.cnf [mysqld] server_id = 1 report-host = 1 report-port = 1 read_only = 0 # binary logs log_bin = /var/log/mysql/replica-1-bin expire_logs_days = 3 max_binlog_size = 1G log_slave_updates = 1 sync-binlog = 0 binlog_format = MIXED # GTID gtid_mode = ON enforce-gtid-consistency # Relay logs relay_log = /var/log/mysql/replica-1-relay relay_log_purge = 1 relay_log_recovery = 1 relay_log_space_limit = 5G [root@mysql-n01 ~] mysql mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicationuser'@'%' IDENTIFIED BY 'somepassword' mysql> FLUSH PRIVILEGES; [root@mysql-n02 ~] cat /etc/mysql/conf.d/replication.cnf [mysqld] server_id = 2 report-host = 1 report-port = 1 read_only = 1 # binary logs log_bin = /var/log/mysql/replica-2-bin expire_logs_days = 3 max_binlog_size = 1G log_slave_updates = 1 sync-binlog = 0 binlog_format = MIXED # GTID gtid_mode = ON enforce-gtid-consistency # Relay logs relay_log = /var/log/mysql/replica-2-relay relay_log_purge = 1 relay_log_recovery = 1 relay_log_space_limit = 5G [root@mysql-n02 ~] mysql mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicationuser'@'%' IDENTIFIED BY 'somepassword' mysql> FLUSH PRIVILEGES;

To start the replication, on the current slave execute:

File: gistfile1.sh ------------------ mysql> change master to master_host="10.188.50.124", master_user="replicationuser", master_password="somepassword", master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.14 sec) mysql> start slave; Query OK, 0 rows affected (0.06 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.188.50.124 Master_User: replicationuser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: replica-2-bin.000004 Read_Master_Log_Pos: 231 Relay_Log_File: replica-1-relay.000002 Relay_Log_Pos: 369 Relay_Master_Log_File: replica-2-bin.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 231 Relay_Log_Space: 573 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: ebc53c30-5de6-11e4-ac82-0018518bc543 Master_Info_File: /var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: e443c439-5de6-11e4-ac82-0018511fca0b:1-6, ebc53c30-5de6-11e4-ac82-0018518bc543:1-3 Auto_Position: 1 1 row in set (0.00 sec)

Now that the replication has been setup, lets install the MHA tools on both database servers:

File: gistfile1.txt ------------------- [root@mysql-n01 ~] wget https://mysql-master-ha.googlecode.com/files/mha4mysql-node_0.54-0_all.deb [root@mysql-n01 ~] apt-get install libdbd-mysql-perl [root@mysql-n01 ~] dpkg --install mha4mysql-node_0.54-0_all.deb

Next, on the two manager nodes, lets install MHA manager, HAProxy and keepalived. The manager nodes also need the MHA node packages:

File: gistfile1.sh ------------------ [root@manager-n01 ~] apt-get install libdbd-mysql-perl [root@manager-n01 ~] apt-get install libconfig-tiny-perl [root@manager-n01 ~] apt-get install liblog-dispatch-perl [root@manager-n01 ~] apt-get install libparallel-forkmanager-perl [root@manager-n01 ~] wget https://mysql-master-ha.googlecode.com/files/mha4mysql-manager_0.55-0_all.deb [root@manager-n01 ~] wget https://mysql-master-ha.googlecode.com/files/mha4mysql-node_0.54-0_all.deb [root@manager-n01 ~] dpkg --install mha4mysql-node_0.54-0_all.deb [root@manager-n01 ~] dpkg --install mha4mysql-manager_0.55-0_all.deb [root@manager-n01 ~] apt-get install haproxy keepalived

Make sure you have a user with ssh keys deployed that MHA can use to ssh between all servers. The configs for MHA, HAProxy and keepalived follow:

File: gistfile1.sh ------------------ [root@manager-n01 ~] cat /etc/app1.cnf [server default] # User that will ssh from the manager nodes to the database nodes user=mhamanager password=somepassword ssh_user=root # working directory on the manager manager_workdir=/var/log/masterha/app1 # working directory on MySQL servers remote_workdir=/var/log/masterha/app1 master_ip_failover_script=/usr/bin/master_ip_failover [server1] hostname=10.188.49.114 [server2] hostname=10.188.50.124 [root@manager-n01 ~] cat /usr/bin/master_ip_failover #!/bin/bash COMMAND=$1 OLD_MASTER_IP=$(echo $4 | cut -d"=" -f2) NEW_MASTER_IP=$(echo $7 | cut -d"=" -f2) if [ "$(echo $COMMAND | grep start)" != "" ] then logger "Failover detected. Changing HAProxy config file" logger "Failed Master IP: $OLD_MASTER_IP, New Master IP: $NEW_MASTER_IP" MASTER_STANZA=$(cat /etc/haproxy/haproxy.cfg | grep "server mysql-master") sed -i "s/${MASTER_STANZA}/ server mysql-master ${NEW_MASTER_IP}:3306/" /etc/haproxy/haproxy.cfg logger "Restarting HAProxy" /etc/init.d/haproxy restart fi [root@manager-n01 ~] cat /etc/haproxy/haproxy.cfg global log 127.0.0.1 local1 log-tag haproxy maxconn 4096 user haproxy group haproxy daemon stats socket /var/run/haproxy.sock mode 600 level admin stats timeout 2m tune.ssl.default-dh-param 2048 defaults log global mode tcp timeout connect 5000ms timeout client 50000ms timeout server 50000ms option dontlognull option tcplog option logasap frontend mysql_master bind 10.188.50.121:3306 default_backend mysql_master frontend mysql_slave bind 10.188.50.121:3307 default_backend mysql_slaves backend mysql_master server mysql-master 10.188.50.100:3306 backend mysql_slaves server mysql-slaves 10.188.50.110:3306 check [root@manager-n01 ~] cat /etc/keepalived/keepalived.conf vrrp_instance management_network { state MASTER interface eth0 virtual_router_id 51 priority 100 virtual_ipaddress { 10.188.50.121/20 dev eth0 label eth0:0 } nopreempt notify /usr/local/bin/mha_manager.sh } [root@manager-n01 ~] cat /usr/local/bin/mha_manager.sh #!/bin/bash TYPE=$1 NAME=$2 STATE=$3 case $STATE in "MASTER") /etc/init.d/haproxy start && /usr/bin/masterha_manager --conf=/etc/app1.cnf exit 0 ;; "BACKUP") /etc/init.d/haproxy stop && /usr/bin/masterha_stop --conf=/etc/app1.cnf exit 0 ;; "FAULT") /etc/init.d/haproxy stop && /usr/bin/masterha_stop --conf=/etc/app1.cnf exit 0 ;; *) echo "unknown state" exit 1 ;; esac

The first config is for the MHA manager. Line 11 specifies what script to execute after the new master is promoted. The script just changes the HAProxy config, but it can be anything. Lines 13-17 define the MySQL servers. MHA will determine which one is the master and which one is the slave.
The HAProxy config file is pretty self explanatory, for more information you can read my other HAProxy posts.
The only interesting part about the keepalived config is on line 88, where we specify what script keepalived will trigger if the current MHA server fails - it will basically start haproxy on the standby server along with the MHA manager.

To manually start, check the status and stop the MHA manager run:

File: gistfile1.sh ------------------ [root@manager-n01 ~] /usr/bin/masterha_manager --conf=/etc/app1.cnf [root@manager-n01 ~] /usr/bin/masterha_check_status --conf=/etc/app1.cnf [root@manager-n01 ~] /usr/bin/masterha_stop --conf=/etc/app1.cnf

To test a failover, stop MySQL on the current master and watch MHA promote the slave to a master, and change the HAProxy config to reflect that.

Resources:
[1]. https://code.google.com/p/mysql-master-ha/
[2]. http://www.haproxy.org/
[3]. http://www.keepalived.org/
[4]. http://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html

Pages

Deploying Highly Available MySQL with MHA and HAProxy