Asterisk High availablity

From open-voip.org

Jump to: navigation, search

High availablity

There are many ways to configure high availablity with asterisk: Using DNS SRV from the device perspective or having load balancer like Ser (Kamailio). I will discuss the HA within the asterisk.

So there are three issues we should take care of:

  • 1. The Asterisk IP
  • 2. The DB (mysql)
  • 3. The local files like ASTDB and other sound file like VoiceMails messages.

The setup: our setup will have two PBXs: 192.168.0.147 and 192.168.0.148 The virtual IP will 192.168.0.149


Contents

Configure IP redunduncy using Heartbeat

Our goal to have one virtual IP for the PBX, at a time only one of the PBX's will be up and the other will be in stand by mode. if there will be any problem with the master the stanby PBX will be the new master.

My favorite way to keep the asterisk IP up when one of the nodes is down is Heartbeat, follow the steps below in order to build heartbeat setup with you asterisk nodes:

1.1 Give host name to each node. to 192.168.0.147:

vi /etc/sysconfig/network
HOSTNAME=PBX-147

The same to the second node 192.168.0.148:

vi /etc/sysconfig/network
HOSTNAME=PBX-148

on both servers change the "hosts" files and add the two lines below:

 vi /etc/hosts
 192.168.0.147           PBX-147
 192.168.0.148           PBX-148

restart the network service on both servers:

service network restart

disable selinux:

 echo 0 >/selinux/enforce
 vi  /etc/selinux/config 

instead of "SELINUX=enforcing" chang to:

 SELINUX=disabled

install heartbit

 yum install heartbeat

Configure heartbeat

 vi /etc/ha.d/ha.cf

ha.cf file should be as follow (indentical on both servers)

 use_logd yes
 keepalive 200ms
 deadtime 2
 warntime 1
 initdead 120
 bcast eth3
 node PBX-147
 node PBX-148
 crm off
 auto_failback off

edit(add) the haresources file:

 vi /etc/ha.d/haresources

file should include the following line (again, identical on both servers): note that we would like to stop or start the "asterisk and "sync_astdb" service when the heartbeat become master(start servoces) or become standby (stop services)

 PBX-147 192.168.0.149/24 asterisk sync_astdb

edit the authkeys file :

 vi /etc/ha.d/authkeys

It should have the following (identical on both servers):

 auth 1
 1 sha1 HIlalalala!

change the permissions of authkeys files on both servers:

 chmod 600 /etc/ha.d/authkeys

start heart beat on both servers:

 /etc/init.d/heartbeat start

use "ifconfig" to see that the virtual IP if up, or you can ping 192.168.0.149

in boot sequence we would like heartbeat to start after the network services are up:

 vi /etc/rc.local

add the following lines:

 service network restart
 sleep 6
 /etc/init.d/heartbeat start

remove heartbeat from the regular startup sequence:

 chkconfig heartbeat --level 345 off

test failover by booting the master server (147) or stop the heartbeat on the master server:

 /etc/init.d/heartbeat stop

make sure the ping to the virtual IP is still up and "ipconfig on the standby server (148, that now is thw active) testing the failover can be done using the heartbeat commands:

 /usr/lib/heartbeat/hb_takeover         
 /usr/lib/heartbeat/hb_standby 

debug can be done using:

 tail -f /var/log/messages

Configiure the MySQL DB redunduncy

In order to have the MySql DB up while one of the PBX servers is down one of the following can be done:

  • have remote DB server, both PBXs are working with the same remote MySQL (cluster?)
  • Each server has its own MySQL DB, MySQL is replicated from the activve to the slave

We will discuss the second option:

Configure MySQL replication on my.cnf

/etc/my.cnf on PBX-147 should be:

 [mysqld]
 datadir=/var/lib/mysql
 socket=/var/lib/mysql/mysql.sock
 user=mysql
 # Default to using old password format for compatibility with mysql 3.x
 # clients (those using the mysqlclient10 compatibility package).
 old_passwords=1
 # add for replication from here
 server-id=2
 master-host = 192.168.0.148
 master-user = replication
 master-password = slave
 master-port = 3306
 #
 log-bin
 binlog-do-db=asteriskrealtime
 #
 [mysql.server]
 user=mysql
 basedir=/var/lib
 ################################################################
 [mysqld_safe]
 log-error=/var/log/mysqld.log
 pid-file=/var/run/mysqld/mysqld.pid

/etc/my.cnf on PBX-148 should be:

 [mysqld]
 datadir=/var/lib/mysql
 socket=/var/lib/mysql/mysql.sock
 user=mysql
 # Default to using old password format for compatibility with mysql 3.x
 # clients (those using the mysqlclient10 compatibility package).
 old_passwords=1
 # add for replication from here
 log-bin
 binlog-do-db=asteriskrealtime
 binlog-ignore-db=mysql
 binlog-ignore-db=test
 #
 server-id=1
 #
 master-host = 192.168.0.147
 master-user = replication
 master-password = slave
 master-port = 3306
 #
 [mysql.server]
 user=mysql
 basedir=/var/lib
 ########################################################################
 [mysqld_safe]
 log-error=/var/log/mysqld.log
 pid-file=/var/run/mysqld/mysqld.pid

Configure MySQL permissions

on PBX-147, enter to mysql by

 mysql -p
 enter root password

permit the local "replication" user and the "replication" user from remore server

 GRANT ALL PRIVILEGES ON *.* TO 'replication'@'localhost' IDENTIFIED BY 'slave' WITH GRANT OPTION;
 GRANT ALL PRIVILEGES ON `asteriskrealtime`.* TO 'replication'@'localhost';
 GRANT ALL PRIVILEGES ON *.* TO 'replication'@'192.168.0.148' IDENTIFIED BY 'slave' WITH GRANT OPTION;
 GRANT ALL PRIVILEGES ON `asteriskrealtime`.* TO 'replication'@'192.168.0.148';
 commit;

on PBX-148, enter to mysql by

 mysql -p
 enter root password

permit the local "replication" user and the "replication" user from remore server

 GRANT ALL PRIVILEGES ON *.* TO 'replication'@'localhost' IDENTIFIED BY 'slave' WITH GRANT OPTION;
 GRANT ALL PRIVILEGES ON `asteriskrealtime`.* TO 'replication'@'localhost';
 GRANT ALL PRIVILEGES ON *.* TO 'replication'@'192.168.0.147' IDENTIFIED BY 'slave' WITH GRANT OPTION;
 GRANT ALL PRIVILEGES ON `asteriskrealtime`.* TO 'replication'@'192.168.0.147';
 commit;


Start the MySQL replication

On server PBX-147 and PBX-148 do the following: enter to the mysql with the replication user:

 mysql -u replication -p

start the replication

 start slave;

on one of the servers, copy the DB to start from the same point:

 load data from master;

check status on each server:

 show master status;
 show slave status\G;

if you want to start replication when stop working:

 reset slave;

debug using:

 cat /var/log/mysqld.log 
 or
 tail -f /var/log/mysqld.log

Test the mysql Replication

  • 1. Connect both MySQLs of PBX-147 and PBX-148 from SQLyog
  • 2. make a change on one table on PBX-147 and save it
  • 3. refresh the table data on PBX-148
  • 4. verify it was replicated
  • 5. now test from PBX-148 to PBX-147

File:mysql_replication_test.gif


Asterisk file replication:

Permit ssh between servers

In order to copy files between the servers, we will allow ssh with no need of password between the two PBX servers. press enter to all questions: generate key:

 [root@PBX-147 ~]# cd /root
 [root@PBX-147 ~]# ssh-keygen -t rsa
 Generating public/private rsa key pair.
 Enter file in which to save the key (/root/.ssh/id_rsa): 
 Created directory '/root/.ssh'.
 Enter passphrase (empty for no passphrase): 
 Enter same passphrase again: 
 Your identification has been saved in /root/.ssh/id_rsa.
 Your public key has been saved in /root/.ssh/id_rsa.pub.
 The key fingerprint is:
 a5:0c:38:d6:73:51:3b:ca:2c:98:e2:2a:f6:82:88:00 root@PBX-147

connect through ssh to 148 and make a new directory, insert the 148 root password when prompt:

 [root@PBX-147 ~]# ssh root@PBX-148 mkdir -p .ssh
 The authenticity of host 'pbx-148 (192.168.0.148)' can't be established.
 RSA key fingerprint is e0:ed:63:7c:df:0f:8d:03:f0:a5:25:8b:5f:1b:8c:01.
 Are you sure you want to continue connecting (yes/no)? yes
 Warning: Permanently added 'pbx-148,192.168.0.148' (RSA) to the list of known hosts.
 root@pbx-148's password:

copy the ket to the the other PBX (insert the password when prompt):

 [root@PBX-147 ~]# cat .ssh/id_rsa.pub | ssh root@PBX-148 'cat >> .ssh/authorized_keys'
 root@pbx-148's password: 

verify by connecting through ssh to the other PBX and print its hostname: [ root@PBX-147 ~]# ssh root@PBX-148 hostname

 PBX-148

Now we will allow ssh from PBX-148 to PBX-147:

 [root@PBX-148 ~]# cd /root
 [root@PBX-148 ~]# ssh-keygen -t rsa
 Generating public/private rsa key pair.
 Enter file in which to save the key (/root/.ssh/id_rsa): 
 Enter passphrase (empty for no passphrase): 
 Enter same passphrase again: 
 Your identification has been saved in /root/.ssh/id_rsa.
 Your public key has been saved in /root/.ssh/id_rsa.pub.
 The key fingerprint is:
 e0:a1:27:17:c6:5d:69:60:73:73:b9:bd:80:ed:0f:df root@PBX-148
 [root@PBX-148 ~]# ssh root@PBX-147 mkdir -p .ssh
 The authenticity of host 'pbx-147 (192.168.0.147)' can't be established.
 RSA key fingerprint is e0:ed:63:7c:df:0f:8d:03:f0:a5:25:8b:5f:1b:8c:01.
 Are you sure you want to continue connecting (yes/no)? yes
 Warning: Permanently added 'pbx-147,192.168.0.147' (RSA) to the list of known hosts.
 root@pbx-147's password: 
 [root@PBX-148 ~]# cat .ssh/id_rsa.pub | ssh root@PBX-147 'cat >> .ssh/authorized_keys'
 root@pbx-147's password: 
 [root@PBX-148 ~]#  ssh root@PBX-147 hostname
 PBX-147

in order to sychonize the files between the master PBX and the stand-by PBX, we will create a file that copy the astdb to the to the standby PBX, if the the standby become the master the replication on the old master should be stopped and to be started from the new master to the new standby. (the same should be done for the voicemail files /var/spool/asterisk/voicemail/*)


creating sync file as service

create the following file on PBX-147:

 vi /opt/sync_astdb 

save file with the following:

 #!/bin/sh
 while true
 do
    rsync -r /var/lib/asterisk/astdb root@PBX-148:/var/lib/asterisk/astdb
 #  logger Rsync myscript
    sleep 30  
 done

and the same on PBX-148:

 vi /opt/sync_astdb 

save file with the following (note that now the copy is to 147 and NOT 148:

 #!/bin/sh
 while true
 do
    rsync -r /var/lib/asterisk/astdb root@PBX-147:/var/lib/asterisk/astdb
 #  logger Rsync myscript
    sleep 30  
 done


adding sync_astdb as service (on both servers)

 vi /etc/init.d/sync_astdb
 #!/bin/sh
 # 
 # chkconfig: 2345 75 05
 # description: Startup script AstDb replication
 # processname: sync_astdb
 # pidfile: /var/run/sync_astdb.pid
 # config: /opt/sync_astdb
 # Source function library.
 . /etc/init.d/functions
 SLAPD_HOST=`hostname -a`
 SLAPD_DIR=/opt/
 PIDFILE=$SLAPD_DIR/logs/pid
 STARTPIDFILE=$SLAPD_DIR/logs/startpid
 if [ -f /etc/sysconfig/sync_astdb ]; then
         . /etc/sysconfig/sync_astdb
 fi
 start() {
         echo -n "Starting AstDB replication: "
         if [ -f $STARTPIDFILE ]; then
                 PID=`cat $STARTPIDFILE`
                 echo AstSB replication is  already running: $PID
                 exit 2;
         elif [ -f $PIDFILE ]; then
                 PID=`cat $PIDFILE`
                 echo AstDB replication is  already running: $PID
                 exit 2;
         else
                 cd $SLAPD_DIR
                 daemon  ./sync_astdb $OPTIONS &
                 RETVAL=$?
                 echo
                 [ $RETVAL -eq 0 ] && touch /var/lock/subsys/sync_astdb
                 return $RETVAL
         fi
 }
 stop() {
         echo -n "Shutting down AstDb replication: "
         echo
         killproc sync_astdb
         echo
         rm -f /var/lock/subsys/sync_astdb
         return 0
 }
 case "$1" in
     start)
       start
       ;;
   stop)
       stop
       ;;
   status)
       status sync_astdb
       ;;
   restart)
       stop
       start
       ;;
   *)
       echo "Usage:  {start|stop|status|restart}"
       exit 1
       ;;
 esac
 exit $?

add sync_astdb as service, but remove from startup:

 chkconfig --add sync_astdb
 chkconfig sync_astdb --level 345 off

change the service permission on both servers:

 chmod 755 /etc/init.d/sync_astdb
 chmod 755 /opt/sync_astdb

verify on both servers:

 /etc/init.d/sync_astdb start
 /etc/init.d/sync_astdb stop

Test the asterisk redunduncy

So, we have master/standby Asterisk servers. The master has the active heartbeat (virtual IP), MySQL DB that replicate data to the standby and service that replicate the needed files like the asterisk DB (ASTBD). when a switch over is done, the virtual IP jump to the stand by (now the new master), replication of the MSQL is allready up, and the service that replicate the files is started on the standby server (now the master) and stopped on the old master. let's test the new setup:

  • open PING to th PBX IPs and the virtual Ip (192.168.0.147,192.168.0.148,192.168.0.149)
  • reboot both servers
  • when both up, PBX-147 should be active:
    • verify the virtual IP 192.168.0.149 by using ifconfig.
    • verify the MYSQL replication by changing data (use SQLyog) on 147 an check on 148.
    • verify that "sync_astdb" service is running on PBX-147 and not on PBX-148(ps -aux | grep syn)
  • boot PBX-147
  • check that the virtual IP is still up and it is on PBX-148 (ifconfig)
  • check that the "sync_astdb" service is started on PBX-148.
Personal tools