picture of Michael Tanoviceanu
Here at Soundbreak we stream out live audio and video 24 hours a day, so it didn't take much convincing to be able to run some testing on MySQL's new replication feature. We discoverd that we could use it to keep an up to date standby database server that could handle all of the queries should the primary fail for any reason. It's not dificult to configure two servers for just such a purpose, and I'll walk through the procedure, as well as a discussion as to how you can use PHP to redirect the queries should the primary server fail.
MySQL's internal replication is built on a master-slave relationship between two or more servers, with one acting as the master, and any number acting as slaves. I'll walk through configuring two servers as a master slave pair, describing the process as we move through it. I initially performed the procedure outlined below on 3.23.22, and have also tested it on 3.23.23. The MySQL developers recommend that you use the most recent version, and that both the master and slave use the same version while the 3.23 series is still in beta as versions may not be backward compatible. I currently have not yet used this procedure on a live site for that reason, as one of the advantages of having fail-over capabilities is to be able to upgrade one server without interrupting any of the queries.

Step One: Configure The Master

For the remainder of the article, I'll refer to two servers, A (10.1.1.1), the primary server, and B (10.1.1.2), the standby server.
MySQL's replication is done by having the slave server (B) connect to the master (A) and read the binary update log, incorporating those changes into its own databases. The slave needs a user account to connect to the master, so on the master (A) create an account with only the FILE privilege with the following:
GRANT FILE ON *.* TO replicate@10.1.1.2 IDENTIFIED BY 'password';
Don't worry about running 'FLUSH PRIVILEGES' on the master to ensure that the slave can connect, as we'll be stopping the server in the next step.
Now we need a snapshot of the codesent data, and to configure the master to start generating binary update logs. First edit the 'my.cnf' file to enable the binary update log, so somewhere under [mysqld] portion add the line: 'log-bin'. Now the next time the server starts, we'll be generating the binary update log (named -bin. ). When you shut down the MySQL server to enable the binary update logging, copy all of the database directories of the master to another directory, then restart mysqld Be sure to get all of the databases, or you could end up with errors when replicating if a table or database exists on the master but not on the slave. Now you have a snapshot of the data, as well as a binary log of any updates since the snapshot. Please note that the MySQL data files (the *.MYD, *.MYI, and *.frm) are file system dependent, so you can't transfer files from Solaris to Linux. If you are in a heterogeneous server environment you'll have to use mysqldump or other custom script to get your data snapshot.

Step Two: Configure The Slave.

Go ahead and stop the MySQL server on the slave, and move the database directories you copied above to the data directory on the slave server. Be sure to change the ownership and group of the directories recursively to the MySQL user, and change the file mode to 660 (read-write for owner and group only) on the files, and the directories themselves to 770 (read-write-execute for owner and group only).
Now go ahead and start the MySQL server on the slave to ensure everything is working fine. Run a few select queries (no updates or inserts) to make sure the data snapshot you took in step one was successful. Go ahead and shutdown the server after successful testing.
The slave needs to be configured to look to a specific master to receive its updates, so we need to edit the 'my.cnf' file on the slave, adding the following lines to the[mysqld]portion.
master-host=10.1.1.1
master-user=replicate
master-password=password
After starting the slave server, it will automatically look the master specified in the 'my.cnf' file for any updates and incorporate those changes into its databases. The slave server keeps track of what updates it has received from its master in the 'master.info' file. The status of the slave thread can be seen through the sql command 'SHOW SLAVE STATUS'. Any errors in processing the binary logs on the slave will cause the slave thread to exit, and generate a message in the *.err log. The errors can then be corrected, and the sql statement 'SLAVE START' can be used to restart the slave thread, where it will pick up where it left off in the binary log of the master.
By now the changes made to the data on the master should have replicated to the slave, and you can test this by inserting or updating a record on the master, and then selecting it on the slave.
Now we have a master -> slave relationship from A -> B, which would allow us to redirect all of our queries to B if A should be down, but we have no way of getting any updates to the databases back to A when it is brought back up. To solve that problem, we create a master -> slave relationship from B -> A.

Step Three: Create A Mutual Master-Slave Relationship

First go ahead and add 'log-bin' to the [mysqld] portion the my.cnf file on B and restart mysqld, then create the account for the replication user on it with:
GRANT FILE ON *.* TO replicate@10.1.1.1 IDENTIFIED BY 'password';
Go ahead and run a 'FLUSH PRIVILEGES' on B to load the new grant tables after adding the replication user, and go back to server A, and add the following to it's 'my.cnf' file:
master-host=10.1.1.2
master-user=replicate
master-password=password
After restarting server A, we now have a mutual master-slave relationship between A and B. A record that is updated or inserted on either server will be replicated to the other. A word of caution: I'm not sure just how fast a slave incorporates bin-log changes, so it might not be good idea to load-balance your insert or update statements to codevent any corruption of your data.
Step Four: Alter Your Database Connection Routine
Now that you have established a mutual relationship between servers A and B, you need to alter your database connection routines to take advantage of this. The function below first attempts to connect to server A, and then to server B if no connection can be made.

<?php

/********************************************************
function db_connect()

returns a link identifier on success, or false on error
********************************************************/
function db_connect(){
$username "replUser";
$password "password";
$primary "10.1.1.1";
$backup "10.1.1.2";

# attempt connection to primary
if(!$link_id = @mysql_connect($primary$username$password))
# attempt connection to secondary
$link_id = @mysql_connect($secondary$username$password)
return 
$link_id;
}

?>
I tested my setup using the above technique under two conditions, with the primary MySQL server shutdown, but otherwise the server was running, and with the primary server shutdown. Connections were made to the backup immediately if just mysqld was shut down on the primary, but if the entire server was shutdown, there was an indefinite hang (I lost track after two minutes - short attention span) while PHP was looking for a non-existent server. Unfortunately, the mysql_connect function does not have any time-out parameters like fsockopen, however we can use fsockopen to fake a timeout for us.

Step Five: An Improved Database Connection Routine


<?php

/********************************************************
function db_connect_plus()

returns a link identifier on success, or false on error
********************************************************/
function db_connect_plus(){
    
$username "username";
    
$password "password";
    
$primary "10.1.1.1";
    
$backup "10.1.1.2";
    
$timeout 15;  // timeout in seconds

    
if($fp fsockopen($primary3306, &$errno, &$errstr$timeout)){
        
fclose($fp);
        return 
$link mysql_connect($primary$username$password);
    }
    if(
$fp fsockopen($secondary3306, &$errno, &$errstr$timeout)){
        
fclose($fp);
          return 
$link mysql_connect($secondary$username$password);
    }

    return 
0;
}

?>
This new and improved function gives us an adjustable timeout feature that the mysql_connect function lacks. If the connection fails right away, such as if the machine is up, but mysqld is down, the function immediately moves to the secondary server. The function above is quite robust, testing to see if the server is listening on the port before attempting a connection, letting your scripts time out in an acceptable period leaving you to handle the error condition appropriately. Be sure to alter the port number if you've changed it from the default port of 3306.
Conclusions and Pointers
First of all, be sure to get a good data snapshot. Forgetting to copy a table or database will cause the slave thread to stop. Timing of the snapashot is critical. You want to be sure that binary logging is not enabled before you copy the data files, and that it is immediately after you do. If you were to enable binary logging before getting the snapshot, the slave thread could stop when it tried importing importing records due to duplicate primary keys. This is best accomplished by the procedure discussed in section two: shutdown-copy-restart with binary logging. Without those precautions.
You may want to set up replication one-way initially, and keep an eye on the slave server for good period of time to ensure that it is keeping in step with the master.
I haven't tested out a system for load balancing with the replication feature, but I would be leery of using such a system for balancing inserts and updates. For excample, what if two records were given the same auto_increment number on two servers, would the slave thread on each stop? Questions like those would keep any load balancing scenario as 'read-only', where one server handled all of the inserts and updates, while a team of slaves (yes, you can have multiple slaves off of a master) handled all of the selects.
I'm very happy that MySQL has taken on some sort of replication system, and that it's so simple to configure. That should get you started in provding an extra measure of security against events beyond your control. I have only covered the features of replication that I have tested and used, but there are a few more which are detailed in section 11 of MySQL's on-line documentation.
~Michael