Adventures with mysql
One of our managed server customers came to us with a straightforward sounding problem. They needed to do a massive data import to their database which was going to take several days to complete. They hadn’t managed to test their full import because they didn’t have a test machine with enough memory, but the procedure looked it looked very solid on smaller datasets. After some discussion we came up with a plan where we’d put together a faster replacement server for them, they could repeatedly try their import until the database was complete and tested, then we’d move the full site over to the new faster machine, reclaiming the old one back into our pool.
As this seemed an ideal opportunity to test our backup and restore procedures and to give training to our newest staff member, instead of copying their server over to the new one, we restored the new one from backup. Happily we restored the state of the server fairly easily, did a little bit of troubleshooting the boot process, a summary of the expected issues in getting the new machine up are as follows,
- udev remembers your network card mac address, so on the first boot your ethernet card comes up as eth2/eth3 instead of eth0/eth1 which means the network doesn’t start, handily we have serial console / recovery mode on our servers so this is easy to fix.
- /dev can’t be backed up so you need to manually generate a minimal set of device nodes.
- it’s important to make sure you’ve generated /proc /sys /tmp /run mount points with the right permissions.
- when installing grub in a chroot you need to make sure you’ve done a bind mount of /proc /sys /dev or you’re going to get very confused
- you can’t back up a mysql database purely by copying it’s files, we take a mysql dump which we restore from, this means you need to start from a blank mysql database and do the restore into it. In order to let the debian mysql scripts work you need to move /etc/mysql/debian.cnf out of the way, create your blank mysql database, do the restore, stop the database, move debian.cnf back in in order to make your system-maintainer passwords match the database at each stage.
- On our replacement server we’ve installed LVM (the original was just a partition) so we needed to install the LVM tools into the chroot to build the initramfs to boot the system.
- We needed to edit /etc/fstab because the UUIDs have changed to point the machine at the correct partitions to boot from.
This is all fine and expected. We handed the machine to our customer, they were happy and it all seemed done. Two days later they came back, mysql on the new server was much slower than the previous server. They gave us a sample query and wondered why it answered in a fraction of a second on their development and current production server. We started investigating, the new server was running 100% CPU, no disk IO, the entire database source files cached so we hadn’t done something obviously stupid. We then started looking at the mysql query planner, it had a different query plan for the same query on the same database on two different machines.
How odd we thought.
Looking further at the indexes we discovered that cardinality of the indexes for the tables in question was different. Further investigation shows that the indexes are built differently if you do INSERT, INSERT, INSERT as the original database was created compared to a single multi-row insert that the restore from backup did.
The solution is to run
optimize table table_name
which finishes building the indexes properly, the query plan changes and the queries then run quickly. We have now added mysqloptimize –all-databases to our recovery procedure and handily put it into this blog post in the hope of saving someone else the investigation we’ve just done.
If your database backup procedure doesn’t have this step in and attempting to figure it out sounds hard you should consider our Managed hosting services. If investigating this sort of thing sounds like a fun way to spend your working days you should consider applying for a job.