You are here

mysql

MySQL 5.1 Released

Quietly, so quietly that Gentoo still doesn't include it, MySQL 5.1 has been released.

In some ways, I think MySQL just don't care anymore. 5.1 took forever to come around and is far from perfect. I really need partitioning which is why I've been watching it. I was intrigued to read this post from Monty (effectively the 'M' of MySQL) explaining what went wrong and what is wrong with 5.1:

http://monty-says.blogspot.com/2008/11/oops-we-did-it-again-mysql-51-rel...

I still like MySQL; it's still my preferred database. There's nothing like getting an intercontinental master/master replication running. But I have had Oracle running here for a long time (1.5GB of bloody RAM, for one table, to record a 20 byte row every few minutes - argh!) and I feel it really is time I learned how to use Postgres. I've started to remove my dependancy on phpMyAdmin (my new job is in Python and PHP simply doesn't get installed on any servers, at all) which means I'm far more comfortable using the command line tools, as I should be really. This should help me to make more use of tools like Postgres which don't have anything of the caliber of phpMyAdmin (laugh if you must, but it is productivity that matters to me and PMA is productive in the extreme).

Keywords: 

PHP, MySQL Optimisation

One of my current projects is the scaling up and out of a fairly large website. It currently sits on a dedicated server. Quite a large machine, but now three years old and we're starting to hit some limits and see slowdown at peak times. The site has already outgrown one server, scaling up to the current box.

The site sees steady growth and is doubling every six months or so. A target has been set and so I must build a system to meet it. The new system is currently in the testing phase so I expect some changes to what I'll write here as we go on.

Some background before I begin detailing some of the experiences I've had working on the site recently:
The site was originally written in 2003 by an unknown third party and was a PHP4 site. 'Register globals' and ext_mysql abound. Since then it's been in constant development with myself taking over in 2004 but the essential framework has not been significantly altered away from typical PHP4 ways. In the last year or so I've started to introduce some classes into the system in an attempt to compartmentalise functionality, which is so often spread over a dozen files otherwise. The size of the code base has increased tenfold since 2004; while I have made some bad decisions in some of the things I've implemented, the majority of the site is quite well written, given that it's PHP4 procedural, and I thank the original author for the clean design.
The hosting platform is the highly capable FreeBSD. MySQL is version 4.1. The current PHP platform is 5.2.

The site itself is one code base with two front ends: the customer side and the administration side. The admin is very complex and contains a mass of functionality to run a multinational business. The client side is much simpler but data intensive (text and images) and very much the focus of traffic. The customer side also customises based on the URL - the software serves many hundreds of sites, each one potentially run by a physical office which could be anywhere around the world (we're on three continents now I believe).

Okay, so how do we go about scaling up what is essentially a PHP4 site running on FreeBSD?

The first job is to get some new servers in and separate out some of the tasks.
1. New database server. Masses of RAM and very fast disks. Priority goes to RAM and disk. We've got this in RAID5 which perhaps isn't best but should see us good for a few years to come.
2. Repurpose old server as a database slave.
3. An application server with as much CPU and RAM as funds can bear - running only PHP.
4. Another application server identical to the first.

The above scheme gives us two ways to scale horizontally: more application servers and more database slaves. Our limit then is database writes: once we start to stress that I would be seriously looking at moving to Oracle but MySQL continues to improve and we'll see what happens in the next three to five years.

The structure of the site is developing like this:

  • db1: hosting MySQL 5.1rc, PHP files, image files and Apache configurations. The PHP, images and Apache are exported using NFS.
  • db2: not implemented yet.
  • app1: running Apache 2.2 and PHP 5.2.6. /home is imported from db1, /usr/local/etc/apache/Includes is imported from db1. In this way each box has its own httpd.conf but gets global site configurations from a central point.
  • app2: a clone of app1 but for IPs and hostname

Every service listed above (files, database, ..) is allocated a host name in DNS and is treated separately in software, such that if necessary we can break services out from a single server on new dedicated boxes.

In terms of centralised server management, I have written a small perl daemon (actually reused something I wrote for my home network years and years ago). This consists of a program running on each server which knows certain commands and can perform them on the local machine, and a control program which is available on each node and can connect to every appropriate daemon and execute these functions.

I have my doubts on hosting the PHP via NFS - even over gigabit. A caching NFS client would make me feel happier. If I detect significant problems with performance I will have the entire file structure replicated to each box using rsync.

With this being a single purpose cluster, it also makes sense to centralise login and authentication. So I am using NIS from the central DB server. I have found that NIS under FreeBSD is far from as transparent as under Linux, which is my only complaint. I must manually copy the master.passwd file, trim out the system accounts and then remake the yp files. Under Linux (Mandriva and Gentoo at least) this is more automatic.

That's it for the hardware and system configuration of the cluster - the rest is in the load balancers and software.

The load balancers (there are two in a failover cluster) are actually also the firewalls and intrusion detection boxes for this part of the ISP. The machines are so flexible that we can centralise much of our network management here. They are two Gentoo linux machines, running iptables, tc, ipvs, snort and a number of other services. These are, as mentioned, in a failover cluster. Each machine is identical and configuration managed by propagating changes from any one machine to all others (today there are only two machines, but we can scale even into active/active for any number of boxes). Otherwise the machines are completely independent. A custom Heartbeat resource was written and is capable of failover in 35 seconds of a problem being detected - with the ARP problem solved by migrating MAC addresses around the cluster. Even all the cabling is in pairs for redundancy. I'll write more on this system later, perhaps, but suffice it to say that incoming traffic is shared between application servers in a controlled way and we can detect node failures at any point and work around them automatically within seconds.

That's the hardware pretty much out of the way. Onto the software changes.

Over the last few days I've made a raft of changes to improve performance and prepare for clustering. As part of this I've also spent time optimising to avoid DDoS attacks as we've suffered some recently from spambots working in geographically disparate botnets hitting forms at a rate of hundreds/second.

Naturally, I would like to rewrite the site starting pretty much from scratch (programmers are like architects, once something has been built they want to do it again but bigger and better). To do that would take a very long time and probably kill the site dead in terms of technological lead; rewrites are very often the worst thing you can do.

So, what have I started doing?

  • Sessions: standard PHP sessions don't cut it in a cluster if you send users to servers in a round-robin fashion. While it's better to keep users on one server (unless your machines are untaxed or you can measure server load by current connections, which isn't appropriate for short term HTTP hits, there's no real gain) a true cluster needs sessions shared among the application nodes. So a PHP session handler class was written and implemented. This was mostly transparent to the site code. I have given the session class its own connection to the database - we'll see if that's a good idea or not soon enough...
  • Modernisation of database control. I moved the entire site from ext mysql to ext mysqli for performance, transactions and a class interface. I added a database class as a subclass of mysqli to centralise all SQL state and functionality. I wrote wrappers for all ext mysql functions used to avoid large rewrites. Why was this important? Centralising SQL and not calling mysql_*() functions directly will allow me to add features to handle a MySQL replicated database. I haven't added this yet though. Being in a class provides a way to more easily add functionality later on.

    In essence all mysql_*() have been removed from the code and replaced with sql_*() functions. These work in exactly the same way as the old mysql_*() functions. Behind the scenes, they call methods in $db.

    The wrapper class (instantiated as $db) contains all the same functions again as methods. That is:

    $result = sql_query_value( "SELECT a FROM .." );

    can be written as:

    $db->query_value( "SELECT a FROM .." );

    This is exactly the same way that MySQLi itself implements both procedural and object oriented methods.

    This also means we can implement the same class interface again to do other things with SQL without rewriting any calling code, which brings me on to:

  • memcached: this is marvellous, it really is. The increased memory on each application node makes this a realistic option now. memcached holds data in memory for a set period of time. Example: on the front page we make more than a hundred SQL calls, the majority of which return data which changes on a scale of days, some over months and some immediately. Why tax the database in requerying this data a hundred times a second? Okay, so the possibilities for performance gains are obvious. What about the implementation? Can we do it without a big rewrite? Yes.

    We write a class to manage memcached, extending the memcached pecl plugin class. Importantly, this class implements the same interface as our new database class: we call it with normal SQL, and if it is cached, we get that. If it is not cached, it goes off and queries the database for us.

    The database class was implemented like this:
    mysql_query_value() became sql_query_value() which is a wrapper for $db->query_value().
    The memcached class works like this:
    $memcached->query_value()
    - query_value is a helper function written in PHP which has been used since day one - much of the site is written like this and it makes development rapid.

    As you can see, as long as you can identify time consuming or rapidly hit calls, you can cache them with ease.

    An additional feature we can then provide is a polymorphic constructor. Should memcached fail, we can return an instance of the mysqli wrapper to the caller. Since they implement the same interface, the same calls work and the site continues to run - just perhaps somewhat slower.

    All this is elementary to Java developers but it's quite exciting to see it working in PHP, particularly to drive such a PHP4 site!

The memcached subclass and mysqli wrapper class are implemented as singletons, so can be instantiated anywhere and reuse existing connections already made on the page. Thus we begin to eliminate global variables.

Would you like some figures? Some of the more complex pages within the site now load in a third of the time, but get perhaps half the number of hits that the main home page gets. So lets look at that:

Without caching:

DB Queries: 156
Execution Time: 0.1264

With:

DB Queries: 15
Execution Time: 0.0530

i.e. the page is twice as fast to load and the database server is doing almost nothing (relatively speaking).
This is on top of the performance increases from physically seperating MySQL and PHP servers, and of course the speed increases of the new servers themselves.

Interestingly the partial modernisation of the code framework into classes allows us the flexibility to actually collect these sorts of statistics now. Previously I did not have these numbers. A fair bit of data can be collected now, much more than I reveal above.

Whist doing all this I also did some more basic optimisations. In some cases I had left some unoptimised code in place (I swear, it must have been for testing!), such as a couple of SELECT * to get one variable from a row, with a row being around 64KB!

I also looked at the cachability of images and pages. Work had been done years ago to ensure processing effort was cached on the server, but much less on the more basic aspect of caching on the user/proxy side.

The best example here was the main unit of images, which cover every page of the site. It is one of many possible photos automatically scaled and processed on the server and presented to the user. The photos simply do not change over time. Often, once set they are set for years.

These were marked by the code as no-cache, due to no-cache headers being sent by the framework for every page. Although we were not reprocessing on each view, we did loose bandwidth to the hit fetching the data from the file system. This was quickly solved and the next problem hit: the URL for the image contains a '?' which essentially kills caching dead. The solution was already implemented elsewhere on the site for the PHP pages: use mod_rewrite to provide a clean URL to the client to fetch the image.

The final result is that with a primed cache only two calls are made to the server for the home page down from twenty six. I may soon be able to reduce this to one.

As an aside YSlow is reporting the wrong numbers for this and still showing masses of hits for the page; I can't think why as Firebug explicitly shows what is really fetched.

That's all I can think to write about for now.
I will try to report back as development and testing continues and what happens when the cluster goes live. Particularly if NFS can handle serving up all those /home directories.

Subscribe to RSS - mysql