13 November 2005

Added 'rcall' to mycat project

I just finished a couple day's work, documenting and adapting the 'rcall' (Remote Call) tool, and added it to the Mycat project on sourceforge. The readme is available here.

'rcall' is designed to run in non-homogeneous *nix clusters, to ease use of these clusters by creating a single place from which to securely run commands across logical groups of servers.

I'll give some examples.... The past two companies I have worked for both had separate web (apache) servers, mailing (MTA) servers, and database (MySQL) servers. Using rcall (its previous incarnation, that is), I was able to easily track down an error that was causing seemingly random entries in the databases to have wrong timestamps; `rcall -on web -C 'date'` would print the current date on each web server, making it easy to see which one's clock was wrong. After making a change to apache's configuration, it was easy to restart all the webservers without any visible interruption in service; `rcall -on web -delay 15 -C '/usr/local/apache/bin/apachectl restart'` - the delay helps our loadbalancer handle the sudden loss of service from each individual server, but collectively, there is no downtime. Checking on disk utilization across all database servers is as easy as `rcall -on data -C 'df -h | grep -P "data|backup"'`.

`rcall` does not directly relate to MySQL, but it has helped me in my work with MySQL Cluster, and with clusters of MySQL servers (using the term loosely there). Of course, the other tool already in the mycat project is `rep_mon`, which IS directly relevant to MySQL.

Over the coming weeks, I will be adding another tool to this project, namely, `conftool` which creates a single location from which to sync configuration files (and back them up, and restore from backups, and so on) throughout the same groups/clusters of servers that `rcall` and `rep_mon` operate on. After that, I hope to add my database backup system, but I have yet to find a way to "generalize" it so that it might work on someone else's systems without hassle.

I've found these tools indispensable in my own work; I hope that others can find at least some use in them!

01 November 2005

Replication Monitor finished!


Finally! Last night, I put the finishing touches on the documentation, and uploaded to SourceForge. Well, finishing touches for a first alpha, hehe - still, it's a good step! Anyone with Perl installed should be able to download Mycat-0.1.2.tgz, untar, edit the config file (mycat.cnf), and run rep_mon.pl to see the status of replication on their servers.

Mycat-0.1.2 release notes

I will add the "rcall" script next, but I can't decide whether it's name comes from "Run Command on all" or "Remote Call"... decisions, decisions... anyhow, rcall relies on ssh key-based remote execution of commands, and is not specific to any MySQL installation. It is none the less an indispensable tool for me, when working with large heterogeneous groups of servers.

26 October 2005

Replication Monitor preview posted

I posted the source to my replication monitor (rep_mon.pl) over at sourceforge, along with a screen shot of it in action. It doesn't look all that fancy, but when you've got a dozen MySQL servers, it sure helps to have one script that checks them all at once!

24 October 2005

Open Source

Just this morning, I recieved permission to make the tools I've developed available via SourceForge, under a GPL license! Wooo! (heh, sorry, couldn't help myself) I've already set up the project

http://sourceforge.net/projects/mycat/

It will take me a little time to prepare the code, so check back soon!

26 August 2005

Python Challenge

Challenge: reverse every word in a text file in place (preserving line breaks and word order)

Answer:

for line in open("file_name").read().split('\n'):
print ' '.join(line[::-1].split()[::-1])

23 August 2005

Distributed database, a few tables at a time

We've got six dedicated web servers reading from and writing to just one database server. All fine and good - until something goes wrong. Over the past few months, lots of little things have been going wrong ... some developer runs a "bad" query which locks the tables needed by the web servers; a new ad campaign generates much more traffic than any previous, and the database is not up to handling that many requests all at once. Of course, I would love it if the web developers never accessed the database without first showing me their SQL statements, and if we could afford the server downtime necessary to completely restructure the tables (change them to InnoDB and add much better indexes). However, I came up with another means to alleviate some (not all) of the stress on the database...

Distribute the most critical tables to a tiny mysqld process on each web server, so that the Apache/PHP processes no longer depend on the central database server to generate those pages.

This diagram shows the path of replication within MySQL. The web servers draw their data from either their local mysqld process or from the master, depending on the table; all writes are directed to the master.

20 August 2005

Previously, MySQL Cluster

About a year ago, I worked for a few months prototyping a MySQL Cluster system to try to replace an Oracle RAQ database containing ~90 million user account records and handling ~4,000 transactions per second. Even though the cluster's speed and performance actually exceeded what our Oracle servers were doing, I could not come up with a way to fit 90 mil records into main memory in a cost-efficient way.

Here's a diagram of the cluster towards the end of my time testing it. (Edit: Original picture stated that RRDNS was used between the application and cluster layers; a load balancer would be more appropriate.)

Every computer in my test cluster was a single processor Intel Xeon 3.4Ghz computer with 2G RAM, running Mandrake 10.1, and MySQL 4.1.7 custom compiled. I used the super-smack utility, running on the API nodes, to generate test data and bombard the Cluster with reads and writes.

With the servers in the configuration I've shown here, I was able to get about 2,000 inserts per second per API node on an indexed varchar field; this rate scaled linearly up to three API nodes. I do not have the precise results of the benchmarks any longer (and the Cluster has been updated alot since then anyway). Regardless, the speed was not the limiting factor for me, it was the capacity. NDB is an "in memory" storage engine, and there was not a cost-effective way to store ~60GB of data in main memory. I'm told that in version 5.1, this limitation will be removed from NDB, allowing for infrequently accessed data to be left on disk. That will make the Cluster even more applicable to VLDB's.

Initializing...

Well, I've been working with MySQL on Linux for a number of years now, and lately I've been trying to help folks on the mailing list as much as possible (both to be a service and to learn more myself). I figure it's time I started blogging what it is that I do, and keeping track of the different bumps I encounter. Maybe it'll help someone else.

If nothing else, it'll give me a good laugh to read this a few years down the road :)


Cheers!
Devananda vdv