About a year ago, I worked for a few months prototyping a MySQL Cluster system to try to replace an Oracle RAQ database containing ~90 million user account records and handling ~4,000 transactions per second. Even though the cluster's speed and performance actually exceeded what our Oracle servers were doing, I could not come up with a way to fit 90 mil records into main memory in a cost-efficient way.
Here's a diagram of the cluster towards the end of my time testing it. (Edit: Original picture stated that RRDNS was used between the application and cluster layers; a load balancer would be more appropriate.)
Every computer in my test cluster was a single processor Intel Xeon 3.4Ghz computer with 2G RAM, running Mandrake 10.1, and MySQL 4.1.7 custom compiled. I used the super-smack utility, running on the API nodes, to generate test data and bombard the Cluster with reads and writes.
With the servers in the configuration I've shown here, I was able to get about 2,000 inserts per second per API node on an indexed varchar field; this rate scaled linearly up to three API nodes. I do not have the precise results of the benchmarks any longer (and the Cluster has been updated alot since then anyway). Regardless, the speed was not the limiting factor for me, it was the capacity. NDB is an "in memory" storage engine, and there was not a cost-effective way to store ~60GB of data in main memory. I'm told that in version 5.1, this limitation will be removed from NDB, allowing for infrequently accessed data to be left on disk. That will make the Cluster even more applicable to VLDB's.