While at Percona Live, I attended some presentations about MySQL Cluster and spoke with several people about the latest with this product. It had always been my understanding (as one who has not worked with MySQL Cluster in detail) that it was basically a straightforward clustered database. But I was not entirely correct. What I learned is that MySQL cluster is, in reality, a collection of MySQL front-ends that all provide access to a distributed NDB based data store.
The Elastic DBMS Blog
Sharding represents a new class of distributed systems!
Having spoken with many people at Percona Live about their experiences with sharding, and attended some presentations describing sharding nightmares, I have come to the considered opinion that sharding represents a new class of distributed systems.
Just as a quick recap, the currently recognized classes of distributed systems are:
Another high profile AWS "exit"
Last week I read this article that talked about the decision by HubSpot to move off Amazon AWS and instead use Rackspace.
Thinking in parallel!
It is one thing to perform computations in a single stream but in the world of the cloud, and virtualization, and massive parallelism, single stream computation just doesn't cut it. Whether you are using technologies like Hadoop/MapReduce, or developing MPP database software like we do at ParElastic, you have got to think in parallel.
Consider the simple case of computing the arithmetic mean of a set of numbers. The formula for computing the mean is simply stated as
Boston Amazon Web Services Meetup
Tomorrow I will be presenting at the Boston Amazon Web Services Meetup at 7pm at Microsoft NERD.
You can visit the meetup site here.
Title: Challenges and travails of scaling your database in AWS
Internship and Co-Op opportunities at ParElastic
We've been busily cooking up some cool technologies at ParElastic and we have a number of openings for members of our software development team including internships and co-ops. These have just been posted on our web site at http://www.parelastic.com/careers and we will be updating that with new openings continuously.
Check them out and if you are interested in some of the openings, or know of others who may be interested, please share the link with them.
What would you be doing if you were an intern or Co-Op at ParElastic?
More about cloud variability
In an earlier blog post, I described the annoying fact that there was an enormous variability in database response time in the cloud. Using a very simple benchmark like sysbench, and using just a single thread on a large amazon instance that would run both MySQL and sysbench, the variability in performance was extraordinary!
Open Database Camp, Boston, 2013
Want to know more about ParElastic? Come visit the Open Database Camp, Boston 2013 (which is part of the North East Linux Fest), and listen to a presentation about ParElastic there!
The ParElastic talk will be about "Elastic Database Virtualization with ParElastic".
The slides that I will be presenting are here.
Annoying things with the cloud!
An extremely annoying thing about working in the cloud (and I love working in the cloud) is that performance testing is much harder because the underlying platform has such highly variable and erratic performance.
Consider this simple example where you are attempting to study the performance of a 'method' (and I use the term in a general sort of way) and so you write a test harness that exercises the method a number of times in some very controlled manner and you observe that the method provides you responses in { ti }.
Furthermore, you observe that
The death of the anonymous user
As web sites have become more interactive, consumers have increasingly begun to expect personalization and customization of content. No longer is it ok to greet visitors as “Hello visitor”; you have to, instead, greet visitors by their first name.


