Wednesday, January 21, 2009

Implementing Sharding in the Database

Over the past few weeks (years really) there has been some discussion on sharding. Instead of discussing when sharding is required, as there are good discussions on this already, I want to discuss how I would like to have sharding implemented in the database.

I want the database to handle sharding automatically, and where it can't be automatic, I want the database to help as much as it can.  Just like I want my business logic in a language ideally suited to it, and not stored procs (generally, there are always exceptions); I want all my physical persistence to be handled by the place that already does most of it, the database.  Having the database handle some of the physical persistence and the object relational layer handle the sharding logic isn’t ideal to me, and not just because the current object relational layers don’t have all the sharding functionality I want.  So here is what I want my database to do.  

1) I want to automatically route simple queries to the one shard that has the relevant data.  Hibernate shards already does this; I just want it implemented in the database.  The MPP (massively parallel processing) version of DB2 can also do this.

2) I need the ability to reshard data when a rebalancing of the shards is required, without downtime.  This is similar in concept to an index rebuild being done while the database is still up, which the expensive databases have implemented.  This is more complex than an index rebuild as multiple tables will need to be resharded before the resharding is complete. DB2 has functionality to reshard, but it requires down time last time I checked (which was a few years ago) .

3) Finally, and this is a superset of #1, I want to be able to run one SQL statement across multiple shards at the same time.  The database will send the SQL to all the relevant shards, run as much of the SQL on that shard as possible, using something like the function shipping that DB2 MPP has, and then aggregate the data from the various shards into one result set.  All I need to do this is execute one SQL statement.  Stated differently, all the shards will look like one database.  This won’t be easy to implement, and will probably be limited in functionality at first, but this long term goal is why implementing sharding in the database makes sense. 

This might sound a bit like an advertisement for the MPP version of DB2, but if it were open source I would at strongly consider implementing it for applications that need database sharding. Given that that is not likely to happen, I hope a reliable open source database, like MySQL, or Drizzle, or PostgreSQL, starts implementing this reasonably soon.

This sharding functionality, for the types of problems I’m seeing, is more important than any other new MySQL functionality, including better scaling on multi-core systems. With all above functionality, I can reasonably easily have multiple shards on one machine. Not that I don’t want much better multi-core CPU scalability, I do, as more shards are still harder to manage than less shards, it just isn’t as important as better sharding functionality.  I absolutely need the ability to shard; I don't absolutely need the ability to effectively use every CPU on a server with one shard. 

On another note, I’ll be giving a presentation on table partitioning at the MySQL conference.  This won’t be a basic presentation on how to implement partitioning, instead I’ll be highlighting scenarios where partitioning will degrade performance and cases where it improves performance.  I’ll show examples where, in spite of degraded performance for some tasks, it still might make sense to implement it.  An analogy here would be indexes, where adding an index can speed up some selects at the expense of slowing down inserts and some updates.

6 comments:

Antony said...

I keep meaning to post/publish a small patch which would enable the use of FederatedX with Partitioning. This would allow someone to configure a MySQL server to interrogate multiple MySQL shards when executing queries.

krow said...

So right now in the new Drizzle protocol we have a concept of a sharding key to support this sort of thing. If you want to hook up at the MySQL Users Conference I can show it to you.

Cheers,
-Brian

Frank Flynn said...

Check out Spockproxy a sharding / connection pooling only version of MySQL proxy. It does 1 & 3 easily (some limitations) and 2 is more complicated, it will do whatever you want but there is some time where some records are on two or no shards while they are moving.

I will be giving a talk on it at the conference.

Kristian Nielsen said...

Do you have any thoughts on the relation between sharding and using a distributed shared-nothing storage engine?

I have been thinking a lot about the possibility of having a general-purpose shared-nothing storage engine for MySQL. There is NDB, but it is much more geared towards real-time/non-SQL usage, not really general purpose (no MVCC, disk-indexes, self-tuning ability, ...).

With a shared-nothing storage engine, your tables are partitioned on your sharding key, with different partitions on different servers. MySQL (because of NDB) already have some very good optimisations to handle the distribution, though especially cross-shard updates are still challenging.

Any thoughts?

dbscience said...

I would be very interested in learning more about the Drizzle sharding approach. I have a feeling others would be interested as well.

As for Spockproxy, it looks promising. My bias is for something to be implemented in the database, however. Ideally, I would want to have the same set based logic (code) being applied against data across multiple shards as is running in a shard. It seems to me that would be easier to develop, maintain, and for me, understand, as the feature set would be the same. But I will use anything that works as nothing is going to be perfect at first.

I haven't thought about writing a shared nothing storage engine. It appears that Drizzle's approach of extending an existing engine like InnoDB makes sense, but I really don't know enough about this to have a strong opinion.

lsmith said...

Right, NDB already does #3 and I guess sort of #2. But like you point out it has a different goal. While NDB arranges data in the cluster without taking semantics into account at all, in sharding we want control over what data ends up on how many servers. And we want the server to know about this, so the server can decide when a join can be executed locally entirely and when it needs to hit another server.