Database Science

Enterprise Software as a Service (SaaS) and Partitioning

2009-09-01T19:19:00.001+01:00

Partitioning, new with MySQL 5.1, has complicated interactions with queries and indexes. If one isn’t careful it is easy to degrade performance. For example, select queries that go with that grain (queries where partition elimination occurs) can be much quicker, but select queries that go against that grain can be much slower. Queries that go against the grain must query each partition, so for a table with 12 partitions, one query against that table can result in 12 queries, one against each of the partitions. An example of this would be a query against a month partitioned table that is looking to see how much activity a product had in the past 12 months.

The ideal partitioning scheme would be a system where all queries only needs to access data from one partition. This describes enterprise software deployed as a service where multiple enterprise tenants all exist within one database . As one enterprise tenant (think of a company like a bank, manufacturing firm, or retailer, not a consumer using facebook or twitter) only queries their own data, the enterprise tenantId provides an ideal grain on which divide up the data. This means each table that has tenant specific data must have a tenantId. A Sale table for a multi-tenant database would look like this:

CREATE TABLE Sale (
   tenantId int NOT NULL,
   orderId int NOT NULL,
   customerId int NOT NULL,
   productId int NOT NULL,
   unit int NOT NULL,
   purchaseAmount decimal(16,2) NOT NULL,
   purchaseCost decimal(16,2) NOT NULL,
   purchaseDate datetime NOT NULL,
   PRIMARY KEY (tenantId, orderId),
   KEY idx_sale_product (productId),
   KEY idx_sale_customer (customerId),
   KEY idx_sale_purchaseDate (purchaseDate)
) PARTITION BY LIST(tenantId) (
   PARTITION t1 VALUES IN (1) ENGINE=InnoDB,
   PARTITION t2 VALUES IN (2) ENGINE=InnoDB,
   PARTITION t3 VALUES IN (3) ENGINE=InnoDB,
   PARTITION t4 VALUES IN (4) ENGINE=InnoDB,
   PARTITION t5 VALUES IN (5) ENGINE=InnoDB)

If you are using InnoDB, an alternative to partitioning by tenant is to create clustered indexes by tenantId. Before MySQL had partitioning, this was a good way to implement a multi-tenant database. If you are curious about this type of solution you can find more here:

http://dbscience.blogspot.com/2008_07_01_archive.html

Both partitioning by tenant and using InnoDB clustered indexes as in the above article are roughly going to perform the same for large data volumes.

The advantage that partitioning provides is on administrative tasks like server splits. When there are too many tenants on one server and a split needs to occur the stressed out database data can be replicated to another server. After the replication there will be two servers, each with roughly half of the tenants inactive. Instead of slow and hardware consuming mass delete of now inactive tenants on a server the inactive partitions can be dropped in a second. While the server split is still painful, this makes the reallocation of tenants across servers easier and the system is fully available far earlier.

There there are the other administrate benefits with partitioning, such as dropping the data for an inactive tenant quickly.

A downside is that you have to keep the partitioning list or range current as new tenants are added. You will probably want to pre-allocate tenant partitions to avoid having to add partitions at the last moment.

However, be aware of the partitioning limitations, such as only 1024 partitions per table. This means only 1024 tenants per database, so if you store more than 1024 tenants in one database you will want to combine multiple tenants into one partition.

If you expect to overwhelm a single database server, and if you are developing enterprise software as a service that is very possible as even simple enterprise applications seem to generate terabytes of data these days, you should strongly considering partitioning tables by the tenant.

MySQL and MS SQL Server

2009-08-05T20:45:00.001+01:00

Recently, MySQL had an article comparing MySQL and SQL Server at http://dev.mysql.com/tech-resources/articles/move_from_microsoft_SQL_Server.html

There is one clarification I would like to make to this article. The article states that MS SQL Server has row locking, and while this is true, MS SQL Server doesn’t always use row locking and often will resort to page locking. A page is 8k of data, so, in effect, many rows are locked at the same time even if only one row in that page is being updated. At high data throughputs, this can lead to serious lock contention and to dead locks, even though the two processes with the contention or dead lock are updating different rows. Stated differently, no matter how well you order your multi-row update processes in MS SQL Server you should expect deadlocks.

There is a way to turn off page locks on an index, but a index can’t be reorganized if page locks are off as index reorganization uses page locks. Alternatively, one can add a row lock hints to each SQL statement, but this becomes tiresome fast. The general approach to this problem is to keep each transaction small, smaller than you perhaps want, which reduces but doesn’t eliminate the page lock contention and dead locks.

Easy to use with sensible defaults, MS SQL Server is a great database; but page locking is a relic from the past when row locks consumed too much of that rare memory. In the 64 bit processor world, memory is too plentiful to worry about how much memory row locks take and page locks hurt scalability. As more recent database design, the MySQL model of only using row locks makes far more sense.

Partitioning Presentation from MySQL Conference

2009-05-14T20:04:00.001+01:00

It appears that my presentation isn’t available at the conference site. I’ve added it below. Sorry for the delay.

Without some context that I talked about at the conference the presentation may not make sense.

Partitioning creates a grain in the table. Select queries that go with that grain can be quicker, at times much faster, but select queries that go against that grain can be slower, at times much slower.

An example is a table that is partitioned by date and has an orderId primary key. Queries that select data by date will either be almost as fast (partitioning can add a slight overhead) as the non-partitioned table to much faster. But queries that don’t query by date will have to query all the partitions. A table partitioned 12 times, one partition for each month, results in 12 index partitions. As the indexes are partitioned by date, a query by the primary key of orderId means all twelve index partitions must be queried to look for the order in question. Querying twelve index partitions is going to be slower than just one unpartitioned table.

Having said that, there are cases where partitioning dramatically improves performance, such as optimizing a table partition by partition.

But now I’m starting to cover too much of what is in the presentation. So here it is.

http://www.slideshare.net/brooksaix/divide-and-be-conquered

Two New Databases Announced

2009-04-01T18:31:00.001+01:00

The open source coderati announced a new database today called “Monsoon”. Norm “Al” Modelle said much effort was spent by the team so it can handle internet sized floods of data. He expected it to have the licensing of PostgreSQL, the multi-engine support of MySQL, the functionality of Oracle, the ease of use of MS SQL Server, and the massively parallel scalability of DB2.

Not to be outdone, Computer Dilettantes (CD) also announced a new database, the oddly named “Mud Puddle”. Eye Samme, the Executive VP in charge of databases we intend to milk for support revenue, said he knows the customer is being squeezed by licensing costs in these hard times. So CD spent considerable effort designing a real time application that constantly analyzes the customer’s use of the database. It then notifies the sales force how they can maximize revenue extraction from the customer. He said the database will have the licensing of Oracle, the ease of use of DB2, the single engine support of PostgreSQL, the scalability of MySQL, and the functionality of MS SQL Server.

Oh, and please attend MySQL partitioning session if you are thinking about implementing partitioning. I promise it will have a bit more technical content.

Is SQL Slow?

2009-02-25T18:46:00.001+00:00

Last time I demonstrated a case where stored procedures are slow when they have to do computationally expensive work. The more interesting, to me at least, question is how slow is SQL? The answer to that is far more complex.

For example, this simple SQL takes 2.2 seconds to run on MySQL. This is a painfully faster than the 696 seconds it took a stored procedure to produce similar results.

select customerName, count(*)
from Salet s
join customer c on s.customerId = c.customerId
group by customerName

As demonstrated in the previous article, the equivalent C# code took 1.4 seconds to produce the same results. Some may find it surprising that it take less time to ship 1 million rows out of the database and then summarize it in code than it does for MySQL to summarize the same data in the database.

In this simple case the performance difference isn’t much and is not worth the extra code complexity. For more complex SQL, with more joins, perhaps nested case statements and temp tables, and similar standard SQL techniques, it is often much faster to do the transformation logic in non-SQL code. I’ve found cases where I was able to improve performance by over ten times by using C# or Java code over SQL. Still, my inclination is to always see if I can’t get acceptable performance from SQL first as less code is generally required, and usually far less code. SQL is an exceptionally expressive language for certain types of problems.

Plus, the performance advantage of C# or Java won’t be true in all cases. Stating the obvious, shipping out all the sales data won’t make sense for more selective queries that query only a few sales. In this case it makes far more sense to write in SQL.

Deciding where to execute code, in the database or elsewhere, is a complex problem that would require a series of articles to answer reasonably (a hint, think about using sequential IO as much as possible). For now I just want to point out that running everything in SQL isn’t always the best performing method.

Stored Procedures Are Slow Part 2

2009-02-23T19:36:00.001+00:00

Last time I demonstrated a case where stored procedures are slow. There were a few comments that I should include selects in the stored procedure to make the tests more realistic. From experience I already knew the answer so I didn’t go into that level of detail, but since stored procedures are all about database access this is a reasonable comment.

This is a simple stored procedure that selects data and then summarizes it by customer. No one would actually write this as it is far too easy to use a SQL statement instead. Assume the logic is more complex and can’t be done easily by the standard SQL techniques of case statements, temp tables, etc, and then this makes more sense.

The end result is this stored procedure takes 696 seconds to run.

create procedure slowCounter()
begin

    declare done int default 0;
    declare curCustomerId int default 0;
    declare customerCount int default 0;
    declare slowCur cursor for select customerId from saleT;
    declare continue handler for not found
        set done = TRUE;

    create temporary table tempCount
    (
        tempCustomerId int not null,
        tempCustomerCount int not null,
        primary key (tempCustomerId)
    );

    open slowCur;
    fetch slowCur into curCustomerId ;
    while not done do
        insert into tempCount(tempCustomerId, tempCustomerCount)
            values (curCustomerId , 1)
        on duplicate key update tempCustomerCount = tempCustomerCount + 1;
        fetch slowCur into curCustomerId ;
    end while;

close slowCur;

select * from tempCount;

end

This more complex C# code (and still not fully exception proofed) only takes 1.4 seconds to run. That is nearly 1/500 the time it took the stored procedure to execute.

using System;
using System.Collections.Generic;
using MySql.Data.MySqlClient;

namespace Summarization
{
    /// <summary>
    /// Tests the performance of C# to simulate a SQL group by
    /// </summary>
    class SumTestGroup : BaseSum
    {
        protected MySqlConnection sqlConn;

        private Dictionary<Int32, String> customer =
            new Dictionary<Int32, String>();
        private Dictionary<String, Int32> customerRowCount =
            new Dictionary<String, Int32>();

        public void execute()
        {
            try
            {
                sqlConn = getConnectionMySql();
                sqlConn.Open();
                int count = 0;
                Int32 customerId;
                String customerName;

                getCustomers();

                MySqlCommand cmd = new MySqlCommand("select s.customerId from Salet s", sqlConn);
                MySqlDataReader rdr = cmd.ExecuteReader();

                while (rdr.Read())
                {
                    customerId = (Int32)rdr["customerId"];
                    customerName = customer[customerId];
                    count++;

                    if (customerRowCount.ContainsKey(customerName))
                    {
                        customerRowCount[customerName]++;
                    }
                    else
                    {
                        customerRowCount[customerName] = 1;
                    }
                    //Console.WriteLine("Count: " + count);
                }

                foreach (KeyValuePair<String, Int32> customerRow in customerRowCount)
                {
                    Console.WriteLine("Customer: " + customerRow.Key + " Customer Count: " + customerRow.Value);
                }
                rdr.Close();

            }
            finally
            {
                sqlConn.Close();
            }
        }

        private void getCustomers()
        {
            MySqlCommand cmd = new MySqlCommand("select customerId, customerName from customer", sqlConn);
            MySqlDataReader rdr = cmd.ExecuteReader();

            Int32 customerId;
            String customerName;

            while (rdr.Read())
            {
                if (!rdr.IsDBNull(0))
                {
                    customerId = (Int32)rdr["customerId"];
                    customerName = (String)rdr["customerName"];
                    customer.Add(customerId, customerName);
                }
            }
            rdr.Close();
        }
    }
}

Both produce results like this:

customer Name 0, 100287
customer Name 1, 100009
customer Name 2, 100130
customer Name 3, 99655
customer Name 4, 100134
customer Name 5, 100102
customer Name 6, 99854
customer Name 7, 100172
customer Name 8, 99846
customer Name 9, 99812

In this case, the effort required to retrieve one million rows out of the database and then process them takes orders of magnitude less than a stored procedure takes to process the data inside the database.

No one would actually write this code when a simple SQL alternative is possible. But given complex logic that isn’t possible or easy to express in SQL that needs to be applied against a large volume of data, using a stored procedure cursor (or loop in general) isn’t a fast way to go. This is odd as the whole point of stored procedures is to write code that deals with data contained in a database.

Are MySQL stored procedures slow?

2009-02-13T20:59:00.001+00:00

Yes, if compared to code in Java or C#. For example, this overly simple code took 284 seconds.

CREATE PROCEDURE CountTest()
begin
    declare counter int default 0;
    select now();
    repeat
        set counter = counter + 1;
    until counter > 120000000
    end repeat;
    select counter;
    select now();
end

Ignoring my off by one error, here is equivalent code in C# (the language I’m currently learning). It took 419 milliseconds, or MySQL took 677 times longer to execute. From my experience, Java isn’t going to be any slower.

int counter = 0;
while (counter < 120000000)
{
counter++;
}

Slow stored procedure performance is one of the reasons why it usually isn’t wise to implement computationally expensive business logic in the database. With networks not being the bottleneck they once were it is often better to extract the data to the application layer and process it in the programming language of your choice, even if that takes a few extra round trips to the database. There are exceptions where those extra round trips are too numerous and prohibitively expensive. With something as complex as a database there are always exceptions.

This example is a bit too simple, but, based on experience, it is representative of the general performance of stored procedure logic.

How can MySQL get away with such poor performance? Well, the equivalent code in MS SQL Server took 80 seconds on the same hardware, which is also two orders of magnitude slower than C# or Java code.

create procedure testCount
as
begin
    declare @countNo int;
    set @countNo = 0;
    while (@countNo < 120000000)
    begin
        set @countNo = @countNo + 1
    end
    select @countNo
end

MS SQL Server code runs 3.5 times faster than MySQL, but given how much older MS SQL Server is, MySQL is doing well here. So are MySQL Stored procedures slow? Not really, if compared to other databases I’ve used.

Just don’t use them to do computationally expensive business logic.

Implementing Sharding in the Database

2009-01-21T23:38:00.001+00:00

Over the past few weeks (years really) there has been some discussion on sharding. Instead of discussing when sharding is required, as there are good discussions on this already, I want to discuss how I would like to have sharding implemented in the database.

I want the database to handle sharding automatically, and where it can't be automatic, I want the database to help as much as it can. Just like I want my business logic in a language ideally suited to it, and not stored procs (generally, there are always exceptions); I want all my physical persistence to be handled by the place that already does most of it, the database. Having the database handle some of the physical persistence and the object relational layer handle the sharding logic isn’t ideal to me, and not just because the current object relational layers don’t have all the sharding functionality I want. So here is what I want my database to do.

1) I want to automatically route simple queries to the one shard that has the relevant data. Hibernate shards already does this; I just want it implemented in the database. The MPP (massively parallel processing) version of DB2 can also do this.

2) I need the ability to reshard data when a rebalancing of the shards is required, without downtime. This is similar in concept to an index rebuild being done while the database is still up, which the expensive databases have implemented. This is more complex than an index rebuild as multiple tables will need to be resharded before the resharding is complete. DB2 has functionality to reshard, but it requires down time last time I checked (which was a few years ago) .

3) Finally, and this is a superset of #1, I want to be able to run one SQL statement across multiple shards at the same time. The database will send the SQL to all the relevant shards, run as much of the SQL on that shard as possible, using something like the function shipping that DB2 MPP has, and then aggregate the data from the various shards into one result set. All I need to do this is execute one SQL statement. Stated differently, all the shards will look like one database. This won’t be easy to implement, and will probably be limited in functionality at first, but this long term goal is why implementing sharding in the database makes sense.

This might sound a bit like an advertisement for the MPP version of DB2, but if it were open source I would at strongly consider implementing it for applications that need database sharding. Given that that is not likely to happen, I hope a reliable open source database, like MySQL, or Drizzle, or PostgreSQL, starts implementing this reasonably soon.

This sharding functionality, for the types of problems I’m seeing, is more important than any other new MySQL functionality, including better scaling on multi-core systems. With all above functionality, I can reasonably easily have multiple shards on one machine. Not that I don’t want much better multi-core CPU scalability, I do, as more shards are still harder to manage than less shards, it just isn’t as important as better sharding functionality. I absolutely need the ability to shard; I don't absolutely need the ability to effectively use every CPU on a server with one shard.

On another note, I’ll be giving a presentation on table partitioning at the MySQL conference. This won’t be a basic presentation on how to implement partitioning, instead I’ll be highlighting scenarios where partitioning will degrade performance and cases where it improves performance. I’ll show examples where, in spite of degraded performance for some tasks, it still might make sense to implement it. An analogy here would be indexes, where adding an index can speed up some selects at the expense of slowing down inserts and some updates.

Infobright Review – Part 2

2008-10-01T00:28:00.001+01:00

First, a retraction, it turns out that the performance problem with datatimes in the previous article wasn’t due to high cardinality (I speculated too much here), but due to a type conversion issue. From a helpful comment from Victoria Eastwood of Infobright (a good sign for a startup), the Infobright engine considered ‘2001-01-01’ to be a date, not a datetime, and it couldn’t do a conversion to a datetime. Instead it pushed the date filtering logic from the Infobright engine to MySQL. Effectively, the slow queries were a table scan. The solution is to add the 00:00:00 to the dates to make them datetimes.

With that in mind, here are some much better numbers for Infobright. For Infobright this query took 0.05 seconds.

1) Select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-05-01 00:00:00'

This compares very favorably to 4 seconds for InnoDB if the data is cached and 8.8 if the data needs to be read from disk.

In doing further tests I found some interesting patterns one might not expect coming from a row database background. This more selective #2 query took 0.27 seconds for Infobright, longer than the less selective query #1. Still 0.27 is fast enough. The odd part is a row based database is generally faster, or at least the same performance, the more selective the query. For Infobright the exact opposite occurs, at least in these cases I explored. For this more selective query, InnoDB took 0.047 seconds with a cold cache (all following queries will be using a cold cache unless otherwise noted) compared to that 0.27 for Infobright. It seems that the that Infobright is faster than InnoDB for less selective queries, and InnoDB is faster for more selective queries.

2) select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-05-01 00:00:00' and productId = 199824

This more selective query #3 took longer still, at 0.99 seconds for Infobright, but only took 0.047 seconds for InnoDB, the same time as query #2.

3) select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-05-01 00:00:00' and productId = 199824 and customerId = 418845

The same pattern becomes more evident with double the time range for the queries, with two months of data compared to just one. Query #4 took 0.07 seconds for Infobright whereas InnoDB took 19.9 seconds, far longer.

4) select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-06-01 00:00:00'

At 1.25 seconds, adding another criteria noticeably slowed down the performance from query #4 for Infobright. At 0.061 seconds, InnoDB was much faster than the less selective query #4,

5) select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-06-01 00:00:00' and productId = 199824

At 1.99 seconds, this more selective query #6 was again slower than #5 for Infobright. Following the trend, InnoDB was even faster at 0.048 seconds.

6) select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-06-01 00:00:00' and productId = 199824 and customerId = 418845

But there is another difference between InnoDB and Infobright. The below query #7 takes 0.06 seconds for InnoDB, and takes 11.5 seconds for Infobright with a cold cache (I restarted the database in both cases). Once the cache was warmed up Infobright took 2.2 seconds to run and retrieve one row. As there are 120 million different orderIds in the Sale table, perhaps my speculation that Infobright does worse the higher the cardinality of a column wasn't that off.

7) select * from Sale where orderId = 1

At least for the cases I examined, InnoDB is better for more selective queries, at times far better, and Infobright is better for less selective ones, at times far better. This assumes defining good indexes for InnoDB, work which Infobright does not require. Which one to use for reporting? It really does depend on the nature of the workload.

Having said that, Infobright behaves quite differently from InnoDB and I know there is much I'm missing as I only put in a few hours of work. I would try your own workload and come to your own conclusions.

An Infobright Review

2008-09-23T22:51:00.001+01:00

With open source software I can install reasonably complete software and try it with my data. This way I get to see how it works in a realistic setting without having to rely on benchmarks and hoping they are a good match for my environment. And I get to do this without having to deal with commercial software sales people.

So I glad to hear the Infobright had gone open source as I have been wanting test a column based database for a while. I was even happier that it was a MySQL based engine as I would already know many of the commands. I decided to run some of the same tests I had run when comparing InnoDB and MyISAM for reporting (http://dbscience.blogspot.com/2008/08/innodb-suitability-for-reporting.html ). InnoDB performed better than MyISAM in my reporting tests so I’m going to compare Infobright to InnoDB.

The first task is to load the data with this command.

load data infile '/dbdata/info/data/sale.dat' into table Sale fields terminated by '\t' ;

Infobright loaded 120 million rows in 8 min 2.62 seconds. Loading 14 million rows a minute is fast enough for me, considering I’m running on a commodity machine with an external disk subsystem that cost about 12k USD in total.

Next I run a simple query that takes 4 seconds to run on InnoDB when the data is cached, and 8.8 when disk is involved. This type of query is common in reporting systems.

select sum(unit) from Sale where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01'

This same query takes 151 seconds to run on Infobright. I get the message “Brighthouse: Value out of range. Query executed by MySQL engine.” I have 631,530 different purchaseDate values for 120,000,000 rows in the Sale table, so perhaps it is the high cardinality of the column that is causing the problem. I’ve heard that column based databases can have this issue. I don’t see any noticeable disk io.

But, if I run the exact query again it now takes 0.0069 seconds; obviously data is being cached at some level.

I’m curious if the various caches need to warm up before the Infobright has fast performance. To test this I run the same query on a subset of the above date. This sql still takes 153 seconds. So it appears only the result set is being cached.

select sum(unit) from Sale where purchaseDate >= '2001-11-02' and purchaseDate < '2001-12-01'

If I limit the result set to one customer the performance doesn’t improve, with this sql taking 205 seconds. I’m still getting the message “Brighthouse: Value out of range. Query executed by MySQL engine.”

select sum(unit) from Sale where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01' and customerId = 50

If I ignore including dates in the query and just go with simple ints conditions this sql takes 1.25 seconds. I’m no longer getting the message “Brighthouse: Value out of range. Query executed by MySQL engine”, which explains the better performance.

select sum(unit) from Sale where customerId = 50 and productId = 22641

The same query in InnoDB takes 0.035 seconds if the io is required and 0.0079 if the data is cached (I'm not using result set caching, I'm just letting the db cache the underlying data).

Perhaps the problem is all my data is too high in cardinality. I have created 10,000,000 products and 1,000,000 customers, with the data being randomly generated. Is this too much for Infobright to handle? Perhaps the data is not realistic enough; perhaps it has too high a cardinality. I do have a tendency to push to extremes to find out when things get interesting (in other words, break). Yet InnoDB handled this cardinality without a problem. Plus, in my case, for the SaaS application I’m working on, the customer count is low if we are a success and 10 million customers is plausible. Storing dates to the second precision isn't an unreasonable requirement, although 10 million products is undeniably unrealistic for most organizations.

In spite of that, next I think about truncating the seconds on the purchaseDate column to reduce the cardinality. Then I remember the community version doesn’t allow updates (or deletes, or inserts). This is a bit too crippled for my taste for I often want to update customer demographic information. This concerns me, as the data would have to be stored elsewhere to be updated and then the updated data, in fact the entire table, dropped and recreated in Infobright. Having to store data twice reduces the compression advantages of Infobright and adds to the complexity of the solution, when one of the stated advantages of Infobright is simplicity compared to row based databases.

And at that I will end this preliminary assessment. I may reduce the cardinality of the data and try the tests again.

Datetimes and Ints

2008-09-19T01:44:00.001+01:00

In my last article I found, and others confirmed (thanks), that timestamps were much slower than datetimes. I next wondered if datetimes were slower than storing dates as ints in the UNIX timestamp format (seconds since January 1, 1970). The conclusion is that ints are a bit faster than datetimes, but not enough for me to give up all the functionality that datetimes provide.

I ran this sql 100 times against tables of about 20 gig, with the innodb_buffer_pool_size=2G and innodb_flush_method=O_DIRECT, to get physical io and make the tests reasonably realistic.

Running this datetime sql 100 times took 21.2 minutes. The month randomly varied for each of the 100 executions to avoid caching issues.

select sum(unit) from Sale where purchaseDate >= '2001-02-01' and purchaseDate < '2001-03-01'

Running the equavlent int sql 100 times took 20.7 minutes, a difference of about 2.4%.

select sum(unit) from SaleTI where purchaseDate >= 981014400 and purchaseDate < 983433600

So even though a datetime takes 8 bytes and an int 4 bytes, the performance is roughly comparable. Since I find datetimes much easier to work with than ints I'll be using datetimes.

For the curious, here is the datetime version of Sale table:

CREATE TABLE `test`.`Sale` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`purchaseDate`,`orderId`),

UNIQUE KEY `idx_sale_order` (`orderId`),

KEY `pf_sale_product` (`productId`),

KEY `pf_sale_customer` (`customerId`),

CONSTRAINT `pf_sale_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),

CONSTRAINT `pf_sale_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`)

) ENGINE=InnoDB

And here is the int version of SaleTI table.

CREATE TABLE `test`.`SaleTI` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` int(11) NOT NULL,

PRIMARY KEY (`purchaseDate`,`orderId`),

UNIQUE KEY `idx_saleTI_order` (`orderId`),

KEY `idx_saleTI_product` (`productId`),

KEY `idx_saleTI_customer` (`customerId`),

CONSTRAINT `fk_saleTI_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),

CONSTRAINT `fk_saleTI_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`)

) ENGINE=InnoDB

Update: While it doesn't represent the environments I work in, I just did a ran with same sql with innodb_buffer_pool_size=28G. The tables were now able to fit into memory. In this case datetimes were about 1% faster than ints, which I can't explain. I suspect the difference is just random fluctuation. The conclusion remains the same, that datetimes performance is about the same as Unix ints and datetimes have far more functionality.

Can a timestamp be slower than a datetime?

2008-08-29T22:47:00.006+01:00

One should use timestamps rather than datetimes as timestamps are four bytes and datetimes are eight bytes, right? This makes sense as smaller indexes and tables are faster than larger ones. However, when I was trying to quantify the improvement (I’m strange that way), I found the exact opposite - when used as a criteria, a timestamp is slower than a datetime. Often much slower.

For example, for this simple sql, when purchaseDate was a timestamp, the query took about 9.4 seconds and the datetime version took about 4.3 seconds. In this case the data was in memory.

select sum(unit) from Sale where purchaseDate >= '2001-07-01' and purchaseDate < '2001-08-01'

I did other tests, such as when data wasn’t cached and there was io, and I found cases where datetimes were much faster than timestamps, differences so extreme I’m hesitant to report them until I have a better explanation of what is going on. I really can’t believe the performance differences I'm measuring and I have this suspicion that something else is occurring as these numbers seem too extreme, but what I can’t say.

Was there a bug for this I don’t know about? It is hard to find information on timestamps as it is such a common word, but I did scan the MySQL bugs for timestamps and didn’t find anything. Is it possible the extra functionality of timestamps, like timezone conversions, makes timestamps slower than datetimes?

In a few other cases, I found them about the same speed; only in one case were timestamps faster than datetimes, and that was 94 seconds for timestamps and 97 seconds for datetimes, which is an insignificant difference.

I tested this on 5.1.26 and 5.0.22, and found the performance difference between timestamps and datetimes on both versions.

I'm not sure what to conclude from this. If you aren’t frequently using the date oriented columns as criteria in queries, and dates are something like a rarely used record of when an event occurred, the smaller timestamp is the better choice. If, however, if many queries are run against the date in question, and assuming these results I’m getting are accurate, then a datetime column should at least be under consideration. In other words, benchmark datetimes and timestamps in your environment with representative queries. The results are worse for timestamps when they are used as a selection criteria.

If you get the crazy performance differences I’m getting, or if you don’t, let me know.

For the curious, the plans for the timestamp queries looked like this:

mysql> explain select sum(unit) from SaleT where purchaseDate >= '2001-07-01' and purchaseDate < '2001-08-01' \G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: SaleT
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 20267520
        Extra: Using where
1 row in set (0.04 sec)

And the datetime queries looked like this:

mysql> explain select sum(unit) from Sale where purchaseDate >= '2001-07-01' and purchaseDate < '2001-08-01' \G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: Sale
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 20548080
        Extra: Using where
1 row in set (0.06 sec)

And the table looks like this (with the other version having a purchaseDate timestamp column instead of a datetime).

CREATE TABLE `test`.`Sale` (
`orderId` int(11) NOT NULL,
`customerId` int(11) NOT NULL,
`productId` int(11) NOT NULL,
`productBigId` int(11) NOT NULL,
`unit` int(11) NOT NULL,
`purchaseAmount` decimal(16,2) NOT NULL,
`purchaseCost` decimal(16,2) NOT NULL,
`purchaseDate` datetime NOT NULL,
PRIMARY KEY (`purchaseDate`,`orderId`),
UNIQUE KEY `idx_sale_order` (`orderId`),
KEY `pf_sale_product` (`productId`),
KEY `pf_sale_customer` (`customerId`),
CONSTRAINT `pf_sale_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),
CONSTRAINT `pf_sale_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`)
) ENGINE=InnoDB

On a somewhat related note, I also got this repeatable bug when trying to insert data into a timestamp column on 5.1.26:

Incorrect datetime value: '2001-03-11 02:00:39' for column 'purchaseDate' at row 226

InnoDB's Suitability for Reporting

2008-08-22T00:03:00.001+01:00

I started using Oracle, a MVCC database, to develop reporting (data warehousing, BI, take your pick) systems years ago. I’ve come to appreciate the scalability improvements that MVCC provides, particularly for pseudo real-time reporting applications, the ones where loads are occurring at the same time as report generation. So when people say InnoDB, partly due to MVCC, isn’t as good as MyISAM for reporting I had to look into this in more detail.

What I found is InnoDB is a good engine for reporting. In some ways, such as performance, it is at times better than MyISAM, and one of the downsides, such as a larger disk requirement, can be mitigated. The trick is to for the primary key to be the one predominant access path. In this example, the InnoDB clustered index, is purchaseDate and another column, such as orderId is added to make it unique. This has a number of advantages. In my experience, most reporting systems queries are for a date range, so date provides a great all round clustered index. Also, one of the problems with indexes is they can fragment, which both slows down queries and takes up more disk. But all the new date oriented data is added to the end of the clustered index, meaning most of the index doesn’t change and the old data, if defragmented, remains so. Finally, as most queries are for the recent data, clustering by date tends to keep the most frequently used recent data in memory, a benefit that is hard to quantify but can be substantial.

Here is the InnoDB version of the table. It holds 120 million rows, about 10 million for each month in 2001. As the server has 28 gig of memory, in all cases I ensured that all these queries ran in memory and there was no physical io. This isn’t completely fair as MyISAM does compact data and takes up less disk, but I’ll show later how to minimize the amount of disk that InnoDB consumes. Plus, if there is enough (any? - comments would be great) contention on this point, I’ll run some tests with physical io for both engines.

CREATE TABLE `test`.`Sale` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`purchaseDate`,`orderId`),

UNIQUE KEY `idx_sale_order` (`orderId`),

KEY `pf_sale_product` (`productId`),

KEY `pf_sale_customer` (`customerId`),

CONSTRAINT `fk_sale_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),

CONSTRAINT `fk_sale_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`)

) ENGINE=InnoDB

The InnoDB version of this query takes seconds 4 seconds and sensibly uses the primary key to limit the amount of data it needs to query. This means only about 1/12 of the table needs to be queried, and when it is read the access if sequential, which is fast.

explain select sum(unit) from Sale where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01' \G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: Sale

type: range

possible_keys: PRIMARY

key: PRIMARY

key_len: 8

ref: NULL

rows: 19447290

Extra: Using where

This is the equivalent MyISAM table with a more standard primary key of orderId.

CREATE TABLE `test`.`SaleI` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`orderId`),

KEY `pk_salei_product` (`productId`),

KEY `pk_salei_customer` (`customerId`),

KEY `idx_saleI_purchaseDate` (`purchaseDate`)

) ENGINE=MyISAM

As 12 seconds, the same query for MyISAM is much slower as it must visit both the index and the page which contains the data.

mysql> explain select sum(unit) from SaleI where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01' \G;

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleI

type: range

possible_keys: idx_saleI_purchaseDate

key: idx_saleI_purchaseDate

key_len: 8

ref: NULL

rows: 9079951

Extra: Using where

However, this isn’t completely fair to MyISAM as a better covering index for MyISAM can be written that replicates some of performance of a clustered index. So I change the table definition to add the idx_saleI_covering index.

CREATE TABLE `test`.`SaleI` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`orderId`),

KEY `pk_salei_product` (`productId`),

KEY `pk_salei_customer` (`customerId`),

KEY `idx_saleI_covering` (`purchaseDate`,`productId`,`customerId`,`unit`,`purchaseAmount`,`purchaseCost`)

) ENGINE=MyISAM

With the better idx_saleI_covering index the query now takes 8 seconds, still about twice a long as InnoDB.

select sum(unit) from SaleI where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01' \G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleI

type: range

possible_keys: idx_saleI_covering

key: idx_saleI_covering

key_len: 8

ref: NULL

rows: 25370368

Extra: Using where; Using index

How does a more selective query do, one which picks a specific date range and a another criteria, in this case customer? For InnoDB, almost zero seconds, 0.0079 seconds to be exact.

select sum(unit) from Sale where purchaseDate >= '2001-11-01' and purchaseDate < '2001-12-01' and customerId = 50 \G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: Sale

type: index_merge

possible_keys: PRIMARY,pf_sale_customer

key: pf_sale_customer,PRIMARY

key_len: 4,8

ref: NULL

rows: 12

Extra: Using intersect(pf_sale_customer,PRIMARY); Using where

1 row in set (0.00 sec)

The query plan for MyISAM is different, but the results are the same, with the query taking 0.0079 seconds to run.

mysql> explain select sum(unit) from SaleI where purchaseDate >= '2001- -01' and purchaseDate < '2001-12-01' and customerId = 50 \G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleI

type: ref

possible_keys: pk_salei_customer,idx_saleI_covering

key: pk_salei_customer

key_len: 4

ref: const

rows: 102

Extra: Using where

So, not an extensive series of tests, but two of the more common types of reporting queries and InnoDB is either as fast or faster than MyISAM. At this point, all other things being equal, I would much prefer the higher reliability of InnoDB over MyISAM.

But what about the size of the of data on disk? MyISAM, for the version of SaleI with the covering index, at 13.6 gig, is 63.3% of InnoDB’s 21.5 gigs. A significant advantage for MyISAM, yes, but not enough of itself to choose MyISAM over InnoDB. Part of the reason is all the indexes are on numbers and dates, which MyISAM doesn’t compress by default. Generally, for performance reasons, indexes should be non-character in a reporting system, so non-character indexes are a reasonable assumption. But still, this comparison isn’t entirely fair as the InnoDB table has been optimized and hasn’t had a chance to get fragmented, right?. What happens if another month of data into the InnoDB Sale table without it being optimized? I use this SQL to populate a new month of data.

insert into Sale (orderId, customerId, productId, productBigId, unit,

purchaseAmount, purchaseCost, purchaseDate)

select orderId + 120000000, customerId, productId, productBigId, unit,

purchaseAmount, purchaseCost, date_add(purchaseDate, interval 1 year)

from Sale

where purchaseDate >= '2001-01-01'

and purchaseDate < '2001-02-01'

And the end result of adding one more month of data to twelve existing months is InnoDB now consumes 23.4 gigs. In other words, adding 8.3% more data results in 8.1% more disk consumed. The lesson from this is, if you can, load your data in primary key order.

However, the data will still get fragmented over time, in particular the other non-clustered indexes. Plus, it isn’t always possible to load data in primary key order. A table optimize always optimizes the entire table, so this takes a while.

At least is does for 5.0. There is a new way with 5.1 to reduce the table maintenance of InnoDB - one can partition by date. The advantage this provides is instead of optimizing an entire InnoDB table, one can do an ALTER TABLE REBUILD PARTITION on only the most recent data. This means that the table defrag should take much less time. I’ve never done this yet, so I’m not sure if it really works and I can’t recommend it, but if/when it does work, it is another mitigating factor against the larger database sizes of InnoDB. However, is could also be used to make date selective MyISAM queries much faster. But see my http://dbscience.blogspot.com/2008/01/divide-and-be-conquered.html article on a potential performance downside of partitioning.

This issue is more complex than this one article can cover, but for those reporting applications where is one predominant access path, such as date, the clustered indexing advantages of the InnoDB engine makes it worth consideration for reporting applications.

SAAS Multi-tenant Databases

2008-07-23T15:18:00.001+01:00

There are some good high level discussions on the various database architectures for implementing software as a service. I’m on going to duplicate that. Instead, this article is going to highlight some of the issues and tradeoffs when multiple tenants (customers of the software service provider) are put into one database. The core of the problem is having multiple tenants in one set of database tables can make queries that select data for only one customer much slower.

First, I’m testing on a new system, one with 32 gig of memory. Currently, the innodb_buffer_pool_size set to 28 gig. I’m using the same size tables as previously, both Sale and SaleTenant have 120 million rows and are about 27 and 28 gig respectively. This means that the tables fit into memory, so the following tests do not have an IO component. Not perfect, but it shouldn't make a difference in the conclusions for these tests.

For the other two tables, there are 100 tenants, and 1 million customers.

The ddl looks like this (yeah, key and constraint names don’t make sense). First, the non-multi-tenant table.

CREATE TABLE `test`.`Sale` (

`orderId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`purchaseDate`,`orderId`),

UNIQUE KEY `idx_sale_order` (`orderId`),

KEY `pf_sale_product` (`productId`),

KEY `pf_sale_customer` (`customerId`),

CONSTRAINT `pf_sale_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),

CONSTRAINT `pf_sale_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`)

) ENGINE=InnoDB

Then the table with a tenantId in the table.

CREATE TABLE `test`.`SaleTenant` (

`orderId` int(11) NOT NULL,

`tenantId` int(11) NOT NULL,

`customerId` int(11) NOT NULL,

`productId` int(11) NOT NULL,

`productBigId` int(11) NOT NULL,

`unit` int(11) NOT NULL,

`purchaseAmount` decimal(16,2) NOT NULL,

`purchaseCost` decimal(16,2) NOT NULL,

`purchaseDate` datetime NOT NULL,

PRIMARY KEY (`tenantId`,`purchaseDate`,`orderId`),

UNIQUE KEY `idx_saletenant_order` (`orderId`),

KEY `pf_saletenant_product` (`productId`),

KEY `pf_saletenant_customer` (`customerId`),

CONSTRAINT `pf_saletenant_customer` FOREIGN KEY (`customerId`) REFERENCES `Customer` (`customerId`),

CONSTRAINT `pf_saletenant_product` FOREIGN KEY (`productId`) REFERENCES `Product` (`productId`),

CONSTRAINT `pf_saletenant_tentant` FOREIGN KEY (`tenantId`) REFERENCES `Tenant` (`tenantId`)

) ENGINE=InnoDB

Now, the tenant table.

CREATE TABLE `test`.`Tenant` (

`tenantId` int(11) NOT NULL,

`tenantName` varchar(32) NOT NULL,

PRIMARY KEY (`tenantId`)

) ENGINE=InnoDB DEFAULT

And a customer table with a tenantId column.

CREATE TABLE `test`.`Customer` (

`customerId` int(11) NOT NULL default '0',

`customerName` varchar(32) NOT NULL,

`tenantId` int(11) NOT NULL default '0',

PRIMARY KEY (`customerId`),

KEY `pf_customer_tenant` (`tenantId`),

CONSTRAINT `pf_customer_tenant` FOREIGN KEY (`tenantId`) REFERENCES `Tenant` (`tenantId`)

) ENGINE=InnoDB

#1) Against a table that is clustered by the TenantId, this query takes 0.5 seconds to run. As it selects about 1/100 of the Sale Tenant database, the performance isn’t bad.

select sum(unit) from SaleTenant where TenantId = 77;

And the query plan shows that the custered index was used to scan on the relevant rows.

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleTenant

type: ref

possible_keys: PRIMARY

key: PRIMARY

key_len: 4

ref: const

rows: 2708904

Extra:

#2) The query against the Sale table without the tenantId is about 5 times slow, at 2.5 seconds.

select sum(unit)

from Customer c

join Sale s

on c.customerId = s.customerId

where c.TenantId = 77

The plan shows that the database drives off the customer table and joins to the sales table on the customerId column.

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: c

type: ref

possible_keys: PRIMARY,pf_customer_tenant

key: pf_customer_tenant

key_len: 4

ref: const

rows: 5040

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: s

type: ref

possible_keys: pf_sale_customer

key: pf_sale_customer

key_len: 4

ref: test.c.customerId

rows: 57

Extra:

3) Further qualifying by date shows where adding the tentantId to the SaleTenant id column further improves performance. This sql runs in about 0.08 seconds, and is much faster than the query #1, which doesn't have the more selective date criteria.

select sum(unit) from SaleTenant

where TenantId = 77

and purchaseDate >= '2001-06-01'

and purchaseDate < '2001-07-01'

The query plan shows that the clustered index of (tenantId, pruchaseDate, orderId) allowed the database to quickly find the relevant rows and sequentially the relevant rows.

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleTenant

type: range

possible_keys: PRIMARY

key: PRIMARY

key_len: 12

ref: NULL

rows: 150491

Extra: Using where

4) As the tenant criteria drives off of the customer table, and the date criteria drives off of the sales table, and the optimizer needs to start with one of these two tables, this sql can’t be as efficient as the query #3. In this case, the optimizer again (like query #2) decides to drive off the customer table. Unlike query #3, the extra date criteria doesn’t improve performance as it must join to each Sale row to determine if the date is valid, and in this case, only about 1/12 of the rows are relevant. This means 11/12 of the joins are, in effect, tossed away. Thus, it has to do the same amount of work as query #2 and runs in about the same time as #2 2.5 seconds, or about 31 times slower than query #3.

select sum(unit)

from Customer c

join Sale s

on c.customerId = s.customerId

where c.TenantId = 77

and purchaseDate >= '2001-06-01'

and purchaseDate < '2001-07-01'

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: c

type: ref

possible_keys: PRIMARY,pf_customer_tenant

key: pf_customer_tenant

key_len: 4

ref: const

rows: 5040

Extra: Using index

*************************** 2. row ***************************

id: 1

select_type: SIMPLE

table: s

type: ref

possible_keys: PRIMARY,pf_sale_customer

key: pf_sale_customer

key_len: 4

ref: test.c.customerId

rows: 57

Extra: Using where

So, obviously there are cases where adding a tenantId will make the queries run much faster. The issue with adding a tenantId to all the tables and clustering on that column is queries that are not limited to a tenant (or set of tenants) will be slower in some cases. The next two sql statements will show such a case.

5) In this sql a simple sum is run against the Sale table for a month, summing about 1/12 of the table. As the table is clustered by date the query is fast and runs in about 4.4 seconds.

select sum(unit)

from Sale s

where purchaseDate >= '2001-06-01'

and purchaseDate < '2001-07-01'

The query plan shows the clustered index being used to scan the table.

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: Sale

type: range

possible_keys: PRIMARY

key: PRIMARY

key_len: 8

ref: NULL

rows: 26733636

Extra: Using where

The equivalent query against the SaleTenant table takes 39 seconds to run. In this case, the entire table must be scanned as there is no date oriented index.

select sum(unit) from SaleTenant

where purchaseDate >= '2001-06-01'

and purchaseDate < '2001-07-01'

The plan show the full table scan.

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: SaleTenant

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 123303794

Extra: Using where

Assuming this cross-tenant access pattern is frequent enough to be a performance problem, a potential solution to this problem is to add a covering index. However, this might not be an issue as the only organization likely to execute such a query would be the hosting company.

Given enough tenants in a database, adding a tenantId to a mutli-tenant database is recommended as some queries are going to be at least an order of magnitude slower if you don’t. As denormalizations go, this is a reasonable one as how often would the tenantId (owner) of a row change? Just be aware that queries that don’t include a tenantId in the criteria can be much slower.

I didn’t cover this, but partitioning by tenantId also makes sense and can make moving customers from one sharded database to another much easier. Perhaps more about that later.

Clustered indexing and query performance

2008-02-22T06:43:00.001+00:00

Last time I showed where partitioning could negatively impact performance, with one partitioned query being four times slower than a non-partitioned one when the data was partitioned by the same column as it was clustered by. This time I’m going to show a way to get better performance by selecting a good clustered index. With the InnoDB, the create table primary key syntax encourages one to create the clustered index the same as the primary key. For transaction systems, in many cases, this makes sense. But there are times, particularly for reporting systems, when this isn't advisable.

To demonstrate this two similar tables will be created where the only difference is the indexing. The below SQL shows an one of these tables, a 20 gig, 120 million rows tables, representing one year (about 10 million per month) of data. This table is clustered by the primary key.

create table SaleOrderCluster (

orderId int not null,

customerId int not null,

productId int not null,

productBigId int not null,

unit int not null,

purchaseAmount decimal(16,2) not null,

purchaseCost decimal(16,2) not null,

purchaseDate datetime not null,

primary key (orderId),

key idx_SaleOrderCluster_purchaseDate (purchaseDate),

key idx_SaleOrderCluster_product (productId),

key idx_SaleOrderCluster_customer (customerId),

constraint pf_SaleOrderCluster_customer foreign key (customerId) references Customer (customerId),

constraint pf_SaleOrderCluster_product foreign key (productId) references Product (productId)

) ENGINE=InnoDB

But in other cases, it makes sense to cluster the data on another key. In this case, the primary key of purchasedate is used to cluster the data by date as many queries scan a range of dates. There is still the need for a constraint on the orderId column, so a unique index is created on this column. This SQL creates the 28 gig, 120 million row table. Note that the larger clustered index results in a noticeably larger disk utilization than the previous table (28 vs 20 gig). This is because all the other indexes point to the clustered index, which means the large clustered index increases the size of all the other indexes. In both cases the table is much larger than the 4 gig of memory assigned to MySQL.

CREATE TABLE Sale (

orderId int not null,

customerId int(11) not null,

productId int(11) not null,

productBigId int(11) not null,

unit int(11) not null,

purchaseAmount decimal(16,2) not null,

purchaseCost decimal(16,2) not null,

purchaseDate datetime not null,

primary key (purchaseDate,orderId),

unique key idx_sale_order (orderId),

key pf_sale_product (productId),

key pf_sale_customer (customerId),

constraint pf_sale_customer foreign key (customerId) references Customer (customerId),

constraint pf_sale_product foreign key (productId) references Product (productId)

) ENGINE=InnoDB

The only difference between these two table is which column is the clustered index. To test the performance of these two indexes this stored procedure is used to see how fast a single row lookup executed.

CREATE PROCEDURE testClusterSpeed()

begin

declare vorderId INT default 0;

declare loopcounter INT default 0;

declare vcustomerId INT default 0;

repeat

set loopcounter = loopcounter + 1;

set vorderId = floor(1 + (rand()*120000000));

set vcustomerId = (select customerId from SaleOrderCluster where orderId = vorderId);

until loopcounter > 1000000

end repeat;

end

The execution plan show it uses the clustered index to find the data. The procedure took 983 seconds to query the data.

+----+-------------+------------------+-------+---------+---------+-------+------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+------------------+-------+---------+---------+-------+------+
| 1 | SIMPLE | SaleOrderCluster | const | PRIMARY | 4 | const | 1 |
+----+-------------+------------------+-------+---------+---------+-------+------+

This is the same basic stored procedure except it is against the table with the purchasedate clustered index.

CREATE PROCEDURE testNoPartitionSpeed()

begin

declare vorderId INT default 0;

declare loopcounter INT default 0;

declare vcustomerId INT default 0;

repeat

set loopcounter = loopcounter + 1;

set vorderId = floor(1 + (rand()*120000000));

set vcustomerId = (select customerId from Sale where orderId = vorderId);

until loopcounter > 1000000

end repeat;

The plan for this is the expected lookup by the unique idx_sale_order index.

+----+-------------+-------+-------+----------------+---------+-------+------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+-------+-------+----------------+---------+-------+------+
| 1 | SIMPLE | Sale | const | idx_sale_order | 4 | const | 1 |
+----+-------------+-------+-------+----------------+---------+-------+------+

As the unique index needs to point to the clustered index, this stored proc takes about three times longer to run – 2780 seconds. Stated differently, MySQL first uses the idx_sale_order to find the OrderID, but as all non clustered indexes point to the clustered index, the clustered index must also be read to ultimately find the row. This last step isn't required if the clustered index is by OrderId as the data is organized by OrderId. If most of the queries are against OrderId then using OrderId as the clustered index makes sense. Plus, the larger Sale table with the purchaseDate clustered index, which would result in less cache hits for queries as less of it could fit into memory, can't help performance either.

However, if most of the queries are against purchaseDate the performance of the two indexes flips. The below query runs against the table with the clustered index on the purchaseDate and runs in about 335 seconds. I ran this after restarting MySQL to ensure the data wasn’t in memory. If it was in memory it ran in seconds.

select sum(unit) from Sale

where purchaseDate < '2002-01-01'

and purchaseDate > '2001-12-01'

+----+-------------+-------+-------+----------------+---------+-------+----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+-------+-------+----------------+---------+-------+----------+
| 1 | SIMPLE | Sale | const | PRIMARY | 8 | | 28836984 |
+----+-------------+-------+-------+----------------+---------+-------+----------+

When I ran the corresponding query on the table with the clustered index OrderId, I had to kill this query after 9000 seconds, which is over 27 times longer. This is because the query plan wasn’t very good as MySQL was using the date index to find each row rather than a much faster table scan.

select sum(unit) from SaleOrderCluster

where purchaseDate < '2002-01-01'

and purchaseDate > '2001-12-01'

+----+-------------+------------------+-------+-----------------------------------+---------+-------+----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+------------------+-------+-----------------------------------+---------+-------+----------+
| 1 | SIMPLE | SaleOrderCluster | range | idx_SaleOrderCluster_purchaseDate | 8 | | 16055948 |
+----+-------------+------------------+-------+-----------------------------------+---------+-------+----------+

This is an example where the MySQL query optimizer is a bit immature. So, I hinted the query to do a table scan and it took 3000 seconds, still about nine times slower than the other index, but much faster than a indexed lookup on 10 million rows. The reason this is being so slow is the data for a month is distributed throughout the table as the table is organized by orderId, so finding one month of data means the entire table must be scanned.

select sum(unit) from SaleOrderCluster use index (primary)

where purchaseDate < '2002-01-01'

and purchaseDate > '2001-12-01'

+----+-------------+------------------+--------+-----+---------+-------+----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+------------------+--------+-----+---------+-------+----------+
| 1 | SIMPLE | SaleOrderCluster | all | | 8 | | 29216712 |
+----+-------------+------------------+--------+-----+---------+-------+----------+

Proving this, the time to query the for one month of dates takes a long as a query against the full table, as the below sql takes 3000 seconds to run as well.

select sum(unit) from SaleOrderCluster

+----+-------------+------------------+--------+-----+---------+-------+-----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+------------------+--------+-----+---------+-------+-----------+
| 1 | SIMPLE | SaleOrderCluster | all | | 8 | | 129216712 |
+----+-------------+------------------+--------+-----+---------+-------+-----------+

As more months are added, assuming the older months aren’t deleted, the SaleOrderCluster table performance of this type of date range queries will degrade even further. Against the purchaseDate clustered table the same date range queries will run in close to constant time as the table grows, assuming the amount of data for a month remains constant. This is because, while the size of the purchase date clustered index will be growing, the number of rows for the month of date being range scanned in that index will remain constant. Partitioning by date in this case won't provide performance benefits as the data is already organized by date.

So it seems there are two choices, either optimize for random reads by the real primary key of OrderId, or, optimize for date range queries. There is a third alternative of adding a covering index on the all the columns being queried, not just the purchase date, on the SaleOrderCluster table, with the first column of the index being the PurchaseDate. In this case all the above queries will be fast as both queries will be answered fully by a index, but the downside is the inserts against a larger covering index will be slower. Perhaps I shall investigate that next time.

Divide and be conquered?

2008-01-16T13:45:00.001+00:00

Over the past four articles I've demonstrated cases where a denormalized data model is faster than a normalized one, but often not by that much, and a case where a normalized data model was a bit faster than a denormalized one. My general conclusion was with today's optimizers one should target a normalized data model and then denormalize where it makes sense, even for reporting. I'm not as much of a fan of the star schema, a heavily denormalized data model popularized by Ralph Kimball, as I used to be. Star schemas are costly to build and maintain and the the time spent creating them can often be spent better on more productive optimizations, such as the creation of summary tables and better indexing. I'm not saying the denormalization doesn't make sense in some cases, just that it doesn't make sense in all cases.

Time on move on to the topic of this article, partitioning.

There are many good articles out there on the new MySQL 5.1 partitioning feature so I'm not going to repeat them. One of my favorites is http://dev.mysql.com/tech-resources/articles/mysql_5.1_partitions.html. However, I want to point out how much partitions can degrade performance as I haven't seen that highlighted enough. The difference between my test and other tests I've seen is the below tests deal with larger data volumes and result in more physical disk reads. As tables that are most likely to be partitioned are the larger tables, the tables that won't fit into memory, I believe these tests better represent reality. Or at least my reality. As I currently use InnoDB these tests will be run on it.

Right below is the non-partitioned, 10.5 gig, 60 million row table, which is has a clustered index (primary key) of purchaseDate and orderId. MySQL has 1 gig of memory assigned to it so the most of the table won't fit into memory.

CREATE TABLE sale (
    orderId int(11) NOT NULL,
    customerId int(11) NOT NULL,
    productId int(11) NOT NULL,
    productBigId int(11) NOT NULL,
    unit int(11) NOT NULL,
    purchaseAmount decimal(16,2) NOT NULL,
    purchaseCost decimal(16,2) NOT NULL,
    purchaseDate datetime NOT NULL,
    primary key (purchaseDate,orderId),
    unique key idx_sale_order (orderId),
    key pf_sale_product (productId),
    key pf_sale_customer (customerId),
    constraint pf_sale_product foreign key (productId) references product (productId)
) engine=InnoDB

This is partitioned version of the above table. The numbers in the partition by range clause represent monthly partitions of '2001-01-01', '2001-02-01', '2001-03-01', etc. Also, the unique key on orderId is no longer possible due a partitioning limitation so it was changed to a non-unique index.

create table salepartition (
    orderId int(11) not null,
    customerId int(11) not null,
    productId int(11) not null,
    productId int(11) not null,
    unit int(11) not null,
    purchaseAmount decimal(16,2) not null,
    purchaseCost decimal(16,2) not null,
    purchaseDate datetime not null,
    primary key (purchaseDate,orderId),
    key idx_salePartition_order (orderId),
    key pf_salePartition_product (productId),
    key pf_salePartition_customer (customerId)
) engine engine =InnoDB
partition by range (to_days(purchaseDate))
   (partition p0 values less than (730882) engine = InnoDB,
    partition p1 values less than (730910) engine = InnoDB,
    partition p2 values less than (730941) engine = InnoDB,
    partition p3 values less than (730971) engine = InnoDB,
    partition p4 values less than (731002) engine = InnoDB,
    partition p5 values less than (731032) engine = InnoDB,
    partition p6 values less than (731063) engine = InnoDB,
    partition p7 values less than (731094) engine = InnoDB,
    partition p8 values less than (731124) engine = InnoDB,
    partition p9 values less than (731155) engine = InnoDB,
    partition p10 values less than (731185) engine = InnoDB,
    partition p11 values less than (731216) engine = InnoDB,
    partition p12 values less than maxvalue engine = InnoDB)

What I did to test was to to write two short programs that would query the same customerId column of both tables. The below program runs against the non-partitioned table and runs in about 18 seconds.

create procedure testNoPartitionSpeed()
begin
    declare vorderId INT default 0;
    declare loopcounter INT default 0;
    declare vcustomerId INT default 0;
    repeat
        set loopcounter = loopcounter + 1;
        set vorderId = floor(1 + (rand()*60000000));
        set vcustomerId = (select customerId from sale where orderId = vorderId);
    until loopcounter > 1000
end

The and the plan is simple.

+----+-------------+-------+----------+----------------------+---------+------------------+----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+-------+----------+----------------------+---------+------------------+----------+
| 1 | SIMPLE | sale | const | 'idx_sale_order' | 4 | const | 1 |
+----+-------------+-------+----------+----------------------+---------+------------------+----------+

Next I ran a similar program against the partitioned table and it ran in about 77 seconds, about four times slower.

create procedure testPartitionSpeed()
begin
    declare vorderId INT default 0;
    declare loopcounter INT default 0;
    declare vcustomerId INT default 0;
    repeat
        set loopcounter = loopcounter + 1;
        set vorderId = floor(1 + (rand()*60000000));
        set vcustomerId = (select customerId from salepartition where orderId = vorderId);
    until loopcounter > 1000
    end repeat;
end

+----+------------+---------------+----------------------+------+--------------------+--------+-------+------+
| id | selecttype | TABLE         | partitions           | type | KEY                | keylen | ref   | rows |
+----+------------+---------------+----------------------+---------------------------+--------+-------+------+
| 1 | SIMPLE     | salepartition | p0,p1,p2,p3,p4,p5,p6,| ref | idx_salepart_order | 4      | const | 13   |
|    |            |               | p7,p8,p9,p10,p11,p12 |      |                    |        |       |      |
+----+------------+---------------+----------------------+------+--------------------+--------+-------+------+

The reason this slow performance is the salePartition table is composed of 13 partitions, 12 with actual data, and each one of which has a separate idx_salePartition_customer index. This means when a query needs to find a customerId it needs to probe into all 13 partitions, which shows up as the 13 in the explain rows column. This performance hit is why some databases have global indexes that cross all partitions. MySQL doesn't have this functionality yet, but since SQL Server just added partitioning functionality in late 2005 MySQL isn't far behind.

Querying by a non-partitioned index was a severe performance hit, but the partition by date will improve queries that limit the data by date, right? Well, unfortunately not. The below query runs in four seconds.

select sum(unit) from sale
where purchaseDate >= '2001-09-01'
and purchaseDate < '2001-10-01'

+----+-------------+-------+----------+----------------------+---------+------------------+----------+
| id | select_type | TABLE | type | KEY | key_len | ref | rows |
+----+-------------+-------+----------+----------------------+---------+------------------+----------+
| 1 | SIMPLE | sale | range | PRIMARY | 4 | null | 9907110 |
+----+-------------+-------+----------+----------------------+---------+------------------+----------+

And this query also runs in about four seconds.

select sum(unit) from salepartition
where purchaseDate >= '2001-09-01'
and purchaseDate < '2001-10-01'

+----+-------------+----------------+------------+---------+---------+---------+------------------+----------+
| id | select_type | TABLE | partitions | type | KEY | key_len | ref | rows |
+----+-------------+----------------+------------+-------------------+---------+------------------+----------+
| 1 | SIMPLE | salepartition | 'p8,p9' | range | primary | 8 | null | 2464709 |
+----+-------------+----------------+------------+---------+---------+---------+------------------+----------+

Both queries run in the same amount of time as the primary key already orders and therefore limits the data by date, meaning the partition doesn't have any performance benefit in this case.

For InnoDB (other databases will behave differently, I didn't test them), where the partitioning is by the leading column of the primary key, partitioning isn't going to provide performance benefits and will instead decrease performance in some cases. This assumes if multiple disks are used to store a table all of the table, partitioned or not, is striped across all the disks (as compared to putting a partition on each disk, which is suboptimal as the load is unlikely to be evenly distributed over the disks).

There are still solid operational reasons to implement partitioning, such as making it easier to backup seperate partitions. But as others have said, if you aren't careful and test you can find partitioning will degrade performance.

A setting star

2007-12-14T17:33:00.001+00:00

On my second blog I stated I would demonstrate a case where a denormalized Star schema is 50 times slower than a normalized data model. Well, while writing this I looked at the tables again and realized that I didn't have optimal indexes and my statistics weren't optimal. Yeah, I feel a bit foolish. Once I fixed those issues there wasn't a performance difference. I'll go more into this during the post test discussion.

First, these are the table that will be used to test the performance of the normalized and denormalized data models. The ProductNormal table represents the normalized table, has 10 million rows, and is about 0.8 gig.

create table ProductNormal (
     productId int(11) not null,
     productName varchar(32) not null,
     productGroupId int(11) not null,
     productBrandId int(11) not null,
     primary key (productId),
     key `idx_ProductNormal_GroupIdBrandId` (`productGroupId`,`productBrandId`),
     key `idx_ProductNormal_BrandGroup` (`productBrandId`,`productGroupId`),
     constraint `fk_ProductNormal_productBrandId`
         foreign key (`productBrandId`) REFERENCES     `ProductBrand` (`productBrandId`),
     constraint `fk_ProductNormal_productGroupId`
         foreign key (`productGroupId`) REFERENCES `ProductGroup` (`productGroupId`)
) engine=InnoDB

The small ProductBrand table has 11 rows.

create table ProductBrand (
    productBrandId int(11) not null,
    productBrandName varchar(32) not null,
    primary key (productBrandId),
    unique key idx_ProductBrand_Name (ProductBrandName)
) engine=InnoDB

The ProductGroup table has 101 rows.

create table ProductGroup (
    productGroupId int(11) not null,
    productGroupName varchar(32) not null,
    primary key (productGroupId),
    unique key idx_ProductGroup_Name (ProductGroupName)
) engine=InnoDB

The 10 million row ProductDenorm combines the previous three tables into a 1.27 gig table.

create table ProductDenorm (
    productId int(11) not null,
    productName varchar(32) not null,
    productGroupId int(11) not null,
    productGroupName varchar(32) not null,
    productBrandId int(11) not null,
    productBrandName varchar(32) not null,
    primary key (productId),
    key idx_ProductDenorm_GroupNameBrandName (productGroupName, productBrandName),
    key idx_ProductDenorm_BrandNameGroupName (productBrandName, productGroupName)
) engine=InnoDB

The 120 million row Sale table is about 30 gig.

create table Sale (
orderId int(11) not null,
customerId int(11) not null,
productId int(11) not null,
productBigId int(11) not null,
unit int(11) not null,
purchaseAmount decimal(16,2) not null,
purchaseCost decimal(16,2) not null,
purchaseDate datetime not null,
primary key (purchaseDate,orderId),
unique key idx_sale_order (orderId),
key idx_sale_product (productId),
key idx_sale_customer (customerId),
constraint pf_sale_customer foreign key (customerId) references Customer (customerId),
constraint pf_sale_product foreign key (productId) references Product (productId)
) engine=InnoDB

First, the below denormalized version of the query takes about 0.39 seconds, averaged over 100 executions. The rest of the queries are also averaged over 100 runs. What is interesting is the driving table of the plan is not the sale table, but the product table. There are two basics query plans the optimizer can decide on. The first is to drive off of the Sale fact table, which was covered in previous blogs. The second is to drive off of the dimension (descriptive tables) like the ProductGroup table. After this article, both cases will have been tested.

select sum(unit) from Sale s
join ProductDenorm p
    on s.productId = p.productId
where p.productBrandName = 'ProductBrand 1'
   and p.productGroupName = 'Group 1'
   and purchaseDate > '2001-12-30'

+----+-------------+-------+----------+---------------------+---------+-------------+-------+------------+
| id | select_type | table | type     | key                 | key_len | ref         | rows | extra
+----+-------------+-------+----------+---------------------+---------+-------------+-------+------------+
| 1 | SIMPLE      | p     | ref      | idx_ProductDenorm_ | 70      | const,const | 13608 | Using where
|    |             |       |          | BrandNameGroupName |         |             |       |
+----+-------------+-------+----------+---------------------+---------+-------------+-------+------------+
| 1 | SIMPLE      | s     | ref      | pf_sale_product     | 4       | p.productId | 10    | Using where
+----+-------------+-------+----------+---------------------+---------+-------------+-------+------------+

The normalized version also takes about 0.39 seconds. The plan is similar to the previous one, but the driving table is the normalized ProductGroup and not the denormalized ProductDenorm. In both cases the last table in the plan is the Sale table.

select sum(unit)
from Sale s
join ProductNormal p
    on s.productId = p.productId
join ProductGroup pg
    on p.productGroupId = pg.productGroupId
join ProductBrand pb
    on p.productBrandId = pb.productBrandId
where pb.productBrandName = 'ProductBrand 1'
   and pg.productGroupName = 'Group 1'
   and purchaseDate > '2001-12-30';

+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+
| id | select_type | table | type | key                          | key_len | ref                | rows | extra
+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+
| 1 | SIMPLE      | pg    | ref | idx_ProductGroup_Name        | 34      | const              | 1    | where,index
+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+
| 1 | SIMPLE      | pb    | ref | idx_ProductBrand_Name        | 34      | const              | 1    | where,index
+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+
| 1 | SIMPLE      | p     | ref | idx_ProductNormal_BrandGroup | 8       | pb.productBrandId, | 3258 | where,index
|    |             |       |      |                              |         | pg.productGroupId |      |
+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+
| 1 | SIMPLE      | s     | ref | pf_sale_product              | 4       | p.productId        | 10   | where,index
+----+-------------+-------+------+------------------------------+---------+--------------------+------+------------+

How about the same basic query, but for a month of data? It also takes about .39 seconds.

select sum(unit) from Sale s
join ProductDenorm p
    on s.productId = p.productId
where p.productBrandName = 'ProductBrand 1'
   and p.productGroupName = 'Group 1'
   and purchaseDate > '2001-12-01'

+----+-------------+-------+----------+--------------------+---------+-------------+----------+-----------+
| id | select_type | table | type     | key                | key_len | ref         |          | extra
+----+-------------+-------+----------+--------------------+---------+-------------+----------+-----------+
| 1 | SIMPLE      | p     | ref      | idx_ProductDenorm_ | 70      | const,const | 13608    | where
|    |             |       |          | BrandNameGroupName |         |             |          |
+----+-------------+-------+----------+--------------------+---------+-------------+----------+-----------+
| 1 | SIMPLE      | s     | ref      | pf_sale_product    | 4       | p.productId | 10       | where
+----+-------------+-------+----------+--------------------+---------+-------------+----------+-----------+

The normalized version also takes about .39 seconds against a month of data. So, at least for these simple queries the normalized data model is just as fast as the denormalized one.

select sum(unit) from Sale s
join ProductNormal p
    on s.productId = p.productId
join ProductGroup pg
    on p.productGroupId = pg.productGroupId
join ProductBrand pb
    on p.productBrandId = pb.productBrandId
where pb.productBrandName = 'ProductBrand 1'
   and pg.productGroupName = 'Group 1'
   and purchaseDate > '2001-12-01'

Over the course of four blog entries tests have shown that joins are expensive, but in spite of that, denormalized data isn't that much faster than a normalized data model. The best case was when a denormalized table was a bit more than twice as fast as a few normalized tables. Then there was the case where a normalized data model was faster than denormalized one as the normalized one used memory more efficiently. And finally, when the optimizer drives off of the dimensional (descriptive) data and not the fact table there isn't a noticeable difference in performance. I know there are exceptions when a normalized data model performance is worse than what I've shown; but they do appear to be exactly that, exceptions.

I could run more complex tests, and I probably will over time, but there are enough of them start coming to tentative conclusions. In the transactional world the general rule is to normalize and only denormalize when a specific query or set of queries warrant it. Given that denormalization can degrade performance at times, this is wise. The time spent denormalizing the data model would be better spent adding better indexes and getting the statistics right.

For reporting systems I would, hesitantly, recommend the same practice - normalize first and then denormalize when required. I believe the Star schema made sense when it was first proposed a decade ago and database optimizers were weak. A denormalized data model has less join permutations to consider and therefore an optimizer was better able to find a fast query plan when the data model was simple.

Over the past decade, the database optimizers have improved and are better able to handle the complexity of the normalized data model. When I was working on this article my normalized data model was initially much slower than my denormalized version. One of the main reasons my normalized data model was slow was because my statistics weren't that good. With better stats, MySQL, which isn't known to have the best optimizer, was at times able to find a normal query plan just as fast as the denormalized version. If this is true for the youthful MySQL it should be true for the more established databases like SQL Server and Oracle, which are better at creating stats.

It takes considerable effort to develop and populate a Star Schema. I now believe, in many cases at least, a normalized data model is best for reporting and denormalize on an exceptional basis where it makes sense. Develop the physical data model for reporting systems just like you would a transactional physical data model. If denormalization makes an important query much faster then denormalize.

This may be reflective of the start up environments where I work, where we have reasonably simple data models of a hundred of so tables but some of those tables have up to a billion rows. But based on this experience and these tests, the time spent denormalizing the data model would be better spent on creating better indexes and finding and populating summary tables. This is where the huge, orders of magnitude, improvements are to be found. Yes, a star schema would improve performance of that important query, but would a summary table be even better?

I'm sure there are exceptions to this; with software as complex as a database, and some of those huge and complex databases out there, there are always exceptions. MySQL is still young and, for example, the statistics functionality of MySQL isn't reliable enough for this to be true in all cases. I believe a simple reporting application that has a limited number of predefined reports can get by with a normalized data model that is selectively denormalized when required, but complex reporting application with frequent ad hoc reporting needs would still do better with a Star schema. Where the dividing line is a judgement call, but if in doubt I would start with a normalized data model and iterate to a more denormalized one.

Yet as the capabilities of the optimizers continue to improve I suspect the development of reporting systems should shift to a more normalized data modeling approach.

Denormalized data, not so fast?

2007-12-12T17:52:00.001+00:00

The last article demonstrated that a denormalized data model was a bit faster than a normalized data model when all the data fit into memory, but only by 20-30 percent or so, at least for the experiments I ran. This article deals with larger data volumes and shows a case where normalized data is faster than denormalized data and then details another case where a denormalized data model faster.

First, this is the what the tables in question look like. All the product tables have 10 million rows. The denormalized ProductAll table is 5.2 gig, and as innodb_buffer_pool_size is set to 4 gig, this table won't fit into memory. The filler column represents other product attributes without having to specify them in detail.

create table ProductAll (
    productAllId int(11) not null,
    productAllName varchar(32) not null,
    productGroupId int(11) not null,
    productGroupName varchar(32) not null,
    filler varchar(4000) not null,
    productBrandId int(11) not null,
    productBrandName varchar(32) not null,
    primary key (productAllId)
) engine=InnoDB

The normalized ProductLess table is 4.9 gig, a bit less than the ProductAll table, but it still has the filler column.

create table ProductLess (
    productId int(11) not null,
    productName varchar(32) not null,
    productGroupId int(11) not null,
    productBrandId int(11) not null,
    filler varchar(4000) not null,
    primary key (`productId`)
) engine=InnoDB

The 4.6gig productBig table will be used to explore how joining two large tables, tables that are both larger than the memory allocated to mysql, together performs.

create table ProductBig (
    productBigId int(11) not null,
    filler` varchar(4000) default NULL,
    productBrandId int(11) not null,
    PRIMARY KEY (productBigId),
    CONSTRAINT on_ProductBig_productBigId FOREIGN KEY (productBigId) REFERENCES Product (productId)
) ENGINE=InnoDB

The 9.2 Product2Filler table is a combination of the the ProductBig and the ProductLess table. Where the filler columns of both the ProductBig and the ProductLess tables are 384 characters, the Product2Filler filler column is twice that at 768 characters.

create table Product2Filler (
    productId int(11) not null,
    productName varchar(32) not null,
    productGroupId int(11) not null,
    productBrandId int(11) not null,
    filler varchar(4000) not null,
    primary key (`productId`)
) engine=InnoDB;

The Sale table has 120 million rows and is about 30 gig. It, combined with the product tables, even just the one month (out of 12 months, or 1/12 of the data) I query against, don't fit into memory.

create table Sale (
orderId int(11) not null,
customerId int(11) not null,
productId int(11) not null
productBigId int(11) not null,
unit int(11) not null,
purchaseAmount decimal(16,2) not null,
purchaseCost decimal(16,2) not null,
purchaseDate datetime not null,
primary key (purchaseDate,orderId),
unique key idx_sale_order (orderId),
key pf_sale_product (productId),
key pf_sale_customer (customerId),
constraint `pf_sale_customer` foreign key (customerId) references Customer (customerId),
constraint `pf_sale_product` foreign key (productId) references Product (productId)
) engine=InnoDB

The tiny ProductGroup table is a small 101 rows.

create table ProductGroup (
    productGroupId int(11) not null,
    productGroupName varchar(32) not null,
    primary key (productGroupId)
) engine=InnoDB

Onto the actual sql. The below denormalized query takes about 4380 seconds. As all the data doesn't fit into memory MySQL needs to do some serious reading, which shows up in iostat numbers, as do all the below queries. All the below queries are read bound. The buffer pool hit rate of 950 /1000 (or so - I averaged a few samples) verifies the heavy io. This is a huge performance difference from the Blog of December 5, where the equivalent sql running against product tables that fit into memory, where the product table don't have the filler column, ran in 25 seconds.

select productGroupName, sum(unit)
from Sale s
join ProductAll p
on s.productId = p.productAllId
where purchaseDate >= '2001-12-01'
group by productGroupName;

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | Using filesort... |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | null |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

At 3790 seconds, the normalized sql runs a bit faster than the above denormalized version. Like the previous query, It also is disk bound, but it has a better buffer hit rate of 972 / 1000 (or so). As the data is smaller it is better able to fit into memory, which explains the higher buffer hit rate and the better performance? I'm not convinced this is true as it needs to do more reads the get the small productGroup data, which is likely to be in memory all the time even with the MySQL least recently used caching - does anyone have insight into this?

select productGroupName, sum(unit)
from Sale s
join ProductLess p
on s.productId = p.productId
join ProductGroup pg
on p.productGroupId = pg.productGroupId
where purchaseDate >= '2001-12-01'
group by productGroupName

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | Using filesort... |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | null |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | null |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

From this one might expect that a normalized data model is faster in all cases when dealing with large data volumes. But how about joining a large table, not only to a small table in memory table, but also to another large table. The below join of the ProductLess table to the ProductBig table takes 20,429 seconds.

select productGroupName, sum(unit)
from Sale s
join ProductLess p
    on s.productId = p.productId
join ProductGroup pg
    on p.productGroupId = pg.productGroupId
join ProductBig pb
    on p.productId = pb.productBigId
where purchaseDate >= '2001-12-01'
group by productGroupName

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | Using filesort... |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | pb | eq_ref | PRIMARY | 4 | s.productId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | pb.productBigId | 1 | Using where |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

When the ProductLess table and the ProductBig tables are combined into the Product2Filler table similar SQL takes 9219 seconds, about half as long. In at least in some cases, joining two large tables, tables that can't fully fit into memory, a combined data model is faster. I know this isn't really a denormalization as both are keyed by productId, but I believe it has the same performance characteristics as a denormalized data model. At least it shows an example of what happens when two large tables are combined into one even larger one.

select productGroupName, sum(unit)
from Sale s
join Product2Filler p
on s.productId = p.productId
join ProductGroup pg
on p.productGroupId = pg.productGroupId
where purchaseDate >= '2001-12-01'
group by productGroupName

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | Using filesort... |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

As a side note, the below query against data that is not cached (different dates than the other queries), the query runs in 340 seconds. It looks like the cost of a sequential table (range) scan is insignificant compared to a random index read of another table that can't fit into memory. Stated differently, joins are at times costly.

select count(*) from Sale
where purchaseDate >= '2001-03-01'
and purchaseDate < '2001-04-01'

What has been shown is normalized data is sometimes faster than denormalized data, and sometimes combined data is faster. So far the case for the denormalized data model isn't that persuasive. This is true both when all the data can fit into memory and when it can't and many reads are required. Why, then, do so many people use star schemas for reporting systems? In the next article there will be a case where a denormalized data model is a bit faster than what has been shown so far.

Denormalized data is a bit faster than normalized data (but by enough?)

2007-12-05T15:43:00.001+00:00

In the last article the performance impact of joins was shown. This one will demonstrate cases where denormalized joins are a bit faster, as will the third article with larger data volumes. The fourth article, the most interesting one, will show where a denormalized data model can be 50 times faster than a normalized data model.

Here are the tables that will be involved in the sql. The normalized ProductSmall table has a 100 million rows and is about 0.67 gig.

create table ProductSmall (

productSmallId int(11) not null,

productSmallName varchar(32) not null,

productGroupId int(11) not null,

primary key (productSmallId),

key idx_ProductSmall_productGroup (productGroupId),

constraint con_ProductSmall_productGroup

foreign key (productGroupId) references ProductGroup (productGroupId)

) ENGINE=InnoDB;

This 101 row table has the product group text and joins to the ProductSmall table.

create table ProductGroup (

productGroupId int(11) not null,

productGroupName varchar(32) not null,

primary key (productGroupId)

) ENGINE=InnoDB;

This is the 100 million rows denormalized combination of the previous two tables. It is 1.2 gig and is about twice the size of the normalized version.

create table Product (

productId int(11) not null,

productName varchar(32) not null,

productGroupId int(11) not null,

productGroupName varchar(32) not null,

primary key (productId)

) ENGINE=InnoDB;

First, how fast does it take to do a sequential scan of 652,637 rows of the 120 million rows Sale table? At 0.3 seconds it is fast as it is cached is memory. This is the same Sale table from the previous article. Purchase date is the leading edge of the primary key, which explains the range scan on the primary key.

select sum(unit) from Sale

where purchaseDate >= '2001-12-30'

+----+-------------+-------+-------+---------------+----------+---------+------+----------+-------------+

+----+-------------+-------+-------+---------------+----------+---------+------+----------+-------------+

+----+-------------+-------+-------+---------------+----------+---------+------+----------+-------------+

Adding a join to a small table (that also fits into memory) is bit slower than the previous simple table scan, but the the below sql runs in a about 1.1 seconds. Still reasonably fast and less than four times slower than a sequential scan of the data. This is great performance as btree lookup can be much more expensive than a sequential table scans. This is a sign that the innodb adaptive hash indexing is working well for reads. And running "show engine innodb status" shows "18617.95 hash searches/s, 5.00 non-hash searches/s". Show engine also shows "Buffer pool hit rate 1000 / 1000", indicating that all the data does fit into memory. I won't show it, but the rest of the queries in this article have similar numbers.

select sum(unit) from Sale s

join ProductSmall p

on s.productId = p.productSmallId

where purchaseDate >= '2001-12-30';

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

| 1 | SIMPLE | s | range | PRIMARY |8 | null | 1882540 | Using where |

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

Using this normalized data model, where product group names are in a different tables, the extra join slows down performance but only by a bit. This sql runs in about 1.4 seconds.

select sum(unit) from Sale s

join ProductSmall p

on s.productId = p.productSmallId

join ProductGroup pg

on p.productGroupId = pg.productGroupId

where purchaseDate >= '2001-12-30';

The plan looks similar to the previous one, with the addition of one more join.

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 1882540 | Using where |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productIdn | 1 | Using index |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------+

| 1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------+

Ok, the denormalized Product table is a larger than the normalized ProductSmall table, but has the productGroupName of the ProductGroup table, removing one of the joins and making the below query a bit faster at 1.1 seconds. Nothing is substantially different here, but if everything can fit into memory, which this data does, the denormalized data model performs better.

select sum(unit) from Sale s

join Product p

on s.productId = p.productId

where purchaseDate >= '2001-12-30';

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

| 1 | SIMPLE | s | range | PRIMARY |8 | null | 1882540 | Using where |

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+-------------+----------+-------------+

Now to run the same series of tests but with a summary of the productGroupName just to verify the above pattern holds. In summary, it does.

Ok, now we know the simple cost of a join, grouping by a value on the joined to table does slow down the sql, but again, this is reasonably fast as 2.2 seconds.

select productGroupName, sum(unit)

from Sale s

join ProductSmall p

on s.productId = p.productSmallId

join ProductGroup pg

on p.productGroupId = pg.productGroupId

where purchaseDate >= '2001-12-30'

group by productGroupName

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY |8 | null | 1882540 | Using filesort ...

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

The denormalized data model is faster again, but not by much, as 1.8 seconds.

select productGroupName, sum(unit)

from Sale s

join Product p

on s.productId = p.productId

where purchaseDate >= '2001-12-30'

group by productGroupName

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

| 1 | SIMPLE | s | range | PRIMARY |8 | null | 1882540 | Using filesort ...

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | Using index |

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

Ok, what occurs when the same sql is run against 10,182,566 rows? If cached, the simple range scan query takes about 4 seconds.

select sum(unit)

from Sale s

where purchaseDate >= '2001-12-01'

+----+-------------+-------+-------+----------+---------+------+----------+-------------+

+----+-------------+-------+-------+----------+---------+------+----------+-------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | USING WHERE |

+----+-------------+-------+-------+----------+---------+------+----------+-------------+

Now, what about a join to the denormalized product table? This takes 25 seconds, which shows the cost of a table join and a group by over a sequential table scan. But as all the data can fit into memory the execution times are still reasonably fast.

select productGroupName, sum(unit)

from Sale s

join Product p

on s.productId = p.productId

where purchaseDate >= '2001-12-01'

group by productGroupName

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

| 1 | SIMPLE | s | range | PRIMARY |8 | null | 30931362 | Using filesort ...

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | |

+----+-------------+-------+----------+----------+---------+-------------+----------+------------------------+

Once again, the below normalized data is slower than the above normalized version and runs in 33 seconds.

select productGroupName, sum(unit)

from Sale s

join ProductSmall p

on s.productId = p.productSmallId

join ProductGroup pg

on p.productGroupId = pg.productGroupId

where purchaseDate >= '2001-12-01'

group by productGroupName

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | s | range | PRIMARY | 8 | null | 30931362 | Using filesort ...

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | p | eq_ref | PRIMARY | 4 | s.productId | 1 | |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

| 1 | SIMPLE | pg | eq_ref | PRIMARY | 4 | p.productGroupId | 1 | |

+----+-------------+-------+----------+----------+---------+------------------+----------+-------------------+

So far, the case for a denormalized data model, such as a star schema, isn't that robust. Yes, the performance is better, by only by about 20-30 percent.

But this series of articles isn't finished yet. In the next blog I’ll show what happens when both the normalized and the denormalized data model struggle to fit into memory.

Joins are slow, memory is fast

2007-11-28T17:00:00.000+00:00

I've been working with various databases for a number of years. In that time I've found there is plenty of documentation on various features but not much on how database modeling impacts performance. Some people say one should use a fully normalized data model with a few denormalizations that are required for performance reasons, but those denormalizations are never discussed. And Ralph Kimball has many articles and a book concentrating on the physical data modeling, but they are more of the "here is how you solve the problem" and they don't detail why his method works better.

I've always found this odd as the physical data model has a major impact on database performance. I want this to be more of the whys behind various physical data modeling options with some examples showing the magnitude of the performance differences.

The first goal of this blog (over a few articles) will be to show cases where a denormalized dimensional data model substantially improves performance instances and where a normalized model optimized for updates is appropriate. The server I have available to do this isn't the best, so the largest table is only going to be 120 million rows and 30 gig on disk, but I believe the principles shown will scale to larger data volumes on more capable servers.

At first, at least, I'm going to demonstrate this with MySQL using the InnoDB database as I'm currently using this at work and I need to learn more about it. I'm more comfortable with Oracle and MS SQL Server, so if I get something wrong with MySQL, or it isn't true in all cases (and what is with a database?), or I'm just being an idiot, point it out.

The machine I ran these tests on has a innodb_buffer_pool_size of 2048M and innodb_flush_method=O_DIRECT to avoid double caching by the OS. The Linux machine I ran these tests on doesn’t have a great disk subsystem with only 4 disks in a striped mirrored configuration.

Enough of the preliminaries, here are the tables. This is the small customer dimension of 1 million rows and 60 meg of data that easily fits into memory.

CREATE TABLE Customer
(customerId int NOT NULL default '0',
customerName varchar(32) NOT NULL,
postalCode varchar(7) NOT NULL,
PRIMARY KEY (`customerId`)
) ENGINE=InnoDB;

The product dimension is larger, with 10 million rows and 1043 meg of data. It will fit into memory, but it takes half of the available memory.

CREATE TABLE Product
(productId int NOT NULL,
productName varchar(32) default NULL,
productGroupId int default NULL,
productGroupName varchar(32) default NULL,
PRIMARY KEY (`productId`)
) ENGINE=InnoDB;

Finally, the large sale fact table, with 120 million rows and about 28 gig of data. It will never fit into memory. Note the trick to create the clustered index (primary key) based on the date so the common date based queries can quickly do a range scan based on the clustered index, but there is a severe downside to this. Since all the other indexes point to the clustered index they will all be larger and will therefore take longer to insert data and read. When to use this trick is a topic in and of itself, so enough of this for now. The range of dates is from 2001-01-01 to 2001-01-31, with the data being randomly distributed over the year.

CREATE TABLE Sale
(orderId int NOT NULL,
customerId int(11) NOT NULL,
productId int(11) NOT NULL,
productBigId int(11) NOT NULL,
unit int(11) NOT NULL,
purchaseAmount decimal(16,2) NOT NULL,
purchaseCost decimal(16,2) NOT NULL,
purchaseDate datetime NOT NULL,
PRIMARY KEY idx_sale_dateOrder (purchaseDate,orderId),
UNIQUE KEY pk_sale_order (orderId),
KEY idx_sale_product (productId),
KEY idx_sale_customer (customerId),
CONSTRAINT pf_sale_product FOREIGN KEY (productId)
REFERENCES Product (productId),
CONSTRAINT pf_sale_customer FOREIGN KEY (customerId)
REFERENCES Customer (customerId)
) ENGINE=InnoDB;

So, my testing technique is to run a query a number of times and throw out the first execution. This isn’t entirely realistic, but the alternative is to flush all the data and have no caching at all. As reporting performance is heavily dependent on memory caching, I prefer a testing methodology that takes into the account caching, even if it is overstated.

Ok, the month of December comprises a bit more than 10 million rows. Stripping out all the keys that aren’t used, one month of this table fits into memory, which shows up in the 3 to 4 second execution times of this sql after the first execution.

select sum(unit) from Sale
where purchaseDate >= '2001-12-01'

And the execution plan shows that the primary key is being used to do a range scan.

1, 'SIMPLE', 'Sale', 'range', 'PRIMARY', 'PRIMARY', '8', '', 30931362, 'Using where; Using index'

Adding a simple join to the small customer table results in an execution time of 11 to 12 seconds. So while everything is fitting into memory, an in memory join still makes a query run about 3 times slower in this case. As expected, the execution plan shows the sale table primary key being used for a range scan with the join to the customer table using the primary key being for an index lookup. As a note, MySQL needs to do a better job on table analysis as I didn’t get a good plan on this simple query without a hint.

select straight_join sum(unit) from Sale s
join Customer c
on s.customerId = c.customerId
where s.purchaseDate >= '2001-12-01'

And here is the plan from the above hint, which drives from the sales table and looks up the related customer information.

1, 'SIMPLE', 's', 'range', 'PRIMARY', 'PRIMARY', '8', '', 30931362, 'Using where'
1, 'SIMPLE', 'c', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'perfbig.s.customerId', 1, 'Using index'

In the next case the sales table is joined to the much larger product table. Both one month of the sale table and the product table both can’t fit into memory; instead of fast memory access we are dealing with slow disk reads and query times of around 417 to 525 seconds.

I was surprised by how much slower this is than the equivalent customer query. I’m also wondering about the strength of the simple InnoDB least recently used buffer pool algorithm as this point. As expected with the slow execution time, iostat shows heavy disk utilization.

select count(*) from Sale s
join Product p
on s.productId = p.productId
where purchaseDate >= '2001-12-01';

The plan is reasonable as it drives from the sale table and joins to the product table.

1, 'SIMPLE', 's', 'range', 'PRIMARY,pf_sale_product', 'PRIMARY', '8', '', 30931362, 'Using where'
1, 'SIMPLE', 'p', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'perfbig.s.productId', 1, 'Using index'

Instead of using a full month of data does how does querying just two days of the sale table behave? Well, it executes in .3 seconds. Obviously, everything fits into memory at this point.

select sum(unit) from Sale s
where purchaseDate >= '2001-12-30';

The plan is good, with a range scan of the sale table.

1, 'SIMPLE', 's', 'range', 'PRIMARY', 'PRIMARY', '8', '', 1882540, 'Using where'

How about adding a simple join to the customer table? That results in an execution time of one second, so even small in memory joins impact execution times.

select sum(unit) from Sale s
join Customer c
on s.customerId = c.customerId
where purchaseDate >= '2001-12-30';

The plan is good.

1, 'SIMPLE', 's', 'range', 'PRIMARY,pf_sale_customer', 'PRIMARY', '8', '', 1882540, 'Using where'
1, 'SIMPLE', 'c', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'perfbig.s.customerId', 1, 'Using index'

What is interesting is the join to the much larger product table takes about the same amount of time as the Customer table, one second. I’m going to guess this is because everything can still fit into memory. This is verified as iostat does not show any disk io.

select sum(unit) from Sale s
join Product p
on s.productId = p.productId
where purchaseDate >= '2001-12-30';

And the product plan looks the same as the join to the customer table.

1, 'SIMPLE', 's', 'range', 'PRIMARY,pf_sale_product', 'PRIMARY', '8', '', 1882540, 'Using where'
1, 'SIMPLE', 'p', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'perfbig.s.productId', 1, 'Using index'

So, one of the insights of the dimensional model is that joins are expensive. This has been demonstrated by these tests. The size of the dimension doesn't impact performance if the dimension can fit into memory, but a large dimension that doesn't fit into memory has a severe impact on memory.

Others, looking at the dimensional model, will say normalization is best as this minimizes the amount of memory a table will hold, and memory optimization means more of the data fits into memory and result in less of those expensive physical reads. So which is more important, minimizing the number of joins or minimizing the size of tables? That will be covered in another (hopefully shorter) entry.