Last time I demonstrated a case where stored procedures are slow. There were a few comments that I should include selects in the stored procedure to make the tests more realistic. From experience I already knew the answer so I didn’t go into that level of detail, but since stored procedures are all about database access this is a reasonable comment.
This is a simple stored procedure that selects data and then summarizes it by customer. No one would actually write this as it is far too easy to use a SQL statement instead. Assume the logic is more complex and can’t be done easily by the standard SQL techniques of case statements, temp tables, etc, and then this makes more sense.
The end result is this stored procedure takes 696 seconds to run.
create procedure slowCounter()
begin
declare done int default 0;
declare curCustomerId int default 0;
declare customerCount int default 0;
declare slowCur cursor for select customerId from saleT;
declare continue handler for not found
set done = TRUE;
create temporary table tempCount
(
tempCustomerId int not null,
tempCustomerCount int not null,
primary key (tempCustomerId)
);
open slowCur;
fetch slowCur into curCustomerId ;
while not done do
insert into tempCount(tempCustomerId, tempCustomerCount)
values (curCustomerId , 1)
on duplicate key update tempCustomerCount = tempCustomerCount + 1;
fetch slowCur into curCustomerId ;
end while;
close slowCur;
select * from tempCount;
end
This more complex C# code (and still not fully exception proofed) only takes 1.4 seconds to run. That is nearly 1/500 the time it took the stored procedure to execute.
using System;
using System.Collections.Generic;
using MySql.Data.MySqlClient;
namespace Summarization
{
/// <summary>
/// Tests the performance of C# to simulate a SQL group by
/// </summary>
class SumTestGroup : BaseSum
{
protected MySqlConnection sqlConn;
private Dictionary<Int32, String> customer =
new Dictionary<Int32, String>();
private Dictionary<String, Int32> customerRowCount =
new Dictionary<String, Int32>();
public void execute()
{
try
{
sqlConn = getConnectionMySql();
sqlConn.Open();
int count = 0;
Int32 customerId;
String customerName;
getCustomers();
MySqlCommand cmd = new MySqlCommand("select s.customerId from Salet s", sqlConn);
MySqlDataReader rdr = cmd.ExecuteReader();
while (rdr.Read())
{
customerId = (Int32)rdr["customerId"];
customerName = customer[customerId];
count++;
if (customerRowCount.ContainsKey(customerName))
{
customerRowCount[customerName]++;
}
else
{
customerRowCount[customerName] = 1;
}
//Console.WriteLine("Count: " + count);
}
foreach (KeyValuePair<String, Int32> customerRow in customerRowCount)
{
Console.WriteLine("Customer: " + customerRow.Key + " Customer Count: " + customerRow.Value);
}
rdr.Close();
}
finally
{
sqlConn.Close();
}
}
private void getCustomers()
{
MySqlCommand cmd = new MySqlCommand("select customerId, customerName from customer", sqlConn);
MySqlDataReader rdr = cmd.ExecuteReader();
Int32 customerId;
String customerName;
while (rdr.Read())
{
if (!rdr.IsDBNull(0))
{
customerId = (Int32)rdr["customerId"];
customerName = (String)rdr["customerName"];
customer.Add(customerId, customerName);
}
}
rdr.Close();
}
}
}
Both produce results like this:
customer Name 0, 100287
customer Name 1, 100009
customer Name 2, 100130
customer Name 3, 99655
customer Name 4, 100134
customer Name 5, 100102
customer Name 6, 99854
customer Name 7, 100172
customer Name 8, 99846
customer Name 9, 99812
In this case, the effort required to retrieve one million rows out of the database and then process them takes orders of magnitude less than a stored procedure takes to process the data inside the database.
No one would actually write this code when a simple SQL alternative is possible. But given complex logic that isn’t possible or easy to express in SQL that needs to be applied against a large volume of data, using a stored procedure cursor (or loop in general) isn’t a fast way to go. This is odd as the whole point of stored procedures is to write code that deals with data contained in a database.