Today I had the pleasure (or the pain?) of tuning a bunch of SQLs written for MySQL (5.0.26). One that bothered me most was a sql that took about 210 seconds and it appeared to be a very innocent SQL except for the fact that the where clause is useless and ending up in a full table scan. It was of the form
select a,b,count(distinct c),count(distinct d) from a-bunch-of-tables-and-where group by a,b;
To rule out the possibility of problem due to full table scan, I tried to do a simple query with out the distinct in the count and to my surprise, it returned back in under 3 seconds. Not bad for more than half-a-million rows!
Now, this difference of 210 to 3 seconds really worried me. After a bit of searching, came to realize that this is currently a limitation with MySQL as mentioned at http://forge.mysql.com/worklog/task.php?id=3220
So, I changed the query to the form
select a,b,count(distinct c),count(d) from (select distinct a,b,c,d from ... where ...) group by a,b;
and the query started working returning results in 12 seconds. Ofcourse, 12 seconds is no good, but as the query is for an aggregate report, I am fine with it.
In general, all the SQLs I tuned today required using sub-queries. Having come from Oracle database world, things I took for granted weren’t working the same with MySQL. And my reading on MySQL tuning makes me conclude that MySQL is way behind Oracle in terms of optimizing queries. While the simple queries required for most B2C applications may work well for MySQL, most of the aggregate reporting type of queries needed for Intelligence Reporting seems to require a fair bit of planning and re-organizing the SQL queries to guide MySQL to execute them faster. With Oracle CBO, that’s usually never the case. Things are far more intuitive and easy in Oracle world.
If anyone has any other stories of their performance tuning experience with MySQL, feel free to comment on them.
High Performance MySQL is the latest book on MySQL Performance tuning.