MySQL Query PerformanceIn this blog post, I’ll look at whether optimizing indexing is always the key to improving MySQL query performance (spoiler, it isn’t).

As we look at MySQL query performance, our first concern is often whether a query is using the right indexes to retrieve the data. This is based on the assumption that finding the data is the most expensive operation – and the one you should focus on for MySQL query optimization. However, this is not always the case.

Let’s look at this query for illustration:

Looking at this query, many might assume the main problem is that this query is doing a full table scan. One could wonder then, “Why does the MySQL optimizer not use index (k)?”  (It is because the clause is not selective enough, by the way.) This thought might cause someone to force using the index, and get even worse performance:

Or someone might extend the index on (k) to (k,g) to be a covering index for this query. This won’t improve performance either:

This wasted effort is all due to focusing on the wrong thing: figuring out how can we find all the rows that match k<1000000  as soon as possible. This is not the problem in this case. In fact, the query that touches all the same columns but doesn’t use GROUP BY runs 10 times as fast:

For this particular query, whether or not it is using the index for lookup should not be the main question. Instead, we should look at how to optimize GROUP BY – which is responsible for some 90% of the query response time.

In my next blog post, I will write about four ways to execute the MySQL GROUP BY operation to provide further help on optimizing these queries.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Raffa Cabo Frio

I liked the article. It’s a topic that interests me a lot. But it seemed to lack a conclusion. 🙁

kabilesh

very excited about next blog