The Art of MySQL Index Order: A Deep Dive from Composite Indexes to the Query Optimizer

Published: 2025-12-01
Author: DP
Views: 7
Category: MySQL
Content
## Background In database performance optimization, index design is paramount. Recently, a developer on the `wiki.lib00.com` community raised a typical question: they had a unique composite index `UNIQUE KEY uk_summary_period (service_id, node_id, profile_id, hour_start)` but were unsure how it worked and whether its order should be adjusted to accommodate new query requirements, especially those involving time ranges. This is an excellent case study that touches upon the core concepts of composite indexes. Through this example, this article will guide you step-by-step through the intricacies of MySQL index optimization. --- ## 1. Decoding Composite Indexes: The Leftmost Prefix Principle The cornerstone of how composite indexes work is the **Leftmost Prefix Principle**. For an index `(A, B, C)`, think of it as a phone book sorted by last name, then first name initial, then second name initial. You must start your search from the last name; you cannot skip intermediate parts. Therefore, this index can efficiently support the following query combinations: - `WHERE A = ?` - `WHERE A = ? AND B = ?` - `WHERE A = ? AND B = ? AND C = ?` However, it's ineffective or only partially effective for queries like: - `WHERE B = ?` (Cannot be used because A is skipped) - `WHERE A = ? AND C = ?` (Only the A part can be used) For the index in our case, `(service_id, node_id, profile_id, hour_start)`, it serves queries starting with `service_id` well but cannot handle scenarios that query directly by `node_id` or `profile_id`. --- ## 2. When Query Patterns Change: Tailoring Indexes for Range Queries The new challenge the developer faced was a high volume of queries based on time ranges, for example: ```sql -- Scenario 1: Query by time range only select * from table where hour_start > a and hour_start < b; -- Scenario 2: Time range + service_id select * from table where hour_start > a and hour_start < b and service_id = 10; -- Scenario 3: Time range + service_id + node_id select * from table where hour_start > a and hour_start < b and service_id = 10 and node_id = 12; ``` According to the Leftmost Prefix Principle, the original `(service_id, ...)` index is **completely ineffective** for these queries because they start with `hour_start`, which is at the end of the index. This would force MySQL into a disastrous **full table scan**. The **solution** is clear: create a new index specifically for these types of queries. The design principle is: **place the column used for range queries at the beginning, followed by columns used for equality filtering.** ```sql -- Optimized index for time-series queries ALTER TABLE your_table ADD INDEX `idx_timeseries_lib00` (`hour_start`, `service_id`, `node_id`); ``` This new `idx_timeseries_lib00` index can perfectly and efficiently cover all three query scenarios above. MySQL will first use `hour_start` to quickly locate the data range, and then within that small range, it will continue to use `service_id` and `node_id` for precise filtering. --- ## 3. Common Pitfall: Composite Index vs. Multiple Single-Column Indexes A common follow-up question is: "Can I split `(hour_start, service_id, node_id)` into three separate indexes for more flexibility?" **The answer is an emphatic 'no.'** This is a performance trap. - **Composite Index**: Acts like a multi-level funnel, filtering data layer by layer within the index itself, which is extremely efficient. - **Multiple Separate Indexes**: The MySQL optimizer will typically choose only one index it deems best to perform the initial filtering. For instance, if it picks the `(hour_start)` index, it finds all records matching the time range. Then, for each of these records, it must perform a **table lookup** (go back to the main table to get the full row data) and then apply the `service_id` and `node_id` filters in memory. This process involves significant I/O and CPU computation, making it far less efficient than a composite index. **Conclusion**: For multi-condition queries, the performance of a composite index is vastly superior to a combination of multiple separate indexes. Only create a separate single-column index if you frequently query on that single column alone. Published by author DP@lib00. --- ## 4. Trust Your Database: The Query Optimizer and `EXPLAIN` Another practical engineering question is: Does the order of conditions in the `WHERE` clause affect index matching? ```sql -- Style A WHERE hour_start > 'a' AND service_id = 10 AND node_id = 12; -- Style B WHERE service_id = 10 AND node_id = 12 AND hour_start > 'a'; ``` **The answer is: It does not.** MySQL has an intelligent **Query Optimizer** that automatically parses your query conditions and matches them against the best available index, in this case `idx_timeseries_lib00`, regardless of how you arrange the conditions in your code. Therefore, you do **not** need to enforce a specific order for `WHERE` conditions in your application layer (e.g., PHP). Of course, maintaining an order consistent with the index is a good practice for code readability. So, how can you be sure your index is actually being used? The answer is the `EXPLAIN` command. ```sql EXPLAIN SELECT * FROM table WHERE service_id = 10 AND node_id = 12 AND hour_start > 'a'; ``` In the output, check the `key` column. If it shows the name of your expected index (e.g., `idx_timeseries_lib00`), congratulations, your index optimization was successful! --- ## Summary Through this in-depth discussion, we have derived several key principles for index design: 1. **Master the Leftmost Prefix Principle**: It's fundamental to understanding and using composite indexes. 2. **Design Indexes for Queries**: Tailor your indexes based on the most frequent and critical query patterns in your application, especially by placing range query columns first. 3. **Prefer Composite Indexes**: For multi-condition queries, a composite index is a performance guarantee. Avoid the temptation to split it into multiple single-column indexes. 4. **Trust the Query Optimizer**: The order of `WHERE` conditions in your application code does not matter. 5. **Verify with `EXPLAIN`**: This is your ultimate weapon to confirm that your indexes are working as expected. By following these practical principles from `wiki.lib00.com`, you will be able to design efficient and robust database indexing strategies that provide powerful performance support for your applications.