How to improve MySQL index performance
Contents |
Indexing strategies
Whenever you see that the query is taking too long to read, an index could be helpful. So benchamrk and profile it to determine what to do. There are special cases for optimization and specialized behaviors:
- Isolate the column:
- This means that "WHERE id+1='5'" won't use indexes
- "WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date)<10" also will skip the indexes because TO_DAYS() is a function
- A better way to write this query would be "WHERE `date`>DATE_SUB(CURRENCT_DATE,INTERVAL_..."
- But yet CURRENT_DATE will prevent the query to be cahced. So an even better way would be "WHERE date>DATE_SUB('2009-12-01',INTERVAL_..."
- Prefix indexes and index selectivity:
- Prefix index is the indexation of part of a string. Is you have a very long string to index, use the hash emulation. But there is another option: you can index just the first few characters of this string.
- Index selectivity is the number of distinct indexed values against the total num of rows.
- The more selective the better, because it lets MySQL filter out more rows at a time
- The prefix index saves space, but makes it less selective
- A prefix of a column is often selective enough to improve performance, specially for a BLOB or a TEXT or long VARCHARs
- Indexation in full length of BLOB, TEXT and long VARCHARS is disallow by MySQL
- The main idea here is to create an index as selective as the full length of the column. That's why the id(INT) index is so powerful.
Counting selectivity
- You count the num of occurrences of a column's prefix:
SELECT COUNT(id) AS `cid`, LEFT(`city`,3) AS `pref` FROM xxx GROUP BY `pref` ORDER BY `cid`
OUTPUT:
| cid | pref |
| 10 | Ale |
| 20 | Ive |
| 30 | Pav |
| ... | ... |
SELECT COUNT(id) AS `cid`, LEFT(`city`,4) AS `pref` FROM xxx GROUP BY `pref` ORDER BY `cid`
OUTPUT:
| cid | pref |
| 5 | Alex |
| 5 | Ales |
| 3 | Ivet |
| 17 | Iven |
| ... | ... |
This returns fewer matches and more rows, that means, more selective.
- Another way to calculate selectivity:
SELECT COUNT(DISTINCT `city`)/COUNT(*) AS `full_length`, SELECT COUNT(DISTINCT LEFT(`city`,3))/COUNT(*) AS `pref3`, SELECT COUNT(DISTINCT LEFT(`city`,4))/COUNT(*) AS `pref4`, SELECT COUNT(DISTINCT LEFT(`city`,5))/COUNT(*) AS `pref5` FROM xxx
OUTPUT:
| full_length | pref3 | pref4 | pref5 |
| 0.0031 | 0.0216 | 0.019 | 0.004 |
The closer to the full_length, the better the index but the more memory it will use
Last Tips
- If you want to index the suffix you can store the string inversely and index it's prefix
- Clustered indexes