,

How to improve MySQL index performance

Contents


Indexing strategies

Whenever you see that the query is taking too long to read, an index could be helpful. So benchamrk and profile it to determine what to do. There are special cases for optimization and specialized behaviors:

  • Isolate the column:
    • This means that "WHERE id+1='5'" won't use indexes
    • "WHERE TO_DAYS(CURRENT_DATE) - TO_DAYS(date)<10" also will skip the indexes because TO_DAYS() is a function
    • A better way to write this query would be "WHERE `date`>DATE_SUB(CURRENCT_DATE,INTERVAL_..."
    • But yet CURRENT_DATE will prevent the query to be cahced. So an even better way would be "WHERE date>DATE_SUB('2009-12-01',INTERVAL_..."
  • Prefix indexes and index selectivity:
    • Prefix index is the indexation of part of a string. Is you have a very long string to index, use the hash emulation. But there is another option: you can index just the first few characters of this string.
    • Index selectivity is the number of distinct indexed values against the total num of rows.
    • The more selective the better, because it lets MySQL filter out more rows at a time
    • The prefix index saves space, but makes it less selective
    • A prefix of a column is often selective enough to improve performance, specially for a BLOB or a TEXT or long VARCHARs
    • Indexation in full length of BLOB, TEXT and long VARCHARS is disallow by MySQL
    • The main idea here is to create an index as selective as the full length of the column. That's why the id(INT) index is so powerful.

Counting selectivity

  • You count the num of occurrences of a column's prefix:
SELECT COUNT(id) AS `cid`, LEFT(`city`,3) AS `pref` FROM xxx GROUP BY `pref` ORDER BY `cid` 

OUTPUT:

cid pref
10 Ale
20 Ive
30 Pav
... ...
SELECT COUNT(id) AS `cid`, LEFT(`city`,4) AS `pref` FROM xxx GROUP BY `pref` ORDER BY `cid`

OUTPUT:

cid pref
5 Alex
5 Ales
3 Ivet
17 Iven
... ...

This returns fewer matches and more rows, that means, more selective.

  • Another way to calculate selectivity:
SELECT COUNT(DISTINCT `city`)/COUNT(*) AS `full_length`, 
  SELECT COUNT(DISTINCT LEFT(`city`,3))/COUNT(*) AS `pref3`,
  SELECT COUNT(DISTINCT LEFT(`city`,4))/COUNT(*) AS `pref4`,
  SELECT COUNT(DISTINCT LEFT(`city`,5))/COUNT(*) AS `pref5`
  FROM xxx

OUTPUT:

full_length pref3 pref4 pref5
0.0031 0.0216 0.019 0.004

The closer to the full_length, the better the index but the more memory it will use

Last Tips

  • If you want to index the suffix you can store the string inversely and index it's prefix
  • Clustered indexes