关于mysql count(*)优化

在工作中，往往遇到分页查询、统计之类，免不了使用count(*)。可是count(*)该如何优化？

测试数据库4核4g内存：表tb1, id自增，c2字符串uuid，c3值为1,2,3。结果均为5次中的最小时间。

表结构，如下：

mysql> desc tb1;
+-------+-----------------+------+-----+---------+----------------+
| Field | Type            | Null | Key | Default | Extra          |
+-------+-----------------+------+-----+---------+----------------+
| id    | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| c2    | varchar(255)    | YES  |     | NULL    |                |
| c3    | int             | YES  |     | NULL    |                |
+-------+-----------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

表中数据共2086094行

mysql> SELECT count(*) FROM tb1 WHERE 1 = 1;
+----------+
| count(*) |
+----------+
|  2086094 |
+----------+
1 row in set (0.06 sec)

count 测试sql如下：

mysql> SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----------+
| count(*) |
+----------+
|   126306 |
+----------+
1 row in set (0.47 sec)

可以看到用时高达0.47s。

第一步，给c2加索引

mysql> alter table tb1 add index idx_tb1_c2(c2);
Query OK, 0 rows affected (22.92 sec)
Records: 0  Duplicates: 0  Warnings: 0

效果如下：

mysql> SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----------+
| count(*) |
+----------+
|   126306 |
+----------+
1 row in set (0.48 sec)

可以看到，并没有什么变化，explain一把，如图：

mysql> explain SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | tb1   | NULL       | ALL  | idx_tb1_c2    | NULL | NULL    | NULL | 2079297 |     3.61 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

结果显示没有用到索引，走的全表扫描，众所周知，当like最左固定的时候，是可以用到索引的

第二步，给c3加索引

mysql> alter table tb1 add index idx_tb1_c3(c3);
Query OK, 0 rows affected (9.75 sec)
Records: 0  Duplicates: 0  Warnings: 0

效果如下：

mysql> SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----------+
| count(*) |
+----------+
|   126306 |
+----------+
1 row in set (1.86 sec)

可以看到，查询反而更慢了，explain一把，如图：

mysql> explain SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 =
 3;
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys         | key        | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------+---------+----------+-------------+
|  1 | SIMPLE      | tb1   | NULL       | ref  | idx_tb1_c2,idx_tb1_c3 | idx_tb1_c3 | 5       | const | 1039648 |    36.06 | Using where |
+----+-------------+-------+------------+------+-----------------------+------------+---------+-------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

结果显示用到了索引 idx_tb1_c3

第三步，建立联合索引c2在前，c3在后

mysql> alter table tb1 add index idx_tb1_c2_c3(c2,c3);
Query OK, 0 rows affected (28.07 sec)
Records: 0  Duplicates: 0  Warnings: 0

效果如下：

mysql> SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----------+
| count(*) |
+----------+
|   126306 |
+----------+
1 row in set (10.58 sec)

可以看到，查询效率更慢了，explain一把，如图：

mysql> explain SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 
 3;
+----+-------------+-------+------------+------+-------------------------------------+------------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys                       | key        | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+-------+------------+------+-------------------------------------+------------+---------+-------+---------+----------+-------------+
|  1 | SIMPLE      | tb1   | NULL       | ref  | idx_tb1_c2,idx_tb1_c3,idx_tb1_c2_c3 | idx_tb1_c3 | 5       | const | 1039648 |    36.06 | Using where |
+----+-------------+-------+------------+------+-------------------------------------+------------+---------+-------+---------+----------+-------------+
1 row in set, 1 warning (0.03 sec)

结果显示用到索引idx_tb1_c3，新建的idx_tb1_c2_c3并没生效

第四步，建立联合索引，c3在前，c2在后

mysql> alter table tb1 add index idx_tb1_c3_c2(c3,c2);
Query OK, 0 rows affected (25.46 sec)
Records: 0  Duplicates: 0  Warnings: 0

效果如下：

mysql> SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 3;
+----------+
| count(*) |
+----------+
|   126306 |
+----------+
1 row in set (0.06 sec)

可以看到，查询效率非常的快，仅0.06秒，explain一把，如下：

mysql> explain SELECT count(*)  FROM tb1  WHERE 1 = 1  AND (c2 LIKE 'a%' or c2 like 'b%' or c2 like 'c%') AND c3 = 
 3;
+----+-------------+-------+------------+-------+---------------------------------------------------+---------------+---------+------+--------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys                                     | key           | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------------------------------------------+---------------+---------+------+--------+----------+--------------------------+
|  1 | SIMPLE      | tb1   | NULL       | range | idx_tb1_c2,idx_tb1_c3,idx_tb1_c2_c3,idx_tb1_c3_c2 | idx_tb1_c3_c2 | 773     | NULL | 250480 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------------------------------------------+---------------+---------+------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

总结：

第三步，为什么用不到索引，众所周知，当为联合索引时，b+树遵循最左匹配原则，c2叶子节点上会维护一个链表或b+树，存放c3的数据，但c3的数据比较少，所以仅会用到链表，即使用到索引，查询效果也不理想。第四步，为什么会用到索引，同理，c3的数据比较少，所以b+树的高度非常低，然后在c3的叶子节点维护一个b+树，为什么是b+树呢，因为c2的数据非常多。所以当我们在建立联合索引时，尽可能把穷举少的列放在前面。

以上就是全部内容，各位大佬们有好的建议，欢迎在评论区留言。

原文链接：https://blog.csdn.net/wyanyi/article/details/123751028