说明:
1.in(list)运算,如果list是一个明确的项目列表,in(list)运算性能尚可。
2.in(list)运算,如果list是一个子查询,那么只有在满足 1)父查询是覆盖索引查询 2)过滤性够好(我的不知道错误还是正确的85%无效结果的排除率) 的时候父查询才会用的上索引,反之则用不上索引。
就如上面那句
select feed_name, feed_url from feed where feed_id = ( select feed_id from user,subscription where user.user_id = subscription.user_id and user_name = 'Tom');
大部分人的想法可能是 首先 子查询查询一堆feed_id 之后 返回feed_id的列表给父查询
而实际上MySQL优化器并不会这么做,而是先对feed表进行全表扫描,或者如果有个 idx1(feed_id,feed_name, feed_url) 索引,则进行全索引扫描,之后把feed表的feed_id代入到子查询,效率有多低相信你应该知道。至少目前已经GA的版本都是这样,not in一样的方式。
mysql> show create table a\G
*************************** 1. row ***************************
Table: a
Create Table: CREATE TABLE `a` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`uid` int(10) NOT NULL,
`time` datetime NOT NULL,
`type` tinyint(4) DEFAULT NULL,
`status` varchar(320) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_uid` (`uid`)
) ENGINE=InnoDB AUTO_INCREMENT=500001 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
+----+-----+---------------------+------+-----------+
| id | uid | time | type | status |
+----+-----+---------------------+------+-----------+
| 1 | 1 | 2012-01-16 17:34:24 | 1 | vvvvvvvvv |
| 2 | 1 | 2012-01-16 17:34:10 | 2 | vvvvvvvvv |
| 3 | 1 | 2012-01-11 12:16:56 | 0 | vvvvvvvvv |
| 4 | 1 | 2012-01-16 17:34:24 | 1 | vvvvvvvvv |
| 5 | 1 | 2012-01-16 17:34:10 | 2 | vvvvvvvvv |
+----+-----+---------------------+------+-----------+
mysql> select * from c;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+----+
10 rows in set (0.00 sec)
1.如果list是一个明确的项目列表
mysql> explain select count(*) from a where uid in (1,2,3,4,5,6,7,8,9,10)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: range
possible_keys: idx_uid
key: idx_uid
key_len: 4
ref: NULL
rows: 4996
Extra: Using where; Using index
1 row in set (0.00 sec)
mysql> select sql_no_cache count(id) from b where uid in (1,2,3,4,5,6,7,8,9,10);
+-----------+
| count(id) |
+-----------+
| 5000 |
+-----------+
1 row in set (0.00 sec)
2.如果list是一个子查询,满足 1)父查询是覆盖索引查询 2)过滤性够好 父查询用的上索引
mysql> explain select count(uid) from a where uid in (select id from c)\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: a
type: index
possible_keys: NULL
key: idx_uid
key_len: 4
ref: NULL
rows: 500101
Extra: Using where; Using index
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: c
type: unique_subquery
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: func
rows: 1
Extra: Using index
2 rows in set (0.00 sec)
mysql> select sql_no_cache count(uid) from a where uid in (select id from c);
+------------+
| count(uid) |
+------------+
| 5000 |
+------------+
1 row in set (1.61 sec)
2.如果list是一个子查询,不满足 1)父查询是覆盖索引查询 2)过滤性够好 父查询用不上索引
mysql> explain select count(time) from a where uid in (select id from c)\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: a
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 500101
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: c
type: unique_subquery
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: func
rows: 1
Extra: Using index
2 rows in set (0.00 sec)
mysql> select sql_no_cache count(time) from a where uid in (select id from c);
+-------------+
| count(time) |
+-------------+
| 5000 |
+-------------+
1 row in set (1.68 sec)