left join 和not in效率问题？ - 调试易

left join 和not in效率问题？

在进行查询时要尽量少用in,not in,要用left join替换，但是最近遇到这样一个问题，在使用not in的查询时间比left join的查询时间要少，不知道什么原因？希望大家帮忙看看
查询的功能是，查询出集合a不在集合b中的数据。执行这条sql,查询出来的结果耗时0.39 sec
select a.ruid,b.ruid from( select distinct RUID from UserMsg where ID =12 and Time >= '2009-8-14 15:30:00' and Time<='2009-8-17 16:00:00') a  left join  ( select distinct RUID from UserMsg where ID =12 and Time < '2009-8-14 15:30:00'  ) b on a.ruid = b.ruid where b.ruid is null
exlpain 之后为
+----+-------------+------------+-------+----------------------+------------+---------+------+------+------------------------------+
| id | select_type | table      | type  | possible_keys        | key        | key_len | ref  | rows | Extra                        |
+----+-------------+------------+-------+----------------------+------------+---------+------+------+------------------------------+
|  1 | PRIMARY     | <derived2> | ALL   | NULL                 | NULL       |    NULL | NULL |  452 |                              |
|  1 | PRIMARY     | <derived3> | ALL   | NULL                 | NULL       |    NULL | NULL | 1112 | Using where; Not exists      |
|  3 | DERIVED     | UserMsg    | ref   | ID，Time             | ID         |       5 |      | 6755 | Using where; Using temporary |
|  2 | DERIVED     | UserMsg    | range | ID,Time              | Time       |       9 | NULL | 2115 | Using where; Using temporary |
+----+-------------+------------+-------+----------------------+------------+---------+------+------+------------------------------+
而执行下面这条not in之后执行时间为0.09sec
select add_tb.RUID from (select distinct RUID from UserMsg where ID =12 and Time>'2009-8-14 15:30:00'  and Time<='2009-8-17 16:00:00' ) add_tb where add_tb.RUID not in (select distinct RUID from UserMsg where ID =12 and  Time<'2009-8-14 15:30:00');
+----+--------------------+------------+----------------+---------------------------+------------+---------+------+------+------------------------------+
| id | select_type        | table      | type           | possible_keys             | key        | key_len | ref  | rows | Extra                        |
+----+--------------------+------------+----------------+---------------------------+------------+---------+------+------+------------------------------+
|  1 | PRIMARY            | <derived2> | ALL            | NULL                      | NULL       |    NULL | NULL |  452 | Using where                  |
|  3 | DEPENDENT SUBQUERY | UserMsg    | index_subquery | RUID,ID,Time              | RUID       |      96 | func |    2 | Using index; Using where     |
|  2 | DERIVED            | UserMsg    | range          | ID,Time                   |  Time      |       9 | NULL | 2113 | Using where; Using temporary |
+----+--------------------+------------+----------------+---------------------------+------------+---------+------+------+------------------------------+

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

用了这么多子查询，看着眼花啊你这样试一下
select distinct a.RUID
from UserMsg a
left join UserMsg b on a.ruid = b.ruid and b.ID =12 and b.Time < '2009-8-14 15:30:00'
where a.ID =12 and a.Time >= '2009-8-14 15:30:00' and a.Time <='2009-8-17 16:00:00'
and b.ruid is null
这个是可以的，谢谢
但是你能说一下为什么我查询的两个sql的效率问题是怎么回事吗？谢谢
你用了那些子查询，导致执行计划里会使用临时表，效率就低了，你看一下我那个sql的执行计划对比一下
很明显的问题啦一样的数据和索引下，第2种情况肯定比1种情况快啦：第1种情况生成了2个临时表(IO量大），临时表之间的关联是没索引可用的。
第1种情况生成了1个临时表(IO量相对小），临时表与后面的子查询关联时，是用上了索引的。
另外，在进行查询时要尽量少用in,not in,要用left join替换
这句话的真正含义实际上指用join代替子查询，mysql对子查询处理的不好，这是个bug，一直没有很好的解决。
第一种方法
select a.ruid,b.ruid
from ( select distinct RUID
from UserMsg
where ID =12
and Time >= '2009-8-14 15:30:00'
and Time <='2009-8-17 16:00:00'
) a  left join  (
select distinct RUID
from UserMsg
where ID =12 and Time  < '2009-8-14 15:30:00'
) b on a.ruid = b.ruid
where b.ruid is null
你直接用了两个临时表, 如果两个临时表的返回记录数量均不多，则效率上是可行的，但如果这两个表 a,b 结果集比较大则无索引可用了。可以试一下这句。创建 (id,time) , (id,ruid,time) 两个索引
select a.ruid
from UserMsg a
where a.ID =12
and a.Time >= '2009-8-14 15:30:00'
and a.Time <='2009-8-17 16:00:00'
and not exists (
select RUID
from UserMsg
where ID =12 and Time  < '2009-8-14 15:30:00'
and ruid=a.ruid
)
感谢各位大侠的关注和指导，小弟谢过了，不知道除了上述方法能解决问题，还有什么其他的sql方法能实现同样的功能呢？谢谢