sql优化的问题：(not)exists 比 (not)in 好吗？

请比较下面两个检索sql，哪一个好一些呢：
select * from table1 where not exists
           (select i1 from table2  where table2.i1=table1.i1);select * from table1 where i1 not in (select i1 from table2 );
一般来说exists要比in效率高，但我用优化器分析了一下
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
   Object_owner object_name  cost  cardinality bytes IO cost
SELECT STATEMENT, GOAL = ALL_ROWS 3 1 12 3
HASH JOIN ANTI 3 1 12 3
  TABLE ACCESS FULL     abc table1 1 8 72 1
  TABLE ACCESS FULL     abc table2 1 8 24 1
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
和：
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
   Object_owner object_name  cost  cardinality bytes IO cost
SELECT STATEMENT, GOAL = ALL_ROWS 1 1 9 1
FILTER
  TABLE ACCESS FULL abc table1 1 1 9 1
  TABLE ACCESS FULL abc table2 1 1 3 1
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
好像用not in 的效果好？为什么呢？
另外，是不是上面的数值小的效率就高？请大家指点。

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

你的TABLE2的i1字段有索引吗？如果没有的话，就会很慢了
没索引时两个效率差不多，建了索引后，not in效率就差远了，尤其是数据量比较大时
原来没有建索引，现在以i1字段建立索引，上面用not exists的基于代价的分析数据变大了。是不是上面的数据大的效率低呀？
not in与not exists还有一区别
当数据量较少时,not in执行速度比not exists快
反之,not exists较可取
(not) in and (not) exists
数据量大的情况下，(not) exists要快的多；
数据量小的情况下，两者差不多
但是单纯来讲的话，好象跟索引没什么关系。
ASKTOM.ORACLE.COM上有专门的文章，我简单总结了一下:
    1. 如果要查询的列上没有NULL值，NOT IN 和 NOT Exist效率
       是一样的.
    2. 跟beckhamboo说的基本类似:
        当数据量较少时,not in执行速度比not exists快
        反之,not exists较可取
    3. 同样，处理 IN 和Exist 也有下面的规则:        内查询为小表，外查询为大表时，使用IN的效率高,例如:
          select count(*) from bigtable
          where id in(select id from smalltable)
            效率高        select count(*)  from bigtable
           where exists ( select null from smalltable
           where smalltable.id = bigtable.id )
         效率低
        反之,内查询为大表，外查询为小表时，使用Exist的效率高.

           select count(*) from smalltable
           where id in (select id from bigtable)

           效率差一点             select count(*)  from smalltable
           where exists ( select null from bigtable
           where smalltable.id = bigtable.id )           效率好一点.
再次强调，数据量小的差距是可以接受的，我们应该关心数据量比较大的情况
而且not exists必须加索引才能大幅度提高性能
举一个比较鲜明对比的例子：用了两个not in或者not exists
从时间上可以清楚判断二者的效率差距，这个是加完索引以后的查询
加索引之前，not in依旧，not exists费时6分多种
fjmam@FJTVORA>select count(*) from oldsys.mainjm_temp;  COUNT(*)
----------
     14434已用时间:  00: 00: 00.01
fjmam@FJTVORA>select count(*) from oldsys.subjm_temp;  COUNT(*)
----------
     54694已用时间:  00: 00: 00.04
fjmam@FJTVORA>select count(*) from oldsys.littjm_temp;  COUNT(*)
----------
     12632已用时间:  00: 00: 00.01fjmam@FJTVORA>select count(*) from oldsys.littjm_temp a where not exists(select 1 from oldsys.mainjm
_temp b where b.mi = a.zhu)
  2  and not exists(select 1 from oldsys.subjm_temp c where c.索取号 = a.索取号);  COUNT(*)
----------
         5已用时间:  00: 00: 00.01
fjmam@FJTVORA>
fjmam@FJTVORA>select count(*) from oldsys.littjm_temp where zhu not in(select mi from oldsys.mainjm_
temp)
  2  and 索取号 not in(select 索取号 from oldsys.subjm_temp);  COUNT(*)
----------
         5已用时间:  00: 06: 27.01
再看看sql_trace的分析，结果一目了然了
********************************************************************************select count(*) from oldsys.littjm_temp a where not exists(select 1 from oldsys.mainjm_temp b where b.mi = a.zhu)
and not exists(select 1 from oldsys.subjm_temp c where c.索取号 = a.索取号)call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.07       0.17        197       5562          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.07       0.17        197       5562          0           1Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66  Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=5562 r=197 w=0 time=175144 us)
      5   FILTER  (cr=5562 r=197 w=0 time=175120 us)
  12632    TABLE ACCESS FULL LITTJM_TEMP (cr=560 r=197 w=0 time=122603 us)
   1825    INDEX RANGE SCAN IDX_MAINJMTEMP_1 (cr=4492 r=0 w=0 time=15461 us)(object id 32028)
    252    INDEX RANGE SCAN IDX_SUBJMTEMP_2 (cr=510 r=0 w=0 time=2207 us)(object id 32030)********************************************************************************select count(*) from oldsys.littjm_temp where zhu not in(select mi from oldsys.mainjm_temp)
and 索取号 not in(select 索取号 from oldsys.subjm_temp)call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     57.73     387.18    1054902    2188181          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     57.73     387.18    1054902    2188181          0           1Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66  Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=2188181 r=1054902 w=0 time=41669224 us)
      5   FILTER  (cr=2188181 r=1054902 w=0 time=199542625 us)
  12632    TABLE ACCESS FULL LITTJM_TEMP (cr=560 r=204 w=0 time=248655 us)
   1825    TABLE ACCESS FULL MAINJM_TEMP (cr=1830610 r=876578 w=0 time=308817129 us)
    252    TABLE ACCESS FULL SUBJM_TEMP (cr=357011 r=178120 w=0 time=78031991 us)********************************************************************************
(not)exists和(not)in 各有优、缺点
看你怎么样使用，考虑索引，可以根据记录数目等计算运算次数决定使用哪个，因此不分好坏！
强烈建议用not exists取代not in
btw
to kevin218(张智勇) :
你以前在深圳轩冕工作过吗？
可是not in 还使用的方便，有时候差不了多少的
HOHO, 又多了这么多发言。不知有哪位朋友去ASKTOM.Oracle.com
仔细看了, TOM 对IN/Exists, NOT IN/NOT Exists 都有专门的分析.
选 exists. 不过,子查询都影响速度.
用oracle优化提示:
select * from table1 where i1 not in (select /*+ hash_aj */ i1 from table2 );
速度飞快.
楼上的:
   你的例子中 il 列没有NULL值吧，如果有NULL值,你再试试,
   速度不是一般的慢.