求SQL

有两个表t1(id,charge) t2(id,charge)
两个表都在id字段上建有索引,两个表数据在1000万以上
现要列出t1表中id不在t2表中的数据初次接触大数据量查询,还请高手指点

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

select * from t1 newt1 where not exists
(select 1 from t2 where newt1.id=t2.id)
有两个表t1(id,charge)   t2(id,charge)
两个表都在id字段上建有索引,两个表数据在1000万以上
现要列出t1表中id不在t2表中的数据初次接触大数据量查询,还请高手指点--只查id
select t1.* from t1 where id not in (select id from t2)--同时查id , charge
select t1.* from t1 where not exists (select 1 from t2 where id = t1.id and charge = t2.charge)
假如只是想要ID的话，
select   id   from   t01   where   not   exists (select   1   from   t02   where   t01.id=t02.id)
select   id   from   t01   where id not in (select id from t02)
这样的句子最快。用到了两个表的INDEX.执行计划完全一样。假如想要t01里面的所有的数据，
select   * from   t01   where   not   exists (select   1   from   t02   where   t01.id=t02.id)
select   * from   t01   where id not in (select id from t02)
用到了
1.t01 的 table access full
2.t02 index的index fast full scan
也就是说，执行计划也是一样的。
select t1.* from t1 where not exists (select 1 from t2 where id = t1.id and charge = t2.charge)
这样的句子，最差。需要用到两次table access full,就是两次全表扫描，赫赫。
会死人的。
select  *  from   t01 where  not exists(select 1  from  t02 where   t01.id=t02.id)
select  *  from   t01 where   id   not   in   (select   id   from   t02) 我试了一下,第一条语句要比第二条快很多,
有人能解释一下原因吗?
赫赫，你可以看一下自己的执行计划的。会明白的。最多的解释：
和执行顺序有关系。
Exist的时候，会先执行主查询，然后开始子查询，找到第一个匹配的。
IN的时候，会先检查子查询，然后这个结果保存在临时表里面。等到子查询全部结束了，开始主查询。就是这个需要临时表的动作以及之后和临时表的操作，浪费了时间。
select ti.* from t1,t2 where t1.id=t2.id(+) and t1.id is not null
外连接
注：最好2表的数据都有分区表空间，并且有索引分区表空间。
用minus直接减：
select id from t1
minus
select id from t2