exists 快在这种情况可以用内连接比较快! select T1.* from test1 T1 ,test2 T2 where T1.NAME = test2.NAME)
用EXISTS替代IN 在许多基于基础表的查询中,为了满足一个条件,往往需要对另一个表进行联接.在这种情况下, 使用EXISTS(或NOT EXISTS)通常将提高查询的效率. 低效: SELECT * FROM EMP (基础表) WHERE EMPNO > 0 AND DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE LOC = ‘MELB’) 高效: SELECT * FROM EMP (基础表) WHERE EMPNO > 0 AND EXISTS (SELECT ‘X’ FROM DEPT WHERE DEPT.DEPTNO = EMP.DEPTNO AND LOC = ‘MELB’) (译者按: 相对来说,用NOT EXISTS替换NOT IN 将更显著地提高效率,下一节中将指出) 你的情況的出現是由於test2的數據量較小的情況下出現的, 在這種情況下一般不需要區分用哪個比較好,因為不會差很多
在海量数据级上exists超级快,用IN在万级别上可以考虑
Well, the two are processed very very differently.Select * from T1 where x in ( select y from T2 )is typically processed as:select * from t1, ( select distinct y from t2 ) t2 where t1.x = t2.y;The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to the original table -- typically. As opposed to select * from t1 where exists ( select null from t2 where y = x )That is processed more like: for x in ( select * from t1 ) loop if ( exists ( select null from t2 where y = x.x ) then OUTPUT THE RECORD end if end loopIt always results in a full scan of T1 whereas the first query can make use of an index on T1(x). So, when is where exists appropriate and in appropriate?Lets say the result of the subquery ( select y from T2 )is "huge" and takes a long time. But the table T1 is relatively small and executing ( select null from t2 where y = x.x ) is very very fast (nice index on t2(y)). Then the exists will be faster as the time to full scan T1 and do the index probe into T2 could be less then the time to simply full scan T2 to build the subquery we need to distinct on. Lets say the result of the subquery is small -- then IN is typicaly more appropriate. If both the subquery and the outer table are huge -- either might work as well as the other -- depends on the indexes and other factors.
in & exists(zt)some example at which situation IN is better than exist
Select * from T1 where x in ( select y from T2 )
is typically processed as:
select * from t1, ( select distinct y from t2 ) t2 where t1.x = t2.y;
The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to the original table -- typically.
As opposed to
select * from t1 where exists ( select null from t2 where y = x )
That is processed more like:
for x in ( select * from t1 ) loop if ( exists ( select null from t2 where y = x.x ) then OUTPUT THE RECORD end if end loop
It always results in a full scan of T1 whereas the first query can make use of an index on T1(x).
So, when is where exists appropriate and in appropriate?
Lets say the result of the subquery ( select y from T2 )
is "huge" and takes a long time. But the table T1 is relatively small and executing ( select null from t2 where y = x.x ) is very very fast (nice index on t2(y)). Then the exists will be faster as the time to full scan T1 and do the index probe into T2 could be less then the time to simply full scan T2 to build the subquery we need to distinct on.
Lets say the result of the subquery is small -- then IN is typicaly more appropriate.
If both the subquery and the outer table are huge -- either might work as well as the other -- depends on the indexes and other factors.
select T1.*
from test1 T1 ,test2 T2
where T1.NAME = test2.NAME)
在许多基于基础表的查询中,为了满足一个条件,往往需要对另一个表进行联接.在这种情况下, 使用EXISTS(或NOT EXISTS)通常将提高查询的效率.
低效:
SELECT *
FROM EMP (基础表)
WHERE EMPNO > 0
AND DEPTNO IN (SELECT DEPTNO
FROM DEPT
WHERE LOC = ‘MELB’)
高效:
SELECT *
FROM EMP (基础表)
WHERE EMPNO > 0
AND EXISTS (SELECT ‘X’
FROM DEPT
WHERE DEPT.DEPTNO = EMP.DEPTNO
AND LOC = ‘MELB’)
(译者按: 相对来说,用NOT EXISTS替换NOT IN 将更显著地提高效率,下一节中将指出) 你的情況的出現是由於test2的數據量較小的情況下出現的,
在這種情況下一般不需要區分用哪個比較好,因為不會差很多
from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then
joined to the original table -- typically.
As opposed to select * from t1 where exists ( select null from t2 where y = x )That is processed more like:
for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then
OUTPUT THE RECORD
end if
end loopIt always results in a full scan of T1 whereas the first query can make use of
an index on T1(x).
So, when is where exists appropriate and in appropriate?Lets say the result of the subquery
( select y from T2 )is "huge" and takes a long time. But the table T1 is relatively small and
executing ( select null from t2 where y = x.x ) is very very fast (nice index on
t2(y)). Then the exists will be faster as the time to full scan T1 and do the
index probe into T2 could be less then the time to simply full scan T2 to build
the subquery we need to distinct on.
Lets say the result of the subquery is small -- then IN is typicaly more
appropriate.
If both the subquery and the outer table are huge -- either might work as well
as the other -- depends on the indexes and other factors.
所以,。
IN is better than exist
Select * from T1 where x in ( select y from T2 )
is typically processed as:
select *
from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;
The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then
joined to the original table -- typically.
As opposed to
select * from t1 where exists ( select null from t2 where y = x )
That is processed more like:
for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then
OUTPUT THE RECORD
end if
end loop
It always results in a full scan of T1 whereas the first query can make use of
an index on T1(x).
So, when is where exists appropriate and in appropriate?
Lets say the result of the subquery
( select y from T2 )
is "huge" and takes a long time. But the table T1 is relatively small and
executing ( select null from t2 where y = x.x ) is very very fast (nice index on
t2(y)). Then the exists will be faster as the time to full scan T1 and do the
index probe into T2 could be less then the time to simply full scan T2 to build
the subquery we need to distinct on.
Lets say the result of the subquery is small -- then IN is typicaly more
appropriate.
If both the subquery and the outer table are huge -- either might work as well
as the other -- depends on the indexes and other factors.
http://asktom.oracle.com/pls/ask/f?p=4950:8:2063933::NO::F4950_P8_DISPLAYID,F4950_P8_B:953229842074,Y