表数据如下:
--------------------------------------------------------------
ID NAME ADDR ROWID
1 1 test addr1 AAARGiAAEAAAAA+AAA
2 1 test addr2 AAARGiAAEAAAAA/AAA
3 1 test addr3 AAARGiAAEAAAAA/AAB
4 2 gogogo addr1 AAARGiAAEAAAAA+AAB
5 2 gogogo addr2 AAARGiAAEAAAAA/AAC
6 3 aaaaaf bbbbbb AAARGiAAEAAAAA+AAC
--------------------------------------------------------------
现在想删除重复记录。重复列为ID和NAME(测试用,无主键)思路:获得全部重复记录,分组并取得最小rowid,删除其他rowid的行
=================================================================
SQL1:
select t.*, rowid
from emp t
where t.rowid not in (select min(rowid)
from emp e
where e.id = t.id
and e.name = t.name
group by e.id, e.name
having count(*)>1)查询结果如下:
ID NAME ADDR ROWID
1 3 aaaaaf bbbbbb AAARGiAAEAAAAA+AAC
2 1 test addr2 AAARGiAAEAAAAA/AAA
3 1 test addr3 AAARGiAAEAAAAA/AAB
4 2 gogogo addr2 AAARGiAAEAAAAA/AAC
--------------------------------------------------------------
SQL2:
select t.*, rowid
from emp t
where t.rowid <> (select min(rowid)
from emp e
where e.id = t.id
and e.name = t.name
group by e.id, e.name
having count(*)>1)
查询结果如下: ID NAME ADDR ROWID
1 1 test addr2 AAARGiAAEAAAAA/AAA
2 1 test addr3 AAARGiAAEAAAAA/AAB
3 2 gogogo addr2 AAARGiAAEAAAAA/AAC
--------------------------------------------------------------
问题:为何SQL1,2取得结果不同,跪求原理。望大神告知。
--------------------------------------------------------------
ID NAME ADDR ROWID
1 1 test addr1 AAARGiAAEAAAAA+AAA
2 1 test addr2 AAARGiAAEAAAAA/AAA
3 1 test addr3 AAARGiAAEAAAAA/AAB
4 2 gogogo addr1 AAARGiAAEAAAAA+AAB
5 2 gogogo addr2 AAARGiAAEAAAAA/AAC
6 3 aaaaaf bbbbbb AAARGiAAEAAAAA+AAC
--------------------------------------------------------------
现在想删除重复记录。重复列为ID和NAME(测试用,无主键)思路:获得全部重复记录,分组并取得最小rowid,删除其他rowid的行
=================================================================
SQL1:
select t.*, rowid
from emp t
where t.rowid not in (select min(rowid)
from emp e
where e.id = t.id
and e.name = t.name
group by e.id, e.name
having count(*)>1)查询结果如下:
ID NAME ADDR ROWID
1 3 aaaaaf bbbbbb AAARGiAAEAAAAA+AAC
2 1 test addr2 AAARGiAAEAAAAA/AAA
3 1 test addr3 AAARGiAAEAAAAA/AAB
4 2 gogogo addr2 AAARGiAAEAAAAA/AAC
--------------------------------------------------------------
SQL2:
select t.*, rowid
from emp t
where t.rowid <> (select min(rowid)
from emp e
where e.id = t.id
and e.name = t.name
group by e.id, e.name
having count(*)>1)
查询结果如下: ID NAME ADDR ROWID
1 1 test addr2 AAARGiAAEAAAAA/AAA
2 1 test addr3 AAARGiAAEAAAAA/AAB
3 2 gogogo addr2 AAARGiAAEAAAAA/AAC
--------------------------------------------------------------
问题:为何SQL1,2取得结果不同,跪求原理。望大神告知。
where not exists (select 1
from emp e
where e.id = t.id
and e.name = t.name
group by e.id, e.name
having min(rowid)=t.rowid and count(*)>1);
当遇到唯一的时候 count(*)>1 永远为假 所以唯一的值也会查询出来第2 个是使用<> 比较 即使不存在 是<>null 也是真
感谢回复。
关于<>的回复,我不是很理解。可否详细说明下?
<>就是直接循环用的<>比较 如果不存在 <>min(rowid) 这很明显啦 即使min(rowid)为空 也<> 也是假
filter false 不查询出来 前面说是真错啦 Null和任何值 任何比较都是 null
是不是你第一个查询到第二个查询间有删除了数据呢?
或者你先执行一下子查询看一下结果
另外,对于SQL2的问题我明白了:当主查询循环到aaaaaf这条记录时,子查询返回的结果是NULL。
判断rowid <> NULL,则返回unknown,所以aaaaaf不会添加的结果集中。让我困惑的是SQL1, 也就是not in NULL 为何会将结果添加到结果集中,还是不能用SQL2的方式去解析SQL1?
有大神知道么?继续跪求原理
-------------------------------
Name Type Nullable Default Storage Comments
ID NUMBER N
NAME VARCHAR2(100) Y
ADDR VARCHAR2(200) Y