根据emai字段去除重复,有个附加条件是另一个字段出现频率最高的数据保留,其他的删除
数据表中有
id email country
1 [email protected] Japan
2 [email protected] China
3 [email protected] China
4 [email protected] Japan
5 [email protected] Chain
6 [email protected] Chain
去除重复项的结果为[email protected] China
[email protected] Japan
[email protected] Chain 重复的email [email protected] country ,China有3次,Japan有两次,左右保留的Chain
数据表中有
id email country
1 [email protected] Japan
2 [email protected] China
3 [email protected] China
4 [email protected] Japan
5 [email protected] Chain
6 [email protected] Chain
去除重复项的结果为[email protected] China
[email protected] Japan
[email protected] Chain 重复的email [email protected] country ,China有3次,Japan有两次,左右保留的Chain
from tb
group by email,country
SELECT email,country,count(*) as gs from [long] group by email,country) a
where not exists(select 1 from (SELECT email,country,count(*) as gs from [long] group by email,country) a1 where a.email=a1.email and a.gs<a1.gs)
from 数据表