现在有一个表A,字段如下:
id date times
10 2009-01-01 3
11 2009-01-02 0
12 2009-01-03 0
13 2009-01-04 0
14 2009-01-05 2
15 2009-01-06 0
16 2009-01-07 0
17 2009-01-08 0
18 2009-01-09 0
19 2009-01-10 0
20 2009-01-05 6
21 2009-01-06 0
22 2009-01-07 0
23 2009-01-08 0
24 2009-01-09 0
25 2009-01-10 0
26 2009-01-09 0
27 2009-01-09 0
要实现的是过滤连续5条或5条以上times字段为0的数据,
过滤后的结果集如下:
id date times
10 2009-01-01 3
11 2009-01-02 0
12 2009-01-03 0
13 2009-01-04 0
14 2009-01-05 2
20 2009-01-05 6
数据量比较大,有2000多万,所以不希望用太多条嵌套的SQL语句。问题解决了我把所有的分都给您。非常感谢!
id date times
10 2009-01-01 3
11 2009-01-02 0
12 2009-01-03 0
13 2009-01-04 0
14 2009-01-05 2
15 2009-01-06 0
16 2009-01-07 0
17 2009-01-08 0
18 2009-01-09 0
19 2009-01-10 0
20 2009-01-05 6
21 2009-01-06 0
22 2009-01-07 0
23 2009-01-08 0
24 2009-01-09 0
25 2009-01-10 0
26 2009-01-09 0
27 2009-01-09 0
要实现的是过滤连续5条或5条以上times字段为0的数据,
过滤后的结果集如下:
id date times
10 2009-01-01 3
11 2009-01-02 0
12 2009-01-03 0
13 2009-01-04 0
14 2009-01-05 2
20 2009-01-05 6
数据量比较大,有2000多万,所以不希望用太多条嵌套的SQL语句。问题解决了我把所有的分都给您。非常感谢!
select * from A a1
where exists (select * from A where id>=a1.id and id<=a1.id+4 and times<>0)
select * from A a1
where exists (select * from A where id>=a1.id and id<=a1.id+4 and times<>0)
or exists (select * from A where id>=a1.id-1 and id<=a1.id+3 and times<>0)
or exists (select * from A where id>=a1.id-2 and id<=a1.id+2 and times<>0)
or exists (select * from A where id>=a1.id-3 and id<=a1.id+1 and times<>0)
or exists (select * from A where id>=a1.id-4 and id<=a1.id and times<>0)
select * from A a1
where exists (select 1 from A a2 where a2.id>=a1.id and id<=a1.id+4 and times<>0)
select * from A a1
where exists (select * from A where id>=a1.id and id<=a1.id+4 and times<>0)
and exists (select * from A where id>=a1.id-1 and id<=a1.id+3 and times<>0)
and exists (select * from A where id>=a1.id-2 and id<=a1.id+2 and times<>0)
and exists (select * from A where id>=a1.id-3 and id<=a1.id+1 and times<>0)
and exists (select * from A where id>=a1.id-4 and id<=a1.id and times<>0)
select * from A a1
where not exists (select 1 from A a2 where a2.id>=a1.id and id<=a1.id+4 and times=0)
非常感谢,不过id不是连续的。就算连续的话,我试了一下这个SQL语句也实现不了。
不过4楼的方法对于连续的id应该是可以的select * from A a1
where exists (select * from A where id>=a1.id and id<=a1.id+4 and times<>0)
and exists (select * from A where id>=a1.id-1 and id<=a1.id+3 and times<>0)
and exists (select * from A where id>=a1.id-2 and id<=a1.id+2 and times<>0)
and exists (select * from A where id>=a1.id-3 and id<=a1.id+1 and times<>0)
and exists (select * from A where id>=a1.id-4 and id<=a1.id and times<>0)
select id,date,times
from A a1
where 5<=(select count(*) from A a2
where id<(select min(id) from A WHERE ID >A1.ID AND TIMES<>0)
and id>(select max(id) from A WHERE ID <A1.ID AND TIMES<>0))
select * from A a1
where exists (select * from A where id>=a1.id and id <=a1.id+4 and times <>0)
and exists (select * from A where id>=a1.id-1 and id <=a1.id+3 and times <>0)
and exists (select * from A where id>=a1.id-2 and id <=a1.id+2 and times <>0)
and exists (select * from A where id>=a1.id-3 and id <=a1.id+1 and times <>0)
and exists (select * from A where id>=a1.id-4 and id <=a1.id and times <>0)
这个语句肯定是不行的。对于2000万条数据来说,这效率也太低了。还有ID不是连续的。select id,date,times
from A a1
where 5<=(select count(*) from A a2
where id<(select min(id) from A WHERE ID >A1.ID AND TIMES<>0)
and id>(select max(id) from A WHERE ID <A1.ID AND TIMES<>0))
刚试过这个语句,实现不了。
--------------------------------------- ----------- ---------------------------------------
10 2009-1-1 3
11 2009-1-2 0
12 2009-1-3 0
13 2009-1-4 0
14 2009-1-5 2
15 2009-1-6 0
16 2009-1-7 0
17 2009-1-8 0
18 2009-1-9 0
19 2009-1-10 0
20 2009-1-5 6
21 2009-1-6 0
22 2009-1-7 0
23 2009-1-8 0
24 2009-1-9 0
25 2009-1-10 0
26 2009-1-9 0
27 2009-1-9 0
28 2009-1-9 119 rows selectedSQL>
SQL> SELECT ID, SJ, TIMES
2 FROM (SELECT t.ID,
3 t.SJ,
4 t.TIMES,
5 COUNT(1) OVER(PARTITION BY ZS) C,
6 ROW_NUMBER() OVER(PARTITION BY ZS ORDER BY ROWNUM) RID
7 FROM (SELECT A.*, SUM(TIMES) OVER(ORDER BY ID) ZS FROM A) t)
8 WHERE C < 6 or (C >= 6 AND RID = 1)
9 / ID SJ TIMES
--------------------------------------- ----------- ---------------------------------------
10 2009-1-1 3
11 2009-1-2 0
12 2009-1-3 0
13 2009-1-4 0
14 2009-1-5 2
20 2009-1-5 6
28 2009-1-9 17 rows selectedSQL>
非常感谢您,我试过了。效率不会比原来的差很多。又可以实现这个效果。结贴了~~
SELECT ID, SJ, TIMES
2 FROM (SELECT t.ID,
3 t.SJ,
4 t.TIMES,
5 COUNT(1) OVER(PARTITION BY ZS) C,
6 ROW_NUMBER() OVER(PARTITION BY ZS ORDER BY ROWNUM) RID
7 FROM (SELECT A.*, SUM(TIMES) OVER(ORDER BY ID) ZS FROM A) t)
8 WHERE C < 6 or (C >= 6 AND RID = 1)