去重问题, 如何筛选出1小时内的重复数据每条记录都有一个时间戳
和上一条记录时间相差1小时内即为重复数据
如何筛选出重复数据,然后删除
例如:
时间转换成时分秒格式方便观看
id time
1 2012-12-25 10:10:11
2 2012-12-25 10:30:11 *
3 2012-12-25 10:40:11 *
4 2012-12-25 11:20:11记录2,3为重复,需要筛选出来以备删除
记录4虽然和3相差不到一小时,但记录2,3为重复数据,4和1对比超过一小时为不重复数据
数据量巨大, 用程序双循环时间太慢, 有什么好办法没?
和上一条记录时间相差1小时内即为重复数据
如何筛选出重复数据,然后删除
例如:
时间转换成时分秒格式方便观看
id time
1 2012-12-25 10:10:11
2 2012-12-25 10:30:11 *
3 2012-12-25 10:40:11 *
4 2012-12-25 11:20:11记录2,3为重复,需要筛选出来以备删除
记录4虽然和3相差不到一小时,但记录2,3为重复数据,4和1对比超过一小时为不重复数据
数据量巨大, 用程序双循环时间太慢, 有什么好办法没?
where exists (
select *
from table1 t
where time between d.time-interval 1 hour and d.time
and not exists (
select 1 from table1
where time between t.time-interval 1 hour and t.time
)
)
那跨小时就可能出现问题
例如:
时间转换成时分秒格式方便观看
id time
1 2012-12-25 09:50:11
2 2012-12-25 10:30:11 *
3 2012-12-25 10:40:11 *
4 2012-12-25 11:20:11
DELETE FROM table WHERE id IN (
SELECT DISTINCT id
FROM (
SELECT if( time - @pre >3600, 0, id ) AS id, if( time - @pre >3600, @pre := time, 0 )
FROM table
WHERE 1
ORDER BY time ASC
) t
)
-------------------------------------
上面的sql可以实现了,,,,
怎么把赋值写到一句话里?
最里面的两个if, 现在显示两个字段所以上面再嵌套一层需要取出一个id比较麻烦,除了用concat以外的办法, 怎么才能不让后面的id出结果???
1 2012-12-25 09:10:11
2 2012-12-25 10:30:11
3 2012-12-25 10:40:11
4 2012-12-25 11:20:11
2、3还算重复?SELECT @PRE,IF( `time` - @pre >3600, 0, id ) AS id, IF( `time` - @pre >3600, @pre:=`time`, 0 )
FROM tTQ,(SELECT @pre :='2012-1-1 00:00:01' ) B
WHERE 1
ORDER BY `time` ASC
SELECT if(@pre is null,@pre:=0,@pre:=@pre), if(............
这样的效果, 怎么mysql不支持那,郁闷啊
FROM tTQ,(SELECT @pre :='2012-1-1 00:00:01' ) B
WHERE 1
ORDER BY `time` ASC
在5.5下测试通过
1 2012-12-25 09:25:11
2 2012-12-25 10:30:11
3 2012-12-25 10:40:11
4 2012-12-25 11:20:11
要求结果是什么
1 2012-12-25 09:25:11
2 2012-12-25 10:30:11
3 2012-12-25 10:40:11 * 与id=2 比不足一小时,重复
4 2012-12-25 11:20:11 * 与id=2 比不足一小时,重复和上一条记录时间相差1小时内即为重复数据, 上一条为不重复的数据
这...这是mysql的吗? 不能执行啊, Mysql 5.1 另外time字段为标准的 unix时间戳
+----+---------------------+
| id | time |
+----+---------------------+
| 1 | 2012-12-25 10:10:11 |
| 2 | 2012-12-25 10:30:11 |
| 3 | 2012-12-25 10:40:11 |
| 4 | 2012-12-25 11:20:11 |
+----+---------------------+
4 rows in set (0.00 sec)mysql> select version();
+------------------+
| version() |
+------------------+
| 5.1.58-community |
+------------------+
1 row in set (0.00 sec)mysql> delete x from table1 x ,(
-> select id from table1 d
-> where exists (
-> select 1
-> from table1 t
-> where time > d.time-interval 1 hour and time < d.time
-> and not exists (
-> select 1 from table1
-> where time > t.time-interval 1 hour and time < t.time
-> )
-> )
-> ) y
-> where x.id=Y.id;
Query OK, 2 rows affected (0.00 sec)mysql> select ROW_COUNT();
+-------------+
| ROW_COUNT() |
+-------------+
| 2 |
+-------------+
1 row in set (0.00 sec)mysql> delete x from table1 x ,(
-> select id from table1 d
-> where exists (
-> select 1
-> from table1 t
-> where time > d.time-interval 1 hour and time < d.time
-> and not exists (
-> select 1 from table1
-> where time > t.time-interval 1 hour and time < t.time
-> )
-> )
-> ) y
-> where x.id=Y.id;
Query OK, 0 rows affected (0.00 sec)mysql> select ROW_COUNT();
+-------------+
| ROW_COUNT() |
+-------------+
| 0 |
+-------------+
1 row in set (0.00 sec)mysql> select * from table1;
+----+---------------------+
| id | time |
+----+---------------------+
| 1 | 2012-12-25 10:10:11 |
| 4 | 2012-12-25 11:20:11 |
+----+---------------------+
2 rows in set (0.00 sec)mysql>
SELECT `time`,@a:= `time`,@b:=1 FROM ttq LIMIT 1;
SELECT *,TIMEDIFF(`time`,@a),
@b:=IF(TIME_TO_SEC(TIMEDIFF(`time`,@a))>=3600 AND HOUR(@b)<>HOUR(`time`),1,0) AS bz,
@a:=IF(@b=0,@a,`time`) AS bz2
FROM ttq LIMIT 1,10000BZ为1满足条件