SELECT SUBSTR((concat(a.customer_id, b.customer_id)), 1, 4),
SUBSTR((concat(a.customer_id, b.customer_id)), 5),
a.commodity_id,
count(concat(a.customer_id, b.customer_id))
FROM (SELECT TO_CHAR(C4.Trade_Date, 'yyyymmdd') as Trade_Date,
Customer_id,
C4.Commodity_ID,
sign(buy_holding_change) as buy,
sign(sell_holding_change) as sell
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date BETWEEN TO_DATE('20060801', 'yyyymmdd') AND
TO_DATE('20060810', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 800) a,
(SELECT TO_CHAR(C4.Trade_Date, 'yyyymmdd') as Trade_Date,
Customer_id,
C4.Commodity_ID,
sign(buy_holding_change) as buy,
sign(sell_holding_change) as sell
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date BETWEEN TO_DATE('20060801', 'yyyymmdd') AND
TO_DATE('20060810', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 800) b WHERE a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
AND a.buy <> 0
GROUP BY a.commodity_id, concat(a.customer_id, b.customer_id)
----------------------------------
a b 表相同,自身联合取满足条件:
a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
AND a.buy <> 0的Customer_id不同组合的个数,
朋友们看看还能不能优化,现在数据量一大速度就特别慢。
SUBSTR((concat(a.customer_id, b.customer_id)), 5),
a.commodity_id,
count(concat(a.customer_id, b.customer_id))
FROM (SELECT TO_CHAR(C4.Trade_Date, 'yyyymmdd') as Trade_Date,
Customer_id,
C4.Commodity_ID,
sign(buy_holding_change) as buy,
sign(sell_holding_change) as sell
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date BETWEEN TO_DATE('20060801', 'yyyymmdd') AND
TO_DATE('20060810', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 800) a,
(SELECT TO_CHAR(C4.Trade_Date, 'yyyymmdd') as Trade_Date,
Customer_id,
C4.Commodity_ID,
sign(buy_holding_change) as buy,
sign(sell_holding_change) as sell
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date BETWEEN TO_DATE('20060801', 'yyyymmdd') AND
TO_DATE('20060810', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 800) b WHERE a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
AND a.buy <> 0
GROUP BY a.commodity_id, concat(a.customer_id, b.customer_id)
----------------------------------
a b 表相同,自身联合取满足条件:
a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
AND a.buy <> 0的Customer_id不同组合的个数,
朋友们看看还能不能优化,现在数据量一大速度就特别慢。
and a.customer_id > b.customer_id
是什么含义?子查询的自连接肯定很慢,不过我想肯定可以修改一下的
表一年的数据有1K万吧,a表查出的结果几万或几十万。楼主可以分别考虑一下a b两个结果集,再做整体考虑
现在要的结果就是从满足a的条件里再选出满足:
a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
AND a.buy <> 0这些条件的,要用自连接实现的,好像a,b表不好弄两个结果集
主键是:Trade_Date,Customer_ID,Commodity_ID,已在主键上建立索引2) and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
是什么含义?
a.commodity_id = b.commodity_id 统计同品种的
a.customer_id > b.customer_id 保证不同客户只取一次组合(排除自己和自己),
如查出结果有001,002两个客户,只取002<->001这各组合,不要001<->002不然结果就重复了,
AND a.buy = b.buy 保证买方向变化相同,全是正的或全是负的
子查询的自连接肯定很慢,不过我想肯定可以修改一下的
自连接一般改哪些方面效率会明显一些?谢谢指点。
2、既然LENGTH(C4.Customer_id) = 4 都为4, 那么在分组和输出中的:
concat(a.customer_id, b.customer_id)、
SUBSTR((concat(a.customer_id, b.customer_id)), 1, 4),
SUBSTR((concat(a.customer_id, b.customer_id)), 5)
是什么意义,直接用group a.customer_id, b.customer_id,
输出字段直接用a.customer_id, b.customer_id就行了3、组合的结果:在一天,对于同样的Commodity_ID,
001 买三次,002买四次,你的count()结果应该是12了吗(先不考虑存在买的情况)4、 一天中对于每 Commodity_ID、buy,一个customer_id 相同的会有多少?
WHERE a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
如
SELECT mgr, ename, hiredate, sal,
AVG(sal) OVER (PARTITION BY mgr ORDER BY hiredate
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS c_mavg
FROM emp;
其他的不太明确信息。
---------------------------------------
SELECT a.customer_id, b.customer_id, a.commodity_id, count(a.customer_id)
FROM (SELECT Trade_Date,Customer_id,C4.Commodity_ID,
sign(buy_holding_change) as buy
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date >=TO_DATE('20060701', 'yyyymmdd') AND
Trade_Date<=TO_DATE('20060801', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 500) a,
(SELECT Trade_Date,Customer_id,C4.Commodity_ID,
sign(buy_holding_change) as buy
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date >=TO_DATE('20060701', 'yyyymmdd') AND
Trade_Date<=TO_DATE('20060801', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 500) b
WHERE a.TRADE_date = b.TRADE_date and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id AND a.buy = b.buy
GROUP BY a.commodity_id, a.customer_id, b.customer_id
1、可以考虑把子查询拆开
这个我现在还想不出来拆开怎么查:(
2、既然LENGTH(C4.Customer_id) = 4 都为4, 那么在分组和输出中的:
concat(a.customer_id, b.customer_id)、
SUBSTR((concat(a.customer_id, b.customer_id)), 1, 4),
SUBSTR((concat(a.customer_id, b.customer_id)), 5)
是什么意义,直接用group a.customer_id, b.customer_id,
输出字段直接用a.customer_id, b.customer_id就行了
--------------------
这是一个同事当时写的代码,现在他走了我在维护,现在已经改了不用concat()3、组合的结果:在一天,对于同样的Commodity_ID,
001 买三次,002买四次,你的count()结果应该是12了吗(先不考虑存在买的情况)4、 一天中对于每 Commodity_ID、buy,一个customer_id 相同的会有多少?
---------------------
这个表是个汇总表,一个Customer_ID对于一个Commodity_ID一天只有一条记录
======================
lz想要统计 某段时间内 各品种的什么信息? 是不是可以不这样写楼主能不能把这个语句的目的说明白一点
条件 AND a.buy = b.buy 这个条件的使用是不是有问题
----------------------
【这个语句的目的:】
从成交表中按照会员/客户分买卖统计其交易当天的持仓变动量:
继而,过滤出会员/客户的买/卖持仓变动量的绝对值大于持仓量变化下限的记录,将过滤结果中持仓变动买卖方向相同的客户码两两组合输出,联动次数以联动出现的天数计算。我现在通过sign()函数把变动方向分开,正的为1负的为-1,a.buy = b.buy 还是有用的来保证这两个会员的变动方向是一致的
可以用分析工具检查索引使用情况
SELECT a.customer_id, b.customer_id, a.commodity_id, count(a.customer_id)
FROM TRADE_Customer_Commodity_Total a
inner join TRADE_Customer_Commodity_Total b
ON a.TRADE_date = b.TRADE_date AND a.commodity_id = b.commodity_id
AND a.customer_id > b.customer_id
AND (b.buy_holding_change<=-500
AND a.buy_holding_change<=-500
OR b.buy_holding_change>=500
AND b.buy_holding_change>=500 ) --AND sign(a.buy_holding_change) = sign(b.buy_holding_change) AND b.Trade_Date between TO_DATE('20060701', 'yyyymmdd') AND TO_DATE('20060801', 'yyyymmdd')
AND b.Customer_id like '____'
AND b.Commodity_ID like '__'
AND ABS(b.buy_holding_change) >= 500
-- AND ( b.buy_holding_change<=-500)
-- OR b.buy_holding_change>=500) )
WHERE a.Trade_Date between TO_DATE('20060701', 'yyyymmdd') AND TO_DATE('20060801', 'yyyymmdd')
AND a.Customer_id like '____' --LENGTH(a.Customer_id) = 4 --两种效果可能一样,可以比对一下
AND a.Commodity_ID like '__' --LENGTH(b.Commodity_ID) = 2
--AND (a.buy_holding_change<=-500
--OR a.buy_holding_change>=500 ) --ABS(a.buy_holding_change) >= 500 --可以检查一下是否使用索引了GROUP BY a.commodity_id, a.customer_id, b.customer_id
--尽可能使用索引或者
trunc( b.Trade_Date) = trunc( a.Trade_Date)
===================
这样自己和自己比较的逻辑最好用分组函数实现。
WHERE a.TRADE_date = b.TRADE_date
and a.commodity_id = b.commodity_id
and a.customer_id > b.customer_id
AND a.buy = b.buy
如
SELECT mgr, ename, hiredate, sal,
AVG(sal) OVER (PARTITION BY mgr ORDER BY hiredate
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS c_mavg
FROM emp;这个现在还想不出来用分组函数怎么实现,现在想到的只能自连接,老兄帮想一下,谢谢
-------------------------------------------------------------To:baojianjun(包子)
===================
建议将执行计划(EXPLAIN PLAN)贴出来看看,同时表的信息也贴一下
PL/SQL里取的:
SELECT STATEMENT, GOAL = ALL_ROWS Cost=10 Cardinality=1 Bytes=80
SORT GROUP BY Cost=10 Cardinality=1 Bytes=80
TABLE ACCESS BY INDEX ROWID Object owner=RISK Object name=TRADE_CUSTOMER_COMMODITY_TOTAL Cost=3 Cardinality=1 Bytes=40
NESTED LOOPS Cost=6 Cardinality=1 Bytes=80
TABLE ACCESS BY INDEX ROWID Object owner=RISK Object name=TRADE_CUSTOMER_COMMODITY_TOTAL Cost=3 Cardinality=1 Bytes=40
INDEX RANGE SCAN Object owner=RISK Object name=TRADE_CUSTOMER_COMMODITY_TOTAL Cost=2 Cardinality=27
INDEX RANGE SCAN Object owner=RISK Object name=TRADE_CUSTOMER_COMMODITY_TOTAL
==============
只能提一点建议 BETWEEN and 改为 〉 〈 可以利用到索引`
其他的不太明确信息。><比between and效率要高还是between and查询的时候不会用索引,望兄台指点。
=======================================================================郁闷,csdn一个人对同一主题最多连续回复三次,再发就不让了,不爽。
=======================
谢谢山林兄,照你的方法试了下也没有什么明显提高,最后我把这个表:
(SELECT Trade_Date,Customer_id,C4.Commodity_ID,
sign(buy_holding_change) as buy
FROM TRADE_Customer_Commodity_Total C4
WHERE C4.Trade_Date >=TO_DATE('20060701', 'yyyymmdd') AND
Trade_Date<=TO_DATE('20060801', 'yyyymmdd')
AND LENGTH(C4.Customer_id) = 4
AND LENGTH(C4.Commodity_ID) = 2
AND ABS(buy_holding_change) >= 500)
全读到内存里统计了,速度快了很多,没有办法,想偷懒也不行:)最后谢谢大家的帮忙。