代码如下:
----------------------------
SELECT NVL(T1.ol_Qty, 0),
(CASE WHEN T2.Qty = 0 THEN 0 ELSE 100 * NVL(T1.ol_Qty, 0) / T2.Qty END),
NVL(T1.ol_Num, 0),
(CASE WHEN T2.Num = 0 THEN 0 ELSE 100 * NVL(T1.ol_Num, 0) / T2.Num END)
FROM (SELECT NVL(SUM(Qty), 0) AS ol_Qty,
COUNT(DISTINCT(Customer_ID)) AS ol_Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN
To_Date('20060718000000', 'yyyymmddhh24miss') AND
To_Date('20060918235959', 'yyyymmddhh24miss')
AND ol_Flag = 'L'
AND to_date(to_char(Trade_Date,'yyyymmdd'),'yyyymmdd')=Open_Trade_Date) T1,
(SELECT SUM(Qty) AS Qty,
COUNT(DISTINCT(Customer_ID)) AS Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN
To_Date('20060718000000', 'yyyymmddhh24miss') AND
To_Date('20060918235959', 'yyyymmddhh24miss')) T2
=================================================================
数据库里的Trade_Date字段带有时分秒,Open_Trade_Date字段没有时分秒,客户端提交的日期没有时分秒,有时分秒的和无时分秒的需要比较,目前知道的去掉时分秒的方法:
1、TRUNC(Trade_Date) 效率太低,尤其数据量特别大的时候。
TRUNC(Trade_Date) BETWEEN To_Date('20060718', 'yyyymmdd') AND To_Date('20060918', 'yyyymmdd'))
2、To_Date(To_Char(Trade_Date)),比方法1好一点,但效率也很慢。数据库里有几千万条数据,用上面的语句要十几分钟,哪们大哥帮小弟优化一下,谢谢了先。
----------------------------
SELECT NVL(T1.ol_Qty, 0),
(CASE WHEN T2.Qty = 0 THEN 0 ELSE 100 * NVL(T1.ol_Qty, 0) / T2.Qty END),
NVL(T1.ol_Num, 0),
(CASE WHEN T2.Num = 0 THEN 0 ELSE 100 * NVL(T1.ol_Num, 0) / T2.Num END)
FROM (SELECT NVL(SUM(Qty), 0) AS ol_Qty,
COUNT(DISTINCT(Customer_ID)) AS ol_Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN
To_Date('20060718000000', 'yyyymmddhh24miss') AND
To_Date('20060918235959', 'yyyymmddhh24miss')
AND ol_Flag = 'L'
AND to_date(to_char(Trade_Date,'yyyymmdd'),'yyyymmdd')=Open_Trade_Date) T1,
(SELECT SUM(Qty) AS Qty,
COUNT(DISTINCT(Customer_ID)) AS Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN
To_Date('20060718000000', 'yyyymmddhh24miss') AND
To_Date('20060918235959', 'yyyymmddhh24miss')) T2
=================================================================
数据库里的Trade_Date字段带有时分秒,Open_Trade_Date字段没有时分秒,客户端提交的日期没有时分秒,有时分秒的和无时分秒的需要比较,目前知道的去掉时分秒的方法:
1、TRUNC(Trade_Date) 效率太低,尤其数据量特别大的时候。
TRUNC(Trade_Date) BETWEEN To_Date('20060718', 'yyyymmdd') AND To_Date('20060918', 'yyyymmdd'))
2、To_Date(To_Char(Trade_Date)),比方法1好一点,但效率也很慢。数据库里有几千万条数据,用上面的语句要十几分钟,哪们大哥帮小弟优化一下,谢谢了先。
2.这么大的数据量可考虑建立分区表,分区表的优点是,在磁盘提供并行读取的情况下可提高性能
2 FROM DUAL
3 WHERE TO_DATE ('2006-10-16 12:00:00', 'yyyy-mm-dd hh24:mi:ss') >
4 TO_DATE ('2006-10-16', 'yyyy-mm-dd')
5 /D
-
XSQL> SELECT *
2 FROM DUAL
3 WHERE TO_DATE ('2006-10-16 12:00:00', 'yyyy-mm-dd hh24:mi:ss') <
4 TO_DATE ('2006-10-16', 'yyyy-mm-dd')
5 /no rows selectedSQL>
to_date(to_char(Trade_Date,'yyyymmdd'),'yyyymmdd')=Open_Trade_Date
为:
Trade_Date >= Open_Trade_Date and Trade_Date < Open_Trade_Date + 1
直接通过日期进行比较,不再进行格式转换;
在没有分区,没有索引,100万数据下,时间能够比使用To_Date(To_Char(...))和trunc(...)提高将近一倍
1、去掉了trunc()直接比较日期;
2、将where里的两句合并成一句:Trade_Date >= Open_Trade_Date and Trade_Date < Open_Trade_Date + 1 [夜里不产生记录所以用了between and]。
现在已经由原来的十几分提高到现在的160s左右。
============================Code============================
SELECT NVL(T1.ol_Qty, 0),
(CASE WHEN T2.Qty = 0 THEN 0 ELSE 100 * NVL(T1.ol_Qty, 0) / T2.Qty END),
NVL(T1.ol_Num, 0),
(CASE WHEN T2.Num = 0 THEN 0 ELSE 100 * NVL(T1.ol_Num, 0) / T2.Num END)
FROM (SELECT 100 AS ID,
NVL(SUM(Qty), 0) AS ol_Qty,
COUNT(DISTINCT(Customer_ID)) AS ol_Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1
AND ol_Flag = 'L') T1,
(SELECT 100 AS ID,
NVL(SUM(Qty), 0) AS Qty,
COUNT(DISTINCT(Customer_ID)) AS Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN To_Date('20060718','yyyymmdd') AND
To_Date('20060918','yyyymmdd')+1) T2
============================================================需要说明的是TRD_TRADES_HIST表是原系统的库里的表,原系统每天产生的记录追加到这里,我们需要从这个表里查询数据但不能改变它,所以不适合建立分区表。正常索引会在后期测试的时候建立。【现在的问题:】
客户要求在1分钟内查询出结果,所以还需要尽最大可能的优化,大
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
家帮看一下还能不能再优化一些。
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----------------------------------------
另外问一下,between 和><哪个效率更高一点?
-----------------------------------------TO:xiaoxiao1984(笨猫儿)
不好意思,我只测了一下To_Date(To_Char(...)),测trunc(...)的时候很慢没等到出结果关了,先入为主的以为To_Date(To_Char(...))效率高一点,见谅:)
不同的条件写到select后,使用CASE WHEN 2、按照你目前的写法,考虑一下追加2个表之间的关联和查到表的索引子段追加个无关紧要的条件。
2、可以考虑用物化视图代替两个子表T1、T2
1. 不能考虑和原系统的维护人员商量修改成分区表么,或者和你们的头儿反映一下情况,权衡一下利弊;
这个会考虑的,谢谢。
================================
licsth() ( ) 信誉:99 Blog
1、2个子表的条件类似,可以试试不用子表的写法,把共同的条件作为条件;
不同的条件写到select后,使用CASE WHEN
----------------------------
SELECT SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L'
THEN Qty END),
(CASE WHEN SUM(Qty)=0 THEN 0 ELSE 100*SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND
Open_Trade_Date+1 AND ol_Flag = 'L' THEN Qty END)/SUM(Qty) END),
COUNT(DISTINCT(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L'
THEN Customer_ID END)),
(CASE WHEN COUNT(DISTINCT(Customer_ID))=0 THEN 0 ELSE 100*COUNT(DISTINCT(CASE WHEN Trade_Date
BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L'
THEN Customer_ID END))/COUNT(DISTINCT(Customer_ID)) END)
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN To_Date('20060718','yyyymmdd') AND To_Date('20060918','yyyymmdd')+1
----------------------------
这个同一个sum()和count()好像要进行两次,实际测试好像要比用子表查询要慢。
2、按照你目前的写法,考虑一下追加2个表之间的关联和查到表的索引子段追加个无关紧要的条件。
这个不太明白,能不能说的再详细点。
===================================
mu_gua(木瓜) ( ) 信誉:100 Blog
1、如果Trade_Date字段的时分秒没有用,直接改成日期型不保存时分秒不是更简单吗?
这是原系统记录的交易时间,时分秒在原系统是必要的。
2、可以考虑用物化视图代替两个子表T1、T2
物化视图?就是普通视图代替T1、T2吗?但是我的T1、T2在不同的条件下会有很多种情况。
=============================
方法一:
SELECT T2.Commodity_ID,
NVL(T1.ol_Qty,0),
(CASE WHEN T2.Qty = 0 THEN 0 ELSE 100 * NVL(T1.ol_Qty,0) / T2.Qty END),
NVL(T1.ol_Num,0),
(CASE WHEN T2.Num = 0 THEN 0 ELSE 100 * NVL(T1.ol_Num,0) / T2.Num END)
FROM (SELECT Commodity_ID,
NVL(SUM(Qty),0) AS ol_Qty,
COUNT(DISTINCT(Customer_ID)) AS ol_Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN To_Date('20060718', 'yyyymmdd') AND
To_Date('20060918', 'yyyymmdd') + 1
AND ol_Flag = 'L'
AND Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date + 1
GROUP BY Commodity_ID) T1,
(SELECT Commodity_ID,
SUM(Qty) AS Qty,
COUNT(DISTINCT(Customer_ID)) AS Num
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN To_Date('20060718', 'yyyymmdd') AND
To_Date('20060918', 'yyyymmdd') + 1
GROUP BY Commodity_ID) T2
WHERE T2.Commodity_ID = T1.Commodity_ID(+)
方法二:
SELECT Commodity_ID,
NVL(SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L' THEN Qty END),0),
(CASE WHEN SUM(Qty)=0 THEN 0 ELSE
100*NVL(SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L' THEN Qty END),0)/SUM(Qty) END),
COUNT(DISTINCT(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L' THEN Customer_ID END)),
(CASE WHEN COUNT(DISTINCT(Customer_ID))=0 THEN 0 ELSE
100*COUNT(DISTINCT(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L'
THEN Customer_ID END))/COUNT(DISTINCT(Customer_ID)) END)
FROM TRD_TRADES_HIST
WHERE Trade_Date BETWEEN To_Date('20060718', 'yyyymmdd') AND
To_Date('20060918', 'yyyymmdd') + 1
GROUP BY Commodity_ID现在的语句如上,主要是把Trunc(Date)换成了日期直接比较,方法一和二的效率相当,查询600万数据时间只差一点。除了建分区表(这个正在考虑)还有没有别的地方可优化的?请老鸟不吝赐教================================
还有一点要问:
NVL(SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L' THEN Qty END),0),
(CASE WHEN SUM(Qty)=0 THEN 0 ELSE
100*NVL(SUM(CASE WHEN Trade_Date BETWEEN Open_Trade_Date AND Open_Trade_Date+1 AND ol_Flag = 'L' THEN Qty END),0)/SUM(Qty) END),
把上面这句写在select里面如上面方法二,SUM(Qty)和SUM(CASE WHEN...END)实际上会执行几遍?方法一和方法二从理论上分析哪个效率应该高一点?