惭愧啊。那个存储过程都不是自己写的,是以前辞职的人写的,很复杂。我也不怎么明白,就知道怎么拿来用。 具体如下,估计也没人会花大量心思去看,而且看了不一定懂。 SET QUOTED_IDENTIFIER ON GO SET ANSI_NULLS ON GO--usp_EnqDailyEfficency '!','Z','!','Z','!','Z','2008/10/12','2008/10/12' ALTER PROC [dbo].[usp_EnqDailyEfficency] (@strFrStaffID char(10),@strToStaffID char(10),@strFrDept char(10),@strToDetp char(10),@strFrPost char(3),@strToPost char(3),@strFrDate char(10),@strToDate char(10)) ASSELECT CODE,NAME,LONGNAME,ISNULL(POST,'') AS POST ,ISNULL(POSTCODE,'') AS POSTCODE ,State,LeaveSeason,PyDate INTO #TSTAFF FROM TXCARD.dbo.uvw_StaffMaster WHERE CODE BETWEEN @strFrStaffID AND @strToStaffID AND DEPT between @strFrDept and @strToDetp AND ISNULL(POSTCODE,'!') BETWEEN @strFrPost AND @strToPost SELECT b.LongName as Dept,b.POST,a.Code,b.Name,a.Style_No,a.HK_MO_CODE,a.Step_No,a.Step_Description,a.Scanned_DateTime,SUM(a.No_of_PPB) AS PrdQty,MAX(b.State) AS STATE,MAX(b.LeaveSeason) AS LEAVESEASON,MAX(PyDate) as PyDate INTO #TB FROM Barcode a (NOLOCK),#TSTAFF b (NOLOCK) WHERE a.Scanned=1 AND a.Code<>'VOID' AND a.Code=b.Code AND a.MO_Code<>'111111' AND a.Scanned_DateTime between @strFrDate and @strToDate GROUP BY b.LongName,b.Post,a.Code,b.Name,a.Style_No,a.HK_MO_CODE,a.Step_No,a.Step_Description,a.Scanned_DateTime SELECT A.*,B.Comment,B.HK_Style_No,C.Step_Price,C.StepPrice2,round(A.PrdQty*0.99,0) AS SalaryPrd, round(A.PrdQty*0.99,0)*ISNULL(C.Step_Price,0) AS Amt,round(A.PrdQty*0.99,0)*ISNULL(C.StepPrice2,0) AS Amt2,ISNULL(StandardRate,0) AS StandardRate INTO #TT FROM #TB A,dbo.uvw_Styles B,dbo.Style_Step C WHERE A.Style_No=B.Style_No AND A.Style_No=C.Style_No AND A.Step_NO=C.Step_NoSELECT a.Code,a.Scanned_DateTime,sum(a.No_of_PPB) as TimeWork INTO #TIMEWORK FROM Barcode a (NOLOCK),#TSTAFF b (NOLOCK) WHERE a.Scanned=1 AND a.Code<>'VOID' AND a.Code=b.Code AND a.Scanned_DateTime between @strFrDate and @strToDate AND a.MO_Code='111111' GROUP BY a.Code,a.Scanned_DateTime select b.code,a.[date] as WorkDate,sum(isnull(a.workDayHour,--a.workDayHourRear,080928排班时间更改为上班实际时间 0)) as WorkTime into #ATT from ATT.dbo.M_DefultCheckinout a (nolock) , ATT.dbo.ShowPTPperson b (nolock),ATT.dbo.USERINFO c (nolock) where a.Userid=c.userid and [date] between @strFrDate and @strToDate and b.Kqid=c.BADGENUMBER COLLATE Chinese_PRC_CI_AS group by b.code,a.[date]select b.code,a.[date] as WorkDate,sum(isnull(a.workDayHourRear,0)) as WorkTime into #ATT2 from ATT2.dbo.M_DefultCheckinout a (nolock) , ATT2.dbo.ShowPTPperson b (nolock),ATT2.dbo.USERINFO c (nolock) where a.Userid=c.userid and [date] between @strFrDate and @strToDate and b.Kqid=c.BADGENUMBER COLLATE Chinese_PRC_CI_AS group by b.code,a.[date] UPDATE #TT SET Amt2=0 WHERE Amt2 IS NULL SELECT A.*,isnull(B.WorkTime,0) as WorkTime INTO #STAFFWORK FROM #TT A, #ATT B WHERE A.Code*=B.Code and A.Scanned_DateTime*=B.WorkDate UPDATE A SET A.WorkTime=isnull(B.WorkTime,0) FROM #STAFFWORK A,#ATT2 B WHERE A.Code=B.Code and A.Scanned_DateTime=B.WorkDate SELECT A.Dept,A.POST,A.Code,A.Name,Convert(char(2),datepart(month,A.Scanned_Datetime))+'_'+Convert(char(2),datepart(day,A.scanned_Datetime)) as scanned_DateTime,A.PyDate, CASE WHEN STATE=9 THEN '离职' else '在职' END AS Status, sum(A.PrdQty) as PrdQty,sum(A.Amt) as Amt,max(isnull(B.TimeWork,0)) as TimeWork INTO #ENQSTAFFSALARY FROM #STAFFWORK A,#TIMEWORK B WHERE A.Code*=B.Code and A.Scanned_DateTime*=B.Scanned_DateTime GROUP BY A.Dept,A.POST,A.Code,A.Name,A.Scanned_DateTime,A.PyDate,STATE
SELECT DISTINCT Scanned_Datetime INTO #workdate FROM #ENQSTAFFSALARYDECLARE @SQL1 NVARCHAR(4000)SET @SQL1='SELECT Dept,post,code,name,pydate,status,'SELECT @SQL1=@SQL1+'sum(case scanned_DateTime when '''+Scanned_Datetime+''' then Amt else 0 end) AS '''+Scanned_Datetime+''' , ' from (select * from #workdate) as aSELECT @SQL1=LEFT(@SQL1,LEN(@SQL1)-1)+' FROM #ENQSTAFFSALARY GROUP BY Dept,post,code,name,pydate,status ' PRINT @SQL1 EXEC (@SQL1)GO SET QUOTED_IDENTIFIER OFF GO SET ANSI_NULLS ON GO
然后就是设计表的结构,让结构倾向于查询速度,如果除了读还会写,数据准确度要求不是非常之高,也可以脏读什么的(nulock)。
然后程序上适当做些调整,尽量减少对数据库的访问,
等等。
都是很好的方法。
900万条不算太多,只要索引建的好,完全可以搞定,也许还不需要用分区表。
这就要看你建立索引的能力了,不是建一个就完事,查询速度相差巨大。
首先是聚集索引的选择问题,这个至关重要!如果你在id自动增长的identity列为主键,那它会被默认为聚集索引,这个是相当浪费的做法,不能解决你的问题,建议你去找相关的索引优化的文章好好看看。
我以前做过测试,400多万的数据,索引得当的话,0.2秒找到自己需要的数据(当然是可以引用索引的条件)。
然后对SQL语句进行优化,主要就是这两个了,楼上有兄弟说的分区表,我也不熟悉,自己得补了
具体如下,估计也没人会花大量心思去看,而且看了不一定懂。
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO--usp_EnqDailyEfficency '!','Z','!','Z','!','Z','2008/10/12','2008/10/12'
ALTER PROC [dbo].[usp_EnqDailyEfficency]
(@strFrStaffID char(10),@strToStaffID char(10),@strFrDept char(10),@strToDetp char(10),@strFrPost char(3),@strToPost char(3),@strFrDate char(10),@strToDate char(10))
ASSELECT CODE,NAME,LONGNAME,ISNULL(POST,'') AS POST ,ISNULL(POSTCODE,'') AS POSTCODE ,State,LeaveSeason,PyDate
INTO #TSTAFF
FROM TXCARD.dbo.uvw_StaffMaster
WHERE CODE BETWEEN @strFrStaffID AND @strToStaffID
AND DEPT between @strFrDept and @strToDetp
AND ISNULL(POSTCODE,'!') BETWEEN @strFrPost AND @strToPost
SELECT b.LongName as Dept,b.POST,a.Code,b.Name,a.Style_No,a.HK_MO_CODE,a.Step_No,a.Step_Description,a.Scanned_DateTime,SUM(a.No_of_PPB) AS PrdQty,MAX(b.State) AS STATE,MAX(b.LeaveSeason) AS LEAVESEASON,MAX(PyDate) as PyDate
INTO #TB
FROM Barcode a (NOLOCK),#TSTAFF b (NOLOCK)
WHERE a.Scanned=1
AND a.Code<>'VOID'
AND a.Code=b.Code
AND a.MO_Code<>'111111'
AND a.Scanned_DateTime between @strFrDate and @strToDate
GROUP BY b.LongName,b.Post,a.Code,b.Name,a.Style_No,a.HK_MO_CODE,a.Step_No,a.Step_Description,a.Scanned_DateTime
SELECT A.*,B.Comment,B.HK_Style_No,C.Step_Price,C.StepPrice2,round(A.PrdQty*0.99,0) AS SalaryPrd,
round(A.PrdQty*0.99,0)*ISNULL(C.Step_Price,0) AS Amt,round(A.PrdQty*0.99,0)*ISNULL(C.StepPrice2,0) AS Amt2,ISNULL(StandardRate,0) AS StandardRate
INTO #TT
FROM #TB A,dbo.uvw_Styles B,dbo.Style_Step C
WHERE A.Style_No=B.Style_No
AND A.Style_No=C.Style_No
AND A.Step_NO=C.Step_NoSELECT a.Code,a.Scanned_DateTime,sum(a.No_of_PPB) as TimeWork
INTO #TIMEWORK
FROM Barcode a (NOLOCK),#TSTAFF b (NOLOCK)
WHERE a.Scanned=1
AND a.Code<>'VOID'
AND a.Code=b.Code
AND a.Scanned_DateTime between @strFrDate and @strToDate
AND a.MO_Code='111111'
GROUP BY a.Code,a.Scanned_DateTime
select b.code,a.[date] as WorkDate,sum(isnull(a.workDayHour,--a.workDayHourRear,080928排班时间更改为上班实际时间
0)) as WorkTime into #ATT
from ATT.dbo.M_DefultCheckinout a (nolock) , ATT.dbo.ShowPTPperson b (nolock),ATT.dbo.USERINFO c (nolock)
where a.Userid=c.userid and [date] between @strFrDate and @strToDate and b.Kqid=c.BADGENUMBER COLLATE Chinese_PRC_CI_AS
group by b.code,a.[date]select b.code,a.[date] as WorkDate,sum(isnull(a.workDayHourRear,0)) as WorkTime into #ATT2
from ATT2.dbo.M_DefultCheckinout a (nolock) , ATT2.dbo.ShowPTPperson b (nolock),ATT2.dbo.USERINFO c (nolock)
where a.Userid=c.userid and [date] between @strFrDate and @strToDate and b.Kqid=c.BADGENUMBER COLLATE Chinese_PRC_CI_AS
group by b.code,a.[date]
UPDATE #TT SET Amt2=0 WHERE Amt2 IS NULL SELECT A.*,isnull(B.WorkTime,0) as WorkTime INTO #STAFFWORK FROM #TT A, #ATT B WHERE A.Code*=B.Code and A.Scanned_DateTime*=B.WorkDate UPDATE A SET A.WorkTime=isnull(B.WorkTime,0) FROM #STAFFWORK A,#ATT2 B WHERE A.Code=B.Code and A.Scanned_DateTime=B.WorkDate SELECT A.Dept,A.POST,A.Code,A.Name,Convert(char(2),datepart(month,A.Scanned_Datetime))+'_'+Convert(char(2),datepart(day,A.scanned_Datetime)) as scanned_DateTime,A.PyDate,
CASE WHEN STATE=9 THEN '离职' else '在职' END AS Status,
sum(A.PrdQty) as PrdQty,sum(A.Amt) as Amt,max(isnull(B.TimeWork,0)) as TimeWork
INTO #ENQSTAFFSALARY
FROM #STAFFWORK A,#TIMEWORK B
WHERE A.Code*=B.Code and A.Scanned_DateTime*=B.Scanned_DateTime
GROUP BY A.Dept,A.POST,A.Code,A.Name,A.Scanned_DateTime,A.PyDate,STATE
SELECT DISTINCT Scanned_Datetime INTO #workdate FROM #ENQSTAFFSALARYDECLARE @SQL1 NVARCHAR(4000)SET @SQL1='SELECT Dept,post,code,name,pydate,status,'SELECT @SQL1=@SQL1+'sum(case scanned_DateTime when '''+Scanned_Datetime+''' then Amt else 0 end) AS '''+Scanned_Datetime+''' , ' from (select * from #workdate) as aSELECT @SQL1=LEFT(@SQL1,LEN(@SQL1)-1)+' FROM #ENQSTAFFSALARY GROUP BY Dept,post,code,name,pydate,status '
PRINT @SQL1 EXEC (@SQL1)GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
http://blog.csdn.net/jinjazz/archive/2008/06/26/2589635.aspx分区表参考
http://blog.csdn.net/jinjazz/archive/2008/06/24/2582235.aspx
如果查询频繁,建议使用lucene或其它类似引擎辅助.我用过lucene做论坛数据搜索(300万左右帖子),一般在700毫秒之内
案例:
数据库中有一个主要的数据表,叫它MasterTable,结构如下,数量在10亿条以上,有30列左右。系统中大多数的查询都会与之有关,由于数据量巨大所以效率很低。
解决方案:
生成临时表如TemporaryTable1,TemporaryTable2等,结构如下,这些表根据数据查询的需要只包含MasterTable中的几列,2小时(业务需要)以后将这些临时表Truncate一次,重新产生数据。这样可以大大提高查询的速度。MasterTable
-----------------------------
| Col1 | Col2 | Col3 | …… |
-----------------------------
| Data | Data | Data | …… |
-----------------------------TemporaryTable1
---------------
| Col1 | Col2 |
---------------
| Data | Data |
---------------TemporaryTable2
---------------
| Col1 | Col3 |
---------------
| Data | Data |
---------------当然这种解决方法有局限性,就是使用的数据并不是最新的数据,准确点说应该是前2小时以内的数据,但是如果满足业务的需要那么也是可以接受的
我来拜强人...orz
从来没有接触过分区表,只是听说过
哪里可以找到分区表的详细资料呢,你的帖子看过了,不过有些地方看不懂
另外刚刚有人提到了lucene,我以前用过(可能版本问题?),在cpu为2个的服务器上运行的很好,放到8个cpu的服务器上就报内存问题。
对于开源,最好不要直接用,有问题或者你有些需求它不能满足的时候,你就麻烦了,你必须得去认真搞懂它的所有代码,然后围绕它去想很多很搞的解决办法。得不偿失。
这个存储过程用于网站查询吗?测过并发没有?如果测并发会不会报错?