数据库插入效率问题

用什么方法可以提高数据库的插入效率？(sqlserver2008)
目前的状况是需要先查询后插入（有重复的记录则数值+1）我使用的是存储过程（3w多条记录花了十几分钟）有什么方法可以提高的吗？

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

这是存储过程
create proc CheckInsert --定义全局临时过程
@fWord varchar(100),@Word varchar(100),@lWord varchar(100)
AS
    declare @tmpID int
    declare @tmpCount int
    set @tmpID = 0
    set @tmpCount = 0
    --查询重复项
    select @tmpID = FormatCheckEn.ID,@tmpCount = FormatCheckEn.Count from FormatCheckEn
    where FormatCheckEn.FirstWord = @fWord and
  FormatCheckEn.Word = @Word and
  FormatCheckEn.LastWord = @lWord
if @tmpID > 0 --如果存在就更新
begin
update FormatCheckEn set FormatCheckEn.count = @tmpCount+1 where FormatCheckEn.ID = @tmpID
end
else --如果不存在则插入
begin
insert into FormatCheckEn(FirstWord,Word,LastWord,count) values(@fWord,@Word,@lWord,1)
end
go
1、   where FormatCheckEn.FirstWord = @fWord and
          FormatCheckEn.Word = @Word and
          FormatCheckEn.LastWord = @lWord
      select这句是否有相关索引
2、update 那句是否有索引
3、磁盘是什么情况，是否有raid？raid5还是raid10？
4、ldf文件是否独立存放？
5、建议条件允许时，checkpoint改成10分钟
语句中不存在索引，因为我怕插入数据的时候索引维护会花费大量的时间，毕竟3w条数据只是测试使用
idf文件存放在某个盘的文件夹下（其中只有这一个数据库的文件）
磁盘情况到没有考虑太清楚
修改checkpoint点会带来多大的好处？开销又是什么呢？
请使用Merge语句
先将需要批次插入的表放入一张缓存的表中(假设名字为FormatCheckEn_tmp , 结构与FormatCheckEn相同)create proc CheckInsert --定义全局临时过程
AS
    --declare @tmpID int
    --declare @tmpCount int
    --set @tmpID = 0
    --set @tmpCount = 0
    ----查询重复项
    --select @tmpID = FormatCheckEn.ID,@tmpCount = FormatCheckEn.Count from FormatCheckEn
    --where FormatCheckEn.FirstWord = @fWord and
    --      FormatCheckEn.Word = @Word and
    --      FormatCheckEn.LastWord = @lWord
    --if @tmpID > 0 --如果存在就更新
    --    begin
    --        update FormatCheckEn set FormatCheckEn.count = @tmpCount+1 where FormatCheckEn.ID = @tmpID
    --    end
    --else     --如果不存在则插入
    --    begin
    --        insert into FormatCheckEn(FirstWord,Word,LastWord,count) values(@fWord,@Word,@lWord,1)
    --    end
    SELECT FirstWord,Word,LastWord,COUNT(1) AS count FROM FormatCheckEn_tmp INTO #t1 GROUP BY FirstWord,Word,LastWord

    UPDATE FormatCheckEn SET FormatCheckEn.COUNT = FormatCheckEn.COUNT + #t1.count FROM  FormatCheckEn  INNER JOIN #t1
    ON FormatCheckEn.FirstWord = #t1.FirstWord and FormatCheckEn.Word = #t1.Word and FormatCheckEn.LastWord = #t1.LastWord

    INSERT INTO FormatCheckEn(FirstWord,Word,LastWord,count)
    SELECT * FROM #t1 WHERE NOT EXISTS
    (SELECT Word FROM FormatCheckEn WHERE FormatCheckEn.FirstWord = #t1.FirstWord and FormatCheckEn.Word = #t1.Word and FormatCheckEn.LastWord = #t1.LastWord)

go
create proc CheckInsert
这存储过程并发操作很容易死锁吧？
这个过程使用的是逐条处理，建议使用批量处理的方式。 5# 提到的 merge ，是不错的选择。
语句中不存在索引，因为我怕插入数据的时候索引维护会花费大量的时间，毕竟3w条数据只是测试使用
idf文件存放在某个盘的文件夹下（其中只有这一个数据库的文件）
磁盘情况到没有考虑太清楚
修改checkpoint点会带来多大的好处？开销又是什么呢？你现在只说是存储过程执行慢，这要分析proc中每一步是否有可以调优的地方；
如果没有索引，那你查询就是全表扫描，这个性能最好的解决方法就是添加索引；
此外，关于LDF单独存放和修改checkpoint，都是充分利用磁盘IO的方法；
默认的checkpoint大约1分钟左右进行一次，在你insert的时候，频繁的checkpoint会导致IO繁忙；
建议使用Distinct 先查询，然后批量插入和更新：Insert Into FormatCheckEn(FirstWord,Word,LastWord,count)
Select Distinct a,b,c,1
From TUpdate FormatCheckEn Set Count=(Select Count(1) From T Where FormatCheckEn.FirstWord =T.xx And
          FormatCheckEn.Word =T.xx And
          FormatCheckEn.LastWord = T.xx