一百万以上的文本数据如何快速的导入到SQL Server?

解决方案 »

  1.   

    一百万条以上的文本数据如何快速的导入到SQL Server?
      

  2.   

                                                                                                                                                                                                                                           ( 02070-LOND6010 )
                                                                                                                    零售贷款风险分类帐户清单日报 
                                                                                                                  ================================
      机构 :  中国银行北京中关村南大街支行                                                                         日期 : 2011/09/25                                                      货币 : CNY                                               页码 :     1  客户号           客户名                         产品码     贷款账号            放款日期       期限 ( 月 )  逾期天数       贷款合同金额           贷款余额         未到期本金       拖欠本金余额           拖欠利息           罚息合计   系统分类结果  手工分类结果 
      0000000109521149  乔峰                        10360007   00000318156588696   2011/01/24     00240        00000            860,000.00         849,996.85         849,996.85               0.00               0.00               0.00    正常          正常 
      0000000109199694  雪儿                        10360007   00000325956590374   2011/01/25     00204        00000            560,000.00         551,295.14         551,295.14               0.00               0.00               0.00    正常          正常 
      0000000109519284  林冲                          10360007   00000327256718740   2011/02/18     00120        00000          1,150,000.00       1,111,077.88       1,111,077.88               0.00               0.00               0.00    正常          正常 
      0000000095571780  宋江                        10360007   00000328556899146   2011/03/23     00360        00000            770,000.00         766,674.69         766,674.69               0.00               0.00               0.00    正常          正常 
      0000000142824394  于铁木                          10360007   00000331157262971   2011/06/10     00264        00000            600,000.00         597,595.48         597,595.48               0.00               0.00               0.00    正常          正常 
      0000000100044704  田田                          10360007   00000335056748400   2011/02/23     00360        00000          1,150,000.00       1,143,777.55       1,143,777.55               0.00               0.00               0.00    正常          正常 
      

  3.   

    一百万的数据还是得一条一条来。不要问怎么能快速。要不然Google买那么多服务器干什么?Intel不断开发更快的CPU干什么?
      

  4.   

    是什么格式的啊?以前到过excel
      

  5.   

    下面所有的数据都是这样的格式?
    类似.
    0000000109521149  乔峰                        10360007   00000318156588696   2011/01/24     00240        00000            860,000.00         849,996.85         849,996.85               0.00               0.00               0.00    正常          正常 如果是, 可以直接导入到sqlserver里面sql里面新建一个表. 点右键-导入数据--数据源选择(平面数据). 之后跳过前几行导入.你可以自己复制少量数据试试.
      

  6.   

    直接利用SQL SERVER的导入导出工具就可以
      

  7.   

     零售贷款风险分类帐户清单日报 
                                                                                                                  ================================
      机构 :  中国银行北京中关村南大街支行                                                                         日期 : 2011/09/25                                                      货币 : CNY                                               页码 :     1  客户号           客户名                         产品码     贷款账号            放款日期       期限 ( 月 )  逾期天数       贷款合同金额           贷款余额         未到期本金       拖欠本金余额           拖欠利息           罚息合计   系统分类结果  手工分类结果 
      0000000109521149  乔峰                        10360007   00000318156588696   2011/01/24     00240        00000            860,000.00         849,996.85         849,996.85               0.00               0.00               0.00    正常          正常 
      0000000109199694  雪儿                        10360007   00000325956590374   2011/01/25     00204        00000            560,000.00         551,295.14         551,295.14               0.00               0.00               0.00    正常          正常 
      0000000109519284  林冲                          10360007   00000327256718740   2011/02/18     00120        00000          1,150,000.00       1,111,077.88       1,111,077.88               0.00               0.00               0.00    正常          正常 
      0000000095571780  宋江                        10360007   00000328556899146   2011/03/23     00360        00000            770,000.00         766,674.69         766,674.69               0.00               0.00               0.00    正常          正常 
      0000000142824394  于铁木                          10360007   00000331157262971   2011/06/10     00264        00000            600,000.00         597,595.48         597,595.48               0.00               0.00               0.00    正常          正常 
      0000000100044704  田田                          10360007   00000335056748400   2011/02/23     00360        00000          1,150,000.00       1,143,777.55       1,143,777.55               0.00               0.00               0.00    正常          正常 
     零售贷款风险分类帐户清单日报 
                                                                                                                  ================================
      机构 :  中国银行北京中关村南大街支行                                                                         日期 : 2011/09/25                                                      货币 : CNY                                               页码 :     1  客户号           客户名                         产品码     贷款账号            放款日期       期限 ( 月 )  逾期天数       贷款合同金额           贷款余额         未到期本金       拖欠本金余额           拖欠利息           罚息合计   系统分类结果  手工分类结果 
      0000000109521149  乔峰                        10360007   00000318156588696   2011/01/24     00240        00000            860,000.00         849,996.85         849,996.85               0.00               0.00               0.00    正常          正常 
      0000000109199694  雪儿                        10360007   00000325956590374   2011/01/25     00204        00000            560,000.00         551,295.14         551,295.14               0.00               0.00               0.00    正常          正常 
      0000000109519284  林冲                          10360007   00000327256718740   2011/02/18     00120        00000          1,150,000.00       1,111,077.88       1,111,077.88               0.00               0.00               0.00    正常          正常 
      0000000095571780  宋江                        10360007   00000328556899146   2011/03/23     00360        00000            770,000.00         766,674.69         766,674.69               0.00               0.00               0.00    正常          正常 
      0000000142824394  于铁木                          10360007   00000331157262971   2011/06/10     00264        00000            600,000.00         597,595.48         597,595.48               0.00               0.00               0.00    正常          正常 
      0000000100044704  田田                          10360007   00000335056748400   2011/02/23     00360        00000          1,150,000.00       1,143,777.55       1,143,777.55               0.00               0.00               0.00    正常          正常 
      

  8.   

    用SqlServer Management Studio的Import Data功能
    选择flat File Source类型,分割选择tab
    后面自己研究下吧
      

  9.   

    bulk insert  网上说是大数据量导入,我没有试过,楼主可以到网上了解一下
      

  10.   

    你有这个格式的话,可以用sql直接导入的,何必要借助程序呢
      

  11.   

    方法一,使用dataset,
    方法二,搞清sql数据组织方式,直接把文件转换成mdf文件 不知行不行
      

  12.   

    分批次多线程+SqlBulkCopy+事务
      

  13.   

    System.Data.SqlClient.SqlBulkCopy
    至于怎么用 你自己去搜吧 我用的时候400万的数据2分钟全部导入 数据是以txt格式保存的
      

  14.   

    考虑提供几种格式给用户
    SqlServer可以用DTS进行导入
      

  15.   

    信了你的邪,才100W就不知道怎么导了,哥每天都是导千万条的数据.直接写个文本分析的,组成sql往里面堆不就可以了。最多2分钟时间
      

  16.   

      Dim filePath As String = Parameter(FILE_PATH)
                    Dim sheetTabelName As String = Parameter(GETSHEETNAME)
                    Dim strReadExcel As String = "select * from " + sheetTabelName
                    Dim excelConnStr As String = Nothing
                    excelConnStr = OleDbHelper.CreateExcelConnectionString(filePath, OleDbHelper.AccessMode.Export, OleDbHelper.FirstRowIsTitle.Yes)                ' Get data from excel
                    Dim ds As DataSet = OleDbHelper.ExecuteDataset(excelConnStr, CommandType.Text, strReadExcel)
                    ' import data to db
                    Me.ImportDataToTempTable(ds)  Public Function ImportDataToTempTable(ByVal ds As DataSet) As TaskResultType
                Dim UploadFacilityID As Integer = 0
                Dim UserID As Integer = 1
                Dim Facility As Integer = 2
                Dim Curr As Integer = 3
                Dim TotalAmount As Integer = 4
                Dim MaturityDate As Integer = 5
                Dim IssueDate As Integer = 6
                Dim Curr_Alt As Integer = 7
                Dim TotalAmount_Alt As Integer = 8
                Dim Comp_ID As Integer = 9
                Dim Role As Integer = 10
                Dim ParticipationAmt As Integer = 11
                Dim IntStatus As Integer = 12
                Dim ContNo As Integer = 13            Dim strInsertData As String = "Insert Into intUploadFacility(UploadFacilityID,UserID,Facility,Curr,TotalAmount,MaturityDate,IssueDate,Curr_Alt,TotalAmount_Alt,Comp_ID,Role,ParticipationAmt,IntStatus,ContNo) " & _
                    "values(" & _
                    "@UploadFacilityID,@UserID,@Facility,@Curr,@TotalAmount,@MaturityDate,@IssueDate,@Curr_Alt," & _
                    "@TotalAmount_Alt,@Comp_ID,@Role,@ParticipationAmt,@IntStatus,@ContNo)"            Dim intRecordID As Integer = 0            Dim tr As SqlTransaction
                tr = Conn.BeginTransaction("IMPORT_DATA")            Try
                    For i As Integer = 0 To ds.Tables(0).Rows.Count - 1                    Dim row As DataRow = ds.Tables(0).Rows(i)                    Dim arrParam() As SqlParameter = _
                            { _
                                New SqlParameter("@UploadFacilityID", SqlDbType.Int), _
                                New SqlParameter("@UserID", SqlDbType.VarChar), _
                                New SqlParameter("@Facility", SqlDbType.NVarChar), _
                                New SqlParameter("@Curr", SqlDbType.VarChar), _
                                New SqlParameter("@TotalAmount", SqlDbType.Decimal), _
                                New SqlParameter("@MaturityDate", SqlDbType.Date), _
                                New SqlParameter("@IssueDate", SqlDbType.Date), _
                                New SqlParameter("@Curr_Alt", SqlDbType.VarChar), _
                                New SqlParameter("@TotalAmount_Alt", SqlDbType.Decimal), _
                                New SqlParameter("@Comp_ID", SqlDbType.Int), _
                                New SqlParameter("@Role", SqlDbType.NVarChar), _
                                New SqlParameter("@ParticipationAmt", SqlDbType.Decimal), _
                                New SqlParameter("@IntStatus", SqlDbType.VarChar), _
                                New SqlParameter("@ContNo", SqlDbType.NVarChar)}
                        intRecordID = intRecordID + 1
                        arrParam(UploadFacilityID).Value = intRecordID
                        ' todo: user id 
                        Dim useId As String = Parameter(USER_ID).ToString()
                        arrParam(UserID).Value = TruncString(useId.ToString(), 50)
                        arrParam(Facility).Value = TruncString(row("Facility").ToString(), 200)
                        arrParam(Curr).Value = TruncString(row("Curr").ToString(), 3)
                        arrParam(TotalAmount).Value = row("Amount")
                        arrParam(MaturityDate).Value = row("MaturityDate")
                        arrParam(IssueDate).Value = row("IssueDate")
                        arrParam(Curr_Alt).Value = TruncString(row("CurrAlt").ToString(), 3)
                        arrParam(TotalAmount_Alt).Value = row("AmountAlt")
                        arrParam(Comp_ID).Value =row("Comp_ID")
                        arrParam(Role).Value = TruncString(row("Role").ToString(), 100)
                        arrParam(ParticipationAmt).Value = row("ParticipationAmt")
                        arrParam(IntStatus).Value = ""
                        arrParam(ContNo).Value = TruncString(row("ContNo").ToString(), 10)                    '
                        SqlHelper.ExecuteNonQuery(tr, CommandType.Text, strInsertData, arrParam)
                    Next            Catch ex As Exception
                    tr.Rollback()
                    Throw ex            End Try            tr.Commit()
                tr.Dispose()            Return TaskResultType.Success
            End Function        Private Function TruncString(ByVal str As String, ByVal len As Integer) As String
                Return str.Substring(0, IIf(str.Length > len, len, str.Length)).ToString()
            End Function
      

  17.   


          Const CONN_EXCEL As String = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=""{0}""; Extended Properties=""Excel 8.0;HDR={1:G};IMEX={2:D};"""
            Const CONN_CSV As String = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=""{0}""; Extended Properties=""Text;HDR={1:G};IMEX={2:D};FMT=Delimited;"""
            Public Enum AccessMode
                Export = 0
                Import = 1
                FullUpdate = 2
            End Enum        Public Enum FirstRowIsTitle
                Yes
                No
            End Enum        Public Shared Function CreateExcelConnectionString(ByVal filePath As String, ByVal accessType As AccessMode, Optional ByVal firstRowAsHeader As FirstRowIsTitle = FirstRowIsTitle.Yes) As String
                Dim res As String = Nothing, format As String = Nothing            Dim firstRowHeader As String = Nothing            filePath = filePath.ToUpper
                If filePath.EndsWith(".XLS") Then
                    format = CONN_EXCEL
                ElseIf filePath.EndsWith(".CSV") Then
                    format = CONN_CSV
                Else
                    Throw New ArgumentOutOfRangeException("Invalid file extention.")
                End If            res = String.Format(format, filePath, firstRowAsHeader, accessType)            Return res
            End Function
      

  18.   

    这个问题有待研究,如果不考虑效率问题的话大可用excel直导
      

  19.   

    我这几天也在研究这个,我写了一个,不过我的数据跟你有些不同,但万变不离其中啊,看代码
    先把数据读取到DataTable, (我现在在想如果导入的时候要判断重复怎么办,100万啊)
    private void ExeclAutoDao()
            {
                if (System.IO.File.Exists(txttmFile.Text))
                {
                    string strResult = "", msg = "";
                    StreamReader sr = new StreamReader(txttmFile.Text);
                    strResult = sr.ReadLine().Trim();
                    //int k = 0;
                    //Hashtable hs = new Hashtable();
                    DataTable dt = new DataTable();
                    dt.Columns.Add("q_partno", typeof(string));
                    dt.Columns.Add("gongsi", typeof(string));
                    dt.Columns.Add("qy_id", typeof(int));
                    while (strResult != null)
                    {
                        //hs.Add(k, strResult);
                        //k += 1;
                        DataRow dr = dt.NewRow();
                        dr["q_partno"] = strResult;//hx[j].ToString().Trim();
                        dr["gongsi"] = qy.Qy_yuyinName;
                        dr["qy_id"] = qy.Qy_pkid;
                        dt.Rows.Add(dr);
                        strResult = sr.ReadLine();
                    }
                    if (dc.ExecuteLink(dt))
                    {//msg = qyMid.check_partnoHX(hs
                        MessageBox.Show("导入数据成功",共导入,+dt.Rows.Count, "提示", MessageBoxButtons.OKCancel, MessageBoxIcon.Warning);
                    }
                }
                else
                {
                    MessageBox.Show("要导入的文件不存在", "提示", MessageBoxButtons.OKCancel, MessageBoxIcon.Error );
                }
            }
    public bool  ExecuteLink(DataTable dt)
            {
                bool isok = false;
                SqlBulkCopy sqlbc;
                try
                {
                    Connection = new SqlConnection(getConnectionText());                using (sqlbc = new SqlBulkCopy(Connection))
                    {
                        Connection.Open();
                        sqlbc.BatchSize = 10000;
                        sqlbc.BulkCopyTimeout = 300;
                        sqlbc.DestinationTableName = "db_interner";
                        sqlbc.ColumnMappings.Add("q_partno", "q_partno");
                        sqlbc.ColumnMappings.Add("gongsi", "gongsi");
                        sqlbc.ColumnMappings.Add("qy_id", "qy_id");
                        sqlbc.WriteToServer(dt);                }
                    isok = true;
                }
                catch
                { isok = false; }
                //finally
                //{
                //    //Connection.Close();
                //    //Command.Clone();
                //    //sqlread.Close();
                //    //sqlbc.Close();
                //}
                finally
                {
                    Connection.Close();
                    Connection.Dispose();
                   // sqlbc.Close();
                }
                return isok;你试试看
      

  20.   

    SqlBulkCopy 就是对大量数据进行操作的!
      

  21.   

    bcp命令具体参考http://www.haogongju.net/art/459613
    或google一下 bcp命令
      

  22.   

    我知道的有三个方法,都用过
    第一个是BCP,只支持固定格式导入 
    第二个是DTS,这个更灵活些,可以做些过滤,做些数据处理
    第三个就是SqlBulkCopy,这个大家都说过。这三个的性能基本差不多,这个导入100W很简单,至于说写文本分析,再一条条Insert,对于批数据处理来讲,还是不大专业的,当然也可以用SSIS,不过对版本有要求。以上这三个都可以用C#实现。
      

  23.   

    bulk insert 专做这种事情,速度很快
      

  24.   

    如果是我 估计就用dataset 
    写一个 事务 导进去
      

  25.   

    我的想法是先看看对数据进行拆分,比如处理掉1000行就开 一个线程,然后提交一个sql事务这样可以不用等全部处理完然后再进行操作了!
      

  26.   

    使用命令行工具bcp啊  最适合这种格式化的文本数据大批量导入了
      

  27.   

    BULK INSERT 库名..表名 
    FROM 'c:\test.txt' 
    WITH ( 
      --BATCHSIZE = 1000,--每1000条数据提交一次。
    FIELDTERMINATOR = ';', 
    ROWTERMINATOR = '\n' 

      

  28.   

    有一个大数据量倒入的类sqlbulkcopy
    可以看看,如果是你的数据以后在数据库中还要查询的话,新建索引时,会导致插入比较的慢,可以在插入以后再建立索引。
      

  29.   

    可以复制到Excel里!自动导入!我昨天才弄的!!
      

  30.   

    sql 2008的SSIS应用方法: 将多个一定格式的文本文件批量导入到数据库 .http://blog.csdn.net/keenweiwei/article/details/6869158
      

  31.   

    bulk insert tb
    from  'c:\x.txt'把数据库恢复模式改为 大容量日志模式
      

  32.   

    bulk insert,我导入过40w数据,很快就好了
      

  33.   

    如果是规则文本想办法读取,拆分,大约2~3万一个数据包在读成DataTable
    调用一下程序        //数据导入
            public static void ExcelUpDataBase(DataTable dt)
            {
                SqlBulkCopy bulkCopy = new SqlBulkCopy(连接串);
                bulkCopy.DestinationTableName = dt.TableName;
                bulkCopy.BulkCopyTimeout = 60;
                bulkCopy.WriteToServer(dt);
                dt.Dispose();
            }
    这是走程序的方式,不过要注意你的文本文件的编码格式,否则会乱码。
    也可以尝试SSIS包的方式,配置也比较简单,而且速度也十分快。
      

  34.   

    Ctrl + CCtrl + V ok....
      

  35.   

    将文本导入到excel中,然后每个字段之间以“|”分割。再导回成文本。
    利用bcp命令很快就可以导入到sqlserver。前提是数据不能有乱字符。为空的也要有“||”分割。
    bcp命令如下:
    bcp 库名..EX_表名 in 文本文件.txt -S数据库服务名 -U数据库用户 -P密码 -c -t"|"
      

  36.   

    http://www.microsoft.com/biztalk/zh/cn/default.aspx我感觉 可行的方案就是用biztalk完成,整合性强
      

  37.   

    零售贷款风险分类帐户清单日报 
                                                                                                                  ================================
      机构 :  中国银行北京中关村南大街支行                                                                         日期 : 2011/09/25                                                      货币 : CNY                                               页码 :     1  客户号           客户名                         产品码     贷款账号            放款日期       期限 ( 月 )  逾期天数       贷款合同金额           贷款余额         未到期本金       拖欠本金余额           拖欠利息           罚息合计   系统分类结果  手工分类结果 
      0000000109521149  乔峰                        10360007   00000318156588696   2011/01/24     00240        00000            860,000.00         849,996.85         849,996.85               0.00               0.00               0.00    正常          正常 
      0000000109199694  雪儿                        10360007   00000325956590374   2011/01/25     00204        00000            560,000.00         551,295.14         551,295.14               0.00               0.00               0.00    正常          正常 
      0000000109519284  林冲                          10360007   00000327256718740   2011/02/18     00120        00000          1,150,000.00       1,111,077.88       1,111,077.88               0.00               0.00               0.00    正常          正常 
      0000000095571780  宋江                        10360007   00000328556899146   2011/03/23     00360        00000            770,000.00         766,674.69         766,674.69               0.00               0.00               0.00    正常          正常 
      0000000142824394  于铁木                          10360007   00000331157262971   2011/06/10     00264        00000            600,000.00         597,595.48         597,595.48               0.00               0.00               0.00    正常          正常 
      0000000100044704  田田                          10360007   00000335056748400   2011/02/23     00360        00000这个怎么读啊?
      

  38.   

    我觉得楼主可以先导入到execl然后处理,txt导入到execl应该很容易,然后写程序导入数据库,若直接execl和数据库自带的导入数据库,就会出现你的数据错误,0000000142824394不能正确导入,我刚写完程序!
      

  39.   

    直接利用SQL SERVER的导入导出工具就可以
      

  40.   

    如果是在程序里,可以用 bcp 导入工具.
      

  41.   


      public void insertintosql()
            {
                string path = "C:\\Documents and Settings\\桌面\\C#.txt";
                StreamReader sr = new StreamReader(path);
                while (sr.Peek()!=-1)
                {
                    //数据以特定的规律以空格为分隔符
                   string[] name= sr.ReadLine().Split(new char[]{' '});
                    //连接sql,插入数据库
                   name[0].ToString();
                   name[1].ToString();
     
                }
     
            }