我现在做的只能对纯文字文件进行索引,如何对doc或docx进行内容索引?
下面这段换成doc就不行了 IndexWriter writer = null;
Document doc = new Document();
writer = new IndexWriter(@"E:\ss\i", new StandardAnalyzer(), true); StreamReader sr = new StreamReader(@"E:\ss\111.txt", System.Text.Encoding.GetEncoding("gb2312"));
doc.Add(new Field(@"E:\ss\111.txt", "name", Field.Store.YES, Field.Index.UN_TOKENIZED));
doc.Add(new Field("content", sr.ReadToEnd().ToString(), Field.Store.YES, Field.Index.TOKENIZED));
sr.Close();
doc.Add(Field.UnIndexed("filename", "111.txt"));
writer.AddDocument(doc);
writer.Optimize();
writer.Close();
下面这段换成doc就不行了 IndexWriter writer = null;
Document doc = new Document();
writer = new IndexWriter(@"E:\ss\i", new StandardAnalyzer(), true); StreamReader sr = new StreamReader(@"E:\ss\111.txt", System.Text.Encoding.GetEncoding("gb2312"));
doc.Add(new Field(@"E:\ss\111.txt", "name", Field.Store.YES, Field.Index.UN_TOKENIZED));
doc.Add(new Field("content", sr.ReadToEnd().ToString(), Field.Store.YES, Field.Index.TOKENIZED));
sr.Close();
doc.Add(Field.UnIndexed("filename", "111.txt"));
writer.AddDocument(doc);
writer.Optimize();
writer.Close();
解决方案 »
- 在线等!未声明的标量变量@DepartName。代码贴出来了,帮忙看看!
- 数据库连接问题,菜鸟求大侠
- Asp.net与Mysql乱码问题
- 讨论:ASP.NET与JSP的比较
- 关于datagrid按牛的问题(没分了,以后有了一定补上!!!)
- 用户上传一个文本文件后,如何判断文本文件的结构是否合法?
- ASP.NET MVC2网站经常出现莫名其妙的异常
- 怎样判定<%# databinder.eval(container.dataitem,"reply")%>值是否存在?
- 关于输出流到EXCEL的格式问题,求教
- 请问用ASP.NET怎么把我电脑上的文件上传到网站?
- asp.net实现串口通信
- 多个web.config的配置
{
try
{
IndexWriter writer = new IndexWriter(indexpath, getAnalyzer(), true);
writer.mergeFactor = 3000;
writer.minMergeDocs = 3000;
writer.maxMergeDocs = int.MaxValue;
for (int i = 0; i < dsMarketResult.Tables[0].Rows.Count; i++)
{
Document doc = new Document();
doc.Add(Field.Keyword("id", dsMarketResult.Tables[0].Rows[i][0].ToString()));
doc.Add(Field.UnIndexed("title", ClassIndex.RemoveHTML(dsMarketResult.Tables[0].Rows[i][1].ToString())));
string content = dsMarketResult.Tables[0].Rows[i][1].ToString();
//content += dsMarketResult.Tables[0].Rows[i][2].ToString();
doc.Add(Field.Text("content", ClassIndex.RemoveHTML(content)));
writer.AddDocument(doc);
}
writer.Optimize();
writer.Close();
}
catch(Exception exp2)
{
System.IO.StreamWriter write = new System.IO.StreamWriter(@"d:\index\createindexlog.txt", true, System.Text.Encoding.Unicode, 10240);
write.WriteLine("记录时间:" + DateTime.Now.ToString());
write.Write(exp2.ToString());
write.Flush();
write.Close();
}
}