lucence2.4.0 如何支持xls、doc、xml文件搜索呢？我在网上找的实例，不支持doc以及xls啊！

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

用jacob.jar第三方包来解决，下载地址如下：
http://sourceforge.net/project/showfiles.php?group_id=109543&package_id=118368jacob应用实例：
http://cache.baidu.com/c?m=9d78d513d99d06fe0fb1c5291a16a6234414d6777b978e1a2592d50a8465285c5a23a6fe302267548d9829365db8492bbbad696f704277f798c295128afbd263388f53642e41d35c428d44fad64624ca27955aedaa0ee7cdaa74ccf0&p=9360c64ad18905ef44bd9b780d42&user=baidu
对于Lucene来说，不在乎，或者说，不关心数据的来源形式
  所有对于Word、Excel、PDF等非纯文本格式的文档，都得自行使用相关的工具（如Jacob、POI、jxl、iText等），把内容解析为String，再进行分词，建立索引
  而对于XML、HTML等带格式的纯文本，虽然可以简单当作普通纯文本来建索引，但为了保证索引的有效和准确，最好还是先使用某些解析工具，把需要索引的内容解析出来，再进行索引
String fileName = file.getName();
String content = "";
FileInputStream input = null;
InputStream is = null;
StringWriter docTextWriter = null;
//System.out.println(file.getPath());
if(file.getPath().endsWith("doc")||file.getPath().endsWith("DOC")){
//POI的使用方法，如果word里有很多表格，会导致读取失败，故不在使用
// is = new FileInputStream(file);
// WordDocument wd = new WordDocument(is);
//     docTextWriter = new StringWriter();
//     wd.writeAllText(new PrintWriter(docTextWriter));
//     content = docTextWriter.toString();

//TextMining的使用方法
input = new FileInputStream (file);
WordExtractor extractor = new WordExtractor();
content = extractor.extractText(input);
//System.out.println(content);
}