我菜鸟,刚接触poi,遇到很多的问题,求高手指点,我的poi版本是poi-3.5-beta1,从官方下载的
String s = "";//此处填写文件路径
File file = new File(s); 
if(file.length() == 0) return;
InputStream inputStream = new FileInputStream(file);
WordExtractor we = new WordExtractor(inputStream);//全都是这句报的错!!!1、java.lang.NullPointerException
at org.apache.poi.poifs.property.DirectoryProperty.addChild(DirectoryProperty.java:289)
at org.apache.poi.poifs.property.PropertyTable.populatePropertyTree(PropertyTable.java:174)
at org.apache.poi.poifs.property.PropertyTable.<init>(PropertyTable.java:82)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:171)
at org.apache.poi.hwpf.HWPFDocument.verifyAndBuildPOIFS(HWPFDocument.java:127)
at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:49)
此word可以正常打开,将其内容复制到另一个新建word文件中读取就正常了,怀疑跟word版本有关
2、java.lang.NullPointerException
at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:50)
at org.apache.poi.hwpf.model.StyleSheet.createPap(StyleSheet.java:248)
at org.apache.poi.hwpf.model.StyleSheet.<init>(StyleSheet.java:121)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:248)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:152)
at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:57)
at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:49)
此word文件可以正常打开,将其内容复制到另一个新建word文件中读取就正常了,怀疑跟word版本有关
3、 java.io.IOException: Invalid header signature; read -8325795626950723393, expected -2226271756974174256
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:112)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
at org.apache.poi.hwpf.HWPFDocument.verifyAndBuildPOIFS(HWPFDocument.java:127)
at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:49)
此word打开为乱码……
暂时碰到这三个问题,求高手指点

解决方案 »

  1.   

    LZ  刚刚用3.5测试了一下已有的doc文件和新建的doc文件都正常。
      

  2.   

    word文件里面有一些没有被poi识别的东西,你用一个简单的word试试,比如你新建一个内容简单的word,测试一下
      

  3.   

    首先感谢回帖的朋友,忘记说明了我是读取了诸多的word文档,其他的word文档都是正常的,只有几个个别的有问题,我就是把这几个有问题的挑出来寻求帮助的
      

  4.   

    http://topic.csdn.net/u/20080704/16/0102f610-f630-48ee-8b4f-0f70ddbfdc19.html
    这里回复过 希望对lz有用
      

  5.   

    看过了,貌似是excel的导入,不好意思,我忘记说了,我所应用的仅仅是读取word文档
      

  6.   

    word是什么版本,poi-3.5-beta1支持了office2007,但是这是beta版,还属于测试阶段做项目的话,千万别用beta版本!
      

  7.   

    哦 不会有office2007的 但是可能有97以前的版本
    关于第三个问题,我想我已找到了答案,原因是非正常的word版本,在朋友的指点下我查看了inputstream转成的字节数组,正常的word文档的数组头应该由以下字节组成:
    -48-4917-32-95-7926-31000000000000000062030-2-190600000000000100042000000001600440001000-2-1-1-1000041000-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-20-91-630113969400-1682-6500000016000006007380014098106981061138011380000000000000000000174220501600195810195810730000000000000000000000000000000-1-115000000000-1-115000000000-1-11500000000000000000-9200000-1223000000-122300-1223000000-1223000000-1223000000-1223000000-1223002000000000000-102300000034400000034400000034400000034400120004640012000-10230000009350054100704000000704000000704000000704000000704000000704000000704000000704000000-364002000-344000000-344000000-344000000-344000000-344000000-3440036000-109600104200-5800660002500210000000000000000000-122300000070400000000000000000000007040000007040000007040000007040000002500000000000000-1223000000-1223000000704000000000000007040000002350022000-884000000-884000000-8840000007040016000-1223000000704000000-1223000000704000000-36400000000000000-10440016000000000000000000000000000000000000000000000000000704000000-36400000000000000-88400000000000000-884000000-1223000000-122300000000000000000000000000000000000000000000000000000000000000-884000000704000000584001200002891-88-561-551000000003440000008640016000-88400000000000000-3640000004550048000935000000-88400000061900000010240016000619000000-88400000000000000000000000000000000000000000000000000000000000000000000000000000061900000000000000-1223000000-8840052000704000000704000000-1224001800070400000070400000000000000000000000000000000000000704000000704000000704000000250000002500000000000000000000000000000000000000118400160000000000000000000000000000000000070400000070400000070400000093500000070400000070400000070400000070400000000000000-1023000000-1023000000-10230068000-3430068000-1023000000-1023000000-1023000000-343000000-1023000000-1023000000-1023000000-1223000000-1223000000-1223000000-1223000000-1223000000-1223000000-1-1-1-100002012100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    而非正常的word版本不是这样,所以导致了Invalid header signature这个错误吧?
      

  8.   

    在开源世界中,有两套比较有影响的转EXCEL的API可供使用,一个是POI,一个是jExcelAPI。其中jExcelAPI是一个韩国程序员的作品,虽然没有POI那样血统高贵,但是在使用过程中,感觉简单方便,对中文支持非常好,功能也比较强大。它的下载地址是:http://www.andykhan.com/jexcelapi/ 我已经在自己的一个项目中成功的将数据库中的数据转成EXCEL
    代码如下:/*
     * Generated by MyEclipse Struts
     * Template path: templates/java/JavaClass.vtl
     */
    package com.cptwyscg.struts.action;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    import org.apache.struts.action.Action;
    import org.apache.struts.action.ActionForm;
    import org.apache.struts.action.ActionForward;
    import org.apache.struts.action.ActionMapping;
    import com.cptwyscg.struts.form.YskgbbForm;
    import java.io.File;
    import java.sql.*;
    import java.util.ArrayList;
    import java.util.List;
    import java.text.DecimalFormat;
    import jxl.*; 
    import jxl.write.*; 
    import java.io.FileOutputStream;
     
     
    public class ToexcelAction extends Action {
    public ActionForward execute(ActionMapping mapping, ActionForm form,
    HttpServletRequest request, HttpServletResponse response)   {
    Statement st = null;
    ResultSet rs = null;
    List  ysx = new ArrayList();
        double ysze;
        double ysye;
        double usemony;
        int i=0;
     
           YskgbbForm yskgbbForm = (YskgbbForm) form;
    Connection conn = jsas400.getConnection();
     
    int yearmonth=2008; 
      String sqlstr=null;

    sqlstr="select * from xxx";
        
       
            try 
            { 
            //打开文件
              WritableWorkbook book= Workbook.createWorkbook(new File("E:\\updowload\\testexcel.xls"));
              // 生成名为“第一页”的工作表,参数0表示这是第一页
              WritableSheet sheet=book.createSheet("first",0); 
    try {
    st = conn.createStatement();
    rs = st.executeQuery(sqlstr);
    while (rs.next()) {
    ...
                                    ...
                                    ...
                    //在Label对象的构造子中指名单元格位置是第一列第一行(0,0)
                  //以及单元格内容为test
                    Label label1=new Label(0,i,rs.getString(1)); 
                  //将定义好的单元格添加到工作表中 
                    sheet.addCell(label1); 
                    Label label2=new Label(1,i,rs.getString(2)); 
                    sheet.addCell(label2);
                    Label label3=new Label(2,i,rs.getString(3)); 
                    sheet.addCell(label3);
                    /*生成一个保存数字的单元格 
                    必须使用Number的完整包路径,否则有语法歧义 
                    单元格位置是第二列,第一行,值为789.123*/
                   jxl.write.Number number1 = new jxl.write.Number(3,i,Double.valueOf(rs.getString(4))); 
                    sheet.addCell(number1);   
                    jxl.write.Number number2 = new jxl.write.Number(4,i,Double.valueOf(rs.getString(5))); 
                    sheet.addCell(number2);   
                    Label label4=new Label(5,i,rs.getString(6)); 
                    sheet.addCell(label4);
                    Label label5=new Label(6,i,rs.getString(7)); 
                    sheet.addCell(label5);
                  //写入数据并关闭文件 
                    i++;
                    }
    } catch (SQLException ex) {
    ex.printStackTrace();
    //throw ex;
    } finally {
    jsas400.releaseConnection(conn, st, rs);
    }
    book.write(); 
            book.close(); 
                
               }catch(Exception e){
             System.out.println(e);         
            }
    return mapping.findForward("success");
    }
    private String String(byte[] bytes, String string) {
    // TODO Auto-generated method stub
    return null;
    }
    }