一个关于将PDF文件转换为txt格式的问题？？

我想将pdf格式的文档转化为txt文档，但是不知道为什么在txt中汉字显示的都是乱码，其他字符都可以显示：
郁闷之极，不知哪位大侠能给我一个解脱:((注：从java中抽取word pdf文档的四种方法：http://touchpdf.blogdriver.com/touchpdf/98972.html)程序如下：
import java.io.*;public class PdfWin {
  public PdfWin() {
  }
  public static void main(String args[]) throws Exception
  {
    String PATH_TO_XPDF="C:\\Documents and Settings\\Frank\\pdftotext.exe";
    String filename="c:\\a.pdf";
    String[] cmd = new String[] { PATH_TO_XPDF, "-enc", "UTF-8", "-q", filename, "-"};
    Process p = Runtime.getRuntime().exec(cmd);
    BufferedInputStream bis = new BufferedInputStream(p.getInputStream());
    InputStreamReader reader = new InputStreamReader(bis, "UTF-8");
PrintWriter  resultFile  =  new PrintWriter(new FileOutputStream(".\\result.txt"));
    char [] buf = new char[10000];
    byte [] bytebuf = new byte[2];
    char [] buf2 = new char[2];
    String   tempstring1 = null;
    String   tempstring2 = null;
    int len;
    int Unicode = 0;
    try{
    while((len = reader.read(buf))>= 0)
    {
     tempstring1 = null;
     for(int i=0;i<len;i++)
     {
     bytebuf[0] = (byte)buf[i];
     if((bytebuf[0]-160+256)>99)
     {
        tempstring2 = String.valueOf(buf[i]);
     }
     else
     {
     bytebuf[0] = (byte)buf[i];
     bytebuf[1] = (byte)buf[i+1];
     tempstring2 = new String(bytebuf);
     i++;
     }
     tempstring1 +=tempstring2;
     }
    System.out.println(tempstring1);
    resultFile.print(tempstring1);
    resultFile.flush();
    }
}catch(FileNotFoundException e1){
        e1.printStackTrace();
        }catch(IOException e2){
        e2.printStackTrace();
        }
    reader.close();
  }
}
在线等待！！！！！！！

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

pdftotext.exe这个工具能实现中文转换么？
如果能，再仔细看他的使用帮助
怎样从UTF－8编码的文本中读出中文字符啊？
有人吗？:(
英文可以，那中文也应该行，你用new string 转换一下看看
以楼主的代码在实际测试中，我也遇到了和楼主同样的问题！
但还不太一样，我在将pdf格式的文档转化为txt文档，在txt中汉字显示有一半是乱码，而另一半则是正确的。
郁闷！
希望各位大侠能给个答案！
pdftotext.exe目前对于部分中文pdf(嵌入式字体)还有一些问题.所以会出现汉字显示有一半是乱码，而另一半则是正确的。xpdf及中文语言包的下载：http://www.foolabs.com/xpdf/在使用java调用的时候，要注意xpdf配置文件xpdfrc中语言包的路径问题，最简单的办法就是使用绝对路径。这样。即使pdftotext命令由java来调用执行的时候也可以找到语言包的位置。String[] cmd = new String[] { PATH_TO_XPDF, "-layout","-enc", "GBK", "-q", pdfFile, "-"};Process p = Runtime.getRuntime().exec(cmd);
BufferedInputStream bis = new BufferedInputStream(p.getInputStream());
InputStreamReader reader = new InputStreamReader(bis);//GBK这样的话。就可以解决中文乱码的问题了
IT公司查询网站，大家可以查下自己关心的公司！
http://www.seeitco.com

    收集国内外IT公司相关信息，并且有相关的评论和排名，你也可以把自己的看法加进去，包括待遇、薪水、公司规模等等，对于找工作的各位很有帮助哦！自己先在心里有个底，至少不会被各家公司眼花缭乱的招聘广告所蒙骗。各位也可以往里面添加新的公司信息，使数据库进一步完善，以后就会越来越方便！
laughsmile(海边的星空)
你的方法我试了，不行。
另外如果在windows的环境下该如何配置文件xpdfrc中语言包的路径呢？
String gb = new String(utfstring.getBytes("UTF-8"),"gb2312");
那时内码错了，你可以先找找看能不能判断PDF文件的内码，
可以根据相对应的内码来解释，就可以了。