我看我现在的代码和网上的一些没什么却别,为什么人家最多是显示乱码,而我是发生这错误呢?
代码如下:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;import org.pdfbox.pdfparser.PDFParser;
import org.pdfbox.pdmodel.PDDocument;
import org.pdfbox.util.PDFTextStripper;
public class pdf3 {
public pdf3() {
} public static void main(String[] args) {
FileInputStream is = null;
try {
is = new FileInputStream("d:\\test1.pdf");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
PDFParser parser = null;
PDFTextStripper stripper = null;
PDDocument pdfdocument = null;
String result = "";
try {
parser = new PDFParser(is);
parser.parse();
pdfdocument = parser.getPDDocument();
stripper = new PDFTextStripper();
result = stripper.getText(pdfdocument);//这句出问题
} catch (IOException e1) {
e1.printStackTrace();
} System.out.println("the string length is" + result.length() + "n");
System.out.println("============n" + result + "n"); }
代码如下:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;import org.pdfbox.pdfparser.PDFParser;
import org.pdfbox.pdmodel.PDDocument;
import org.pdfbox.util.PDFTextStripper;
public class pdf3 {
public pdf3() {
} public static void main(String[] args) {
FileInputStream is = null;
try {
is = new FileInputStream("d:\\test1.pdf");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
PDFParser parser = null;
PDFTextStripper stripper = null;
PDDocument pdfdocument = null;
String result = "";
try {
parser = new PDFParser(is);
parser.parse();
pdfdocument = parser.getPDDocument();
stripper = new PDFTextStripper();
result = stripper.getText(pdfdocument);//这句出问题
} catch (IOException e1) {
e1.printStackTrace();
} System.out.println("the string length is" + result.length() + "n");
System.out.println("============n" + result + "n"); }
转的文本绝对效果好(能保持原来pdf的排版)