我想使用下面的一个程序下载一个网页

我想使用下面的一个程序下载一个网页，他已经成功地写入文件了，但是我不希望他写入文件，我只想获得这个字符串内容并输出到Console上，请问应该如何修改？我尝试过读取Content = input.read(buf, 0, ReadByte)，但content给出是一个古怪的字符串，真的不知如何得到这些字符串，好怪。import java.io.IOException;//引入类
import java.io.*;
import java.net.URL;
import java.net.URLConnection;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.StringReader;
import java.io.StringWriter;
import java.util.concurrent.TimeUnit;
import java.io.BufferedReader;//引入类
import java.io.File;
import java.io.FileInputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.*;
import java.io.*;
public class TextURL {
public static void getHTMLResource(String htmlFile) throws IOException {//读取URL指定的网页内容try {RandomAccessFile random = new RandomAccessFile("c:/11.txt", "rw");
int ReadByte = 1024;
URL url = new URL(htmlFile);// 根据网址创建URL对象
HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection();//创建远程对象连接对象
InputStream input = httpConnection.getInputStream();//获得输入流对象
byte[] buf = new byte[ReadByte];//创建字节数据存储文件的数据
while ((Content = input.read(buf, 0, ReadByte)) > 0) {//读取文件信息
random.write(buf, 0, ReadByte);//写入文件
System.out.println(input); }
random.close();//释放资源
input.close();
System.out.println(""+99);
System.out.println(""+Content);
} catch (Exception e) {}
} public static void main(String[] args) throws IOException {// java程序主入口处
String htmlFile="http://www.tianya.cn/publicforum/content/no05/1/219013.shtml";
getHTMLResource(htmlFile); }
}

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

“古怪的字符串”，是怎么个古怪法？是不是new String的时候，没有指定编码字符集？
试了下你的程序，你这个程序有编译错误吧？Content这个变量就没有定义。另外，你企图输出到Console上的写法错了，应该是：System.out.println(new String(buf, 0, Content));
你目前的写法是：System.out.println(input); 这只能是显示下这个input对象的信息而已，类似于“sun.net.www.protocol.http.HttpURLConnection$HttpInputStream@32c41a”。
你是想把你要写入的内容不写入文件，而是输出到console上。这样的话，你把你读取到的字节数组遍历一次就可以得到你读取的内容。但是你读取文件的语句要修改一下。
byte [] buf = new byte[1024];
int len = 0;
while((len = reader.read(buf)) != -1){ writer.write(buf,0,len);
}
for(byte b : buf){
    System.out.print(b);
}
    System.out.println();
这样一来，不但可以保证不出现JVM内存溢出，也可以输出你读取的数据。
我把你的程序修改了一下，修改结果如下：
import java.io.IOException;
import java.io.InputStream;
import java.io.RandomAccessFile;
import java.net.HttpURLConnection;
import java.net.URL;public class TextURL{
    public static void getHTMLResource(String htmlFile) throws IOException {
    try {
RandomAccessFile random = new RandomAccessFile("d:/tt.txt", "rw");
int ReadByte = 1024;
URL url = new URL(htmlFile);
HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection();
InputStream input = httpConnection.getInputStream();//
byte [] buf = new byte[1024];
int len = 0;
while((len = input.read(buf)) != -1){
random.write(buf,0,len);
}
for(byte b : buf){
  System.out.print(b);
}
  System.out.println(); random.close()
input.close();
System.out.println("" + 读取结束);

} catch (Exception e) {
}
}     public static void main(String[] args) throws IOException
      String htmlFile= "http://www.tianya.cn/publicforum/content/no05/1/219013.shtml";
getHTMLResource(htmlFile);
}
}
运行的结果是一串0~9的数字。由于结果过长，下面是其中一部分：
"91143943391051121163210897110103117971031016134106971189711599114105112116343211511499613410411611”
利用文件接收的内容是该页面的源文件。
你直接把字节给打印了，能看到那些文字么？有流了输出字符串还不简单，去看String的API
（String(byte[] bytes, String charsetName) ）
public static void getHTMLResource(String htmlFile) throws IOException {// 读取URL指定的网页内容
try {
RandomAccessFile random = new RandomAccessFile(
"/root/Desktop/11.txt", "rw");
int ReadByte = 1024;
URL url = new URL(htmlFile);// 根据网址创建URL对象
HttpURLConnection httpConnection = (HttpURLConnection) url
.openConnection();// 创建远程对象连接对象
InputStream input = httpConnection.getInputStream();// 获得输入流对象
byte[] buf = new byte[ReadByte];// 创建字节数据存储文件的数据
int Content = 0;
StringBuffer bf = new StringBuffer();
while ((Content = input.read(buf, 0, ReadByte)) != -1) {// 读取文件信息
random.write(buf, 0, Content);// 写入文件
bf.append(new String(buf, 0, Content));
}
System.out.println(bf.toString());
random.close();// 释放资源
input.close();
} catch (Exception e) {
//nothing
}
}每次除了写文件外，还把结果存在StringBuffer里面。
不过这不是最终解决方案，因为这样分断读取并转换成字符串，中间可能有乱码。最好用InputStreamReader读取数据，并根据网页设置编码级别。
public static void getHTMLResource(String htmlFile) throws IOException {// 读取URL指定的网页内容
try {
RandomAccessFile random = new RandomAccessFile(
"/root/Desktop/11.txt", "rw");
URL url = new URL(htmlFile);// 根据网址创建URL对象
InputStreamReader in = new InputStreamReader(url.openStream(),"gbk");//根据网页设置编码
char[] chars = new char[1111];// 创建字节数据存储文件的数据
StringBuffer bf = new StringBuffer();
while (in.read(chars) != -1) {// 读取文件信息
random.writeChars(new String(chars));// 写入文件
bf.append(new String(chars));
chars = new char[1024];//防止读取到上次数据
}
System.out.println(bf.toString());
random.close();// 释放资源
in.close();
} catch (Exception e) {
}
}