郁闷，为什么IO流有时候是gb2312有时候不是呢？

调试了好多次,和你的结果一样,换了好多url只有sports.tom.com的不能正常显示.
关注一下

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

嗯，我也发现了，只有TOM的是这样，换成sina的就没这问题好奇怪，我一开始觉得问题出在程序上，现在感觉问题出在TOM上，但是TOM看上去也不像有问题的样子不解
也许是Tom网页上的ActiveX控件搞出来的毛病!
ActiveX控件？没道理啊～只是获取HTML而已啊
晕，不用reader,writer
直接用stream
reader writer是以字符为单位
对于这种网页读取最好以二进制流读入
有道理啊!
url.openStream()返回的就是就是InputStream，有read()方法可以读字节流!
public static String getHtml() {
//BufferedReader reader = null;
InputStream in=null;
String line = null;
String title = null;
String time = null;
String content = null;
String temp = null;
try {
URL url = new URL(URL); // Create the URL
URLConnection connection=url.openConnection();
//reader = new BufferedReader(new InputStreamReader(connection.getInputStream(),"GB2312"));
in=connection.getInputStream();
byte[] bytes=new byte[1024];
int len;
while((len=in.read(bytes))!=-1) {
/*
* 不知道你这段代码的意思，先不管！
if (line.indexOf("") != -1) {
title = line;
}
if (line.indexOf("") != -1) {
time = line;
}
if (line.indexOf("") != -1) {
content = line;
}
*/
System.out.write(bytes,0,len);
}
//temp = title + time + content;

} catch (Exception e) {
System.err.println(e);
} finally {
try {
in.close();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
return temp;
}
改了!不幸的是以字节读取tom的结果还是一样,一堆乱码!其他网站的都正常
System.out.print(new String(bytes,0,len,"GBK"));
System.out.print(new String(bytes,0,len,"GBK"));不是的,把tom的网页保存后用UltaEdit打开可以看到<META http-equiv=Content-Type content="text/html; charset=gb2312">,编码应该是没问题的.
用delphi获取就没问题应该还是java对steam处理上的问题
System.out.print(new String(bytes,0,len,"GBK"));
System.out.print(new String(bytes,0,len,"GB2312"));这两个随便哪个
你把bin stream写到文件就发现没有乱码了
先保存的话，我不好分析呀按我原来那样写会有什么问题么？我不用TOM的就是了，我换别的好了