String url = "http://roll.sohu.com/20111026/n323511012.shtml"; String str = getHttp(url);
System.out.println(str); public String getHttp(String url) {
try {
URL u = new URL(url);
HttpURLConnection http = (HttpURLConnection) u.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(http.getInputStream(), "gbk"));
StringBuilder sb = new StringBuilder();
String line = "";
while ((line = in.readLine()) != null) {
sb.append(line).append("\n");
}
in.close();
http.disconnect();
return sb.toString();
} catch (Exception ex) {
Logger.getLogger(Http.class.getName()).log(Level.SEVERE, null, ex);
return null;
}
}http://roll.sohu.com/20111026/n323511012.shtml 明明是GBK 的为什么读取出来是乱码
.getInputStream(), "utf-8"));
设置成utf-8试试,可能是页面的编码是utf-8
不行http://roll.sohu.com/20111026/n323511012.shtml 你用浏览器查看源代码,明明是GBK 用UTF8同样是乱码