本来是想采集搜索的结果的。
但是偶尔会下载到类似:
<!-- gsp15.search.cnb.yahoo.com uncompressed/chunked Thu May 22 11:25:49 CST 2008 -->
这样的字符串,得不到网页的内容,这是怎么回事。
但是偶尔会下载到类似:
<!-- gsp15.search.cnb.yahoo.com uncompressed/chunked Thu May 22 11:25:49 CST 2008 -->
这样的字符串,得不到网页的内容,这是怎么回事。
解决方案 »
- 为什么往Property后缀的文件里面写汉字会出现时乱码,读取出来也是乱码,英文汉字就是正常的
- 想用SWING写一个界面,用来显示构造好的幻方,用什么来显示比较好呢?
- 用servlet生成一个图片时报错
- java1.5容器类继承问题
- 本站java FAQ下载
- 把图标加入到JTree后,节点上的文字不见了。求教!!!
- 什么是静态页面(csdn上回帖后老说进入静态页面),用什么技术做的啊~
- jsp代码太长,无法编译问题??请高手指教
- applet问题: 我用IE加载applet时, 老是出现信息栏的警告提示, 请问有什么办法除去信息栏的警告提示呢?
- #7 Problem: Oracle的SQLCODE问题
- java的socket线程之间怎么通信?
- 弱弱的问一下,前几天CSDN咋了??
public static String getHtmlText(String strUrl, int timeout, String strEnCoding) {
if (strUrl == null || strUrl.length() == 0) {
return null;
} StringBuffer strHtml = null;
String strLine = "";
HttpURLConnection httpConnection = null;// 这里可以定义成HttpURLConnection
InputStream urlStream = null;
BufferedInputStream buff = null;
BufferedReader br = null;
boolean isError = false;
try { //链接网络得到网页源代码
URL url = new URL(strUrl);
httpConnection = (HttpURLConnection) url.openConnection();
httpConnection.addRequestProperty("User-Agent", "IcewolfHttp/1.0");
httpConnection.addRequestProperty("Accept",
"www/source; text/html; image/gif; */*");
httpConnection.addRequestProperty("Accept-Language", "");
httpConnection.setConnectTimeout(timeout);
httpConnection.setReadTimeout(timeout);
urlStream = httpConnection.getInputStream();
buff = new BufferedInputStream(urlStream);
Reader r = null;
if (strEnCoding == null || strEnCoding.compareTo("null") == 0) {
r = new InputStreamReader(buff);
} else {
try {
r = new InputStreamReader(buff, strEnCoding);
} catch (UnsupportedEncodingException e) {
r = new InputStreamReader(buff);
}
}
br = new BufferedReader(r);
strHtml = new StringBuffer("");
while ((strLine = br.readLine()) != null) {
strHtml.append(strLine + "\r\n");
}
} catch (Exception e) {
//e.printStackTrace();
System.out.println(e.getClass() + "下载网页" + strUrl + "失败");
isError = true;
} finally{
try{
if (br != null)
br.close();
if (buff != null)
buff.close();
if (urlStream != null)
urlStream.close();
}catch(Exception e){
System.out.println(e.getClass() + "下载网页" + strUrl + "连接关闭失败");
return null;
}
}
if (strHtml == null || isError)
return null;
return strHtml.toString();
}