不难啊!写个程序抓取就可以了
解决方案 »
- Eclipse安装组件与没有安装组件在开发web上有什么不同
- jsp页面如何显示excel文件
- ————JSP[通用查询][控制器获取结果]的问题————
- 请问如何把汉字转换成RFC1738编码?
- tomcat自动停止服务谁知道?
- 谁能给我一段把页面直接导入到word的代码,最好也有导入到excel的
- ???Jsp写数据到数据库时有时成功有时失败的问题???(在线等,问题解决马上给分)
- JSP+mysql+Tomcat的网站问题!
- error信息為什麼沒顯示出來?
- 寻求cnjsp的论坛代码,不是那个什么浓缩版的,我需要的版本的数据库是MYSQL的,我以前有的,后来弄丢了,拜托,如果哪位兄弟有,告诉我地
- 提问一下,如何对空格和回车等字符进行转换下面这个有点问题啊...
- 讨论一下关于servlet使用的利弊,论者有分
import java.net.*;public class GetWebPage {
public static void main(String args[])
throws Exception {
if (args.length != 1) {
System.err.println("java GetWebPage hostname");
return;
}
String host = args[0];
InetAddress addr = InetAddress.getByName(host);
Socket socket = new Socket(addr, 80);
InputStream is = socket.getInputStream();
OutputStream os = socket.getOutputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
PrintWriter pw = new PrintWriter(new OutputStreamWriter(os));
pw.print("GET / HTTP/1.0\n\n");
pw.flush();
String line;
while ((line = br.readLine()) != null) { // read until EOF
System.out.println(line);
}
pw.close();
br.close();
}
}编译后运行
java GetWebPage java.sun.com,没有问题,不过sina总是出现
HTTP/1.0 403 Forbidden
Server: squid/2.5.STABLE4
Mime-Version: 1.0
Date: Fri, 30 Apr 2004 01:58:23 GMT
Content-Type: text/html
Content-Length: 1080
Expires: Fri, 30 Apr 2004 01:58:23 GMT
X-Squid-Error: ERR_ACCESS_DENIED 0
X-Cache: MISS from xa-179.sina.com.cn
Connection: close<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The requested URL could not be retrieved</TITLE>
<STYLE type="text/css"><!--BODY{background-color:#ffffff;font-family:verdana,sans-serif}PRE{font-family:sans-serif}--></STYLE>
</HEAD><BODY>
<H1>ERROR</H1>
<H2>The requested URL could not be retrieved</H2>
<HR noshade size="1px">
<P>
While trying to retrieve the URL:
<A HREF="http://218.30.12.179/index.html">http://218.30.12.179/index.html</A>
<P>
The following error was encountered:
<UL>
<LI>
<STRONG>
Access Denied.
</STRONG>
<P>
Access control configuration prevents your request from
being allowed at this time. Please contact your service provider if
you feel this is incorrect.
</UL>
<P>Your cache administrator is <A HREF="mailto:webmaster">webmaster</A>.
<BR clear="all">
<HR noshade size="1px">
<ADDRESS>
Generated Fri, 30 Apr 2004 01:58:23 GMT by xa-179.sina.com.cn (squid/2.5.STABLE4)
</ADDRESS>
</BODY></HTML>
似乎有什么限制。