如何提取网页元素和相应的链接地址程序 the art of java上好像有个例子 解决方案 » 免费领取超大流量手机卡,每月29元包185G流量+100分钟通话, 中国电信官方发货 给你一个例子,得到某个网页的所有链接:public ArrayList getUrls(String url) throws Exception{ HTTPSocket http = new HTTPSocket(); HTMLPage page = new HTMLPage(http); page.open(url,null); Vector links = page.getLinks(); Iterator it = links.iterator(); String href = ""; ArrayList hrefs = new ArrayList(); while(it.hasNext()){ Link link = (Link)it.next(); href = link.getHREF(); System.out.println(link.getHREF()+" : "+ link.getPrompt()); } return hrefs; }不过,这需要<<JAVA机器人编程指南>>中的bot包. 下载地址:http://www.jeffheaton.com 不好意思,上面的例子删除了点东西,有点问题.主要是下面几句:HTTPSocket http = new HTTPSocket();HTMLPage page = new HTMLPage(http);page.open(url,null);Vector links = page.getLinks();Iterator it = links.iterator();while(it.hasNext()){ Link link = (Link)it.next(); System.out.println(link.getHREF()+" : "+ link.getPrompt());} JSP基于XML如何实现分页? 请给我推荐一些 JavaEE的学习资料和视频 tocat问题 java web start Tomcat中 AdminService (wsdl) Version (wsdl) 是如何出现的??? 关于为每个用户单独设置 session timeout 的问题 j2ee环境变量问题? 请问在web service(java)这端是否可以将客户端上来的数据入到数据库中? 如何开发消息中间件? 如何使得linux平台Eclipse开发的程序和Windows平台的兼容 求Struts标签问题 请各位高手帮忙写个算法!!
public ArrayList getUrls(String url) throws Exception{
HTTPSocket http = new HTTPSocket();
HTMLPage page = new HTMLPage(http);
page.open(url,null);
Vector links = page.getLinks(); Iterator it = links.iterator();
String href = "";
ArrayList hrefs = new ArrayList();
while(it.hasNext()){
Link link = (Link)it.next();
href = link.getHREF();
System.out.println(link.getHREF()+" : "+ link.getPrompt());
}
return hrefs;
}
不过,这需要<<JAVA机器人编程指南>>中的bot包. 下载地址:http://www.jeffheaton.com
HTTPSocket http = new HTTPSocket();
HTMLPage page = new HTMLPage(http);
page.open(url,null);
Vector links = page.getLinks();Iterator it = links.iterator();while(it.hasNext()){
Link link = (Link)it.next();
System.out.println(link.getHREF()+" : "+ link.getPrompt());}