大家应该都知道www.51job.com这个网站吧.
我现在想模仿浏览器实现搜索所有全文带有"java"关键字的职位.
51job分普通搜索和高级搜索,我是按普通搜索的,搜索"全文".如下图1图1点击搜索之后,浏览器会自动发出一个post的请求,然后51job会返回来一个响应,
这些都被万恶的我截取了下来,哈哈.
如下两个图(图2和图3)
图2图3
图中红色部分是返回来的response,可能对你的回答没有用
求一段java代码,能够模仿浏览器发出图2和图3所示的request,用到的技术可能是HttpURLConnection,
apache的httpcomponents或httpclient,不过只要你能用java代码实现我要的功能,用什么框架都行.
下面是图中大量的字符,供大家复制.Response Headers
Date Sat, 13 Sep 2008 15:51:43 GMT
Server Apache/1.3.37 (Unix)
Set-Cookie ord_list_field=0%7C1; expires=Mon, 23-Jul-18 15:51:43 GMT; path=/;
domain=.51job.com last_search=0000%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%
6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321103%40%230200%7E%609%7E%6099%7E%6099%7E%600000%
7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221320941; expires=Mon,
23-Jul-18 15:51:43 GMT; path=/; domain=.51job.com
Keep-Alive timeout=15, max=97
Connection Keep-Alive
Content-Type text/html
Cache-Control private
Content-Encoding gzip
Transfer-Encoding chunkedRequest Headers
Host search.51job.com
User-Agent Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.1) Gecko/2008070208
Firefox/3.0.1
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language zh-cn,zh;q=0.5
Accept-Encoding gzip,deflate
Accept-Charset gb2312,utf-8;q=0.7,*;q=0.7
Keep-Alive 300
Connection keep-alive
Referer http://www.51job.com/
Cookie guid=12212883571560640084; ord_list_field=0%7C1; last_search=0000%7E%609%7E%6099%7E%6099%
7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321062%40%
230200%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%
602%7E%601%7E%601221320941; 51job=cenglish%3D0; nolife=fromdomain%3D
Cache-Control max-age=0image.x 32
image.y 14
jobarea
keyword java
keywordtype 2
stype 2如果三个图失效了,请尝试访问下面的连接
url1: http://img129.imageshack.us/img129/5034/88081134gx3.jpg
url2: http://img178.imageshack.us/img178/176/84283260nn6.jpg
url3: http://img90.imageshack.us/img90/9711/44104996jv3.jpg
我现在想模仿浏览器实现搜索所有全文带有"java"关键字的职位.
51job分普通搜索和高级搜索,我是按普通搜索的,搜索"全文".如下图1图1点击搜索之后,浏览器会自动发出一个post的请求,然后51job会返回来一个响应,
这些都被万恶的我截取了下来,哈哈.
如下两个图(图2和图3)
图2图3
图中红色部分是返回来的response,可能对你的回答没有用
求一段java代码,能够模仿浏览器发出图2和图3所示的request,用到的技术可能是HttpURLConnection,
apache的httpcomponents或httpclient,不过只要你能用java代码实现我要的功能,用什么框架都行.
下面是图中大量的字符,供大家复制.Response Headers
Date Sat, 13 Sep 2008 15:51:43 GMT
Server Apache/1.3.37 (Unix)
Set-Cookie ord_list_field=0%7C1; expires=Mon, 23-Jul-18 15:51:43 GMT; path=/;
domain=.51job.com last_search=0000%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%
6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321103%40%230200%7E%609%7E%6099%7E%6099%7E%600000%
7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221320941; expires=Mon,
23-Jul-18 15:51:43 GMT; path=/; domain=.51job.com
Keep-Alive timeout=15, max=97
Connection Keep-Alive
Content-Type text/html
Cache-Control private
Content-Encoding gzip
Transfer-Encoding chunkedRequest Headers
Host search.51job.com
User-Agent Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.1) Gecko/2008070208
Firefox/3.0.1
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language zh-cn,zh;q=0.5
Accept-Encoding gzip,deflate
Accept-Charset gb2312,utf-8;q=0.7,*;q=0.7
Keep-Alive 300
Connection keep-alive
Referer http://www.51job.com/
Cookie guid=12212883571560640084; ord_list_field=0%7C1; last_search=0000%7E%609%7E%6099%7E%6099%
7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321062%40%
230200%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%
602%7E%601%7E%601221320941; 51job=cenglish%3D0; nolife=fromdomain%3D
Cache-Control max-age=0image.x 32
image.y 14
jobarea
keyword java
keywordtype 2
stype 2如果三个图失效了,请尝试访问下面的连接
url1: http://img129.imageshack.us/img129/5034/88081134gx3.jpg
url2: http://img178.imageshack.us/img178/176/84283260nn6.jpg
url3: http://img90.imageshack.us/img90/9711/44104996jv3.jpg
解决方案 »
- 关于在读取输入流时的while循环和read方法
- java jdic
- 奇怪的问题,java.security.AccessControlException
- 求教
- 一个关于i++的小问题(小是小,逻辑性很强,反正是把我绕进去了-_-)
- 关于java软件的基本问题,在线等
- 中文操作系统下,使用"5 \n 6"输出的结果是“5”+回车+“6” 可是在英文环境下面操作系统不认\n 请问应该如何实现呢?
- 弱弱的问题:服务端通过socket传给客户端的十六进制数"EA",怎么到了客户端编成了65533的十进制数了?应该是234的啊??
- 请问硬件高手,我的机器无缘无故地丢了D盘,不知为何?
- java 信手问题,通过编译,但是看不到我创建的包
- 用DWR从数据库提取出了数据,为什么在页面刷新一次就没数据了?
- static 的问题!一个小小菜鸟的问题!!
try{
URL url = new URL("http://192.168.1.66:8080/regSite");//Will be change to www.minma.com
HttpURLConnection urlcon = (HttpURLConnection) url.openConnection();
urlcon.setDoOutput(true);
urlcon.setRequestMethod("POST");
OutputStream buf = new BufferedOutputStream(urlcon.getOutputStream());
OutputStreamWriter out = new OutputStreamWriter(buf,"UTF-8");
/*向regSite传递5个参数*/
out.write("domain="+domain+"&tradeType="+tradeType+"&catalog="+catalog+"&pnum="+pnum+"&version=1.0");
out.flush();
out.close(); InputStream in = urlcon.getInputStream();
in.close();
/*读取访问页面内容*/
BufferedReader br=new java.io.BufferedReader(new InputStreamReader(in));
System.out.println("==================Beging====================");
String s=null;
while((s = br.readLine()) != null){
result+=s;
in.close();
System.out.println(result);
System.out.println("===================End======================");
}
}catch(Exception e){
System.out.println("Network not connect.");
System.out.println(e.getStackTrace());
}
但是我看cookie里面还有一大串的字符Cookie guid=12212883571560640084; ord_list_field=0%7C1; last_search=0000%7E%609%7E%6099%7E%6099%
7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321062%40%
230200%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%
602%7E%601%7E%601221320941; 51job=cenglish%3D0; nolife=fromdomain%3D 里面有很多带%的特殊字符,这个是什么编码,utf-8?gbk?或者是http的特殊编码,然后怎么转换成正常字符呢?