大家应该都知道www.51job.com这个网站吧.
我现在想模仿浏览器实现搜索所有全文带有"java"关键字的职位.
51job分普通搜索和高级搜索,我是按普通搜索的,搜索"全文".如下图1图1点击搜索之后,浏览器会自动发出一个post的请求,然后51job会返回来一个响应,
这些都被万恶的我截取了下来,哈哈.
如下两个图(图2和图3)
图2图3
图中红色部分是返回来的response,可能对你的回答没有用
求一段java代码,能够模仿浏览器发出图2和图3所示的request,用到的技术可能是HttpURLConnection,
apache的httpcomponents或httpclient,不过只要你能用java代码实现我要的功能,用什么框架都行.
下面是图中大量的字符,供大家复制.Response Headers
Date Sat, 13 Sep 2008 15:51:43 GMT
Server Apache/1.3.37 (Unix)
Set-Cookie ord_list_field=0%7C1; expires=Mon, 23-Jul-18 15:51:43 GMT; path=/;
domain=.51job.com last_search=0000%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%
6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321103%40%230200%7E%609%7E%6099%7E%6099%7E%600000%
7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221320941; expires=Mon, 
23-Jul-18 15:51:43 GMT; path=/; domain=.51job.com
Keep-Alive timeout=15, max=97
Connection Keep-Alive
Content-Type text/html
Cache-Control private
Content-Encoding gzip
Transfer-Encoding chunkedRequest Headers
Host search.51job.com
User-Agent Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.1) Gecko/2008070208 
Firefox/3.0.1
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language zh-cn,zh;q=0.5
Accept-Encoding gzip,deflate
Accept-Charset gb2312,utf-8;q=0.7,*;q=0.7
Keep-Alive 300
Connection keep-alive
Referer http://www.51job.com/
Cookie guid=12212883571560640084; ord_list_field=0%7C1; last_search=0000%7E%609%7E%6099%7E%6099%
7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321062%40%
230200%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%
602%7E%601%7E%601221320941; 51job=cenglish%3D0; nolife=fromdomain%3D
Cache-Control max-age=0image.x 32
image.y 14
jobarea
keyword java
keywordtype 2
stype 2如果三个图失效了,请尝试访问下面的连接
url1: http://img129.imageshack.us/img129/5034/88081134gx3.jpg
url2: http://img178.imageshack.us/img178/176/84283260nn6.jpg
url3: http://img90.imageshack.us/img90/9711/44104996jv3.jpg

解决方案 »

  1.   

    /*连接http://192.168.1.66:8080/regSite,并向其传递参数*/ 
    try{ 
    URL url = new URL("http://192.168.1.66:8080/regSite");//Will be change to www.minma.com 
    HttpURLConnection urlcon = (HttpURLConnection) url.openConnection(); 
    urlcon.setDoOutput(true); 
    urlcon.setRequestMethod("POST"); 
    OutputStream buf = new BufferedOutputStream(urlcon.getOutputStream()); 
    OutputStreamWriter out = new OutputStreamWriter(buf,"UTF-8"); 
    /*向regSite传递5个参数*/ 
    out.write("domain="+domain+"&tradeType="+tradeType+"&catalog="+catalog+"&pnum="+pnum+"&version=1.0"); 
    out.flush(); 
    out.close(); InputStream in = urlcon.getInputStream(); 
    in.close(); 
    /*读取访问页面内容*/ 
    BufferedReader br=new java.io.BufferedReader(new InputStreamReader(in)); 
    System.out.println("==================Beging===================="); 
    String s=null; 
    while((s = br.readLine()) != null){ 
    result+=s; 
    in.close(); 
    System.out.println(result); 
    System.out.println("===================End======================"); 

    }catch(Exception e){ 
    System.out.println("Network not connect."); 
    System.out.println(e.getStackTrace()); 
      

  2.   

    用HttpClient类可以实现,看一下想关的api
      

  3.   

    httpclient(apache commons) or httpconnection(jdk)
      

  4.   

    其实1楼的回答已经很不错了,能够用request.getParameter("keyword")接受到诸如keyword=java的内容
    但是我看cookie里面还有一大串的字符Cookie guid=12212883571560640084; ord_list_field=0%7C1; last_search=0000%7E%609%7E%6099%7E%6099% 
    7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E%602%7E%601%7E%601221321062%40% 
    230200%7E%609%7E%6099%7E%6099%7E%600000%7E%600000%7E%6000%7E%6099%7E%6099%7E%6099%7E%60java%7E% 
    602%7E%601%7E%601221320941; 51job=cenglish%3D0; nolife=fromdomain%3D 里面有很多带%的特殊字符,这个是什么编码,utf-8?gbk?或者是http的特殊编码,然后怎么转换成正常字符呢?