刚学Java不久,最近有一个任务,需要经常在程序中获取源文件,可是我发现获的源代码经常是乱码,这是怎么回事?
以下是我从Google上获取源文件的代码import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test2 {
public static void main(String[] args) throws IOException {
HttpURLConnection httpurlconnection = null;
String googlesite="http://www.google.cn/search?hl=zh-CN&newwindow=1&client=aff-maxthon&hs=7tK&channel=channel4&source=hp&q=%E4%BC%98%E9%85%B7&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&oq=";
String line=null;
URL url=new URL(googlesite);
httpurlconnection = (HttpURLConnection) url.openConnection();
httpurlconnection.setConnectTimeout(60000);
httpurlconnection.setReadTimeout(60000);
httpurlconnection.setDoOutput(false);
httpurlconnection.setDoInput(true);
httpurlconnection.setRequestMethod("GET");
httpurlconnection.setRequestProperty("x-requested-with",
"XMLHttpRequest");
httpurlconnection.setRequestProperty("Accept-Language", "zh-cn");
httpurlconnection.setRequestProperty("Referer",
"http://www.google.cn/search?hl=zh-CN&newwindow=1&client=aff-maxthon&hs=7tK&channel=channel4&source=hp&q=%E4%BC%98%E9%85%B7&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&oq=");
httpurlconnection.setRequestProperty("Accept",
"image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*");
httpurlconnection.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
httpurlconnection.setRequestProperty("Accept-Encoding",
"gzip, deflate");
httpurlconnection
.setRequestProperty("User-Agent",
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Mathon 2.0)");
httpurlconnection.setRequestProperty("Host", "www.google.cn");
httpurlconnection.setRequestProperty("Connection", "Keep-Alive");
httpurlconnection.connect();
BufferedReader brr =new BufferedReader(new InputStreamReader(httpurlconnection.getInputStream(),"utf-8"));
while((line=brr.readLine())!=null)
{
System.out.println(line);
}
httpurlconnection.disconnect();
}
}
以下是我从Google上获取源文件的代码import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test2 {
public static void main(String[] args) throws IOException {
HttpURLConnection httpurlconnection = null;
String googlesite="http://www.google.cn/search?hl=zh-CN&newwindow=1&client=aff-maxthon&hs=7tK&channel=channel4&source=hp&q=%E4%BC%98%E9%85%B7&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&oq=";
String line=null;
URL url=new URL(googlesite);
httpurlconnection = (HttpURLConnection) url.openConnection();
httpurlconnection.setConnectTimeout(60000);
httpurlconnection.setReadTimeout(60000);
httpurlconnection.setDoOutput(false);
httpurlconnection.setDoInput(true);
httpurlconnection.setRequestMethod("GET");
httpurlconnection.setRequestProperty("x-requested-with",
"XMLHttpRequest");
httpurlconnection.setRequestProperty("Accept-Language", "zh-cn");
httpurlconnection.setRequestProperty("Referer",
"http://www.google.cn/search?hl=zh-CN&newwindow=1&client=aff-maxthon&hs=7tK&channel=channel4&source=hp&q=%E4%BC%98%E9%85%B7&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&oq=");
httpurlconnection.setRequestProperty("Accept",
"image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*");
httpurlconnection.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
httpurlconnection.setRequestProperty("Accept-Encoding",
"gzip, deflate");
httpurlconnection
.setRequestProperty("User-Agent",
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Mathon 2.0)");
httpurlconnection.setRequestProperty("Host", "www.google.cn");
httpurlconnection.setRequestProperty("Connection", "Keep-Alive");
httpurlconnection.connect();
BufferedReader brr =new BufferedReader(new InputStreamReader(httpurlconnection.getInputStream(),"utf-8"));
while((line=brr.readLine())!=null)
{
System.out.println(line);
}
httpurlconnection.disconnect();
}
}
解决方案 »
- 智力题,散分!!!分不多,重在参与!
- 动态建立JTable
- 龟兔赛跑的问题,急啊!!
- 请问在java中如何实现参数的引用传递?
- 请问一个 java 正则表达式的问题,解决立即结帖。
- 关于使用了未经检查或不安全的操作的问题怎么解决呢?
- 怎么判断字符串中只包含中文(不含英文 阿拉伯数字 和其他国外文字)
- 一个简单的JAVA程序
- %%%%%%%%%%%%%%%在2k平台开发applet,发现在98下不能运行,得到的很多引用都是null,是不是由于98是16位的造成的。%%%%%%%%%%%%%
- java swing程序导出带第三方jar包的项目时报错
- 多线程中if(!flag)的困惑———生产者与消费者的Demo
- 这段简单的io程序,为什么没有达到效果呢?
因为你设置的返回格式为gzip,所以是乱码。
1. 注释
// httpurlconnection.setRequestProperty("Accept-Encoding", "gzip, deflate");2. 或者你解码
一种是字符编码不同。
另一种是,传输的数据,可能采用压缩格式了。楼主可以通过应答过程中,HTTP头部的属性判断,是编码问题,还是压缩问题。
有可能是压缩问题,因为,请求过程中,HTTP的头部,说明你的程序支持对数据的gzip压缩算法。
BufferedReader brr =new BufferedReader(new InputStreamReader(new GZIPInputStream(httpurlconnection.getInputStream())));
gzip一种数据压缩方式,其实其他的setqRequetProperty我想都可以不用,但是设置多了是模拟浏览器更像,其实你完全可以就设置两个
httpurlconnection.setRequestProperty("User-Agent","Mozilla/5.0");//模拟的浏览器是火狐
httpurlconnection.setRequestProperty("Accept-Encoding","gzip,deflate");//接收数据为gzip压缩格式