我的jvm是utf-8的我用socket请求:http://tech.qq.com/web/it/telerss.xml(我用cpdetector得到的是gb2312)下面是我的代码: try {
URL u=new URL("http://tech.qq.com/web/it/telerss.xml");
String host=u.getHost();
String path=u.getPath();
String param=u.getQuery();
try {
Socket s=new Socket(host,80);
s.setKeepAlive(true);//长连接
OutputStream os=s.getOutputStream();
PrintWriter osOut=new PrintWriter(os,true);
//InputStream osIn=s.getInputStream();
BufferedReader osIn=new BufferedReader(
new InputStreamReader(s.getInputStream(),Charset.forName("GB2312")));
osOut.println("GET "+path+"?"+param+" HTTP/1.1");
osOut.println("Host: "+host+"");
osOut.println("User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36");
osOut.println("Accept-Charset: iso-8859-1, unicode-1-1;q=0.8");
osOut.println("Accept-Language: en-us,en;q=0.8");
osOut.println("Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");
osOut.println("Accept-Encoding: gzip,deflate");
osOut.println("Connection: Close");
osOut.println();
//get response
StringBuilder sb=new StringBuilder(); int c;
while((c=osIn.read())!=-1){
sb.append((char)c);
}
s.close();
System.out.println(sb.toString());
} catch (UnknownHostException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
高手指定一下,在哪一步乱码了?如何解决?
URL u=new URL("http://tech.qq.com/web/it/telerss.xml");
String host=u.getHost();
String path=u.getPath();
String param=u.getQuery();
try {
Socket s=new Socket(host,80);
s.setKeepAlive(true);//长连接
OutputStream os=s.getOutputStream();
PrintWriter osOut=new PrintWriter(os,true);
//InputStream osIn=s.getInputStream();
BufferedReader osIn=new BufferedReader(
new InputStreamReader(s.getInputStream(),Charset.forName("GB2312")));
osOut.println("GET "+path+"?"+param+" HTTP/1.1");
osOut.println("Host: "+host+"");
osOut.println("User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36");
osOut.println("Accept-Charset: iso-8859-1, unicode-1-1;q=0.8");
osOut.println("Accept-Language: en-us,en;q=0.8");
osOut.println("Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");
osOut.println("Accept-Encoding: gzip,deflate");
osOut.println("Connection: Close");
osOut.println();
//get response
StringBuilder sb=new StringBuilder(); int c;
while((c=osIn.read())!=-1){
sb.append((char)c);
}
s.close();
System.out.println(sb.toString());
} catch (UnknownHostException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
高手指定一下,在哪一步乱码了?如何解决?
(1)
osOut.println("Accept-Charset: iso-8859-1, unicode-1-1;q=0.8");
(2)
还有一个
xml都是 utf-8编码的 你用gb2312去取有可能就有问题
(3)
按字节读取这三种方式都试一下
osOut.println("Accept-Charset: iso-8859-1, unicode-1-1;q=0.8");把 is0-8858-1 换成 gb2312试试
while((c=osIn.read())!=-1){
sb.append((char)c);
}
换成String line ;
while((line = osIn.readLine()) != null){
sb.append(line);
}试试呢
while((c=osIn.read())!=-1){
sb.append((char)c);
}
换成String line ;
while((line = osIn.readLine()) != null){
sb.append(line);
}试试呢
也不行呀