用socket访问网络,比如:www.baidu.com,返回的是一段乱码字符,访问代码如下:// webAddress = www.baidu.com
public void linkButClick() throws Exception, IOException{
String webAddress = webAddressFie.getText().trim();
Socket socket = new Socket(webAddress, 80);
StringBuffer sb = new StringBuffer("GET "+"/"+" HTTP/1.1\r\n");
sb.append("Host: "+webAddress+"\r\n");
sb.append("Accept: */*\r\n");
sb.append("Accept-Language: zh-cn\r\n");
sb.append("Accept-Encoding: gzip,deflate\r\n");
sb.append("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.8)\r\n");
sb.append("Connection: keep-alive\r\n\r\n");
System.out.println(sb.toString());
OutputStream socketOut = socket.getOutputStream();
socketOut.write(sb.toString().getBytes());
socket.shutdownOutput();
InputStream socketIn = socket.getInputStream();
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
byte[] buff = new byte[1024];
int len = -1;
while((len = socketIn.read(buff)) != -1){
buffer.write(buff, 0, len);
}
System.out.println(new String(buffer.toByteArray()));
}返回的字符如下:GET / HTTP/1.1 // 发送请求的字符串
Host: www.baidu.com
Accept: */*
Accept-Language: zh-cn
Accept-Encoding: gzip,deflate
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.8)
Connection: keep-alive
HTTP/1.1 200 OK // 返回的字符串
Date: Sun, 20 Sep 2009 07:17:49 GMT
Server: BWS/1.0
Content-Length: 1745
Content-Type: text/html
Cache-Control: private
Expires: Sun, 20 Sep 2009 07:17:49 GMT
Content-Encoding: gzip
Set-Cookie: BAIDUID=E9DF2B0254DB765B944F62650997DCBA:FG=1; expires=Sun, 20-Sep-39 07:17:49 GMT; path=/; domain=.baidu.com
P3P: CP=" OTI DSP COR IVA OUR IND COM "// 以下全是乱码pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?篲?虹梑蒺篲?虹梑蒺篲?虹梑蒺篲?虹梑蒺
..............请哪位高手指点一下,网上都没找到解决方法,谢谢!
public void linkButClick() throws Exception, IOException{
String webAddress = webAddressFie.getText().trim();
Socket socket = new Socket(webAddress, 80);
StringBuffer sb = new StringBuffer("GET "+"/"+" HTTP/1.1\r\n");
sb.append("Host: "+webAddress+"\r\n");
sb.append("Accept: */*\r\n");
sb.append("Accept-Language: zh-cn\r\n");
sb.append("Accept-Encoding: gzip,deflate\r\n");
sb.append("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.8)\r\n");
sb.append("Connection: keep-alive\r\n\r\n");
System.out.println(sb.toString());
OutputStream socketOut = socket.getOutputStream();
socketOut.write(sb.toString().getBytes());
socket.shutdownOutput();
InputStream socketIn = socket.getInputStream();
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
byte[] buff = new byte[1024];
int len = -1;
while((len = socketIn.read(buff)) != -1){
buffer.write(buff, 0, len);
}
System.out.println(new String(buffer.toByteArray()));
}返回的字符如下:GET / HTTP/1.1 // 发送请求的字符串
Host: www.baidu.com
Accept: */*
Accept-Language: zh-cn
Accept-Encoding: gzip,deflate
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.8)
Connection: keep-alive
HTTP/1.1 200 OK // 返回的字符串
Date: Sun, 20 Sep 2009 07:17:49 GMT
Server: BWS/1.0
Content-Length: 1745
Content-Type: text/html
Cache-Control: private
Expires: Sun, 20 Sep 2009 07:17:49 GMT
Content-Encoding: gzip
Set-Cookie: BAIDUID=E9DF2B0254DB765B944F62650997DCBA:FG=1; expires=Sun, 20-Sep-39 07:17:49 GMT; path=/; domain=.baidu.com
P3P: CP=" OTI DSP COR IVA OUR IND COM "// 以下全是乱码pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?pn]鞥]赝呚+縘y龅鐃?# g?N
<漸鮩筟U炸]涇桍.泊佹党唍?紳?篲?虹梑蒺篲?虹梑蒺篲?虹梑蒺篲?虹梑蒺
..............请哪位高手指点一下,网上都没找到解决方法,谢谢!
System.out.println(str);如果gbk不行你试试utf-8
这样呢
String str = buffer.toString("gbk");
HTTP/1.1 200 OK
Connection: close
Date: Sun, 20 Sep 2009 09:30:55 GMT
Server: Microsoft-IIS/6.0
…………
)没乱码,而网页代码有乱码呢?
String test = new String(buffer.toByteArray());
System.out.println(new String( test.getBytes("utf-8"),"GBK"));
我这样试过一二不行
压缩会提高浏览速度,所以经过gzip压缩过的网页代码直接显示就成乱码了。所以,解决方法:sb.append("Accept-Encoding: \r\n"); //这样就不压缩返回了,就没乱码了