我在浏览器上直接访问Servlet,我自己写的Servlet的代码如下
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
//request.setCharacterEncoding("utf-8");
PrintWriter out = response.getWriter();
out.print("呵呵");
String name = request.getParameter("uname");
System.out.println(name);
out.print(name);
}
我在地址栏输入http://localhost:8080/servlet/Myservlet?uname=长
结果如下:
浏览器页面只显示:%E5%91%B5%E5%91%B5长 ---》也就是说“呵呵”未成功显示,成了乱码但“长”正确显示
控制台输出:é?? ---》也就是说都未正常显示还有个有趣的现象,那就是提交后地址栏变成http://localhost:8080/servlet/Myservlet?uname=%E9%95%BF
我又把“呵呵”换成“%E9%95%BF”,结果并非“长长”变成是“%E9%95%BF长”
求教高手解释这一现象
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
//request.setCharacterEncoding("utf-8");
PrintWriter out = response.getWriter();
out.print("呵呵");
String name = request.getParameter("uname");
System.out.println(name);
out.print(name);
}
我在地址栏输入http://localhost:8080/servlet/Myservlet?uname=长
结果如下:
浏览器页面只显示:%E5%91%B5%E5%91%B5长 ---》也就是说“呵呵”未成功显示,成了乱码但“长”正确显示
控制台输出:é?? ---》也就是说都未正常显示还有个有趣的现象,那就是提交后地址栏变成http://localhost:8080/servlet/Myservlet?uname=%E9%95%BF
我又把“呵呵”换成“%E9%95%BF”,结果并非“长长”变成是“%E9%95%BF长”
求教高手解释这一现象
http://linyinan02.iteye.com/blog/403171
你可以这样测试下:
http://localhost:8080/servlet/Myservlet?uname=%E9%95%BF
response.setContentType("text/html;charset=UTF-8");
PrintWriter out = response.getWriter();
out.print("呵呵");
String name = request.getParameter("uname");
System.out.println(new String(name.getBytes("iso-8859-1"),"UTF-8"));
out.print(new String(name.getBytes("iso-8859-1"),"UTF-8"));
“%E9%95%BF"本身就是Ascii编码,这个在你的jsp上,也被直接得到了
最后就是控制台,没有正确的值说明,println这一块,你没有限制编码
类似编码问题时常会出现,lz不要怕错,不过,还是最好将所有编码都统一了,减少类似的麻烦
本身URL不支持中文字符
LZ可以试试String uname= java.net.URLDecoder.decode(request.getParameter("uname"),"UTF-8");
out.print(uname);
因为你是直接用浏览器的,所以默认都是ISO-8859-1编码格式,如果通过页面表单设置了编码就不会了
System.out.println(new String(name.getBytes("ISO-8859-1"),"gbk"));3,至于在Servlet中直接输出,必须要写response.setContentType("text/html;charset=utf-8");
同2的解释,没设置编码格式,他会默认显示4,把“呵呵”换成“%E9%95%BF”,他肯定直接显示这个啊,难道你和中国人讲普通话还需要带翻译吗?、5,上面你说浏览器页面会正常显示长 ,我有点搞不懂,怎么会能显示呢,我特地尝试了下是乱码。
a:如果是get方式,在tomcat中,在conf/server.xml文件中8080端口的connector元素增加URIEncoding="utf-8"
b:如果是post方式,在servlet中,request.setcharactorEncoding("utf-8");放在request.getparameter之前
servlet相应中文,客户端显示乱码:response.setCharactorEncoding("utf-8");该代码要放在
response.getWriter之前
2.在title标签下添加:<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
3.在form里添加 method="post" 例如:<form action="xxx" method="post">
-----------------------------------------------------------------------
4.servlet
----------------------------------
public void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
request.setCharacterEncoding("utf-8");
response.setContentType("text/html;charset=utf-8");
PrintWriter out = response.getWriter();
out.print("呵呵");
String name = request.getParameter("userName");
System.out.println(name);
out.print(name);
}
经测试输出:呵呵测试中文乱码问题
结论:乱码问题得以解决。
a:如果是get方式,在tomcat中,在conf/server.xml文件中8080端口的connector元素增加URIEncoding="utf-8"
b:如果是post方式,在servlet中,request.setcharactorEncoding("utf-8");放在request.getparameter之前
servlet相应中文,客户端显示乱码:response.setCharactorEncoding("utf-8");该代码要放在
response.getWriter之前
但是若按照这样想用地址栏传值的话输入什么就输出什么,我在名字为uname的text中写“长”提交(method=get)显示的是乱码。这个现象又说明什么呢?
先获取"呵呵" 和 "长"的 GBK和UTF-8的编码:public static void main(String[] args) throws UnsupportedEncodingException {
echoGBKEncoding("长");
echoUTF8Encoding("长");
echoGBKEncoding("呵呵");
echoUTF8Encoding("呵呵");
}
public static void echoGBKEncoding(String str) throws UnsupportedEncodingException{
if(str == null) return;
System.out.print(str + " GBK encoding : ");
for(byte i : str.getBytes("GBK")){
System.out.print(Integer.toHexString(i & 0xFF).toUpperCase());
}
System.out.println();
}
public static void echoUTF8Encoding(String str) throws UnsupportedEncodingException{
if(str == null) return;
System.out.print(str + " UTF-8 encoding : ");
for(byte i : str.getBytes("UTF-8")){
System.out.print(Integer.toHexString(i & 0xFF).toUpperCase());
}
System.out.println();
}运行上面的程序会得到"长"的UTF-8编码是E995BF,与UTL中HTTP的编码是不是一样(去掉%,至于URL加上%,这个你懂得);再这样测试:System.out.println("UTF-8 --> GBK --> UTF-8");
String gbkStr = new String("长".getBytes("UTF-8"),"GBK");
System.out.println(gbkStr);
String utf_8Str = new String(gbkStr.getBytes("GBK"),"UTF-8");
System.out.println(utf_8Str);
System.out.println("========================");
System.out.println("GBK --> UTF-8 --> GBK");
utf_8Str = new String("长".getBytes("GBK"),"UTF-8");
System.out.println(utf_8Str);
gbkStr = new String(utf_8Str.getBytes("UTF-8"),"GBK");
System.out.println(gbkStr);
System.out.println("========================");
System.out.println("UTF-8 --> iso8859-1 --> UTF-8");
String iso8859_1str = new String("长".getBytes("UTF-8"),"iso8859-1");
System.out.println(iso8859_1str);
utf_8Str = new String(iso8859_1str.getBytes("iso8859-1"),"UTF-8");
System.out.println(utf_8Str);
System.out.println("========================");
System.out.println("GBK --> iso8859-1 --> GBK");
iso8859_1str = new String("长".getBytes("GBK"),"iso8859-1");
System.out.println(iso8859_1str);
gbkStr = new String(iso8859_1str.getBytes("iso8859-1"),"GBK");
System.out.println(gbkStr);
System.out.println("========================");你会发现,GBK --> iso8859-1 --> GBK 和 UTF-8 --> iso8859-1 --> UTF-8是不会出现错误的;至于原因,楼主可以慢慢体会(提示 一般汉字GBK是2字节,UTF-8是3字节,iso8859-1是1个字节,解码不能出现字节丢失,才能还原);
提示IE: page->Encoding
Firefox: View -> Character EncodingConsole: cmd -> 右键 Properties -> Options选项 -> Current code page 如果是936表示的是GBKEclipse: run as -> run configurations -> Common选项 -> Encoding