还是编码问题！！！！

今天在求证一个问题时得出下面的一些问题，不是很明白 String test = "a";
byte[] tests = test.getBytes();
for (int i = 0; i < tests.length; i++) {
System.out.println(tests[i]);
}getBytes() 是采用的是系统默认字符集GBK，输出的是97
而大家看看下面的代码： String test = "a";
byte[] tests = test.getBytes("unicode");
for (int i = 0; i < tests.length; i++) {
System.out.println(tests[i]);
}采用UNICODE时，输出的却是-1 -2 97 0 String test = "ab";
byte[] tests = test.getBytes("unicode");
for (int i = 0; i < tests.length; i++) {
System.out.println(tests[i]);
}输出的是-1 -2 97 0 98 0
按说对一个字符的字符串的unicode码应该只有两个字节啊！
前面的-1 -2 是什么呢？？

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

-1 -2 即 FFFE，是 UTF-16 的传输标志。UCS 规定传输 Unicode 时，前面加上 FEFF 字符（名为 ZERO WIDTH NO-BREAK SPACE），
而 FFFE 是不存在的字符，不应该被编码。因为在传输时有顺序问题，因此需要给定顺序，当接收到 FEFF 时表示 Big-Endian，
接收到 FFFE 时表示 Little-Endian所谓的 Big-Endian 是指，把 UTF-16 的高字节位放在前面，低字节位放在后面
而 Little-Endian 与 Big-Endian 相反，把低字节位放在前面，高字节位放在后面。接收到 FFFE 时，是高字节位与低字节位相反的 FEFF 标志位。根据网上资料称 PC 上一般采用 Little-Endian 方式进行传输，而 Mac 机上采用
Big-Endian 方式。
了解了嗯
也就是说在Mac机上的话  可能显示就是-1 -2 0 97 0 98   对吧？
关于这个，我找到一份资料（这是 Unicode Technical Note #23: To the BMP and Beyond），
在第 18 页上有说明。http://www.unicode.org/notes/tn23/Muller-Slides+Narr.pdfCharacter encoding schemes* mapping of code units to bytes
* UTF-8: obvious
* UTF-16LE
     little endian
     initial FF FE (if present) is a character
* UTF-16BE
     big endian
     initial FE FF (if present) is a character
* UTF-16
     either endianness
     may have a BOM: FF FE or FE FF, not part of text
     if no BOM, then must be BE
* UTF-32: similarly, UTF-32LE, UTF-32BE and UTF-32The final layer of the character model deals with the serialization in bytes of
the code units.For UTF-8, where the code units are already bytes, this step is trivial, and
there is a single encoding scheme.For UTF-16, there are three encoding schemes:in UTF-16LE, the least significant byte of each code unit comes first. If the
string starts with the bytes FF FE, those two bytes should be interpreted as
the FEFF code unit, i.e. as the character U+FEFF ZERO WIDTH NO-BREAK SPACE.in UTF-16BE, the most significant byte of each code unit comes first. If the
string starts with the bytes FE FF, those two bytes should be interpreted as
the FEFF code unit, i.e. as the character U+FEFF ZERO WIDTH NO-BREAK SPACE.in UTF-16, either endianness is possible. The endianness may be indicated by
starting the byte stream with FF FE (little endian) or FE FF (big endian), and
those bytes are not part of the string. If no endianness is specified, then the
byte order must be big endian.UTF-32 also has three encoding schemes, defined in a similar way.