java有关汉字字节判断的问题（面试题）

面试题java

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

正常的是，你为什么会是上面的结果怎么区分？
我
我叫
我叫A
我叫AB
我叫ABC
我叫ABC哈
java里统一了汉字、英文占的字节数了吧。
n是字符结果如楼上所写，n是字节，n=1，也不可能输出“我”
public static String getBytes(int count,String str){

byte[] bytes = str.getBytes();

String str1 = new String(bytes,0,count);
while(!str.contains(str1)){
str1 = new String(bytes,0,++count);
}
return str1;
}
public class java {   public static String getBytes(int count,String str){

        byte[] bytes = str.getBytes();

        String str1 = new String(bytes,0,count);
        while(!str.contains(str1)){
            str1 = new String(bytes,0,++count);
        }
        return str1;
    }   public static String getBytes2(int count,String str){

        byte[] bytes = str.getBytes(); int i = 0;
        while(i<count){
            if(bytes[i] >= 0){
i++ ;
    }else{
i += 2;
    }
        }
        return new String(bytes,0,i);
    }    public static void main(String[] args){
String s = "我叫ABC哈哈哈";
for(int i = 1 ;i<9;i++){
    System.out.println(getBytes(i,s ));
} for(int i = 1 ;i<9;i++){
    System.out.println(getBytes2(i,s ));
}

    }
}
2楼的   public static String getBytes3(int count,String str){

        byte[] bytes = str.getBytes(); int i = 0,ic = 0;
        while(ic < count){
            if(bytes[i] >= 0){
i++ ;
    }else{
i += 2;
    }
    ic++;
        }
        return new String(bytes,0,i);
    }
getBytes2可能不是最优解但肯定是较优解了
不过这样只适合GBK等双字节编码，万一在默认是UTF-8编码的机器上以上代码就是错的
所以应该用getBytes("GBK")和new String(bytes, 0, i, "GBK")
谢楼上的夸奖与纠正（如果是 utf-8  i 就会 += 1    +=  2   +=3  了）
楼上的都没听说过CodePoint吗？
Java String早就有直接方法处理CodePoint，
比如
String.codePointBefore(int);
String.codePointAt(int);多读读API没坏处。
碰到surrotage可能还是需要特别处理下。
因为某些异体字罕见字的位置比较麻烦。需要查一下。编码这事儿水很深的。