分割独立中文和英文字符串

$str="我爱中国word  i love";
$arr =
$strlen=mb_strlen($str,"utf-8");
for($i=0;$i<$strlen;$i++){
  if(ord(mb_substr($str,$i,1,"utf-8"))>0xa0){
     echo mb_substr($str,$i,1,"utf-8")."<br>";
  }else{
    $temp = "";
    while((mb_substr($str,$i,1,"utf-8"))!=" " && $i<$strlen) {
      $temp.=mb_substr($str,$i,1,"utf-8");
      $i++;
    }
    echo $temp."<br>";
  }
}楼主试一下，应该可以

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

假定使用的是国标汉字代码(gbk,gb2312)$s = "我爱中国China I Love";
preg_match_all("/[\x80-\xff].|\w+/", $s, $r);
print_r($r);out:
Array
(
    [0] => Array
        (
            [0] => 我
            [1] => 爱
            [2] => 中
            [3] => 国
            [4] => China
            [5] => I
            [6] => Love
        ))
测试通过，唠叨老大的代码可以UTF8的编码这个测试也通过
$output = null;
$str='我sdfg爱dfg16d到访撒旦飞4!@#你abddd';
preg_match_all("/[\x01-\x7f]|[\xc2-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xff][\x80-\xbf]{3}/", $str, $output);
print_r($output);
array (
  0 =>
  array (
    0 => '我',
    1 => 's',
    2 => 'd',
    3 => 'f',
    4 => 'g',
    5 => '爱',
    6 => 'd',
    7 => 'f',
    8 => 'g',
    9 => '1',
    10 => '6',
    11 => 'd',
    12 => '到',
    13 => '访',
    14 => '撒',
    15 => '旦',
    16 => '飞',
    17 => '4',
    18 => '!',
    19 => '@',
    20 => '#',
    21 => '你',
    22 => 'a',
    23 => 'b',
    24 => 'd',
    25 => 'd',
    26 => 'd',
  ),
)
借用三楼的：
$s = "我爱中国China I Love";
preg_match_all("/[\x80-\xff].¦\w+/",iconv("UTF-8","gb2312",$s), $r);
//红色部分根据实际情况修改，当然显示的时候还得转回去；
print_r($r);
谢谢，两种方式都可以实现，不过3楼的有个小问题，我修改如下：
方式1：
$str="我爱中国word  i love";
$arr =
$strlen=mb_strlen($str,"utf-8");
for($i=0;$i <$strlen;$i++){
  if(ord(mb_substr($str,$i,1,"utf-8"))>0xa0){
    echo mb_substr($str,$i,1,"utf-8")." <br>";
  }else{
    $temp = "";
    while((mb_substr($str,$i,1,"utf-8"))!=" " && $i <$strlen && ((mb_substr($str,$i,1,"utf-8"))<= 0xa0) {
      $temp.=mb_substr($str,$i,1,"utf-8");
      $i++;
    }
    echo $temp." <br>";
  }
} 方式2：
public static function getSplitStr($str){
     preg_match_all("/[\x80-\xff].|\w+/",iconv("UTF-8","gb2312",$str), $r);
     return iconv("gb2312","UTF-8",implode(" ",$r[0]));
}