比如有这样一段话“求一PHP正则表达式like,フレーズ”,中英日文字都有!
我要将这句话的每个字存到一个数组里,其中英文单词存一个单元,
比如{"求","一","PHP","正则","表","达","式","like","フ","レ","ー","ズ"}
网上找了段代码可以将中文都存到一个数组//用正则匹配半角单个字符或者全角单个字符,存入数组$ar
preg_match_all("/[\x80-\xff]+?\\x00/",$str,$ar);
$ar = $ar[0];
//去掉$ar中ASCII为0字符的项目
for ( $i = 0; $i < count($ar); $i++ ){
if ($ar[$i] != chr(0x00)) {
$ar_new[]=$ar[$i];
echo "==".$ar[$i];
}
}
我要将这句话的每个字存到一个数组里,其中英文单词存一个单元,
比如{"求","一","PHP","正则","表","达","式","like","フ","レ","ー","ズ"}
网上找了段代码可以将中文都存到一个数组//用正则匹配半角单个字符或者全角单个字符,存入数组$ar
preg_match_all("/[\x80-\xff]+?\\x00/",$str,$ar);
$ar = $ar[0];
//去掉$ar中ASCII为0字符的项目
for ( $i = 0; $i < count($ar); $i++ ){
if ($ar[$i] != chr(0x00)) {
$ar_new[]=$ar[$i];
echo "==".$ar[$i];
}
}
preg_match_all("/[\x80-\xff].|\w+/", $s, $r);
print_r($r[0]);Array ( [0] => 求 [1] => 一 [2] => PHP [3] => 正 [4] => 则 [5] => 表 [6] => 达 [7] => 式 [8] => like [9] => フ [10] => レ [11] => ズ )
<?php
$str = '"求","一","PHP","正则","表","达","式","like","フ","レ","ー","ズ"';
$str = str_ireplace(array(',', '"'), array("", ""), $str);
$pattern = "/[^\x4e00-\x9fa5]{2}|[\w]+/i";
preg_match_all($pattern, $str, $aMatch);
print_r($aMatch);
?>
这个正是我想要的结果,可是为什么我这跑出来的结果是乱码呢?
<html>
<head>
<title> test</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head><body>
<?php
$s = '求一PHP正则表达式like,フレ?ズ';
preg_match_all("/[\x80-\xff].|\w+/", $s, $r);
print_r($r[0]);
//echo preg_replace('#[\x{4e00}-\x{9fa5}]#ue','chinese_unicode("\\0")',$str2);//保证$str2是utf-8。
$q = trim($_GET['q']);
?>
</body>
</html>
meta charset=''/or header();
and
文件编码
你没看我的代码吗 我都为charset=UTF-8 了 包括文件编码