如何提取出汉字和英语混合字符串中的汉字？

如题，比如CString s="wadfd地地道道打斗打斗地点的wdefhhaf大胆地Wwddddddddd\\f房方法";
如何取出其中的汉字，把汉字和字符坟开，存入两个不同的CString对象中

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

void CAaaView::OnButton1()
    {
     // TODO: Add your control notification handler code here
     CString ChargeItemName;
     CString aa = "9494858受得失测试585858585888d888888888888888";
     int len=0;
     ChargeItemName=InterceptString(len,aa);
     AfxMessageBox(ChargeItemName);
     len=ChargeItemName.GetLength();
     ChargeItemName=aa.Mid(len);
     AfxMessageBox(ChargeItemName);
    }
    CString CAaaView::InterceptString(int qlen, CString strSource)
    {
     int len,i,y;
     CString sTemp,sreturn,ceshi;
     strSource.TrimLeft();strSource.TrimRight();
     len=strSource.GetLength();
     y=0;
     sTemp=strSource.Right(len-qlen);
     for(i=0;i<len;i++)
     {
      if(sTemp[y]<0 || sTemp[y]>255)
       y=y+2;
      else
       y=y+1;
      if(y>=26)
       break;
     }
     ceshi.Format("%d",y);
     AfxMessageBox(ceshi);
     sreturn=sTemp.Left(y);
     return sreturn;
    }
转成utf16格式，然后每个字符都点相同的宽度。
CString s="wadfd地地道道打斗打斗地点的wdefhhaf大胆地Wwddddddddd\\f房方法";
楼主,你这个写法,你那堆东西应该是GB2312而非UNICODE的.因为 GB2312汉字的高字节的范围是0xB0-0xF7，低字节的范围是0xA1-0xFE那么你可以从头判断,看如果小于0xb0的,就应该是单字节的英文或符号.大于0xb0的,再加上后面的字节,认为是汉字.
http://baike.baidu.com/view/25492.htm?fr=ala0_1_1给个网址楼主去看看汉字的编码吧,知道了根本的编码方式,分离汉字与英文就有把握了.
#include <afx.h>
int main()
{
CString sText = "wadfd地地道道打斗打斗地点的wdefhhaf大胆地Wwddddddddd\\f房方法", sChinese,sEnglish;
char ch;
for(int i = 0; i < sText.GetLength(); i++)
{
ch = sText.GetAt(i);
if(ch < 0) // 表示为中文字符
{
sChinese += ch;
sChinese += sText.GetAt(++i);
}
else // 否则表示为英文或数字
{
sEnglish += ch;
}
}
printf("%s\n%s\n", sChinese, sEnglish);
return 0;
}
CString strText(_T("wadfd地地道道打斗打斗地点的wdefhhaf大胆地Wwddddddddd\\f房方法"));

CString strAlpha(_T(""));
CString strChinese(_T("")); for(int i=0; i<strText.GetLength(); i++)
{
TCHAR ch = strText.GetAt(i);
if(ch < 255)
{
strAlpha += ch;
}
else
{
strChinese += ch;
}
}
AfxMessageBox(strAlpha);
AfxMessageBox(strChinese);
谢谢各位，我用的是ansi码，该用哪个？
测试过8楼的代码,因为写成了ASCII与UNICODE兼容的代码.但里面的算法却是UNICODE的.
所以在ASCII环境下,运行不正却,因为一个 char 的值,是一定会小于 255 的.
所以一定会执行这个判断.
       if(ch < 255)
        {
            strAlpha += ch;
        }
7楼的用小于0来划分.
其实会把其他的全角字符也当作汉字来处理.
因为只有 0xb0a1 之后的才是汉字,如果判断小于0,等同于判断大于等于 0x80
这里有一个GB2312的编码表
http://www.knowsky.com/resource/gb2312tbl.htm
下面是一小段.可以看出,如果只判断小于0的话,类似0xA1C0这些全角的字符,也会被认为是汉字了.
code  +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F
A1A0     　、。 · ˉ ˇ ¨ 〃々 — ～ ‖ … ‘ ’
A1B0  “ ” 〔〕〈〉《》「」『』〖〗【】
A1C0  ± × ÷ ∶ ∧ ∨ ∑ ∏ ∪ ∩ ∈ ∷ √ ⊥ ∥ ∠
A1D0  ⌒ ⊙ ∫ ∮ ≡ ≌ ≈ ∽ ∝ ≠ ≮ ≯ ≤ ≥ ∞ ∵
A1E0  ∴ ♂ ♀ ° ′ ″ ℃ ＄ ¤ ￠￡ ‰ § № ☆ ★
A1F0  ○ ● ◎ ◇ ◆ □ ■ △ ▲ ※ → ← ↑ ↓ 〓   如果楼主是只想分清全角半角字符的话,只判断 ch < 0 我想是一个不错的做法.
CString strText(_T("wadfd地地道道打斗打斗地点的wdefhhaf大胆地Wwddddddddd\\f房方法")); CString strAlpha(_T(""));
CString strChinese(_T("")); for(int i=0; i<strText.GetLength(); i++)
{
TCHAR ch = strText.GetAt(i);
#ifdef _UNICODE
#define WIDE_CHAR
if(ch < 255)
{
strAlpha += ch;
}
else
{
strChinese += ch;
}
#else
#define MULTI_CHAR
if(IsDBCSLeadByte(ch))
{
TCHAR ch2 = strText.GetAt(++i);
strChinese += ch;
strChinese += ch2;
}
else
{
strAlpha += ch;
}
#endif
}
AfxMessageBox(strAlpha);
AfxMessageBox(strChinese);
12楼强啊,原来还有 IsDBCSLeadByte 这个函数.可以分清是全角还是半角的.楼主,你到底是只要分清全角与半角,还是说你要分清汉字与非汉字?因为有很多全角的字符的.
如果输出的时候在控制台，怎么输出中文字符，对应程序里面的strChinese，用wcout也不对！！
gb2312 ansi比较容易判断。。
[Quote=引用 22 楼  的回复:]12楼，太牛了~~
[/Quo
很简单，每个汉字占两个字节，第一个字节的ANSI值>127即是汉字。