如何判断读取UNICODE和ASIC文件,请给两句演示代码。
解决方案 »
- ui线程怎么创建,如何通过ui线程将数据显示到窗口控件上
- CADORecordset Error ,高手过来看啊,在线等,解决即给分
- 一个代理服务器在本机,当代理服务器获得网页后,如何通知客户进行显示?
- picture控件问题
- 使用ADO连接数据库的问题
- 如何用msxml加入以下内容?
- 我做了一个搜索文件的例子,用的是递归,但是点击取消是却停不下来,有什么方法解决
- 我的程序在Build后显示这个错误:unexpected end of file while looking for precompiled header directive什么原因?如何解决?Thanks!
- 创建远程线程,返回句柄为空,求解释!
- 请问高手 怎么实现一个char changx[];这样的数组,要求程序在运行时changx会变为chang1,chang2,chang3..........
- 怎样判断CEdit控件中的内容当前被选中了,如果内容选中的话下次输入就覆盖以前的内容??
- 怎么在工具栏弹出右键菜单?
0xEF, 0xBB, 0xBF (the three bytes is as header of utf-8 file)
int op=IS_TEXT_UNICODE_STATISTICS;
if(IsTextUnicode(buff,sizeof(buff),&op))
{
//是Unicode
}
else
{
//是ASIC
}
CFile file(pszPathName,CFile::modeRead|CFile::typeBinary);
file.Read(pcBuf,sizeof(pcBuff));
我原以为可以通过上面的方法比较,但是取出来的都是ASIC,是否应该查看文件后缀名?如果是UTF_8格式,MultiByteToWideChar(CP_UTF8,0,pcLineBuf,-1,NULL,0);
如果是ASIC格式,MultiByteToWideChar(CP_ACP,0,pcLineBuf,-1,NULL,0);
主要就是为了设置那两个标记。
The IsTextUnicode function determines whether a buffer is likely to contain a form of Unicode text. The function uses various statistical and deterministic methods to make its determination, under the control of flags passed via lpi. When the function returns, the results of such tests are reported via lpi. BOOL IsTextUnicode(
CONST VOID* pBuffer, // input buffer to be examined
int cb, // size of input buffer
LPINT lpi // options
);
Parameters
lpBuffer
[in] Pointer to the input buffer to be examined.
cb
[in] Specifies the size, in bytes, of the input buffer pointed to by lpBuffer.
lpi
[in/out] On input, specifies the tests to be applied to the input buffer text. On output, receives the results of the specified tests: 1 if the contents of the buffer pass a test, zero for failure. Only flags that are set upon input to the function are significant upon output.
If lpi is NULL, the function uses all available tests to determine whether the data in the buffer is likely to be Unicode text. This parameter can be one or more of the following values. Value Meaning
IS_TEXT_UNICODE_ASCII16 The text is Unicode, and contains onlyzero-extended ASCII values/characters.
IS_TEXT_UNICODE_REVERSE_ASCII16 Same as the preceding, except that the Unicode text is byte-reversed.
IS_TEXT_UNICODE_STATISTICS The text is probably Unicode, with the determination made by applying statistical analysis. Absolute certainty is not guaranteed. See the following Res section.
IS_TEXT_UNICODE_REVERSE_STATISTICS Same as the preceding, except that the probably-Unicode text is byte-reversed.
IS_TEXT_UNICODE_CONTROLS The text contains Unicode representations of one or more of these nonprinting characters: RETURN, LINEFEED, SPACE, CJK_SPACE, TAB.
IS_TEXT_UNICODE_REVERSE_CONTROLS Same as the preceding, except that the Unicode characters are byte-reversed.
IS_TEXT_UNICODE_BUFFER_TOO_SMALL There are too few characters in the buffer for meaningful analysis (fewer than two bytes).
IS_TEXT_UNICODE_SIGNATURE The text contains the Unicode byte-order (BOM) 0xFEFF as its first character.
IS_TEXT_UNICODE_REVERSE_SIGNATURE The text contains the Unicode byte-reversed byte-order (Reverse BOM) 0xFFFE as its first character.
IS_TEXT_UNICODE_ILLEGAL_CHARS The text contains one of these Unicode-illegal characters: embedded Reverse BOM, UNICODE_NUL, CRLF (packed into one WORD), or 0xFFFF.
IS_TEXT_UNICODE_ODD_LENGTH The number of characters in the string is odd. A string of odd length cannot (by definition) be Unicode text.
IS_TEXT_UNICODE_NULL_BYTES The text contains null bytes, which indicate non-ASCII text.
IS_TEXT_UNICODE_UNICODE_MASK This flag constant is a combination of IS_TEXT_UNICODE_ASCII16, IS_TEXT_UNICODE_STATISTICS, IS_TEXT_UNICODE_CONTROLS, IS_TEXT_UNICODE_SIGNATURE.
IS_TEXT_UNICODE_REVERSE_MASK This flag constant is a combination of IS_TEXT_UNICODE_REVERSE_ASCII16, IS_TEXT_UNICODE_REVERSE_STATISTICS, IS_TEXT_UNICODE_REVERSE_CONTROLS, IS_TEXT_UNICODE_REVERSE_SIGNATURE.
IS_TEXT_UNICODE_NOT_UNICODE_MASK This flag constant is a combination of IS_TEXT_UNICODE_ILLEGAL_CHARS, IS_TEXT_UNICODE_ODD_LENGTH, and two currently unused bit flags.
IS_TEXT_UNICODE_NOT_ASCII_MASK This flag constant is a combination of IS_TEXT_UNICODE_NULL_BYTES and three currently unused bit flags.
你是想区别文件的后缀名,你可以通过GetFileName()来比较。