求救,CStdioFile::ReadString在UNICODE下面读取文件含有乱码的情况。。。

想把自己写的小程序转移到WM手机上，环境被限定在UNICODE，
在UNICODE环境下ReadString总是有些奇怪的问题，文本文件确认是好的，
每读一行，总是有些乱码加插在内容中，查了一下，才知道是UNICODE编码的问题参考过网上的一些资料做法，方法大概都是WCHAR另外一个临时的变量，然后CString.GetBuffer()复制过去，
然后我再复制回到一个CString我在做试验时候，发现是原文复制过去的，，所以结果不变折腾了好久也不知道问题出在哪里，
好像是ReadString在读取时已把乱码读取在内容里了，这样情况复制下去，当然一样。。
GetLength()总是实际长度+2（没有试过逐个测试，大致上都是+2）。折腾了好几天了，高分求救，求实现方法，能把文本内容不含乱码地逐行读出来
对于我这个初学者，UNICODE环境一下变了，真是困难啊

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

CFile是可以的，CStdioFile在UNICODE下没有试过只要注意GetLength这种长度是按照字节计算的，L"\r\n"是换行标志即可
俺使用file，直接读二进制，然后转换。
CFile没ReadString。
因为不是很熟练做逐行读取的思路，所以才考虑CStdioFiles的ReadString直接放到CString里来处理的。。
这样对于我思路比较清晰，^_^老大您这样的回答，让我等小辈痛苦死了。
不是说了找L"\r\n"吗，看出来你是要代码，等等看
我记得文本模式的文件阅读unicode文本是有问题的。这点微软做的非常差，因此楼主需要以typeBinary模式打开文件，用Read把所有内容读入内存，然后强制转换为UNICODE(必须确保文件原来就是unicode的）。
由于CStdioFile是使用typeText打开文件的，因此它打开unicode自然有问题
UNICODE环境下,用二进制（CFile::typeBinary）方式打开。
1.文本是ANSI的，用CFile:Read读取，char类型缓冲区接收方式，再用MultiByteToWideChar转成WCHAR来使用
2.文本是UNICODE的，用CFile:Read读取，WCHAR类型缓冲区接收方式，读取的时候跳过文件开头的"FF FE"两个字节的文件标志。
然后再自己解析_T(“\r\n”)就和ReadString读取的效果一样了。
不是很明白如何_T("\r\n")为逐行放入某个CString里？。。
复加,递进？
小试了一下，str还是乱码。
CFile的Read用得不对吗？
CFile::typeText也试过...
CFile fileRead;
CString tempStr; fileRead.Open(_T("\123.txt"),CFile::modeRead | CFile::typeBinary); wchar_t tmpchar[10];
fileRead.Read(tmpchar,9);
        CString str;
str.Format(_T("%s"),tmpchar);
AfxMessageBox(str);
fileRead.Close();
那估计你的文本是UNICODE的，你用char缓冲区接收，用MessageBoxA打印出来就知道了。
CFile fileRead;
CString tempStr;fileRead.Open(_T("\123.txt"),CFile::modeRead | CFile::typeBinary);char tmpchar[20];
fileRead.Read(tmpchar,19);
::MessageBoxA(NULL, tmpchar, NULL, 0);
fileRead.Close();
刚写错了，那估计你的文本是ANSI的，你用char缓冲区接收，用MessageBoxA打印出来就知道了,要在CString中使用先用MultiByteToWideChar转换过来才能用
CFile fileRead;
CString tempStr;fileRead.Open(_T("\123.txt"),CFile::modeRead | CFile::typeBinary);char tmpchar[20];
fileRead.Read(tmpchar,19);
::MessageBoxA(NULL, tmpchar, NULL, 0);
fileRead.Close();
CStdioFile不能直接处理UNICODE,请自己写程序进行扩展
看了一下CStdioFile
virtual LPTSTR ReadString(LPTSTR lpsz, UINT nMax);
virtual BOOL ReadString(CString& rString);
貌似都没有不兼容unicode的问题都用的TCHAR么
你在PC这端改_MBCS为_UNICODE编译过了应该就可以兼容wm你注意一下是否你的unicode文本格式的问题开头是不是有2字节0xff 0xfe
文件确认是UNICODE的,用WinHex看到FF FF开头...
ReadString应该是处理UNICODE有问题的,原因似乎是始终以textMode打开文件，是MFC设计的问题...
我在CodeProject上找到一个CStdioFileEX的demo，重写的ReadString似乎没有什么效果（程序编译通过，运行时候tempStr是没有内容CStdioFileEx fileReadEx;
CString tempStr;
fileReadEx.Open(_T("\\123.txt"),CStdioFile::modeRead);
fileReadEx.ReadString(tempStr);
AfxMessageBox(tempStr);
CStdioFileEx重载的内容如下：BOOL CStdioFileEx::ReadString(CString& rString)
{
const int nMAX_LINE_CHARS = 4096;
BOOL bReadData = FALSE;
LPTSTR lpsz;
int nLen = 0;
wchar_t* pszUnicodeString = NULL;
char * pszMultiByteString= NULL;
int nChars = 0; try
{
// If at position 0, discard byte-order  before reading
if (!m_pStream || (GetPosition() == 0 && m_bIsUnicodeText))
{
wchar_t cDummy;
// Read(&cDummy, sizeof(_TCHAR));
Read(&cDummy, sizeof(wchar_t));
}// If compiled for Unicode
#ifdef _UNICODE
if (m_bIsUnicodeText)
{
// Do standard stuff - Unicode to Unicode. Seems to work OK.
bReadData = CStdioFile::ReadString(rString);
}
else
{
pszUnicodeString = new wchar_t[nMAX_LINE_CHARS];
pszMultiByteString= new char[nMAX_LINE_CHARS]; // Initialise to something safe
memset(pszUnicodeString, 0, sizeof(wchar_t) * nMAX_LINE_CHARS);
memset(pszMultiByteString, 0, sizeof(char) * nMAX_LINE_CHARS);

// Read the string
bReadData = (NULL != fgets(pszMultiByteString, nMAX_LINE_CHARS, m_pStream)); if (bReadData)
{
// Convert multibyte to Unicode, using the specified code page
nChars = GetUnicodeStringFromMultiByteString(pszMultiByteString, pszUnicodeString, nMAX_LINE_CHARS, m_nFileCodePage); if (nChars > 0)
{
rString = (CString)pszUnicodeString;
}
}
}
#else if (!m_bIsUnicodeText)
{
// Do standard stuff -- read ANSI in ANSI
bReadData = CStdioFile::ReadString(rString); // Get the current code page
UINT nLocaleCodePage = GetCurrentLocaleCodePage(); // If we got it OK...
if (nLocaleCodePage > 0)
{
// if file code page does not match the system code page, we need to do a double conversion!
if (nLocaleCodePage != (UINT)m_nFileCodePage)
{
int nStringBufferChars = rString.GetLength() + 1; pszUnicodeString = new wchar_t[nStringBufferChars]; // Initialise to something safe
memset(pszUnicodeString, 0, sizeof(wchar_t) * nStringBufferChars);

// Convert to Unicode using the file code page
nChars = GetUnicodeStringFromMultiByteString(rString, pszUnicodeString, nStringBufferChars, m_nFileCodePage); // Convert back to multibyte using the system code page
// (This doesn't really confer huge advantages except to avoid "mangling" of non-convertible special
// characters. So, if a file in the E.European code page is displayed on a system using the
// western European code page, special accented characters which the system cannot display will be
// replaced by the default character (a hash or something), rather than being incorrectly mapped to
// other, western European accented characters).
if (nChars > 0)
{
// Calculate how much we need for the MB buffer (it might be larger)
nStringBufferChars = GetRequiredMultiByteLengthForUnicodeString(pszUnicodeString,nLocaleCodePage);
pszMultiByteString= new char[nStringBufferChars];   nChars = GetMultiByteStringFromUnicodeString(pszUnicodeString, pszMultiByteString, nStringBufferChars, nLocaleCodePage);
rString = (CString)pszMultiByteString;
}
}
}
}
else
{
pszUnicodeString = new wchar_t[nMAX_LINE_CHARS]; // Initialise to something safe
memset(pszUnicodeString, 0, sizeof(wchar_t) * nMAX_LINE_CHARS);

// Read as Unicode, convert to ANSI // Bug fix by Dennis Jeryd 06/07/2003: initialise bReadData
bReadData = (NULL != fgetws(pszUnicodeString, nMAX_LINE_CHARS, m_pStream)); if (bReadData)
{
// Calculate how much we need for the multibyte string
int nRequiredMBBuffer = GetRequiredMultiByteLengthForUnicodeString(pszUnicodeString,m_nFileCodePage);
pszMultiByteString= new char[nRequiredMBBuffer];   nChars = GetMultiByteStringFromUnicodeString(pszUnicodeString, pszMultiByteString, nRequiredMBBuffer, m_nFileCodePage); if (nChars > 0)
{
rString = (CString)pszMultiByteString;
}
} }
#endif // Then remove end-of-line character if in Unicode text mode
if (bReadData)
{
// Copied from FileTxt.cpp but adapted to Unicode and then adapted for end-of-line being just '\r'.
nLen = rString.GetLength();
if (nLen > 1 && rString.Mid(nLen-2) == sNEWLINE)
{
rString.GetBufferSetLength(nLen-2);
}
else
{
lpsz = rString.GetBuffer(0);
if (nLen != 0 && (lpsz[nLen-1] == _T('\r') || lpsz[nLen-1] == _T('\n')))
{
rString.GetBufferSetLength(nLen-1);
}
}
}
}
// Ensure we always delete in case of exception
catch(...)
{
if (pszUnicodeString) delete [] pszUnicodeString; if (pszMultiByteString) delete [] pszMultiByteString; throw;
} if (pszUnicodeString) delete [] pszUnicodeString; if (pszMultiByteString) delete [] pszMultiByteString; return bReadData;
}
用这个试试呢.http://www.pcdog.com/edu/ren/2005/08/n017093.html