webrequest获取网页中用进度条问题

public static string GetUrlData(string url,ProgressBar Prog)
{
HttpWebResponse res = null;
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(http://www.baidu.com/);
res = (HttpWebResponse)req.GetResponse();
long  totalBytes = res.ContentLength;
Stream input = null;
input = res.GetResponseStream();
int totalDownloadedByte = 0;
byte[] by = new byte[10245];
string Content=null;
Encoding encoder = Encoding.GetEncoding("GB2312");
do
{
                     totalDownloadedByte = input.Read(by, 0, (int)by.Length);
                     Content+=encoder.GetString(by, 0, totalDownloadedByte);
                     if (Prog.Value + totalDownloadedByte <= Prog.Maximum)
                     {
                         Prog.Value += totalDownloadedByte;
                         Application.DoEvents();
                     }
                     else
                     {
                         Prog.Value = Prog.Maximum;
                     }

                 }
                 while (totalDownloadedByte != 0);
}
res.Close();
return Content;
}
输出经常有一小部分错误的的文字比如？？号。或者?崭窭己蟊噶?之类的。99%的文字是正常的。。要怎么写才可以100%正常？

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

不是。每次错误的地方都不一样
取消进度条。改成HttpWebResponse res = null;
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(http://www.baidu.com/);
res = (HttpWebResponse)req.GetResponse();
Stream input = null;
input = res.GetResponseStream();
StreamReader sr = new StreamReader(input, Encoding.GetEncoding("gb2312"));
Content= sr.ReadToEnd();
res.Close();
return Content;
输出就没有错误。。
把
byte[] by = new byte[10245];
改成
byte[] by = new byte[90245];
还是有错误。。
越小错误就出现得越多。越大就越少。但是还是不可以完全正确
应该是编码问题，
Encoding encoder = Encoding.GetEncoding("GB2312");
改成：
Encoding encoder = Encoding.GetEncoding("UTF-8");
不是编码问题。而且GB2312是正确的。。UTF-8就全是乱码了。问题就出现
totalDownloadedByte = input.Read(by, 0, (int)by.Length);
Content+=encoder.GetString(by, 0, totalDownloadedByte);
有没有好的写法。。
标红的地方出错了。
建议你把所有的数据都读完后再调用encoder.GetString()
public static string GetUrlData(string url,ProgressBar Prog)
{
HttpWebResponse res = null;
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(http://www.baidu.com/);
res = (HttpWebResponse)req.GetResponse();
long  totalBytes = res.ContentLength;
Stream input = null;
input = res.GetResponseStream();
int totalDownloadedByte = 0;
byte[] by = new byte[10245];
string Content=null;
Encoding encoder = Encoding.GetEncoding("GB2312");
do
{
                    totalDownloadedByte = input.Read(by, 0, (int)by.Length);
                    Content+=encoder.GetString(by, 0, totalDownloadedByte);
                    if (Prog.Value + totalDownloadedByte <= Prog.Maximum)
                    {
                        Prog.Value += totalDownloadedByte;
                        Application.DoEvents();
                    }
                    else
                    {
                        Prog.Value = Prog.Maximum;
                    }

                }
                while (totalDownloadedByte != 0);
}
res.Close();
return Content;
}
这样的错误我也遇到过。主要是因为每个字符编码的长度不是固定的。
在使用gb,utf-8等编码的时候都要注意这个问题。因为encoder.getstring()是根据一个buffer来转换的。如果编码长度不定，就有可能在这个buffer的最后一个字节和下一个buffer的第一个字节是同一个字符的编码两个部分。这样encoder就不能正常的识别了。所以这种情况下要不就把所有的数据都接收的再处理，要不就自己根据编码规则自己写一个分割的函数。但是后者很麻烦的，而且执行效率很低。