为什么我用TIdHTTP控件抓取这个网址的数据时,出现乱码?而其他的一般的网址就可以?怎么改?先谢了!
该地址如下:
http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515'
程序如下:
unit Unit1; interface uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, IdBaseComponent, IdComponent, IdTCPConnection, IdTCPClient, IdHTTP,
StdCtrls, ExtCtrls, WebAdapt, WebComp,WinINet, ComCtrls, OleCtrls, SHDocVw; type
TForm1 = class(TForm)
Panel1: TPanel;
Button1: TButton;
Memo1: TMemo;
procedure Button1Click(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end; var
Form1: TForm1; implementation {$R *.dfm}
procedure TForm1.Button1Click(Sender: TObject);
var
url:string;
IdHTTP:TIdHTTP;
stream:TMemoryStream;
begin
url:='http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515';
IdHTTP:=TIdHTTP.Create(nil);
stream:=TMemoryStream.Create;
try
IdHTTP.get(url,stream);
stream.SaveToFile('c:\1.txt');
finally
stream.Free;
IdHTTP.Free;
end;
end;
end.
该地址如下:
http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515'
程序如下:
unit Unit1; interface uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, IdBaseComponent, IdComponent, IdTCPConnection, IdTCPClient, IdHTTP,
StdCtrls, ExtCtrls, WebAdapt, WebComp,WinINet, ComCtrls, OleCtrls, SHDocVw; type
TForm1 = class(TForm)
Panel1: TPanel;
Button1: TButton;
Memo1: TMemo;
procedure Button1Click(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end; var
Form1: TForm1; implementation {$R *.dfm}
procedure TForm1.Button1Click(Sender: TObject);
var
url:string;
IdHTTP:TIdHTTP;
stream:TMemoryStream;
begin
url:='http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515';
IdHTTP:=TIdHTTP.Create(nil);
stream:=TMemoryStream.Create;
try
IdHTTP.get(url,stream);
stream.SaveToFile('c:\1.txt');
finally
stream.Free;
IdHTTP.Free;
end;
end;
end.
begin
Result := IHtmlDocument2(WB.Document).Body.OuterHtml
end;
我以前写的是用webbrowser去抓的。
IdHTTP1.request.acceptcharset:='gb2312';等待中
IdHTTP.request.acceptcharset:='gb2312';
但还是收到乱码....
如果你能用gzip控件解压最好如果不会,就发送不支持gzip的命令:procedure TForm1.Button1Click(Sender: TObject);
var
url:string;
IdHTTP:TIdHTTP;
stream:TMemoryStream;
begin
url:='http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515';
IdHTTP:=TIdHTTP.Create(nil);
IdHTTP.Request.AcceptEncoding:='aaa'; //加上
IdHTTP.ProtocolVersion:=pv1_0; //加上
stream:=TMemoryStream.Create;
try
IdHTTP.get(url,stream);
stream.SaveToFile('c:\1.txt');
finally
stream.Free;
IdHTTP.Free;
end;
end;
end.
因为没有环境,等回公司再试试
HTTP/1.1 200 OK
Server: nginx/0.5.19
Date: Wed, 27 Aug 2008 06:10:40 GMT
Content-Type: text/html
Connection: close
Content-Encoding: deflate //这个压缩编码
Content-Length: 7204看来是要用到第三方ziplib控件或indy10以上的DecompressGZipStream才行了
我是d7,以前用ziplib做过,成功解压可以按这种思路做
祝好运
已经安装上了,但弄了一两个小时还是不知道怎么用这个控件(VCLUnzip)解压
方便的话,帮一看看怎么实现上面数据的自动解压。
谢谢了!
有时成功,但很奇怪,一会又不行了.....
我不知道是怎么回事我是这么做的,把IdHTTP的compressor设置成为IdCompressorZLibEx1
再试试其它。。
在算法上还是花了不少时间,
以前解gzip内容,现在是deflate,少了文件头内容,所以
用以前思路老是碰到"data error"问题。
用的是zLib控件:
(http://www.2ccc.com/article.asp?articleid=4269)下面是可以运行的代码片断:
uses ZLibEx;procedure TForm1.Button3Click(Sender: TObject);
var
url: string;
IdHTTP: TIdHTTP;
stream, out_stream: TMemoryStream;
begin
url := 'http://bill.finance.sina.com.cn/bill/trade_item.php?stock_code=sh580013&pages=0&time=1218809515';
IdHTTP := TIdHTTP.Create(nil); stream := TMemoryStream.Create;
out_stream := TMemoryStream.Create;
try
IdHTTP.get(url, stream);
stream.Position:=0;
ZLibEx.ZDecompressStream2(stream, out_stream, -15);
stream.SaveToFile('c:\1.txt');
out_stream.SaveToFile('c:\2.txt'); //这个就是我们要的html内容
finally
stream.Free;
out_stream.Free;
IdHTTP.Free;
end;
end;
虽然它方便一些。如果功能不够,宁可找第三方控件来实现。
因为装indy10,对原网络开发环境改变太大,增加不可控因素,
容易对以前的项目产生影响。
现在用了你给的代码搞定了
但为什么ZLibEx.ZDecompressStream2第三个参数是-15呢?为什么是个负数?这个参数是干什么的?
是看到zlib中有这行应用,直接copy出来试
-15换成-10也行想了解就把源码读一遍,我也很懒:)