小弟我在做一个股票分析系统,需要获取一些股票的历史行情数据。
http://quote.eastmoney.com/flash/flashk.html?c=600651&m=1&n=%e9%a3%9e%e4%b9%90%e9%9f%b3%e5%93%8d
我用http watch抓包后,得到一个连接:
http://hq2fls.eastmoney.com/EM_Quote2010PictureApplication/Flash.aspx?Type=CHD&ID=6006511&lastnum=300&r=0.12576666055247188
但是数据却是:
x湐}i8?z'?M园嵉x%媻_>t酻熹姸娾?F??R\rx磕€??q斴D騙z.?D?r?eBT?y淇?耽W琶豉鐔~絕f鉕?$蛆瞑}借?q??婵4*l魊!H?yh縚/1)瑞 錝b栖糫
等等一堆乱码。
编码的问题我考虑,试了很多编码(utf8、gb2312等等),但还是乱码。HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 87929
Content-Type: text/html
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Date: Tue, 11 Oct 2011 04:48:20 GMT
这是返回来的http header ,没有说明编码
http://quote.eastmoney.com/flash/flashk.html?c=600651&m=1&n=%e9%a3%9e%e4%b9%90%e9%9f%b3%e5%93%8d
我用http watch抓包后,得到一个连接:
http://hq2fls.eastmoney.com/EM_Quote2010PictureApplication/Flash.aspx?Type=CHD&ID=6006511&lastnum=300&r=0.12576666055247188
但是数据却是:
x湐}i8?z'?M园嵉x%媻_>t酻熹姸娾?F??R\rx磕€??q斴D騙z.?D?r?eBT?y淇?耽W琶豉鐔~絕f鉕?$蛆瞑}借?q??婵4*l魊!H?yh縚/1)瑞 錝b栖糫
等等一堆乱码。
编码的问题我考虑,试了很多编码(utf8、gb2312等等),但还是乱码。HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 87929
Content-Type: text/html
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Date: Tue, 11 Oct 2011 04:48:20 GMT
这是返回来的http header ,没有说明编码
推荐你一个软件.. Charles抓包 这个就支持amf的信息查看.. 但是不支持win7系统...
不过官网(www.zlib.net)好像是被墙了,不想翻墙也可以在下面的地址下载:
http://www.componentace.com/download/download.php?editionid=25解压后的数据都是这种格式:
1990-12-21,336.30,336.30,336.30,336.30,100,1000
1990-12-24,336.30,353.20,353.20,336.30,1700,23000
1990-12-26,389.50,389.50,389.50,389.50,400,6000
1990-12-27,393.40,393.40,393.40,393.40,6600,104000
1990-12-28,397.30,397.30,397.30,397.30,3000,48000
...
2011-09-28,8.95,9.39,9.45,8.89,17452081,160561392
2011-09-29,9.24,9.06,9.36,9.00,11440596,104881744
2011-09-30,9.16,9.39,9.74,9.10,23676018,224456224
2011-10-10,9.39,9.59,9.70,9.32,16371824,156133968
2011-10-11,9.77,9.71,9.87,9.42,16612936,159970736
一共是5k多条,大概是每天的开收盘价什么的,具体含义还要再研究。
百度指数的网页用http analyzer分析后,是amf的