我从网上找到一个例子,是抓网页,但这只是获取HTML,我需要是把title, 网页内容,URL,分开,并存入数据中哪位有做过,请指教一下
private void button1_Click(object sender, EventArgs e)
{
PageUrl = textBox1.Text;
WebRequest request = WebRequest.Create(PageUrl);
WebResponse response = request.GetResponse();
Stream resStream = response.GetResponseStream();
StreamReader sr = new StreamReader(resStream,System.Text.Encoding.Default);
textBox2.Text = sr.ReadToEnd();
sr.Close(); }
private void button1_Click(object sender, EventArgs e)
{
PageUrl = textBox1.Text;
WebRequest request = WebRequest.Create(PageUrl);
WebResponse response = request.GetResponse();
Stream resStream = response.GetResponseStream();
StreamReader sr = new StreamReader(resStream,System.Text.Encoding.Default);
textBox2.Text = sr.ReadToEnd();
sr.Close(); }
其它的内容各个可以用正则提取出来。
http://www.cnblogs.com/overred/articles/846419.html
http://www.beijing-hyundai.com.cn/eting/eting.shtml
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>北京现代汽车有限公司-市场活动</title>
<meta name="Keywords" content="市场活动,最新活动,精彩活动,活动回顾,免费检测活动,促销活动" />
<meta name="Description" content="公司最新及以往活动信息,包括促销活动,免费检测活动,形象代言选拔等信息。" />
<link rel="stylesheet" href="/css/header.css" type="text/css" media="all" />
<link rel="stylesheet" href="/css/eting_content.css" type="text/css" media="all" />
<script type="text/javascript" src="/js/nav.js"></script>
</head>
<body id="eting"> <div class="eting_item_right">
<h5 class="blue_a02"><a href="./hd/080714/index.html" target="_blank">回娘家—为中国加油!</a></h5>
<p>鼎沸中国,悦动八月</p>
<p>北京现代诚邀2008位客户回娘家!您将可参观北京现代先进的生产线,可切身感受首都八月浓郁的奥运氛围,游览大气磅礴的体育场馆,观摩激动人心的体育赛事……</p>
<p>让我们一起来,悦动八月!</p>
</p>
<p class="date">2008.08.06-2008.08.24</p>
</div>