<div class="content">
<div class="stream-item-header">
<a class="account-group js-user-profile-link" href="/babybamboo_hk">
<img class="avatar js-action-profile-avatar " src="https://si0.twimg.com/profile_images/1111616402/111_normal.jpg" alt="babybamboo" data-user-id="183241857"/>
<strong class="fullname js-action-profile-name">babybamboo</strong>
<span>‏</span>
<span class="username js-action-profile-name">@babybamboo_hk</span>
</a>
</div>
<p class="bio ">
想唔想玩盡、食盡香港,而又不花太多錢?BabyBamboo可以一次過滿足你兩個願望!與商戶合作,BabyBamboo以團購方式為會員爭取最低的價格,每日精選至少一項優惠,日日新鮮,款款唔同,其中包括高質素的餐廳、咖啡廳、戲院、美容院、酒店…以及城中各式新潮的消遣熱點,應有盡有!
</p>
</div>
求各位大神帮帮忙阿...我要只获取到HTML源码中的这些内容https://si0.twimg.com/profile_images/1111616402/111_normal.jpg,babybamboo,想唔想玩盡、食盡香港,而又不花太多錢? 正则表达式怎么写呀???求帮忙呀正则表达式C#行业数据
<div class="stream-item-header">
<a class="account-group js-user-profile-link" href="/babybamboo_hk">
<img class="avatar js-action-profile-avatar " src="https://si0.twimg.com/profile_images/1111616402/111_normal.jpg" alt="babybamboo" data-user-id="183241857"/>
<strong class="fullname js-action-profile-name">babybamboo</strong>
<span>‏</span>
<span class="username js-action-profile-name">@babybamboo_hk</span>
</a>
</div>
<p class="bio ">
想唔想玩盡、食盡香港,而又不花太多錢?BabyBamboo可以一次過滿足你兩個願望!與商戶合作,BabyBamboo以團購方式為會員爭取最低的價格,每日精選至少一項優惠,日日新鮮,款款唔同,其中包括高質素的餐廳、咖啡廳、戲院、美容院、酒店…以及城中各式新潮的消遣熱點,應有盡有!
</p>
</div>
求各位大神帮帮忙阿...我要只获取到HTML源码中的这些内容https://si0.twimg.com/profile_images/1111616402/111_normal.jpg,babybamboo,想唔想玩盡、食盡香港,而又不花太多錢? 正则表达式怎么写呀???求帮忙呀正则表达式C#行业数据
string html=@"你的html源码";
string pattern=@"(?is)<div\s*class=""content"">[\s\S]*?<img[^>]*?src=""(?<src>[^""]*?)""[^>]*?alt=""(?<alt>[^""]*?)""[^>]*?/>[\s\S]*?<p\s*class=""bio "">(?<pcontent>.*?)</p>
";
Regex reg=new Regex(pattern);Console.WriteLine(reg.Match(html).Groups["src"].Value);Console.WriteLine(reg.Match(html).Groups["alt"].Value);Console.WriteLine(reg.Match(html).Groups["pcontent"].Value);
string pattern = @"(?is)<div\s*class=""content"">[\s\S]*?<img[^>]*?src=""(?<src>[^""]*?)""[^>]*?alt=""(?<alt>[^""]*?)""[^>]*?/>[\s\S]*?<p\s*class=""bio "">(?<pcontent>.*?)</p>
";
Regex reg = new Regex(pattern); Console.WriteLine(reg.Match(html).Groups["src"].Value); Console.WriteLine(reg.Match(html).Groups["alt"].Value); Console.WriteLine(reg.Match(html).Groups["pcontent"].Value);
string pattern = @"(?is)<div\s*class=""content"">[\s\S]*?<img[^>]*?src=""(?<src>[^""]*?)""[^>]*?alt=""(?<alt>[^""]*?)""[^>]*?/>[\s\S]*?<p\s*class=""bio "">(?<pcontent>.*?)</p>";
Regex reg = new Regex(pattern); MessageBox.Show(reg.Match(html).Groups[1].Value); MessageBox.Show(reg.Match(html).Groups[1].Value); MessageBox.Show(reg.Match(html).Groups[1].Value);
HTML就是上面的源码放在C盘下面1.txt了可还是筛选不到我要的内容啊?
string htmlCode = html.GetHTML("https://twitter.com/following", "gb2312");
//string pattern=@"<p\s*class='bio\s*'>(?<content>[^<]*)</p>";
Regex Comment = new Regex(@"(?is)<div\s*class=""content"">[\s\S]*?<img[^>]*?src=""(?<src>[^""]*?)""[^>]*?alt=""(?<alt>[^""]*?)""[^>]*?/>[\s\S]*?<p\s*class=""bio "">(?<pcontent>.*?)</p>");
System.Text.RegularExpressions.MatchCollection co = Comment.Matches(htmlCode);
//遍历匹配内容
int count = 0;
foreach (Match m in co)
{
richTextBox2.AppendText((co[count + 1].ToString()) + "\r\n");
}
class Html
{
public string GetHTML(string url, string encoding)
{
System.Net.WebClient web = new System.Net.WebClient();
byte[] buffer = web.DownloadData(url);
return Encoding.GetEncoding(encoding).GetString(buffer);
}
}这是我的代码,以前筛选数据就没问题的,我是先得到这个网站的HTML源码在用你写的正则表达式筛选的,我也试了你写的那个方法也是筛选不到阿
string htmlCode = html.GetHTML("https://twitter.com/following", "gb2312");
就是htmlCode 的值
<div class="stream-item-header">
<a class="account-group js-user-profile-link" href="/babybamboo_hk">
<img class="avatar js-action-profile-avatar " src="https://si0.twimg.com/profile_images/1111616402/111_normal.jpg" alt="babybamboo" data-user-id="183241857"/>
<strong class="fullname js-action-profile-name">babybamboo</strong>
<span>‏</span>
<span class="username js-action-profile-name">@babybamboo_hk</span>
</a>
</div>
<p class="bio ">
想唔想玩盡、食盡香港,而又不花太多錢?BabyBamboo可以一次過滿足你兩個願望!與商戶合作,BabyBamboo以團購方式為會員爭取最低的價格,每日精選至少一項優惠,日日新鮮,款款唔同,其中包括高質素的餐廳、咖啡廳、戲院、美容院、酒店…以及城中各式新潮的消遣熱點,應有盡有!
</p>
</div>
</div>
<div class="stream-item-header">
<a class="account-group js-user-profile-link" href="/babybamboo_hk">
<img class="avatar js-action-profile-avatar " src="https://si0.twimg.com/profile_images/1111616402/111_normal.jpg" alt="babybamboo" data-user-id="183241857"/>
<strong class="fullname js-action-profile-name">babybamboo</strong>
<span>‏</span>
<span class="username js-action-profile-name">@babybamboo_hk</span>
</a>
</div>
<p class="bio ">
想唔想玩盡、食盡香港,而又不花太多錢?BabyBamboo可以一次過滿足你兩個願望!與商戶合作,BabyBamboo以團購方式為會員爭取最低的價格,每日精選至少一項優惠,日日新鮮,款款唔同,其中包括高質素的餐廳、咖啡廳、戲院、美容院、酒店…以及城中各式新潮的消遣熱點,應有盡有!
</p>
</div>
我要在这些代码中只筛选到想唔想玩盡、食盡香港,而又不花太多錢?BabyBamboo可以一次過滿足你兩個願望!與商戶合作,BabyBamboo以團購方式為會員爭取最低的價格,每日精選至少一項優惠,日日新鮮,款款唔同,其中包括高質素的餐廳、咖啡廳、戲院、美容院、酒店…以及城中各式新潮的消遣熱點,應有盡有!
这些内容正则表达是怎么写呀???