<span class="prodList_gridViewCell">
<!--DynamicProductGrid.ascx Product -->
<div style="margin-bottom: 5px">
<div style="height: 130px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageWrapLabel">
<div style="text-align: center;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageLink" title="AHAVA Refreshing Cleansing Gel" href="/p-16557-ahava-refreshing-cleansing-gel.aspx"><img title="AHAVA Refreshing Cleansing Gel" src="//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing-gel.jpg"(图片路径) style="border-width:0px;" /></a></div>
</span>
</div>
<br>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_titleDiv" style="overflow:hidden;margin-bottom:8px;height:45px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleWrapLabel"><a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleLink" href="/p-16557-ahava-refreshing-cleansing-gel.aspx">AHAVA Refreshing Cleansing Gel(商品名称)</a></span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_shortDescriptionDiv" style="overflow:hidden;margin-bottom:5px;display:block;height:52px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductDescriptionLabel" class="subNav">A mild cleanser that washes away makeup and impurities to leave skin clean and refreshed.(商品描述)</span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ratingDiv" style="height:20px;overflow:hidden;display:block;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductReviewsImageLink" href="http://reviews.skinstore.com/7554/16557/reviews.htm"></a>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_badgeDiv" style="height:20px;overflow:hidden;margin-top:10px;display:none;">
</div>
<div style="margin-top: 10px; overflow: hidden;">
<span style="float: left">
<strong>$20.00(价格)
</strong>(价格)
</span>
</div>
<div style="margin-top: 5px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductUnitLabel" class="subNav">3.4oz(规格)</span>
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductSKULabel" class="subNav"> <font color="black">|</font> AH175(商品ID)</span>
</div>
<div class="clr">
</div>
<div style="margin-top: 12px; overflow: hidden; height: 25px;">
<span style="float: left;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductAddToCartImageLink" class="searchAddBtn" href="/checkout/shoppingCart.aspx?cProductID=16557"><img src="/resources/nav/nav_add_cart.gif" style="border-width:0px;" /></a></span> <span style="float: right">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductStockStatusLabel" class="stockStatus">In Stock(是否新品)</span></span>
</div>
</div>
<!--/DynamicProductGrid.ascx Product -->
</span>
页面上有很多个这样的span标签,我想取标签内括号前面的内容
获取//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing-gel.jpg
</strong>
获取20.00
AHAVA Refreshing Cleansing Gel======================================================================
匹配记录2(行26:列109,长度:89):
A mild cleanser that washes away makeup and impurities to leave skin clean and refreshed.======================================================================
匹配记录3(行44:列33,长度:6):
$20.00======================================================================
匹配记录4(行50:列16,长度:5):
3.4oz======================================================================
匹配记录5(行52:列56,长度:17):
AH175======================================================================
匹配记录6(行64:列21,长度:8):
In Stock======================================================================
StringBuilder sb=new StringBuilder();
Regex.Matches("你的html源文件", @"(?is)(?<=\>)[^<(]+?(?=\()").OfType<Match>().Select(x => x.Value).ToList().ForEach(x =>
{
sb.Append(x + "\n");
});
Console.Write(sb.ToString());
{
sb.Append(ma.Value);
}
Regex reg = new Regex(@"(?<=(['""]?))[^()>'""]+(?=\1\([^)]+\))");
foreach (Match m in reg.Matches(str))
Console.WriteLine(m.Value);
/*
//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing
-gel.jpg
AHAVA Refreshing Cleansing Gel
A mild cleanser that washes away makeup and impurities to leave skin clean and r
efreshed.
$20.00
3.4oz
AH175
In Stock*/
取span标签内的非汉字内容,如果span标签内还有其他标签,不取出来。
是否是这样?
就是取这个span标签内有有括号注明的内容
<span class="prodList_gridViewCell">
<!--DynamicProductGrid.ascx Product -->
<div style="margin-bottom: 5px">
<div style="height: 130px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageWrapLabel">
<div style="text-align: center;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageLink" title="AHAVA Refreshing Cleansing Gel" href="/p-16557-ahava-refreshing-cleansing-gel.aspx"><img title="AHAVA Refreshing Cleansing Gel" src="(//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing-gel.jpg)" style="border-width:0px;" /></a></div>
</span>
</div>
<br>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_titleDiv" style="overflow:hidden;margin-bottom:8px;height:45px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleWrapLabel"><a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleLink" href="/p-16557-ahava-refreshing-cleansing-gel.aspx">(AHAVA Refreshing Cleansing Gel)</a></span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_shortDescriptionDiv" style="overflow:hidden;margin-bottom:5px;display:block;height:52px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductDescriptionLabel" class="subNav">(A mild cleanser that washes away makeup and impurities to leave skin clean and refreshed.)</span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ratingDiv" style="height:20px;overflow:hidden;display:block;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductReviewsImageLink" href="http://reviews.skinstore.com/7554/16557/reviews.htm"></a>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_badgeDiv" style="height:20px;overflow:hidden;margin-top:10px;display:none;">
</div>
<div style="margin-top: 10px; overflow: hidden;">
<span style="float: left">
<strong>$(20.00)
</strong> </span>
</div>
<div style="margin-top: 5px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductUnitLabel" class="subNav">(3.4oz)</span>
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductSKULabel" class="subNav"> <font color="black">|</font> (AH175)</span>
</div>
<div class="clr">
</div>
<div style="margin-top: 12px; overflow: hidden; height: 25px;">
<span style="float: left;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductAddToCartImageLink" class="searchAddBtn" href="/checkout/shoppingCart.aspx?cProductID=16557"><img src="/resources/nav/nav_add_cart.gif" style="border-width:0px;" /></a></span> <span style="float: right">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductStockStatusLabel" class="stockStatus">(In Stock)</span></span>
</div>
</div>
<!--/DynamicProductGrid.ascx Product -->
</span>
取括号扩起来的内容
Regex reg = new Regex(@"(?is)(?<=<span[^>]*?>)(?:[^(]*\(([^)]+)\))*.*?(?=</span>)");
foreach (Capture c in reg.Match(str).Groups[1].Captures)
Console.WriteLine(c.Value);
/*
//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing
-gel.jpg
AHAVA Refreshing Cleansing Gel
A mild cleanser that washes away makeup and impurities to leave skin clean and r
efreshed.
20.00
3.4oz
AH175
In Stock*/
<span class="prodList_gridViewCell">
<!--DynamicProductGrid.ascx Product -->
<div style="margin-bottom: 5px">
<div style="height: 130px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageWrapLabel">
<div style="text-align: center;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageLink" title="CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small" href="/p-15645-chi-air-expert-tourmaline-ceramic-nylon-round-brush-small.aspx"><img title="CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small" src="//skincare-img.skinstore.com/resources/dynamic/store/indeximages/CE016-nylon-round.jpeg" style="border-width:0px;" /></a></div>
</span>
</div>
<br>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_titleDiv" style="overflow:hidden;margin-bottom:8px;height:45px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleWrapLabel"><a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleLink" href="/p-15645-chi-air-expert-tourmaline-ceramic-nylon-round-brush-small.aspx">CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small</a></span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_shortDescriptionDiv" style="overflow:hidden;margin-bottom:5px;display:block;height:52px;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductDescriptionLabel" class="subNav">Get professional results and cut styling time in half with this ceramic round brush.</span>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ratingDiv" style="height:20px;overflow:hidden;display:block;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductReviewsImageLink" href="http://reviews.skinstore.com/7554/15645/reviews.htm"></a>
</div>
<div id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_badgeDiv" style="height:20px;overflow:hidden;margin-top:10px;display:none;">
</div>
<div style="margin-top: 10px; overflow: hidden;">
<span style="float: left">
<strong>$14.00</strong>
</span>
</div>
<div style="margin-top: 5px; overflow: hidden;">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductUnitLabel" class="subNav"> </span>
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductSKULabel" class="subNav"> <font color="black">|</font> CE016</span>
</div>
<div class="clr">
</div>
<div style="margin-top: 12px; overflow: hidden; height: 25px;">
<span style="float: left;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductAddToCartImageLink" class="searchAddBtn" href="/checkout/shoppingCart.aspx?cProductID=15645"><img src="/resources/nav/nav_add_cart.gif" style="border-width:0px;" /></a></span> <span style="float: right">
<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductStockStatusLabel" class="stockStatus">In Stock</span></span>
</div>
</div>
<!--/DynamicProductGrid.ascx Product -->
</span>
这是获取页面的数据,就取刚刚你取的那些值
<div style="text-align: center;">
<a id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageLink" title="CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small" href="/p-15645-chi-air-expert-tourmaline-ceramic-nylon-round-brush-small.aspx">
<img title="CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small" src="//skincare-img.skinstore.com/resources/dynamic/store/indeximages/CE016-nylon-round.jpeg" style="border-width:0px;" /></a></div>
</span>
//这个span是要获取img的src <span style="float: left">
<strong>$14.00</strong>
</span>
//这个span是要获取strong中的$14.00<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductSKULabel"
class="subNav"> <font color="black">|</font> CE016</span>
//这个span要获取CE016<span id="ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductStockStatu
sLabel" class="stockStatus">In Stock</span></span>//这个span要获取in Stock这些有什么规律?
如果硬写成一句的话处理的分支条件以及回朔很多,效率很低。
你可以先获取每个span中的内容。
然后在判断是否有img 有的话取src
判断是否有strong 有的话取strong里的内容
判断有没有不需要的标签,例如上面的font标签
<font color="black">|</font> CE016
有的话替换为空。
还是得要有规律
要自己总结出来。
谢谢了哦
string input = @"<span class=""prodList_gridViewCell"">
<!--DynamicProductGrid.ascx Product -->
<div style=""margin-bottom: 5px"">
<div style=""height: 130px; overflow: hidden;"">
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageWrapLabel"">
<div style=""text-align: center;"">
<a id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductImageLink"" title=""AHAVA Refreshing Cleansing Gel"" href=""/p-16557-ahava-refreshing-cleansing-gel.aspx""><img title=""AHAVA Refreshing Cleansing Gel"" src=""(//skincare-img.skinstore.com/resources/dynamic/store/indeximages/AH175-cleansing-gel.jpg)"" style=""border-width:0px;"" /></a></div>
</span>
</div>
<br>
<div id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_titleDiv"" style=""overflow:hidden;margin-bottom:8px;height:45px;"">
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleWrapLabel""><a id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductTitleLink"" href=""/p-16557-ahava-refreshing-cleansing-gel.aspx"">(AHAVA Refreshing Cleansing Gel)</a></span>
</div>
<div id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_shortDescriptionDiv"" style=""overflow:hidden;margin-bottom:5px;display:block;height:52px;"">
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductDescriptionLabel"" class=""subNav"">(A mild cleanser that washes away makeup and impurities to leave skin clean and refreshed.)</span>
</div>
<div id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ratingDiv"" style=""height:20px;overflow:hidden;display:block;"">
<a id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductReviewsImageLink"" href=""http://reviews.skinstore.com/7554/16557/reviews.htm""></a>
</div>
<div id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_badgeDiv"" style=""height:20px;overflow:hidden;margin-top:10px;display:none;"">
</div>
<div style=""margin-top: 10px; overflow: hidden;"">
<span style=""float: left"">
<strong>$(20.00)
</strong> </span>
</div>
<div style=""margin-top: 5px; overflow: hidden;"">
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductUnitLabel"" class=""subNav"">(3.4oz)</span>
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductSKULabel"" class=""subNav""> <font color=""black"">|</font> (AH175)</span>
</div>
<div class=""clr"">
</div>
<div style=""margin-top: 12px; overflow: hidden; height: 25px;"">
<span style=""float: left;"">
<a id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductAddToCartImageLink"" class=""searchAddBtn"" href=""/checkout/shoppingCart.aspx?cProductID=16557""><img src=""/resources/nav/nav_add_cart.gif"" style=""border-width:0px;"" /></a></span> <span style=""float: right"">
<span id=""ProductsSearchLayout_DynamicProductGrid1_ProductsDataList_ctl00_ProductStockStatusLabel"" class=""stockStatus"">(In Stock)</span></span>
</div>
</div>
<!--/DynamicProductGrid.ascx Product -->
</span>
"
;
string pattern = @"(?is)(?<=(\<[^>]+?\>)+?)[^<]+?(?=(\</?[^>]+?>)+?)";
foreach(Match ma in Regex.Matches(input.Replace(" ",""), pattern))
{
if(!string.IsNullOrEmpty(ma.Value.Trim()))
Console.WriteLine(ma.Value);
}
Console.Read();
//
Regex reg = new Regex(@"(?is)<img[^>]*?src=""([^""]+)""[^>]*?>|(?<=<(?!font\b)[^>]*?>)(?!(?:\s*|(?: )*)<)([^<>]+?)(?=</?[^>]+?>)");
foreach (Match m in reg.Matches(str))
{
if (m.Groups[1].Success)
Console.WriteLine(m.Groups[1].Value);
else
Console.WriteLine(m.Groups[2].Value);
}
/*
//skincare-img.skinstore.com/resources/dynamic/store/indeximages/CE016-nylon-rou
nd.jpeg
CHI Air Expert Tourmaline Ceramic Nylon Round Brush - Small
Get professional results and cut styling time in half with this ceramic round br
ush.
$14.00
CE016
/resources/nav/nav_add_cart.gif
In Stock*/
hi 我的页面上面有多个这样的span标签怎么弄啊?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Billion Dollar Brows - 20% Off Coupon FAB20</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="keywords" content="Billion Dollar Brows, Billion Dollar Brows Products, Billion Dollar Brows Reviews"></meta>
<meta name="description" content="Billion Dollar Brows and 6000+ other Skin care products at SkinStore.com: 115% Price Protection, Free Shipping, Billion Dollar Brows 20% Auto Ship Savings, Free Samples."></meta>
<link href="/css/template.css" type="text/css" rel="stylesheet" />
<link href="/css/search.css" type="text/css" rel="stylesheet" />
<link href="/css/department.css" type="text/css" rel="stylesheet" />
<link href="/css/brand.css" type="text/css" rel="stylesheet" />
<script type="text/javascript" src="/js/slider.js"></script>
<script type="text/javascript" src="/sharedCode/global.js"></script>
</head>
<body>
<div id="mainContainer" class="container_16">
<div class="grid_16" style="position: relative;">
<!-- title and breadcrumbs -->
<div id="titleBlock">
<span id="BreadCrumb1" style="font-weight:normal;"> <a href="/index.aspx">Home</a> : <a href="/brands/brands.aspx">Brands</a> : <a href="billion-dollar-brows.aspx">Billion Dollar Brows</a></span>
<br />
<br />
我现在只要页面上所有<span class=="prodList_gridViewCell">...........</span>标签里面的