<div class="tb-detail-hd">
    <h3><a href="http://detail.tmall.com/venus/spu_detail.htm?spu_id=136191697&amp;no_switch=1&amp;default_item_id=13133052500" target="_blank">【五折】Jack Jones杰克琼斯连帽含羊毛双层毛衣B浅211425001104</a></h3>
    <p>                <span>
                                                                                                  举报此商品(<a href="http://support.taobao.com/myservice/suit/accuse_punish.jhtml?auction_num_id=13133052500&amp;display_type=3">举报</a>)
</span>
    </p>
</div>
想提取 class为tb-detail-hd  div 下面的 h3  我现在的写法:reg = @"(?is)<div class=""tb-detail-hd""><h3>(<a[^>]*>)?([^<]*)(</a>)?</h3></div>"; 提取是空的 
如果 写成 reg = "<h3>(<a[^>]*>)?([^<]*)(</a>)?</h3>"; 
虽然可以提取但是页面有别的h3标签 那么也一并提取了 求教了

解决方案 »

  1.   

    <div class=""tb-detail-hd""><h3>
    <h3>前面有空格符
      

  2.   

    string s = @"<div class=""tb-detail-hd"">
        <h3><a href=""http://detail.tmall.com/venus/spu_detail.htm?spu_id=136191697&amp;no_switch=1&amp;default_item_id=13133052500"" target=""_blank"">【五折】Jack Jones杰克琼斯连帽含羊毛双层毛衣B浅211425001104</a></h3>
        <p>                <span>
                                                                                                                                                        举报此商品(<a href=""http://support.taobao.com/myservice/suit/accuse_punish.jhtml?auction_num_id=13133052500&amp;display_type=3"">举报</a>)
                                                        </span>
        </p>
    </div>";
    Match match = Regex.Match(s, @"(?is)<div\s+class=""tb-detail-hd"">\s*(<h3>.+?</h3>).*?</div>");
    Response.Write(Server.HtmlEncode(match.Groups[1].Value));
      

  3.   


    Regex re = new Regex("(?is)<div\\s*class=\"tb-detail-hd\">[^<]+<h3>(.*?)</h3>.*?</div>", RegexOptions.None);
      

  4.   


    LZ的那个改一下也可以Regex re = new Regex("(?is)<div\\s*class=\"tb-detail-hd\">\\s*<h3>(<a[^>]*>)?[^<]*(</a>)?</h3>.*?</div>", RegexOptions.None);
      

  5.   

      public static string get_tianmao(string content, int type)
            {
                string result = "";
                string reg = "";
                switch (type)
                {
                    case 0: return "";
                    case 1: reg = @"J_ImgBooth\b[^<>]*?\bsrc[\s\t\r\n]*=[\s\t\r\n]*[""']?[\s\t\r\n]*(?<imgUrl>[^\s\t\r\n""'<>]*)[^<>]*?/?[\s\t\r\n]*>"; break;
                    //case 2: reg = "<div class=\"tb-detail-hd\"><h3>(<a[^>]*>)?([^<]*)(</a>)?</h3></div>"; break;
                    case 2: reg = @"(?is)<div\s+class=""tb-detail-hd"">\s*(<h3>.+?</h3>).*?</div>"; break;
                    case 3: reg = "J_StrPrice[^>]*>([^<>]*)(</)"; break;
                }
                string regex = reg;
                Regex re = new Regex(regex);
                MatchCollection matches = re.Matches(content);
                System.Collections.IEnumerator enu = matches.GetEnumerator();
                switch (type)
                {
                    case 0: return "";
                    case 1:
                        while (enu.MoveNext() && enu.Current != null)
                        {
                            Match match = (Match)(enu.Current);
                            result += match.Groups["imgUrl"];
                        } break;
                    case 2:
                        while (enu.MoveNext() && enu.Current != null)
                        {
                            Match match = (Match)(enu.Current);
                            result += match.Groups[2];
                        } break;
                    case 3:
                        while (enu.MoveNext() && enu.Current != null)
                        {
                            Match match = (Match)(enu.Current);
                            result += match.Groups[1];
                        } break;
                }
                return result;
            }
    还是不行  是不是这个方法的问题???  依旧是空的
      

  6.   


    string strMatch = Regex.Match(strHtml, @"(?<=<div class=""tb-detail-hd"">\s*)<h3>(<a[^>]*>)?([^<]*)(</a>)?</h3>", RegexOptions.IgnoreCase).Value;
    return strMatch;
      

  7.   

    改一下
    reg = @"(?is)<div class=""tb-detail-hd""><h3>(<a[^>]*>)?(.*?)(</a>)?</h3></div>";
      

  8.   


    你的2个我都试了 第一个取的还是为""   第二个。。取的是  "</a>"  我只想把那个div下面的字提取出来其实就是天猫的 标题我是用的这个地址测试的:http://detail.tmall.com/item.htm?id=3372931960&is_b=1&cat_id=50025829&key_words=&spm=1008.1000032.1000012.16求教了 就是不行啊
      

  9.   

    <div class="tb-detail-hd">
        <h3><a target="_blank" href="http://detail.tmall.com/venus/spu_detail.htm?spu_id=47663902&no_switch=1&default_item_id=3372931960">包快递2012春季新款圆头鞋平跟鞋浅口单鞋女牛津鞋大码女鞋娃娃鞋</a></h3>
        <p>                <span>
                                                                                                      举报此商品(<a href="http://support.taobao.com/myservice/suit/accuse_punish.jhtml?auction_num_id=3372931960&display_type=3">举报</a>)
    </span>
        </p>
    </div>你的取的是:包快递2012春季新款圆头鞋平跟鞋浅口单鞋女牛津鞋大码女鞋娃娃鞋是吗?
      

  10.   

    http://www.deerchao.net/tutorials/regex/regex.htm#negativelookaround
    刚学的
      

  11.   

    string str = @"<div class=""tb-detail-hd"">
        <h3><a target=""_blank"" href=""http://detail.tmall.com/venus/spu_detail.htm?spu_id=47663902&no_switch=1&default_item_id=3372931960"">包快递2012春季新款圆头鞋平跟鞋浅口单鞋女牛津鞋大码女鞋娃娃鞋</a></h3>
        <p>                <span>
                                                                                                                                                        举报此商品(<a href=""http://support.taobao.com/myservice/suit/accuse_punish.jhtml?auction_num_id=3372931960&display_type=3"">举报</a>)
                                                        </span>
        </p>
    </div>";
                    string resultStr = string.Empty;
                    Regex re = new Regex("(?is)<div\\s*class=\"tb-detail-hd\">[^<]+<h3><a[^>]+>(.*?)</a></h3>.*?</div>", RegexOptions.None);
                    MatchCollection mc = re.Matches(str);
                    foreach (Match ma in mc)
                    {
                        resultStr = ma.Groups[1].Value;
                    }
                    Console.WriteLine(resultStr);
                    
                    Console.ReadLine();