想做个一键生成网站地图。用程序遍历网站的结构获得链接

这个就和搜索引擎的爬虫一样了吧，试着写了一下获取页面上的超链接：private static Dictionary<string, string> GetUrlDict(string url)
        {
            Dictionary<string, string> dict = new Dictionary<string, string>();
            WebClient wc = new WebClient();
            wc.Encoding = Encoding.UTF8;
            string html = wc.DownloadString(url);
            Regex reg = new Regex(@"(?is)<a(?:(?!href=).)*href=(['""]?)(?<url>[^""\s>]*)\1[^>]*>(?<text>(?:(?!</?a\b).)*)</a>");
            MatchCollection mc = reg.Matches(html);
            foreach (Match m in mc)
            {
                string href = m.Groups["url"].Value;
                if (href != string.Empty)
                {
                    if (href.Length < 7 || href.Substring(0, 7) != "http://")
                    {
                        href = url + href;
                    }
                    if (dict.ContainsKey(href) == false)
                    {
                        dict.Add(href, m.Groups["text"].Value);
                    }
                }
            }
            return dict;
        }
先对首页查一遍，然后再对查到的结果每一个链接查一遍，然后再再对查到的结果的结果链接查一遍，这样查3层或者4层就差不多了。当然还要排除外链。sitemap的格式就是xml文件吧，应该是可以不区分百度和谷歌的差别的。

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

如果能够读取到数据库就尽量从数据库中遍例数据后再生成地图。
查一下google sitemap的格式，一种是sitemap,还有一种是sitemap index,其它的就是读数据库，生成sitemap了sitemap看一下数据容量，好像是一个sitemap里面的loc不要超过50000条数据，不能大于10M（印象中）,如果大于10M要使用gzip压缩一下。。就先回这些吧，把这里面每一步你做的整理一下，然后一步一步完成即可。
额，貌似回错地方了这是seo 么~~