怎样消除fck 的结果中的<div>....</div> html标签

怎样消除fck 的结果中的<div>....</div> html标签。
要求：去掉后，输出到word后，样式基本不变，最基本的换行总有吧。只是去掉标签，已经实现【用正则表达式】，但是输出到word后格式却都是内容紧贴着，没有换行以及格式了

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

替换HTML
Regex.Replace(str,@"<[^> ]+>","");
换行使用Enviroment.NewLine,\r\n替换标识
保存为HTML
public static string StripHTML(string HTML) //google "StripHTML" 得到
        {
            string[] Regexs = {
                                  @"<script[^>]*?>.*?</script>",
                                  @"<(\/\s*)?!?((\w+:)?\w+)(\w+(\s*=?\s*(([""'])(\\[""'tbnr]|[^\7])*?\7|\w+)|.{0})|\s)*?(\/\s*)?>",
                                  @"([\r\n])[\s]+",
                                  @"&(quot|#34);",
                                  @"&(amp|#38);",
                                  @"&(lt|#60);",
                                  @"&(gt|#62);",
                                  @"&(nbsp|#160);",
                                  @"&(iexcl|#161);",
                                  @"&(cent|#162);",
                                  @"&(pound|#163);",
                                  @"&(copy|#169);",
                                  @"&#(\d+);",
                                  @"-->",
                                  @"<!--.*\n"
                               };            string[] Replaces ={
                                      "",
                                      "",
                                      "",
                                      "\"",
                                      "&",
                                      "<",
                                      ">",
                                      " ",
                                      "\xa1", //chr(161),
                                      "\xa2", //chr(162),
                                      "\xa3", //chr(163),
                                      "\xa9", //chr(169),
                                      "",
                                      "\r\n",
                                      ""
                                  };
            string s = HTML;
            for (int i = 0; i < Regexs.Length; i++)
            {
                s = new Regex(Regexs[i], RegexOptions.Multiline | RegexOptions.IgnoreCase).Replace(s, Replaces[i]);
            }
            s.Replace("<", "");
            s.Replace(">", "");
            s.Replace("\r\n", "");
            return s;
        }
这是消除html标签的。但是消除后，内容全部连一块了。
String xx=Regex.Replace(源字符,@"<div[^>]*>([\s\S]*?)</div>","<p>$1</p>");