大家好 我想问个关于正则式的问题 下面是原代码,想从当中提出某些内容:
<div id="main">
<h2>Madonna</h2>
<dl>
<dt>Website /dt>
<dd>
<a href="http://www.madonna.com/" title="Go to the Madonna website">www.madonna.com </a>
</dd>
<dt id="artPhoto">Artist photo</dt>
<dd id="photo"><img src="images/madonna.jpg" alt="Madonna" title="Madonna"/></dd>想提取当中 id=main的 "Madonna", "www.madonna.com " 和id="photo"的 madonna.jpg这张图片 请问用正则式的话 那表达式应该是怎样的? 请大家帮帮忙,因为刚第1天学正则式,所以比较迷茫, 如果大家关于正则式的好网站的话,麻烦也提供下给小弟多学习学习 谢谢
<div id="main">
<h2>Madonna</h2>
<dl>
<dt>Website /dt>
<dd>
<a href="http://www.madonna.com/" title="Go to the Madonna website">www.madonna.com </a>
</dd>
<dt id="artPhoto">Artist photo</dt>
<dd id="photo"><img src="images/madonna.jpg" alt="Madonna" title="Madonna"/></dd>想提取当中 id=main的 "Madonna", "www.madonna.com " 和id="photo"的 madonna.jpg这张图片 请问用正则式的话 那表达式应该是怎样的? 请大家帮帮忙,因为刚第1天学正则式,所以比较迷茫, 如果大家关于正则式的好网站的话,麻烦也提供下给小弟多学习学习 谢谢
Pattern p = Pattern.compile("<div id=\"main\">.*?<a href=\"http://(.*?)[/]?\".*?<img src=\"images/(.*?)\"", Pattern.DOTALL);
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
//竹子露了一个,
public static void main(String[] args) throws IOException, InterruptedException {
String s = "<div id=\"main\">\n<h2>Madonna </h2>\n<dl>\n<dt>Website /dt>\n<dd>\n <a href=\"http://www.madonna.com/\" title=\"Go to the Madonna website\">www.madonna.com </a>\n</dd>\n<dt id=\"artPhoto\">Artist photo </dt>\n<dd id=\"photo\"> <img src=\"images/madonna.jpg\" alt=\"Madonna\" title=\"Madonna\"/> </dd> ";
Pattern p = Pattern.compile(
"<div id=\"main\">\n<h2>(.*?)</h2>.*?<a href=\"http://(.*?)[/]?\".*?<img src=\"images/(.*?)\"",
Pattern.DOTALL);
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
}
程序如下:
public class SimpleAccess { public static void main(String[] args) throws Exception{
URL myURL = new URL("http://nebula.dcs.shef.ac.uk:8087/com4280/artists/artistd708.html?id=3");
BufferedReader in =
new BufferedReader(
new InputStreamReader(myURL.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
{
String s=inputLine;
Pattern p = Pattern.compile(
"<div id=\"main\">\n\n<h2>(.*?)</h2>.*?<a href=\"http://(.*?)[/]?\".*?<img src=\"images/(.*?)\"",
Pattern.DOTALL);
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
}
in.close();
}
}
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<head>
<title>Music Warehouse » Artist » </title>
<link type="text/css" rel="stylesheet" href="css/style3.css" />
</head><body id="artist"><div id="contentlayer">
<div id="header">
<h1><a href="index.html">The Music Warehouse</a></h1>
<p id="byline">All your favourite artists and albums in one place</p>
</div>
<div id="main"> <h2>Madonna</h2>
<dl>
<dt>Website:</dt>
<dd>
<a href="http://www.madonna.com/" title="Go to the Madonna website">www.madonna.com
</a>
</dd>
<dt id="artPhoto">Artist photo</dt>
<dd id="photo"><img src="images/madonna.jpg" alt="Madonna" title="Madonna"/></dd>