mysql当然可以实现,从3.23版就可以了。正如Reve(仨仁仕) 说说这个跟数据库有关。所以这里不需要提供php代码.创建数据表的时候加上fulltext索引 eg: CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) );以后就可以用 SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database'); 这里用到match函数来进行这两个字段的全文检索
function setOperator($keyword){ if($keyword == "") return ""; $keyword = @eregi_replace("and"," AND ",$keyword); $keyword = @eregi_replace("or"," OR ",$keyword); $keyword = @eregi_replace("not"," NOT ",$keyword); $keyword = str_replace("&&"," AND ",$keyword); $keyword = str_replace("&"," AND ",$keyword); $keyword = str_replace("+"," AND ",$keyword); $keyword = str_replace("||"," OR ",$keyword); $keyword = str_replace("|"," OR ",$keyword); $keyword = str_replace("!"," NOT ",$keyword); $keyword = str_replace("-"," NOT ",$keyword); //ereg sign $keyword = str_replace("*","%",$keyword); $keyword = str_replace("?","_",$keyword); return $keyword; }function parseKeyword($keyword, $fields){ //function to parse keyword for search Engine //RETURN SQL CMD //@PARA $fields array of colummn of table //@PARA $keyword input keyword //@version 1.3.1 //@last update 20020910 //@created 20020906 //@author walksing chen [[email protected]] /*eg: keyword : jsp asp or php not cn fields[0]->caption,fields[0]->content result:( caption LIKE '%jsp asp%' OR content LIKE '%jsp asp%' ) OR ( caption LIKE '%php%' OR content LIKE '%php%' ) AND NOT ( caption LIKE '%cn%' OR content LIKE '%cn%' ) keyword : jsp asp php cn fields[0]->caption,fields[0]->content result:( caption LIKE '%jsp%' OR content LIKE '%jsp%' ) OR ( caption LIKE '%asp%' OR content LIKE '%asp%' ) OR ( caption LIKE '%php%' OR content LIKE '%php%' ) OR ( caption LIKE '%cn%' OR content LIKE '%cn%' ) keyword : jsp asp or php not cn fields[0]->caption result:( caption LIKE '%jsp asp%' ) OR ( caption LIKE '%php%' ) AND NOT ( caption LIKE '%cn%' ) keyword : jsp asp php cn fields[0]->caption result:( caption LIKE '%jsp%' ) OR ( caption LIKE '%asp%' ) OR ( caption LIKE '%php%' ) OR ( caption LIKE '%cn%' ) */ $operator = ""; if($keyword == "") return ""; //set regular operator $keyword = setOperator($keyword); $keys = split(" ", $keyword); $sql = ""; $relation = ""; $max = count($keys);
//eregi whole keyword $has_operator = isOperator($keyword,true); //if not is operator repalce space to OR $relation = ($has_operator ? "":"OR");
vivanboy(被迫早起的鸟儿) 你说的哪种 SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database'); 的方法我试了N次都查不出来数据~~~~
CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) ); INSERT INTO articles VALUES(0,'MySQL Tutorial', 'DBMS stands for DataBase Management ...'); INSERT INTO articles VALUES(0,'How To Use MySQL Efficiently', 'After you went through a ...'); INSERT INTO articles VALUES(0,'Optimizing MySQL','In this tutorial we will show how to ...'); INSERT INTO articles VALUES(0,'1001 MySQL Trick','1. Never run mysqld as root. 2. Normalize ...'); INSERT INTO articles VALUES(0,'MySQL vs. YourSQL', 'In the following database comparison we ...'); INSERT INTO articles VALUES(0,'MySQL Security', 'When configured properly, MySQL could be ...');在运行查询语句: SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database')我这里一切正常。上面摘自英文帮助,不可能错误
// 有一个稍微改善速度并且简化代码的办法,就是把所有文件名按序号分类排列,例如// 新闻类: news/0.htm, news/1.htm, ..., news/9999.htm, ...
// 文档类: doc/0.htm, doc/1.htm, ...
// ...// 这样,就可以根据用户所要求的文章类别,先确定类别,找到对应子目录
// 在这个目录里按序号读文件,范围缩小,而且不必 findfirst, findnext,
// 可以有效提高速度。
// 当然,各目录的文件数(即最大序号)必须保存在数据库里,
// 每当有新页面产生时,必须同步更新数据库// 假设客户端传过来两个参数:
// type: 文章类型(news/doc/...)
// keyword: 关键字// 以下代码只演示如何实现文件内容搜索,不包含输入检查、AND/OR 处理等功能$type = HTTP_POST_VARS["type"];
$keyword = HTTP_POST_VARS["keyword"];$sql = "select max_ord from table1 where type='$type'";
$result = query($sql);
$max_ord = $result['max_ord'];
for ( $ord = 0; $ord <= $max_ord; $ord++ )
{
$fileName = "./" . $type . "/." . $ord . ".htm";
$fp = fopen($fileName, "rt");
if ( $fp )
{
$str = fread($fp, filesize($fileName));
// 这里就搜索字符串 $str,根据需要输出结果啦
fclose($fp);
}
}?>如果想按“行”输出结果,即输出关键字所在的整行
可以不用 fopen,fread,直接用$array = file($fileName);获得文件各行的数组,然后在行内检索
http://www.evolt.org/article/Boolean_Fulltext_Searching_with_PHP_and_MySQL/18/15665/
where [field] linke '%(string)%'
或者在查询时这样用
select * from tablename where fieldname Binary =
OR
select * from tablename where Binary fieldname=
我们公司也做搜索引擎,包括数据库全文检索、Offfice文档、pdf、Dyna Doc等等之类的全文检索。一般是先建立一个索引数据库,很难说明白索引数据库是什么东西,可以当成字典的目录,查字典先查目录,后查正文,速度快很多。可惜比较贵。2至3万。可能你也不愿意花钱。
还是找 phpnuke。台湾的phpnuke 网站汉化版本水平一般,以前大陆也有汉化,可惜版本都比较老。台湾的phpnuke 网站是http://www.phpnuke-tw.com/
其它php写的网络社区免费软件里面都有全文检索,不过这类搜索在数据量很大的时候速度很慢。如果你的数据量不大可以用。
真正专业的全文检索系统,应该是使用文件系统,利用倒排文件建立索引
索引文件应该是事先放入到内存中的.看看google的检索速度,你就应该知道
绝对不是采用通用数据库作全文检索的.
你可以察看关于Information Retrieval 的相关论文
eg:
CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
);以后就可以用
SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database');
这里用到match函数来进行这两个字段的全文检索
function isOperator($keyword,$is_ereg_whole_str = false){
//check isOperator of $keyword for search Engine
$OPERATOR = "AND,OR,NOT,&&,||,!,&,\|,\+,_";
/*
AND = && = & = +
OR = || = |
NOT = ! = -
% = *
_ = ?
*/
if(trim($keyword) == ""){
return false;
}
$args = split(",", $OPERATOR);
$is_return = false;
for($i = 0; $i<count($args); $i++){
$operator = strtoupper($args[$i]);
if($is_ereg_whole_str){
if(@eregi(" $operator ", $keyword)){
$is_return = true;
break;
}
}else{
if($operator == strtoupper($keyword)){
$is_return = true;
break;
}
}
}
return $is_return;
}
if($keyword == "")
return "";
$keyword = @eregi_replace("and"," AND ",$keyword);
$keyword = @eregi_replace("or"," OR ",$keyword);
$keyword = @eregi_replace("not"," NOT ",$keyword);
$keyword = str_replace("&&"," AND ",$keyword);
$keyword = str_replace("&"," AND ",$keyword);
$keyword = str_replace("+"," AND ",$keyword);
$keyword = str_replace("||"," OR ",$keyword);
$keyword = str_replace("|"," OR ",$keyword);
$keyword = str_replace("!"," NOT ",$keyword);
$keyword = str_replace("-"," NOT ",$keyword);
//ereg sign
$keyword = str_replace("*","%",$keyword);
$keyword = str_replace("?","_",$keyword);
return $keyword;
}function parseKeyword($keyword, $fields){
//function to parse keyword for search Engine
//RETURN SQL CMD
//@PARA $fields array of colummn of table
//@PARA $keyword input keyword
//@version 1.3.1
//@last update 20020910
//@created 20020906
//@author walksing chen [[email protected]]
/*eg:
keyword : jsp asp or php not cn
fields[0]->caption,fields[0]->content
result:( caption LIKE '%jsp asp%' OR content LIKE '%jsp asp%' ) OR ( caption LIKE '%php%' OR content LIKE '%php%' ) AND NOT ( caption LIKE '%cn%' OR content LIKE '%cn%' )
keyword : jsp asp php cn
fields[0]->caption,fields[0]->content
result:( caption LIKE '%jsp%' OR content LIKE '%jsp%' ) OR ( caption LIKE '%asp%' OR content LIKE '%asp%' ) OR ( caption LIKE '%php%' OR content LIKE '%php%' ) OR ( caption LIKE '%cn%' OR content LIKE '%cn%' )
keyword : jsp asp or php not cn
fields[0]->caption
result:( caption LIKE '%jsp asp%' ) OR ( caption LIKE '%php%' ) AND NOT ( caption LIKE '%cn%' )
keyword : jsp asp php cn
fields[0]->caption
result:( caption LIKE '%jsp%' ) OR ( caption LIKE '%asp%' ) OR ( caption LIKE '%php%' ) OR ( caption LIKE '%cn%' )
*/
$operator = "";
if($keyword == "")
return "";
//set regular operator
$keyword = setOperator($keyword);
$keys = split(" ", $keyword);
$sql = "";
$relation = "";
$max = count($keys);
//eregi whole keyword
$has_operator = isOperator($keyword,true);
//if not is operator repalce space to OR
$relation = ($has_operator ? "":"OR");
$fldv = "";
for($i = 0; $i<$max;$i++){
if(!$has_operator){
$fldv = $keys[$i];
$fldv = trim($fldv);
$relation = "OR";
if($fldv == ""){
continue;
}
}else{
if(trim($keys[$i]) == "" && $i == 0 )
continue; //skip 0 & ""
if(!isOperator($keys[$i])){
//save to fldv
$fldv .= trim($keys[$i])." ";
if($i != ($max-1)){
continue;
}
}else{
$relation = strtoupper($keys[$i]);
$relation = ($relation == "NOT") ? "AND NOT":
($relation);
}
}
if(count($fields)<=1){
$tmp = $fields. " LIKE '%".trim($fldv)."%' ";
$sql .= "( ".$tmp." ) ".$relation." ";
}else{
reset($fields);//very easy lose
$tmp = "";//set null
while(list($name, $fld_value) = each($fields)){
if($fld_value == "")
continue;
$tmp .= $fld_value . " LIKE '%".trim($fldv).
"%' OR ";
}
$tmp = substr($tmp, 0, strlen($tmp)-3);
$tmp = $tmp == "" ? "":"( $tmp )";
$sql .= " $tmp ".$relation." ";
}
$fldv = null;
}
$sql = trim($sql);
$sql = substr($sql, 0, strlen($sql)- strlen($relation));
return $sql;
}function makeupStr($str, $keyword = ""){
//this function is used to makeup $keyword ,eg:make it color
if($str == "" )
return "";
if($keyword == "")
return $str;
$keyword = setOperator($keyword);
$keys = split(" ", $keyword);
$max = count($keys);
$keyword_n = "<font color=red><strong>".$keyword."</strong></font>";
$keyword = trim($keyword);
$str = trim($str);
if($max == 1 ){
$buf = null;
//split keyword include CN
//eg:keyword=い%
for($j = 0; $j<strlen($keyword); $j++){
$char = substr($keyword, $j, 1);
if($char != "%" && $char != "_" ){
$buf .= $char;
if($j != (strlen($keyword) -1))
continue;
}
$key = $buf;
$keyword_n = "<font color=red><strong>".
$key."</strong></font>";
if($key != ""){
$str = eregi_replace($key, $keyword_n,
$str);
}
$buf = null;
}
$buf = null;
return $str;
}else if(@eregi(" ", $keyword)){
$parter = " ";
}
$parter = $parter == ""?" ":$parter;
$keywords = split($parter, $keyword);
for($i=0; $i<count($keywords); $i++){
$keyword = $keywords[$i];
if(isOperator($keyword)){
continue;
}
$keyword_n = "<font color=red><strong>".$keyword.
"</strong></font>";
if(trim($keyword) == ""){
continue;
}
$str = @eregi_replace($keyword, $keyword_n,
$str);
}
return $str;
}
//End Search Engine
?>
#
# Table structure for table 'textnews'
#CREATE TABLE textnews (
id int(11) DEFAULT '0' NOT NULL auto_increment,
docid varchar(50) NOT NULL,
pubcode varchar(50),
pubdate date DEFAULT '0000-00-00' NOT NULL,
author varchar(50),
pageno varchar(20),
section varchar(250),
pathfile varchar(250),
headline varchar(250),
content mediumtext,
PRIMARY KEY (id, docid)
);
http://www.foresight.com.hk/bbs/application/webpub/showdetail.asp?serialnum=70551038175334&boardid=12&num=72
讲的这么深奥http://web.scuec.edu.cn/~game002/newweb0/给分啦!!!
http://jakarta.apache.org/lucene/docs/index.html
高质量的全文搜索引擎我在上面做了支持中文断词的全文搜索
SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database');
的方法我试了N次都查不出来数据~~~~
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
);
INSERT INTO articles VALUES(0,'MySQL Tutorial', 'DBMS stands for DataBase Management ...');
INSERT INTO articles VALUES(0,'How To Use MySQL Efficiently', 'After you went through a ...');
INSERT INTO articles VALUES(0,'Optimizing MySQL','In this tutorial we will show how to ...');
INSERT INTO articles VALUES(0,'1001 MySQL Trick','1. Never run mysqld as root. 2. Normalize ...');
INSERT INTO articles VALUES(0,'MySQL vs. YourSQL', 'In the following database comparison we ...');
INSERT INTO articles VALUES(0,'MySQL Security', 'When configured properly, MySQL could be ...');在运行查询语句:
SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database')我这里一切正常。上面摘自英文帮助,不可能错误
上面的是没错,如果用于搜索e文就绝对没问题,然而对中文这种双字节的字符就不支持了。
在varchar可以用binary来修饰后就可以用全文搜索,但是text没有binary。
mysql说明书上也说了,要在下版本完善这个全文搜索。
比如上面database就可以,for等就不行
都可以。看样子mysql的全文搜索功能bug太多看样子并不是仅仅对多字节没效。我也晕了楼主换成mysql4.0看看效果