哥最擅长的是接分和搜索.以下转贴 自己参考.修改 。 The first line we add to our config file is:SetEnvIfNoCase User-Agent "^Wget" bad_bot SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot SetEnvIfNoCase User-Agent "^EmailWolf" bad_botThe 'SetEnvIfNoCase' simply sets an enviornment (SetEnv) variable called 'bad_bot' If (SetEnvIf) the 'User-Agent' string contains Wget, EmailSiphon, or EmailWolf, regardless of case (SetEnvIfNoCase). In english, anytime a browser with a name containing 'wget, emailsiphon, or emailwolf' accesses our website, we set a variable called 'bad_bot'. We'd also want to add a line for the User-Agent string of any other Spidert we want to deny.Now we tell Apache which directories to block the Spiderts from with the <Directory> directive:<Directory "/home/evolt/public_html/users/"> Order Allow,Deny Allow from all Deny from env=bad_bot </Directory>In english, we're denying access to the /home/lists/public_html/archive directory if the environment variable exists called 'bad_bot'. Apache will return a standard 403 Denied error message, and the Spidert gets nothing! Since most of the email addresses of members are found in lists.evolt.org/archive, this should suffice, but you'll probably want to adjust a couple things to fit your needs.
多谢大家 2楼的,这样设置至少可防止一部分, 4楼的,现是需设置 user agent为空的用户,并返回503, 大家再帮我想想,thanks
if($_SERVER['HTTP_USER_AGENT']==NULL || empty($_SERVER['HTTP_USER_AGENT'])){
header("location:503.html");
}
The first line we add to our config file is:SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_botThe 'SetEnvIfNoCase' simply sets an enviornment (SetEnv) variable called 'bad_bot' If (SetEnvIf) the 'User-Agent' string contains Wget, EmailSiphon, or EmailWolf, regardless of case (SetEnvIfNoCase). In english, anytime a browser with a name containing 'wget, emailsiphon, or emailwolf' accesses our website, we set a variable called 'bad_bot'. We'd also want to add a line for the User-Agent string of any other Spidert we want to deny.Now we tell Apache which directories to block the Spiderts from with the <Directory> directive:<Directory "/home/evolt/public_html/users/">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Directory>In english, we're denying access to the /home/lists/public_html/archive directory if the environment variable exists called 'bad_bot'. Apache will return a standard 403 Denied error message, and the Spidert gets nothing! Since most of the email addresses of members are found in lists.evolt.org/archive, this should suffice, but you'll probably want to adjust a couple things to fit your needs.
2楼的,这样设置至少可防止一部分,
4楼的,现是需设置 user agent为空的用户,并返回503,
大家再帮我想想,thanks
<Directory "/home/evolt/public_html/users/">
这个目录指的是哪个目录呀,我需要改吗?
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot没取到效果呀,User-Agent的用户,还是进来了。我的环境中,linux+apache+mysql
现需 在apache中 设置 user agent为空的用户,拒绝访问本站,并返回503错误,
这如何实现呀,thanks
这个是例子.
如果你的apache打开了Rewrite,可以自定义一个503页面转过去,或者直接返回ForbiddenRewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule ^.* 503.html
#RewriteRule ^.* - [F]
1RewriteCond %{HTTP_USER_AGENT} ^$
这句是设置urser agent为空吗,
2不是返回到5.3页面,可是在错误报告中提示503
#2.这么说你要定制503错误信息写入error_log?那rewrite做不到。LogFormat + CutomLog可以做到。