ip总数->select COUNT(distinct c-ip) as GueCount from 'F:ex111001.log'结果->1865//蜘蛛的ip总数 select COUNT(distinct c-ip) as GueCount from 'F:ex111001.log' where cs(User-Agent) like
'%baidu%' or cs(User-Agent) like '%Yahoo!+Slurp%' or cs(User-Agent) like '%google%' or cs(User-Agent) like '%
YoudaoBot%' or cs(User-Agent) like '%sogou%' or cs(User-Agent) like '%msnbot%' or cs(User-Agent) like '%
+bingbot%'结果->569访客ip总量(代码和蜘蛛总量相似,只是将蜘蛛ip总量的sql语句中like前面加个not,同时or改成and),如下:select COUNT(distinct c-ip) as GueCount from 'F:ex111001.log' where cs(User-Agent) not like '%baidu%' and cs(User-Agent) not like '%Yahoo!+Slurp%' and cs(User-Agent) not like '%google%' and cs(User
-Agent) not like '%YoudaoBot%' and cs(User-Agent) not like '%sogou%' and cs(User-Agent) not like '%msnbot%' and cs(User-Agent) not like '%+bingbot%' 结果->1304问题来了,按道理结果应该是 ->【ip总量】 减掉 【蜘蛛的ip总量】,也就是1865-569=1296但是结果却是1304,怎么多出了8个?哪来的?希望高手帮忙想想...
'%baidu%' or cs(User-Agent) like '%Yahoo!+Slurp%' or cs(User-Agent) like '%google%' or cs(User-Agent) like '%
YoudaoBot%' or cs(User-Agent) like '%sogou%' or cs(User-Agent) like '%msnbot%' or cs(User-Agent) like '%
+bingbot%'结果->569访客ip总量(代码和蜘蛛总量相似,只是将蜘蛛ip总量的sql语句中like前面加个not,同时or改成and),如下:select COUNT(distinct c-ip) as GueCount from 'F:ex111001.log' where cs(User-Agent) not like '%baidu%' and cs(User-Agent) not like '%Yahoo!+Slurp%' and cs(User-Agent) not like '%google%' and cs(User
-Agent) not like '%YoudaoBot%' and cs(User-Agent) not like '%sogou%' and cs(User-Agent) not like '%msnbot%' and cs(User-Agent) not like '%+bingbot%' 结果->1304问题来了,按道理结果应该是 ->【ip总量】 减掉 【蜘蛛的ip总量】,也就是1865-569=1296但是结果却是1304,怎么多出了8个?哪来的?希望高手帮忙想想...
所以暂时就用这个直接读了 当然 如果需要导入数据库 我会选sqlserverl,因为SqlBulkCopy 时间上还是可以忍受的
and not like
不是减法用算,仔细考虑下。
log日志里面 cs(User-Agent) 列明明有jikespider这个值 但结果依然为1304,匪夷所思...
我觉得做减法更准确 大家觉得呢?
再求出蜘蛛的ip总量
最后做减法 得到的结果是不是更准确些?
至于and not like..我还是有点想不通 但好像也能感觉出什么...
感觉做减法更加准确...