返回列表 发帖

[过滤规则] 自制常见广告过滤规则 最后更新:2008-09-13

感谢凤凰工作室,没有它就没有TheWorld
感谢正则表达式,没有它就没有这些规则
写在前面的话:
1.由于本人能力有限,规则只能过滤掉一部分广告,并且对于产生的空白框架,我无能为力,请各位多多包涵.
2.请大家善用TheWorld的网站白名单功能(自己的邮箱,博客都可以填进去)
3.请大家善用TheWorld的禁止显示Flash功能(绝大多数Flash都是广告,看视频的时候再打开)

规则做了优化,可读性比较差...
VERSION 2008-09-13
  1. //数字组合
  2. #ex#<(I(?:MG|FRAME)|SCRIPT|EMBED|PARAM)\x20[^-_\\\/.?="'>]+(?:[-_\\\/.?="']+[^-_\\\/.?="'>]+)*?[-_\\\/.?="']+([1-9]\d{1,2}[-_xX][1-9]\d{0,1}[05])[-_\\\/.?="'&\x20][^>]*>(?:(?:(?!<\1\b)[\s\S]){0,512}?<]*?\1>)?###\\??\/[^>]*?\1>)?###<S STYLE="background:#EEE;color:#FFF;font-size:9pt" TITLE="$1">[$2]</S>
  3. //关键词
  4. #ex#<(A|SCRIPT|I(?:MG|FRAME)|EMBED|PARAM)\x20[^-_\\\/.?="'>]+(?:[-_\\\/.?="']+[^-_\\\/.?="'>]+)*?[-_\\\/.?="']+((?:360quan|9v|a(?:d(?:\d+|click|id|pictures|s(?:it)?|union|v(?:iew)?|)|l(?:i(?:mama|union)|l(?:4ad|yes\d*)))|b(?:anners?|iz5)|c(?:hinesefriendfinders?|l(?:ck?|ic(?:k(?:eye)?|))|p(?:[cdlmps]|v|ro))|d(?:annonces|otmore)|e(?:iv|te(?:un)?)|g(?:g(?:ao|img|)|o(?:cpc|oglesyndication)|uang(?:g(?:ao)?|))|h(?:ao(?:ei|ye)|eima8)|i(?:dc|focus)|j(?:iaoyou|oy)|keyrun|market|narrowad|p4p|qyule|s(?:ooe|pc(?:lick|ode)|tat)|tuotu|u(?:link|n(?:i(?:gg|on(?:sky)?)|stat|))|vo(?:done|gate)|webgame|y(?:eeyoo|igao)))[-_\\\/.?="'&][^>]*>(?:(?:(?!<\1\b)[\s\S]){0,512}?<]*?\1>)?###\\??\/[^>]*?\1>)?###<S STYLE="background:#EEE;color:#FFF;font-size:9pt" TITLE="$1">[$2]</S>
  5. //脚本
  6. #ex#<SCRIPT[^>]*>(?:(?!<\/SCRIPT>)[\s\S])*?(?:[-_/."'?=]|^)(google(?:syndication|_ad)|alimama|allyes|sogou_param|cpro)[-_/."'?=](?:(?!<\/SCRIPT>)[\s\S])*?<\/SCRIPT>###<S STYLE="background:#EEE;color:#FFF;font-size:9pt" TITLE="SCRIPT">[$1]</S>
复制代码






中间省略N多版本


VERSION 2008-05-27
A版:显示被过滤的关键词和链接类型
  1. #ex#<(A|SCRIPT|IFRAME|EMBED|IMG)[^>]+?(?:\.|\/|_)(a(?:2d|d(?:s?|s?click|tology|tools|union|\d{1,3})|l(?:imama|lyes\d*))|banner|gg|p4p|uni(?:gg|on(?:sys)?)|hao(?:ei|ye)|keyrun|googlesyndication|t2click|tuotu|vodone|yidaba|skype\d*|floatads?|firefox\.js|\w*gg\d*.js|(?:08|78|28|158|36578|89178|9v|360quan|c(?:hanet|o-cm|oopen)|p(?:v|hah)|yigao)\.c(?m|n)|pro(?:img)?\.163|(?:hc|ma)\.baidu|baidu\.com\/(?:baidu|cpro)|cpc\.sogou|ufile\.kuaiche|atm\.youku|biz5\.sandai|(?:s|p4)p\.tom)(?:\.|\/|"|_|\.js)[^>]*?>(?:[\s\S]*?(?:<)|)###\\?\/\1>)|)###<span style="background-color:Silver;color:White;font-size:9pt">[<$1>($2)]</span>
复制代码
B版:直接隐藏
  1. #ex#<(A|SCRIPT|IFRAME|EMBED|IMG)[^>]+?(?:\.|\/|_)(?:a(?:2d|d(?:s?|s?click|tology|tools|union|\d{1,3})|l(?:imama|lyes\d*))|banner|gg|p4p|uni(?:gg|on(?:sys)?)|hao(?:ei|ye)|keyrun|googlesyndication|t2click|tuotu|vodone|yidaba|skype\d*|floatads?|firefox\.js|\w*gg\d*.js|(?:08|78|28|158|36578|89178|9v|360quan|c(?:hanet|o-cm|oopen)|p(?:v|hah)|yigao)\.c(?m|n)|pro(?:img)?\.163|(?:hc|ma)\.baidu|baidu\.com\/(?:baidu|cpro)|cpc\.sogou|ufile\.kuaiche|atm\.youku|biz5\.sandai|(?:s|p4)p\.tom)(?:\.|\/|"|_|\.js)[^>]*?>(?:[\s\S]*?(?:<)|)###
复制代码

(把预览图片删掉了,太占页面高度...)
各位如果觉得不错就加到黑名单里吧:lol:

[ 本帖最后由 狄烁stec 于 2008-9-14 14:06 编辑 ]
1

评分人数

  • 小絮

2008 10 18 版本
http://hi.baidu.com/blacklistbot ... 25af16b9127b99.html
  1. //数字组合
  2. #ex#<(A|SCRIPT|I(?:MG|FRAME)|EMBED|PARAM)\s+[^>]*?\b(?:CLASS|ID|NAME|HREF|SRC|VALUE)=(?=(?:[^\x20>]*?[-_\\\/.?="'])??([1-9]\d{1,2}[-_xX][1-9]\d{0,1}[05])(?:[-_\\\/.?="'&][^\x20>]*)??)([^\x20>]+)[^>]*?>(???!<\1[\s>])[\s\S])*?<\/\1>)?###<S STYLE="background-color:Silver;color:White;font-size:9pt" TITLE=$1[$3]>[$2]</S>
  3. //关键词
  4. #ex#<(A|SCRIPT|I(?:MG|FRAME)|EMBED|PARAM)\s+[^>]*?\b(?:CLASS|ID|NAME|HREF|SRC|VALUE)=(?=(?:[^\x20>]*?[-_\\\/.?="'])??(360quan|9v|a(?:d(?:\d+|banner|c(?:lick|ode)|files?|i(?:d|m(?:g|ages))|js|pic(?:s|tures|)|s(?:\d+|cripts|e(?:nce|rv(?:er|ing))|how|it|js|ky|union|view|)|to(?:p|logy)|union|v(?:\d+|all|code|er(?:salservers|t(?:ising)?|)|iew|)|)|l(?:i(?:mama|union)|l(?:4ad|yes\d*)))|b(?:anners?|iz5)|c(?:hinesefriendfinders?|l(?:ck?|ic(?:k(?:eye)?|))|p(?:[cdlmpsv]|ro))|d(?:annonces|otmore)|e(?:iv|te(?:un)?)|f(?:loat(?:adv?|)|ootad|unshion)|g(?:g(?:ao|img|pic|)|o(?:cpc|oglesyndication)|uang(?:g(?:ao)?|))|h(?:ao(?:ei|ye)|eima8)|i(?:dc|focus)|j(?:iaoyou|oy)|k(?:eyrun|oowo)|market|narrowad|p4p|qyule|s(?oe|pc(?:lick|ode)|tat)|tuotu|u(?:link|n(?:i(?:gg|on(?:sky)?)|stat|)|usee)|vo(?:done|gate)|webgame|y(?:eeyoo|igao))(?:[-_\\\/.?="'&][^\x20>]*)??)([^\x20>]+)[^>]*?>(???!<\1[\s>])[\s\S])*?<\/\1>)?###<S STYLE="background-color:Silver;color:White;font-size:9pt" TITLE=$1[$3]>[$2]</S>
  5. //脚本
  6. #ex#<SCRIPT[^>]*>(??!<\/SCRIPT>)[\s\S])*?(?:[-_/."'?=]|^)(google(?:syndication|_ad)|alimama|allyes|sogou_param|cpro)[-_/."'?=](??!<\/SCRIPT>)[\s\S])*?<\/SCRIPT>###<S STYLE="background:#EEE;color:#FFF;font-size:9pt" TITLE="SCRIPT">[$1]</S>
  7. // AD DIV
  8. #ex#<DIV\s+[^>]*?\b(?:CLASS|ID)=(?=(?:[^\x20>]*?[-_\\\/.?="'])??(ad\d*?|b(?:anner|dfs\d+))(?:[-_\\\/.?="'&][^\x20>]*)??)([^\x20>]+)[^>]*?>(??!<DIV)[\s\S])*?</DIV>###<S STYLE="background-color:Silver;color:White;font-size:9pt" TITLE=DIV[$3]>[$2]</S>
  9. // <!-- AD begin -->...<!-- AD end -->
  10. #ex#<!--(ads?) begin-->[\s\S]+?<!--\1 end-->###<S STYLE="background:#EEE;color:#FFF;font-size:9pt" TITLE="&lt;!--&gt;">[$1]</S>
  11. // 其他
  12. #ex#waitingTime\s*=\s*\d+###waitingTime=0
  13. #ex#beginPosition != -1###true
  14. #ex#beginPosition <0###false
  15. // 给<iframe>加双线框
  16. #ex#</head>###<style type="text/css"><!--iframe {border:3px double #9cf}--></style>
  17. // 霏凡显示真实下载地址
  18. #exd#*crsky.com*#'\s\+\sadlist\s\+\s'###
  19. // 9lala漫画批量下载
  20. #exd#*www.9lala.com/html/*#<div [^>]+><img src="(http:\/\/mh\d*.9lala.com/(?:[^/]+/)+?)1.jpg"></div>[\S\s]+?<div align="center">本漫画共<[^>]+>(\d+)<[^>]+>[\s\S]+?</div>###<script language="JavaScript">var maxpage=$2;for (var i=1;i<=maxpage;i++){document.writeln('<img src="$1'+i+'.jpg" width="32" height="32" title="'+i+'" />'+i);if (i%10==0){document.writeln('<br>')}}</script>$&
复制代码
Xp sp3 IE8 Nod32 4.x 谷歌拼音2  Tw2 x
tw3开启优化CPU占用,防假死,混合模式
世界之窗插件集锦通用过滤规则
小絮的规则

TOP

3.0有没有现成的过滤规则啊??

TOP

谢谢楼主,能持续更新很好啊!

TOP

真不好意思,贴过来的时候忘了转义"<>"啦,更新一下

TOP

......  用了这条规则,貌似把绿盟给废了

TOP

忽然发现有编辑权限啦
更新一下规则,顺便定一下,嘿嘿

ps:最近的广告回帖真多哦...

TOP

回楼上:很抱歉,tw1.x目前的黑名单不支持正则,请您升级到tw2.x版本,其实新版本现在已经很稳定了。

TOP

我是TW1.4版. 怎么无效啊. (试了个bibidu)

TOP

楼主看看是不是把http://www.btchina.net/的搜索给杀了

TOP

用了2楼的规则,新浪视频、新浪共享看来没问题。

TOP

  1. #ex#<(A|SCRIPT|IFRAME|EMBED|IMG)[^>]+?\b(?:href|src)[^>]*?=[^>]*?(?:\.|\/)(?:ad(?:[-_][a-z*]+|\d{1,3}|files?|frame|gifs?|graph|images|cycle|show|s?click|tology|tools|league|s?union|4all|2d|vs?|s?)|al(?:imama|l4ad|lyes\d*)|banner|cpc|g(?:uang?)?g(?:ao|img)?|pop(?:up)?(?:s)?|p4p|uni(?:gg|on(?:sys)?)|hao(?:ei|ye)|keyrun|googlesyndication|t2click|tuotu|vodone|yeeyoo|yidaba|tuijian|qyule|skype\d*|float(?:ads?)?|firefox\.(?:js|gif|html?)|\w*gg\d*.js|[\d?]{2,3}[-_x+][\d?]{2,3}|(?:(?!51|91|5460|163|126|265|17173|955)\d{2,6}|9v|360quan|c(?:hanet|o-cm|oopen)|p(?:v|hah)|yigao|sooe).c(?:om|n)|pro(?:img)?\.163|(?:hc|ma|utk|spcode)\.baidu|baidu\.com\/(?:baidu|cpro)|ufile\.kuaiche|atm\.youku|biz5\.sandai|(?:s|p4)p\.tom|eachnet\.com\/[\w./]*?\?adid)(?:\.|\/|"|=)[^>]*?>[\s\S]*?<\\?\/\1>###
复制代码
其实就是把“###”后面的东东去掉就好啦

TOP

返回列表