返回列表 发帖

教我怎样过滤这种网页的内容!!!

假设网址是 ftp://***.***.***.***/********.htm
想把网页里的 “A” 过滤为 “B”!
给个规则!!!
http://bbs.360.cn/img/face/05.gif
或者还需要什么条件?

你要求的是替换,广告过滤基本格式是: #(type)#(url)#(restring)###(replace string)

详细参考使用手册,或者论坛的教程
行云流水兮  用心无多  求大道以礼兵兮  凌万物而超脱



关于TheWorld2.x功能的全面介绍请参考=世界之窗使用手册=

TOP

就是看过了都不懂才会问的....
#(type)#(url)#那里.....
说下!

另外能给个教程地址么?

TOP

本帖最后由 噬血细胞Xxs 于 2009-6-5 13:51 编辑

给个示范嘛!
  1. <html xmlns:v="urn:schemas-microsoft-com:vml"
  2. xmlns="urn:schemas-microsoft-comfficeffice"
  3. xmlns:w="urn:schemas-microsoft-comffice:word"
  4. xmlns="http://www.w3.org/TR/REC-html40">

  5. <head>
  6. <meta http-equiv=Content-Type content="text/html; charset=us-ascii">
  7. <meta name=ProgId content=Word.Document>
  8. <meta name=Generator content="Microsoft Word 11">
  9. <meta name=Originator content="Microsoft Word 11">
  10. <link rel=File-List href="tongzhi2.files/filelist.xml">
  11. <title>欢迎使用中国电信“蔚蓝校园”宽带</title>
  12. <!--[if gte mso 9]><xml>
  13. <oocumentProperties>
  14. <o:Author>小桶</o:Author>
  15. <o:LastAuthor>Datacom Division</o:LastAuthor>
  16. <o:Revision>4</o:Revision>
  17. <o:TotalTime>146</o:TotalTime>
  18. <o:Created>2009-04-26T08:04:00Z</o:Created>
  19. <o:LastSaved>2009-04-27T02:13:00Z</o:LastSaved>
  20. <oages>1</oages>
  21. <o:Words>60</o:Words>
  22. <o:Characters>343</o:Characters>
  23. <o:Company>数据</o:Company>
  24. <o:Lines>2</o:Lines>
  25. <oaragraphs>1</oaragraphs>
  26. <o:CharactersWithSpaces>402</o:CharactersWithSpaces>
  27. <o:Version>11.5606</o:Version>
  28. </oocumentProperties>
  29. </xml><![endif]--><!--[if gte mso 9]><xml>
  30. <w:WordDocument>
  31. <w:Zoom>200</w:Zoom>
  32. <wontDisplayPageBoundaries/>
  33. <w:SpellingState>Clean</w:SpellingState>
  34. <w:GrammarState>Clean</w:GrammarState>
  35. <wunctuationKerning/>
  36. <wrawingGridVerticalSpacing>7.8 磅</wrawingGridVerticalSpacing>
  37. <wisplayHorizontalDrawingGridEvery>0</wisplayHorizontalDrawingGridEvery>
  38. <wisplayVerticalDrawingGridEvery>2</wisplayVerticalDrawingGridEvery>
  39. <w:ValidateAgainstSchemas/>
  40. <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  41. <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  42. <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  43. <w:Compatibility>
  44. <w:SpaceForUL/>
  45. <w:BalanceSingleByteDoubleByteWidth/>
  46. <woNotLeaveBackslashAlone/>
  47. <w:ULTrailSpace/>
  48. <woNotExpandShiftReturn/>
  49. <w:AdjustLineHeightInTable/>
  50. <w:BreakWrappedTables/>
  51. <w:SnapToGridInCell/>
  52. <w:WrapTextWithPunct/>
  53. <w:UseAsianBreakRules/>
  54. <wontGrowAutofit/>
  55. <w:UseFELayout/>
  56. </w:Compatibility>
  57. <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
  58. </w:WordDocument>
  59. </xml><![endif]--><!--[if gte mso 9]><xml>
  60. <w:LatentStyles DefLockedState="false" LatentStyleCount="156">
  61. </w:LatentStyles>
  62. </xml><![endif]-->
  63. <style>
  64. <!--
  65. /* Font Definitions */
  66. @font-face
  67. {font-family:SimSun;
  68. panose-1:2 1 6 0 3 1 1 1 1 1;
  69. mso-font-alt:SimSun;
  70. mso-font-charset:134;
  71. mso-generic-font-family:auto;
  72. mso-font-pitch:variable;
  73. mso-font-signature:3 135135232 16 0 262145 0;}
  74. @font-face
  75. {font-family:SimHei;
  76. panose-1:2 1 6 0 3 1 1 1 1 1;
  77. mso-font-alt:SimHei;
  78. mso-font-charset:134;
  79. mso-generic-font-family:auto;
  80. mso-font-pitch:variable;
  81. mso-font-signature:1 135135232 16 0 262144 0;}
  82. @font-face
  83. {font-family:Verdana;
  84. panose-1:2 11 6 4 3 5 4 4 2 4;
  85. mso-font-charset:0;
  86. mso-generic-font-family:swiss;
  87. mso-font-pitch:variable;
  88. mso-font-signature:536871559 0 0 0 415 0;}
  89. @font-face
  90. {font-family:SimHei;
  91. panose-1:2 1 6 0 3 1 1 1 1 1;
  92. mso-font-charset:134;
  93. mso-generic-font-family:auto;
  94. mso-font-pitch:variable;
  95. mso-font-signature:1 135135232 16 0 262144 0;}
  96. @font-face
  97. {font-family:SimSun;
  98. panose-1:2 1 6 0 3 1 1 1 1 1;
  99. mso-font-charset:134;
  100. mso-generic-font-family:auto;
  101. mso-font-pitch:variable;
  102. mso-font-signature:3 135135232 16 0 262145 0;}
  103. /* Style Definitions */
  104. p.MsoNormal, li.MsoNormal, div.MsoNormal
  105. {mso-style-parent:"";
  106. margin:0cm;
  107. margin-bottom:.0001pt;
  108. text-align:justify;
  109. text-justify:inter-ideograph;
  110. mso-pagination:none;
  111. font-size:10.5pt;
  112. mso-bidi-font-size:12.0pt;
  113. font-family:"Times New Roman";
  114. mso-fareast-font-family:SimSun;
  115. mso-font-kerning:1.0pt;}
  116. a:link, span.MsoHyperlink
  117. {mso-ansi-font-size:9.0pt;
  118. mso-bidi-font-size:9.0pt;
  119. font-family:Verdana;
  120. mso-ascii-font-family:Verdana;
  121. mso-hansi-font-family:Verdana;
  122. color:#333333;
  123. mso-text-animation:none;
  124. text-decoration:none;
  125. text-underline:none;
  126. text-decoration:none;
  127. text-line-through:none;}
  128. a:visited, span.MsoHyperlinkFollowed
  129. {color:purple;
  130. text-decoration:underline;
  131. text-underline:single;}
  132. span.style2
  133. {mso-style-name:style2;}
  134. span.GramE
  135. {mso-style-name:"";
  136. mso-gram-e:yes;}
  137. /* Page Definitions */
  138. @page
  139. {mso-page-border-surround-header:no;
  140. mso-page-border-surround-footer:no;}
  141. @page Section1
  142. {size:595.3pt 841.9pt;
  143. margin:72.0pt 90.0pt 72.0pt 90.0pt;
  144. mso-header-margin:42.55pt;
  145. mso-footer-margin:49.6pt;
  146. mso-paper-source:0;
  147. layout-grid:15.6pt;}
  148. div.Section1
  149. {page:Section1;}
  150. -->
  151. </style>
  152. <!--[if gte mso 10]>
  153. <style>
  154. /* Style Definitions */
  155. table.MsoNormalTable
  156. {mso-style-name:\666E\901A\8868\683C;
  157. mso-tstyle-rowband-size:0;
  158. mso-tstyle-colband-size:0;
  159. mso-style-noshow:yes;
  160. mso-style-parent:"";
  161. mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
  162. mso-para-margin:0cm;
  163. mso-para-margin-bottom:.0001pt;
  164. mso-pagination:widow-orphan;
  165. font-size:10.0pt;
  166. font-family:"Times New Roman";
  167. mso-fareast-font-family:"Times New Roman";
  168. mso-ansi-language:#0400;
  169. mso-fareast-language:#0400;
  170. mso-bidi-language:#0400;}
  171. </style>
  172. <![endif]--><!--[if gte mso 9]><xml>
  173. <o:shapedefaults v:ext="edit" spidmax="16386"/>
  174. </xml><![endif]--><!--[if gte mso 9]><xml>
  175. <o:shapelayout v:ext="edit">
  176. <o:idmap v:ext="edit" data="1"/>
  177. </o:shapelayout></xml><![endif]-->
  178. </head>

  179. <body bgcolor=white lang=ZH-CN link="#333333" vlink=purple style='tab-interval:
  180. 21.0pt;text-justify-trim:punctuation'>
  181. <!--[if gte mso 9]><xml>
  182. <v:background id="_x0000_s1025" o:bwmode="white" o:targetscreensize="800,600">
  183. <v:fill recolor="t" type="frame"/>
  184. </v:background></xml><![endif]-->

  185. <div class=Section1 style='layout-grid:15.6pt'>

  186. <p class=MsoNormal align=center style='text-align:center;layout-grid-mode:char'><b
  187. style='mso-bidi-font-weight:normal'><span style='font-size:26.0pt;font-family:
  188. SimHei;mso-hansi-font-family:Verdana;color:#333333'>学院</span></b><b><span
  189. style='font-size:26.0pt;font-family:SimHei;color:#333333'>“数字校园”网络优化通知</span></b></p>

  190. <p class=MsoNormal style='layout-grid-mode:char'><b><span lang=EN-US
  191. style='font-size:26.0pt;mso-ascii-font-family:SimHei;mso-fareast-font-family:
  192. SimHei'> </span></b></p>

  193. <p class=MsoNormal style='layout-grid-mode:char'><b><span style='font-size:
  194. 16.0pt;font-family:SimHei'>为了提升您对数字校园的使用感知,电信公司计划于<span
  195. lang=EN-US>2009</span>年<span lang=EN-US>4</span>月<span
  196. lang=EN-US>28</span>日上午<span lang=EN-US>8</span>点对学院数字校园宽带网络进行优化升级,升级过程中可能会导致您的上网业务出现中断,并且升级完以后,<span
  197. style='color:red'>需要使用新的客户端拨号器才能上网</span>。请各位同学务必提前安装好新客户端拨号软件,给您带来的不便敬请谅解。</span></b></p>

  198. <p class=MsoNormal style='layout-grid-mode:char'><span lang=EN-US
  199. style='font-size:16.0pt;mso-ascii-font-family:SimHei;mso-fareast-font-family:
  200. SimHei'> </span></p>

  201. <p class=MsoNormal align=center style='text-align:center;layout-grid-mode:char'><span
  202. style='font-size:14.0pt;font-family:SimHei'>★<b>新客户端拨号软件</b>
  203. 请点击<b><u><span lang=EN-US style='color:blue'><a
  204. href="http://202.103.194.212:9081/pop/version/download.html"><u><span
  205. lang=EN-US style='mso-ansi-font-size:14.0pt;mso-bidi-font-size:14.0pt'><span
  206. lang=EN-US>下载</span></span></u></a></span></u></b><span
  207. style='color:black'>安装 点击查看<b><u><span
  208. lang=EN-US><a href="ftp://222.216.111.198/doc5.doc"><u><span lang=EN-US
  209. style='mso-ansi-font-size:14.0pt;mso-bidi-font-size:14.0pt'><span lang=EN-US>帮助</span></span></u></a></span></u></b></span></span></p>

  210. <p class=MsoNormal align=center style='text-align:center;layout-grid-mode:char'><span
  211. style='font-size:14.0pt;font-family:SimHei'>★<b>更改密码、网上充值</b>及其他<span
  212. class=GramE><b>自服务</b>请访问</span><b><u><span
  213. lang=EN-US>http://gx.ct10000.com/campus_card/index.html</span></u></b></span></p>

  214. <p class=MsoNormal align=center style='text-align:center;layout-grid-mode:char'><span
  215. class=style2><span lang=EN-US style='font-size:16.0pt'> </span></span></p>

  216. <p class=MsoNormal align=center style='text-align:center;layout-grid-mode:char'><span
  217. class=style2><b><span style='font-size:16.0pt;font-family:SimHei'>客服电话:<span
  218. lang=EN-US style='color:#FF6600'>10000</span></span></b></span><span
  219. class=style2><b><span lang=EN-US style='font-size:16.0pt;mso-ascii-font-family:
  220. SimHei;mso-fareast-font-family:SimHei'>  </span></b></span><span
  221. class=style2><b><span lang=EN-US style='font-size:16.0pt;font-family:SimHei'>
  222. 24</span></b></span><span class=style2><b><span style='font-size:16.0pt;
  223. font-family:SimHei'>小时热线</span></b></span></p>

  224. <p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>

  225. </div>

  226. </body>

  227. </html>
复制代码
假设网址是 ftp://***.***.***.***/********.htm
想把网页里的 ““数字校园”网络优化通知” 过滤为 “ABDD”!


该如何?我不懂正则表达式!

TOP

你给的明显是HTTP的嘛,干嘛假设网址是FTP的?
  1. #exd#*网址*#“数字校园”网络优化通知###ABDD
复制代码
注:网址处只写域名

TOP

本帖最后由 噬血细胞Xxs 于 2009-6-5 17:21 编辑

5# smile16888


那ftp网的源文件确实是这样啊!


是              #exd#*ftp://***.***.***.***/********.htm*#“数字校园”网络优化通知###ABDD           么?

不行啊!

没域名怎么办...

TOP

ftp:不支持。
假设网址是 http://bbs.ioage.com/cn/forum-36-1.html,那么#5楼规则说的“网址”可以是这个地址中的域名或任何一部分(当然也需有一定的识别性,否则就和#ex#规则无异了)。

TOP

返回列表