如何做搜索引擎蜘蛛日志分析
搜索引擎蜘蛛日志文件是一种非常强大但未被站长充分利用的文件,分析它可以获取有关每个搜索引擎如何爬取网站内容的相关信息点,及查看搜索引擎蜘蛛在一段时间内的行为。
| IP地址(643) | 服务器名称 | 所属国家 |
|---|---|---|
| 24.241.227.11 | 24-241-227-11.dhcp.mdsn.wi.charter.com | US |
| 68.48.8.30 | c-68-48-8-30.hsd1.dc.comcast.net | US |
| 212.34.12.45 | 212.34.12.45 | JO |
| 64.125.222.50 | 64.125.222.50.IPYX-103455-006-ZYO.zip.zayo.com | US |
| 64.125.222.16 | 64.125.222.16.network.zip.zayo.com | US |
| 46.211.33.0 | 46-211-33-0-chg.broadband.kyivstar.net | UA |
| 91.235.71.229 | 91-235-68-229.telegroup.kiev.ua | UA |
| 175.136.140.93 | 175.136.140.93 | MY |
| 176.241.243.142 | host-176-241-243-142.jmdi.pl | PL |
| 178.206.212.157 | 178.206.212.157 | RU |
| IP地址(101) | 服务器名称 | 所属国家 |
|---|---|---|
| 128.71.100.1 | 128-71-100-1.broadband.corbina.ru | RU |
| 178.217.104.183 | 178-217-104-183.u-lan.ru | RU |
| 91.205.162.5 | 91.205.162.5 | RU |
| 36.83.136.51 | 36.83.136.51 | ID |
| 46.175.166.7 | nat6.46-175-166-7.norma4.ks.ua | UA |
| 83.149.48.3 | 83.149.48.3 | RU |
| 95.69.209.163 | ip-95-69-209-163.airbites.net.ua | UA |
| 46.146.10.3 | net10-3.perm.ertelecom.ru | RU |
| 93.159.240.179 | 93.159.240.179 | RU |
| 136.169.210.191 | 136.169.210.191 | RU |
| IP地址(451) | 服务器名称 | 所属国家 |
|---|---|---|
| 65.7.5.57 | host-65-7-5-57.jan.bellsouth.net | US |
| 24.241.227.11 | 24-241-227-11.dhcp.mdsn.wi.charter.com | US |
| 68.48.8.30 | c-68-48-8-30.hsd1.dc.comcast.net | US |
| 212.34.12.45 | 212.34.12.45 | JO |
| 64.125.222.50 | 64.125.222.50.IPYX-103455-006-ZYO.zip.zayo.com | US |
| 64.125.222.16 | 64.125.222.16.network.zip.zayo.com | US |
| 46.211.33.0 | 46-211-33-0-chg.broadband.kyivstar.net | UA |
| 91.235.71.229 | 91-235-68-229.telegroup.kiev.ua | UA |
| 175.136.140.93 | 175.136.140.93 | MY |
| 176.241.243.142 | host-176-241-243-142.jmdi.pl | PL |
| 178.206.212.157 | 178.206.212.157 | RU |
| 66.188.62.254 | 66-188-62-254.dhcp.bycy.mi.charter.com | US |
| 76.122.74.20 | c-76-122-74-20.hsd1.ga.comcast.net | US |
| 24.102.170.84 | 24.102.170.84.res-cmts.mlf.ptd.net | US |
| 71.225.19.41 | c-71-225-19-41.hsd1.pa.comcast.net | US |
| 66.227.144.4 | 66-227-144-4.dhcp.trcy.mi.charter.com | US |
| 68.160.46.214 | pool-68-160-46-214.bos.east.verizon.net | US |
| 76.101.101.90 | c-76-101-101-90.hsd1.fl.comcast.net | US |
| 74.192.248.30 | r74-192-248-30.tyrdcmta02.tylrtx.tl.dh.suddenlink.net | US |
| 75.30.229.71 | adsl-75-30-229-71.dsl.pltn13.sbcglobal.net | US |
| 71.185.63.162 | pool-71-185-63-162.phlapa.fios.verizon.net | US |
| 72.72.1.138 | 72.72.1.138 | US |
| 68.58.181.241 | c-68-58-181-241.hsd1.sc.comcast.net | US |
| 75.97.163.76 | 75.97.163.76.res-cmts.dlh.ptd.net | US |
| 98.119.106.112 | pool-98-119-106-112.lsanca.fios.verizon.net | US |
| 98.251.41.59 | c-98-251-41-59.hsd1.ga.comcast.net | US |
| 24.245.19.226 | c-24-245-19-226.hsd1.mn.comcast.net | US |
| 24.192.237.88 | d192-24-88-237.col.wideopenwest.com | CA |
| 69.152.36.146 | adsl-69-152-36-146.dsl.ksc2mo.swbell.net | US |
| 68.192.137.247 | ool-44c089f7.dyn.optonline.net | US |
| 69.214.4.81 | ppp-69-214-4-81.dsl.klmzmi.ameritech.net | US |
| IP地址(451) | 服务器名称 | 所属国家 |
|---|---|---|
| 97.120.196.218 | 97-120-196-218.ptld.qwest.net | US |
| 24.13.80.166 | c-24-13-80-166.hsd1.il.comcast.net | US |
| 66.208.218.65 | cmts.ubr01b.flshng01.mi.hfc.comcastbusiness.net | US |
| 88.170.233.27 | ivr94-9-88-170-233-27.fbx.proxad.net | FR |
| 98.74.48.106 | adsl-74-48-106.aby.bellsouth.net | US |
| 98.228.125.61 | c-98-228-125-61.hsd1.in.comcast.net | US |
| 138.88.209.58 | pool-138-88-209-58.res.east.verizon.net | US |
| 75.22.92.16 | adsl-75-22-92-16.dsl.irvnca.sbcglobal.net | US |
| 71.114.179.246 | pool-71-114-179-246.trrhin.dsl-w.verizon.net | US |
| 66.191.131.67 | 66-191-131-67.static.roch.mn.charter.com | US |
| IP地址(1) | 服务器名称 | 所属国家 |
|---|---|---|
| 64.125.222.19 | 64.125.222.19.IPYX-103455-003-ZYO.zip.zayo.com | US |
可以考虑拦截。。爬虫通常会下载公开的互联网内容,这些内容默认情况下可以免费访问。不过,如果你不希望你的内容被用于未经授权的目的,你应该拦截它们。
您可以通过在网站的 robots.txt 中设置用户代理访问规则来屏蔽 008 或限制其访问权限。我们建议安装 Spider Analyser 插件,以检查它是否真正遵循这些规则。
# robots.txt # 下列代码一般情况可以拦截该代理 User-agent: 008 Disallow: /
您无需手动执行此操作,可通过我们的 Wordpress 插件 Spider Analyser 来拦截不必要的蜘蛛或者爬虫。
(工作日 10:00 - 18:30 为您服务)