
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(247) | 伺服器名稱 | 所屬國家 |
---|---|---|
45.61.92.114 | ? | CA |
23.109.195.101 | 23.109.195.101 | NL |
149.71.246.77 | 149.71.246.77 | DE |
23.109.193.136 | 23.109.193.136 | NL |
85.209.78.108 | 85.209.78.108 | GB |
45.56.135.60 | srv60.mailer-static.whitelistmaildomain.net | US |
45.56.133.159 | mail-srv45-56-133-159.host.whoisthismail.net | US |
45.129.235.25 | 45.129.235.25 | NL |
107.181.157.207 | 107.181.157.207 | GB |
45.93.129.217 | 45.93.129.217 | RO |
174.140.201.109 | ? | JP |
185.214.196.145 | ? | FR |
155.254.59.6 | ? | GB |
155.254.50.190 | 155.254.50.190 | GB |
173.211.16.77 | 173.211.16.77.rdns.colocationamerica.com | JP |
107.181.157.83 | 107.181.157.83 | GB |
185.181.122.107 | 185.181.122.107 | DE |
45.80.63.134 | 45.80.63.134 | GB |
45.93.129.212 | 45.93.129.212 | ? |
185.135.212.139 | 185.135.212.139 | GB |
155.254.50.239 | ? | GB |
155.254.50.240 | ? | GB |
154.30.105.177 | ? | US |
23.109.197.62 | ? | NL |
23.109.195.95 | ? | NL |
193.254.54.88 | ? | ? |
174.140.202.82 | ? | JP |
188.208.222.79 | ? | GB |
173.211.16.216 | 173.211.16.216.rdns.colocationamerica.com | JP |
45.129.235.234 | 45.129.235.234 | NL |
155.254.56.174 | ? | GB |
185.181.112.54 | ? | DE |
38.18.20.87 | ? | US |
158.222.119.82 | host.sindad.net | US |
155.254.58.13 | 155.254.58.13 | GB |
154.3.178.88 | 154.3.178.88 | CA |
209.147.81.145 | ? | US |
85.209.79.57 | ? | GB |
158.222.117.172 | host.sindad.net | US |
38.18.27.85 | ? | US |
45.41.130.88 | src088.host.sendmailnice.com | US |
104.232.222.99 | ? | ZA |
23.109.191.107 | ? | NL |
158.222.127.240 | host.sindad.net | US |
45.41.128.147 | src45-41-128-147.mail.berlin-business-school.org | US |
157.97.126.21 | ? | US |
45.93.131.34 | ? | RO |
185.214.199.6 | ? | FR |
173.211.29.83 | ? | JP |
185.214.199.248 | ? | FR |
37.35.46.73 | ? | RO |
45.56.135.44 | srv44.mailer-static.whitelistmaildomain.net | US |
154.3.180.161 | ? | CA |
45.41.129.117 | mail-static117.mailer-static.mailinatorlabs.com | US |
154.60.70.57 | ? | GB |
185.181.120.111 | ? | DE |
158.222.113.3 | host.sindad.net | US |
188.208.222.157 | ? | GB |
185.182.235.236 | 185.182.235.236 | ? |
185.182.235.180 | 185.182.235.180 | DE |
185.182.232.54 | ? | DE |
174.140.202.212 | 174.140.202.212.rdns.colocationamerica.com | JP |
45.65.79.116 | ? | FR |
185.214.198.217 | ? | FR |
154.30.110.214 | ? | US |
157.97.124.193 | ? | US |
185.181.120.223 | ? | DE |
23.109.193.61 | ? | NL |
45.56.133.86 | mail-srv45-56-133-86.host.whoisthismail.net | US |
45.56.135.166 | srv166.mailer-static.whitelistmaildomain.net | US |
154.60.70.255 | ? | GB |
167.160.50.11 | ? | US |
207.199.173.3 | ? | US |
89.35.89.111 | ? | NL |
185.181.122.173 | ? | DE |
可以考慮攔截。。爬蟲通常會下載公開的網際網路內容,這些內容預設情況下可以免費訪問。不過,如果你不希望你的內容被用於未經授權的目的,你應該攔截它們。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 Dormouse 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: Dormouse Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。