
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(22) | 伺服器名稱 | 所屬國家 |
---|---|---|
35.86.1.7 | ec2-35-86-1-7.us-west-2.compute.amazonaws.com | US |
44.233.116.153 | ec2-44-233-116-153.us-west-2.compute.amazonaws.com | US |
44.234.110.193 | ec2-44-234-110-193.us-west-2.compute.amazonaws.com | US |
35.84.143.66 | ec2-35-84-143-66.us-west-2.compute.amazonaws.com | US |
85.26.186.123 | 85.26.186.123 | RU |
54.189.233.153 | ec2-54-189-233-153.us-west-2.compute.amazonaws.com | US |
35.166.68.246 | ec2-35-166-68-246.us-west-2.compute.amazonaws.com | US |
52.8.185.89 | ec2-52-8-185-89.us-west-1.compute.amazonaws.com | US |
104.156.53.224 | 104-156-53-224.static.hvvc.us | US |
159.69.2.127 | static.127.2.69.159.clients.your-server.de | DE |
54.144.81.251 | ec2-54-144-81-251.compute-1.amazonaws.com | US |
44.205.193.198 | ec2-44-205-193-198.compute-1.amazonaws.com | US |
54.203.242.6 | ec2-54-203-242-6.us-west-2.compute.amazonaws.com | US |
52.55.82.245 | ec2-52-55-82-245.compute-1.amazonaws.com | US |
18.214.107.67 | ec2-18-214-107-67.compute-1.amazonaws.com | US |
52.21.163.234 | ec2-52-21-163-234.compute-1.amazonaws.com | US |
IP地址(14) | 伺服器名稱 | 所屬國家 |
---|---|---|
34.234.121.220 | ec2-34-234-121-220.compute-1.amazonaws.com | US |
35.168.70.116 | ec2-35-168-70-116.compute-1.amazonaws.com | US |
20.85.20.81 | 20.85.20.81 | US |
20.72.67.178 | 20.72.67.178 | US |
52.177.83.110 | 52.177.83.110 | US |
104.236.98.42 | 104.236.98.42 | US |
104.236.77.34 | 104.236.77.34 | US |
104.236.29.201 | 104.236.29.201 | US |
159.89.86.50 | 159.89.86.50 | US |
174.138.37.186 | 704081.cloudwaysapps.com | US |
3.86.114.167 | ec2-3-86-114-167.compute-1.amazonaws.com | US |
IP地址(4) | 伺服器名稱 | 所屬國家 |
---|---|---|
85.14.122.114 | ? | PL |
34.245.1.187 | ec2-34-245-1-187.eu-west-1.compute.amazonaws.com | IE |
34.245.81.220 | ec2-34-245-81-220.eu-west-1.compute.amazonaws.com | IE |
34.241.255.205 | ec2-34-241-255-205.eu-west-1.compute.amazonaws.com | IE |
10.192.43.33 | 10.192.43.33 | ? |
10.192.165.36 | 10.192.165.36 | ? |
10.192.170.34 | 10.192.170.34 | ? |
IP地址(4) | 伺服器名稱 | 所屬國家 |
---|---|---|
34.245.81.220 | ec2-34-245-81-220.eu-west-1.compute.amazonaws.com | IE |
34.241.255.205 | ec2-34-241-255-205.eu-west-1.compute.amazonaws.com | IE |
34.241.70.77 | ec2-34-241-70-77.eu-west-1.compute.amazonaws.com | IE |
IP地址(10) | 伺服器名稱 | 所屬國家 |
---|---|---|
52.214.126.156 | ec2-52-214-126-156.eu-west-1.compute.amazonaws.com | IE |
52.49.30.209 | ec2-52-49-30-209.eu-west-1.compute.amazonaws.com | IE |
52.213.228.119 | ec2-52-213-228-119.eu-west-1.compute.amazonaws.com | IE |
52.31.73.10 | ec2-52-31-73-10.eu-west-1.compute.amazonaws.com | IE |
52.214.222.196 | ec2-52-214-222-196.eu-west-1.compute.amazonaws.com | IE |
52.19.153.209 | ec2-52-19-153-209.eu-west-1.compute.amazonaws.com | IE |
34.250.53.240 | ec2-34-250-53-240.eu-west-1.compute.amazonaws.com | IE |
34.240.160.53 | ec2-34-240-160-53.eu-west-1.compute.amazonaws.com | IE |
52.209.145.192 | ec2-52-209-145-192.eu-west-1.compute.amazonaws.com | IE |
34.251.223.143 | ec2-34-251-223-143.eu-west-1.compute.amazonaws.com | IE |
對於未知蜘蛛或者爬蟲。它的用途對網站來說可能是好的,也可能是壞的,這取決於它是什麼。所以說,這需要站長進一步分析判斷這些尚不明確的爬蟲行為,再作最終決定。 但,根據以往的經驗,未宣告行為目的及未命名的蜘蛛爬蟲,通常都有不可告人的祕密,我們理應對其行為進行控制,比如攔截。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 hyScore.io crawler 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: hyScore.io crawler Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。