
如何做搜尋引擎蜘蛛日誌分析
搜尋引擎蜘蛛日誌檔案是一種非常強大但未被站長充分利用的檔案,分析它可以獲取有關每個搜尋引擎如何爬取網站內容的相關資訊點,及檢視搜尋引擎蜘蛛在一段時間內的行為。
IP地址(20) | 伺服器名稱 | 所屬國家 |
---|---|---|
147.92.153.19 | crawl.147-92-153-19.search.line-apps.com | JP |
147.92.153.5 | crawl.147-92-153-5.search.line-apps.com | JP |
147.92.153.18 | crawl.147-92-153-18.search.line-apps.com | JP |
147.92.153.17 | crawl.147-92-153-17.search.line-apps.com | JP |
147.92.153.16 | crawl.147-92-153-16.search.line-apps.com | JP |
147.92.153.4 | crawl.147-92-153-4.search.line-apps.com | JP |
147.92.153.8 | crawl.147-92-153-8.search.line-apps.com | JP |
147.92.153.3 | crawl.147-92-153-3.search.line-apps.com | JP |
147.92.153.6 | crawl.147-92-153-6.search.line-apps.com | JP |
147.92.153.1 | crawl.147-92-153-1.search.line-apps.com | JP |
IP地址(22) | 伺服器名稱 | 所屬國家 |
---|---|---|
147.92.153.3 | crawl.147-92-153-3.search.line-apps.com | JP |
147.92.153.19 | crawl.147-92-153-19.search.line-apps.com | JP |
147.92.153.2 | crawl.147-92-153-2.search.line-apps.com | JP |
147.92.153.4 | crawl.147-92-153-4.search.line-apps.com | JP |
147.92.153.13 | crawl.147-92-153-13.search.line-apps.com | JP |
147.92.153.15 | crawl.147-92-153-15.search.line-apps.com | JP |
147.92.153.8 | crawl.147-92-153-8.search.line-apps.com | JP |
147.92.153.18 | crawl.147-92-153-18.search.line-apps.com | JP |
147.92.153.10 | crawl.147-92-153-10.search.line-apps.com | JP |
147.92.153.9 | crawl.147-92-153-9.search.line-apps.com | JP |
IP地址(28) | 伺服器名稱 | 所屬國家 |
---|---|---|
147.92.153.5 | crawl.147-92-153-5.search.line-apps.com | JP |
147.92.153.7 | crawl.147-92-153-7.search.line-apps.com | JP |
147.92.153.14 | crawl.147-92-153-14.search.line-apps.com | JP |
147.92.153.8 | crawl.147-92-153-8.search.line-apps.com | JP |
147.92.153.15 | crawl.147-92-153-15.search.line-apps.com | JP |
147.92.153.9 | crawl.147-92-153-9.search.line-apps.com | JP |
147.92.153.16 | crawl.147-92-153-16.search.line-apps.com | JP |
147.92.153.17 | crawl.147-92-153-17.search.line-apps.com | JP |
147.92.153.18 | crawl.147-92-153-18.search.line-apps.com | JP |
147.92.153.19 | crawl.147-92-153-19.search.line-apps.com | JP |
IP地址(10) | 伺服器名稱 | 所屬國家 |
---|---|---|
203.104.154.135 | crawl.203-104-154-135.web.naver.com | JP |
203.104.154.144 | crawl.203-104-154-144.web.naver.com | JP |
203.104.154.137 | crawl.203-104-154-137.web.naver.com | JP |
203.104.154.143 | crawl.203-104-154-143.web.naver.com | JP |
203.104.154.136 | ? | JP |
203.104.154.142 | crawl.203-104-154-142.web.naver.com | JP |
203.104.154.138 | ? | JP |
203.104.154.141 | 203.104.154.141 | JP |
203.104.154.140 | crawl.203-104-154-140.web.naver.com | JP |
203.104.154.139 | crawl.203-104-154-139.web.naver.com | JP |
IP地址(10) | 伺服器名稱 | 所屬國家 |
---|---|---|
147.92.153.15 | crawl.147-92-153-15.search.line-apps.com | JP |
147.92.153.18 | crawl.147-92-153-18.search.line-apps.com | JP |
147.92.153.3 | crawl.147-92-153-3.search.line-apps.com | JP |
147.92.153.14 | crawl.147-92-153-14.search.line-apps.com | JP |
147.92.153.19 | crawl.147-92-153-19.search.line-apps.com | JP |
147.92.153.16 | crawl.147-92-153-16.search.line-apps.com | JP |
147.92.153.11 | crawl.147-92-153-11.search.line-apps.com | JP |
147.92.153.7 | crawl.147-92-153-7.search.line-apps.com | JP |
147.92.153.12 | crawl.147-92-153-12.search.line-apps.com | JP |
147.92.153.17 | crawl.147-92-153-17.search.line-apps.com | JP |
147.92.153.9 | crawl.147-92-153-9.search.line-apps.com | JP |
147.92.153.4 | crawl.147-92-153-4.search.line-apps.com | JP |
147.92.153.20 | crawl.147-92-153-20.search.line-apps.com | JP |
147.92.153.8 | crawl.147-92-153-8.search.line-apps.com | JP |
147.92.153.2 | crawl.147-92-153-2.search.line-apps.com | JP |
147.92.153.1 | crawl.147-92-153-1.search.line-apps.com | JP |
147.92.153.13 | crawl.147-92-153-13.search.line-apps.com | JP |
147.92.153.10 | crawl.147-92-153-10.search.line-apps.com | JP |
203.104.154.135 | crawl.203-104-154-135.web.naver.com | JP |
203.104.154.144 | crawl.203-104-154-144.web.naver.com | JP |
203.104.154.137 | crawl.203-104-154-137.web.naver.com | JP |
203.104.154.143 | crawl.203-104-154-143.web.naver.com | JP |
203.104.154.136 | ? | JP |
203.104.154.142 | crawl.203-104-154-142.web.naver.com | JP |
203.104.154.138 | ? | JP |
203.104.154.141 | 203.104.154.141 | JP |
203.104.154.140 | crawl.203-104-154-140.web.naver.com | JP |
203.104.154.139 | crawl.203-104-154-139.web.naver.com | JP |
147.92.153.6 | crawl.147-92-153-6.search.line-apps.com | JP |
147.92.153.5 | crawl.147-92-153-5.search.line-apps.com | JP |
147.92.246.97 | crawl.147-92-246-97.search.line-apps.com | JP |
147.92.163.130 | crawl.147-92-163-130.search.line-apps.com | JP |
147.92.246.130 | crawl.147-92-246-130.search.line-apps.com | ? |
147.92.246.100 | crawl.147-92-246-100.search.line-apps.com | JP |
147.92.246.200 | crawl.147-92-246-200.search.line-apps.com | JP |
147.92.246.70 | crawl.147-92-246-70.search.line-apps.com | JP |
147.92.246.160 | crawl.147-92-246-160.search.line-apps.com | JP |
147.92.246.40 | crawl.147-92-246-40.search.line-apps.com | JP |
147.92.246.60 | crawl.147-92-246-60.search.line-apps.com | JP |
147.92.246.50 | crawl.147-92-246-50.search.line-apps.com | JP |
147.92.246.90 | crawl.147-92-246-90.search.line-apps.com | JP |
147.92.246.20 | crawl.147-92-246-20.search.line-apps.com | JP |
147.92.246.170 | crawl.147-92-246-170.search.line-apps.com | JP |
147.92.246.180 | crawl.147-92-246-180.search.line-apps.com | JP |
147.92.246.10 | crawl.147-92-246-10.search.line-apps.com | ? |
147.92.246.140 | crawl.147-92-246-140.search.line-apps.com | JP |
147.92.246.190 | crawl.147-92-246-190.search.line-apps.com | JP |
147.92.246.120 | crawl.147-92-246-120.search.line-apps.com | JP |
147.92.246.110 | crawl.147-92-246-110.search.line-apps.com | JP |
147.92.246.80 | crawl.147-92-246-80.search.line-apps.com | JP |
147.92.246.150 | crawl.147-92-246-150.search.line-apps.com | JP |
147.92.246.30 | crawl.147-92-246-30.search.line-apps.com | JP |
對於未知蜘蛛或者爬蟲。它的用途對網站來說可能是好的,也可能是壞的,這取決於它是什麼。所以說,這需要站長進一步分析判斷這些尚不明確的爬蟲行為,再作最終決定。 但,根據以往的經驗,未宣告行為目的及未命名的蜘蛛爬蟲,通常都有不可告人的祕密,我們理應對其行為進行控制,比如攔截。
您可以通過在網站的 robots.txt 中設定使用者代理訪問規則來遮蔽 Linespider 或限制其訪問許可權。我們建議安裝 Spider Analyser 外掛,以檢查它是否真正遵循這些規則。
# robots.txt # 下列程式碼一般情況可以攔截該代理 User-agent: Linespider Disallow: /
您無需手動執行此操作,可通過我們的 Wordpress 外掛 Spider Analyser 來攔截不必要的蜘蛛或者爬蟲。