How to effectively block Sogou spider from crawling your website content?
Method 1: Use the robots.txt file
To prevent Sogou spider from crawling your website content, you can achieve this by creating a robots.txt file. Add the following content to the file:
User-agent: Sogou web spider
Disallow: /
User-agent: sogou spider
Disallow: /
User-agent: *
Disallow:
Because it is not certain whether it is sogou spider or Sogou web spider, two lines are written. Other search engines usually specify their spider names in related articles, but Sogou does not, which also shows its side. Upload the file to the root directory of the website to take effect. However, it should be noted that Sogou spider sometimes does not abide by the robots.txt file protocol, so it is still possible for it to crawl.
Method 2: Use the .htaccess file
In conjunction with the robots.txt file, you can create a new .htaccess file. The file name is .htaccess, and add the following content in the file:
#block spider
\
order allow,deny
#Sogou block
deny from 220.181.125.71
deny from 220.181.125.68
deny from 220.181.125.69
deny from 220.181.94.235
deny from 220.181.94.233
deny from 220.181.94.236
deny from 220.181.19.84
allow from all
</LIMIT>
Upload the file to the root directory of the website. The listed IP addresses are all from the Sogou spider. Because they frequently change, new IP addresses can be added at any time.