International Search Engines
For many clients, this boils down to where you trade and your target audience. If you’re not interested in for example Chinese or Russian clients, or you actively don’t want international traffic, then it also doesn’t make sense to let Yandex and Badu crawl. Although you certainly can block bots in robots.txt, remember that this presumes the bots obey your instructions - often they just don’t. VerifiedVisitors actually enforces your instructions and verifies you are blocking the bots you don’t want. There are a huge amount of local international search engines as well that you’ve probably never heard of.
Vendor
Bot Service
Recommendation
Description
Yandex
Yandex Content Downloaders
Recommended
Not recommended
This Yandex bot collects information in order to be able to generate previews with the aim of helping their users answer questions from within the search results.
Yandex
Yandex Verticals search bot
Recommended
Not recommended
Yandex Vertical search robot.
Yandex
Yandex sitelinks availability checker
Recommended
Not recommended
The Yandex sitelinks “fetcher” used for checking the availability of the pages detected as sitelinks.
Yandex
Yandex Video Parser
Recommended
Not recommended
The Yandex.Video indexing robot, searches for videos across your site that it can index. This bot needs to be specifically disallowed by robots.txt by referencing the user-agent.
Yandex
Yandex Video Bot
Recommended
Not recommended
The Yandex.Video robot with the sole purpose of indexing video clips for the Yandex video site.
Yandex
Yandex Screenshot Bot
Recommended
Not recommended
Yandex bot which makes a snapshot of a page. To avoid being unintentionally blocked by the site owners, this bot may igore the robots.txt directives designed for random robots (User-agent: *. You can stop it by disallowing the agent YandexScreenshotBot.
Yandex
Yandex Mobile
Recommended
Not recommended
The Yandex mobile devices robot deployed to support the needs of mobile users.
Yandex
Yandex Multimedia Indexer
Recommended
Not recommended
The Yandex Multimedia data indexer bot.
Yandex
Yandex Resource Renderer
Recommended
Not recommended
The Yandex Resource Renderer loads resources to render the page with JavaScript. It will ignore instructions in robots.txt if the HTML page on which these resources are located is accessible to the Yandex robot.The robot does not access resources if HTML pages where these resources are used are prohibited in robots.txt.
Yandex
Yandex News bot
Recommended
Not recommended
The Yandex News robot.
Yandex
Yandex Page Checker
Recommended
Not recommended
The Yandex robot that validates markup submitted through the Structured data validator form.
Yandex
Yandex Mirror Detector
Recommended
Not recommended
The Yandex robot detects site mirrors.
Yandex
Yandex Images Bot
Recommended
Not recommended
The Yandex.Images indexing robot.
Yandex
Yandex Direct Dyn Bot
Recommended
Not recommended
Yandex bot YandexDirectFetcher downloads ad landing pages to check their availability and content, so that ads can be placed in the Yandex search results and on partner sites. To avoid being unintentionally blocked by the site owners, this bot may ignore the robots.txt directives designed for random robots (User-agent: *.) You can stop it by specifically disallowing the agent YandexDirectDyn
Yandex
Yandex Bot
Recommended
Not recommended
This is the blog indexing bot from Yandex. If you run a blog and want your blog comments indexing by Yandex, select this one.
Yahoo
Yahoo! Japan
Recommended
Not recommended
Crawler of Yahoo! in Japan
SoGou
SoGou Chinese Search
Recommended
Not recommended
Chinese Search Engine
Sistrix
Seekport
Recommended
Not recommended
Seekport is an internet search engine. This search engine is operated by SISTRIX, a platform intelligence provider from Germany.The search engine is a public, free and independent alternative to Google. Seekport does not store user data and does not profile users. Seekport is also operated without advertising and has no conflicts of interest in the display of search results.The Seekport Bot crawls the public internet to find and update content for Seekport's search index.
Qihoo360
so.com
Recommended
Not recommended
So.com is a major Search Engine in China, this is another of its crawlers.
Baidu
BaiduBot
Recommended
Not recommended
This newer spider will not only crawl the HTML of the web pages but also render the page with other elements including CSS, JS, and images to help Baidu better understand the content of the page and provide more meaningful results.
Baidu
Baidu Mobile
Recommended
Not recommended
This newer spider will not only crawl the HTML of the web pages but also render the page with other elements including CSS, JS, and images to help Baidu better understand the content of the page and provide more meaningful results.