Bot Database

International Search Engines

For many clients, this boils down to where you trade and your target audience. If you’re not interested in for example Chinese or Russian clients, or you actively don’t want international traffic, then it also doesn’t make sense to let Yandex and Badu crawl. Although you certainly can block bots in robots.txt, remember that this presumes the bots obey your instructions - often they just don’t. VerifiedVisitors actually enforces your instructions and verifies you are blocking the bots you don’t want. There are a huge amount of local international search engines as well that you’ve probably never heard of.

Vendor

Bot Service

Recommendation

Description

logotypelogotype

Yandex

Yandex Content Downloaders

Recommended

Not recommended

This Yandex bot collects information in order to be able to generate previews with the aim of helping their users answer questions from within the search results.

logotypelogotype

Yandex

Yandex Verticals search bot

Recommended

Not recommended

Yandex Vertical search robot.

logotypelogotype

Yandex

Yandex sitelinks availability checker

Recommended

Not recommended

The Yandex sitelinks “fetcher” used for checking the availability of the pages detected as sitelinks.

logotypelogotype

Yandex

Yandex Video Parser

Recommended

Not recommended

The Yandex.Video indexing robot, searches for videos across your site that it can index. This bot needs to be specifically disallowed by robots.txt by referencing the user-agent.

logotypelogotype

Yandex

Yandex Video Bot

Recommended

Not recommended

The Yandex.Video robot with the sole purpose of indexing video clips for the Yandex video site.

logotypelogotype

Yandex

Yandex Screenshot Bot

Recommended

Not recommended

Yandex bot which makes a snapshot of a page. To avoid being unintentionally blocked by the site owners, this bot may igore the robots.txt directives designed for random robots (User-agent: *. You can stop it by disallowing the agent YandexScreenshotBot.

logotypelogotype

Yandex

Yandex Mobile

Recommended

Not recommended

The Yandex mobile devices robot deployed to support the needs of mobile users.

logotypelogotype

Yandex

Yandex Multimedia Indexer

Recommended

Not recommended

The Yandex Multimedia data indexer bot.

logotypelogotype

Yandex

Yandex Resource Renderer

Recommended

Not recommended

The Yandex Resource Renderer loads resources to render the page with JavaScript. It will ignore instructions in robots.txt if the HTML page on which these resources are located is accessible to the Yandex robot.The robot does not access resources if HTML pages where these resources are used are prohibited in robots.txt.

logotypelogotype

Yandex

Yandex News bot

Recommended

Not recommended

The Yandex News robot.

logotypelogotype

Yandex

Yandex Page Checker

Recommended

Not recommended

The Yandex robot that validates markup submitted through the Structured data validator form.

logotypelogotype

Yandex

Yandex Mirror Detector

Recommended

Not recommended

The Yandex robot detects site mirrors.

logotypelogotype

Yandex

Yandex Images Bot

Recommended

Not recommended

The Yandex.Images indexing robot.

logotypelogotype

Yandex

Yandex Direct Dyn Bot

Recommended

Not recommended

Yandex bot YandexDirectFetcher downloads ad landing pages to check their availability and content, so that ads can be placed in the Yandex search results and on partner sites. To avoid being unintentionally blocked by the site owners, this bot may ignore the robots.txt directives designed for random robots (User-agent: *.) You can stop it by specifically disallowing the agent YandexDirectDyn

logotypelogotype

Yandex

Yandex Bot

Recommended

Not recommended

This is the blog indexing bot from Yandex. If you run a blog and want your blog comments indexing by Yandex, select this one.

logotypelogotype

Yahoo

Yahoo! Japan

Recommended

Not recommended

Crawler of Yahoo! in Japan

logotypelogotype

SoGou

SoGou Chinese Search

Recommended

Not recommended

Chinese Search Engine

logotypelogotype

Sistrix

Seekport

Recommended

Not recommended

Seekport is an internet search engine. This search engine is operated by SISTRIX, a platform intelligence provider from Germany.The search engine is a public, free and independent alternative to Google. Seekport does not store user data and does not profile users. Seekport is also operated without advertising and has no conflicts of interest in the display of search results.The Seekport Bot crawls the public internet to find and update content for Seekport's search index.

logotypelogotype

Qihoo360

so.com

Recommended

Not recommended

So.com is a major Search Engine in China, this is another of its crawlers.

logotypelogotype

Baidu

BaiduBot

Recommended

Not recommended

This newer spider will not only crawl the HTML of the web pages but also render the page with other elements including CSS, JS, and images to help Baidu better understand the content of the page and provide more meaningful results.

logotypelogotype

Baidu

Baidu Mobile

Recommended

Not recommended

This newer spider will not only crawl the HTML of the web pages but also render the page with other elements including CSS, JS, and images to help Baidu better understand the content of the page and provide more meaningful results.