Przejdź do treści
Can AI see it

Sprawdź, co widzi AI. Zmierz, ile to warte.

Crawlery treningowe AI

Te boty crawlują strony, aby zbierać dane treningowe dla modeli AI. Wiele z nich respektuje robots.txt — dając Ci kontrolę nad tym, czy Twoje treści są wykorzystywane do treningu AI.

Crawlery treningowe AI pobierają strony internetowe, aby ich operatorzy mogli wykorzystać treści do treningu dużych modeli językowych i innych systemów AI. Zrozumienie, które boty są aktywne na Twojej stronie — i czy respektują Twoje reguły dostępu — to pierwszy krok w kontrolowaniu sposobu wykorzystania Twoich treści.

26 Crawler AI treningowy w naszym katalogu

  • Amazon Kendra by Amazon

    Amazon Kendra is an intelligent search service operated by Amazon, using natural language processing for accurate search results.

  • Amazonbot by Amazon

    Amazonbot is Amazon's web crawler used to improve products and services, such as training AI models.

  • Anchor Browser by Anchor

    The Anchor Browser is a web browser for AI agents, operated by Anchor, used for automating workflows and web interactions.

  • AwarioSmartBot by Awario

    AwarioSmartBot is a web crawler operated by Awario, used for collecting new and updated web data for Internet marketers.

  • Big Sur AI by Big Sur AI

    Big Sur AI Crawler, operated by Big Sur AI, crawls user websites for AI-infused experiences.

  • Brandwatch by Brandwatch

    Brandwatch's Magpie Crawler indexes web pages for social media monitoring and analysis.

  • Ceramic TerraCotta by Ceramic

    Ceramic TerraCotta is a web crawler operated by Ceramic, an AI training infrastructure company. It crawls websites to support Ceramic's AI model training optimization platform.

  • ClaudeBot by Anthropic

    ClaudeBot is Anthropic's web crawler that collects web content for training Claude AI models. It respects robots.txt directives and supports Crawl-delay.

  • Cotoyogi by Research Organization of Information and Systems

    Cotoyogi is a bot operated by the Research Organization of Information and Systems for AI training purposes.

  • Echobot Bot by Echobox

    Echobot is a web scraping bot operated by Echobox for AI training purposes, specifically for automating content distribution for digital publishers.

  • Factset_spyderbot by Factset

    Factset Spyderbot is a web scraping bot operated by Factset for delivering reliable financial data.

  • Google NotebookLM by Google

    Google NotebookLM bot operated by Google for AI training.

  • Google-CloudVertexBot by Google

    Google-CloudVertexBot is a crawler operated by Google for targeted AI training of site owners' own sites.

  • GoogleOther by Google

    GoogleOther is a generic crawler operated by Google for fetching publicly accessible content from sites, used for internal research and development.

  • GPTBot by OpenAI

    GPTBot is used to crawl content that may be used in training OpenAI's generative AI foundation models.

  • ICC Crawler by NICT

    ICC Crawler is a web crawler operated by NICT, collecting web pages for AI training.

  • LINER Bot by Liner Bot

    LINER Bot is a web crawler operated by Liner Bot, used for AI training by collecting data from the internet.

  • Meta-ExternalAgent by Meta

    Meta-ExternalAgent is a bot operated by Meta for AI training purposes, specifically for training AI models or improving products by indexing content directly.

  • netEstate Imprint Crawler by netEstate

    The NetEstate Imprint crawler crawls websites for public contact information.

  • Novellum AI Crawl by Novellum

    Novellum.ai is building out tools for building agents. This MCP tool will be used by agents to crawl sites.

  • PetalBot by Huawei

    PetalBot is a search engine crawler operated by Huawei, used for indexing websites and providing content recommendations in Petal Search engine, Huawei Assistant, and AI Search services.

  • QualifiedBot by Qualified.com, Inc.

    QualifiedBot is a crawler operated by Qualified.com, Inc. to power AI products by crawling customer websites.

  • SemrushBot-OCOB by Semrush

    SemrushBot-OCOB is a bot operated by Semrush for ai-training, specifically for Content Toolkit.

  • SemrushBotSwa by SEMrush

    SEMrushBotSwa is a bot operated by SEMrush for ai-training purposes, collecting data for the SEO Writing Assistant tool.

  • ShapBot by Parallel

    Not officially documented

  • WARDBot by WEBSPARK

    WARDBot is a monitoring bot operated by WEBSPARK that tracks URL status codes for users.

Zobacz wszystkie boty w pełnym katalogu