Which user-agent strings identify MirrorWebCrawler?

The following user-agent strings identify MirrorWebCrawler: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36 +https://www.mirrorweb.com.

Does MirrorWebCrawler respect robots.txt rules?

According to official documentation, MirrorWebCrawler does not respect robots.txt rules.

How can I verify MirrorWebCrawler traffic with live traffic data?

You can verify MirrorWebCrawler requests by checking your server access logs for the documented user-agent strings. For accurate verification, correlate user-agent patterns with IP ranges or verification methods provided by MirrorWeb Ltd.

What is MirrorWebCrawler?

Direct Answer: MirrorWebCrawler is a web archiving bot operated by MirrorWeb Ltd, providing archival solutions for financial and public sectors.

Operator: MirrorWeb Ltd Type: Other Bot Purpose: Web archiving for compliance and regulatory purposes

The MirrorWebCrawler is a commercial web archiving bot used for capturing and preserving website content in real-time. It is operated by MirrorWeb Ltd, a company that provides archival solutions for the financial and public sector. The bot's primary function is to archive websites for compliance, proof, and peace of mind, meeting financial regulations, FOIA requirements, or preserving content for legal purposes.

User-Agent Identification

The following user-agent strings identify MirrorWebCrawler in your live traffic data:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36 +https://www.mirrorweb.com

robots.txt Rules for MirrorWebCrawler

Respects robots.txt: No

This bot does not commit to following robots.txt

MirrorWebCrawler does not officially follow robots.txt directives. The only reliable way to control access is through server-side blocking (IP filtering, user-agent rules in your web server config) combined with log monitoring to verify effectiveness.

Need continuous verification across 500+ bots? Can AI See It automates this.

Crawl Behavior

Frequency:Continuous

Request Pattern:Real-Time Website Capture

Crawl Activity Index

Relative crawl activity for MirrorWebCrawler over the past 28 days. Higher values indicate increased crawling intensity compared to the period baseline.

View recent activity data (last 7 days)

Date	Activity Index
Mar 28, 2026	28.0
Mar 29, 2026	26.3
Mar 30, 2026	20.2
Mar 31, 2026	17.1
Apr 1, 2026	16.4
Apr 2, 2026	23.7
Apr 3, 2026	23.3

Source: Cloudflare Radar

Why track MirrorWebCrawler traffic?

Identify and classify unknown crawler activity. MirrorWebCrawler may appear in your live traffic data with varying frequency. Tracking its behavior helps you decide whether to allow, throttle, or block it based on actual data.

Protect your crawl budget. Every bot request consumes server resources. Understanding what MirrorWebCrawler crawls helps you prioritize the crawlers that matter.

Log Verification

To verify MirrorWebCrawler traffic in your live traffic data:

Search access logs for the user-agent strings listed above
Check if the IP addresses match documented ranges (if provided by MirrorWeb Ltd)
Verify the crawl pattern matches documented behavior
Use reverse DNS lookup for additional verification if available

Note: Observed behavior in production environments may differ from official documentation. Live traffic monitoring provides the only reliable verification of actual bot behavior.

Undocumented Information

The following information is not officially documented for MirrorWebCrawler:

crawl frequency details
IP verification method
JavaScript rendering details

Official Documentation

View Official MirrorWebCrawler Documentation →

Information sourced from official documentation. Content generated with AI assistance.