Does New York Times Newsgathering respect robots.txt rules?

According to official documentation, New York Times Newsgathering does not respect robots.txt rules.

How can I verify New York Times Newsgathering traffic with live traffic data?

You can verify New York Times Newsgathering requests by checking your server access logs for the documented user-agent strings. For accurate verification, correlate user-agent patterns with IP ranges or verification methods provided by The New York Times.

What is New York Times Newsgathering?

Direct Answer: New York Times Newsgathering bot collects public, non-copyright data for newsroom use.

Operator: The New York Times Type: Other Bot Purpose: Collecting public, non-copyright data for newsroom use

The New York Times Newsgathering bot is used by coders within the NYT newsroom to collect public, non-copyright data from government and commercial websites. It is used for tasks such as archival projects and public-service data collection, including U.S. Elections pages and Covid-19 trackers. The bot follows industry best practices, including controlling request volume, throttling, and identifying itself with custom UserAgents and X-headers.

User-Agent Identification

The following user-agent strings identify New York Times Newsgathering in your live traffic data:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 nyt_scraping/scraping@nytimes.com

robots.txt Rules for New York Times Newsgathering

Respects robots.txt: No

Robots.txt has limited effect on user-initiated bots

New York Times Newsgathering is triggered by user actions within The New York Times's products. While The New York Times states it respects robots.txt, the bot operates differently from autonomous crawlers — it fetches specific URLs on demand rather than systematically spidering your site. Server-log monitoring is the only reliable way to verify what actually happens.

Need continuous verification across 500+ bots? Can AI See It automates this.

Crawl Behavior

Frequency:Not Documented

Request Pattern:Not Documented

Official Documentation Quotes

"Coders within The New York Times newsroom write scripts and scrapers that collect public, non-copyright data from government and commercial websites, ranging from archival tasks to public-service data like our U.S. Elections pages and Covid-19 trackers."
Source:Official Documentation

"We bake-in industry best practices like controlling the volume of requests, throttling/concurrency and identifying our work with custom UserAgents and X-headers."
Source:Official Documentation

Crawl Activity Index

Relative crawl activity for New York Times Newsgathering over the past 28 days. Higher values indicate increased crawling intensity compared to the period baseline.

View recent activity data (last 7 days)

Date	Activity Index
Mar 26, 2026	88.0
Mar 27, 2026	82.7
Mar 28, 2026	83.1
Mar 29, 2026	81.8
Mar 30, 2026	87.3
Mar 31, 2026	90.2
Apr 1, 2026	88.9

Source: Cloudflare Radar

Why track New York Times Newsgathering traffic?

Identify and classify unknown crawler activity. New York Times Newsgathering may appear in your live traffic data with varying frequency. Tracking its behavior helps you decide whether to allow, throttle, or block it based on actual data.

Protect your crawl budget. Every bot request consumes server resources. Understanding what New York Times Newsgathering crawls helps you prioritize the crawlers that matter.

Log Verification

To verify New York Times Newsgathering traffic in your live traffic data:

Search access logs for the user-agent strings listed above
Check if the IP addresses match documented ranges (if provided by The New York Times)
Verify the crawl pattern matches documented behavior
Use reverse DNS lookup for additional verification if available

Note: Observed behavior in production environments may differ from official documentation. Live traffic monitoring provides the only reliable verification of actual bot behavior.

Undocumented Information

The following information is not officially documented for New York Times Newsgathering:

crawl frequency
request pattern
JavaScript rendering details

Official Documentation

View Official New York Times Newsgathering Documentation →

Information sourced from official documentation. Content generated with AI assistance.