GuardianScan Bot

Technical documentation for webmasters

Last updated: 30 October 2025

What is GuardianScan?

Website audit tool for UK developers and agencies. We scan for Core Web Vitals, WCAG 2.2 accessibility, security headers, SEO, and modern framework optimisation.

Our bot only visits sites when explicitly requested by authenticated users. We do not perform continuous or unsolicited crawling, and we do not build search indexes. Each scan is a one-time, on-demand audit lasting approximately 45 seconds.

What is the GuardianScan user agent?

Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/131.0.0.0 Mobile Safari/537.36 (compatible; GuardianScan/1.0; +https://guardianscan.ai/bot)

We use a mobile user agent (Pixel 5) because Google ranks sites based on mobile performance. This matches how your site appears in Google Search Console Core Web Vitals reports.

What We Check

Performance

• Core Web Vitals (LCP, TBT, CLS)
• Lighthouse scores
• Load times

Accessibility

• WCAG 2.2 Level AA compliance (UK Equality Act 2010)
• Screen-reader compatibility
• Keyboard navigation

Security

• Security headers
• HTTPS configuration
• CSP policies

Modern Standards

• Framework patterns
• Image optimisation
• SEO best practices

How do I block the GuardianScan bot?

We respect robots.txt directives. To block GuardianScan from scanning your site, add the following to your robots.txt file:

User-agent: GuardianScan
Disallow: /

Note: This will prevent GuardianScan users from auditing your website.

Technical Details

Browser: Chromium-based (headless Chrome 131)
Origin: UK-based cloud infrastructure
Scan Duration: About 45 seconds for most sites
Standards Compliance: RFC 9309 (Robots Exclusion Protocol)
Scanning Behaviour: On-demand only (no automated crawling or indexing)

Privacy and Data Protection

Data Controller Information

Legal Entity

Numen Technology Limited

Company Number

13262519 (England and Wales)

ICO Registration

ZC018646

Registered Address

86-90 Paul Street, London, EC2A 4NE, United Kingdom

What Data We Collect

We only access publicly available content. Our scanning process collects:

Target URL and final resolved URL (after redirects)
Viewport screenshot (PNG format, visible area only)
HTTP response headers (for security analysis)
Page performance metrics (Core Web Vitals, load times)
Accessibility audit results (WCAG 2.2 findings)
Console errors and warnings (up to 100 each)
IP addresses (for rate limiting only, not stored permanently)
Email addresses (for free scan delivery only, when provided)

We do not collect: cookies, session data, form inputs, authentication tokens, full-page HTML source code, or any personal data beyond what is publicly visible.

Legal Basis for Processing

Under UK GDPR Article 6, we process your data on the following lawful bases:

Performance of contract (Article 6(1)(b)) – for providing scanning services to authenticated users
Legitimate interests (Article 6(1)(f)) – for security monitoring, abuse prevention, and service improvement

Data Storage and Security

All data stored in UK-based infrastructure (Supabase London region, United Kingdom)
Screenshots encrypted at rest (AES-256) and in transit (TLS 1.3) with row-level security controls
Access restricted to the authenticated user who requested the scan via database row-level security policies
Authenticated users: scan results retained until manual deletion or account closure. Free scans: email addresses removed after 30 days. Following the 30-day retention period, scan data undergoes a comprehensive anonymisation process including removal of all direct identifiers (email addresses, user IDs, IP addresses), suppression of small cell counts (websites with fewer than 5 scans), aggregation of temporal data to monthly periods, URL domain generalisation, and addition of statistical noise. This process has been assessed to ensure re-identification is not reasonably likely, in accordance with ICO guidance. Once anonymised, the data is no longer personal data under UK GDPR and is retained indefinitely for service improvement and industry benchmarking.
No sharing, selling, or commercial use of scan data

Third-Party Data Processors

We use the following sub-processors to deliver our service:

Browserless.io (browser automation) – UK/EU servers, used for headless Chrome execution
Supabase (data storage and authentication) – UK/EU infrastructure, PostgreSQL database and file storage
Upstash (rate limiting) – Global infrastructure with EU data residency

All processors operate under data processing agreements compliant with UK GDPR Article 28.

Your Data Protection Rights

Under UK GDPR, you have the right to:

Access your personal data (Article 15)
Rectification of inaccurate data (Article 16)
Erasure of your data (“right to be forgotten”) (Article 17)
Restriction of processing (Article 18)
Data portability (Article 20)
Object to processing (Article 21)
Lodge a complaint with the Information Commissioner's Office (ICO) at www.ico.org.uk

To exercise these rights, contact us at privacy@guardianscan.ai

Compliance: UK GDPR (as retained in UK law post-Brexit), Data Protection Act 2018, Privacy and Electronic Communications Regulations (PECR) 2003, and Electronic Commerce Regulations 2002.

Contact

Questions about our bot or need to report an issue?

bot@guardianscan.ai

Postal Address

Numen Technology Limited
86-90 Paul Street
London, EC2A 4NE
United Kingdom

Website

guardianscan.ai