GuardianScan Bot
Technical documentation for webmasters
Last updated: 30 October 2025
What is GuardianScan?
Website audit tool for UK developers and agencies. We scan for Core Web Vitals, WCAG 2.2 accessibility, security headers, SEO, and modern framework optimisation.
Our bot only visits sites when explicitly requested by authenticated users. We do not perform continuous or unsolicited crawling, and we do not build search indexes. Each scan is a one-time, on-demand audit lasting approximately 45 seconds.
What is the GuardianScan user agent?
Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/131.0.0.0 Mobile Safari/537.36 (compatible; GuardianScan/1.0; +https://guardianscan.ai/bot)We use a mobile user agent (Pixel 5) because Google ranks sites based on mobile performance. This matches how your site appears in Google Search Console Core Web Vitals reports.
What We Check
Performance
- • Core Web Vitals (LCP, TBT, CLS)
- • Lighthouse scores
- • Load times
Accessibility
- • WCAG 2.2 Level AA compliance (UK Equality Act 2010)
- • Screen-reader compatibility
- • Keyboard navigation
Security
- • Security headers
- • HTTPS configuration
- • CSP policies
Modern Standards
- • Framework patterns
- • Image optimisation
- • SEO best practices
How do I block the GuardianScan bot?
We respect robots.txt directives. To block GuardianScan from scanning your site, add the following to your robots.txt file:
User-agent: GuardianScan
Disallow: /Note: This will prevent GuardianScan users from auditing your website.
Technical Details
- Browser
- Chromium-based (headless Chrome 131)
- Origin
- UK-based cloud infrastructure
- Scan Duration
- About 45 seconds for most sites
- Standards Compliance
- RFC 9309 (Robots Exclusion Protocol)
- Scanning Behaviour
- On-demand only (no automated crawling or indexing)
Privacy and Data Protection
Data Controller Information
What Data We Collect
We only access publicly available content. Our scanning process collects:
- Target URL and final resolved URL (after redirects)
- Viewport screenshot (PNG format, visible area only)
- HTTP response headers (for security analysis)
- Page performance metrics (Core Web Vitals, load times)
- Accessibility audit results (WCAG 2.2 findings)
- Console errors and warnings (up to 100 each)
- IP addresses (for rate limiting only, not stored permanently)
- Email addresses (for free scan delivery only, when provided)
We do not collect: cookies, session data, form inputs, authentication tokens, full-page HTML source code, or any personal data beyond what is publicly visible.
Legal Basis for Processing
Under UK GDPR Article 6, we process your data on the following lawful bases:
- Performance of contract (Article 6(1)(b)) – for providing scanning services to authenticated users
- Legitimate interests (Article 6(1)(f)) – for security monitoring, abuse prevention, and service improvement
Data Storage and Security
- All data stored in UK-based infrastructure (Supabase London region, United Kingdom)
- Screenshots encrypted at rest (AES-256) and in transit (TLS 1.3) with row-level security controls
- Access restricted to the authenticated user who requested the scan via database row-level security policies
- Authenticated users: scan results retained until manual deletion or account closure. Free scans: email addresses removed after 30 days. Following the 30-day retention period, scan data undergoes a comprehensive anonymisation process including removal of all direct identifiers (email addresses, user IDs, IP addresses), suppression of small cell counts (websites with fewer than 5 scans), aggregation of temporal data to monthly periods, URL domain generalisation, and addition of statistical noise. This process has been assessed to ensure re-identification is not reasonably likely, in accordance with ICO guidance. Once anonymised, the data is no longer personal data under UK GDPR and is retained indefinitely for service improvement and industry benchmarking.
- No sharing, selling, or commercial use of scan data
Third-Party Data Processors
We use the following sub-processors to deliver our service:
- Browserless.io (browser automation) – UK/EU servers, used for headless Chrome execution
- Supabase (data storage and authentication) – UK/EU infrastructure, PostgreSQL database and file storage
- Upstash (rate limiting) – Global infrastructure with EU data residency
All processors operate under data processing agreements compliant with UK GDPR Article 28.
Your Data Protection Rights
Under UK GDPR, you have the right to:
- Access your personal data (Article 15)
- Rectification of inaccurate data (Article 16)
- Erasure of your data (“right to be forgotten”) (Article 17)
- Restriction of processing (Article 18)
- Data portability (Article 20)
- Object to processing (Article 21)
- Lodge a complaint with the Information Commissioner's Office (ICO) at www.ico.org.uk
To exercise these rights, contact us at privacy@guardianscan.ai
Compliance: UK GDPR (as retained in UK law post-Brexit), Data Protection Act 2018, Privacy and Electronic Communications Regulations (PECR) 2003, and Electronic Commerce Regulations 2002.
Contact
Questions about our bot or need to report an issue?
Numen Technology Limited
86-90 Paul Street
London, EC2A 4NE
United Kingdom