# robots.txt for digitalheroes.co.in # Last updated: 2026-04-25 — see CLAUDE.md §25 (AI crawler access policy) # === Standard search crawlers === User-agent: Googlebot Allow: / # Googlebot-Video specifically: skip decorative hero / brand-loop videos that # are not on a "watch page" (no VideoObject schema, no transcript, no main # content focus). Google Search Console flagged these as "Video isn't on a # watch page" — adding explicit Disallow stops the crawler from re-attempting # indexation. Add new decorative .mp4 paths here as they ship. User-agent: Googlebot-Video Disallow: /assets/fire.mp4 Allow: / User-agent: Bingbot Allow: / User-agent: DuckDuckBot Allow: / User-agent: YandexBot Allow: / User-agent: Baiduspider Allow: / # === AI training crawlers (default-open per CLAUDE.md §25) === User-agent: GPTBot Allow: / User-agent: ClaudeBot Allow: / User-agent: Google-Extended Allow: / User-agent: CCBot Allow: / User-agent: Applebot-Extended Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Bytespider Allow: / User-agent: Amazonbot Allow: / # === AI search retrieval crawlers (feed AI answer engines in real time) === User-agent: OAI-SearchBot Allow: / User-agent: Claude-SearchBot Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: DuckAssistBot Allow: / # === User-triggered fetches (ChatGPT Browse, Claude tool-use) === User-agent: ChatGPT-User Allow: / User-agent: Claude-User Allow: / # === Default policy for all other bots === User-agent: * Allow: / Disallow: /dist/ Disallow: /node_modules/ Disallow: /scripts/ Disallow: /_handoff/ # === Sitemap location === Sitemap: https://digitalheroes.co.in/sitemap.xml # === Host === Host: https://digitalheroes.co.in