# Steuersoft Tools — Crawl-Profil # Strategie: Tool-Hauptseiten + Wissensquellen offen für Such-/LLM-Indexierung, # JSON-Indizes + Embeddings + sensible Tools NUR für legitime LLM-Crawler. # ====== Standard-Bots (z.B. Scraper, generic crawler) ====== User-agent: * Allow: / Allow: /index.html # Daten-Endpoints + Tool-Indizes blockieren (= unsere Pipeline-Werte schützen) Disallow: /*-index.json Disallow: /*-embeddings.json Disallow: /datasources.json Disallow: /meta.json Disallow: /freshness.json Disallow: /tool-of-month.json Disallow: /stats.json Disallow: /analytics.html Disallow: /bstbl-pdfs/ Disallow: /vendor/ Disallow: /api/ Disallow: /go Disallow: /uebernahme-track Disallow: /install-track Disallow: /pdf-track/ Disallow: /copy-track/ # Sensible Tools (Mandantendaten) Disallow: /BescheidChecker/ Disallow: /EinspruchsGenerator/ Disallow: /MandantenbriefGenerator/ Disallow: /FristenCockpitV2/ Disallow: /MandantenOnboarding/ Disallow: /Vollmacht80AO/ # ====== Etablierte Suchmaschinen-Bots (volle Indexierung erlaubt) ====== User-agent: Googlebot Allow: / Disallow: /bstbl-pdfs/ Disallow: /vendor/ User-agent: Bingbot Allow: / Disallow: /bstbl-pdfs/ Disallow: /vendor/ User-agent: DuckDuckBot Allow: / Disallow: /bstbl-pdfs/ Disallow: /vendor/ # ====== LLM-Crawler (Indexierung erwünscht, aber NICHT die JSON-Indizes) ====== User-agent: GPTBot Allow: / Disallow: /*-index.json Disallow: /*-embeddings.json Disallow: /datasources.json User-agent: OAI-SearchBot Allow: / Disallow: /*-index.json Disallow: /*-embeddings.json Disallow: /datasources.json User-agent: ChatGPT-User Allow: / Disallow: /*-index.json User-agent: ClaudeBot Allow: / Disallow: /*-index.json Disallow: /*-embeddings.json Disallow: /datasources.json User-agent: anthropic-ai Allow: / Disallow: /*-index.json User-agent: PerplexityBot Allow: / Disallow: /*-index.json User-agent: Perplexity-User Allow: / Disallow: /*-index.json User-agent: Google-Extended Allow: / Disallow: /*-index.json User-agent: Applebot-Extended Allow: / Disallow: /*-index.json User-agent: CCBot Allow: / Disallow: /*-index.json Disallow: /*-embeddings.json User-agent: cohere-ai Allow: / Disallow: /*-index.json User-agent: MistralAI-User Allow: / Disallow: /*-index.json # ====== Bekannte Scraper-Bots blockieren ====== User-agent: SemrushBot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: DotBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: Bytespider Disallow: / User-agent: Amazonbot Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: PetalBot Disallow: / User-agent: SeekportBot Disallow: / User-agent: ImagesiftBot Disallow: / Sitemap: https://tools.steuersoft.de/sitemap.xml