robots.txt — Content-Signal

per site 4.2% — 21 of 502 publishers.

Sites whose robots.txt carries Cloudflare's Content-Signal: directives — a recent proposal at contentsignals.org. Three independent signals: search (use in search results), ai-input (use as input/RAG context for AI answers), and ai-train (use to train AI models). The columns show each publisher's declared yes/no value for each.

Publisher Country search ai-input ai-train
Brújula Digital Bolivia (BO) yes no
CamboJA News Cambodia (KH) yes no
Dagblad Suriname Suriname (SR) yes no
Dagbladet Norway (NO) yes no no
De Ware Tijd Suriname (SR) yes no
DVB (Democratic Voice of Burma, English) Myanmar (MM) yes no
Enab Baladi (English) Syria (SY) yes no
Iraqi News Iraq (IQ) yes no
Islamic Emirate of Afghanistan - Alemarah Afghanistan (AF) yes no
Nederlands Dagblad Netherlands (NL) yes yes no
New Telegraph Nigeria (NG) yes no
Novinite (Sofia News Agency) Bulgaria (BG) yes no
Petra (Jordan News Agency) Jordan (JO) yes no
SVT Nyheter Sweden (SE) yes yes no
Tchadinfos Chad (TD) yes no
The Atlantic United States (US) yes no no
The Daily Star (Bangladesh) Bangladesh (BD) yes no
The Phnom Penh Post Cambodia (KH) yes no
Xalq So'zi (Narodnoe Slovo) Uzbekistan (UZ) yes no
Äripäev Estonia (EE) yes no
Дневник Bulgaria (BG) yes no