Responsible AI asks for consent.

You uploaded a photo, a song, a painting. Somewhere, a model is training on it, and nobody asked you first.

This site is a plain-English guide to what's happening to your data online, who it hurts, and the tools you can actually use to push back.

Built as an open educational resource: a project for Responsible AI, IIT Delhi. Contribute on GitHub ↗

It's not only artists.
It's everyone who ever put something online.

Generative AI is trained on the open web. That sounds abstract until you realise the open web means your selfies, your mother's voice note, the sketch you posted in eighth grade, the reel you filmed last Tuesday. Here are five ways that plays out.

01 · Ordinary users

Your photos, their deepfakes.

Every public photo is a training sample. Face-swap apps, undressing tools, and non-consensual deepfake porn all rely on models that learned from ordinary selfies, most of which were never meant for that purpose. Once an image leaves your phone, it can be reconstituted into something you never posed for.

e.g., a global surge in AI deepfakes driving new cases of child sexual violence.[1]

02 · Voters

Audio and video you can't trust.

Cloned political voices and AI-generated rally footage have already circulated during election seasons. Detection tools lag behind generation tools by months, sometimes years. The cost of making a convincing fake dropped from a studio to a laptop and ten minutes.

e.g., AI-generated deepfake videos and cloned voices of political leaders flagged as threats across Indian elections.[2]

03 · Artists

Style mimicry isn't flattery: it's replacement.

A diffusion model that has seen a few dozen of an illustrator's pieces can generate new work in their style on demand, for free, in seconds. Commissions dry up. Clients pick the $0 version. The artist's own search results get buried under imitations. Class-action lawsuits are slow; by the time courts rule, the career damage is done.

e.g., the Glaze user study: 1,156 artists, 88% want a protection tool.[3]

04 · Creators

AI translation and the missing original.

Platforms now auto-dub creators into languages they don't speak, matching lip movements and voice timbre. Nuance, slang, and regional identity get flattened. More importantly: the AI-modified version often gets more reach than the original, so the original effectively disappears.

e.g., YouTube's auto-dubbing feature automatically translating creators' videos into other languages.[4][5]

05 · Everyone

You can't read what you can't find.

Most platforms bury AI-training clauses inside 40-page Terms of Service that the average person will never read: just 9% of adults say they always read privacy policies before agreeing.[7] Opt out of AI training is either missing, hidden three menus deep, or only available to users in specific jurisdictions.[6]

Consent, in practice, is manufactured by exhaustion.

Consent, in practice, is manufactured by exhaustion. The pattern across every major platform's ToS, 2023–25

The collateral damage: a public record that's closing itself off.

The pushback from publishers isn't aimed only at OpenAI or Anthropic. When a newspaper updates its robots.txt to keep AI crawlers out, the same rule frequently locks out the Internet Archive too: the public-memory service that has preserved over a trillion web pages since 1996. By late 2024, a large share of the world's top news sites were blocking at least one major AI crawler;[8] many major news sites have also moved to block the Wayback Machine.[9] Journalism and the public record used to share one infrastructure. AI-training anxiety is pulling them apart.

Split diagram. Left panel: a layered stack of archived pages dated across thirty years, captioned "one trillion-plus pages preserved for thirty years", the public memory of the web. Right panel: a newspaper with redacted headlines behind a red padlock, captioned "241 news sites blocking the crawler." A fractured dashed line divides the two panels.
Left: the Internet Archive's open stack, 1996 onward. Right: the paywall-and-padlock web that grew in response to AI scraping. The fracture down the middle is new.

The bind is brutal: the same tools publishers reach for to keep AI models out also keep librarians, researchers, and ordinary readers from preserving what was published. The remedy we build for one problem is quietly dismantling the piece of the web most of us assumed would always be there.

India has over 800 million internet users[10]
and one of the world's weakest consent floors.

Most global AI-consent discourse is framed around GDPR and the EU AI Act. Indian users rarely see themselves in it.

India is one of the largest sources of training data on the planet: photos, voices, languages, faces, handwritten scripts. It is also, for now, one of the jurisdictions where AI companies face the least friction. The EU can fine a company billions under the AI Act. A user in Berlin can demand their data be deleted under GDPR. A user in Bengaluru, in most cases, cannot.

The regulatory gap

Jurisdiction Key law What it gives users
European Union GDPR (2018)[11] + AI Act (2024)[12] Right to erasure, right to object to automated processing; AI Act penalties reach up to €35 million or 7% of global turnover for prohibited practices.
California, USA CCPA / CPRA[13] Right to know, delete, opt out of sale or sharing, including for AI training.
India DPDP Act, 2023[14] + DPDP Rules, 2025[15] Consent-based framework; rules notified in 2025 and Data Protection Board still being operationalised. No AI-specific provisions.
United Kingdom Data Protection Act + emerging AI regulation Similar to GDPR; still evolving.

Tested on India first

Several patterns recur in reporting on AI deployment in India:

  • Features launched in India before the EU or US, because regulatory review is faster and consent requirements looser.[16]
  • India positioned as a hub for building AI "for India and the world," attracting global model training on Indian-language data and identities.[17]
  • Low-wage data annotation and RLHF work subcontracted to Indian workers labelling violent or disturbing content without adequate psychological support.[18]

The pattern isn't conspiracy: it's economic gravity. Where the regulatory cost of a mistake is low and the labour cost of data work is low, companies go. India is currently both.

What creators in India can do today

Know your DPDP rights.

The Act gives you the right to consent, purpose limitation, and erasure. It is narrow, and it is still rolling out, but it is what you have.

Use the toolbox below.

Glaze, Nightshade, and C2PA work regardless of jurisdiction. Laws can wait; your images can't.

Document exposure.

If your work or likeness appears in a scraped dataset, keep a paper trail: it's the evidence base that legal and advocacy follow-up is built on.

You don't have to wait for a law.
These three tools exist now.

None of them are silver bullets. Each addresses a different layer of the problem: proving what something is, hiding what a model can learn from it, and poisoning what a model tries to steal. Used together, they shift leverage back to the creator.

Side-by-side

C2PA Glaze Nightshade
What it does Labels content with signed provenance. Hides your style from learning. Poisons what a scraper learns.
Who it's for Everyone. Artists with a style at risk. Artists willing to escalate.
Works if already scraped? Partially (for future edits). No. No.
Requires cooperation? Yes, AI companies must honour it. No. No.
Stance Defensive / legal. Defensive / technical. Offensive / technical.
Cost to user Free (Adobe CAI). Free. Free.

Try it yourself.

Below is a side-by-side sandbox. On the left: a regular image, as a scraper sees it, featureless, learnable, repeatable. On the right: the same image with C2PA credentials attached and a Glaze-style perturbation applied. Watch what the simulated scraper picks up in each case.

Raw: as the scraper sees it

Protected: C2PA + Glaze overlay

Built-in sample: a stylised portrait.

Your image never leaves your browser. Everything above is processed locally via the Canvas API: no upload, no server.

The people closest to the problem
aren't waiting for permission anymore.

Researchers at UChicago's SAND Lab surveyed 1,156 professional artists — illustrators, concept artists, animators — for the Glaze paper (USENIX Security 2023). It is, to date, the largest structured record of what working creators actually feel about generative AI training on their work. The numbers are not ambivalent.

77% style already mimicked
24% income already hit

This isn't a forecast. It's already happened.

77% say models mimic their specific style — what a client would recognise as theirs, not a generic look. 24% have already lost paid work to it. The harm is current, not hypothetical.

88% want a protection tool
93% rated Glaze successful

The demand is clear. So is the early verdict on the tool.

88% wanted a dedicated protection tool before one existed. After trying Glaze, 93% rated it successful at disrupting mimicry. The appetite for a technical opt-out isn't fringe — it's nearly the whole profession.

89% taking defensive action
50% considering going dark

When the argument can't arrive in time, self-defence becomes the argument.

89% have taken or plan defensive measures: watermarking, style-cloaking, locking portfolios, leaving platforms. Over half are considering pulling their work off the public internet entirely.

51% willing to sue

Litigation is no longer a fringe position.

51% said they would personally join a class-action against an AI company training on their work without consent. In a profession historically reluctant to litigate, a clean majority willing to go to court is a structural shift.

Methodology note

Source: Shan et al., Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models, USENIX Security 2023.[3] 1,156 professional artists surveyed, recruited through artist community networks: predominantly full-time (46%) or part-time / freelance (50%), across illustration, concept art, animation, and related fields.

The survey is not a random sample of all artists; it skews towards creators already aware of AI-training concerns and engaged enough to respond. Read the figures as the view from the affected, attentive core of the profession: the people who will define what the rest eventually demand.

References.

Sources behind the claims, statistics, and examples throughout this site.

  1. Millions of children face sexual violence as AI deepfakes drive surge in new cases, latest global data, The Conversation.
  2. AI-generated deepfake videos, voice cloning emerge as potential threats during election season, The Times of India.
  3. Shan et al., Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models, arXiv:2302.04222 / USENIX Security 2023.
  4. AI dubbing: is it good enough?, Clockwork Captions.
  5. Enable or disable YouTube auto-dubbing feature, 2025.
  6. Do we actually agree to these terms and conditions?, UC Berkeley School of Information.
  7. Americans' attitudes and experiences with privacy policies and laws, Pew Research Center, 2019. (Just 9% of adults say they always read privacy policies; another 13% often do.)
  8. How many news websites block AI crawlers?, Reuters Institute for the Study of Journalism.
  9. Why major news sites are blocking the Internet Archive's Wayback Machine, Forbes, 2026.
  10. India has over 800 million internet users; most use tech for OTT services, IBEF.
  11. What is GDPR?, the European Union's strict data privacy and security law.
  12. EU AI Act: Article 99 (Penalties). Fines reach up to €35 million or 7% of total worldwide annual turnover, whichever is higher, for prohibited AI practices.
  13. California Consumer Privacy Act (CCPA), Office of the California Attorney General.
  14. Digital Personal Data Protection (DPDP) Act, 2023, India's first legal framework for personal data protection.
  15. DPDP Rules, 2025, Press Information Bureau, Government of India.
  16. Europe's regulations vs India's innovation, af.net. Features often launch in India before the EU or US because regulatory review is faster and consent requirements looser.
  17. Amitabh Kant, Made in India AI for India and the world.
  18. "In the end you feel blank": India's female workers watching hours of abusive content to train AI, The Guardian, 2026.

This is open source.
Make it better.

01 Write.

Spot a missing tool, a better explanation, or a case study worth adding? Open a PR or an issue on GitHub.

02 Code.

The Agency Simulator can grow, and the site itself has rough edges. Good first issues are tagged good-first-issue on GitHub.

03 Share.

Pass it along to anyone whose work, voice, or likeness lives online. The point of an educational resource is to be read.

Go to the repo

Content licensed CC-BY-SA 4.0. Code licensed MIT. Attribution, not permission, is the default.