One Nation, One License, One Problem: Why India’s Proposed AI–Copyright Framework Cannot Hold
The Hybrid Model promises certainty for creators and developers, but its assumptions collapse under the realities of modern AI systems, global competition and rights governance.
India’s DPIIT Working Paper on Generative AI and Copyright arrives at a moment when nations are being forced to choose between the defensive instincts of legacy copyright regimes and the economic momentum of large-scale AI development. The report is ambitious, well-researched and designed with a conciliatory intent: ensure content creators are compensated, reduce uncertainty for AI developers, and establish a national framework that is neither as permissive as the United States nor as fragmented as the European Union. Its headline promise - “One Nation, One License, One Payment”—signals a desire for administrative simplicity and sovereign clarity.
Yet simplicity can conceal structural fragility. The proposed “Hybrid Model,” which permits permission-free training on all lawfully accessed copyrighted content in exchange for a mandated royalty regime administered by a national clearinghouse (CRCAT), attempts to broker peace between creators and developers. But the peace it promises cannot be delivered. Not because the goals are misguided, but because the model is constructed on assumptions that do not hold in contemporary AI ecosystems.
This review examines the Working Paper from the perspective of legal design, technical feasibility, comparative international policy, economic incentives, and long-term innovation impact. What emerges is a picture of a well-intentioned framework that underestimates the structural complexity of AI development, overestimates the ability of copyright law to map onto statistical learning systems, and stakes the future of India’s AI competitiveness on a licensing mechanism more suited to the broadcast era than the foundation-model era.
The Working Paper is a substantial step forward in recognising the issue. It is not yet a workable solution.
The Policy Ambition: A Centralised Royalty Engine for AI Training
At the core of the Working Paper is a proposal to amend the Copyright Act to allow AI developers “permission-free access” to all lawfully accessed copyrighted content for model training. In exchange, a new statutory entity - the Copyright Royalties Collective for AI Training (CRCAT) would collect royalties from AI developers and distribute them to creators through their respective CMOs or sector bodies. A government committee would set the rates, and every class of creative work would have representation.
The logic is clear:
treat training as a form of fair-but-compensated use,
avoid the transaction cost explosion of one-to-one licensing,
neutralise the opt-out chaos observed in the EU’s TDM regime,
prevent withholding of data that would degrade the quality of Indian foundation models,
and ensure creators receive a share of AI’s economic upside.
In a vacuum, the proposal is elegant. In practice, it struggles to map to the realities of how generative AI systems are built and deployed today.
A Central Contradiction: Compensation Without Attribution
The Working Paper acknowledges the near-impossible task of attributing the influence of individual works on a trained model. Deep learning systems do not store content as identifiable copies; training embeds statistical relationships across millions or billions of parameters. The process is non-linear, distributed and irreversible. Once the model has learned representations, there is no technical path to quantifying how much a specific book, image, audio file or article contributed to a model’s capability or to any given output.
Yet the proposed compensation model exists as though some reasonable attribution mechanism could eventually be engineered, or as though royalty distributions could be decoupled from attribution entirely without creating imbalances, distortions or inequities.
The report attempts to bypass this tension by proposing a pooled royalty approach, where compensation is tied to registration and class membership rather than contribution. This mechanically solves the attribution problem by abandoning it, but creates a policy regime in which:
creators who are statistically irrelevant to model performance receive compensation,
creators whose works materially shaped a model may receive nothing if not registered,
new creators may enter the pool without a connection to past training datasets,
creative sectors with strong lobbying power may distort representation,
and AI developers are forced into a royalty system pegged to abstractions rather than usage.
The internal contradiction is never fully resolved. Without attribution, the royalty regime becomes symbolic rather than substantive. With attribution, the system becomes technically impossible. The Working Paper tries to square both and succeeds at neither.
Global Comparisons Reveal Divergent Philosophies
The Working Paper’s strength is its comparative analysis. It recognises the United States’ position anchored in fair use as the most innovation-friendly. It recognises Japan’s broad text-and-data-mining exceptions that treat training as lawful by default. It recognises the EU’s attempt to mediate competing interests through the opt-out structure in Articles 3 and 4 of the CDSM Directive, an approach already proving unworkable in practice. It reviews Singapore and Switzerland, both of which adopt more permissive TDM exceptions.
Against this backdrop, India’s Hybrid Model attempts a middle-path: permissive use combined with structured compensation. It avoids the political defensiveness of the EU and the laissez-faire approach of the US. It mirrors elements of collective licensing but with statutory force.
However, the comparative analysis also reveals the core risk India faces: regulatory divergence. If India creates a compliance burden that its global competitors do not share, domestic developers will face higher friction and slower iteration cycles relative to developers in Silicon Valley, Shenzhen, Singapore and Tokyo. The Working Paper attempts to mitigate this by emphasising simplicity and predictable licensing, but the hidden compliance architecture of registration, governmental rate-setting, audits, disclosures, and centralised clearing introduces a heavier regulatory footprint than appears at first glance.
In a global AI race where model quality is driven by speed, scale, compute and data availability, even small regulatory overheads compound into strategic disadvantages.
The Technical Misalignment: Models Do Not Learn the Way Copyright Imagines
Copyright law is built around acts of reproduction, adaptation, communication and distribution. Generative AI training does not neatly fall into any of these categories. When a model ingests data, it does not retain copies of works; it extracts statistical patterns from vast corpora. Once patterns are encoded, there is no pathway to reverse-engineer which inputs influenced which outputs.
The Working Paper acknowledges this technical reality in several sections yet proceeds to design a policy framework dependent on a causal relationship between copyrighted inputs and model outputs. This is where the model collapses under its own weight.
Every major technical assumption underlying the compensation regime is misaligned with contemporary AI practice:
1. Lawful access cannot be verified at scale.
Most training datasets include mixed-provenance sources such as Common Crawl or LAION-5B, which aggregate content without deterministic verification. Even determining “lawful access” is non-trivial when content originates from global websites with varied licensing terms.
2. Training data is not static; models undergo continual fine-tuning.
Foundation models evolve over time. Developers fine-tune using domain-specific, proprietary and synthetic data. A licensing system designed for one-off ingestion cannot track an evolving model lifecycle.
3. Attribution is computationally intractable.
No practical method exists today to determine the contribution weight of specific works. Even researchers attempting influence-function analysis can only approximate results under narrow assumptions.
4. Synthetic data undermines compensation logic.
As more model improvements rely on synthetic data loops, the economic justification for charging royalties on human works diminishes. The Working Paper does not address this trajectory.
Any copyright framework that does not internalise these realities risks writing law for a world that no longer exists.
The Economic Layer: Who Wins and Who Loses?
The Working Paper insists the Hybrid Model protects creators while enabling innovation. Yet the incentive structures created by this model are more ambiguous.
Creators may believe they will receive steady royalties. In reality, several structural barriers exist:
Registration barriers.
Only works registered with CRCAT-affiliated CMOs are eligible. Unregistered or informal creators in India, a vast majority receive nothing.
Representation asymmetry.
Sectors with established CMOs (music, film) dominate governance. Sectors without CMOs (independent writers, illustrators, educators, journalists) depend on government-nominated representation.
Royalty dilution.
A pooled royalty system distributes value broadly, often to parties who contributed little to model training.
Weak enforcement.
The Working Paper does not articulate a robust audit mechanism for ensuring AI developers disclose training data sources or pay royalties proportionate to usage.
For AI developers, the Hybrid Model imposes predictable friction:
mandatory royalty outflows independent of business model,
compliance costs associated with registration, disclosures and audits,
potential exposure to retroactive penalties if provenance challenges arise,
and the strategic disadvantage of operating under constraints competitors in the US or Japan do not face.
For government, the regime creates a governance and liability surface significantly larger than anticipated:
rate misalignment risks political backlash,
CRCAT underperformance risks creator dissatisfaction,
auditing failures risk regulatory capture or litigation,
and long-term overhead risks bureaucratic ossification.
The model distributes risk upward and benefit downward, but the distribution is uneven and the accountability chain is unclear.
The Missing Infrastructure: Provenance, Traceability and Verifiable Compute
The Working Paper presumes disclosure mechanisms that do not yet exist. It encourages training transparency, but transparency must be operationalised through technology, not declarations.
There is no reference to:
machine-verifiable provenance standards (C2PA, IPTC, W3C Verifiable Credentials),
cryptographic attestations of training pipelines,
model cards tied to verifiable operational logs,
differential content licensing systems suited to large-scale AI,
or decentralised registries of rights that can interface with training workflows.
Without infrastructure that supports real-time provenance signalling, automated rights checks and verifiable disclosures, the proposed licensing framework becomes unenforceable in practice. The paper acknowledges this gap but offers no roadmap to fill it.
This is a critical omission. A policy vision for AI training in 2025 cannot operate without acknowledging the evolving global push toward provenance and content authenticity ecosystems.
The CRCAT Problem: A Centralised Solution to a Distributed System
CRCAT is the heart of the Hybrid Model. But its design reveals several structural issues:
Fragmentation of rights.
The report assumes creative works map neatly to sectoral CMOs. In reality, rights are layered, derivative and often contested. Many works have co-writers, co-publishers and multiple licensing layers. Standardising this complexity into a single registry is aspirational at best.
Governance risk.
A national licensing authority becomes a chokepoint. Governance disputes, political interference, opaque rate-setting and sectoral pressure are inevitable.
Administrative overload.
Millions of creators, thousands of publishers, dozens of content classes and evolving AI developers produce a combinatorial explosion of administrative load.
Technical infeasibility.
CRCAT is expected to maintain a public registry, distribute royalties, validate registrations, support unclaimed royalty mechanisms and facilitate grievance redress. This requires infrastructure, staffing, digital systems, cybersecurity, audit pipelines and interoperability with AI developers’ internal systems—all of which require years to mature.
The Working Paper underestimates the operational complexity of running such a body in real-world conditions.
Penalties and Liabilities: The Soft Underbelly of the Framework
The Working Paper identifies the need for grievance redress, monitoring and burden-of-proof frameworks, but stops short of articulating:
penalties for non-compliance,
audit rights for CRCAT,
mechanisms for cross-border enforcement,
guidelines for model-as-a-service APIs,
or obligations for open-weights models.
The absence of a complete enforcement model creates legal uncertainty, precisely the problem the Working Paper seeks to solve.
If penalties are weak, compliance becomes optional. If penalties are strong, innovation becomes risky. Thus, the Working Paper sits awkwardly in the middle, unable to commit to either direction.
Where the Working Paper Succeeds
The report has real strengths:
It clearly articulates stakeholder positions and the tensions between them.
It conducts a genuinely useful international comparison.
It recognises that India cannot wait for global jurisprudence to settle.
It acknowledges the importance of protecting both creators and developers.
It brings creators into a conversation where they have historically been excluded.
It attempts to align AI policy with the broader IndiaAI initiative.
These are not minor achievements. They indicate an evolving policy apparatus willing to engage with technical complexity.
Where It Falls Short
However, the model collapses under nine structural weaknesses:
It presumes attribution where none exists.
It centralises governance in a domain that is inherently distributed.
It treats training data as static rather than evolving.
It undervalues synthetic data as a forthcoming alternative to human works.
It imposes friction on domestic developers relative to international competitors.
It provides weak protection for independent and informal creators.
It lacks an operational enforcement architecture.
It ignores global momentum toward provenance-based solutions.
It overestimates the administrative capacity of CRCAT.
These weaknesses are not fatal flaws, but they demand reconsideration before India codifies a system that will be difficult to unwind.
A More Coherent Direction: From Licensing to Verifiable Infrastructure
A future-ready policy architecture for AI training must integrate copyright considerations with technical and operational solutions. The pathway forward lies not in nationalising licensing but in building verifiable digital infrastructure that enables rights signalling, provenance disclosure, transparent training attestations and interoperable compliance.
A more coherent alternative would include:
machine-readable licensing signals embedded in content,
cryptographically verifiable training logs,
global alignment with C2PA-like provenance standards,
decentralised registries for rights metadata,
training attestations tied to compute environments,
tiered licensing regimes for commercial vs non-commercial models,
safe harbours for open-source models,
and outcome-based audits rather than content-based disclosures.
India is already investing in trust infrastructure across sectors. Aligning AI governance with this broader digital public infrastructure vision would offer a far more durable and globally competitive framework.
A Necessary Debate, an Incomplete Solution
The DPIIT Working Paper deserves recognition for initiating a serious, structured national conversation. It acknowledges that AI training sits at the intersection of law, technology and creativity and that India’s regulatory approach must balance these interests. But the Hybrid Model, in its current form, rests on assumptions that the modern AI ecosystem does not support.
India has an opportunity to lead with a model that blends legal clarity, technical feasibility, economic competitiveness and creator protection. Achieving that balance requires shifting from twentieth-century licensing architectures to twenty-first-century verifiability infrastructures.


