Anyone else worried about content rights with AI? The legal landscape is getting wild
Community discussion on content rights in AI, covering copyright concerns, licensing frameworks, fair use debates, and strategies for protecting creator content...
We’re a B2B publisher. Our content is being used by AI systems, and I’m getting conflicting advice.
Lawyer A says: “This is copyright infringement. Block all AI crawlers. Prepare for litigation.”
Lawyer B says: “This is fair use. You can’t stop it. Focus on maximizing visibility benefits.”
What I’m observing:
My questions:
I need a practical position, not just legal theory.
Let me give you the current state of play:
Active litigation (as of December 2025):
No final precedent yet. Courts haven’t definitively ruled on whether AI training constitutes fair use.
What AI companies argue:
What publishers argue:
The practical reality:
| Publisher Type | Typical Strategy |
|---|---|
| Major (NYT, WSJ) | Litigation + licensing negotiations |
| Large (major outlets) | Licensing negotiations, some blocking |
| Mid-size | Mostly allowing, hoping for visibility |
| Small | Allowing, focusing on traffic benefits |
Why mid-size publishers mostly allow:
On licensing deals specifically:
Who has deals:
Deal sizes (reported):
Why mid-size can’t get deals:
The uncomfortable truth: Unless you’re NYT-scale, licensing isn’t realistic.
What you CAN do:
The cost-benefit: Blocking = lose visibility, protect nothing meaningful Allowing = gain visibility, uncertain future rights
Most mid-size publishers choose visibility.
Note: Not legal advice, general information only.
Why your lawyers disagree:
Lawyer A (block/litigate):
Lawyer B (embrace/allow):
Both are right, from their perspectives.
The questions to ask:
Can you afford to litigate?
What are you actually protecting?
What’s your business model?
My observation: Most B2B publishers choose visibility because their business model benefits from awareness more than it loses to AI usage.
Here’s what we decided and why:
Our business: B2B industry publication, similar to yours. Revenue: Advertising + events + sponsored content
Our decision: Allow all AI crawlers. Maximize visibility.
Why:
Our revenue comes from audience, not content sales AI visibility = more audience = more revenue
Blocking wouldn’t help Content already in training sets. Blocking only stops future value.
AI traffic is valuable We see 5% of traffic from AI referrals. Those users convert well.
No realistic licensing option We approached OpenAI. No interest in our scale.
Legal costs exceed benefits Litigation would cost more than potential recovery.
What we did do:
The result: AI visibility up 200%. Referral traffic growing. Brand awareness improving.
Would we accept a licensing deal? Sure. But we’re not waiting for one.
Important distinction many miss:
Training data use vs. Real-time citation
| Aspect | Training Data | Real-time Citation |
|---|---|---|
| When it happens | Model building | Each query |
| What’s used | Full content | Snippets/facts |
| Can you block? | Future only | Yes (robots.txt) |
| Legal status | Heavily disputed | Less controversial |
| Business impact | Past content included | Affects visibility now |
Different AI systems, different models:
ChatGPT (base):
ChatGPT (Search):
Perplexity:
The nuance: Blocking ChatGPT’s training crawlers (GPTBot) = excludes from future training, doesn’t affect current model Blocking Perplexity = loses real-time citation benefits
Many publishers: Block training crawlers, allow citation crawlers. Balances concerns.
Here’s a nuanced robots.txt approach:
The selective strategy:
# Block training crawlers
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
# Allow citation/search crawlers
User-agent: ChatGPT-User
Allow: /
User-agent: PerplexityBot
Allow: /
What this does:
Who uses this approach: Some major publishers trying to balance.
The limitation: Past training data still exists. This only affects future.
For your lawyers: This might satisfy both:
It’s a middle ground that many find acceptable.
What’s likely to happen (my prediction):
Short term (2026):
Medium term (2027-2028):
Long term (2028+):
What this means for you:
The parallel: Like early music/video streaming - started controversial, eventually established licensing. AI content may follow similar path.
But that took years. Don’t put business on hold waiting for resolution.
This helped me form a position. Our strategy:
Decision: Allow with documentation
What we’re doing:
How I’m framing for leadership:
“The legal situation is genuinely uncertain. Neither blocking nor allowing has clear legal protection. Given our business model relies on audience reach, we recommend maintaining AI visibility while:
For my lawyers: This gives Lawyer A the blocking/documentation they want while giving Lawyer B the visibility/pragmatism they recommend.
Key insight: This isn’t a copyright strategy - it’s a business strategy that acknowledges copyright uncertainty. We’re optimizing for what we can control (visibility) while preserving options for what we can’t (legal outcomes).
Thanks everyone for the practical perspectives.
Get personalized help from our team. We'll respond within 24 hours.
Track how your content is being used and cited across ChatGPT, Perplexity, and other AI platforms.
Community discussion on content rights in AI, covering copyright concerns, licensing frameworks, fair use debates, and strategies for protecting creator content...
Community discussion on how publisher licensing deals with AI companies like OpenAI affect citation patterns and content visibility. Perspectives from publisher...
Community discussion on which AI crawlers to allow or block. Real decisions from webmasters on GPTBot, PerplexityBot, and other AI crawler access for visibility...