Building a Trust & Safety Program from Scratch: Lessons from Amazon, Google, and TikTok

Published January 2026 · 19 min read · By Maneesha Pandey

Every successful platform eventually needs Trust & Safety. The question is whether you build it proactively or reactively after a crisis.

I've built Trust & Safety operations from scratch twice—at TikTok LATAM (zero to regional safety infrastructure) and within new product launches at Amazon and Google. Here's what actually works when you're starting from nothing.

The Biggest Mistake: Waiting Too Long

The most common pattern I see:

Launch: "We'll deal with safety later"
Early growth: "We're too small to have real safety issues"
Inflection point: First serious incident, regulatory inquiry, or press coverage
Panic: Scramble to build safety infrastructure under crisis pressure
Over-correct: Spend 2-3x more than if you'd built it proactively

Reality: It's much cheaper to build safety infrastructure before you need it than to retrofit it during a crisis.

When to Start Building Trust & Safety

Bare minimum triggers:

You have user-generated content (UGC)
Users can interact with each other
You handle sensitive data (financial, health, children)
You operate in regulated industry
You're launching in EU (DSA requirements)

Ideal timing:

Before public launch if high-risk category (social, dating, kids, marketplace)
At 1,000-10,000 users if medium-risk category
Before Series A if you want clean due diligence

Too late:

After first serious safety incident
During regulatory enforcement
When investors ask "who owns Trust & Safety?" and you have no answer

Phase 1: Foundation (Weeks 1-4)

Start with the minimum viable safety infrastructure:

1. Define What "Unsafe" Means for Your Platform

Write down specific prohibited content and behavior for your platform.

Start with legal minimums:

Illegal content in your primary market (US, EU, etc.)
CSAM (child sexual abuse material)
Terrorism and violent extremism
Illegal goods and services
Copyright/IP infringement

Add platform-specific harms:

Harassment and bullying
Hate speech and discrimination
Spam and manipulation
Fraud and scams
Misinformation (if relevant to your platform)

Document as policy: Create community guidelines or acceptable use policy

Example (Dating App):

CSAM - prohibited, report to NCMEC
Harassment - unwanted contact after block
Catfishing - fake identity/photos
Solicitation - prostitution or trafficking
Scams - romance scams, financial fraud
Hate speech - discrimination based on protected characteristics

2. Build User Reporting Mechanism

Users need a way to report safety issues.

Minimum viable:

"Report" button on content/profiles
Basic form: what's wrong, where is it, who are you
Email confirmation of receipt
Someone monitoring reports daily

Tools to use:

Your existing help desk (Zendesk, Intercom, Freshdesk)
Google Forms + email forwarding (ultra-early stage)
Dedicated report@yourplatform.com email address

Don't build custom tooling yet. Use existing systems.

3. Establish Review Process

Someone needs to review reports and take action.

Early stage (<10,000 users):

Founder or early employee reviews reports
1-2 hours per day initially
Decision: Remove content + warn user, suspend user, or ignore report

Response time target:

CSAM / imminent harm: <4 hours
Severe violations (terrorism, threats): <24 hours
Everything else: <48 hours

Create decision log:

Spreadsheet tracking: report received, decision made, action taken, date
This becomes your transparency data later

4. Set Up Basic Detection

Even pre-launch, implement basic automated detection:

Technical controls:

Email verification for sign-up
CAPTCHA to prevent bots
Rate limiting on posting/messaging
Profanity filter (use existing library, don't build your own)

For images/video:

PhotoDNA for CSAM detection (Microsoft provides free access)
Simple hashing to detect exact duplicates

For text:

Keyword lists for extreme content (terrorism, CSAM, slurs)
Don't over-filter - false positives destroy user experience

Realistic early stage: 90% reactive (user reports) + 10% proactive (automated detection of extreme content)

Phase 2: Scaling Infrastructure (Months 2-6)

As you grow from thousands to tens of thousands of users:

1. Formalize Moderation Operations

Hire your first Trust & Safety person:

Title: Trust & Safety Manager or Trust & Safety Lead
Reports to: Founder, CEO, or COO initially
Responsibilities: Own moderation operations, vendor management, policy refinement

Typical first hire profile:

2-5 years content moderation experience
Managed moderation teams or vendors
Understands policy development
Comfortable with ambiguity

Salary range: $80K - $120K depending on market and experience

2. Implement Moderation Vendor (If Needed)

When to use moderation vendors:

Volume >100 reports/day
Need 24/7 coverage
Multi-language requirements
Want to avoid hiring large in-house team

Major vendors:

TaskUs - good for startups, flexible
Accenture - enterprise-grade
Telus International - mid-market
CNET (Centific) - AI training + moderation

Typical pricing: $8-$15 per hour per moderator (offshore), $25-$40 (US-based)

Hybrid model (recommended):

Vendor handles volume and 24/7 coverage
Internal person handles escalations, policy decisions, quality assurance

3. Build Automated Detection (v2)

Invest in better proactive detection:

Text classification:

Use pre-trained models (OpenAI Moderation API, Perspective API, Hive)
Don't build your own ML models yet (too expensive, not better than off-the-shelf)

Image/video detection:

Hive, Clarifai, or AWS Rekognition for NSFW detection
PhotoDNA for CSAM (mandatory)
Google Vision API for general classification

Behavioral signals:

New account spam patterns (post volume, follow patterns)
Coordinated activity detection
Suspicious engagement patterns

Don't over-automate: Keep humans in the loop for final decisions on removal

4. Create Appeals Process

Users will disagree with your moderation decisions. You need a way to handle appeals.

Basic appeals process:

User clicks "Appeal" on enforcement notification
Form asks: why do you think this was a mistake?
Different reviewer (not original moderator) re-reviews
Decision within 48-72 hours
Overturn if wrong, uphold with explanation if correct

Track appeal metrics:

Appeal rate (what % of enforcements are appealed)
Overturn rate (what % of appeals result in reversal)
Target overturn rate: 5-15% (too low = not enough appeals, too high = poor initial accuracy)

5. Establish Incident Response Plan

You will have safety incidents. Plan before they happen.

Define severity levels:

P0 (Critical): CSAM, imminent harm, active shooter, regulatory enforcement
P1 (High): Media coverage, high-profile user incident, wave of similar issues
P2 (Medium): Individual serious violation, complaint from authority
P3 (Low): Standard policy violation

For each severity:

Who gets notified (CEO, legal, PR, etc.)
How quickly must we respond
Who has decision authority
Who communicates externally

Practice: Run incident simulations quarterly

Phase 3: Mature Operations (Months 6-18)

As you scale to hundreds of thousands or millions of users:

1. Build Cross-Functional Safety Processes

Trust & Safety can't be siloed. Integrate with:

Product: Safety review for all new features

Who can use this feature? (age, verified users, etc.)
What can go wrong? (harassment, spam, fraud)
What safety controls are needed?
Launch checklist includes safety sign-off

Engineering: Safety infrastructure roadmap

Detection systems
Logging and auditability
Account action infrastructure (warnings, suspensions, bans)
Appeals handling

Legal: Regulatory compliance alignment

Which regulations apply
What documentation is required
Regulatory reporting obligations
Authority relationship management

Communications/PR: Crisis communication

Spokespeople trained on safety topics
Pre-drafted holding statements
Escalation triggers for PR involvement

2. Implement Quality Assurance

How do you know your moderation is accurate?

Sampling and audits:

Review 10% of all moderation decisions weekly
Separate QA team from front-line moderators
Track accuracy by moderator, policy type, content category

Calibration sessions:

Weekly meetings with moderators
Review edge cases together
Ensure consistent policy interpretation
Update guidance based on new patterns

User feedback:

Track appeal overturn rates
Survey users after enforcement (why did this happen?)
Incorporate feedback into policy refinement

Target accuracy: 92-96% (perfect accuracy impossible with subjective policies)

3. Develop Transparency Reporting

Even before it's legally required (DSA, etc.), build transparency reporting:

Track:

Volume of user reports received
Volume of content actioned (removed, warnings, suspensions)
Action rate (what % of reports lead to action)
Average time to action
Appeal volume and overturn rate
Proactive detection vs. user-reported

Publish: Quarterly or semi-annually

Benefits:

Accountability to users
Investor/board confidence
Regulatory compliance (if/when required)
Benchmarking and improvement

Common Mistakes When Building Trust & Safety

Mistake 1: "We're Different, Normal Rules Don't Apply"

I hear this constantly: "We're a professional network, we don't have safety issues" or "Our users are vetted, we don't need moderation."

Reality: Every platform with humans has safety issues. LinkedIn has fraud and harassment. GitHub has harassment and spam. Professional platforms aren't exempt.

Solution: Assume you'll have safety issues and build accordingly.

Mistake 2: Over-Reliance on Automation

"We'll just use AI to moderate everything automatically."

Reality: Current AI can help detect, but shouldn't auto-remove except in narrow cases (exact CSAM hashes, known malware). Humans needed for context and edge cases.

Solution: Automation flags for human review. Humans make final decisions.

Mistake 3: Vague Policies

"Be respectful" and "don't abuse the platform" are not enforceable policies.

Reality: Moderators and users need specific definitions. What exactly is harassment? What's hate speech? What's spam?

Solution: Write specific, enforceable policies with examples. "Harassment includes repeated unwanted contact after a user blocks you" is enforceable.

Mistake 4: No Policy Enforcement for "Important" Users

"This user drives a lot of engagement, we can't ban them even though they violate policies."

Reality: Inconsistent enforcement destroys trust and creates legal risk. If your policy says X is prohibited, you must enforce on everyone.

Solution: Enforce policies consistently. If an important user is valuable despite policy violations, change the policy—don't make exceptions.

Mistake 5: Treating Safety as Cost Center

"Trust & Safety doesn't make money, minimize spending on it."

Reality: Safety failures cost more than safety infrastructure. One major incident can cost millions in legal fees, settlements, lost users, PR damage.

Solution: Treat safety as risk mitigation. The ROI is avoiding catastrophic losses.

Need Help Building Your Trust & Safety Program?

Echelon Advisory helps startups build Trust & Safety infrastructure from scratch, based on lessons from launching operations at Amazon, Google, and TikTok.

Services:

Phase 1: Foundation Setup ($10K-$20K) - Policies, reporting, review process, basic detection
Phase 2: Scaling Infrastructure ($25K-$50K) - Vendor selection, advanced detection, appeals, incident response
Phase 3: Ongoing Advisory ($5K-$10K/month) - Strategic guidance, policy updates, compliance reviews

Key Takeaways

Start building Trust & Safety before you need it (much cheaper than retrofitting)
Minimum viable: policies, reporting mechanism, review process, basic detection
Use off-the-shelf tools, don't build custom systems early
Hire your first Trust & Safety person around 10K-50K users
Human oversight required even with automation
Integrate safety into product development, not bolted on after
Treat safety as risk mitigation, not cost center
Plan for compliance with applicable regulations from day 1

Building Trust & Safety infrastructure proactively is one of the best investments a startup can make. It's cheaper, faster, and less stressful than building it reactively during a crisis.

About the Author

Maneesha Pandey is the founder of Echelon Advisory Services, specializing in Trust & Safety, AI Governance, and EU regulatory compliance. She spent 14+ years building Trust & Safety infrastructure at Amazon, Google, and TikTok, including launching TikTok's LATAM Trust & Safety operations from scratch.

Learn more about Echelon Advisory Services

← Back to Blog