Back to Glossary

What is Trust & Safety?

Last reviewed by Moderation API

Trust & Safety, usually shortened to T&S, is the function at an online platform responsible for keeping users safe from harm, fraud, and abuse. Content moderation is the most visible part of the job, but the discipline also covers account integrity, platform manipulation, scam prevention, child safety, regulatory reporting, and crisis response. In practice it draws on policy, operations, engineering, legal, and data science in roughly equal measure.

What T&S teams actually do

A modern T&S team owns the full lifecycle of user harm.

The work starts with writing the community guidelines that define what is and is not allowed on the platform. From there the team builds or buys the detection systems that enforce those rules at scale, typically some combination of classifiers, heuristic rules, and human review queues. When content or behavior crosses a policy line, T&S operations handles the review, the escalation, the takedown, the account action, and the appeal.

For the most severe categories, CSAM, terrorism, and imminent self-harm, the team also manages mandatory reporting to authorities such as NCMEC, law enforcement, and national regulators. These pipelines usually have their own tooling, staffing, and legal review because the consequences of a mistake are much higher.

How the discipline took shape

In the early 2000s, most platforms treated abuse as a support ticket problem and handed it to customer service. That stopped working once user-generated content scaled into the billions of items per day. The work outgrew ad-hoc processes and turned into its own engineering-heavy function with dedicated tooling, headcount, and leadership.

Industry groups like the Trust & Safety Professional Association (TSPA) and the Digital Trust & Safety Partnership have since formalized the practice by publishing standards, frameworks, and a shared vocabulary. Regulation did the rest. Between the EU Digital Services Act, the UK Online Safety Act, and sector-specific rules for CSAM and terrorism, T&S is now a board-level concern at most large platforms.

The core pillars

Most T&S organizations are built around a handful of specialized groups. Content moderation handles text, image, audio, and video review. Platform integrity focuses on coordinated inauthentic behavior, bot networks, and spam. Fraud and financial crime handles scams, chargebacks, and money laundering. Child safety runs dedicated CSAM detection and NCMEC reporting pipelines, often with its own tooling and staffing because of the legal stakes. Policy and legal owns the rules themselves and deals with regulators. Data science and ML engineering build the classifiers that make the whole operation scale beyond what humans could ever review by hand.

How T&S work gets measured

Mature T&S teams run on metrics.

The standard ones include prevalence (how much violating content exists on the platform at any given time), proactive detection rate (how much of it is caught before a user reports it), time to action, appeal overturn rate, and moderator wellness indicators. Most of these numbers end up in transparency reports, which are now legally required in several jurisdictions and voluntarily published by many platforms that are not yet covered by one.

Running a real T&S program means treating user safety as a measurable engineering problem with its own KPIs, postmortems, and roadmap.

Find out what we'd flag on your platform