Compliance
Last updated
54 content moderation terms tagged compliance.
- 3 Strikes Policy
A 3 strikes policy is a moderation rule that escalates consequences for repeated violations: a warning on the first offense, a temporary suspension on the second, and a permanent ban on the third. It gives users a chance to correct behavior before removal.
- Advance Fee Scam (419 Scam)
An advance fee scam tricks the victim into paying an upfront fee in exchange for the promise of a much larger future payout that never arrives, classically framed as an inheritance, lottery win, or foreign official needing help moving funds. It is also known as a 419 scam, after the section of the Nigerian criminal code that covers the offense.
- Age Verification
Age verification is the process of confirming a user's age before granting access to age-restricted content or features, using methods ranging from government ID checks and credit card verification to facial age estimation and behavioral age assurance. Requirements vary by jurisdiction and by the risk profile of the platform.
- AI Guardrails
AI guardrails are the rules, filters, and policies built around an AI system to keep its inputs and outputs within safe and ethical boundaries — preventing the model from generating harmful, biased, or off-policy content even when prompted to do so.
- AI Voice Cloning Scam
An AI voice cloning scam uses a few seconds of recorded speech — pulled from social media, voicemail, or a short phone call — to generate a synthetic copy of someone's voice, then impersonates them in a fake emergency call demanding money. The most common variant is the grandparent scam, where the cloned voice of a child or grandchild claims to be in jail, in the hospital, or stranded abroad.
- Allowlist & Blocklist
An allowlist is a curated list of words, phrases, users, or domains that are explicitly permitted on a platform and allowed to bypass certain moderation filters. A blocklist is the inverse — a curated list of items that are banned outright. The terms have replaced the older "whitelist" and "blacklist".
- Appeal Process
An appeal process is a mechanism that lets a user contest a moderation decision and request that the removal, restriction, or account action be reviewed again. Under the EU Digital Services Act, providing an accessible internal complaint-handling system is a mandatory user right.
- Banning
Banning is the act of permanently revoking a user's access to a platform or service after they have violated the community guidelines or terms of service. Bans may be enforced by account, IP address, device fingerprint, or payment method.
- Brand Safety
Brand safety is the set of measures advertisers and platforms use to prevent a brand's ads from appearing alongside inappropriate, controversial, or harmful content that could damage the brand's reputation.
- Business Email Compromise (BEC)
Business email compromise is a targeted fraud in which attackers impersonate an executive, employee, or vendor over email to redirect wire transfers, payroll, or invoice payments to accounts they control. It relies on social engineering and spoofed or compromised email accounts rather than malware.
- C2PA
The Coalition for Content Provenance and Authenticity is an open technical standard for attaching cryptographically signed provenance metadata to media, recording how a file was captured, edited, and published. It is designed to help platforms and users distinguish authentic media from manipulated or synthetic content.
- Catfishing
Catfishing is the practice of creating a fake online persona to deceive another person into a romantic or emotional relationship, often as a prelude to financial fraud, blackmail, or manipulation. Detection typically combines image reverse-search, behavioral signals, and conversation analysis.
- Community Guidelines
Community guidelines are a set of rules and standards published by a platform that define what behavior and content are acceptable, giving users and moderators a shared framework for keeping discussions safe and respectful.
- COPPA
The Children's Online Privacy Protection Act is a US federal law that restricts how online services may collect, use, and disclose personal information from children under 13. It requires verifiable parental consent before data collection and imposes strict obligations on data retention, disclosure, and security.
- Crypto Scam
A crypto scam is any fraud that exploits cryptocurrency rails to steal funds, including fake exchanges, celebrity giveaway scams, wallet drainers, and fraudulent token launches. The irreversibility of on-chain transactions makes recovery extremely difficult once assets have moved.
- CSAM (Child Sexual Abuse Material)
CSAM stands for Child Sexual Abuse Material — any visual depiction of sexually explicit conduct involving a minor. CSAM is illegal in virtually every jurisdiction, and US-based platforms are legally required to report known instances to the National Center for Missing & Exploited Children (NCMEC).
- Dark Web
The dark web is a portion of the internet that is not indexed by traditional search engines and can only be reached through anonymizing software such as Tor. Its anonymity makes it both a refuge for privacy-conscious users and a venue for illegal activity.
- Deepfake
A deepfake is a piece of synthetic media — typically video, audio, or image — in which a person's likeness or voice has been replaced or generated using deep learning. Deepfakes are widely used for non-consensual intimate imagery, fraud, impersonation scams, and political disinformation.
- Deepfake Scam
A deepfake scam uses AI-generated synthetic video or audio to impersonate a real person for fraud, including fake CEO video calls authorizing wire transfers, fabricated celebrity endorsements of investment schemes, and synthetic non-consensual intimate imagery used for sextortion. As generation tools get cheaper and faster, deepfakes are becoming a default layer in business email compromise and romance scams.
- Digital Services Act (DSA)
The Digital Services Act (DSA) is a European Union regulation that sets binding rules for online intermediaries on illegal content, transparency, advertising, and risk management. Very Large Online Platforms face the strictest obligations and can be fined up to 6% of global annual turnover for non-compliance.
- Disinformation
Disinformation is false information that is deliberately created and spread to deceive, manipulate, or cause harm. It is the raw material of influence operations and coordinated inauthentic behavior, and is distinguished from misinformation by the intent of whoever originates it.
- Doxxing
Doxxing is the act of publicly sharing someone's private personal information — such as home address, phone number, workplace, or identity documents — without their consent, typically to enable harassment, intimidation, or real-world harm. Most major platforms treat doxxing as an immediate takedown offense.
- Employment Scam
An employment scam is a fraud that uses fake job listings, recruiter outreach, or work-from-home offers to extract money or personal information from applicants. Common variants include upfront fees for equipment or training, fraudulent check-cashing tasks, and identity theft disguised as onboarding paperwork.
- Government Impersonation Scam
A government impersonation scam is a fraud in which criminals pose as officials from tax, social security, law enforcement, or immigration agencies to pressure victims into sending money or sharing personal information. It is among the fastest-growing categories of reported fraud in recent years.
- Hate Speech
Hate speech is content that promotes violence, discrimination, or hostility toward individuals or groups based on protected attributes such as race, religion, ethnicity, national origin, gender, sexual orientation, or disability.
- Human in the Loop
Human in the loop is a moderation approach where AI handles bulk decisions but escalates low-confidence or borderline cases to human reviewers, combining the speed of automation with the judgment of trained moderators.
- Imposter Scam
An imposter scam is a fraud in which the attacker pretends to be someone the victim trusts — a government agency, a family member, a business, or a tech support agent — in order to extract money or personal information. Imposter scams are consistently the top category of fraud reported to the US Federal Trade Commission, with billions of dollars in reported losses each year.
- Investment Scam
An investment scam is a fraud that lures victims into fake trading platforms, nonexistent funds, or manipulated crypto schemes with promises of unusually high or guaranteed returns. It consistently ranks among the highest-loss fraud categories, often facilitated through social media and messaging apps.
- KOSA (Kids Online Safety Act)
KOSA is proposed US federal legislation that would impose a duty of care on online platforms to prevent and mitigate specific harms to minors, including content promoting suicide, eating disorders, substance abuse, and sexual exploitation. It would also require platforms to provide stronger default privacy settings, parental tools, and transparency reporting for users under 17.
- Misinformation
Misinformation is false or misleading information that is shared without the intent to deceive — the person spreading it believes it is true. It is distinct from disinformation, which is deliberately fabricated, though the two can blur as content travels through a network.
- NCII (Non-Consensual Intimate Imagery)
Non-consensual intimate imagery refers to sexually explicit photos or videos shared without the subject's consent, often called "revenge porn". It is now criminalized in many jurisdictions and is typically handled through dedicated victim-reporting workflows, hash-sharing programs, and expedited takedown procedures.
- NSFA (Not Safe for Ads)
NSFA stands for "Not Safe for Ads" and labels content that is unsafe to monetize because it is explicit, offensive, or controversial. Advertisers and ad networks use NSFA classification to demonetize or block placements next to such content.
- NSFW
NSFW stands for "Not Safe For Work" and is used to label content — typically nudity, graphic violence, or strong language — that is inappropriate to view in a professional or public setting and should be hidden behind a warning.
- Offensive Content
Offensive content is user-generated material likely to upset or alienate readers — including hate speech, harassment, slurs, graphic violence, and sexually explicit imagery — even when it does not necessarily break the law.
- Online Safety Act (UK)
The Online Safety Act is a UK law that imposes a legal "duty of care" on online platforms to protect users — especially children — from illegal and harmful content, with enforcement and fines administered by the regulator Ofcom.
- Phishing
Phishing is a social engineering attack in which the attacker impersonates a trusted entity — a bank, an employer, a well-known brand — in an email, text message, or website in order to trick the victim into handing over credentials, payment information, or access to a device. Variants include spear phishing, smishing (SMS), and vishing (voice).
- Pig Butchering Scam
A pig butchering scam is a long-con fraud in which a scammer builds a romantic or friendly relationship with a victim over weeks or months, then lures them into a fake cryptocurrency investment platform and drains their savings. The name comes from the Chinese "sha zhu pan" — fattening the pig before the slaughter.
- PII Detection
PII detection is the automated identification of personally identifiable information — such as names, addresses, phone numbers, government IDs, and financial details — inside user-generated content so it can be redacted, blocked, or routed for review. It is central to doxxing prevention, privacy compliance, and safe data handling.
- Romance Scam
A romance scam is a fraud in which the attacker feigns a romantic relationship with the victim to win their trust and then solicit money, gift cards, or cryptocurrency. Romance scams are frequently the entry point for pig butchering investment fraud and financial sextortion, and are among the most costly categories of consumer fraud tracked by the FBI and FTC.
- Rug Pull
A rug pull is a cryptocurrency exit scam in which the developers of a token or project abruptly abandon it and drain the liquidity pool, leaving holders with worthless assets. It typically follows aggressive marketing and artificial price pumping designed to attract retail buyers.
- Section 230
Section 230 is the provision of the 1996 Communications Decency Act that shields US online platforms from liability for content posted by their users, and protects them when they moderate or remove content in good faith. It is often described as the legal foundation of the modern internet.
- Sextortion
Sextortion is a form of online blackmail in which an attacker threatens to share sexual images or videos of a victim unless they pay money or provide more content. The FBI has warned of a sharp rise in financial sextortion targeting teenage boys, with tens of thousands of reports filed to the National Center for Missing & Exploited Children each year.
- SHAFT
SHAFT is a content moderation and advertising compliance acronym for Sex, Hate, Alcohol, Firearms, and Tobacco — five categories of regulated or sensitive material that ad networks and platforms restrict, age-gate, or prohibit outright.
- SIM Swap
A SIM swap is an attack in which a fraudster social-engineers a mobile carrier into transferring a victim's phone number to a SIM they control, allowing them to intercept SMS-based two-factor authentication codes and take over bank, email, and crypto accounts. It is one of the primary reasons security practitioners discourage SMS 2FA for high-value accounts.
- Smishing
Smishing is phishing delivered over SMS, where attackers send text messages impersonating a bank, delivery service, toll authority, or government agency to trick the recipient into clicking a malicious link or handing over credentials. The compressed format of text messages makes it harder for victims to spot the usual red flags of a phishing attempt.
- Spam
Spam is unsolicited, irrelevant, or repetitive content posted at scale, typically for advertising, link building, scams, or phishing. Spam degrades user experience and is one of the most common targets of automated moderation.
- Takedown
A takedown is the removal of a piece of content from a platform after it has been determined to violate community guidelines or legal standards, often in response to a user report, copyright notice, or government request.
- Tech Support Scam
A tech support scam is a fraud in which criminals impersonate well-known software or hardware companies — most often Microsoft or Apple — to convince victims that their device is infected, then charge for fake repairs or install remote access tools. Older adults are disproportionately targeted.
- Terms of Service (ToS)
Terms of Service is the legal agreement between a platform and its users that defines the rules for using the service, including acceptable content, user responsibilities, and the platform's right to remove content or suspend accounts.
- Toxicity
Toxicity in content moderation describes language that is harmful, abusive, or disruptive to a community — typically including hate speech, harassment, threats, and personal attacks. Toxicity classifiers score text on this dimension.
- Transparency Report
A transparency report is a regular public disclosure in which a platform reports on content it removed, accounts it actioned, and government or legal requests it received. Under the EU Digital Services Act, transparency reports are a legal obligation for online platforms operating in the EU.
- Trust & Safety
Trust & Safety is the discipline within an online platform responsible for protecting users from harm — covering content moderation, fraud prevention, policy enforcement, and incident response. Content moderation is one pillar of a broader Trust & Safety function that also handles abuse, scams, and regulatory compliance.
- Vishing
Vishing is voice phishing, where an attacker calls the victim and impersonates a bank, tax authority, or tech-support agent to extract credentials, payment details, or remote access to a device. Modern vishing increasingly uses AI voice cloning to impersonate specific individuals the victim knows and trusts.
- Zero Tolerance Policy
A zero tolerance policy is a moderation rule that triggers an immediate and severe consequence — usually content removal or a permanent ban — on the first violation, with no warnings, strikes, or progressive discipline.
