False Negative
Last reviewed by Moderation API
A false negative in content moderation is an instance where harmful or policy-violating content slips through undetected by the moderation system. False negatives expose users to harm and indicate gaps in the model or rule set.
