False Positive
Last reviewed by Moderation API
A false positive in content moderation is an instance where benign content is incorrectly flagged or removed as harmful by a moderation tool or algorithm. High false positive rates frustrate users and erode trust in the platform.
