Instagram Enhances Safety Features with AI and Appeal Processes

Instagram is implementing significant safety updates aimed at protecting its users from harmful content and misinformation, while also providing more avenues for user recourse. These updates leverage artificial intelligence (AI) for content moderation and introduce new features that give users more control and transparency.

Appeal Process for Post Takedowns

One of the key new features is an enhanced appeal process for users whose posts have been taken down. Instagram is rolling out a new in-app interface that allows users to request a second opinion on the platform's decision. This means a different moderator will review the content. If the original decision was incorrect, the post's visibility will be restored, and the user will be notified of the outcome. Previously, users could only appeal account suspensions, but this new system extends the appeal process to individual content takedowns, addressing instances where content might be mistakenly flagged for violating policies like nudity or hate speech.

Blocking Vaccine Misinformation Hashtags

In its ongoing efforts to combat misinformation, Instagram is introducing measures to curb the spread of false information, particularly concerning vaccines. The platform will begin blocking hashtag pages that feature a significant amount of verifiably false content about vaccines. For hashtags that contain some violating content but do not meet the threshold for outright blocking, Instagram will implement a "Top-only" post setting. This means only the most authoritative content will appear, reducing the visibility of problematic posts. Instagram plans to test this approach and potentially expand it to other categories of harmful content. Additionally, users searching for vaccine-related content will be presented with educational information via pop-up notifications, similar to measures previously taken for self-harm and opioid-related content. The platform is now comfortable classifying contradictory information about vaccines, such as the claim that "VACCINES DO NOT CAUSE AUTISM," as verifiably false and will actively demote it.

AI in Content Moderation

Instagram's content moderation efforts are increasingly powered by AI. Automated systems are employed to scan and score every post uploaded to the platform. These systems utilize classifiers designed to identify prohibited content and leverage "text-matching banks" – collections of fingerprinted content that have already been banned. This technology includes indexing text and using Optical Character Recognition (OCR) to extract words from images, enabling Instagram to find posts with similar text content. The company is actively working on extending this AI capability to video content. The systems are continuously being trained to detect not only explicit violations like threats, unwanted contact, and insults, but also more nuanced issues such as content that intentionally induces fear-of-missing-out (FOMO), taunting, shaming, and betrayals.

Tally-Based Suspensions

Instagram is also revising its approach to account suspensions. Instead of basing decisions on the percentage of violating content relative to a user's total posts, the platform will now use a tally of total violations within a specific timeframe. This change is intended to create a more equitable system, preventing users with a large volume of posts from having a disproportionately lower violation percentage compared to users who post less frequently. Instagram has stated that the exact timeframe and number of violations that trigger suspensions will not be disclosed to prevent bad actors from gaming the system.

Other Safety Initiatives

Beyond these core updates, Instagram is testing several other safety features. These include a "nudge" that prompts users if they are about to post a potentially hateful comment, an "away mode" allowing users to take a break from the platform without deleting their account, and a "manage interactions" feature. The latter enables users to ban specific actions like commenting on their content or sending direct messages without having to block the user entirely.

Conclusion

These comprehensive updates underscore Instagram's commitment to user safety and its increasing reliance on AI to manage content at scale. By providing appeal mechanisms, actively combating misinformation, and refining its enforcement policies, Instagram aims to create a more responsible and secure environment for its vast user base.

Images:

Author:

Josh Constine, Partner at SignalFire, formerly Editor-At-Large for TechCrunch.

Topics Covered:

Instagram, AI, Content Moderation, Misinformation, Social Media Policy, Tech Policy, Digital Safety, AI Ethics, AI Safety, NLP, OCR, Hate Speech Detection, Fake News Detection.