• Activists and journalists are increasingly suppressed and silenced on social media platforms.
  • Content by users from marginalized populations is unfairly and disproportionately targeted for removal due to bias in algorithms and AI.
  • Three principles implemented in the tech design stage can reduce harm, end digital repression and hold certain actors and platforms accountable for censorship of documented injustices and human rights abuses.

Most of us do not have full visibility into the potential injustices, conflicts and oppression or violence that occur in any given country. We depend on citizen activists and journalists to be our eyes and ears.

Over this past year, however, we have seen an unparalleled level of censorship of individuals, indigenous communities and vulnerable populations on social media and other platforms. Activists and journalists on these platforms are increasingly suppressed and silenced to the extent of near invisibility.

At the same time, perpetrators of injustices roam unpunished, rewarded with clicks and likes.

Compounding this is the reality that digital communication platforms run on algorithms programmed to maximize engagement, which tend to promote attention-grabbing inflammatory content. Platforms also utilize artificial intelligence to remove anything that violates the terms and conditions of community guidelines (such as hate speech) and ban inappropriate accounts. However, these processes can disproportionately impact marginalized communities, which find their accounts and content – especially political speech, conflict documentation and dissent – unfairly targeted for removal. Further, since the start of the pandemic, we have seen a proliferation of social bots liking, sharing and commenting on posts, which can generate online hate as well as amplify existing hate speech and disinformation by facilitating its spread and emboldening individuals with extremist viewpoints.

To understand the scale of this problem, we need to know how effective platforms are at removing harmful content, as well as removing content assumed to be harmful but in actuality is not. Facebook and Instagram provide some information on removed content they later restored, but we still have very little visibility into what percentage of content was removed by mistake. Furthermore, as of now, there exists no shared language on what constitutes “terrorism” or “organized hate”, the categories automated flagging systems use to validate removal of content.

According to Google’s January 2021-June 2021 transparency report, 9,569,641 videos were removed for violating YouTube’s Community Guidelines. Of videos flagged, 95% were removed through automated flagging, of which 27.8% were removed before gaining any views. Facebook’s January 2021-March 2021 Community Standards Enforcement Report shows Facebook removed 9 million pieces of content deemed content actioned by Dangerous Organizations: Terrorism and Organized Hate and 25.2 million pieces of content deemed hate speech. Instagram removed 429,000 pieces of terrorist content and 6.3 million pieces of hate speech content.

Google transparency report - videos removed by source of first detection
According to Google's transparency report, 95% of videos were removed by automated flagging.
Image: Google

The dilemma in these numbers is that this also includes content shared by activists and journalists removed on the false pretense of terrorism or hate speech.

We have seen how these issues can cause tangible physical, emotional and political harm to individuals, such as in the case of anti-Muslim hate speech and disinformation on Facebook and WhatsApp, which contributed to genocide in Myanmar. Ambiguity in platforms’ community guidelines and the lack of shared definitions pave the way for bias in automated flagging systems, digital repression by government authorities and censorship of activists instead of the true perpetrators.

Marginalized Communities: Targets of Content Removal

Indisputably, digital communication platforms amplify social justice causes, particularly racial injustice. Among top hashtags used on Instagram related to racial injustice, #blacklivesmatter and #endpolicebrutality have helped to educate and create calls to action against state-sanctioned violence and anti-Black racism. While social media giants have expressed solidarity with anti-racist movements, their algorithms have a track record of disproportionately removing content raising awareness of these issues. Whether it’s artificial intelligence incorrectly flagging content or moderators’ inability to manage the sheer volume of inflammatory language, banning such content results in silencing historically marginalized voices.

Following the murder of George Floyd in May 2020, many Black activists reported being censored on social media for spelling out racist policies and shining a light on historical injustices. According to a study by the University of Washington, Carnegie Mellon University and the Allen Institute for Artificial Intelligence, tweets written in African American English were two times as likely to be marked as “hate speech”. More recently, during the outbreak of violence between Israel and Palestine in May 2021, Twitter and Facebook either improperly blocked or restricted millions of pro-Palestinian posts incorrectly associated with terrorism.

While improvements to algorithms have been made, platforms are still struggling with differentiating between online terror and those communicating candid, real-life experiences of discrimination, with the algorithms often reflecting the inequalities that underpin bias in society.

Case Study: Palestinian Activism

During the May 2021 airstrikes in Gaza, Palestinians and supporters around the world leveraged social media to quickly disseminate key information, share experiences and crowdsource relief. However, many accounts were suspended, with posts flagged as inappropriate, live videos blocked and content reaching fewer eyes than normal. In the first two weeks of the conflict, over 500 instances of content removal were reported to 7amleh, a nonprofit focused on social media. Platforms blamed technical glitches, but advocates have scrutinized the ways in which posts and hashtags of the Al-Aqsa mosque, the Sheikh Jarrah neighborhood in Jerusalem and other non-violent subjects were targeted for removal – making invisible the experiences of many Arabs in the larger social media ecosystem. Further exacerbating the frustrations of many Palestinians, Israeli ads suggesting violence remained on YouTube for days before being flagged for removal.

Although companies have created task forces and dedicated resources to addressing these issues, the situation in May was reflective of a broader pattern.

Case Study: Colombian Activism

Generally, there is this problem with moments of social sparks and protests in Latin American countries wherein we see more online censorship of crucial content documenting violence, raising awareness of human rights violations, and bringing evidence of the repression and abuses that happen on the streets.

—Veridiana Alimonti, Latin American Sr. Policy Analyst, Electronic Frontier Foundation

Since 28 April 2021, thousands of people demonstrating against increasing inequality, poverty and state violence in Colombia have been met with police crackdowns. Efforts to stymie dissent in the physical world have translated to the online world, wherein content regarding the protests has been met with internet disruptions and removals.

Colombian authorities started a campaign called #Colombiaesmiverdad ("Colombia is my truth") to frame anti-government voices as terrorists and vandals – not only in the physical world, but also online. This in effect prompted Instagram’s algorithm to flag and remove content considered “terrorism” or “hate speech” regardless of whether or not the content honestly depicted injustices, crimes and violence.

During this same time frame, Instagram Commons admitted to the removal issues, but many activists and organizations reported that their content was still being removed. According to Carolina Botero Cabrera, Executive Director of Karisma Foundation, the Colombian civil society digital rights organization, “We have over 1,000 reports of censorship, around 90 percent of it was by Instagram and the content was overwhelmingly about the protests”.

Automated content moderation systems pose a significant threat to freedom of expression, with a disproportionate impact on marginalized and vulnerable communities, activists and journalists. Even with the lack of transparency into the error rates, it is clear that automated systems do not understand context and can be easily manipulated by authorities and disinformation bots. Unresolved, this problem will continue to have profound consequences on the ground, particularly during times of crisis and civil unrest.

3 Ethical Technology Design Principles

Three technology design principles can be implemented to:

  • Reduce online and emanating physical harms
  • Stop the digital repression of activists and journalists
  • Hold certain actors and digital communications platforms accountable for censorship of documented injustices and human rights abuses

Principle 1: Service provider responsibility

The burden of safety should never fall solely upon the end user. Service providers can take preventative steps to ensure that their services are less likely to facilitate or encourage illegal and inappropriate behaviors.

Principle 2: User empowerment and autonomy

Ensuring the dignity of users is of central importance, with users’ best interests a primary consideration, through features, functionality and an inclusive design approach that secures user empowerment and autonomy as part of the in-service experience.

Principle 3: Transparency and accountability

Transparency and accountability are hallmarks of a robust approach to safety. They not only provide assurances that services are operating according to their published safety objectives, but also assist in educating and empowering users about steps they can take to address safety concerns.

Based on the Safety by Design (SbD) principles from Australia’s eSafety Commissioner (eSafety), these principles were developed through extensive consultation with over 60 key stakeholder groups in recognition of the importance of proactively considering user safety during the development process, rather than retrofitting safety considerations after users have experienced online harm.

While digital injustices unfortunately exist in every society, organizations with a vested interest in developing, deploying and using ethical technology have a responsibility to make it difficult for people to perpetuate technology-enabled harms and abuses.

It is not enough to simply remove harmful content. Success rates for content removal mean very little if activists are completely censored online and marginalized offline, and if victims or survivors still have to deal with the social, reputational and psychological trauma caused by the digital abuser. We need transparency into the rate of error for mistaken content removals. We need the digital repression of vulnerable populations, activists and journalists to stop. And we need the perpetrators of digital injustices held to account.

Here is a non-comprehensive list of leaders, organizations and campaigns that advocate for digital justice and help victims, survivors and activists:

The World Economic Forum’s Global Future Council on Data Policy is leading a multistakeholder initiative aimed at exploring these issues, Pathways to Digital Justice, in collaboration with the Global Future Council on Media, Entertainment and Sport and the Global Future Council on AI for Humanity. To learn more, contact Evîn Cheikosman at evin.cheikosman@weforum.org.