Skip to main content

Twitter had a new plan to fight extremism — then Elon arrived

The company had an ambitious plan to fight extremism on the platform. So what happened to it?

Illustration by Kristen Radtke / The Verge; Getty Images

It had been a long pandemic for Twitter’s research team. Tasked with solving some of the platform’s toughest problems around harassment, extremism, and disinformation, staffers absconded to Napa Valley in November 2021 for a company retreat. Despite a tumultuous change in leadership — Jack Dorsey had recently stepped down, appointing former chief technology officer Parag Agrawal to take his place — the group felt unified, even hopeful. After months of fighting bad actors online, employees took a moment to unwind. “We finally felt like we had a cohesive team,” one researcher says.

But at the goodbye brunch on the last day, people’s phones started pinging with alarming news: their boss, Dantley Davis, Twitter’s vice president of design, had been fired. Nobody knew it was coming. “It was like a movie,” says one attendee, who asked to remain anonymous because they are not authorized to speak publicly about the company. “People started crying. I was just sitting there eating a croissant being like, ‘What’s up with the mood?’”

The news foreshadowed a downward spiral for the research organization. Although the group was used to reorganizations, a shakeup in the middle of an outing meant to bond the team together felt deeply symbolic.

The turmoil came to a head in April, when Elon Musk signed a deal to buy Twitter. Interviews with current and former employees, along with 70 pages of internal documents, suggest the chaos surrounding Musk’s acquisition pushed some teams to the breaking point, prompting numerous health researchers to quit, with some saying their colleagues were told to deprioritize projects to fight extremism in favor of focusing on bots and spam. The Musk deal might not even go through, but the effects on Twitter’s health efforts are already clear.

The health team, once tasked with fostering civil conversations on the famously uncivil platform, went from 15 full-time staffers down to two.


In 2019, Jack Dorsey asked a fundamental question about the platform he had helped create: “Can we actually measure the health of the conversation?”

Onstage at a TED conference in Vancouver, the beanie-clad CEO talked earnestly about investing in automated systems to proactively detect bad behavior and “take the burden off the victim completely.”

That summer, the company began staffing up a team of health researchers to carry out Dorsey’s mission. His talk convinced people who’d been working in academia, or for larger tech companies like Meta, to join Twitter, inspired by the prospect of working toward positive social change. 

“We did not prioritize identifying and mitigating against health and safety risks before launching Spaces.”

When the process worked as intended, health researchers helped Twitter think through potential abuses of new products. In 2020, Twitter was working on a tool called “unmention” that allows users to limit who can reply to their tweets. Researchers conducted a “red team” exercise, bringing together employees across the company to explore how the tool could be misused. Unmention could allow “powerful people [to] suppress dissent, discussion, and correction” and enable “harassers seeking contact with their targets [to] coerce targets to respond in person,” the red team wrote in an internal report. 

But the process wasn’t always so smooth. In 2021, former Twitter product chief Kayvon Beykpour announced the company’s number one priority was launching Spaces. (“It was a full on assault to kill Clubhouse,” one employee says.) The team assigned to the project worked overtime trying to get the feature out the door and didn’t schedule a red team exercise until August 10th — three months after launch. In July, the exercise was canceled. Spaces went live without a comprehensive assessment of the key risks, and white nationalists and terrorists flooded the platform, as The Washington Post reported

When Twitter eventually held a red team exercise for Spaces in January 2022, the report concluded: “We did not prioritize identifying and mitigating against health and safety risks before launching Spaces. This Red Team occurred too late. Despite critical investments in the first year and a half of building Spaces, we have been largely reactive to the real-world harms inflicted by malicious actors in Spaces. We have over relied on the general public to identify problems. We have launched products and features without adequate exploration of potential health implications.”

Earlier this year, Twitter walked back plans to monetize adult content after a red team found that the platform had failed to adequately address child sexual exploitation material. It was a problem researchers had been warning about for years. Employees said that Twitter executives have been aware of the problem but noted the company has not allocated the resources necessary to fix it. 


By late 2021, Twitter’s health researchers had spent years playing whack-a-mole with bad actors on the platform and decided to deploy a more sophisticated approach to dealing with harmful content. Externally, the company was regularly criticized for allowing dangerous groups to run amok. But internally, it sometimes felt like certain groups, like conspiracy theorists, were kicked off the platform too soon — before researchers could study their dynamics.

“The old approach was almost comically ineffective, and very reactive — a manual process of playing catch,” says a former employee, who asked to remain anonymous because they are not authorized to speak publicly about the company. “Simply defining and catching ‘bad guys’ is a losing game.”

Instead, researchers hoped to identify people who were about to engage with harmful tweets, and nudge them toward healthier content using pop-up messages and interstitials. “The pilot will allow Twitter to identify and leverage behavioral — rather than content — signals and reach users at risk from harm with redirection to supportive content and services,” read an internal project brief, viewed by The Verge.

“Simply defining and catching ‘bad guys’ is a losing game.”

Twitter researchers partnered with Moonshot, a company that specializes in studying violent extremists, and kicked off a project called Redirect, modeled after work that Google and Facebook had done to curb the spread of harmful communities. At Google, this work had resulted in a sophisticated campaign to target people searching for extremist content with ads and YouTube videos aimed at debunking extremist messaging. Twitter planned to do the same. 

The goal was to move the company from simply reacting to bad accounts and posts to proactively guiding users toward better behavior.

“Twitter’s efforts to stem harmful groups tends to focus on defining these groups, designating them within a policy framework, detecting their reach (though group affiliation and behaviors), and suspending or deplatforming those within the cohort,” an internal project brief reads. “This project seeks, instead, to understand and address user behaviors upstream. Instead of focusing on designating bad accounts or content, we seek to understand how users find harmful group content in accounts and then to redirect those efforts.”

In phase one of the project, which began last year, researchers focused on three communities: racially or ethnically motivated violent extremism, anti-government or anti-authority violent extremism, and incels. In a case study about the boogaloo movement, a far-right group focused on inciting a second American Civil War, Moonshot identified 17 influencers who had high engagement within the community, using Twitter to share and spread their ideology. 

The report outlined possible points of intervention: one when someone tried to search for a boogaloo term, and another when they were about to engage with a piece of boogaloo content. “Moonshot’s approach to core community identification could highlight users moving towards this sphere of influence, prompting an interstitial message from Twitter,” the report says. 

The team also suggested adding a pop-up message before users could retweet extremist content. The interventions were meant to add friction to the process of finding and engaging with harmful tweets. Done right, it would blunt the impact of extremist content on Twitter, making it harder for the groups to recruit new followers. 

Before that work could be fully implemented, however, Musk reached an agreement with Twitter’s board to buy the company. Shortly afterward, employees who’d been leading the Moonshot partnership left. And in the months since Musk signed the deal, the health research team has all but evaporated, going from 15 staffers to just two. 

“Selling the company to Elon Musk was icing on the cake of a much longer track record of decisions by higher ups in the company showing safety wasn’t prioritized,” one employee says.

Multiple former researchers said the turmoil associated with Musk’s bid to purchase the company was a breaking point and led them to decide to pursue other work.

“The chaos of the deal made me realize that I didn’t want to work for a private, Musk-owned Twitter, but also that I didn’t want to work for a public, not-Musk-owned Twitter,” a former employee says. “I just no longer wanted to work for Twitter.”

Phase two of the Redirect project — which would have helped Twitter understand which interventions worked and how users were actually interacting with them — received funding. But by the time the money came through, there were no researchers available to oversee it. Some employees who remained were allegedly told to deprioritize Redirect in favor of projects related to bots and spam, which Musk has focused on in his attempt to back out of the deal. 

Twitter spokesperson Lauren Alexander declined to comment on the record.

One employee captured the team’s frustration in a tweet: “Completely uninterested in what jack or any other c-suiter has to say about this takeover,” the employee wrote, screenshotting an article about how much Twitter CEO Parag Agrawal and former CEO Jack Dorsey stood to gain from the deal with Musk. “May you all fall down a very long flight of stairs.” (The employee declined to comment.) 

According to current workers, the tweet was reported as being a threat to a coworker, and the employee was fired.