New research casts doubt on the idea of “collective intelligence.”

The concept of collective intelligence is simple—it asserts that if a team performs well on one task, it will repeat that success on other projects, regardless of the scope or focus of the work. While it sounds good in theory, it doesn’t work that way in reality, according to new research.

Marcus Credé, an assistant professor of psychology at Iowa State University, says unlike individuals, group dynamics are too complex to predict a team’s effectiveness with one general factor, such as intelligence. Instead, there are a variety of factors—leadership, group communication, decision-making skills—that affect a team’s performance, he says.

Anita Woolley’s research supporting collective intelligence quickly gained traction in the business world when it came out in 2010. The attention didn’t surprise Credé. Because organizations rely heavily on group work, managers are always looking for a “silver bullet” to improve team performance, he says.

After re-analyzing the data gathered by Woolley and her colleagues, however, Credé and Garett Howardson, an assistant professor at Hofstra University, found the data didn’t support the basic premise of collective intelligence.

“For decades researchers have looked at what makes a team work well. They’ve typically found that if a team performs well in one area, that is largely unrelated to how the team will perform in a different area,” Credé says. “A team working on a production line requires a fundamentally different set of skills than a team trying to find creative solutions to a problem. While a Marine Corps fire team is great at its job, it’s not going to work well performing surgery.”

Credé notes that of the six studies included in their re-analysis, only one—a 2014 study by researchers at Indiana University—correctly concluded there was no evidence of collective intelligence.

Credé says conflicting data was just one of three major problems he and Howardson discovered. Their analysis found participants in these studies were either unmotivated—which Credé suspects is likely the case—or they were confused by some of the tasks the groups were asked to perform. For example, as part of a brainstorming task, each team had 10 minutes to come up with different uses for a brick. Teams scored a point for each use, regardless of the practicality.

At least one team included in the analysis received a zero on this task. Credé says it’s hard to believe a team could not come up with one use for a brick. In this example, if one group does poorly because of minimal effort, it can artificially inflate correlations between performance across tasks, the researchers explain in the paper.

As a result, Credé says Woolley and her team may have misinterpreted the data as an indicator of collective intelligence.

They also did not recognize that teams can exhibit some consistency in performance across tasks, even when the team members barely interact with each other. In other words, the teams may not have functioned collectively. Instead, Credé says individual team members may have developed separate responses that were averaged across the team, rather than true collaboration.

The fact that study participants were college students receiving course credit or community members receiving a stipend also doesn’t reflect how teams form and function within organizations.

“In real organizations, people typically know each other; they work together over time and work on very different tasks than the ones assigned in the study,” Credé says. “A lot of teams are also comprised of members with high-level and different skill sets, and often one member functions as a leader.”

Credé says in one study, Woolley and her team recorded team conversations while each group was completing a task, which offers a better understanding of how team members interact. In some groups, one team member dominated the entire conversation, and in other groups, there were more equal contributions. Credé says team performance generally suffers when one person controls the conversation.

It is possible that team performance on one task may predict its performance on another similar task, Credé says. For researchers to fully understand this relationship, however, their work must mirror team composition and tasks in real organizations. Credé cautions that this may be difficult to replicate in a lab setting.