AI agents as a ‘virtual team’

Researchers built an all-AI team; join me for 3 min to unpack what this means for problem-solving and collaboration.

[click to view transcript]

This is just a banana pants study that I need to share with people. I think all the best uses of AI right now are happening in research, and especially medical research. This paper is a great example.

As background, all you need to know is that any time a bunch of cross-disciplinary experts come together, there are costs and challenges. Challenges in terminology, methodology, and priorities and goals, and then costs in project duration, effort level, and just pure people-hours. But also, really difficult, open-ended problems are best solved with a bunch of cross-disciplinary experts coming together, so those challenges and costs have always seemed worth it. But now we may have an alternative.

In the study I’m sharing today, link below as always, the researchers created a cross-disciplinary team of experts, but all the experts were AI agents. And they had a series of meetings and working sessions among the experts over time, but they ran them in parallel and the participants were the AI agents. As if that wasn’t wild enough, the human in the loop only created two of the agents, and then one of those agents created all the other agents that were the experts on the working team.

For those of you who stop listening after 1 minute, that’s the overview. For the rest of you, let’s take a closer look and unpack the implications.

The way the team was built was as follows: the human research team defined a Principal Investigator and a so-called Scientific Critic. The critic was instructed to catch errors and provide oversight on all the agents and all the work, while the Principal Investigator was instructed to create and guide whatever team of agents it thought it needed to solve the problem at hand. And the human research team provided the problem at hand, as well as the agendas and discussion points for each working session that would happen between agents.

All the paper is quite interesting. I genuinely enjoyed the tactical discussion of how they built agendas and structured working sessions for maximum efficacy, but I’ll close my thoughts here with implications.

The framing here is mine, this isn’t how they frame things in the paper, but what stood out to me is that they structured this program of work around 3 competing things happening at once.

First, they gave the agents free reign, in order to take advantage of generative and agentive AI’s open-ended capabilities.

Second, they gave the agents clearly scoped roles and activities, and asked the Principal Investigator to do the same for the agents it created, in order to keep each agent both focused in on its own expertise and on task throughout the inevitable chaos of open-ended collaboration between open-ended entities.

And third, they gave the agents constant error check and critique, to mitigate the ever-present risk of hallucinations and factual errors running the rapidly paced work off track.

So they built a system that could continually and dynamically balance itself across the competing demands of open-ended, scoped, and critically assessed. If you’re familiar with the ever-obnoxious double diamond of so-called design thinking, this is sort of like taking the double diamond, folding it in on itself, and then spinning it.

I’m calling it now: I think we will continue to see the collapse of process and instead a growth in the skill of structuring systems. And not to sound cynical, but I am curious to see how this new type of work gets branded or simplified. Thanks for listening.

Source: https://www.biorxiv.org/content/10.1101/2024.11.11.623004v1