Dartmouth Researchers Assess Agentic AI

News subtitle

Across campus, scholars balance the promise of autonomous AI with its pitfalls.

Image
Image
Nikhil Singh
Assistant Professor of Computer Science Nikhil Singh is working to more fully understand how AI agents make decisions. (Photo by Katie Lenhart)
Body

Artificial intelligence is rapidly moving from reactive models that respond to user prompts to proactive agentic AI systems that can write and test code, plan and book travel, and boost office productivity by streamlining and managing tasks.

Dartmouth researchers across campus are exploring ways to use AI agents in their labs to drive discovery and innovation in such fields as health monitoring, quantum physics, and energy pricing even as they grapple with the issues they raise.

The idea of an “AI agent” originated with John McCarthy, who also organized the seminal 1956 Dartmouth Summer Research Project on Artificial Intelligence, says Nikhil Singh, assistant professor of computer science.

The Science and Art of Human-AI Systems lab, which Singh directs, studies the capabilities and reliability of AI agents and applies them to solve real-world problems ranging from audio production to chronic disease management.

“The ability to make decisions autonomously under uncertainty is, for me, the hallmark of an agent,” says Singh, who taught Dartmouth’s first course on AI agents this winter. “Let’s say I task an agent with finding me a nice backpack. It has to translate the goal into a series of steps, find and navigate webpages, figure out how to compare products, and shortlist the best candidates.”

An important step towards creating an effective AI agent that is useful and robust is to really understand how they make decisions, says Singh. “Does it make the decisions that we would make in the same uncertainty, is it easily tricked, and does it go off script in ways that are unpredictable?” are some questions Singh and his collaborators are asking.

To find answers, they set up an experimental framework for evaluating AI agents at scale. The agents were presented with choices, and the researchers already knew what the optimal decision at any point was.

“This is often not true in real-world scenarios, like choosing a backpack, but it allows us to analyze the tradeoffs agents make and see whether we can influence them one way or another,” says Singh. 

The study found that with autonomous agents, biases get amplified—when presented with a default option, they’re way more likely to take it than humans would be. They are also hugely swayed by nudges such as highlighting one option out of several candidates, he says, likely because doing deep, careful reasoning for every single decision would require a lot more time and computing power.

Image
AI image generation of backpacks
Computer science professor Nikhil Singh used an image generation model to automatically discover how to make pictures more persuasive to AI agents.  (Photo by Nikhil Singh)

In a recent paper published in the 2026 International Conference on Learning Representations, Singh and co-authors report that the pattern continued when agents were tested in a simulated shopping environment. 

Product price and user ratings hugely biased the outcomes, and the agents were very susceptible to marketing nudges, such as adding tags like “popular,” designed to capture attention. “It’s clear to us that their decisions are biased in a systematic way that we should properly understand before we start delegating everything to them,” Singh cautions.

Finally, the team widened their investigations to visual agents that use computer vision to scan images on webpages rather than read textual information. Their work will be presented at the 2026 International Conference on Machine Learning in July.

They started with standard images of a person, place, or product and used an image generation model to automatically discover how to make the picture more persuasive without altering the subject, such as by using more favorable lighting or by adding context.

“We found that we can very reliably influence the choices of these agents, which means we were discovering the visual features that really strongly bias their decisions,” says Singh. “And then, to our surprise, we found that visual tweaks also can work on people.”

In their paper, the researchers also propose strategies to mitigate the effects of these visual artifacts on agents.

Addressing risk

Singh isn’t the only Dartmouth professor working to better understand agentic AI systems and examine their limitations.

“You have to be cognizant of all the bad things that happen and be upfront about the risks. It is dangerous right now because it’s willy-nilly,” says Eugene Santos Jr., the Sydney E. Junkins 1887 Professor of Engineering. He cites the example of popular chatbots that confidently provide responses even when they don’t know the right answer.

Image
Chandrasekhar Ramanathan
Physics professor Chandrasekhar Ramanathan uses a specialized, super-cooled, ultra-powerful magnet to study quantum systems. (Photo by Rowan Kowalsky)

For Santos, the fix is a matter of engineering discipline. “For any engineering system we build, we go to great lengths to understand reliability. The same should apply for AI,” says Santos, who studies trust in AI, computational intent, and explainable AI.

Intent is key, he says, emphasizing that creators must understand what they build and provide guarantees and clarity about the capabilities of their products. 

“Off-label drug use can be a good analogy. There are some great purposes for it, but the intent should be as clear and/or unambiguous as possible, and there must be transparency and accountability,” says Santos. 

“Of course, all these remain fundamental challenges towards building trustworthy AI systems.”

Santos works with engineers and psychologists as well as other disciplines to understand what trust looks like in human-AI collaborations and examines how figuring out what the system is incentivized to do can help users understand and influence its biases.

Even as these cautions mount, Dartmouth researchers are putting agents to work in everything from quantum labs to energy markets to healthcare.

Smarter tools, smarter science

Can AI agents be trained to collaborate with researchers running complex physics experiments?

For her senior undergraduate thesis, Catherine Chu ’26, a physics and mathematics double major, examined whether AI agents could optimize experimental controls used to steer quantum systems with minimal human intervention. 

Chu worked with Professor of Physics and Astronomy Chandrasekhar Ramanathan, who leads experimental research at the interface between quantum information processing and condensed matter physics, and Peter Chin, a professor of engineering who directs the Learning, Intelligence + Signal processing lab.

Using magnetic resonance techniques—the same underlying principle used in MRI scanners—researchers in Ramanathan’s laboratory observe and control quantum systems where the spins of atomic particles are manipulated within a solid material, such as a crystal. 

These systems connect quantum behavior with scalable engineering, and researchers studying quantum control are focused on making fragile quantum systems stable and useful enough to build real technologies, such as ultra-sensitive sensors for navigation, medical imaging, and materials research.

“We use radiofrequency or microwave pulse sequences to extract or encode information in the molecules we’re studying. But spins in a system are always interacting with each other, making the signals too complicated to read,” says Ramanathan.

To extract meaningful information about the system, researchers use pulses to selectively turn off interactions. Chu’s work tasked AI agents with performing this simple quantum control task.

The agents function like scientists, translating experimental goals described in English into executable workflows, in this case, a program for generating pulse sequences. Ongoing work will reveal whether the experiments based on the simulations are successful.

“This research has been particularly exciting because it has allowed me to combine my interests in mathematics and physics while applying new AI methods to solve problems I’ve been interested in since my freshman year,” Chu says.

Assistant Professor of Engineering Cong Chen gave her AI agents a different role. Several, in fact. They modeled a cautious grandmother, a data-savvy graduate PhD student, and an emotionally driven actor, and Chen watched them decide electricity consumption and home battery backup power usage during a simulated power outage. The student continued selling power, while the grandmother and the actor chose to save backup power.

Image
Cong Chen presenting her research
Assistant Professor of Engineering Cong Chen presents her research on decision-making during a power outage at CERAWeek 2026 in Houston in March. (Courtesy of CERAWeek by S&P Global) 

The agents serve as digital proxies for various energy customers, generating behavioral insights about how people will respond to changes in electricity pricing, energy policies, or renewable sources incentives, especially during rare events like outages.

These insights enable simpler, fairer market design, and support Chen’s research for real-time pricing that eliminates market failure and incentive distortions in the electricity market. Chen was part of the Dartmouth delegation at the 2026 CERAWeek conference held this March in Houston.

Healthcare is another field that could benefit from agentic AI. Recently, Singh, Computer Science Research Faculty Temiloluwa O. Prioleau, and computer science PhD student Yanjun Cui, Guarini, worked with collaborators from Emory University to develop AI agents for disease management in diabetes patients.

“Glucose monitors and other devices used to manage diabetes generate an enormous amount of data,” says Singh. “The data helps clinicians make sense of what’s going on with patients, but it’s also there for patients themselves.”

Currently, apps and dashboards present patients with statistics. But that is a rigid format that doesn’t allow users to get personalized answers to questions such as “how does my Sunday brunch tradition affect my glucose levels,” says Singh.

The team’s AI agent is designed to bridge this gap and act like a personal data analyst while ensuring the safety of their data.

“The agent is designed to help patients get more from the data in a safe and effective way that preserves their privacy,” says Singh.

Taken together, the projects suggest that AI agents are tools capable of genuine assistance, but ones that come with warning labels, calling for skepticism and adoption, critique and construction, in equal measure.

Written by
Harini Barath