1,000 Scientist AI Jam Session explores AI-driven scientific discovery

OpenAI researcher Aaron Jaech (center) assisted Lab scientists in working with AI models at the 1,000 Scientist AI Jam Session, held in LLNL’s Library and the Livermore Valley Open Campus.
Lawrence Livermore National Laboratory (LLNL) on Feb. 28 joined an initiative that brought together over 1,400 Department of Energy (DOE) scientists across multiple sites to explore how cutting-edge AI models could transform scientific research.
The first-ever 1,000 Scientist AI Jam Session, hosted at nine DOE labs including LLNL, immersed scientists in a full-day, hands-on collaboration with OpenAI to evaluate some of the company’s most advanced AI reasoning models on real-world scientific problems. During the event, researchers assessed the models’ capabilities in solving complex scientific challenges and reported their findings, in hopes of charting a course for the future of AI for science.
“We're really trying to do two things at the top level, and the easiest goal is to expose our scientific staff to new AI reasoning models that industry has provided,” said Brian Spears, director of LLNL’s AI Innovation Incubator. “We want to get these models in the hands of staff, make them understand the way that these tools operate, and help them understand what they can do with them and what they can't yet do with them. The other piece is to start defining what the scientific frontier looks like — we’d really like to ask questions that are just within the reach of these models, and some that are much further away that we can't yet do.”
Spurred by an ongoing DOE partnership with OpenAI, the Jam Session marked the first in a series of joint engagements aimed at integrating AI-driven tools into scientific workflows, including hypothesis generation, experiment automation and coding efficiency. Other participating DOE hosts included Argonne, Lawrence Berkeley, Brookhaven, Idaho, Los Alamos, Oak Ridge, Pacific Northwest and Princeton Plasma Physics national labs. Future events are likely to include other AI companies, including Anthropic and Meta.
Scientists engage in immersive AI experimentation
At LLNL, more than 300 AI experts and scientists gathered for working sessions at the Library and the Livermore Valley Open Campus. Scientists explored OpenAI’s reasoning models, including 03-mini, o1 and o1 Pro — which “think” before answering and can explain their reasoning — as well as GPT-4.
Participants worked independently and in teams, testing the AI models’ capabilities in data analysis, pattern recognition and problem-solving. Throughout the day, check-ins facilitated knowledge exchange and model performance assessment. The event concluded with an outbrief session, where attendees shared feedback on usability, effectiveness and areas for improvement.
LLNL physicist Jason Harke, who was using OpenAI’s reasoning models for the first time, found the experience “mind-blowing.” He noted that the o-3 model solved a coding problem that he’d spent a day and a half working on in about three minutes.
“It’s exceptionally quick,” Harke said. “Right out of the box, it’s getting C and Python code generation right. It’s impressive and extremely efficient. We need this at the Lab.”
Jhi-Young Joo, an LLNL power systems engineer focused on power grid resilience, said she attended to understand the newest reasoning model capabilities for her research. She used the o3-mini model to ask questions about fault prediction for battery systems and engaged in a lengthy dialogue with the AI model.
“It asked me questions [back] to narrow down what I was looking for, which I thought was useful, and then it took probably 10-15 minutes to do a lot of research,” she said. “It gave me extensive citations and a good summary of what it found. The background research was for a proposal that I'm writing for a project, so it was practical and urgent for me. I just want to get a better sense of what this tool can do, so I can use it for the future.”
LLNL data scientist Bogdan Kustowski, who works on inertial confinement fusion (ICF), said he participated to assess AI’s potential for ICF applications.
“We want to get an understanding of what the reasoning models can do — perhaps asking some specific questions and learning how to ask questions in such a way that the model at least has a better chance of answering them,” Kustowski said.
AI for science: insights and future directions
LLNL's Brian Van Essen, a senior computer scientist and contributor to the AI Community of Practice committee behind the event, said his team is focused on improving molecule synthesis models, adding that current large-language models face challenges in transparency and accuracy. Van Essen said his team, which is also developing their own foundation model, wanted the event to serve as a testbed to see the effectiveness of the OpenAI models and improve the time to result.
“We’re asking, 'Can we better represent data on molecule synthesis to improve AI reasoning models?’” Van Essen said. “I’ve been brainstorming and conducting a kind of literature search with the model saying, ‘Here's where we're at, here's the state of the practice and these are the gaps our technical team has seen.’ I'm just articulating this natural language to the model and then working with it as a research assistant.”
Feedback from all jam sessions across the DOE complex will be shared to improve future AI systems so that they are built with scientists’ needs in mind. OpenAI software engineer Eric Ning, who was on-hand to assist scientists, emphasized the importance of AI models showing their reasoning when tackling complex scientific problems, and explained how collaborating with DOE will help refine OpenAI’s models for scientific applications.
“I hope in the future we'll see a stronger and stronger relationship [with the national labs] and can provide early access to even more powerful models, to support scientific research,” Ning said.
Lab event organizers highlighted the importance of public-private partnerships with OpenAI, Anthropic, Meta and others to drive scientific advancements and refine AI models to optimize efficiency, accuracy and scalability. Future Jam Sessions will explore models from various AI developers, broadening the scope of AI integration in national lab research, they said.
Spears noted that while AI reasoning models have advanced significantly in recent months, further improvements are needed. Organizers will analyze scientists’ prompts, model responses and performance data to pinpoint areas where AI falls short and work with partners to enhance its capabilities.
“We’re looking at [this data] as a total corpus,” Spears said. “We’re going to really put in perspective two classes of problems — these shining lights who really had a spark of ‘oh, this is something that we can do today that we didn't know we could do,’ and these other opportunities where the model is not really good yet, but we have ambitions, so let's go see with our vendor partners if can go make those things capable. And that's fantastic.”
Contact

[email protected]
(925) 422-5539
Tags
HPC, Simulation, and Data ScienceIndustry Collaborations
Featured Articles


