AAAI 2025 Tutorial

2 minute read

Published: February 25, 2025

AI for Science in the Era of Large Language Models

Department of Computer Science, Virginia Tech, USA

Time: February 26, 2025, 8:30 am - 12:30 pm EST

Location: Philadelphia, Pennsylvania @ Philadelphia Convention Center, Room 119A

Abstract:

The capabilities of AI in the realm of science span a wide spectrum, from the atomic level, where it solves partial differential equations for quantum systems, to the molecular level, predicting chemical or protein structures, and even extending to societal predictions like infectious disease outbreaks. Recent advancements in large language models (LLMs), exemplified by models like ChatGPT, have showcased significant prowess in tasks involving natural language, such as translating languages, constructing chatbots, and answering questions. When we consider scientific data, we notice a resemblance to natural language in terms of sequences – scientific literature and health records presented as text, bio-omics data arranged in sequences, or sensor data like brain signals. The question arises: Can we harness the potential of these recent LLMs to drive scientific progress? In this tutorial, we will explore the application of large language models to three crucial categories of scientific data: 1) textual data, 2) biomedical sequences, and 3) brain signals. Furthermore, we will delve into LLMs’ challenges in scientific research, including ensuring trustworthiness, achieving personalization, and adapting to multi-modal data representation.

Tutorial Recording:

A recording of our tutorial can be found here.

Slides [Combined]:

Introduction [Slides]
Part I: Scientific Text [Slides]
Part II: Brain Signals [Slides]
Part III: Biological Sequences [Slides]
Summary [Slides]

Presenters:

Xuan Wang is an Assistant Professor in the Department of Computer Science Department at Virginia Tech. Her research interests are in natural language processing, data mining, AI for sciences, and AI for healthcare. Her current research directions include natural language understanding with limited supervision, complex reasoning and planning with large language models, and scientific discoveries with multi-modal science foundation models. She was a recipient of the Cisco Research Award 2025, NSF NAIRR Pilot Award 2024-2025, and NAACL Best Demo Paper Award 2021. She received a Ph.D. degree in Computer Science, an M.S. degree in Statistics, and an M.S. degree in Biochemistry from the University of Illinois Urbana-Champaign in 2022, 2017, and 2015, respectively, and a B.S. degree in Biological Science from Tsinghua University in 2013. She has delivered tutorials in IEEE-BigData 2019, WWW 2022, KDD 2022, and EMNLP 2024.

Share on

Twitter Facebook LinkedIn