AAAI 2024: LLMs4Bio
An open exchange of ideas to spur foundational research and LLM-enabled biological discoveries
About
Rapid advances in large language models (LLMs) provide an unprecedented opportunity to further scientific inquiry across scientific disciplines and domains. Despite remarkable feats in natural language tasks often exclusively indicative of human intelligence, the potential of LLMs beyond natural language has yet to be realized. Outstanding challenges in diverse scientific disciplines, such as molecular biology, material science, climate science, geology, hydrology, and various domains within them, such as drug discovery, quantum material design, weather forecasting, and more, necessitate the integration of heterogeneous, multi-modal datasets resulting from diverse physical processes, as well as the injection of deep domain knowledge accumulated over decades of discovery about the inherent processes that govern the natural and biological world. This workshop brings together diverse researchers from various disciplines to formulate problem spaces, datasets, standards, benchmarks, and spur innovation on accessible and inclusive LLMs to power the next scientific breakthroughs.
The Call for Papers has been published
Program
Opening Remarks | 1:55PM – 2:00PM | ||||||||||||||
Keynote Remarks Sorin Draghici, NSF CISE:IIS:III Program Director. Molecular Biology Research at the National Science Foundation: An Information and Intelligent Systems Perspective. | 2:00PM – 2:20PM | ||||||||||||||
Session I: GenomicsInvited Talk: Davuluri Ramana | 20 mins | P1.Jim Clauwaert, Zahra McVey, Ramneek Gupta, John Prensner and Gerben Menschaert. RIBO-former: applying attention-based neural networks for precise delineation of translated open reading frames using ribosome profiling data. | 10 mins | P2. Bernardo de Almeida and Thomas Pierrot. Large Language Models for Genomics. | 10 mins | P3. Evan Trop, Chia-Hsiang Kao, Mckinley Polen, Yair Schiff, Bernardo P. de Almeida, Aaron Gokaslan, Thomas Pierrot and Volodymyr Kuleshov. Advancing DNA Language Models: The Genomics Long-Range Benchmark. | 7 mins | P4. Sam Boshar, Evan Trop, Bernardo P. de Almeida and Thomas Pierrot. Are Genomic Language Models All You Need ? Exploring Genomic Language Models on Protein Downstream Tasks. | 7 mins | PANEL with Q&A for presenting authors of papers | 15 mins | | 2:20PM – 3:30PM | ||
Session II: ProteomicsP5. Asher Moldwin, Anowarul Kabir, and Amarda Shehu. A More Informative and Reproducible Remote Homology Evaluation for Protein Language Models. | 10 mins | P6. Amitesh Badkul, Li Xie, Shuo Zhang and Lei Xie. Trustworthy protein-ligand binding affinity prediction using large protein language model | 10 mins | P7. Shuo Zhang and Lei Xie. Protein Large Language Model-Powered 3D Ligand Binding Site Prediction from Protein Sequence. | 10 mins | PANEL with Q&A for presenting authors of papers | 15 mins | | 3:30PM – 4:15PM | ||||||
Session III: Bio & ChemInformaticsInvited Talk: Aidong Zhang | 20 mins | P8. Amir Shariatmadari and Aidong Zhang. Harnessing the Power of Knowledge Graphs to Enhance LLM Explainability in the BioMedical Domain. | 10 mins | P9. Jiayu Chang, Shiyu Wang, Chen Ling, Zhaohui Qin and Liang Zhao. Gene-associated Disease Discovery Powered by Large Language Models. | 10 mins | P10. Xiuyuan Hu, Guoqing Liu, Yang Zhao and Hao Zhang. Empirical Evidence for the Fragment level understanding on Drug Molecular Structure of LLMs. | 10 mins | P11. Gaetan De Waele, Anneleen D Wieme, Gerben Menschaert, Peter Vandamme and Willem Waegeman. Transformers for MALDI-TOF mass spectra. | 7 mins | P12. Xinyang Liu, Bowei Fang, Ping Zhang, Bo Chen and Huaxiu Yao. MolGuide: 2D Molecular Optimization with Preserved Structural Motifs Guidance. | 7 mins | PANEL with Q&A for presenting authors of papers | 20 mins | | 4:15PM - 5:40PM |
Moderated Discussion with all Participants | 5:40PM – 6:00PM | ||||||||||||||
Concluding Remarks | 6:00PM |
Keynote Speakers
Sorin Draghici
Professor, Wayne State University
Aidong Zhang
Thomas M. Linville Professor, University of Virginia
Ramana Daluvuri
Professor, Stony Brook
Organizers
Dr. Amarda Shehu
Professor, Associate Dean of Engineering for AI Innovation and Associate Vice President of Research, George Mason University
Dr. Amarda Shehu is a Professor in the Department of Computer Science at George Mason University, where she is also Associate Dean of Engineering for AI Innovation and Associate Vice President of Research for the Institute of Digital InnovAtion. Shehu obtained her Ph.D. from Rice University in 2008, where she was also an NIH predoctoral fellow in the Nanobiology Program and was dually trained in AI and Molecular Biophysics. Shehu's research bridges foundational AI research and AI in support of progress across scientific disciplines. Her scholarship includes work at the intersection of stochastic optimization, planning, deep learning, optimization for deep learning, bioinformatics, health informatics, applied NLP, and more. She is a 2022 Fellow of the American Institute for Medical and Biological Engineering (AIMBE) and is passionate about bridging scientific communities. She regularly delivers keynotes and invited talks (IEEE BIBM 2022, BIOKDD 2021, ICCABS 2020, etc.), organizes tutorials and workshops in ACM and IEEE bioinformatics and evolutionary computation conferences, and journal collections and special issues (PLoS Comput Biol, Curr Opin Struct Biol, Molecules, Intl J Mol Sci, etc.) to advance transdisciplinary research. Shehu also served as Program Director in the CISE directorate at NSF 2019-2022 during which time she galvanized various sub-communities of CISE, BIO, CHEM, and ENG researchers in transdisciplinary research, education programs, and workshops. She has extensive expertise bringing together silo-ed communities together in workshops (Women @ GECCCO, CSBW, AAAI Workshops, etc.), with a strong emphasis on integration and leadership for historically-minoritized researchers.
Dr. Yana Bromberg
Professor, Departments of Computer Science and Biology, Emory University, Hans Fischer Fellow, Institute for Advanced Study, Technical University of Munich
Dr. Yana Bromberg is a Professor in the Departments of Computer Science and Biology at Emory University and a Hans Fischer Fellow at the Institute for Advanced Study in the Technical University of Munich. She is a noted researcher in computational molecular biology. A former NLM Biomedical Informatics doctoral student, she is trained in molecular biology and bioinformatics. She is known for her work in annotation of genome variant-caused functional changes, tracking of microbiome functional shifts due to environmental pressures, and exploration of protein structures at the origins of life. Her seminal SNAP method, the first neural network-based tool for predicting functional impacts of single amino acid substitutions has enabled novel variant assessment methods and whole genome analyses and methods for mapping genetic architectures of disease. Bromberg has advanced computational annotation, prediction, and analysis of protein functionality, establishing the library of protein folds at the origins of life. Her exploration of the microbial world has led to advances in understanding molecular functional diversity carried out by individual microbes and emergent functionality in microbial communities. Bromberg is a director of the International Society for Computational Biology and has filled a number of leading roles in the organization of ISMB -- the society's annual flagship conference. She consistently delivers keynotes and invited talks at national and international conferences (ISMB, GRC, etc.) and lectures at relevant workshops and courses (STAMPs workshop @MBL, the University of Bologna International Masters Bioinformatics program, etc.)
Dr. Liang Zhao
Associate Professor, Department of Computer Science, Emory University
Dr. Liang Zhao is an Associate Professor in the Department of Computer Science at Emory University with extensive experience in spatiotemporal data mining, network modeling, deep learning, interpretable machine learning, and nonconvex and distributed optimization. His work is published at top conferences/journals, including KDD, NeurIPS, ICLR, ICDM, AAAI, IJCAI, WWW, TKDE, CSUR, PIEEE, and TPAMI. He received the NSF CAREER Award, Amazon Research Award, Meta Research Award, Cisco Faculty Research Award, and Jeffress Trust Award. His paper on deep subgraph anomaly detection won the Best Paper Award in ICDM 2022. His paper on interpretable representation learning is among the Best Paper Shortlist in WWW 2021. He has also won Best Paper Runner-up and Best Paper Candidate in ACM SIGSPATIAL 2022. His paper on deep graph transformation and applications on molecule reaction prediction won the Best Paper Award in ICDM 2019. His recent book "Graph Neural Network: Foundations, Frontiers, and Applications" received over 100K accesses in the publisher Springer link's website. Its Chinese version has won Best-Seller Award from Post \& Telecom Press. Zhao has disseminated over 30 software tools and numerous datasets.
Student Co-Organizers
George Mason University
Samuel Blouir
Role: Speaker Engagement
Anowarul Kabir
Role: PC Engagement
Manpriya Dua
Role: DEI Ambassador
Asher Moldwin
Role: Social Media
Emory University
Bo Pan
Role: DEI Ambassador
Yijun Liu
Role: Workshop Website
Chen Ling
Role: Social Media
R. Prabakaran
Role: PC Engagement