Category Archives: Uncategorized

Discover Science – Applications Open for Spring 2027 Undergraduate Internships

Are you an undergraduate student who is passionate about scientific research? Please see the announcement below from DOE regarding an exciting opportunity for undergraduate internships!

An amazing opportunity — students and recent grads may apply to conduct research and technical projects at DOE National Laboratories

The Department of Energy (DOE) today starts to accept applications for the Spring 2027 term to the Science Undergraduate Laboratory Internships (SULI) program and the Community College Internships (CCI) program. Mentored by scientists, engineers, and technical professionals, interns will acquire new skills through hands-on learning, be exposed to science and technology careers, and expand their professional network to advance their educational and career goals. The interns will have the opportunity to work with a team of experts to solve real-world problems and contribute to science and innovation that support the DOE mission, including the Genesis Mission and new frontiers in artificial intelligence, quantum, nuclear energy and technology, biotechnology, critical minerals and materials, and fusion science and engineering.

SULI is open to full-time students attending 4-year institutions and community colleges or recent graduates within two years of receiving their bachelor’s degree or associate degree. CCI is for community college students including those who are enrolled part-time. Both programs are stipend-based and offered three times annually in fall, spring, and summer. Participants residing outside the commuting area are offered round-trip travel to and from the host lab, and financial assistance with lodging. The application deadline for both programs is September 30, 2026, at 5:00 p.m. ET. A number of workshops, office hours, and technical support will be available to applicants throughout the process.

Workshop Details

Two workshops are planned to introduce the programs, explain the application process, and provide strategies for submitting a compliant application.

August 25, 2026, at 2:00 pm, for CCI application assistance workshop: Register here
August 13, 2026, at 2:00 pm for SULI application assistance workshop: Register here

Office Hours

The program office invites applicants and letter of recommendation writers to attend office hours. In these office hours, program staff members will answer administrative questions including those related to uploading transcripts and submitting letters of recommendation.  Timing and registration details can be found below and posted on the program website.

SULI/CCI Office Hours: September 9, September 16, September 23, 2026, at 2:00 pm ET. Register here

The programs offer unique opportunities to gain a glimpse into discovery science that is at the cutting edge of national priorities like the Genesis Mission. Learn more about the internship experience through the many Lab Stories shared by previous participants.

SULI and CCI are managed by the Office of Workforce Development for Teachers and Scientists (WDTS) in the Office of Science. More information can be found at https://science.osti.gov/wdts

Turning Uncertainty into a Design Tool for AI-engineered Molecules

Our recent work on harnessing model uncertainty to improve generative molecular design has been featured on the Brookhaven National Laboratory’s website. The full article can be accessed at: https://www.bnl.gov/newsroom/news.php?a=222882

This was a collaborative work between Texas A&M and BNL, involving Nafiz Abeer (TAMU), Sanket Jantre (BNL), and Nathan Urban (BNL).

H/T to Charity Plata (BNL CDS) for the wonderful article!

Accelerating AI-driven antigen design

Last week, researchers at Texas A&M and UTMB, who are part of the ARPA-H APECx “SPHERICAL” project team, convened for an intensive (but fun!) 1.5-day workshop focused on accelerating the progress in AI-driven antigen design.

At the heart of the discussions was the AI-based antigen design pipeline developed by the Texas A&M team, which has been applied to optimize the design of candidate antigens for Mayaro virus (MAYV) and Rift Valley fever virus (RVFV). This represents an important step toward rapid, data-driven countermeasure development that effectively leverages the latest advances in AI, protein language modeling, and protein structure prediction.

A major milestone highlighted during the workshop was the availability of the first batch of experimental results. UTMB successfully synthesized the antigens designed by the AI pipeline and evaluated them using neutralization assays, providing critical validation data for the computational predictions.

Our team engaged in in-depth discussions analyzing the correlation between the predicted antigen properties and the observed neutralization performance. We expect that these insights will be instrumental in refining the current antigen design pipeline, improving the predictive accuracy of the AI models utilized therein, and strengthening the overall design framework in a closed-loop fashion.

The workshop concluded with forward-looking conversations between team members at the two institutions on enhancing the pipeline, including strategies to improve robustness, incorporate uncertainty quantification, and extend its capabilities toward predicting broadly effective antigens against diverse viral variants.

This collaborative effort underscores the power of integrating AI, experimental validation, and interdisciplinary expertise to accelerate the development of next-generation antigen design strategies.

For more information about the ARPA-H APECx program that is sponsoring the project, please visit: https://arpa-h.gov/explore-funding/programs/apecx

Uncertainty-aware optimization of generative molecular design models featured on the cover of RSC Journal MSDE’s Feb. 2026 issue

How can we explore model uncertainty to improve generative molecular design?

In our recent paper entitled “Enhancing generative molecular design via uncertainty-guided fine-tuning of variational autoencoders” just featured on the cover page of RSC Molecular Systems Design & Engineering (MSDE)‘s Feb. 2026 issue, we propose an uncertainty-aware model optimization strategy.

Specifically, our strategy takes the latent points found by any latent space optimization approach and explores the uncertainty classes of generative molecular design models (in this case, variational autoencoders) through their low-dimensional active subspace to find the models that improve the properties of the molecules corresponding to those latent points.

Further details of our work can be found at: https://pubs.rsc.org/en/content/articlelanding/2025/me/d5me00081e

ACM-BCB 2026 will be held in Calabria, Italy (June 30-July 3, 2026)

The 17th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2026) is the flagship conference of the ACM SIGBio. It will move for the first time to Europe and will be held in Calabria, Italy, from June 30 to July 3, 2026. The conference aims to promote big data, AI, and algorithms for health and biomedicine, including cutting-edge advances in computational biology, bioinformatics, and health informatics, at the intersection of computer science, biology, health, and medicine. The conference will be hosted at the University of Calabria Congress Center.

This year, for the first time, ACM BCB submitted papers will go through a two-phase review process, aiming to publish accepted papers in a new ACM journal format. Through the two-phase review process, authors can refine their work based on constructive critiques from reviewers. The updated paper version can thus be considered in a second-phase review process. Accepted papers will then be included in a new journal published by ACM.

For further information about ACM-BCB 2026 and important dates, please visit the conference website: https://acm-bcb.org

Variable rate neural compression of massive particle collider data from next-generation particle physics experiments

Next-generation particle physics experiments, including those conducted in heavy-ion and electron-ion colliders, aim to recreate the extreme conditions of the early universe and to probe how matter behaves at the smallest scales. Their detectors, such as large time projection chambers (TPCs), capture three-dimensional particle tracks from every collision, producing enormous data volumes at high speed. These rich datasets contain the detailed structures scientists need to study nuclear matter, map the internal structure of protons and nuclei, and search for physics beyond the standard model. Yet, the sheer scale of the data presents a growing challenge: storing all events in full is increasingly impractical, while conventional data-reduction strategies—such as discarding events based on predefined triggers or applying generic compression—may remove rare or unexpected signals that are central to discovery. Complicating matters further, TPC data are highly sparse: only a tiny portion of the detector records activity, but the specific pattern of those signals is essential for scientific interpretation.

In our recent paper entitled “Variable rate neural compression for sparse detector data”, published in Patterns – an open access journal by Cell Press in the broad area of Data Science – we explore a data-driven approach to compression that learns directly from the structure of the data themselves.

Yi Huang, Yeonju Go, Jin Huang, Shuhang Li, Xihaier Luo, Thomas Marshall, Joseph Osborn, Christopher Pinkenburg, Yihui Ren, Evgeny Shulga, Shinjae Yoo, Byung-Jun Yoon, “Variable Rate Neural Compression for Sparse Detector Data,” Patterns, 2026, 101452, https://doi.org/10.1016/j.patter.2025.101452.

Instead of relying on physics-specific rules, we propose a method that identifies and prioritizes the most informative signals within each event, allowing storage to scale with the true complexity of the measurement. Such adaptive, learning-based strategies could help future experiments preserve far more information without exceeding storage or computing limits, and they may offer a general framework for handling large, sparse datasets across a wide range of scientific domains.

For further details, please visit the Patterns website to read the whole paper: https://www.cell.com/patterns/fulltext/S2666-3899(25)00300-9

How to explore model uncertainty to enhance generative molecular design

In recent years, artificial intelligence (AI) has made big strides in helping scientists design new molecules—whether for life-saving drugs or advanced materials. Especially, generative AI models have received significant interest from the research community for designing new molecules, opening exciting possibilities in science and medicine.

However, one big challenge remains: these models are usually trained once and then used “as-is.” Adapting them to target-specific molecular characteristics – for example, to improve a molecule’s activity against specific targets or to make the designed molecule easily synthesizable – is challenging.

In a recent publication entitled “Enhancing generative molecular design via uncertainty-guided fine-tuning of variational autoencoders,” published in RSC Molecular Systems Design & Engineering, we have proposed a smarter way to fine-tune generative AI models by quantifying and leveraging model uncertainty.

A N M Nafiz Abeer, Sanket Jantre, Nathan M Urban, and Byung-Jun Yoon, “Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders,” Molecular Systems Design & Engineering, 2025, https://doi.org/10.1039/D5ME00081E.

In this paper, we proposed an uncertainty-guided fine-tuning strategy that can effectively enhance a pre-trained variational autoencoder (VAE) for generative molecular design (GMD) through performance feedback in an active learning setting. The strategy begins by quantifying the model uncertainty of the generative model using an efficient active subspace-based UQ (uncertainty quantification) scheme. Next, the decoder diversity within the characterized model uncertainty class is explored to expand the viable space of molecular generation. Empirical results across six target molecular properties demonstrate that the uncertainty-guided fine-tuning strategy consistently leads to improved models that outperform the original pre-trained generative models.

For further details, please read the full paper at the following link: https://doi.org/10.1039/D5ME00081E

Nafiz Abeer wins Best Poster Award at the 16th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2025)

Nafiz Abeer, a senior Ph.D student in the BioMLSP lab, has recently presented his work on TCR-pMHC binding prediction at the 16th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2025), the flagship conference of ACM SIGBio. His poster presentation has been selected by the ACM-BCB 2025 Poster Chairs for the Best Poster Award.

Understanding the interaction between the T-cell receptor (TCR) and the peptide-major histocompatibility complex (pMHC) is a crucial step in T-cell activation and the initiation of the T-cell mediated adaptive immune response. Despite efforts in developing various computational approaches for TCR-pMHC binding prediction, identifying TCR recognition for new peptides forming pMHCs on the surface of antigen-presenting cells (APCs) remains a critical challenge. In his work, Nafiz proposed a novel dual encoder strategy to effectively leverage predicted TCR-pMHC complex structures, thereby achieving better prediction performance for peptides that were not observed during training. His findings demonstrate that the dual-encoder-based approach often improves prediction performance (especially for EGNN) over existing single-encoder-based schemes.

ARPA-H APECx program site visit to Texas A&M University

On June 4th, 2025, Dr. Andy Kilianski, Acting Deputy Director of Health Science Futures at the Advanced Research Projects Agency for Health (ARPA-H), visited Texas A&M University to meet with the team behind the SPHERICAL project.

SPHERICAL (Scientific Platform for High Efficacy Antigen Design via Robust Integration of Computational Experiments, AI, and Protein Modeling) is supported by ARPA-H’s APECx program. Over the past nine months, the entire SPHERICAL team and I have actively reimagined how AI-driven, uncertainty-quantified, and competence-aware modeling can accelerate the design of broadly effective antigens.

During the site visit, we had the pleasure of presenting our progress to Dr. Andy Kilianski, discussing next steps, and sharing our vision for advancing the APECx mission. We look forward to continuing this collaboration and driving the project toward its milestones!

Multi-objective prompt optimization

Natural language prompt optimization, or prompt engineering, has emerged as a
powerful technique to unlock the potential of Large Language Models (LLMs) for various tasks. While existing methods primarily focus on maximizing a single task-specific performance metric for LLM outputs, real-world applications
often require considering trade-offs between multiple objectives.

In a recent paper entitled “Pareto Prompt Optimization“, we proposed an effective technique for multi-objective prompt optimization for LLMs to address this limitation. Specifically, the proposed method called ParetoPrompt, takes a reinforcement learning (RL) approach that leverages dominance relationships between prompts to derive an effective policy model for prompt optimization using preference-based loss functions. By leveraging multi-objective dominance relationships, ParetoPrompt enables efficient exploration of the entire Pareto front without the need for a predefined (and typically heuristic) scalarization of multiple objectives. Experimental results show that ParetoPrompt consistently outperforms existing prompt optimization techniques on various benchmarks. Furthermore, ParetoPrompt demonstrates robust performance when the objective metrics differ between training and testing.

Details of the work can be found at the following link:

Guang Zhao, Byung-Jun Yoon, Gilchan Park, Shantenu Jha, Shinjae Yoo, Xiaoning Qian, “Pareto Prompt Optimization,” 13th International Conference on Learning Representations (ICLR), Singapore, Apr 24-28, 2025.

BioMLSP Lab

Machine Learning for Computational Network Biology @ Texas A&M University