Author Archives: Byung-Jun Yoon

Unknown's avatar

About Byung-Jun Yoon

I am a Professor at Texas A&M University, Department of Electrical & Computer Engineering, and a Scientist at Brookhaven National Laboratory, Computational Science Initiative. My research interests include objective-based uncertainty quantification, optimal experimental design (OED), machine learning, and signal processing. Application areas of interest include bioinformatics, computational network biology, and AI-driven drug/materials discovery.

[Call for Papers] Special Collection in “Accelerating scientific discoveries through data-driven innovations” (Patterns – a Cell Press journal)

We are happy to announce a new Special Collection in “Accelerating scientific discoveries through data-driven innovations” in Patterns.

Patterns is a premium open-access journal from Cell Press, publishing ground-breaking original research across the full breadth of data science. The journal aims to share data science solutions to problems that cross domain boundaries.

Developing artificial intelligence (AI) and machine learning (ML) methods and models that can accelerate scientific discoveries and advance science has become one of the important research directions for the AI/ML research community. It has been gaining increasing attention from researchers in diverse scientific areas, including biomedical science, materials science, climate science, physics, chemistry, and many others. Data-driven AI/ML innovations to enable reliable predictions and optimal decision-making for scientific discoveries face several critical challenges, among which are high system complexity, large search space, incomplete knowledge, and small data, all of which demand novel strategies to effectively address them. Meeting these changes and thereby accelerating scientific discoveries and industrial innovations calls for research that can take full advantage of the latest advances in AI/ML to integrate data-driven techniques with scientific knowledge and principles and is able to execute them in a modern HPC environment at scale.

We invite researchers working in the forefront of accelerating scientific discoveries through innovations in machine learning (ML), AI, and data-driven modeling to submit their latest research findings to the special collection to inspire the next wave of data-driven innovations in various scientific domains.

Specific topics of interest include, but are not limited to:

  • Cross-cutting research at the intersections of mathematics, ML/AI, and computing
  • Scientific ML/AI integrating data-driven techniques with scientific knowledge/principles
  • Uncertainty-aware techniques of learning, inference, and optimization
  • Computational challenges for high-performance computing and data at extreme scales
  • Data-driven discoveries/innovations in complex scientific and industrial problems

We solicit papers that can showcase the state of the art in scientific ML/AI, applied mathematics, and high-performance computing that can meet the challenges that stem from large-scale data, extreme-scale computing requirements, and the need for decision-making pertinent to complex systems in the presence of significant uncertainties, which altogether can accelerate scientific discoveries through data-driven approaches. While we look for submissions of original research articles that report the latest research findings and breakthroughs in the field, we welcome various article types considered by Patterns, which include review and perspective articles.

Manuscripts should be prepared according to the guide for authors and should be submitted online, mentioning in the cover letter that you are submitting for the “Accelerating scientific discoveries through data-driven innovations” special collection.

The article processing charge will be waived for the first five manuscripts to be accepted for publication.

For further information, visit the following link on the Patterns website:

https://www.cell.com/patterns/special-issues/call-for-papers/accelerating-scientific-discoveries-through-data-driven-innovations

Accelerating optimal experimental design (OED) via deep learning

Various real-world scientific applications involve the mathematical modeling of complex uncertain systems with numerous unknown parameters. Accurate parameter estimation is often practically infeasible in such systems, as the available training data may be insufficient and the cost of acquiring additional data may be very high. In such cases, it may be desirable to represent the uncertainty present in the model in a Bayesian paradigm, based on which one may design robust operators that retain good overall performance across all possible models. Furthermore, one may design optimal experiments that can effectively reduce the uncertainty so as to significantly enhance the performance of the robust operators.

While objective-based uncertainty quantification (objective-UQ) based on MOCU (mean objective cost of uncertainty) provides an effective means for quantifying uncertainty in complex systems, the high computational cost of estimating MOCU has been a practical challenge in applying it to real-world scientific/engineering problems.

In our recent work, we proposed a novel deep learning (DL) scheme to reduce the computational cost for objective-UQ via MOCU that can significantly accelerate MOCU estimation and optimal experimental design (OED).

Qihua Chen, Xuejin Chen, Hyun-Myung Woo, Byung-Jun Yoon, “Neural Message Passing for Objective-Based Uncertainty Quantification and Optimal Experimental Design,” Engineering Applications of Artificial Intelligence, Volume 123, Part A, 106171, 2023, https://doi.org/10.1016/j.engappai.2023.106171

In the above study, we trained a message-passing neural network (MPNN) as a surrogate MOCU estimator, incorporating a novel axiomatic constraint loss that improves the estimation performance, and ultimately, the OED outcomes. Our results show that the proposed scheme can accelerate MOCU-based OED by four to five orders of magnitude, without any visible performance loss.

For further details, the paper can be accessed at: [download paper]

The 2023 ACM-BCB conference will be held in Houston, TX, September 3-6, 2023.

This year, the 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB) – the flagship conference of the ACM SIGBio – will be held in Houston, TX, September 3-6, 2023. ACM-BCB is being organized for the fourteenth year, building upon the success of the first thirteen meetings in various locations, including Boston, Niagara Falls, Chicago, Orlando, Washington DC, Newport Beach, Atlanta, and Seattle.

Continue reading

How to design a high-throughput virtual screening (HTVS) pipeline for efficient selection of redox-active organic materials

As global interest in renewable energy continues to increase, there has been a pressing need for developing novel energy storage devices based on organic electrode materials that can overcome the shortcomings of the current lithium-ion batteries. One critical challenge for this quest is to find materials whose redox potential (RP) meets specific design targets.

In our recent study below, we proposed a computational framework for addressing this challenge:

Hyun-Myung Woo, Omar Allam, Junhe Chen, Seung Soon Jang, Byung- Jun Yoon, “Optimal high-throughput virtual screening pipeline for efficient selection of redox- active organic materials,” iScience (2023), doi: https://doi.org/10.1016/j.isci.2022.105735

Given a high-fidelity model for estimating the RP of a given material, we showed how a set of surrogate models with different accuracy and complexity may be designed to construct a highly accurate and efficient HTVS pipeline. The performance of the screening campaigns based on the constructed HTVS pipeline can be optimized by designing the optimal screening policy, which enables rapid screening of organic materials that satisfy the desired criteria. We demonstrated that the proposed HTVS pipeline construction and operation strategies can substantially enhance the overall screening throughput.

Further details of this study can be found in iScience.

A graphical abstract illustrating the optimal design and operation of a high-throughput virtual screening (HTVS) pipeline for efficient selection of redox-active organic materials.

Dr. Yoon’s collaborative work with COVID ACT NOW and Mass General Hospital is featured on the cover of Patterns from Cell Press

The paper “Looking back on forward-looking COVID models”, in which Dr. Byung-Jun Yoon is a co-author, has been featured on the cover of the July 2022 issue of Patterns, a premium open access journal from Cell Press. This paper is an outcome of Dr. Yoon’s ongoing collaboration with COVID ACT NOW (CAN) – the COVID-focused U.S initiative of Act Now Coalition – and the Massachusetts General Hospital (MGH).

The cover art of Patterns July 2022 issue features Dr. Yoon’s collaborative project with Covid Act Now and Mass General Hospital.

In this work, we presented the epidemiological model by Covid Act Now (CAN) and evaluated its performance by back-testing against historical data. For comparison, similar analyses were performed for several other COVID models and the obtained results were compared. It was found that all models generally captured the potential magnitude and directionality of the pandemic in the short term. While there are limitations to epidemiological models, understanding these limitations enables these models to be utilized as tools for “data-driven decision-making” in viral outbreaks.

“Face the Music” – the cover image on the July 2022 issue of Patterns – was created by Anna Maybach, resident artist at Act Now Coalition.

The cover image is entitled “Face the Music”. The American idiom “face the music” means to accept consequences. It is thought to originate from an exhortation to face one’s stage fright. The sound waves in this image were created by superimposing Covid Act Now trend graphs representing cases, hospitalizations, ICU hospitalizations, and deaths in the United States since March 2020. This visualization represents the path of accepting the consequences of our actions and facing our fears in order to navigate the COVID pandemic: to “face the music.”

The paper can be accessed at the link below:

Paul Chong, Byung-Jun Yoon, Debbie Lai, Michael Carlson, Jarone Lee, Shuhan He, “Looking Back on Forward-Looking COVID Models,” Patterns, volume 3, issue 7, 100492, July 08, 2022, https://doi.org/10.1016/j.patter.2022.100492.

Craig H. Neilsen Foundation will fund a new research project on reinforcement learning for optimal electrical stimulation for spinal cord injury (SCI) patients

According to the National Spinal Cord Injury Statistical Center, approximately 18,000 new spinal cord injuries (SCI) occur each year in the United States. Spinal cord injuries often lead to serious constipation or incontinence, which can lead to decreased quality of life and may even be life-threatening. After a spinal cord injury, 41% of patients rated bowel dysfunction as a severe life-limiting problem.

Craig H. Neilsen Foundation recently announced that they will fund a new research project that aims to develop an optimal electrical stimulation method via reinforcement learning (RL) to help bowel dysfunction of spinal cord injury patients. In this project, Dr. Byung-Jun Yoon will collaborate with Dr. Hangue Park (PI, Texas A&M Electrical and Computer Engineering) and Dr. Cedric Geoffroy (Texas A&M College of Medicine) to develop a closed-loop stimulation scheme, which ultimately aims to improve the quality of life for SCI patients as well as their caregivers.

For more information, please visit:

https://engineering.tamu.edu/news/2022/06/ecen-electrical-stimulation-could-help-bowel-dysfunction-after-spinal-cord-injury.html

Our recent work on transfer learning for error estimation has been featured on ACM Tech News & DOE Office of Science website

This week’s ACM Tech News has featured our recent work on Transfer Learning for Bayesian Error Estimation (TL-BEE):

https://technews.acm.org/archives.cfm?fo=2022-03-mar/mar-11-2022.html

This study has also received spotlight on the Department of Energy (DOE), Office of Science website:

https://www.energy.gov/science/office-science

The work has been recently published in Cell Press Patterns, and the full article can be accessed at the link below:

Omar Maddouri, Xiaoning Qian, Francis J. Alexander, Edward R. Dougherty, Byung-Jun Yoon, “Robust Importance Sampling for Error Estimation in the Context of Optimal Bayesian Transfer Learning,” Patterns, 2022, DOI: https://doi.org/10.1016/j.patter.2021.100428

Omar Maddouri’s recent work on transfer learning for Bayesian error estimation featured on Texas A&M College of Engineering website

Recent work by Omar Maddouri, currently a Ph.D. candidate in the BioMLSP lab, on transfer learning for Bayesian error estimation has been featured in an article entitled “Doctoral student offers new insight into machine-learning error estimation”, which has been published on the Texas A&M College of Engineering website.

The article can be found at: https://engineering.tamu.edu/news/2022/03/ecen-doctoral-student-offers-new-insight-into-machine-learning-error-estimation.html

Our paper on error estimation via optimal Bayesian transfer learning has now been published in Patterns

Our recent study on Bayesian error estimation via optimal Bayesian transfer learning has been published in Patterns, a premium open access journal from Cell Press that publishes ground-breaking original research across the full breadth of data science.

Omar Maddouri, Xiaoning Qian, Francis J. Alexander, Edward R. Dougherty, Byung-Jun Yoon, “Robust Importance Sampling for Error Estimation in the Context of Optimal Bayesian Transfer Learning,” Patterns, DOI:https://doi.org/10.1016/j.patter.2021.100428.

Continue reading

Finding robust biomarkers for more accurate and reproducible disease diagnosis/prognosis

Identifying robust diagnostic/prognostic biomarkers from gene expression data that can lead to accurate and reproducible predictions is a challenging problem.

In our recent paper “Deep graph representations embed network information for robust disease marker identification”, we show that deep graph representations using graph convolutional networks (GCNs) can identify effective markers that significantly improve the predictive performance and their reproducibility across different datasets and platforms.

Omar Maddouri, Xiaoning Qian, Byung-Jun Yoon, “Deep graph representations embed network information for robust disease marker identification,” Bioinformatics, btab772, https://doi.org/10.1093/bioinformatics/btab772.

Continue reading