I am an Associate Professor at Texas A&M University, Department of Electrical & Computer Engineering, and a Scientist at Brookhaven National Laboratory, Computational Science Initiative. My research is focused on machine learning and signal processing theories, models, and algorithms for various scientific applications, primarily bioinformatics and computational biology.
The 2nd Workshop on Knowledge Guided Machine Learning (KGML2021) is inviting researchers working in relevant fields to share their latest research findings, including work in progress.
KGML2021 is a workshop that brings together data scientists – researchers in data mining, machine learning, and statistics – and researchers from hydrology, atmospheric science, aquatic sciences, and translational biology to discuss challenges, opportunities, and early progress in bringing scientific knowledge to machine learning.
KGML2021 is a virtual conference from August 9-11 and welcomes submissions from researchers around the world.
The ACM SIGBio bridges computer science, mathematics, statistics with biology and biomedicine. The mission of ACM SIGBio is to improve our ability to develop advanced research, training, and outreach in Bioinformatics, Computational Biology, and Biomedical Informatics by stimulating interactions among researchers, educators and practitioners from related multi-disciplinary fields.
Dr. Yoon’s term as the Treasurer will be from July 1, 2021 until June 30, 2024. For more information about ACM SIGBio, please visit http://www.sigbio.org
Various real-world applications involve modeling complex systems with immense uncertainty and optimizing multiple objectives based on the uncertain model. Being able to quantify the impact of such model uncertainty on the operational objectives of interest is critical, for example, to design optimal experiments that can most effectively reduce the uncertainty that affect the objectives pertinent to the application at hand. In fact, such objective-based uncertainty quantification (objective-UQ) has been shown to be much more efficient for optimal experimental design (OED) compared to other approaches that do not explicitly aim at reducing the “uncertainty that actually matters”.
The concept of MOCU (mean objective cost of uncertainty) provides an effective means to quantify this objective uncertainty, but its original definition was limited to the case of single objective operations.
In our recent paper, we extend the original MOCU to propose the mean multi-objective cost of uncertainty (multi-objective MOCU), which can be used for objective-based quantification of uncertainty for complex uncertain systems considering multiple operational objectives:
Based on several examples, we illustrate the concept of multi-objective MOCU and demonstrate its efficacy in quantifying the operational impact of model uncertainty when there are multiple, possibly competing, objectives.
For more information regarding the concept of objective-UQ, optimal experimental design (OED), and other relevant resources, please visit: https://objectiveUQ.org
The multi-objective MOCU quantifies the expected performance gap between the robust multi-objective operator that needs to be used to main good performance in the presence of model uncertainty and the optimal multi-objective operator for the true (but unknown) model.
The final version of our paper entitled “Physics-constrained Automatic Feature Engineering for Predictive Modeling in Materials Science” recently presented at the 35th AAAI Conference on Artificial Intelligence (AAAI-21) is now accessible at the link below:
Automatic Feature Engineering (AFE) aims to extract useful knowledge for interpretable predictions given data for the machine learning tasks of interest. In this work, we presented a novel AFE scheme that effectively extracts relationships from data that can be interpreted based on functional formulas to discover their “physical meaning” or “new hypotheses”. Here we focused on materials science applications, where interpretable predictive modeling may enhance our understanding of materials systems and also guide the discovery of new materials.
Typically, it is computationally prohibitive to exhaustively explore all potential relationships to identify interpretable and predictive features. We overcome this challenge by designing an AFE strategy that efficiently explores a feature generation tree (FGT) using a deep Q-network (DQN) for scalable and efficient exploration of the feature space in an automated manner. We demonstrate that our proposed DQN-based AFE strategy yields promising results when benchmarked against existing AFE methods based on several materials science datasets.
In this work, we present a general optimal experimental design (OED) strategy for an uncertain system that is described by coupled ordinary differential equations (ODEs), whose parameters are not completely known. As a vehicle to develop the OED strategy, we focus on non-homogeneous Kuramoto models in this study as the primary example. The proposed OED strategy quantifies the objective uncertainty of the Kuramoto model based on the mean objective cost of uncertainty (MOCU), where the optimal experiment can be identified by predicting the one in a given experimental design space that is expected to maximally reduce the MOCU.
Our study highlights the importance of quantifying the operational impact of the potential experiments in selecting the optimal experiment and it demonstrates that the MOCU-based OED scheme enables us to minimize the objective cost (i.e., cost of robust control in the application considered in this paper) of the uncertain Kuramoto model with the fewest experiments compared to other alternatives.
This work was performed in collaboration with Prof. Youngjoon Hong (Department of Mathematics, Sungkyunkwan University) and Prof. Bongsuk Kwon (Department of Mathematical Sciences, Ulsan National Institute of Science and Technology).
The Brookhaven National Laboratory (BNL), where Dr. Byung-Jun Yoon is working as a Scientist (via joint appointment), has officially joined the ATOM consortium for Accelerating Therapeutics for Opportunities in Medicine (ATOM).
“At Brookhaven, we are excited to apply our team’s work developing and using optimization algorithms directly to ATOM’s diverse computational data-driven modeling efforts,” said Francis J. “Frank” Alexander, deputy director of the Computational Science Initiative. “Often, mathematical models and systems of interest to ATOM cancer therapy problems are uncertain and under-characterized due to their extremely complex nature. At Brookhaven, our artificial intelligence, machine learning and applied mathematics work aims to unravel complexities to design computational and laboratory experiments that achieve discovery goals in the most efficient manner. We believe these efforts will have significant applications in ATOM that can greatly benefit and enhance the program’s impact. We look forward to contributing as part of the collaboration.”
We are happy to announce that our AISTATS 2021 paper entitled “Bayesian Active Learning by Soft Mean Objective Cost of Uncertainty” is now available in the Proceedings of Machine Learning Research, which can be accessed at the following link:
In this paper, we suggest a strictly concave approximation of MOCU – referred to as “Soft MOCU” – that can be used to define an acquisition function for Bayesian active learning with a theoretical convergence guarantee. We show in this study that the Soft MOCU based Bayesian active learning outperforms other existing methods, with the important benefit of theoretical guarantee of convergence to the optimal classifier.
In this paper, we propose an acquisition function for active learning of a Bayesian classifier based on a weighted form of MOCU (mean objective cost of uncertainty). By quantifying the uncertainty that directly affects the classification error, the proposed method avoids the shortcoming of the previous expected Loss Reduction (ELR) methods by avoiding their myopic behavior. Unlike existing ELR methods, which may get stuck before reaching the optimal classifier, the proposed weighted-MOCU based strategy provides the critical advantage that the resulting Bayesian active learning algorithm guarantees convergence to the optimal classifier of the true model. We demonstrate its performance with both synthetic and real-world datasets.
The Applied Mathematics Group of the Computational Science Initiative (CSI) at Brookhaven National Laboratory (BNL) invites exceptional candidates to apply for the Amalie Emmy Noether Fellowship in applied mathematics and scientific computing.
This fellowship offers a unique opportunity to conduct research in a broad set of fields, including reduced order modeling, uncertainty quantification and scalable computational statistics for Bayesian inference, optimization and control for decision making under uncertainty, scientific machine learning, high-dimensional inverse problems, multiscale modeling, integrated computational modeling frameworks, data science for streaming or “in-situ” (within simulation) analytics in high performance computing (HPC), and numerical methods. The methods and fundamental advances made in the course of this research will further the progress of applications of interest to BNL and the Department of Energy (DOE). Examples of such applications might include: data-driven uncertainty quantification and hybrid process-based/data-driven modeling for climate prediction and resilience planning, optimal experimental and simulation design for drug discovery and materials science, and large-scale data processing for particle accelerator experiments. The followship includes access to world-class HPC resources, such as the BNL Institutional Cluster and DOE leadership computing facilities. Access to these platforms will allow computing at scale and will ensure that the successful candidate will have the necessary resources to solve challenging DOE problems of interest.
This program provides full support for a period of two years at CSI. Candidates must have received a doctorate (Ph.D.) in applied mathematics or a related field (e.g., mathematics, physics, engineering, statistics, operations research, or computer science) within the past five years. This fellowship presents a unique chance to conduct interdisciplinary collaborative research in BNL programs with a strongly competitive salary. Recipients will be allowed to select a direct mentor from a list of CSI staff scientists. This mentor will help the recipient define and pursue their own research agenda during their appointment.
We are happy to announce the opportunity to apply for 2021 NSF Math Sciences Graduate Internship (MSGI) to work on the research project entitled: Uncertainty-Aware Data-Driven Models for Optimal Learning and Robust Decision Making Under Uncertainty. (Mentors: Drs. Nathan Urban & Byung-Jun Yoon)
This project aims to develop Scientific ML techniques that enable objective-driven uncertainty quantification (UQ) for data-driven models. We will focus on developing theories and algorithms that can ultimately lead to an automated learning procedure of effective surrogates for complex systems that can be used for making optimal decisions robust to system uncertainties and surrogate approximation errors. These goals will be attained based on a Bayesian ML paradigm, in which we integrate scientific prior knowledge on the system and the available data to obtain a prior directly characterizing the scientific uncertainty in the physical system, quantify the uncertainty relative to the objective, develop optimal operators robust to the uncertainty, and design strategies that can optimally reduce the uncertainty and thereby directly contribute to the attainment of the objective. Potential applications of this methodology will be discussed with the student, but may focus on biological and biomedical discovery science. Detailed information of this project can be found in the project catalog at the following link (search for reference code: BNL-URBAN1): https://orise.orau.gov/nsf-msgi/project-catalog.html
The NSF Mathematical Sciences Graduate Internship (MSGI) program is aimed at students who are interested in understanding the application of advanced mathematical and statistical techniques to “real world” problems, regardless of whether you plan to pursue an academic or nonacademic career. Internship activities will vary based on the assigned research project and hosting facility. As part of your application, you will identify your top 3 research projects from the 2021 NSF MSGI Project Catalog: https://orise.orau.gov/nsf-msgi/project-catalog.html