ACM CHIL 2020 was held virtually on July 23rd and 24th. It featured 23 research talks from accepted papers, 23 workshop spotlights, and 15 participants in the doctoral symposium.
+ Abstract:
+ A key impediment to reinforcement learning (RL) in real applications with limited, batch data is in defining a reward function that reflects what we implicitly know about reasonable behaviour for a task and allows for robust off-policy evaluation. In this work, we develop a method to identify an admissible set of reward functions for policies that (a) do not deviate too far in performance from prior behaviour, and (b) can be evaluated with high confidence, given only a collection of past trajectories. Together, these ensure that we avoid proposing unreasonable policies in high-risk settings. We demonstrate our approach to reward design on synthetic domains as well as in a critical care context, to guide the design of a reward function that consolidates clinical objectives to learn a policy for weaning patients from mechanical ventilation.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The abundance of modern health data provides many opportunities for the use of machine learning techniques to build better statistical models to improve clinical decision making. Predicting time-to-event distributions, also known as survival analysis, plays a key role in many clinical applications. We introduce a variational time-to-event prediction model, named Variational Survival Inference (VSI), which builds upon recent advances in distribution learning techniques and deep neural networks. VSI addresses the challenges of non-parametric distribution estimation by (i) relaxing the restrictive modeling assumptions made in classical models, and (ii) efficiently handling the censored observations, i.e., events that occur outside the observation window, all within the variational framework. To validate the effectiveness of our approach, an extensive set of experiments on both synthetic and real-world datasets is carried out, showing improved performance relative to competing solutions.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The dearth of prescribing guidelines for physicians is one key driver of the current opioid epidemic in the United States. In this work, we analyze medical and pharmaceutical claims data to draw insights on characteristics of patients who are more prone to adverse outcomes after an initial synthetic opioid prescription. Toward this end, we propose a generative model that allows discovery from observational data of subgroups that demonstrate an enhanced or diminished causal effect due to treatment. Our approach models these sub-populations as a mixture distribution, using sparsity to enhance interpretability, while jointly learning nonlinear predictors of the potential outcomes to better adjust for confounding. The approach leads to human interpretable insights on discovered subgroups, improving the practical utility for decision support.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Adverse drug reactions (ADRs) are detrimental and unexpected clinical incidents caused by drug intake. The increasing availability of massive quantities of longitudinal event data such as electronic health records (EHRs) has redefined ADR discovery as a big data analytics problem, where data-hungry deep neural networks are especially suitable because of the abundance of the data. To this end, we introduce neural self-controlled case series (NSCCS), a deep learning framework for ADR discovery from EHRs. NSCCS rigorously follows a self-controlled case series design to adjust implicitly and efficiently for individual heterogeneity. In this way, NSCCS is robust to time-invariant confounding issues and thus more capable of identifying associations that reflect the underlying mechanism between various types of drugs and adverse conditions. We apply NSCCS to a large-scale real-world EHR dataset and empirically demonstrate its superior performance with comprehensive experiments on a benchmark ADR discovery task.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Real-world predictive models in healthcare should be evaluated in terms of discrimination, the ability to differentiate between high and low risk events, and calibration, or the accuracy of the risk estimates. Unfortunately, calibration is often neglected and only discrimination is analyzed. Calibration is crucial for personalized medicine as they play an increasing role in the decision making process. Since random forest is a popular model for many healthcare applications, we propose CaliForest, a new calibrated random forest. Unlike existing calibration methodologies, CaliForest utilizes the out-of-bag samples to avoid the explicit construction of a calibration set. We evaluated CaliForest on two risk prediction tasks obtained from the publicly-available MIMIC-III database. Evaluation on these binary prediction tasks demonstrates that CaliForest can achieve the same discriminative power as random forest while obtaining a better-calibrated model evaluated across six different metrics. CaliForest will be published on the standard Python software repository and the code will be openly available on Github.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Retinal effusions and cysts caused by the leakage of damaged macular vessels and choroid neovascularization are symptoms of many ophthalmic diseases. Optical coherence tomography (OCT), which provides clear 10-layer cross-sectional images of the retina, is widely used to screen various ophthalmic diseases. A large number of researchers have carried out relevant studies on deep learning technology to realize the semantic segmentation of lesion areas, such as effusion on OCT images, and achieved good results. However, in this field, problems of the low contrast of the lesion area and unevenness of lesion size limit the accuracy of the deep learning semantic segmentation model. In this paper, we propose a boundary multi-scale multi-task OCT segmentation network (BMM-Net) for these two challenges to segment the retinal edema area, subretinal fluid, and pigment epithelial detachment in OCT images. We propose a boundary extraction module, a multi-scale information perception module, and a classification module to capture accurate position and semantic information and collaboratively extract meaningful features. We train and verify on the AI Challenger competition dataset. The average Dice coefficient of the three lesion areas is 3.058% higher than the most commonly used model in the field of medical image segmentation and reaches 0.8222.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations with diverse risk profiles or survival distributions. As a result, there is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles, while jointly accounting for accurate individualized time-to-event predictions. An approach that addresses this need is likely to improve the characterization of individual outcomes by leveraging regularities in subpopulations, thus accounting for population-level heterogeneity. In this paper, we propose a Bayesian nonparametrics approach that represents observations (subjects) in a clustered latent space, and encourages accurate time-to-event predictions and clusters (subpopulations) with distinct risk profiles. Experiments on real-world datasets show consistent improvements in predictive performance and interpretability relative to existing state-of-the-art survival analysis models.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when they are tested in different hospital systems. Furthermore, even within a given hospital system, deep learning models have been shown to depend on hospital- and patient-level confounders rather than meaningful pathology to make classifications. In order for these models to be safely deployed, we would like to ensure that they do not use confounding variables to make their classification, and that they will work well even when tested on images from hospitals that were not included in the training data. We attempt to address this problem in the context of pneumonia classification from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia classifier by training a model that is invariant to the view position of chest radiographs (anterior-posterior vs. posterior-anterior). Our approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed methods to handle confounding, and also suggests a method for identifying models that may rely on confounders.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, the goal of the explanation is to attribute the increase to a few relevant inputs from the past.While our formal setup and techniques are general, we carry out an in-depth case study in a clinical setting. The goal here is to alert a clinician when a patient's risk of deterioration rises. The clinician then has to decide whether to intervene and adjust the treatment. Given a potentially long sequence of new events since she last saw the patient, a concise explanation helps her to quickly triage the alert.We develop methods to lift static attribution techniques to the dynamical setting, where we identify and address challenges specific to dynamics. We then experimentally assess the utility of different explanations of clinical alerts through expert evaluation.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ We introduce SparseVM, a method that registers clinical-quality 3D MR scans both faster and more accurately than previously possible. Deformable alignment, or registration, of clinical scans is a fundamental task for many clinical neuroscience studies. However, most registration algorithms are designed for high-resolution research-quality scans. In contrast to research-quality scans, clinical scans are often sparse, missing up to 86% of the slices available in research-quality scans. Existing methods for registering these sparse images are either inaccurate or extremely slow. We present a learning-based registration method, SparseVM, that is more accurate and orders of magnitude faster than the most accurate clinical registration methods. To our knowledge, it is the first method to use deep learning specifically tailored to registering clinical images. We demonstrate our method on a clinically-acquired MRI dataset of stroke patients and on a simulated sparse MRI dataset. Our code is available as part of the VoxelMorph package at http://voxelmorph.mit.edu.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Necrotizing enterocolitis (NEC) is a life-threatening intestinal disease that primarily affects preterm infants during their first weeks after birth. Mortality rates associated with NEC are 15-30%, and surviving infants are susceptible to multiple serious, long-term complications. The disease is sporadic and, with currently available tools, unpredictable. We are creating an early warning system that uses stool microbiome features, combined with clinical and demographic information, to identify infants at high risk of developing NEC. Our approach uses a multiple instance learning, neural network-based system that could be used to generate daily or weekly NEC predictions for premature infants. The approach was selected to effectively utilize sparse and weakly annotated datasets characteristic of stool microbiome analysis. Here we describe initial validation of our system, using clinical and microbiome data from a nested case-control study of 161 preterm infants. We show receiver-operator curve areas above 0.9, with 75% of dominant predictive samples for NEC-affected infants identified at least 24 hours prior to disease onset. Our results pave the way for development of a real-time early warning system for NEC using a limited set of basic clinical and demographic details combined with stool microbiome data.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been used for disease prediction or diagnostics. Graph Attention Networks have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that is difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92% accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network, random forest, and multi-layer perceptron. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the difference between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding which we visualize with PHATE and UMAP. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The International Classification of Disease (ICD) is a widely used diagnostic ontology for the classification of health disorders and a valuable resource for healthcare analytics. However, ICD is an evolving ontology and subject to periodic revisions (e.g. ICD-9-CM to ICD-10-CM) resulting in the absence of complete cross-walks between versions. While clinical experts can create custom mappings across ICD versions, this process is both time-consuming and costly. We propose an automated solution that facilitates interoperability without sacrificing accuracy.Our solution leverages the SNOMED-CT ontology whereby medical concepts are organised in a directed acyclic graph. We use this to map ICD-9-CM to ICD-10-CM by associating codes to clinical concepts in the SNOMED graph using a nearest neighbors search in combination with natural language processing. To assess the impact of our method, the performance of a gradient boosted tree (XGBoost) developed to classify patients with Exocrine Pancreatic Insufficiency (EPI) disorder, was compared when using features constructed by our solution versus clinically-driven methods. This dataset comprised of 23, 204 EPI patients and 277, 324 non-EPI patients with data spanning from October 2011 to April 2017. Our algorithm generated clinical predictors with comparable stability across the ICD-9-CM to ICD-10-CM transition point when compared to ICD-9-CM/ICD-10-CM mappings generated by clinical experts. Preliminary modeling results showed highly similar performance for models based on the SNOMED mapping vs clinically defined mapping (71% precision at 20% recall for both models). Overall, the framework does not compromise on accuracy at the individual code level or at the model-level while obviating the need for time-consuming manual mapping.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Systematic review (SR) is an essential process to identify, evaluate, and summarize the findings of all relevant individual studies concerning health-related questions. However, conducting a SR is labor-intensive, as identifying relevant studies is a daunting process that entails multiple researchers screening thousands of articles for relevance. In this paper, we propose MMiDaS-AE, a Multi-modal Missing Data aware Stacked Autoencoder, for semi-automating screening for SRs. We use a multi-modal view that exploits three representations, of: 1) documents, 2) topics, and 3) citation networks. Documents that contain similar words will be nearby in the document embedding space. Models can also exploit the relationship between documents and the associated SR MeSH terms to capture article relevancy. Finally, related works will likely share the same citations, and thus closely related articles would, intuitively, be trained to be close to each other in the embedding space. However, using all three learned representations as features directly result in an unwieldy number of parameters. Thus, motivated by recent work on multi-modal auto-encoders, we adopt a multi-modal stacked autoencoder that can learn a shared representation encoding all three representations in a compressed space. However, in practice one or more of these modalities may be missing for an article (e.g., if we cannot recover citation information). Therefore, we propose to learn to impute the shared representation even when specific inputs are missing. We find this new model significantly improves performance on a dataset consisting of 15 SRs compared to existing approaches.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this problem as hidden stratification, and observe that it results from incompletely describing the meaningful variation in a dataset. While hidden stratification can substantially reduce the clinical efficacy of machine learning models, its effects remain difficult to measure. In this work, we assess the utility of several possible techniques for measuring hidden stratification effects, and characterize these effects both via synthetic experiments on the CIFAR-100 benchmark dataset and on multiple real-world medical imaging datasets. Using these measurement techniques, we find evidence that hidden stratification can occur in unidentified imaging subsets with low prevalence, low label quality, subtle distinguishing features, or spurious correlates, and that it can result in relative performance differences of over 20% on clinically important subsets. Finally, we discuss the clinical implications of our findings, and suggest that evaluation of hidden stratification should be a critical component of any machine learning deployment in medical imaging.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Automated assessment of rehabilitation exercises using machine learning has a potential to improve current rehabilitation practices. However, it is challenging to completely replicate therapist's decision making on the assessment of patients with various physical conditions. This paper describes an interactive machine learning approach that iteratively integrates a data-driven model with expert's knowledge to assess the quality of rehabilitation exercises. Among a large set of kinematic features of the exercise motions, our approach identifies the most salient features for assessment using reinforcement learning and generates a user-specific analysis to elicit feature relevance from a therapist for personalized rehabilitation assessment. While accommodating therapist's feedback on feature relevance, our approach can tune a generic assessment model into a personalized model. Specifically, our approach improves performance to predict assessment from 0.8279 to 0.9116 average F1-scores of three upper-limb rehabilitation exercises (p < 0.01). Our work demonstrates that machine learning models with feature selection can generate kinematic feature-based analysis as explanations on predictions of a model to elicit expert's knowledge of assessment, and how machine learning models can augment with expert's knowledge for personalized rehabilitation assessment.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Accurately extracting medical entities from social media is challenging because people use informal language with different expressions for the same concept, and they also make spelling mistakes. Previous work either focused on specific diseases (e.g., depression) or drugs (e.g., opioids) or, if working with a wide-set of medical entities, only tackled individual and small-scale benchmark datasets (e.g., AskaPatient). In this work, we first demonstrated how to accurately extract a wide variety of medical entities such as symptoms, diseases, and drug names on three benchmark datasets from varied social media sources, and then also validated this approach on a large-scale Reddit dataset.We first implemented a deep-learning method using contextual embeddings that upon two existing benchmark datasets, one containing annotated AskaPatient posts (CADEC) and the other containing annotated tweets (Micromed), outperformed existing state-of-the-art methods. Second, we created an additional benchmark dataset by annotating medical entities in 2K Reddit posts (made publicly available under the name of MedRed) and showed that our method also performs well on this new dataset.Finally, to demonstrate that our method accurately extracts a wide variety of medical entities on a large scale, we applied the model pre-trained on MedRed to half a million Reddit posts. The posts came from disease-specific subreddits so we could categorise them into 18 diseases based on the subreddit. We then trained a machine-learning classifier to predict the post's category solely from the extracted medical entities. The average F1 score across categories was .87. These results open up new cost-effective opportunities for modeling, tracking and even predicting health behavior at scale.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment to predict in another due to variability in features. Even within disease labels there can be differences (e.g. "fever" may mean something different reported in a doctor's office versus in an online app). Moreover, models are often built on passive, observational data which contain different distributions of population subgroups (e.g. men or women). Thus, there are two forms of instability between environments in this observational transport problem. We first harness substantive knowledge from health research to conceptualize the underlying causal structure of this problem in a health outcome prediction task. Based on sources of stability in the model and the task, we posit that we can combine environment and population information in a novel population-aware hierarchical Bayesian domain adaptation framework that harnesses multiple invariant components through population attributes when needed. We study the conditions under which invariant learning fails, leading to reliance on the environment-specific attributes. Experimental results for an influenza prediction task on four datasets gathered from different contexts show the model can improve prediction in the case of largely unlabelled target data from a new environment and different constituent population, by harnessing both environment and population invariant information. This work represents a novel, principled way to address a critical challenge by blending domain (health) knowledge and algorithmic innovation. The proposed approach will have significant impact in many social settings wherein who the data comes from and how it was generated, matters.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Phenotyping electronic health records (EHR)focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose Temporal And Static TEnsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes.TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical re-formulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14× faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 60 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertainty methods in the medical domain. Using RNN ensembles and various Bayesian RNNs, we show that population-level metrics, such as AUC-PR, AUC-ROC, log-likelihood, and calibration error, do not capture model uncertainty. Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. Understanding the uncertainty for individual patients is an area with clear clinical impact, such as determining when a model decision is likely to be brittle. We further show that RNNs with only Bayesian embeddings can be a more efficient way to capture model uncertainty compared to ensembles, and we analyze how model uncertainty is impacted across individual input features and patient subgroups.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The ability of caregivers and investigators to share patient data is fundamental to many areas of clinical practice and biomedical research. Prior to sharing, it is often necessary to remove identifiers such as names, contact details, and dates in order to protect patient privacy. Deidentification, the process of removing identifiers, is challenging, however. High-quality annotated data for developing models is scarce; many target identifiers are highly heterogenous (for example, there are uncountable variations of patient names); and in practice anything less than perfect sensitivity may be considered a failure. Consequently, software for adequately deidentifying clinical data is not widely available. As a result patient data is often withheld when sharing would be beneficial, and identifiable patient data is often divulged when a deidentified version would suffice.In recent years, advances in machine learning methods have led to rapid performance improvements in natural language processing tasks, in particular with the advent of large-scale pretrained language models. In this paper we develop and evaluate an approach for deidentification of clinical notes based on a bidirectional transformer model. We propose human interpretable evaluation measures and demonstrate state of the art performance against modern baseline models. Finally, we highlight current challenges in deidentification, including the absence of clear annotation guidelines, lack of portability of models, and paucity of training data. Code to develop our model is open source and simple to install, allowing for broad reuse.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Machine learning for healthcare researchers face challenges to progress and reproducibility due to a lack of standardized processing frameworks for public datasets. We present MIMIC-Extract, an open source pipeline for transforming the raw electronic health record (EHR) data of critical care patients from the publicly-available MIMIC-III database into data structures that are directly usable in common time-series prediction pipelines. MIMIC-Extract addresses three challenges in making complex EHR data accessible to the broader machine learning community. First, MIMIC-Extract transforms raw vital sign and laboratory measurements into usable hourly time series, performing essential steps such as unit conversion, outlier handling, and aggregation of semantically similar features to reduce missingness and improve robustness. Second, MIMIC-Extract extracts and makes prediction of clinically-relevant targets possible, including outcomes such as mortality and length-of-stay as well as comprehensive hourly intervention signals for ventilators, vasopressors, and fluid therapies. Finally, the pipeline emphasizes reproducibility and extensibility to future research questions. We demonstrate the pipeline's effectiveness by developing several benchmark tasks for outcome and intervention forecasting and assessing the performance of competitive models.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ This talk outlines two Mila projects aimed at fighting the Covid-19 pandemic which are part of Mila's AI for Humanity mission. The first one is about discovering antivirals, either via repurposing several existing drugs using graph neural networks or via discovering new drug-like molecules using reinforcement learning and docking simulations to search in the molecular space. The second project is about using machine learning to provide early warning signals to people who are contagious -- especially if they don't realize that they are -- by exchanging information between phones of users who have had dangerous contacts with each other. This extends digital contact tracing by incorporating information about symptoms, medical condition and behavior (like wearing a mask) and relies on a sophisticated epidemiological model at the individual levels in which we can simulate different individual-level and society-level strategies.
+
+ Bio:
+ Yoshua Bengio is Professor in the Computer Science and Operations Research departments at U. Montreal, founder and scientific director of Mila and of IVADO. He is a Fellow of the Royal Society of London and of the Royal Society of Canada, has received a Canada Research Chair and a Canada CIFAR AI Chair and is a recipient of the 2018 Turing Award for pioneering deep learning, is an officer of the Order of Canada, a member of the NeurIPS advisory board, co-founder and member of the board of the ICLR conference, and program director of the CIFAR program on Learning in Machines and Brains. His goal is to contribute to uncovering the principles giving rise to intelligence through learning, as well as favour the development of AI for the benefit of all.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Countries in Sub-Saharan Africa are facing a double burden of infectious and noncommunicable diseases. The burden of noncommunicable diseases such as, diabetes and hypertension, are expected to continue increasing. Digital data and tools that can be used to study the patterns of health and disease in populations offer opportunities for improving public health. Digital platforms such as, social media, search engines, and internet forums, have been widely accepted in Sub-Saharan Africa for health information seeking and sharing. These tools can be used to improve public health in Sub-Saharan Africa in three ways: (1) monitoring health information seeking and providing health education, (2) monitoring risk factors, and (3) monitoring disease incidence. However, in order for these tools to be effective, it is important to consider and incorporate into analytical processes the distinct social, cultural, and economic context in Sub-Saharan African countries.
+
+ Bio:
+ Dr. Nsoesie is an Assistant Professor of Global Health at Boston University (BU) School of Public Health. She is also a BU Data Science Faculty Fellow as part of the BU Data Science Initiative at the Hariri Institute for Computing and a Data and Innovation Fellow at The Directorate of Science, Technology and Innovation (DSTI) in the Office of the President in Sierra Leone. Dr. Nsoesie applies data science methodologies to global health problems, using digital data and technology to improve health, particularly in the realm of surveillance of chronic and infectious diseases. She has worked with local public health departments in the United States and international organizations. She completed her postdoctoral studies at Harvard Medical School, and her PhD in Computational Epidemiology from the Genetics, Bioinformatics and Computational Biology program at Virginia Tech. She also has an MS in Statistics and a BS in Mathematics. She is the founder of Rethé – an initiative focused on providing scientific writing tools and resources to student communities in Africa in order to increase representation in scientific publications. She has written for NPR, The Conversation, Public Health Post and Quartz. Dr. Nsoesie was born and raised in Cameroon.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/2020/speaker_20K03.html b/2020/speaker_20K03.html
new file mode 100644
index 000000000..1ae0a3528
--- /dev/null
+++ b/2020/speaker_20K03.html
@@ -0,0 +1,531 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CHIL
+
+ : Machine Learning in Health Care: Too Important to Be a Toy Example
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning in Health Care: Too Important to Be a Toy Example
+
+ Abstract:
+ The massive size of the health care sector make data science applications in this space particularly salient for social policy. An overarching theme of this keynote is that developing machine learning methodology tailored to specific substantive health problems and the associated electronic health data is critical given the stakes involved, rather than eschewing complexity in simplified scenarios that may no longer represent an actual real-world problem.
+
+ Bio:
+ Sherri Rose, Ph.D. is an Associate Professor of Health Care Policy at Harvard Medical School and Co-Director of the Health Policy Data Science Lab. Her research in health policy focuses on risk adjustment, comparative effectiveness, and health program evaluation. Dr. Rose coauthored the first book on machine learning for causal inference and has published work across fields, including in Biometrics, JASA, PMLR,Journal of Health Economics, and NEJM. She currently serves as co-editor of the journal Biostatistics and is Chair-Elect of the American Statistical Association’s Biometrics Section. Her honors include the ISPOR Bernie J. O’Brien New Investigator Award for exceptional early career work in health economics and outcomes research and an NIH Director’s New Innovator Award to develop machine learning estimators for generalizability in health policy.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Bio:
+ Dr. Ruslan Salakhutdinov is a UPMC professor of Computer Science at Carnegie Mellon University. He has served as an area chair for NIPS, ICML, CVPR, and ICLR. He holds a PhD from University of Toronto and completed postdoctoral training at Massachusetts Institute of Technology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In this session we will explore strategies for, and issues involved in, bringing Artificial Intelligence (AI) technologies to the clinic, safely and ethically. We will discuss the characteristics of a sound data strategy for powering a machine learning (ML) health system. The session introduces a frame-work for analyzing the utility of ML models in healthcare and discusses the implicit assumptions in aligning incentives for AI guided healthcare actions.
+
+ Bio:
+ Dr. Nigam Shah is Associate Professor of Medicine (Biomedical Informatics) at Stanford University, and serves as the Associate CIO for Data Science for Stanford Health Care. Dr. Shah’s research focuses on combining machine learning and prior knowledge in medical ontologies to enable the learning health system. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and is inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Machine Learning Challenges in the Fight for Social Good - the Covid-19 Case
+
+
+
+
+
+ Yoshua Bengio / University of Montreal
+
+
+
+
+
+
+
+
+
+
Abstract: This talk outlines two Mila projects aimed at fighting the Covid-19 pandemic which are part of Mila's AI for Humanity mission. The first one is about discovering antivirals, either via repurposing several existing drugs using graph neural networks or via discovering new drug-like molecules using reinforcement learning and docking simulations to search in the molecular space. The second project is about using machine learning to provide early warning signals to people who are contagious -- especially if they don't realize that they are -- by exchanging information between phones of users who have had dangerous contacts with each other. This extends digital contact tracing by incorporating information about symptoms, medical condition and behavior (like wearing a mask) and relies on a sophisticated epidemiological model at the individual levels in which we can simulate different individual-level and society-level strategies.
+
+
+
Bio: Yoshua Bengio is Professor in the Computer Science and Operations Research departments at U. Montreal, founder and scientific director of Mila and of IVADO. He is a Fellow of the Royal Society of London and of the Royal Society of Canada, has received a Canada Research Chair and a Canada CIFAR AI Chair and is a recipient of the 2018 Turing Award for pioneering deep learning, is an officer of the Order of Canada, a member of the NeurIPS advisory board, co-founder and member of the board of the ICLR conference, and program director of the CIFAR program on Learning in Machines and Brains. His goal is to contribute to uncovering the principles giving rise to intelligence through learning, as well as favour the development of AI for the benefit of all.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Digital Platforms & Public Health in Africa
+
+
+
+
+
+ Elaine Nsoesie / Boston University
+
+
+
+
+
+
+
+
+
+
Abstract: Countries in Sub-Saharan Africa are facing a double burden of infectious and noncommunicable diseases. The burden of noncommunicable diseases such as, diabetes and hypertension, are expected to continue increasing. Digital data and tools that can be used to study the patterns of health and disease in populations offer opportunities for improving public health. Digital platforms such as, social media, search engines, and internet forums, have been widely accepted in Sub-Saharan Africa for health information seeking and sharing. These tools can be used to improve public health in Sub-Saharan Africa in three ways: (1) monitoring health information seeking and providing health education, (2) monitoring risk factors, and (3) monitoring disease incidence. However, in order for these tools to be effective, it is important to consider and incorporate into analytical processes the distinct social, cultural, and economic context in Sub-Saharan African countries.
+
+
+
Bio: Dr. Nsoesie is an Assistant Professor of Global Health at Boston University (BU) School of Public Health. She is also a BU Data Science Faculty Fellow as part of the BU Data Science Initiative at the Hariri Institute for Computing and a Data and Innovation Fellow at The Directorate of Science, Technology and Innovation (DSTI) in the Office of the President in Sierra Leone. Dr. Nsoesie applies data science methodologies to global health problems, using digital data and technology to improve health, particularly in the realm of surveillance of chronic and infectious diseases. She has worked with local public health departments in the United States and international organizations. She completed her postdoctoral studies at Harvard Medical School, and her PhD in Computational Epidemiology from the Genetics, Bioinformatics and Computational Biology program at Virginia Tech. She also has an MS in Statistics and a BS in Mathematics. She is the founder of Rethé – an initiative focused on providing scientific writing tools and resources to student communities in Africa in order to increase representation in scientific publications. She has written for NPR, The Conversation, Public Health Post and Quartz. Dr. Nsoesie was born and raised in Cameroon.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning in Health Care: Too Important to Be a Toy Example
+
+
+
+
+
+ Sherri Rose / Harvard Medical School
+
+
+
+
+
+
+
+
+
+
Abstract: The massive size of the health care sector make data science applications in this space particularly salient for social policy. An overarching theme of this keynote is that developing machine learning methodology tailored to specific substantive health problems and the associated electronic health data is critical given the stakes involved, rather than eschewing complexity in simplified scenarios that may no longer represent an actual real-world problem.
+
+
+
Bio: Sherri Rose, Ph.D. is an Associate Professor of Health Care Policy at Harvard Medical School and Co-Director of the Health Policy Data Science Lab. Her research in health policy focuses on risk adjustment, comparative effectiveness, and health program evaluation. Dr. Rose coauthored the first book on machine learning for causal inference and has published work across fields, including in Biometrics, JASA, PMLR,Journal of Health Economics, and NEJM. She currently serves as co-editor of the journal Biostatistics and is Chair-Elect of the American Statistical Association’s Biometrics Section. Her honors include the ISPOR Bernie J. O’Brien New Investigator Award for exceptional early career work in health economics and outcomes research and an NIH Director’s New Innovator Award to develop machine learning estimators for generalizability in health policy.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Incorporating Domain Knowledge into Deep Learning Models
+
+
+
+
+
+ Ruslan Salakhutdinov / Carnegie Mellon University
+
+
+
+
+
+
+
+
+
+
Abstract: Details to be confirmed.
+
+
+
Bio: Dr. Ruslan Salakhutdinov is a UPMC professor of Computer Science at Carnegie Mellon University. He has served as an area chair for NIPS, ICML, CVPR, and ICLR. He holds a PhD from University of Toronto and completed postdoctoral training at Massachusetts Institute of Technology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A framework for shaping the future of AI in healthcare
+
+
+
+
+
+ Nigam Shah / Stanford University
+
+
+
+
+
+
+
+
+
+
Abstract: In this session we will explore strategies for, and issues involved in, bringing Artificial Intelligence (AI) technologies to the clinic, safely and ethically. We will discuss the characteristics of a sound data strategy for powering a machine learning (ML) health system. The session introduces a frame-work for analyzing the utility of ML models in healthcare and discusses the implicit assumptions in aligning incentives for AI guided healthcare actions.
+
+
+
Bio: Dr. Nigam Shah is Associate Professor of Medicine (Biomedical Informatics) at Stanford University, and serves as the Associate CIO for Data Science for Stanford Health Care. Dr. Shah’s research focuses on combining machine learning and prior knowledge in medical ontologies to enable the learning health system. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and is inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+ Abstract:
+ Serena Jeblee's (University of Toronto, Expected Aug 2020) research focuses on clinical natural language processing (NLP), with a special focus on the extraction of a normalized Cause of Death (CoD) from verbal autopsy reports. This research would be especially impactful in low to middle income countries, where verbal autopsy reports are common, and physical autopsies or medically certified causes of death are less common. Serena approaches this problem by first extracting a temporally ordered list of symptoms from the verbal autopsy report, then uses these to construct a more accurate assessment of the overall CoD diagnosis. Serena's other work focuses on other clinical NLP tasks, including automatic extraction of pertinent information from provider-patient dialogs.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Dr. Serifat Folorunso's (University of Ibadan, 2019) research focuses on augmented survival data analysis using modified generalized gamma mixture cure models (GGMCMs) for cancer research. In particular, Dr. Folorunso's work examines generalizing traditional GGMCMs to better account for the acute-asymmetry in survival data by using a gamma link function. Dr. Folorunso's model demonstrated superior performance to a traditional GGMCM as well as other kinds of survival mixture-cure models on an ovarian cancer dataset from University College Hospital, Ibadan. Dr. Folorunso's has also investigated works examining additional aspects of survival models, as well as social determinants and impacts of neonatal health.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Dr. Savannah Bergquist's (Harvard University, 2019) research focuses on accounting for missing not at random (MNAR) data in health contexts, specifically insurance plan payment policies and lung cancer staging from insurance claims data. In the former analyses, Dr. Bergquist's work uses missingness sensitive ML methods to examine the contribution of various current practices to problematic incentives in medicare plan payment policies, and to suggest improvement. In the latter research, Dr. Bergquist focuses on predicting a clinically meaningful lung cancer staging system using classification models. Dr. Bergquist also has examined other aspects of health insurance plan design and analysis.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Paidamoyo Chapfuwa's (Duke University, Expected 2021) research focuses on bringing modern machine learning approaches to survival analysis, i.e, causal inference, generative modeling, and Bayesian nonparametric. In particular, Paidamoyo's work examines generative methods for high-performance (accurate, calibrated, uncertainty-aware predictions) survival models. Moreover, her work introduces an adversarial distribution matching approach and a novel covariate-conditional Kaplan-Meier estimator, accounting for the predictive uncertainty in survival model calibration. In addition, her work also enables an interpretable time-to-event driven clustering method using a Bayesian nonparametric stick-breaking representation of the Dirichlet Process that represents patients in a clustered latent space. Recently, Paidamoyo’s work has explored a unified framework for individualized treatment effect estimation for survival outcomes from observation data.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Primoz Kocbek's (University of Maribor, Expected 2021) research focuses on interpretability and the use of synthetic data in machine learning models processing electronic health record (EHR) data. In particular, Primoz's research examined and provided a more nuanced analysis of the kinds of interpretability enabled by various kinds of models, including classifications of models as providing local vs. global or model-dependent vs. model-agnostic interpretability. Primoz also hopes to extend his research in the future with the use of synthetic data as additional structure to data, primarily leveraging the natural graph structure of some subsets of EHR data to improve predictive power.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Jill Furzer's (University of Toronto, Expected 2020) research focuses on combining ensemble learning methods with an economics causal inference tool-kit to predict mental health risk in childhood, assess drivers of marginal misdiagnosis, and understand long-term socioeconomic implications of missed, late or low-value diagnoses. Jill compares classic regression with regularized regression and gradient boosted trees to estimate latent mental health risk in childhood in a nationally representative longitudinal health survey dataset, and further examines how sensitive these models are to protected subgroup information, including gender, rural v. urban, and socioeconomic status. Jill's past research has further focused on modelling the cost-effectiveness of various pediatric oncology screening guidelines and treatments.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Dr. Hasna Njah's (University of Sfax, 2019) research focuses on learning bayesian networks (BNs) for health applications in the context of high-dimensional data. In particular, Dr. Njah's research proposes a new kind of BN, called a Bayesian Network Abstraction (BNA) framework, which uses latent variables to ameliorate the computational and optimization difficulties imposed by high-dimensional data. The BNA framework first uses dependency-based feature clustering algorithms to cluster input variables, followed by learning to summarize each cluster in a separate latent variable, thereby realizing the entire network in a hierarchical clustering & summarization BN, with the overall system learned using the greedy equilibrium criteria and hierarchical expectation maximization. In other work, Dr. Njah has focused on applying BNs to protein-protein interaction data and gene regulatory networks.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Vinyas Harish's (University of Toronto, Expected MD/PhD 2025) research focuses on the ways in which machine learning can complement traditional epidemiological perspectives and methods applied at the population and clinical levels, with an emphasis on promoting health systems resilience in the context of emergencies. Vinyas explores these topics in several ways, including a qualitative study on the ethics of private sector ML4H collaborations with stakeholders across technical, ethics/governance, and clinical domains, an examination of the utility of pandemic preparedness indices through cluster analysis, and the high-resolution prediction of COVID-19 transmission using mobility data and environmental covariates. Historically, Vinyas has also examined medical device safety and feasibility testing as well as the efficacy of novel methods for teaching clinicians image-guided procedures.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Haohan Wang's (Carnegie Mellon University, Expected 2021) research focuses on the systematic development of trustworthy machine learning (ML) systems that can be deployed to answer biomedical questions in the real-world scenarios, consistently responding over significant variations of the data. In particular, Haohan's work focuses on improving robustness of ML models to dataset shift, specifically towards the application of early prediction of Alzheimer's disease from genetic and imaging data. Haohan's methods focus on using a nuanced understanding of the data generative process in order to better account for expected distributional shifts, yielding more robust and interpretable models of Alzheimer's diagnosis. In other work, Haohan has also investigated the use of ML methods on genomic and transcriptomic data for biomedical applications.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Mamadou Lamine MBOUP's (University of Thies, Expected 2022) research focuses on using ML methods over ultrasound data to perform early diagnosis and identification of liver damage within chronic liver disease patients and to classify said patients according to their severity. Especially in areas where chronic liver diseases, such as hepatitis, are prevalent, and liver cirrhosis and cancer are a significant health burden on the community, using ML methods to perform early diagnosis of these syndromes based on a low-cost modality like ultrasound would be extremely impactful. Mamdou's work investigates using supervised and unsupervised classical and deep learning methods to solve this problem, using data from a cohort of patients at the Aristide Le Dantec University Hospital Center. In past work, Mamadou has investigated algorithms for image compression, as well as investigated other health tasks in the cancer area.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Tulika Kakati's (Tezpur University, Expected 2020) research focuses on gene expression analysis using ML to identify biomarkers across disease state and the cell cycle. Tulika's work has used novel clustering methods and identification of border genes for co-expression analysis, as well as developing novel deep learning approaches to the identification of differentially expressed genes via DEGnet, validating all models across a number of gene expression datasets. Tulika has also investigated improving the computational efficiency of these methods via distributed computing, specifically with regards to the application of their clustering algorithms.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Nirvana Nursimulu's (University of Toronto, Expected 2021) research focuses on methods for computationally analyzing metabolic networks, with applications towards understanding pathogen growth in pursuit of drug development. Nirvana has examined the enzyme annotation problem, specifically focusing on producing methods that yield lower false positives than traditional similarity search metrics while considering full sequence diversity within enzyme classes. In addition, Nirvana has developed an automated pipeline for enzyme annotation and reconstruction of a metabolic model, focusing on increasing model coverage in order to yield more realistic simulations. In other work, Nirvana has also investigated more traditional microbiology across various pathogens.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Rohit Bhattacharya's (Johns Hopkins University, Expected 2021) research focuses on the development of causal methods that correct for understudied but ubiquitous sources of bias that arise during the course of data analyses, including data dependence, non-ignorable missingness, and model misspecification, in the study of infectious diseases. Rohit approaches these problems by developing novel graphical modeling techniques that can detect and correct for such sources of bias while providing the investigator with clear and interpretable representations of the underlying data dependence or missingness process. In dealing with model misspecification, Rohit has recently developed algorithms that yield doubly robust and efficient semi-parametric estimators for a wide class of causal graphical models, despite the presence of unmeasured confounders. In other work, Rohit has performed several investigations in oncology applications.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Kaspar Märtens's (University of Oxford, Expected 2020) research focuses on enabling feature-level interpretability in non-linear latent variable models via a synthesis of statistical and machine learning techniques. In particular, Kaspar designs novel latent variable, non-linear dimensionality reduction models that allow for feature-level interpretability, focusing primarily on gaussian process latent variable models (GPLVMs) and variational autoencoders (VAEs), specifically augmenting these models with ideas from classical statistics, such as the functional analysis of variance (ANOVA) decomposition or probabilistic clustering algorithms. The results of these works are a class of models for flexible non-linear dimensionality reduction together with explainability, providing a mechanism to gain insights into what the model has learnt in terms of the observed features. In other work, Kaspar has examined genomic problems and applications of MCMC sampling.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Luis Oala's (Fraunhofer Heinrich Hertz Institute, Expected 2021) research focuses on gaining a better understanding about the vulnerabilities of deep neural networks and finding tests to make these vulnerabilities visible, primarily through the lens of uncertainty quantification. Together with his research group, Luis has developed an effective and modular alarm system for image reconstruction DNNs. The alarm system, called Interval Neural Networks, allows for high-resolution error heatmaps during inference for use cases such as CT image reconstruction. As co-chair of the Working Group on Data and AI Solution Assessment Methods in the ITU/WHO Focus Group on AI4H (FG-AI4H), he also leads a group of interdisciplinary experts working towards a standardized assessment framework for the evaluation of health AIs
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Survival analysis is used for predicting time-to-event outcomes, such as how long a patient will stay in the hospital, or when the recurrence of a tumor will likely happen. This tutorial aims to go over the basics of survival analysis, how it is used in healthcare, and some of its recent methodological advances from the ML community. We will also discuss open challenges. NOTE: This tutorial has a corresponding notebook: https://sites.google.com/view/chil-survival.
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In this tutorial, we will describe population and public health and their essential role in a comprehensive strategy to improve health. We will illustrate state of the art data and modeling approaches in population and public health. In doing so, we will identify overlaps with and open questions relevant to machine learning, causal inference and fairness.
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ With today's publicly available, de-identified clinical datasets, it is possible to ask questions like, “Can an algorithm read an electrocardiogram as well as a cardiologist can?” However, other kinds of questions like, “Does this ECG relate to a later cardiac arrest?” can’t be answered with the limited public data available to us today. Research using private datasets gives us reason to be optimistic, but progress will be slow unless suitable de-identified datasets become open, allowing researchers to efficiently collaborate and compete. Learn about an effort underway at the University of Chicago, led by Ziad Obermeyer, Sendhil Mullainathan, and their team, to provide a secure and public “ImageNet for clinical data” that balances the concerns of patients, healthcare institutions, and researchers.
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ This tutorial will be styled as a graduate lecture about medical imaging with deep learning. This will cover the background of popular medical image domains (chest X-ray and histology) as well as methods to tackle multi-modality/view, segmentation, and counting tasks. These methods will be covered in terms of architecture and objective function design. Also, a discussion about incorrect feature attribution and approaches to mitigate the issue. Prerequisites: basic knowledge of computer vision (CNNs) and machine learning (regression, gradient descent).
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.
+
+ Bio:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ A Tour of Survival Analysis, from Classical to Modern
+
+
+
+
+
+ George H. Chen, Jeremy C. Weiss
+
+
+
+
+
Abstract: Survival analysis is used for predicting time-to-event outcomes, such as how long a patient will stay in the hospital, or when the recurrence of a tumor will likely happen. This tutorial aims to go over the basics of survival analysis, how it is used in healthcare, and some of its recent methodological advances from the ML community. We will also discuss open challenges. NOTE: This tutorial has a corresponding notebook: https://sites.google.com/view/chil-survival.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Population and public health: challenges and opportunities
+
+
Abstract: In this tutorial, we will describe population and public health and their essential role in a comprehensive strategy to improve health. We will illustrate state of the art data and modeling approaches in population and public health. In doing so, we will identify overlaps with and open questions relevant to machine learning, causal inference and fairness.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Public Health Datasets for Deep Learning: Challenges and Opportunities
+
+
Abstract: With today's publicly available, de-identified clinical datasets, it is possible to ask questions like, “Can an algorithm read an electrocardiogram as well as a cardiologist can?” However, other kinds of questions like, “Does this ECG relate to a later cardiac arrest?” can’t be answered with the limited public data available to us today. Research using private datasets gives us reason to be optimistic, but progress will be slow unless suitable de-identified datasets become open, allowing researchers to efficiently collaborate and compete. Learn about an effort underway at the University of Chicago, led by Ziad Obermeyer, Sendhil Mullainathan, and their team, to provide a secure and public “ImageNet for clinical data” that balances the concerns of patients, healthcare institutions, and researchers.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ State of the Art Deep Learning in Medical Imaging
+
+
+
+
+
+ Joseph Paul Cohen
+
+
+
+
+
Abstract: This tutorial will be styled as a graduate lecture about medical imaging with deep learning. This will cover the background of popular medical image domains (chest X-ray and histology) as well as methods to tackle multi-modality/view, segmentation, and counting tasks. These methods will be covered in terms of architecture and objective function design. Also, a discussion about incorrect feature attribution and approaches to mitigate the issue. Prerequisites: basic knowledge of computer vision (CNNs) and machine learning (regression, gradient descent).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Analyzing critical care data, from speculation to publication, starring MIMIC-IV (Part 1)
+
+
+
+
+
+ Alistair Johnson
+
+
+
+
+
Abstract: Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Analyzing critical care data, from speculation to publication, starring MIMIC-IV (Part 2)
+
+
+
+
+
+ Alistair Johnson
+
+
+
+
+
Abstract: Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.
+ Abstract:
+ Small datasets form a significant portion of releasable data in high sensitivity domains such as healthcare. But, providing differential privacy for small dataset release is a hard task, where current state-of-the-art methods suffer from severe utility loss. As a solution, we propose DPRP (Differentially Private Data Release via Random Projections), a reconstruction based approach for releasing differentially private small datasets. DPRP has several key advantages over the state-of-the-art. Using seven diverse real-life clinical datasets, we show that DPRP outperforms the current state-of-the-art on a variety of tasks, under varying conditions, and for all privacy budgets.
+ Abstract:
+ Reinforcement Learning (RL) has recently been applied to several problems in healthcare, with a particular focus in offline learning in observational data. RL relies on the use of latent states that embed sequential observations in such a way that the embedding is sufficient to approximately predict the next observation. but the appropriate construction of such states in healthcare settings is an open question, as the variation in steady-state human physiology is poorly-understood. In this work, we evaluate several information encoding schemes for offline RL using data from electronic health records (EHR). We use observations from septic patients in the MIMIC-III intensive care unit dataset, and evaluate the predictive performance of four embedding approaches in two tasks: predicting the next observation, and predicting a ``k-step'' look ahead or roll out. Our experiments highlight that the best performing state representation learning approaches utilize higher dimension recurrent neural architectures, and demonstrate that incorporating additional context with the state representation when predicting the next observation.
+ Abstract:
+ Capturing the inter-dependencies among multiple types of clinically-critical events is critical not only to accurate future event prediction, but also to better treatment planning. In this work, we propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events (e.g., kidney failure, mortality) by explicitly modeling the temporal dynamics of patients' latent states. Based on these learned patient states, we further develop a new general discrete-time formulation of the hazard rate function to estimate the survival distribution of patients with significantly improved accuracy. Extensive evaluations over real EMR data show that our proposed model compares favorably to various state-of-the-art baselines. Further our method also uncovers meaningful insights about the latent correlation among mortality and different types of organ failures.
+ Abstract:
+ Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function, Nelson-Aalen's estimator for the hazard function, etc. Using eleven real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.
+ Abstract:
+ As machine learning has become increasingly applied to medical imaging data, noise in training labels has emerged as an important challenge. Variability in diagnosis of medical images is well established; in addition, variability in training and attention to task among medical labelers may exacerbate this issue. Methods for identifying and mitigating the impact of low quality labels have been studied, but are not well characterized in medical imaging tasks. For instance, Noisy Cross-Validation splits the training data into halves, and has been shown to identify low-quality labels in computer vision tasks; but it has not been applied to medical imaging tasks specifically. In addition, there may be concerns around label imbalance for medical image sets, where relevant pathology may be rare. In this work we introduce Stratified Noisy Cross-Validation (SNCV), an extension of noisy cross validation. SNCV allows us to measure confidence in model prediction and assign a quality score to each example; supports label stratification to handle class imbalance; and identifies likely low-quality labels to analyse the causes. In contrast to noisy cross-validation, sample selection for SNCV occurs after training two models, not during training, which simplifies application of the method. We assess performance of SNCV on diagnosis of glaucoma suspect risk (GSR) from retinal fundus photographs, a clinically important yet nuanced labeling task. Using training data from a previously-published deep learning model, we compute a continuous quality score (QS) for each training example. We relabel 1,277 low-QS examples using a trained glaucoma specialist; the new labels agree with the SNCV prediction over the initial label >85% of the time, indicating that low-QS examples appear mostly reflect labeler erors. We then quantify the impact of training with only high-QS labels, showing that strong model performance may be obtained with many fewer examples. By applying the method to randomly sub-sampled training dataset, we show that our method can reduce labelling burden by approximately 50% while achieving model performance non-inferior to using the full dataset on multiple held-out test sets.
+ Abstract:
+ Modeling disease progression is an active area of research. Many computational methods for progression modeling have been developed but mostly at population levels. In this paper, we formulate a personalized disease progression modeling problem as a multi-task regression problem where the estimation of progression scores at different time points is defined as a learning task. We introduce a Personalized Progression Modeling (PPM) scheme as a novel way to estimate personalized trajectories of disease by jointly discovering clusters of similar patients while estimating disease progression scores. The approach is formulated as an optimization problem that can be solved using existing optimization techniques. We present efficient algorithms for the PPM scheme, together with experimental results on both synthetic and real world healthcare data proving its analytical efficacy over other 4 baseline methods representing the current state of the art. On synthetic data, we showed that our algorithm achieves over 40% accuracy improvement over all the baselines. On the healthcare application PPM has a 4% accuracy improvement on average over the state-of-the-art baseline in predicting the viral infection progression. These results highlight significant modeling performance gains obtained with PPM.
+ Abstract:
+ Clinical notes in electronic health records contain highly heterogeneous writing styles, including non-standard terminology or abbreviations. Using these notes in predictive modeling has traditionally required preprocessing (e.g. taking frequent terms or topic modeling) that removes much of the richness of the source data. We propose a pretrained hierarchical recurrent neural network model that parses minimally processed clinical notes in an intuitive fashion, and show that it improves performance for discharge diagnosis classification tasks on the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, compared to models that conduct no pretraining or that treat the notes as an unordered collection of terms. We also apply an attribution technique to examples to identify the words that the model uses to make its prediction, and show the importance of the words’ nearby context.
+ Abstract:
+ Industrial equipment, devices and patients typically undergo change from a healthy state to an unhealthy state. We develop a novel approach to detect unhealthy entities and also discover the time of change to enable deeper investigation into the cause for change. In the absence of an engineering or medical intervention, health degradation only happens in one direction --- healthy to unhealthy. Our transductive learning framework leverages this chronology of observations for learning a superior model with minimal supervision. Temporal Transduction is achieved by incorporating chronological constraints in the conventional max-margin classifier --- Support Vector Machines (SVM). We utilize stochastic gradient descent to solve the resulting optimization problem. Our experiments on publicly available benchmark datasets demonstrate the effectiveness of our approach in accurately detecting unhealthy entities with less supervision as compared to other strong baselines --- conventional and transductive SVM.
+ Abstract:
+ In many Machine Learning applications, it is important to reduce the set of features used in training. This is especially important when different attributes have different acquisition costs, e.g., various blood tests. Cost-sensitive feature selection methods aim to select a subset of attributes that yields a performant Machine Learning model while keeping the total cost low. In this paper, we propose a Bayesian Optimization approach to this task. We explore the different subsets of available features by optimizing an evaluation function that weights the model's performance and total feature cost. We evaluate the proposed method on different UCI datasets, as well as a real-life one, and compare it to diverse feature selection approaches. Our results demonstrate that the Bayesian optimization cost-sensitive feature selection (BOCFS) can select a low-cost subset of informative features, therefore generating highly effective classifiers, and achieving state-of-the-art performance in some datasets.
+ Abstract:
+ With the increase in popularity of deep learning models for natural language processing (NLP) tasks in the field of Pharmacovigilance, more specifically for the identification of Adverse Drug Reactions (ADRs), there is an inherent need for large-scale social-media datasets aimed at such tasks. With most researchers allocating large amounts of time to crawl Twitter or buying expensive pre-curated datasets, then manually annotating by humans, these approaches do not scale well as more and more data keeps flowing in Twitter. In this work we re-purpose a publicly available archived dataset of more than 9.4 billion Tweets with the objective of creating a very large dataset of drug usage-related tweets. Using existing manually curated datasets from the literature, we then validate our filtered tweets for relevance using machine learning methods, with the end result of a publicly available dataset of 1,181,993 million tweets for public use. We provide all code and detailed procedure on how to extract this dataset and the selected tweet ids for researchers to use.
+ Abstract:
+ Clinical notes contain information about patients beyond structured data such as lab values or medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. We aim to develop and evaluate a continuous representation of clinical notes. Given this representation, our goal is to predict 30-day hospital readmission at various timepoints of admission, including early stages and at discharge. We apply bidirectional encoder representations from transformers (BERT) to clinical text. Publicly-released BERT parameters are trained on standard corpora such as Wikipedia and BookCorpus, which differ from clinical text. We therefore pre-train BERT using clinical notes and fine-tune the network for the task of predicting hospital readmission. This defines ClinicalBERT. ClinicalBERT uncovers high-quality relationships between medical concepts, as judged by physicians. ClinicalBERT outperforms various baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit on various clinically-motivated metrics. The attention weights of ClinicalBERT can also be used to interpret predictions. To facilitate research, we open-source model parameters, and scripts for training and evaluation. ClinicalBERT is a flexible framework to represent clinical notes. It improves on previous clinical text processing methods and with little engineering can be adapted to other clinical predictive tasks.
+ Abstract:
+ Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework to first extract problem lists from clinical notes and subsequently use the extracted problems to predict patient outcomes. This framework is both more performant and more interpretable than existing models used within the domain, achieving an AU-ROC of 0.710 for bounceback readmission and 0.869 for in-hospital mortality occurring after ICU discharge. We identify risk factors for both readmission and mortality outcomes and demonstrate that it can be used to develop dynamic problem lists that present clinical problems along with their quantitative importance. This allows clinicians to both easily identify the relevant problems and gain insight into the factors driving the model’s prediction.
+ Abstract:
+ Deep learning is increasingly common in healthcare, yet transfer learning for physiological signals (e.g., temperature, heart rate, etc.) is under-explored. Here, we present a straightforward, yet performant framework for transferring knowledge about physiological signals. Our framework is called PHASE (\underline{PH}ysiologic\underline{A}l \underline{S}ignal \underline{E}mbeddings). It i) learns deep embeddings of physiological signals and ii) predicts adverse outcomes based on the embeddings. PHASE is the first instance of deep transfer learning in a cross-hospital, cross-department setting for physiological signals. We show that PHASE's per-signal (one for each signal) LSTM embedding functions confer a number of benefits including improved performance, successful transference between hospitals, and lower computational cost.
+ Abstract:
+ Machine-learned diagnosis models have shown promise as medical aides but are trained under a closed-set assumption, i.e. that models will only encounter conditions on which they have been trained. However, it is practically infeasible to obtain sufficient training data for every human condition, and once deployed such models will invariably face previously unseen conditions. We frame machine-learned diagnosis as an open-set learning problem, and study how state-of-the-art approaches compare. Further, we extend our study to a setting where training data is distributed across several healthcare sites that do not allow data pooling, and experiment with different strategies of building open-set diagnostic ensembles. Across both settings, we observe consistent gains from explicitly modeling unseen conditions, but find the optimal training strategy to vary across settings.
+ Abstract:
+ This paper aims to evaluate the suitability of current deep learning methods for clinical workflow especially by focusing on dermatology. Although deep learning methods have been attempted to get dermatologist level accuracy in several individual conditions, it has not been rigorously tested for common clinical complaints. Most projects involve data acquired in well-controlled laboratory conditions. This may not reflect regular clinical evaluation where corresponding image quality is not always ideal. We test the robustness of deep learning methods by simulating non-ideal characteristics on user submitted images of ten classes of diseases. Assessing via imitated conditions, we have found the overall accuracy to drop and individual predictions change significantly in many cases despite of robust training.
+ Abstract:
+ Sentiment analysis is a well-researched field of machine learning and natural language processing generally concerned with determining the degree of positive or negative polarity in free text. Traditionally, such methods have focused on analyzing user opinions directed towards external entities such as products, news, or movies. However, less attention has been paid towards understanding the sentiment of human emotion in the form of internalized thoughts and expressions of self-reflection. Given the rise of public social media platforms and private online therapy services, the opportunity for designing accurate tools to quantify emotional states in is at an all-time high. Based upon findings in psychological research, in this work we propose a new type of sentiment analysis task more appropriate for assessing the valence of human emotion. Rather than assessing text on a single polarity axis ranging from positive to negative, we analyze self-expressive thoughts using a two-dimensional assignment scheme with four sentiment categories: positive, negative, both positive and negative, and neither positive nor negative. This work details the collection of a novel annotated dataset of real-world mental health therapy logs and compares several machine learning methodologies for the accurate classification of emotional valence. We found superior performance using deep transfer learning approaches, and in particular, best results were obtained using the recent breakthrough method of BERT (Bidirectional Encoder Representations from Transformers). Based on these results, it is clear that transfer learning has the potential for greatly improving the accuracy of classifiers in the mental health domain, where labeled data is often scarce. Additionally, we argue that representing emotional sentiment on decoupled valence axes via four classification labels is an appropriate modification of traditional sentiment analysis for mental health tasks.
+ Abstract:
+ Electronic Health Records (EHRs) are commonly used by the machine learning community for research on problems specifically related to health care and medicine. EHRs have the advantages that they can be easily distributed and contain many features useful for e.g. classification problems. What makes EHR data sets different from typical machine learning data sets is that they are often very sparse, due to their high dimensionality, and often contain heterogeneous data types. Furthermore, the data sets deal with sensitive information, which limits the distribution of any models learned using them, due to privacy concerns. In this work, we explore using Generative Adversarial Networks to generate synthetic, heterogeneous EHRs with the goal of using these synthetic records in place of existing data sets. We will further explore applying differential privacy (DP) preserving optimization in order to produce differentially private synthetic EHR data sets, which provide rigorous privacy guarantees, and are therefore more easily shareable. The performance (measured by AUROC, AUPRC and accuracy) of our model's synthetic, heterogeneous data is very close to the original data set (within 6.4%) for the non-DP model when tested in a binary classification task. Although incurring a 20% performance penalty, the DP synthetic data is still useful for machine learning tasks. We additionally perform a sub-population analysis and find that our model does not introduce any bias into the synthetic EHR data compared to the baseline in either male/female populations, or the 0-18, 19-50 and 51+ age groups in terms of classification performance.
+ Abstract:
+ Intensive Care Unit Electronic Health Records (ICU EHRs) store multimodal data about patients including clinical notes, sparse and irregularly sampled physiological time series, lab results, and more. To date, most methods designed to learn predictive models from ICU EHR data have focused on a single modality. In this paper, we leverage the recently proposed interpolation-prediction deep learning architecture as a basis for exploring how physiological time series data and clinical notes can be integrated into a unified mortality prediction model. We study both early and late fusion approaches, and demonstrate how the relative predictive value of clinical text and physiological data change over time. Our results show that a late fusion approach can provide a statistically significant improvement in mortality prediction performance over using individual modalities in isolation.
+ Abstract:
+ Although there have been several recent advances in the application of deep learning algorithms to chest x-ray interpretation, we identify three major challenges for the translation of chest x-ray algorithms to the clinical setting. We examine the performance of the top 10 performing models on the CheXpert challenge leaderboard on three tasks: (1) TB detection, (2) pathology detection on photos of chest x-rays, and (3) pathology detection on data from an external institution. First, we find that the top 10 chest x-ray models on the CheXpert competition achieve an average AUC of 0.851 on the task of detecting TB on two public TB datasets without fine-tuning or including the TB labels in training data. Second, we find that the average performance of the models on photos of x-rays (AUC = 0.916) is similar to their performance on the original chest x-ray images (AUC = 0.924). Third, we find that the models tested on an external dataset either perform comparably to or exceed the average performance of radiologists. We believe that our investigation will inform rapid translation of deep learning algorithms to safe and effective clinical decision support tools that can be validated prospectively with large impact studies and clinical trials.
+ Abstract:
+ Documenting patients' interactions with health providers and institutions requires summarizing highly complex data. Medical coding reduces the dimensionality of this problem to a set of manually assigned codes that are used to bill, track patient health, and summarize a patient encounter. Incorrect coding, however, can lead to significant financial, legal, and health costs to clinics and patients. To address this, we build several deep learning models -- including transfer learning of state-of-the-art BERT models -- to predict medical codes on a novel dataset of 39,000 patient encounters. We also show through several labeling experiments that model performance is robust to subjectivity in the labels, and find that our models outperform a clinic's coding when judged against charts corrected and relabeled by an expert.
+ Abstract:
+ Representation learning is a commonly touted goal in machine learning for healthcare, and for good reason. If we could learn a numerical encoding of clinical data which is reflective of underlying physiological similarity, this would have significant benefits both in research and application. However, many works pursuing representation learning systems evaluate only according to traditional, single-task performance metrics, and fail to assess whether or not the representations they produce actually contain generalizable signals capturing this underlying notion of similarity. In this work, we design an evaluation procedure specifically for representation learning systems, and use it to analyze the value of large-scale multi-task representation learners. We find mixed results, with multi-task representations being commonly helpful across a battery of prediction tasks and models, even while ensemble performance is often improvement by removing tasks from the trained ensemble and learned representations demonstrate no ability to cluster.
+ Abstract:
+ In the last few years, the FDA has begun to recognize De Novo pathways (new approval processes) for approving AI as medical devices. A major concern with this is that the review process does not adequately test for biases in these models. There are many ways in which biases can arise in data, including during data collection, training, and model deployment. In this paper, we adopt a framework for categorizing the types of bias in datasets in a fine-grained way, which enables informed, targeted interventions for each issue appropriately. From there, we propose policy recommendations to the FDA and NIH to promote the deployment of more equitable AI diagnostic systems.
+ Abstract:
+ Electronic records contain sequences of events, some of which take place all at once in a single visit, and others that are dispersed over multiple visits, each with a different timestamp. We postulate that fine temporal detail, e.g., whether a series of blood tests are completed at once or in rapid succession should not alter predictions based on this data. Motivated by this intuition, we propose models for analyzing sequences of multivariate clinical time series data that are invariant to this temporal clustering. We propose an efficient data augmentation technique that exploits the postulated temporal-clustering invariance to regularize deep neural networks optimized for several clinical prediction tasks. We introduce two techniques to temporally coarsen (downsample) irregular time series: (i) grouping the data points based on regularly-spaced timestamps; and (ii) clustering them, yielding irregularly-paced timestamps. Moreover, we propose a MultiResolution network with Shared Weights (MRSW), improving predictive accuracy by combining predictions based on inputs sequences transformed by different coarsening operators. Our experiments show that MRSW improves the mAP on the benchmark mortality prediction task from 51.53% to 53.92%.
+ Abstract:
+ In survival analysis, deep learning approaches have recently been proposed for estimating an individual's probability of survival over some time horizon. Such approaches can capture complex non-linear relationships, without relying on restrictive assumptions regarding the specific form of the relationship between an individual's characteristics and their underlying survival process. To date, however, these methods have focused primarily on optimizing discriminative performance, and have ignored model calibration. Well-calibrated survival curves present realistic and meaningful probabilistic estimates of the true underlying survival process for an individual. However, due to the lack of ground-truth regarding the underlying stochastic process of survival for an individual, optimizing for and measuring calibration in survival analysis is an inherently difficult task. In this work, we i) propose a new loss function, for training deep nonparametric survival analysis models, that maximizes discriminative performance, subject to good calibration, and ii) present a calibration metric for survival analysis that facilitates model comparison. Through experiments on two publicly available clinical datasets, we show that our proposed approach achieves the same discriminative performance as state-of-the-art methods, while leading to over a 60% reduction in calibration error.
+ Abstract:
+ Electronic Health Records (EHR) are high-dimensional data with implicit connections among thousands of medical concepts. These connections, for instance, the co-occurrence of diseases and lab-disease correlations can be informative when only a subset of these variables is documented by the clinician. A feasible approach to improving the representation learning of EHR data is to associate relevant medical concepts and utilize these connections. Existing medical ontologies can be the reference for EHR structures, but they place numerous constraints on the data source. Recent progress on graph neural networks (GNN) enables end-to-end learning of topological structures for non-grid or non-sequential data. However, there are problems to be addressed on how to learn the medical graph adaptively and how to understand the effect of the medical graph on representation learning. In this paper, we propose a variationally regularized encoder-decoder graph network that achieves more robustness in graph structure learning by regularizing node representations. Our model outperforms the existing graph and non-graph based methods in various EHR predictive tasks based on both public data and real-world clinical data. Besides the improvements in empirical experiment performances, we provide an interpretation of the effect of variational regularization compared to standard graph neural network, using singular value analysis.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Affinitention Nets: Kernel Perspective on Attention Architectures for Set Classification with Applications to Medical Text and Images
+
+
+ David Dov, Serge Assaad, Shijing Si, and Rui Wang (Duke University) , Hongteng Xu (Renmin University of China) , Shahar Ziv Kovalsky (UNC at Chapel Hill) , Jonathan Bell and Danielle Elliott Range (Duke University Hospital) , Jonathan Cohen (Kaplan Medical Center) , Ricardo Henao and Lawrence Carin (Duke University)
+
+ Abstract:
+ Set classification is the task of predicting a single label from a set comprising multiple instances. The examples we consider are pathology slides represented by sets of patches and medical text represented by sets of word embeddings. State of the art methods, such as the transformers, typically use attention mechanisms to learn representations of set-data by modeling interactions between instances of the set. These methods, however, have complex heuristic architectures comprising multiple heads and layers. The complexity of attention architectures hampers their training when only a small number of labeled sets is available, as is often the case in medical applications. To address this problem, we present a kernel-based representation learning framework that associates between learning affinity kernels to learning representations from attention architectures. We show that learning a combination of the sum and the product of kernels is equivalent to learning representations from multi-head multi-layer attention architectures. From our framework, we devise a simplified attention architecture which we term \emph{affinitention} (affinity-attention) nets. We demonstrate the application of affinitention nets to the classification of Set-Cifar10 dataset, thyroid malignancy prediction from pathology slides, as well as patient text message-triage. We show that affinitention nets provide competitive results compared to heuristic attention architectures and outperform other competing methods.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Machine Learning, and in particular Federated Machine Learning, opens new perspectives in terms of medical research and patient care. Although Federated Machine Learning improves over centralized Machine Learning in terms of privacy, it does not provide provable privacy guarantees. Furthermore, Federated Machine Learning is quite expensive in term of bandwidth consumption as it requires participant nodes to regularly exchange large updates. This paper proposes a bandwidth-efficient privacy-preserving Federated Learning that provides theoretical privacy guarantees based on Differential Privacy. We experimentally evaluate our proposal for in-hospital mortality prediction using a real dataset, containing Electronic Health Records of about one million patients. Our results suggest that strong and provable patient-level privacy can be enforced at the expense of only a moderate loss of prediction accuracy.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Recurrent Neural Networks (RNNs) are often used for sequential modeling of adverse outcomes in electronic health records (EHRs) due to their ability to encode past clinical states. These deep, recurrent architectures have displayed increased performance compared to other modeling approaches in a number of tasks, fueling the interest in deploying deep models in clinical settings. One of the key elements in ensuring safe model deployment and building user trust is model explainability. Testing with Concept Activation Vectors (TCAV) has recently been introduced as a way of providing human-understandable explanations by comparing high-level concepts to the network's gradients. While the technique has shown promising results in real-world imaging applications, it has not been applied to structured temporal inputs. To enable an application of TCAV to sequential predictions in the EHR, we propose an extension of the method to time series data. We evaluate the proposed approach on an open EHR benchmark from the intensive care unit, as well as synthetic data where we are able to better isolate individual effects.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Sharing data is critical to generate large data sets required for the training of machine learning models. Trustworthy machine learning requires incentives, guarantees of data quality, and information privacy. Applying recent advancements in data valuation methods for machine learning can help to enable these. In this work, we analyze the suitability of three different data valuation methods for medical image classification tasks, specifically pleural effusion, on an extensive data set of chest x-ray scans. Our results reveal that a heuristic for calculating the Shapley valuation scheme based on a k-nearest neighbor classifier can successfully value large quantities of data instances. We also demonstrate possible applications for incentivizing data sharing, the efficient detection of mislabeled data, and summarizing data sets to exclude private information. Thereby, this work contributes to developing modern data infrastructures for trustworthy machine learning in health care.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The pressure of ever-increasing patient demand and budget restrictions make hospital bed management a daily challenge for clinical staff. Most critical is the efficient allocation of resource-heavy Intensive Care Unit (ICU) beds to the patients who need life support. Central to solving this problem is knowing for how long the current set of ICU patients are likely to stay in the unit. In this work, we propose a new deep learning model based on the combination of temporal convolution and pointwise (1x1) convolution, to solve the length of stay prediction task on the eICU and MIMIC-IV critical care datasets. The model - which we refer to as Temporal Pointwise Convolution (TPC) - is specifically designed to mitigate common challenges with Electronic Health Records, such as skewness, irregular sampling and missing data. In doing so, we have achieved significant performance benefits of 18-68% (metric and dataset dependent) over the commonly used Long-Short Term Memory (LSTM) network, and the multi-head self-attention network known as the Transformer. By adding mortality prediction as a side-task, we can improve performance further still, resulting in a mean absolute deviation of 1.55 days (eICU) and 2.28 days (MIMIC-IV) on predicting remaining length of stay.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Wearable devices such as smartwatches are becoming increasingly popular tools for objectively monitoring physical activity in free-living conditions. To date, research has primarily focused on the purely supervised task of human activity recognition, demonstrating limited success in inferring high-level health outcomes from low-level signals. Here, we present a novel _self-supervised_ representation learning method using activity and heart rate (HR) signals without semantic labels. With a deep neural network, we set HR responses as the _supervisory signal_ for the activity data, leveraging their underlying physiological relationship. In addition, we propose a custom quantile loss function that accounts for the long-tailed HR distribution present in the general population. We evaluate our model in the largest free-living combined-sensing dataset (comprising >280k hours of wrist accelerometer & wearable ECG data). Our contributions are two-fold: i) the pre-training task creates a model that can accurately forecast HR based only on cheap activity sensors, and ii) we leverage the information captured through this task by proposing a simple method to aggregate the learnt latent representations (embeddings) from the window-level to user-level. Notably, we show that the embeddings can generalize in various downstream tasks through transfer learning with linear classifiers, capturing physiologically meaningful, personalized information. For instance, they can be used to predict variables associated with individuals' health, fitness and demographic characteristics (AUC >70), outperforming unsupervised autoencoders and common bio-markers. Overall, we propose the first multimodal self-supervised method for behavioral and physiological data with implications for large-scale health and lifestyle monitoring.
+ Abstract:
+ In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are ``known-unknowns'': We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Learning to Predict with Supporting Evidence: Applications to Clinical Risk Prediction
+
+
+ Aniruddh Raghu and John Guttag (Massachusetts Institute of Technology) , Katherine Young (Harvard Medical School) , Eugene Pomerantsev (Massachusetts General Hospital) , Adrian V. Dalca (Harvard Medical School & MIT) , Collin M. Stultz (Massachusetts Institute of Technology)
+
+ Abstract:
+ The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models. In this paper, we present a method to provide people with clinical expertise with domain-relevant evidence about why a prediction should be trusted. We first design a probabilistic model that relates meaningful latent concepts to prediction targets and observed data. Inference of latent variables in this model corresponds to both making a prediction $\textit{and}$ providing supporting evidence for that prediction. We present a two-step process to efficiently approximate inference: (i) estimating model parameters using variational learning, and (ii) approximating $\textit{maximum a posteriori}$ estimation of latent variables in the model using a neural network trained with an objective derived from the probabilistic model. We demonstrate the method on the task of predicting mortality risk for cardiovascular patients. Specifically, using electrocardiogram and tabular data as input, we show that our approach provides appropriate domain-relevant supporting evidence for accurate predictions.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels
+
+
+ Saahil Jain and Akshay Smit (Stanford University) , Steven QH Truong, Chanh DT Nguyen, and Minh-Thanh Huynh (VinBrain) , Mudit Jain (unaffiliated) , Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, and Pranav Rajpurkar (Stanford University)
+
+ Abstract:
+ Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods to produce labels from radiology reports that have better agreement with radiologists labeling images. Our best performing method, called VisualCheXbert, uses a biomedically-pretrained BERT model to directly map from a radiology report to the image labels, with a supervisory signal determined by a computer vision model trained to detect medical conditions from chest X-ray images. We find that VisualCheXbert outperforms an approach using an existing radiology report labeler by an average F1 score of 0.14 (95% CI 0.12, 0.17). We also find that VisualCheXbert better agrees with radiologists labeling chest X-ray images than do radiologists labeling the corresponding radiology reports by an average F1 score across several medical conditions of between 0.12 (95% CI 0.09, 0.15) and 0.21 (95% CI 0.18, 0.24).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Deep learning methods for chest X-ray interpretation typically rely on pretrained models developed for ImageNet. This paradigm assumes that better ImageNet architectures perform better on chest X-ray tasks and that ImageNet-pretrained weights provide a performance boost over random initialization. In this work, we compare the transfer performance and parameter efficiency of 16 popular convolutional architectures on a large chest X-ray dataset (CheXpert) to investigate these assumptions. First, we find no relationship between ImageNet performance and CheXpert performance for both models without pretraining and models with pretraining. Second, we find that, for models without pretraining, the choice of model family influences performance more than size within a family for medical imaging tasks. Third, we observe that ImageNet pretraining yields a statistically significant boost in performance across architectures, with a higher boost for smaller architectures. Fourth, we examine whether ImageNet architectures are unnecessarily large for CheXpert by truncating final blocks from pretrained models, and find that we can make models 3.25x more parameter-efficient on average without a statistically significant drop in performance. Our work contributes new experimental evidence about the relation of ImageNet to chest x-ray interpretation performance.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Recent advances in training deep learning models have demonstrated the potential to provide accurate chest X-ray interpretation and increase access to radiology expertise. However, poor generalization due to data distribution shifts in clinical settings is a key barrier to implementation. In this study, we measured the diagnostic performance for 8 different chest X-ray models when applied to (1) smartphone photos of chest X-rays and (2) external datasets without any finetuning. All models were developed by different groups and submitted to the CheXpert challenge, and re-applied to test datasets without further tuning. We found that (1) on photos of chest X-rays, all 8 models experienced a statistically significant drop in task performance, but only 3 performed significantly worse than radiologists on average, and (2) on the external set, none of the models performed statistically significantly worse than radiologists, and five models performed statistically significantly better than radiologists. Our results demonstrate that some chest X-ray models, under clinically relevant distribution shifts, were comparable to radiologists while other models were not. Future work should investigate aspects of model training procedures and dataset collection that influence generalization in the presence of data distribution shifts.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Balanced representation learning methods have been applied successfully to counterfactual inference from observational data. However, approaches that account for survival outcomes are relatively limited. Survival data are frequently encountered across diverse medical applications, \textit{i.e.}, drug development, risk profiling, and clinical trials, and such data are also relevant in fields like manufacturing (\textit{e.g.}, for equipment monitoring). When the outcome of interest is a time-to-event, special precautions for handling censored events need to be taken, as ignoring censored outcomes may lead to biased estimates. We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes. Further, we formulate a nonparametric hazard ratio metric for evaluating average and individualized treatment effects. Experimental results on real-world and semi-synthetic datasets, the latter of which we introduce, demonstrate that the proposed approach significantly outperforms competitive alternatives in both survival-outcome prediction and treatment-effect estimation.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly. In addition, optimizing these multiple properties is a challenging task because the optimization of one property is prone to changing other properties. In this paper, we pose this multi-property optimization problem as a sequence translation process and propose a new optimized molecule generator model based on the Transformer with two constraint networks: property prediction and similarity prediction. We further improve the model by incorporating score predictions from these constraint networks in a modified beam search algorithm. The experiments demonstrate that our proposed model outperforms state-of-the-art models by a significant margin for optimizing multiple properties simultaneously.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ MetaPhys: Few-Shot Adaptation for Non-Contact Physiological Measurement
+
+
+ Xin Liu and Ziheng Jiang (University of Washington) , Josh Fromm (OctoML) , Xuhai Xu and Shwetak Patel (University of Washington) , Daniel McDuff (Microsoft Research)
+
+ Abstract:
+ There are large individual differences in physiological processes, making designing personalized health sensing algorithms challenging. Existing machine learning systems struggle to generalize well to unseen subjects or contexts and can often contain problematic biases. Video-based physiological measurement is not an exception. Therefore, learning personalized or customized models from a small number of unlabeled samples is very attractive as it would allow fast calibrations to improve generalization and help correct biases. In this paper, we present a novel meta-learning approach called MetaPhys for personalized video-based cardiac measurement. Our method uses only 18-seconds of video for customization and works effectively in both supervised and unsupervised manners. We evaluate our proposed approach on two benchmark datasets and demonstrate superior performance in cross-dataset evaluation with substantial reductions (42% to 44%) in errors compared with state-of-the-art approaches. We have also demonstrated our proposed method significantly helps reduce the bias in skin type.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts. As such, regulatory bodies like the US FDA have begun discussions on how to autonomously approve modifications to algorithms. Current proposals evaluate algorithmic modifications via hypothesis testing. However, these methods are only able to define and control the online error rate if the data is stationary over time, which is unlikely to hold in practice. In this manuscript, we investigate designing approval policies for modifications to ML algorithms in the presence of distributional shifts. Our key observation is that the approval policy that is most efficient at identifying and approving beneficial modifications varies across different problem settings. So rather than selecting fixed approval policy a priori, we propose learning the best approval policy by searching over a family of approval strategies. We define a family of strategies that range in their level of optimism when approving modifications. This family includes the pessimistic strategy that, in fact, rescinds approval, which is necessary when no version of the ML algorithm performs well. We use the exponentially weighted averaging forecaster (EWAF) to learn the most appropriate strategy and derive tighter regret bounds assuming the distributional shifts are bounded. In simulation studies and empirical analyses, we find that wrapping approval strategies within EWAF algorithm is a simple yet effective strategy that can help protect against distributional shifts without significantly slowing down approval of beneficial modifications.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The black-box nature of the deep networks makes the explanation for "why" they make certain predictions extremely challenging. Saliency maps are one of the most widely-used local explanation tools to alleviate this problem. One of the primary approaches for generating saliency maps is by optimizing a mask over the input dimensions so that the output of the network is influenced the most by the masking. However, prior work only studies such influence by removing evidence from the input. In this paper, we present iGOS++, a framework to generate saliency maps that are optimized for altering the output of the black-box system by either removing or preserving only a small fraction of the input. Additionally, we propose to add a bilateral total variation term to the optimization that improves the continuity of the saliency map especially under high resolution and with thin object parts. The evaluation results from comparing iGOS++ against state-of-the-art saliency map methods show significant improvement in locating salient regions that are directly interpretable by humans. We utilized iGOS++ in the task of classifying COVID-19 cases from x-ray images and discovered that sometimes the CNN network is overfitted to the characters printed on the x-ray images when performing classification. Fixing this issue by data cleansing significantly improved the precision and recall of the classifier.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Despite the large number of patients in Electronic Health Records (EHRs), the subset of usable data for modeling outcomes of specific phenotypes are often imbalanced and of modest size. This can be attributed to the uneven coverage of medical concepts in EHRs. We propose OMTL, an Ontology-driven Multi-Task Learning framework, that is designed to overcome such data limitations.The key contribution of our work is the effective use of knowledge from a predefined well-established medical relationship graph (ontology) to construct a novel deep learning network architecture that mirrors this ontology. This enables common representations to be shared across related phenotypes, and was found to improve the learning performance. The proposed OMTL naturally allows for multi-task learning of different phenotypes on distinct predictive tasks. These phenotypes are tied together by their semantic relationship according to the external medical ontology. Using the publicly available MIMIC-III database, we evaluate OMTL and demonstrate its efficacy on several real patient outcome predictions over state-of-the-art multi-task learning schemes. The results of evaluating the proposed approach on six experiments show improvement in the area under ROC curve by 9\% and by 8\% in the area under precision-recall curve.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and ablation variants.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Predictive Models for Colorectal Cancer Recurrence Using Multi-modal Healthcare Data
+
+
+ Danliang Ho (National University of Singapore) , Iain Bee Huat Tan (National Cancer Center Singapore) , Mehul Motani (National University of Singapore)
+
+ Abstract:
+ Colorectal cancer recurrence is a major clinical problem - around 30-40% of patients who are treated with curative intent surgery will experience cancer relapse. Proactive prognostication is critical for early detection and treatment of recurrence. However, the common clinical approach to monitoring recurrence through testing for carcinoembryonic antigen (CEA) does not possess a strong prognostic performance. In our paper, we study a series of machine and deep learning architectures that exploit heterogeneous healthcare data to predict colorectal cancer recurrence. In particular, we demonstrate three different approaches to extract and integrate features from multiple modalities including longitudinal as well as tabular clinical data. Our best model employs a hybrid architecture that takes in multi-modal inputs and comprises: 1) a Transformer model carefully modified to extract high-quality features from time-series data, and 2) a Multi-Layered Perceptron (MLP) that learns tabular data features, followed by feature integration and classification for prediction of recurrence. It achieves an AUROC score of 0.95, as well as precision, sensitivity and specificity scores of 0.83, 0.80 and 0.96 respectively, surpassing the performance of all-known published results based on CEA, as well as most commercially available diagnostic assays. Our results could lead to better post-operative management and follow-up of colorectal cancer patients.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ B-SegNet : Branched-SegMentor Network For Skin Lesion Segmentation
+
+
+ shreshth saini (indian institute of technology jodhpur) , Jeon Young Seok and Mengling Feng (Saw Swee Hock School of PublicHealth, National University HealthSystem, National University ofSingapore)
+
+ Abstract:
+ Melanoma is the most common form of cancer in the world. Early diagnosis of the disease and an accurate estimation of its size and shape are crucial in preventing its spread to other body parts. Manual segmentation of these lesions by a radiologist however is time consuming and error-prone. It is clinically desirable to have an automatic tool to detect malignant skin lesions from dermoscopic skin images. We propose a novel end-to-end convolution neural network(CNN) for a precise and robust skin lesion localization and segmentation. The proposed network has 3 sub-encoders branching out from the main encoder. The 3 sub-encoders are inspired from Coordinate Convolution, Hourglass, and Octave Convolutional blocks: each sub-encoder summarizes different patterns and yet collectively aims to achieve a precise segmentation. We trained our segmentation model just on the ISIC 2018 dataset. To demonstrate the generalizability of our model, we evaluated our model on the ISIC 2018 and unseen datasets including ISIC 2017 and PH$^2$. Our approach showed an average 5\% improvement in performance over different datasets while having less than half of the number of parameters when compared to other state-of-the-art segmentation models.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Modeling Longitudinal Dynamics of Comorbidities
+
+
+ Basil Maag, Stefan Feuerriegel, and Mathias Kraus (ETH Zurich) , Maytal Saar-Tsechansky (University of Texas at Austin) , Thomas Zueger (1) Inselspital, Bern, University Hospital, University of Bern 2) ETH Zurich)
+
+ Abstract:
+ In medicine, comorbidities refer to the presence of multiple, co-occurring diseases. Due to their co-occurring nature, the course of one comorbidity is often highly dependent on the course of the other disease and, hence, treatments can have significant spill-over effects. Despite the prevalence of comorbidities among patients, a comprehensive statistical framework for modeling the longitudinal dynamics of comorbidities is missing. In this paper, we propose a probabilistic model for analyzing comorbidity dynamics over time in patients. Specifically, we develop a coupled hidden Markov model with a personalized, non-homogeneous transition mechanism, named Comorbidity-HMM. The specification of our Comorbidity-HMM is informed by clinical research: (1) It accounts for different disease states (i. e., acute, stable) in the disease progression by introducing latent states that are of clinical meaning. (2) It models a coupling among the trajectories from comorbidities to capture co-evolution dynamics. (3) It considers between-patient heterogeneity (e. g., risk factors, treatments) in the transition mechanism. Based on our model, we define a spill-over effect that measures the indirect effect of treatments on patient trajectories through coupling (i. e., through comorbidity co-evolution). We evaluated our proposed Comorbidity-HMM based on 675 health trajectories where we investigate the joint progression of diabetes mellitus and chronic liver disease. Compared to alternative models without coupling, we find that our Comorbidity-HMM achieves a superior fit. Further, we quantify the spill-over effect, that is, to what extent diabetes treatments are associated with a change in the chronic liver disease from an acute to a stable disease state. To this end, our model is of direct relevance for both treatment planning and clinical research in the context of comorbidities.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Generating interpretable visualizations of multivariate time series in the intensive care unit is of great practical importance. Clinicians seek to condense complex clinical observations into intuitively understandable critical illness patterns, like failures of different organ systems. They would greatly benefit from a low-dimensional representation in which the trajectories of the patients' pathology become apparent and relevant health features are highlighted. To this end, we propose to use the latent topological structure of Self-Organizing Maps (SOMs) to achieve an interpretable latent representation of ICU time series and combine it with recent advances in deep clustering. Specifically, we (a) present a novel way to fit SOMs with probabilistic cluster assignments (PSOM), (b) propose a new deep architecture for probabilistic clustering (DPSOM) using a VAE, and (c) extend our architecture to cluster and forecast clinical states in time series (T-DPSOM). We show that our model achieves superior clustering performance compared to state-of-the-art SOM-based clustering methods while maintaining the favorable visualization properties of SOMs. On the eICU data-set, we demonstrate that T-DPSOM provides interpretable visualizations of patient state trajectories and uncertainty estimation. We show that our method rediscovers well-known clinical patient characteristics, such as a dynamic variant of the Acute Physiology And Chronic Health Evaluation (APACHE) score. Moreover, we illustrate how it can disentangle individual organ dysfunctions on disjoint regions of the two-dimensional SOM map.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Wearable technology opens opportunities to reduce sedentary behavior; however, commercially available devices do not provide tailored coaching strategies. Just-In-Time Adaptive Interventions (JITAI) provide such a framework; however most JITAI are conceptual to date. We conduct a study to evaluate just-in-time nudges in free-living conditions in terms of receptiveness and nudge impact. We first quantify baseline behavioral patterns in context using features such as location and step count, and assess differences in individual responses. We show there is a strong inverse relationship between average daily step counts and time spent being sedentary indicating that steps are steadily taken throughout the day, rather than in large bursts. Interestingly, the effect of nudges delivered at the workplace is larger in terms of step count than those delivered at home. We develop Random Forest models to learn nudge receptiveness using both individualized and contextualized data. We show that step count is the least important identifier in nudge receptiveness, while location is the most important. Furthermore, we compare the developed models with a commercially available smart coach using post-hoc analysis. The results show that using the contextualized and individualized information significantly outperforms non-JITAI approaches to determine nudge receptiveness.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ A Comprehensive EHR Timeseries Pre-training Benchmark
+
+
+ Matthew McDermott (Massachusetts Institute of Technology) , Bret Nestor (University of Toronto) , Evan Kim (Massachusetts Institute of Technology) , Wancong Zhang (New York University) , Anna Goldenberg (Hospital for Sick Children, University of Toronto, Vector Institute) , Peter Szolovits (MIT) , Marzyeh Ghassemi (University of Toronto , Vector Institute for Artificial Intelligence)
+
+ Abstract:
+ Pre-training (PT) has been used successfully in many areas of machine learning. One area where PT would be extremely impactful is over electronic health record (EHR) data. Successful PT strategies on this modality could improve model performance in data-scarce contexts such as modeling for rare diseases or allowing smaller hospitals to benefit from data from larger health systems. While many PT strategies have been explored in other domains, much less exploration has occurred for EHR data. One reason this may be is the lack of standardized benchmarks suitable for developing and testing PT algorithms. In this work, we establish a PT benchmark dataset for EHR timeseries data, establishing cohorts, a diverse set of fine-tuning tasks, and PT-focused evaluation regimes across two public EHR datasets: MIMIC-III and eICU. This benchmark fills an essential hole in the field by enabling a robust manner of iterating on PT strategies for this modality. To show the value of this benchmark and provide baselines for further research, we also profile two simple PT algorithms: a self-supervised, masked imputation system and a weakly-supervised, multi-task system. We find that PT strategies (in particular weakly-supervised PT methods) can offer significant gains over traditional learning in few-shot settings, especially on tasks with strong class imbalance. Our full benchmark and code are publicly available at https://github.com/mmcdermott/comprehensive_MTL_EHR.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Clinical machine learning models have been found to significantly degrade in performance on hospitals or regions not seen during training. Recent developments in domain generalization offer a promising solution to this problem, by creating models that learn invariances which hold across environments. In this work, we benchmark the performance of eight domain generalization methods on clinical time series and medical imaging data. We introduce a framework to induce practical confounding and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find, consistent with prior work, that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data. However, we do find a subset of realistic confounding scenarios where significant performance gains are observed. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Early detection of influenza-like symptoms can prevent widespread flu viruses and enable timely treatments, particularly in the post-pandemic era. Mobile sensing leverages an increasingly diverse set of embedded sensors to capture fine-grained information of human behaviors and ambient contexts and can serve as a promising solution for influenza-like symptom recognition. Traditionally, handcrafted and high level features of mobile sensing data are extracted by using handcrafted feature engineering and Convolutional/Recurrent Neural Network respectively. However, in this work, we use graph representation to encode the dynamics of state transitions and internal dependencies in human behaviors, apply graph embeddings to automatically extract the topological and spatial features from graph input and propose an end-to-end Graph Neural Network model with multi-channel mobile sensing input for influenza-like symptom recognition based on people's daily mobility, social interactions, and physical activities. Using data generated from 448 participants, We show that Graph Neural Networks (GNN) with GraphSAGE convolutional layers significantly outperform baseline models with handcrafted features. Furthermore, we use GNN interpretability method to generate insight (important node, graph structure) for the symptom recognition. To the best of our knowledge, this is the first work that applies graph representation and graph neural network on mobile sensing data for graph-based human behaviors modeling.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ David Dov, Serge Assaad, Shijing Si, and Rui Wang (Duke University) ; Hongteng Xu (Renmin University of China) ; Shahar Ziv Kovalsky (UNC at Chapel Hill) ; Jonathan Bell and Danielle Elliott Range (Duke University Hospital) ; Jonathan Cohen (Kaplan Medical Center) ; Ricardo Henao and Lawrence Carin (Duke University)
+
+ Aniruddh Raghu and John Guttag (Massachusetts Institute of Technology) ; Katherine Young (Harvard Medical School) ; Eugene Pomerantsev (Massachusetts General Hospital) ; Adrian V. Dalca (Harvard Medical School & MIT) ; Collin M. Stultz (Massachusetts Institute of Technology)
+
+ Saahil Jain and Akshay Smit (Stanford University) ; Steven QH Truong, Chanh DT Nguyen, and Minh-Thanh Huynh (VinBrain) ; Mudit Jain (unaffiliated) ; Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, and Pranav Rajpurkar (Stanford University)
+
+ Xin Liu and Ziheng Jiang (University of Washington) ; Josh Fromm (OctoML) ; Xuhai Xu and Shwetak Patel (University of Washington) ; Daniel McDuff (Microsoft Research)
+
+ Danliang Ho (National University of Singapore) ; Iain Bee Huat Tan (National Cancer Center Singapore) ; Mehul Motani (National University of Singapore)
+
+ shreshth saini (indian institute of technology jodhpur) ; Jeon Young Seok and Mengling Feng (Saw Swee Hock School of PublicHealth, National University HealthSystem, National University ofSingapore)
+
+ Basil Maag, Stefan Feuerriegel, and Mathias Kraus (ETH Zurich) ; Maytal Saar-Tsechansky (University of Texas at Austin) ; Thomas Zueger (1) Inselspital, Bern, University Hospital, University of Bern 2) ETH Zurich)
+
+ Matthew McDermott (Massachusetts Institute of Technology) ; Bret Nestor (University of Toronto) ; Evan Kim (Massachusetts Institute of Technology) ; Wancong Zhang (New York University) ; Anna Goldenberg (Hospital for Sick Children, University of Toronto, Vector Institute) ; Peter Szolovits (MIT) ; Marzyeh Ghassemi (University of Toronto ; Vector Institute for Artificial Intelligence)
+
+ Abstract:
+ Until today, all the available therapeutics are designed by human experts, with no help from AI tools. This reliance on human knowledge and dependence on large-scale experimentations result in prohibitive development cost and high failure rate. Recent developments in machine learning algorithms for molecular modeling aim to transform this field. In my talk, I will present state-of-the-art approaches for property prediction and de-novo molecular generation, describing their use in drug design. In addition, I will highlight unsolved algorithmic questions in this field, including confidence estimation, pretraining, and deficiencies in learned molecular representations.
+
+ Bio:
+ Regina Barzilay is a professor in the Department of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology. She is an AI faculty lead for Jameel Clinic, an MIT center for Machine Learning in Health at MIT. Her research interests are in natural language processing, applications of deep learning to chemistry and oncology. She is a recipient of various awards including the NSF Career Award, the MIT Technology Review TR-35 Award, Microsoft Faculty Fellowship and several Best Paper Awards at NAACL and ACL. In 2017, she received a MacArthur fellowship, an ACL fellowship and an AAAI fellowship. In 2020, she was awarded AAAI Squirrel Award for Artificial Intelligence for the Benefit of Humanity. She received her Ph.D. in Computer Science from Columbia University, and spent a year as a postdoc at Cornell University. Regina received her undergraduate from Ben Gurion University of the Negev, Israel.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Improved healthcare delivery and patient outcomes are the ultimate goals of many AI applications in healthcare. However, relatively few machine learning models have been translated to clinical practice so far and among those even fewer have undergone a randomized control trial (RCT) to assess their impact. This talk will highlight aspects of the clinical translational process, beyond retrospective modeling, that impact design, development, validation, and regulation of machine learning models in healthcare. In particular, this talk focuses on our recent study of predicting favorable outcomes in hospitalized COVID-19 patients. The resulting model, which was deployed and prospectively validated at NYU Langone, underwent an RCT, and was eventually shared with other institutions. I will discuss challenges around integrating our model in the EHR system and their implications, the efficacy and safety results of our RCT, and practical insights about sharing models across clinics. We will end the talk by reviewing results of a survey of over 195 clinical users who interacted with this model, summarizing when and how the model was most helpful.
+
+ Bio:
+ Narges Razavian is an assistant professor at NYU Langone Health, Center for Healthcare Innovation and Delivery Sciences, and Predictive Analytics Unit. Her lab focuses on various applications of Machine Learning and AI for medicine with a clinical translation outlook, and they work with Medical Images, Clinical Notes, and Electronic Health Records. Before NYU Langone, she was a postdoc at CILVR lab at NYU Courant CS department. She received her PhD at CMU Computational Biology group.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In early March 2020, Mark joined an interdisciplinary team to launch the Pandemic Response Network. Over the subsequent months, he helped build and launch programs to support health workers, university students and staff, small businesses, K-12 public schools, and historically marginalized communities through the COVID-19 pandemic. With a strong background in design and implementation of high-tech health innovations, Mark worked alongside public health practitioners and community leaders to repeatedly execute the last mile implementation of critical COVID-19 programs, including symptom monitoring in the workplace, rapid antigen testing in schools, and pop-up vaccination events in churches. The portfolio of programs rapidly shifted health care capabilities and expertise out of hospitals and clinics into community settings that were poorly supported by existing public health infrastructure. The experience forced Mark and his team to approach technology design with a new set of assumptions and led to the development of completely novel data streams and technology systems. In his talk, Mark distills insights and learnings from the front lines of the COVID-19 response and highlights important implications and opportunities for the field of machine learning and artificial intelligence in health care.
+
+ Bio:
+ Mark Sendak, MD, MPP is the Population Health & Data Science Lead at the Duke Institute for Health Innovation (DIHI), where he leads interdisciplinary teams of data scientists, clinicians, and machine learning experts to build technologies that solve real clinical problems. He has built tools to support Duke Health's Accountable Care Organization, COVID-19 Pandemic Response Network, and hospital network. Together with his team, he has integrated dozens of data-driven technologies into clinical operations and is a co-inventor of software to scale machine learning applications. He leads the DIHI Clinical Research & Innovation scholarship, which equips medical students with the business and data science skills required to lead health care innovation efforts. His work has been published in technical venues such as the Machine Learning for Healthcare Proceedings and Fairness, Accountability, and Transparency in Machine Learning Proceedings and clinical journals such as Plos Medicine, Nature Medicine and JAMA Open. He has served as an expert advisor to the American Medical Association, AARP, and National Academies of Medicine on matters related to machine learning, innovation, and policy. He obtained his MD and Masters of Public Policy at Duke University as a Dean's Tuition Scholar and his Bachelor's of Science in Mathematics from UCLA.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Machine learning in healthcare could have transformative impact for patients, caregivers and health systems but the potential benefits remain challenging to realise at scale. Along the path from the development of a model to the realisation of clinical and health-economic impact are a number of challenges and learnings that might be transferable across a range of applications. This talk surveys some recent progress at Google Health and shares learnings from their team in moving from early research to product development; from product development to deployment; and from deployment to early measures of clinical impact.
+
+ Bio:
+ Dr. Alan Karthikesalingam is a surgeon-scientist who leads the healthcare machine learning research group at Google Health in London (and formerly for healthcare at DeepMind).
He led DeepMind and Google’s teams in four landmark studies in Nature and Nature Medicine focusing on AI for breast cancer screening with Cancer Research UK, AI for the recognition and prediction of blinding eye diseases with the world’s largest eye hospital (Moorfields) and medical records research with the Veterans Affairs developing AI early warning systems for common causes of patient deterioration, like acute kidney injury.
He is leading work on how machine learning approaches can best promote AI safety as the team takes forward its early research into products for clinical care. Alan continues to practice clinically and supervise PhD students as a lecturer in the vascular surgery department of Imperial College, London.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ In medicine, the integration of artificial intelligence (AI) and machine learning (ML) tools could lead to a paradigm shift in which human-AI collaboration becomes integrated in medical decision-making. Despite many years of enthusiasm towards these technologies, the majority of tools fail once they are deployed in the real-world, often due to failures in workflow integration and interface design. In this talk, I will share research using methods in human-computer interaction (HCI) to design and evaluate machine learning tools for real-world clinical use. Results from this work suggest that trends in explainable AI may be inappropriate for clinical environments. I will discuss paths towards designing these tools for real-world medical systems, and describe how we are using collaborations across medicine, data science, and HCI to create machine learning tools for complex medical decisions.
+
+ Bio:
+ Dr. Maia Jacobs is an assistant professor at Northwestern University in Computer Science and Preventive Medicine. Her research contributes to the fields of Computer Science, Human-Computer Interaction (HCI), and Health Informatics through the design and evaluation of novel computing approaches that provide individuals with timely, relevant, and actionable health information. Recent projects include the design and deployment of mobile tools to increase health information access in rural communities, evaluating the influence of AI interface design on expert decision making, and co-designing intelligent decision support tools with clinicians. Her research has been funded by the National Science Foundation, the National Cancer Institute, and the Harvard Data Science Institute and has resulted in the deployment of tools currently being used by healthcare systems and patients around the country. She completed her PhD in Human Centered Computing at Georgia Institute of Technology and was a postdoctoral fellow in the Center for Research on Computation and Society at Harvard University. Jacobs’ work was awarded the iSchools Doctoral Dissertation Award, the Georgia Institute of Technology College of Computing Dissertation Award, and was recognized in the 2016 report to the President of the United States from the President's Cancer Panel, which focused on improving cancer-related outcomes.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ The wide adoption of electronic health records (EHR) systems has led to the availability of large clinical datasets available for precision medicine research. EHR data, linked with bio-repository, is a valuable new source for deriving real-word, data-driven prediction models of disease risk and treatment response. Yet, they also bring analytical difficulties. Precise information on clinical outcomes is not readily available and requires labor intensive manual chart review. Synthesizing information across healthcare systems is also challenging due to heterogeneity and privacy. In this talk, I’ll discuss analytical approaches for mining EHR data with a focus on denoising, scalability and transportability . These methods will be illustrated using EHR data from multiple healthcare centers.
+
+ Bio:
+ Dr. Tianxi Cai is the John Rock Professor of Population and Translational Data Science jointly appointed in the Department of Biostatistics at the Harvard T.H. Chan School of Public Health (HSPH) and the Department of Biomedical Informatics (DBMI), Harvard Medical School, where she directs the Translational Data Science Center for a learning healthcare system. Her recent research has been focusing on developing interpretable and robust statistical and machine learning methods for deriving precision medicine strategies and more broadly for mining large-scale biomedical data including electronic health records data.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
Abstract: Until today, all the available therapeutics are designed by human experts, with no help from AI tools. This reliance on human knowledge and dependence on large-scale experimentations result in prohibitive development cost and high failure rate. Recent developments in machine learning algorithms for molecular modeling aim to transform this field. In my talk, I will present state-of-the-art approaches for property prediction and de-novo molecular generation, describing their use in drug design. In addition, I will highlight unsolved algorithmic questions in this field, including confidence estimation, pretraining, and deficiencies in learned molecular representations.
+
+
+
Bio: Regina Barzilay is a professor in the Department of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology. She is an AI faculty lead for Jameel Clinic, an MIT center for Machine Learning in Health at MIT. Her research interests are in natural language processing, applications of deep learning to chemistry and oncology. She is a recipient of various awards including the NSF Career Award, the MIT Technology Review TR-35 Award, Microsoft Faculty Fellowship and several Best Paper Awards at NAACL and ACL. In 2017, she received a MacArthur fellowship, an ACL fellowship and an AAAI fellowship. In 2020, she was awarded AAAI Squirrel Award for Artificial Intelligence for the Benefit of Humanity. She received her Ph.D. in Computer Science from Columbia University, and spent a year as a postdoc at Cornell University. Regina received her undergraduate from Ben Gurion University of the Negev, Israel.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning in Healthcare: From Modeling to Clinical Impact
+
+
+
+
+
+ Narges Razavian / New York University Langone Medical Center
+
+
Abstract: Improved healthcare delivery and patient outcomes are the ultimate goals of many AI applications in healthcare. However, relatively few machine learning models have been translated to clinical practice so far and among those even fewer have undergone a randomized control trial (RCT) to assess their impact. This talk will highlight aspects of the clinical translational process, beyond retrospective modeling, that impact design, development, validation, and regulation of machine learning models in healthcare. In particular, this talk focuses on our recent study of predicting favorable outcomes in hospitalized COVID-19 patients. The resulting model, which was deployed and prospectively validated at NYU Langone, underwent an RCT, and was eventually shared with other institutions. I will discuss challenges around integrating our model in the EHR system and their implications, the efficacy and safety results of our RCT, and practical insights about sharing models across clinics. We will end the talk by reviewing results of a survey of over 195 clinical users who interacted with this model, summarizing when and how the model was most helpful.
+
+
+
Bio: Narges Razavian is an assistant professor at NYU Langone Health, Center for Healthcare Innovation and Delivery Sciences, and Predictive Analytics Unit. Her lab focuses on various applications of Machine Learning and AI for medicine with a clinical translation outlook, and they work with Medical Images, Clinical Notes, and Electronic Health Records. Before NYU Langone, she was a postdoc at CILVR lab at NYU Courant CS department. She received her PhD at CMU Computational Biology group.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Holding a Hammer When There are no Nails - Rapid Iteration to Build COVID-19 Support Programs for Historically Marginalized Communities
+
+
+
+
+
+ Mark Sendak / Duke Institute for Health Innovation
+
+
Abstract: In early March 2020, Mark joined an interdisciplinary team to launch the Pandemic Response Network. Over the subsequent months, he helped build and launch programs to support health workers, university students and staff, small businesses, K-12 public schools, and historically marginalized communities through the COVID-19 pandemic. With a strong background in design and implementation of high-tech health innovations, Mark worked alongside public health practitioners and community leaders to repeatedly execute the last mile implementation of critical COVID-19 programs, including symptom monitoring in the workplace, rapid antigen testing in schools, and pop-up vaccination events in churches. The portfolio of programs rapidly shifted health care capabilities and expertise out of hospitals and clinics into community settings that were poorly supported by existing public health infrastructure. The experience forced Mark and his team to approach technology design with a new set of assumptions and led to the development of completely novel data streams and technology systems. In his talk, Mark distills insights and learnings from the front lines of the COVID-19 response and highlights important implications and opportunities for the field of machine learning and artificial intelligence in health care.
+
+
+
Bio: Mark Sendak, MD, MPP is the Population Health & Data Science Lead at the Duke Institute for Health Innovation (DIHI), where he leads interdisciplinary teams of data scientists, clinicians, and machine learning experts to build technologies that solve real clinical problems. He has built tools to support Duke Health's Accountable Care Organization, COVID-19 Pandemic Response Network, and hospital network. Together with his team, he has integrated dozens of data-driven technologies into clinical operations and is a co-inventor of software to scale machine learning applications. He leads the DIHI Clinical Research & Innovation scholarship, which equips medical students with the business and data science skills required to lead health care innovation efforts. His work has been published in technical venues such as the Machine Learning for Healthcare Proceedings and Fairness, Accountability, and Transparency in Machine Learning Proceedings and clinical journals such as Plos Medicine, Nature Medicine and JAMA Open. He has served as an expert advisor to the American Medical Association, AARP, and National Academies of Medicine on matters related to machine learning, innovation, and policy. He obtained his MD and Masters of Public Policy at Duke University as a Dean's Tuition Scholar and his Bachelor's of Science in Mathematics from UCLA.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Lessons on the Path from Code to Clinic - Some Common Myths in Machine Learning for Healthcare
+
+
+
+
+
+ Alan Karthikesalingam / Google Health - London
+
+
+
+
+
+
+
+
+
+
Abstract: Machine learning in healthcare could have transformative impact for patients, caregivers and health systems but the potential benefits remain challenging to realise at scale. Along the path from the development of a model to the realisation of clinical and health-economic impact are a number of challenges and learnings that might be transferable across a range of applications. This talk surveys some recent progress at Google Health and shares learnings from their team in moving from early research to product development; from product development to deployment; and from deployment to early measures of clinical impact.
+
+
+
Bio: Dr. Alan Karthikesalingam is a surgeon-scientist who leads the healthcare machine learning research group at Google Health in London (and formerly for healthcare at DeepMind).
He led DeepMind and Google’s teams in four landmark studies in Nature and Nature Medicine focusing on AI for breast cancer screening with Cancer Research UK, AI for the recognition and prediction of blinding eye diseases with the world’s largest eye hospital (Moorfields) and medical records research with the Veterans Affairs developing AI early warning systems for common causes of patient deterioration, like acute kidney injury.
He is leading work on how machine learning approaches can best promote AI safety as the team takes forward its early research into products for clinical care. Alan continues to practice clinically and supervise PhD students as a lecturer in the vascular surgery department of Imperial College, London.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Bringing AI to the Bedside with User Centered Design
+
+
Abstract: In medicine, the integration of artificial intelligence (AI) and machine learning (ML) tools could lead to a paradigm shift in which human-AI collaboration becomes integrated in medical decision-making. Despite many years of enthusiasm towards these technologies, the majority of tools fail once they are deployed in the real-world, often due to failures in workflow integration and interface design. In this talk, I will share research using methods in human-computer interaction (HCI) to design and evaluate machine learning tools for real-world clinical use. Results from this work suggest that trends in explainable AI may be inappropriate for clinical environments. I will discuss paths towards designing these tools for real-world medical systems, and describe how we are using collaborations across medicine, data science, and HCI to create machine learning tools for complex medical decisions.
+
+
+
Bio: Dr. Maia Jacobs is an assistant professor at Northwestern University in Computer Science and Preventive Medicine. Her research contributes to the fields of Computer Science, Human-Computer Interaction (HCI), and Health Informatics through the design and evaluation of novel computing approaches that provide individuals with timely, relevant, and actionable health information. Recent projects include the design and deployment of mobile tools to increase health information access in rural communities, evaluating the influence of AI interface design on expert decision making, and co-designing intelligent decision support tools with clinicians. Her research has been funded by the National Science Foundation, the National Cancer Institute, and the Harvard Data Science Institute and has resulted in the deployment of tools currently being used by healthcare systems and patients around the country. She completed her PhD in Human Centered Computing at Georgia Institute of Technology and was a postdoctoral fellow in the Center for Research on Computation and Society at Harvard University. Jacobs’ work was awarded the iSchools Doctoral Dissertation Award, the Georgia Institute of Technology College of Computing Dissertation Award, and was recognized in the 2016 report to the President of the United States from the President's Cancer Panel, which focused on improving cancer-related outcomes.
Abstract: The wide adoption of electronic health records (EHR) systems has led to the availability of large clinical datasets available for precision medicine research. EHR data, linked with bio-repository, is a valuable new source for deriving real-word, data-driven prediction models of disease risk and treatment response. Yet, they also bring analytical difficulties. Precise information on clinical outcomes is not readily available and requires labor intensive manual chart review. Synthesizing information across healthcare systems is also challenging due to heterogeneity and privacy. In this talk, I’ll discuss analytical approaches for mining EHR data with a focus on denoising, scalability and transportability . These methods will be illustrated using EHR data from multiple healthcare centers.
+
+
+
Bio: Dr. Tianxi Cai is the John Rock Professor of Population and Translational Data Science jointly appointed in the Department of Biostatistics at the Harvard T.H. Chan School of Public Health (HSPH) and the Department of Biomedical Informatics (DBMI), Harvard Medical School, where she directs the Translational Data Science Center for a learning healthcare system. Her recent research has been focusing on developing interpretable and robust statistical and machine learning methods for deriving precision medicine strategies and more broadly for mining large-scale biomedical data including electronic health records data.
+ Abstract:
+ Causal inference is an important topic in healthcare because a causal relationship between an exposure and a health outcome may suggest an intervention to improve the health outcome. In this tutorial, we provide an introduction to the field of causal inference. We will cover several fundamental topics in causal inference, including the potential outcome framework, structural equation modeling, propensity score modeling, and instrumental variable analysis. Methods will be illustrated using real clinical examples.
+
+ Bio:
+
+ Linbo Wang is an assistant professor in the Department of Statistical Sciences, University of Toronto. He is also an Affiliate Assistant Professor in the Department of Statistics, University of Washington, and a faculty affiliate at Vector Institute. His research interest is centered around causality and its interaction with statistics and machine learning. Prior to these roles, he was a postdoc at Harvard T.H. Chan School of Public Health. He obtained his Ph.D. from the University of Washington.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Mobile health (mHealth) technologies are providing new promising ways to deliver interventions in both clinical and non-clinical settings. Wearable sensors and smartphones collect real-time data streams that provide information about an individual’s current health including both internal (e.g., mood, blood sugar level) and external (e.g., social, location) contexts. Both wearables and smartphones can be used to deliver interventions. mHealth interventions are in current use across a vast number of health-related fields including medication adherence, physical activity, weight loss, mental illness and addictions. This tutorial discusses the micro-randomized trial (MRT), an experimental trial design for use in optimizing real time delivery of sequences of treatment, with an emphasis on mHealth. We introduce the MRT design using HeartSteps, a physical activity study, as an example. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT with continuous binary outcomes. We discuss the sample size considerations for designing MRTs.
+
+ Bio:
+
+ Tianchen Qian is an Assistant Professor in the Department of Statistics at University of California, Irvine. He completed his PhD at the Johns Hopkins University and was a postdoctoral fellow at Harvard University. His research is focused on the experimental design and statistical analysis methods for developing mobile health interventions. In particular, he has developed causal inference methods for analyzing micro-randomized trial data and sample size calculation approaches for designing micro-randomized trials.
+
+ Abstract:
+ Offline reinforcement learning (offline RL), a.k.a. batch-mode reinforcement learning, involves learning a policy from potentially suboptimal data. In contrast to imitation learning, offline RL does not rely on expert demonstrations, but rather seeks to surpass the average performance of the agents that generated the data. Methodologies such as the gathering of new experience fall short in offline settings, requiring reassessment of fundamental learning paradigms. In this tutorial I aim to provide the necessary background and challenges of this exciting area of research, from off policy evaluation through bandits to deep reinforcement learning.
+
+ Bio:
+
+ Guy Tennenholtz is a fourth-year Ph.D. student at the Technion University, advised by Prof. Shie Mannor. His research interests lie in the field of reinforcement learning, and specifically, how offline data can be leveraged to build better agents. Problems of large action spaces, partial observability, confounding bias, and uncertainty are only some of the problems he is actively researching. In his spare time Guy also enjoys creating mobile games, with the vision of incorporating AI into both the game development process and gameplay.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in a post hoc manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. However, recent research has shed light on the vulnerabilities of popular post hoc explanation techniques. In this tutorial, I will provide a brief overview of post hoc explanation methods with special emphasis on feature attribution methods such as LIME and SHAP. I will then discuss recent research which demonstrates that these methods are brittle, unstable, and are vulnerable to a variety of adversarial attacks. Lastly, I will present two solutions to address some of the vulnerabilities of these methods – (i) a generic framework based on adversarial training that is designed to make post hoc explanations more stable and robust to shifts in the underlying data, and (ii) a Bayesian framework that captures the uncertainty associated with post hoc explanations and in turn allows us to generate reliable explanations which satisfy user specified levels of confidence. Overall, this tutorial will provide a bird’s eye view of the state-of-the-art in the burgeoning field of explainable machine learning.
+
+ Bio:
+
+ Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Abstract:
+ Phenotyping is the process of identifying a patient’s health state based on the information in their electronic health records. In this tutorial, we will discuss why phenotyping is a challenging problem from both a practical and methodological perspective. We will focus primarily on the the challenges in obtaining annotated phenotype information from patient records and present statistical learning methods that leverage unlabeled examples to improve model estimation and evaluation to reduce the annotation burden.
+
+ Bio:
+
+ Jesse Gronsbell is an Assistant Professor in the Department of Statistical Sciences at the University of Toronto. Prior to joining U of T, Jesse spent a couple of years as a data scientist in the Mental Health Research and Development Group at Alphabet's Verily Life Sciences. Her primary interest is in the development of statistical methods for modern digital data sources such as electronic health records and mobile health data.
+
Chuan Hong is an instructor in biomedical informatics from the Department of Biomedical Informatics (DBMI) at Harvard Medical School. She received her PhD in Biostatistics from the University of Texas Health Science Center at Houston. Her doctoral research focused on meta-analysis and DNA methylation detection. At DBMI, Chuan's research interests lie in developing statistical and computational methods for biomarker evaluation, predictive modeling, and precision medicine with biomedical data. In particular, she is interested in combining electronic medical records with biorepositories and relevant resources to improve phenotyping accuracy, detect novel biomarkers, and monitor disease progression in clinical research.
+
Molei Liu is a 4th year PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He received a Bachelor's degree in Statistics from Peking University. Molei has been working in areas including high dimensional statistics, distributed learning, semi-supervised learning, semi-parametric inference, and model-X inference. He has also been working on methods for phenome-wide association studies (PheWAS) using electronic health records data.
+
Clara-Lea Bonzel is a research assistant at the Department of Biomedical Informatics at Harvard Medical School. She is mainly interested in personalized medicine using phenomic and genomic data, and model selection and evaluation. Clara-Lea received her master's degree in Applied Mathematics and Financial Engineering from the Swiss Federal Institute of Technology (EPFL).
+
Aaron Sonabend is a PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He is primarily focused on developing robust reinforcement learning and natural language processing methods for contexts with sampling bias, partially observed rewards, or strong distribution shifts. He is interested in healthcare and biomedical applications, such as finding optimal sequential treatment regimes for complex diseases, and phenotyping using electronic health records. Aaron holds a Bachelor's degree in Applied Mathematics, and in Economics from the National Autonomous Technological Institute of Mexico (ITAM).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
If the Livestream seems inaccessible, please try refreshing your browser. Clicking the "LIVE" button ensures you are in sync with the live content.
+ Causal Inference in Clinical Research: From Theory to Practice
+
+
+
+
+
+ Linbo Wang
+
+
+
+
+
Abstract: Causal inference is an important topic in healthcare because a causal relationship between an exposure and a health outcome may suggest an intervention to improve the health outcome. In this tutorial, we provide an introduction to the field of causal inference. We will cover several fundamental topics in causal inference, including the potential outcome framework, structural equation modeling, propensity score modeling, and instrumental variable analysis. Methods will be illustrated using real clinical examples.
+
+
+
Bio:
+ Linbo Wang is an assistant professor in the Department of Statistical Sciences, University of Toronto. He is also an Affiliate Assistant Professor in the Department of Statistics, University of Washington, and a faculty affiliate at Vector Institute. His research interest is centered around causality and its interaction with statistics and machine learning. Prior to these roles, he was a postdoc at Harvard T.H. Chan School of Public Health. He obtained his Ph.D. from the University of Washington.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Experimental Design and Causal Inference Methods For Micro-Randomized Trials: A Framework for Developing Mobile Health Interventions
+
+
+
+
+
+ Tianchen Qian
+
+
+
+
+
Abstract: Mobile health (mHealth) technologies are providing new promising ways to deliver interventions in both clinical and non-clinical settings. Wearable sensors and smartphones collect real-time data streams that provide information about an individual’s current health including both internal (e.g., mood, blood sugar level) and external (e.g., social, location) contexts. Both wearables and smartphones can be used to deliver interventions. mHealth interventions are in current use across a vast number of health-related fields including medication adherence, physical activity, weight loss, mental illness and addictions. This tutorial discusses the micro-randomized trial (MRT), an experimental trial design for use in optimizing real time delivery of sequences of treatment, with an emphasis on mHealth. We introduce the MRT design using HeartSteps, a physical activity study, as an example. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT with continuous binary outcomes. We discuss the sample size considerations for designing MRTs.
+
+
+
Bio:
+ Tianchen Qian is an Assistant Professor in the Department of Statistics at University of California, Irvine. He completed his PhD at the Johns Hopkins University and was a postdoctoral fellow at Harvard University. His research is focused on the experimental design and statistical analysis methods for developing mobile health interventions. In particular, he has developed causal inference methods for analyzing micro-randomized trial data and sample size calculation approaches for designing micro-randomized trials.
+
Abstract: Offline reinforcement learning (offline RL), a.k.a. batch-mode reinforcement learning, involves learning a policy from potentially suboptimal data. In contrast to imitation learning, offline RL does not rely on expert demonstrations, but rather seeks to surpass the average performance of the agents that generated the data. Methodologies such as the gathering of new experience fall short in offline settings, requiring reassessment of fundamental learning paradigms. In this tutorial I aim to provide the necessary background and challenges of this exciting area of research, from off policy evaluation through bandits to deep reinforcement learning.
+
+
+
Bio:
+ Guy Tennenholtz is a fourth-year Ph.D. student at the Technion University, advised by Prof. Shie Mannor. His research interests lie in the field of reinforcement learning, and specifically, how offline data can be leveraged to build better agents. Problems of large action spaces, partial observability, confounding bias, and uncertainty are only some of the problems he is actively researching. In his spare time Guy also enjoys creating mobile games, with the vision of incorporating AI into both the game development process and gameplay.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Explainable ML: Understanding the Limits and Pushing the Boundaries
+
+
+
+
+
+ Hima Lakkaraju
+
+
+
+
+
Abstract: As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in a post hoc manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. However, recent research has shed light on the vulnerabilities of popular post hoc explanation techniques. In this tutorial, I will provide a brief overview of post hoc explanation methods with special emphasis on feature attribution methods such as LIME and SHAP. I will then discuss recent research which demonstrates that these methods are brittle, unstable, and are vulnerable to a variety of adversarial attacks. Lastly, I will present two solutions to address some of the vulnerabilities of these methods – (i) a generic framework based on adversarial training that is designed to make post hoc explanations more stable and robust to shifts in the underlying data, and (ii) a Bayesian framework that captures the uncertainty associated with post hoc explanations and in turn allows us to generate reliable explanations which satisfy user specified levels of confidence. Overall, this tutorial will provide a bird’s eye view of the state-of-the-art in the burgeoning field of explainable machine learning.
+
+
+
Bio:
+ Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Semi-supervised Phenotyping with Electronic Health Records
+
+
+
+
+
+ Jesse Gronsbell , Chuan Hong , Molei Liu , Clara-Lea Bonzel , Aaron Sonabend
+
+
+
+
+
Abstract: Phenotyping is the process of identifying a patient’s health state based on the information in their electronic health records. In this tutorial, we will discuss why phenotyping is a challenging problem from both a practical and methodological perspective. We will focus primarily on the the challenges in obtaining annotated phenotype information from patient records and present statistical learning methods that leverage unlabeled examples to improve model estimation and evaluation to reduce the annotation burden.
+
+
+
Bio:
+ Jesse Gronsbell is an Assistant Professor in the Department of Statistical Sciences at the University of Toronto. Prior to joining U of T, Jesse spent a couple of years as a data scientist in the Mental Health Research and Development Group at Alphabet's Verily Life Sciences. Her primary interest is in the development of statistical methods for modern digital data sources such as electronic health records and mobile health data.
+
Chuan Hong is an instructor in biomedical informatics from the Department of Biomedical Informatics (DBMI) at Harvard Medical School. She received her PhD in Biostatistics from the University of Texas Health Science Center at Houston. Her doctoral research focused on meta-analysis and DNA methylation detection. At DBMI, Chuan's research interests lie in developing statistical and computational methods for biomarker evaluation, predictive modeling, and precision medicine with biomedical data. In particular, she is interested in combining electronic medical records with biorepositories and relevant resources to improve phenotyping accuracy, detect novel biomarkers, and monitor disease progression in clinical research.
+
Molei Liu is a 4th year PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He received a Bachelor's degree in Statistics from Peking University. Molei has been working in areas including high dimensional statistics, distributed learning, semi-supervised learning, semi-parametric inference, and model-X inference. He has also been working on methods for phenome-wide association studies (PheWAS) using electronic health records data.
+
Clara-Lea Bonzel is a research assistant at the Department of Biomedical Informatics at Harvard Medical School. She is mainly interested in personalized medicine using phenomic and genomic data, and model selection and evaluation. Clara-Lea received her master's degree in Applied Mathematics and Financial Engineering from the Swiss Federal Institute of Technology (EPFL).
+
Aaron Sonabend is a PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He is primarily focused on developing robust reinforcement learning and natural language processing methods for contexts with sampling bias, partially observed rewards, or strong distribution shifts. He is interested in healthcare and biomedical applications, such as finding optimal sequential treatment regimes for complex diseases, and phenotyping using electronic health records. Aaron holds a Bachelor's degree in Applied Mathematics, and in Economics from the National Autonomous Technological Institute of Mexico (ITAM).
+ Abstract:
+ In many real-world environments, the details of decision-making processes are not fully known, e.g., how oncologists decide on specific radiation therapy treatment plans for cancer patients, how clinicians decide on medication dosages for different patients, or how hypertension patients choose their diet to control their illness. While conventional machine learning and statistical methods can be used to better understand such processes, they often fail to provide meaningful insights into the unknown parameters when the problem's setting is heavily constrained. Similarly, conventional constrained inference models, such as inverse optimization, are not well equipped for data-driven problems. In this study, we develop a novel methodology (called MLIO) that combines machine learning and inverse optimization techniques to recover the utility functions of a black-box decision-making process. Our method can be applied to settings where different types of data are required to capture the problem. MLIO is specifically developed with data-intensive medical decision-making environments in mind. We evaluate our approach in the context of personalized diet recommendations for patients, building on a large dataset of historical daily food intakes of patients from NHANES. MLIO considers these prior dietary behaviors in addition to complementary data (e.g., demographics and preexisting conditions) to recover the underlying criteria that the patients had in mind when deciding on their food choices. Once the underlying criteria are known, an optimization model can be used to find personalized diet recommendations that adhere to patients' behavior while meeting all required dietary constraints.
+ Abstract:
+ Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future.
+ Abstract:
+ Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies related to the same treatment and outcome measurement. It is an important tool for medical researchers and clinicians to derive reliable conclusions regarding the overall effect of treatments and interventions (e.g., drugs) on a certain outcome (e.g., the severity of a disease). Unfortunately, conventional meta-analysis involves great human effort, i.e., it is constructed by hand and is extremely time-consuming and labor-intensive, rendering a process that is inefficient in practice and vulnerable to human bias. To overcome these challenges, we work toward automating meta-analysis with a focus on controlling for the potential biases. Automating meta-analysis consists of two major steps: (1) extracting information from scientific publications written in natural language, which is different and noisier than what humans typically extract when conducting a meta-analysis; and (2) modeling meta-analysis, from a novel \textit{causal-inference} perspective, to control for the potential biases and summarize the treatment effect from the outputs of the first step. Since sufficient prior work exists for the first step, this study focuses on the second step. The core contribution of this work is a multiple causal inference algorithm tailored to the potentially noisy and biased information automatically extracted by current natural language processing systems. Empirical evaluations on both synthetic and semi-synthetic data show that the proposed approach for automated meta-analysis yields high-quality performance.
+ Abstract:
+ Attention is a powerful concept in computer vision. End-to-end networks that learn to focus selectively on regions of an image or video often perform strongly. However, other image regions, while not necessarily containing the signal of interest, may contain useful context. We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the signal of interest and those that do not. Our technique uses the inverse of an attention mask to generate a noise estimate that is then used to denoise temporal observations. We apply this to the task of camera-based physiological measurement. A convolutional attention network is used to learn which regions of a video contain the physiological signal and generate a preliminary estimate. A noise estimate is obtained by using the pixel intensities in the inverse regions of the learned attention mask, this in turn is used to refine the estimate of the physiological signal. We perform experiments on two large benchmark datasets and show that this approach produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation error by as much as 30%, recovering subtle waveform dynamics, and generalizing from RGB to NIR videos without retraining.
+ DynEHR: Dynamic Adaptation of Models with Data Heterogeneity in Electronic Health Records
+
+
+ Lida Zhang (Texas A&M University); Xiaohan Chen, Tianlong Chen, and Zhangyang Wang (University of Texas at Austin); Bobak J. Mortazavi (Texas A&M University)
+
+ Abstract:
+ Electronic health records (EHRs) provide an abundance of data for clinical outcomes modeling. The prevalence of EHR data has enabled a number of studies using a variety of machine learning algorithms to predict potential adverse events. However, these studies do not account for the heterogeneity present in EHR data, including various lengths of stay, various frequencies of vitals captured in invasive versus non-invasive fashion, and various repetitions (or lack of thereof) of laboratory examinations. Therefore, studies limit the types of features extracted or the domain considered to provide a more homogeneous training set to machine learning models. The heterogeneity in this data represents important risk differences in each patient. In this work, we examine such data in an intensive care unit (ICU) setting, where the length of stay and the frequency of data gathered may vary significantly based upon the severity of patient condition. Therefore, it is unreasonable to use the same model for patients first entering the ICU versus those that have been there for above average lengths of stay. Developing multiple individual models to account for different patient cohorts, different lengths of stay, and different sources for key vital sign data may be tedious and not account for rare cases well. We address this challenge by developing a dynamic model, based upon meta-learning, to adapt to data heterogeneity and generate predictions of various outcomes across the different lengths of data. We compare this technique against a set of benchmarks on a publicly-available ICU dataset (MIMIC-III) and demonstrate improved model performance by accounting for data heterogeneity.
+ Martha Ferreira (Dalhousie University); Michal Malyska and Nicola Sahar (Semantic Health); Riccardo Miotto (Icahn School of Medicine at Mount Sinai); Fernando Paulovich (Dalhousie University); Evangelos Milios (Dalhousie University, Faculty of Computer Scienc)
+
+ Abstract:
+ Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician labeled data. Active Learning (AL) is a process of selecting the most informative instances to be labeled by an expert to further train a supervised algorithm. We demonstrate the effectiveness of AL in multi-label text classification in the clinical domain. In this context, we apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set (8.3\% of the total instances). We conclude that AL methods can significantly reduce the manual annotation cost while preserving model performance.
+ Framing Social Contact Networks for Contagion Dynamics
+
+
+ Kirti Jain (Department of Computer Science, University of Delhi, Delhi, India); Sharanjit Kaur (Acharya Narendra Dev College, University of Delhi, Delhi, India); Vasudha Bhatnagar (Department of Computer Science, University of Delhi, Delhi, India)
+
+ Abstract:
+ Assessment of COVID-19 pandemic predictions indicates that differential equation-based epidemic spreading models are less than satisfactory in the contemporary world of intense human connectivity. Network-based simulations are more apt for studying the contagion dynamics due to their ability to model heterogeneity of human interactions. However, the quality of predictions in network-based models depends on how well the underlying wire-frame approximates the real social contact network of the population. In this paper, we propose a framework to create a modular wire-frame to mimic the social contact network of geography by lacing it with demographic information. The proposed inter-connected network sports small-world topology, accommodates density variations in the geography, and emulates human interactions in family, social, and work spaces. The resulting wire-frame is a generic and potent instrument for urban planners, demographers, economists, and social scientists to simulate different "what-if" scenarios and predict epidemic variables. The basic frame can be laden with any economic, social, urban data that can potentially shape human connectance. We present a preliminary study of the impact of variations in contact patterns due to density and demography on the epidemic variables.
+ Abstract:
+ Shaping an epidemic with an adaptive contact restriction policy that balances the disease and socioeconomic impact has been the holy grail during the COVID-19 pandemic. Most of the existing work on epidemiological models focuses on scenario-based forecasting via simulation but techniques for explicit control of epidemics via an analytical framework are largely missing. In this paper, we consider the problem of determining the optimal control policy for transmission rate assuming SIR dynamics, which is the most widely used epidemiological paradigm. We first demonstrate that the SIR model with infectious patients and susceptible contacts (i.e., product of transmission rate and susceptible population) interpreted as predators and prey respectively reduces to a Lotka-Volterra (LV) predator-prey model. The modified SIR system (LVSIR) has a stable equilibrium point, an 'energy' conservation property, and exhibits bounded cyclic behaviour similar to an LV system. This mapping permits a theoretical analysis of the control problem supporting some of the recent simulation-based studies that point to the benefits of periodic interventions. We use a control-Lyapunov approach to design adaptive control policies (CoSIR) to nudge the SIR model to the desired equilibrium that permits ready extensions to richer compartmental models. We also describe a practical implementation of this transmission control method by approximating the ideal control with a finite, but a time-varying set of restriction levels. We provide experimental results comparing with periodic lockdowns on few different geographical regions (India, Mexico, Netherlands) to demonstrate the efficacy of this approach.
+ Abstract:
+ A major obstacle to the integration of deep learning models for chest x-ray interpretation into clinical settings is the lack of understanding of their failure modes. In this work, we first investigate whether there are clinical subgroups that chest x-ray models are likely to misclassify. We find that older patients and patients with a lung lesion or pneumothorax finding have a higher probability of being misclassified on some diseases. Second, we develop misclassification predictors on chest x-ray models using their outputs and clinical features. We find that our best performing misclassification identifier achieves an AUROC close to 0.9 for most diseases. Third, employing our misclassification identifiers, we develop a corrective algorithm to selectively flip model predictions that have high likelihood of misclassification at inference time. We observe F1 improvement on the prediction of Consolidation (0.008, 95%CI[0.005, 0.010]) and Edema (0.003, 95%CI[0.001, 0.006]). By carrying out our investigation on ten distinct and high-performing chest x-ray models, we are able to derive insights across model architectures and offer a generalizable framework applicable to other medical imaging tasks.
+ Abstract:
+ Contrastive learning is a form of self-supervision that can leverage unlabeled data to produce pretrained models. While contrastive learning has demonstrated promising results on natural image classification tasks, its application to medical imaging tasks like chest X-ray interpretation has been limited. In this work, we propose MoCo-CXR, which is an adaptation of the contrastive learning method Momentum Contrast (MoCo), to produce models with better representations and initializations for the detection of pathologies in chest X-rays. In detecting pleural effusion, we find that linear models trained on MoCo-CXR-pretrained representations outperform those without MoCo-CXR-pretrained representations, indicating that MoCo-CXR-pretrained representations are of higher-quality. End-to-end fine-tuning experiments reveal that a model initialized via MoCo-CXR-pretraining outperforms its non-MoCo-CXR-pretrained counterpart. We find that MoCo-CXR-pretraining provides the most benefit with limited labeled training data. Finally, we demonstrate similar results on a target Tuberculosis dataset unseen during pretraining, indicating that MoCo-CXR-pretraining endows models with representations and transferability that can be applied across chest X-ray datasets and tasks.
+ Abstract:
+ Inertial Measurement Unit (IMU) sensors are becoming increasingly ubiquitous in everyday devices such as smartphones, fitness watches, etc. As a result, the array of health-related applications that tap onto this data has been growing, as well as the importance of designing accurate prediction models for tasks such as human activity recognition (HAR). However, one important task that has received little attention is the prediction of an individual's heart rate when undergoing a physical activity using IMU data. This could be used, for example, to determine which activities are safe for a person without having him/her actually perform them. We propose a neural architecture for this task composed of convolutional and LSTM layers, similarly to the state-of-the-art techniques for the closely related task of HAR. However, our model includes a convolutional network that extracts, based on sensor data from a previously executed activity, a physical conditioning embedding (PCE) of the individual to be used as the LSTM's initial hidden state. We evaluate the proposed model, dubbed PCE-LSTM, when predicting the heart rate of 23 subjects performing a variety of physical activities from IMU-sensor data available in public datasets (PAMAP2, PPG-DaLiA). For comparison, we use as baselines the only model specifically proposed for this task, and an adapted state-of-the-art model for HAR. PCE-LSTM yields over 10% lower mean absolute error. We demonstrate empirically that this error reduction is in part due to the use of the PCE. Last, we use the two datasets (PPG-DaLiA, WESAD) to show that PCE-LSTM can also be successfully applied when photoplethysmography (PPG) sensors are available to rectify heart rate measurement errors caused by movement, outperforming the state-of-the-art deep learning baselines by more than 30%.
+ Towards Reliable and Trustworthy Computer-Aided Diagnosis Predictions: Diagnosing COVID-19 from X-Ray Images
+
+
+ Krishanu Sarker (Georgia State University); Sharbani Pandit (Georgia Institute of Technology); Anupam Sarker (Institute of Epidemiology, Disease Control and Research); Saeid Belkasim and Shihao Ji (Georgia State University)
+
+ Abstract:
+ COVID-19 pandemic has been ravaging the world we know since it's insurgence. Computer-Aided Diagnosis (CAD) systems with high precision and reliability can play a vital role in the battle against COVID-19. Most of the existing works in the literature focus on developing sophisticated methods yielding high detection performance yet not addressing the issue of predictive uncertainty. Uncertainty estimation has been explored heavily in the literature for Deep Neural Networks; however, not much work focused on this issue on COVID-19 detection. In this work, we explore the efficacy of state-of-the-art (SOTA) uncertainty estimation methods on COVID-19 detection. We propose to augment the best performing method by using feature denoising algorithm to gain higher Positive Predictive Value (PPV) on COVID positive cases. Through extensive experimentation, we identify the most lightweight and easy-to-deploy uncertainty estimation framework that can effectively identify the confusing COVID-19 cases for expert analysis while performing comparatively with the existing resource heavy uncertainty estimation methods. In collaboration with medical professionals, we further validate the results to ensure the viability of the framework in clinical practice.
+ CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays
+
+
+ Siyu Shi (Department of Medicine, School of Medicine, Stanford University); Ishaan Malhi, Kevin Tran, Andrew Y. Ng, and Pranav Rajpurkar (Department of Computer Science, Stanford University)
+
+ Abstract:
+ We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as "no disease". Second, we evaluate whether models trained on seen diseases can detect seen diseases when co-occurring with diseases outside the subset (unseen diseases). We find that models are still able to detect seen diseases even when co-occurring with unseen diseases. Third, we evaluate whether feature representations learned by models may be used to detect the presence of unseen diseases given a small labeled set of unseen diseases. We find that the penultimate layer provides useful features for unseen disease detection. Our results can inform the safe clinical deployment of deep learning models trained on a non-exhaustive set of disease classes.
+ Abstract:
+ We explore the application of graph neural networks (GNNs) to the problem of estimating exposure to an infectious pathogen and probability of transmission. Specifically, given a datatset in which a subset of patients are known to be infected and information in the form of a graph about who has interacted with whom, we aim to directly estimate transmission dynamics, i.e., what types of interactions (e.g., length and number) lead to transmission events. While, graph neural networks (GNNs) have proven capable of learning meaningful representations from graph data, they commonly assume tasks with high homophily (i.e., nodes that share an edge look similar). Recently researchers have proposed techniques for addressing problems with low homophily (e.g., adding residual connections to GNNs). In our problem setting, homophily is high on average, the majority of patients do not become infected. But, homophily remains low with respect to the minority class. In this paper, we characterize this setting as particularly challenging for GNNs. Given the asymmetry in homophily between classes, we hypothesize that solutions designed to address low homophily on average will not suffice and instead propose a solution based on attention. Applied to both real-world and synthetic network data, we test this hypothesis and explore the ability of GNNs to learn complex transmission dynamics directly from network data. Overall, attention proves to be an effective mechanism for addressing low homophily in the minority class (AUROC with 95\% CI: GCN 0.684 (0.659,0.710) vs. GAT 0.715 (0.688,0.742)) and such a data-driven approach can outperform approaches based on potentially flawed expert knowledge.
+ Abstract:
+ Explainable artificial intelligence provides an opportunity to improve prediction accuracy over standard linear models using 'black box' machine learning (ML) models while still revealing insights into a complex outcome such as all-cause mortality. We propose the IMPACT (Interpretable Machine learning Prediction of All-Cause morTality) framework that implements and explains complex, non-linear ML models in epidemiological research, by combining a tree ensemble mortality prediction model and an explainability method. We use 133 variables from NHANES 1999-2014 datasets (number of samples: ?? = 47, 261) to predict all-cause mortality. To explain our model, we extract local (i.e., per-sample) explanations to verify well-studied mortality risk factors, and make new dis- coveries. We present major factors for predicting ??-year mortality (?? = 1, 3, 5) across different age groups and their individualized im- pact on mortality prediction. Moreover, we highlight interactions between risk factors associated with mortality prediction, which leads to findings that linear models do not reveal. We demonstrate that compared with traditional linear models, tree-based models have unique strengths such as: (1) improving prediction power, (2) making no distribution assumptions, (3) capturing non-linear relationships and important thresholds, (4) identifying feature interactions, and (5) detecting different non-linear relationships between models. Given the popularity of complex ML models in prognostic research, combining these models with explainability methods has implications for further applications of ML in medical fields. To our knowledge, this is the first study that combines complex ML models and state-of-the-art feature attributions to explain mortality prediction, which enables us to achieve higher prediction accuracy and gain new insights into the effect of risk factors on mortality.
+ Outcomes-Driven Clinical Phenotyping in Patients with Cardiogenic Shock for Risk Modeling and Comparative Treatment Effectiveness
+
+
+ Nathan C. Hurley (Texas A&M University); Alyssa Berkowitz (Yale University); Frederick Masoudi (University of Colorado School of Medicine); Joseph Ross and Nihar Desai (Yale University); Nilay Shah (Mayo Clinic); Sanket Dhruva (UCSF School of Medicine); Bobak J. Mortazavi (Texas A&M University)
+
+ Abstract:
+ Cardiogenic shock is a deadly and complicated illness. Despite extensive research into treating cardiogenic shock, mortality remains high and has not decreased over time. Patients suffering from cardiogenic shock are highly heterogeneous, and developing an understanding of phenotypes among these patients is crucial for understanding this disease and the appropriate treatments for individual patients. In this work, we develop a deep mixture of experts approach to jointly find phenotypes among patients with cardiogenic shock while simultaneously estimating their risk of in-hospital mortality. Although trained with information regarding treatment and outcomes, after training, the proposed model is decomposable into a network that clusters patients into phenotypes from information available prior to treatment. This model is validated on a synthetic dataset and then applied to a cohort of 28,304 patients with cardiogenic shock. The full model predicts in-hospital mortality on this cohort with an AUROC of 0.85 ± 0.01. The model discovers five phenotypes among the population, finding statistically different mortality rates among them and among treatment choices within those groups. This approach allows for grouping patients in clinical clusters with different rates of device utilization and different risk of mortality. This approach is suitable for jointly finding phenotypes within a clinical population and in modeling risk among that population.
+ Abstract:
+ Severe infectious diseases such as the novel coronavirus (COVID-19) pose a huge threat to public health. Stringent control measures, such as school closures and stay-at-home orders, while having significant effects, also bring huge economic losses. In the face of an emerging infectious disease, a crucial question for policymakers is how to make the trade-off and implement the appropriate interventions timely, with the existence of huge uncertainty. In this work, we propose a Multi-Objective Model-based Reinforcement Learning framework to facilitate data-driven decision making and minimize the long-term overall cost. Specifically, at each decision point, a Bayesian epidemiological model is first learned as the environment model, and then the proposed model-based multi-objective planning algorithm is applied to find a set of Pareto-optimal policies. This framework, combined with the prediction bands for each policy, provides a real-time decision support tool for policymakers. The application is demonstrated with the spread of COVID-19 in China.
+ Abstract:
+ With the growing amount of text in health data, there have beenrapid advances in large pre-trained models that can be applied to awide variety of biomedical tasks with minimal task-specific mod-ifications. Emphasizing the cost of these models, which renderstechnical replication challenging, this paper summarizes experi-ments conducted in replicating BioBERT and further pre-trainingand careful fine-tuning in the biomedical domain. We also inves-tigate the effectiveness of domain-specific and domain-agnosticpre-trained models across downstream biomedical NLP tasks. Ourfinding confirms that pre-trained models can be impactful in somedownstream NLP tasks (QA and NER) in the biomedical domain;however, this improvement may not justify the high cost of domain-specific pre-training.
+ Knowledge Graph-based Question Answering with Electronic Health Records
+
+
+ Junwoo Park and Youngwoo Cho (Korea Advanced Institute of Science and Technology (KAIST)); Haneol Lee (Yonsei University); Jaegul Choo and Edward Choi (Korea Advanced Institute of Science and Technology (KAIST))
+
+ Abstract:
+ Question Answering (QA) is a widely-used framework for developing and evaluating an intelligent machine. In this light, QA on Electronic Health Records (EHR), namely EHR QA, can work as a crucial milestone towards developing an intelligent agent in healthcare. EHR data are typically stored in a relational database, which can also be converted to a directed acyclic graph, allowing two approaches for EHR QA: Table-based QA and Knowledge Graph-based QA. We hypothesize that the graph-based approach is more suitable for EHR QA as graphs can represent relations between entities and values more naturally compared to tables, which essentially require JOIN operations. In this paper, we propose a graph-based EHR QA where natural language queries are converted to SPARQL instead of SQL. To validate our hypothesis, we create four EHR QA datasets (graph-based VS table-based, and simplified database schema VS original database schema), based on a table-based dataset MIMICSQL. We test both a simple Seq2Seq model and a state-of-the-art EHR QA model on all datasets where the graph-based datasets facilitated up to 34% higher accuracy than the table-based dataset without any modification to the model architectures. Finally, all datasets will be open-sourced to encourage further EHR QA research in both directions.
+ Abstract:
+ There is an increased adoption of electronic health record (EHR) systems by variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich source of patient information is the unstructured clinical text. In this work, we develop \model, a chart-aware temporal attention network for learning patient representations from clinical notes. We introduce a novel representation where each note is considered a single unit, like a sentence, and composed of attention-weighted words. The notes in turn are aggregated into a patient representation using a second weighting unit, note attention. Unlike standard attention computations which focus only on the content of the note, we incorporate the chart-time for each note as a constraint for attention calculation. This allows our model to focus on notes closer to the prediction time. Using the MIMIC-III dataset, we empirically show that our patient representation and attention calculation achieves the best performance in comparison with various state-of-the-art baselines for one-year mortality prediction and 30-day hospital readmission. Moreover, the attention weights can be used to offer transparency into our model's predictions.
+ Abstract:
+ Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.
+ Lida Zhang (Texas A&M University); Xiaohan Chen, Tianlong Chen, and Zhangyang Wang (University of Texas at Austin); Bobak J. Mortazavi (Texas A&M University)
+
+ Martha Ferreira (Dalhousie University); Michal Malyska and Nicola Sahar (Semantic Health); Riccardo Miotto (Icahn School of Medicine at Mount Sinai); Fernando Paulovich (Dalhousie University); Evangelos Milios (Dalhousie University, Faculty of Computer Scienc)
+
+ Kirti Jain (Department of Computer Science, University of Delhi, Delhi, India); Sharanjit Kaur (Acharya Narendra Dev College, University of Delhi, Delhi, India); Vasudha Bhatnagar (Department of Computer Science, University of Delhi, Delhi, India)
+
+ Krishanu Sarker (Georgia State University); Sharbani Pandit (Georgia Institute of Technology); Anupam Sarker (Institute of Epidemiology, Disease Control and Research); Saeid Belkasim and Shihao Ji (Georgia State University)
+
+ Siyu Shi (Department of Medicine, School of Medicine, Stanford University); Ishaan Malhi, Kevin Tran, Andrew Y. Ng, and Pranav Rajpurkar (Department of Computer Science, Stanford University)
+
+ Nathan C. Hurley (Texas A&M University); Alyssa Berkowitz (Yale University); Frederick Masoudi (University of Colorado School of Medicine); Joseph Ross and Nihar Desai (Yale University); Nilay Shah (Mayo Clinic); Sanket Dhruva (UCSF School of Medicine); Bobak J. Mortazavi (Texas A&M University)
+
+ Junwoo Park and Youngwoo Cho (Korea Advanced Institute of Science and Technology (KAIST)); Haneol Lee (Yonsei University); Jaegul Choo and Edward Choi (Korea Advanced Institute of Science and Technology (KAIST))
+
+ Abstract:
+ It has been shown that equalizing health disparities can avert more deaths than the number of lives saved by medical advances alone in the same time frame. Moreover, without a simultaneous focus on innovations and equity, advances in health for one group can occur at the cost of added challenges for another. In this talk I will introduce the science of health disparities and juxtapose it with the machine learning subfield of algorithmic fairness. Given the key foci and principles of health equity and health disparities within public and population health, I will show examples of how machine learning and principles of public and population health can be synergized for using data to advance the science of health disparities and sustainable health of entire populations.
+
+ Bio:
+ Dr. Rumi Chunara is an Associate Professor at New York University, jointly appointed at the Tandon School of Engineering (in Computer Science) and the School of Global Public Health (in Biostatistics/Epidemiology). Her PhD is from the Harvard-MIT Division of Health Sciences and Technology and her BSc from Caltech. Her research group focuses on developing computational and statistical approaches for acquiring, integrating and using data to improve population and public health. She is an MIT TR35, NSF Career, Bill & Melinda Gates Foundation Grand Challenges, Facebook Research and Max Planck Sabbatical award winner.
+
+ Abstract:
+ A common goal in genome-wide association (GWA) studies is to characterize the relationship between genotypic and phenotypic variation. Linear models are widely used tools in GWA analyses, in part, because they provide significance measures which detail how individual single nucleotide polymorphisms (SNPs) are statistically associated with a trait or disease of interest. However, traditional linear regression largely ignores non-additive genetic variation, and the univariate SNP-level mapping approach has been shown to be underpowered and challenging to interpret for certain trait architectures. While machine learning (ML) methods such as neural networks are well known to account for complex data structures, these same algorithms have also been criticized as “black box” since they do not naturally carry out statistical hypothesis testing like classic linear models. This limitation has prevented ML approaches from being used for association mapping tasks in GWA applications. In this talk, we present flexible and scalable classes of Bayesian feedforward models which provide interpretable probabilistic summaries such as posterior inclusion probabilities and credible sets which allows researchers to simultaneously perform (i) fine-mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. While analyzing real data assayed in diverse self-identified human ancestries from the UK Biobank, the Biobank Japan, and the PAGE consortium we demonstrate that interpretable ML has the power to increase the return on investment in multi-ancestry biobanks. Furthermore, we highlight that by prioritizing biological mechanism we can identify associations that are robust across ancestries---suggesting that ML can play a key role in making personalized medicine a reality for all.
+
+ Bio:
+ Lorin Crawford is a Senior Researcher at Microsoft Research New England. He also holds a position as the RGSS Assistant Professor of Biostatistics at Brown University. His scientific research interests involve the development of novel and efficient computational methodologies to address complex problems in statistical genetics, cancer pharmacology, and radiomics (e.g., cancer imaging). Dr. Crawford has an extensive background in modeling massive data sets of high-throughput molecular information as it pertains to functional genomics and cellular-based biological processes. His most recent work has earned him a place on Forbes 30 Under 30 list, The Root 100 Most Influential African Americans list, and recognition as an Alfred P. Sloan Research Fellow and a David & Lucile Packard Foundation Fellowship for Science and Engineering. Before joining Brown, Dr. Crawford received his PhD from the Department of Statistical Science at Duke University and received his Bachelor of Science degree in Mathematics from Clark Atlanta University.
+
+ Abstract:
+ Machine learning presents an opportunity to understand the patient journey over high dimensional data in the clinical context. This is aligned to one of the foundational issues of machine learning for healthcare: how do you represent a patient state. Improving state representations allows us to (i) visualise/cluster deteriorating patients, (ii) understand the patient journey and thus heterogeneous pathways to improvement or clinical deterioration which encompasses different data modalities; and thus (iii) more quickly identify situations for intervention. In this talk, I present motivating examples of understanding heterogeneity as a route towards understanding health and personalising healthcare interventions.
+
+ Bio:
+ Danielle Belgrave is a Senior Staff Research Scientist at DeepMind. Prior to joining DeepMind she worked in the Healthcare Intelligence group at Microsoft Research and was a tenured research fellow at Imperial College London. Her research focuses on integrating medical domain knowledge, machine learning and causal modelling frameworks to understand health. She obtained a BSc in Mathematics and Statistics from London School of Economics, an MSc in Statistics from University College London and a PhD in the area of machine learning in health applications from the University of Manchester.
+
+ Abstract:
+ In my talk, I will describe the work that I have been doing since March 2020, leading a multi-disciplinary team of 20+ volunteer scientists working very closely with the Presidency of the Valencian Government in Spain on 4 large areas: (1) human mobility modeling; (2) computational epidemiological models (both metapopulation, individual and LSTM-based models); (3) predictive models; and (4) a large-scale, online citizen surveys called the COVID19impactsurvey (https://covid19impactsurvey.org) with over 700,000 answers worldwide. This survey has enabled us to shed light on the impact that the pandemic is having on people's lives. I will present the results obtained in each of these four areas, including winning the 500K XPRIZE Pandemic Response Challenge and obtaining a best paper award at ECML-PKDD 2021. I will share the lessons learned in this very special initiative of collaboration between the civil society at large (through the survey), the scientific community (through the Expert Group) and a public administration (through the Commissioner at the Presidency level). For those interested in knowing more, WIRED magazine published an extensive article describing our story: https://www.wired.co.uk/article/valencia-ai-covid-data.
+
+ Bio:
+ Nuria Oliver is Co-founder and Vice-president of ELLIS (The European Laboratory for Learning and Intelligent Systems), Co-founder and Director of the ELLIS Unit Alicante, Chief Data Scientist at Data-Pop Alliance and Chief Scientific Advisor to the Vodafone Institute. Nuria earned her PhD from MIT. She is a Fellow of the ACM, IEEE and EurAI. She is the youngest member (and fourth female) in the Spanish Royal Academy of Engineering. She is also the only Spanish scientist at SIGCHI Academy. She has over 25 years of research experience in human-centric AI and is the author of over 180 widely cited scientific articles as well as an inventor of 40+ patents and a public speaker. Her work is regularly featured in the media and has received numerous recognitions, including the Spanish National Computer Science Award, the MIT TR100 (today TR35), Young Innovator Award (first Spanish scientist to receive this award); the 2020 Data Scientist of the Year by ESRI, the 2021 King Jaume I award in New Technologies and the 2021 Abie Technology Leadership Award. In March of 2020, she was appointed Commissioner to the President of the Valencian Government on AI Strategy and Data Science against COVID-19. In that role, she has recently co-led ValenciaIA4COVID, the winning team of the 500k XPRIZE Pandemic Response Challenge. Their work was featured in WIRED, among other media.
+
+ Abstract:
+ Spoiler alert: No. And yes, it is much, much further. Public health has not traditionally been a data-driven field. The good news is that has been changing in recent years, accelerated significantly by the COVID epidemic. But public health and human services organizations have many more fundamental things to worry about before we will have the luxury of considering what machine learning can enable. These fundamentals include data-related facets such as electronic data capture and exchange, data quality, data governance, information technology infrastructure, and data management best practices. In addition, data literacy, workforce development, and compensation that is a fraction of what 'quants' can earn in industry are also major stumbling blocks toward advanced analytics in public health. At the start of the COVID pandemic, many communicable diseases were reporting by fax machine and then hand-entered into a database. Although there was significant interest in predictive modeling to project hospital capacity out in the future, even the most sophisticated models were of limited use to policy makers beyond basic trends and observations from the front lines. The most notable exception, where AI is in fact proving useful in public health, is in the use of 'robotic process automation' (RPA) as a band-aid for poorly designed systems that require mindless human intervention. These tools serve as workarounds for systems that lack interoperability by emulating human users to do the grunt work of data entry and wrangling. This talk will be a reality check from the trenches of state government on the heels of the COVID-19 pandemic.
+
+ Bio:
+ Dr. Tenenbaum serves as the Chief Data Officer (CDO) for DHHS, where she oversees data strategy across the Department enabling the use of information to inform and evaluate policy and improve the health and well-being of residents of North Carolina. Prior to taking on the role of CDO, Dr. Tenenbaum was a founding faculty member of the Division of Translational Biomedical Informatics within Duke University's Department of Biostatistics and Bioinformatics where her research focused on informatics methods to enable precision medicine, particularly in mental health. She is also interested in ethical, legal, and social issues around big data and precision medicine. Nationally, Dr. Tenenbaum has served as Associate Editor for the Journal of Biomedical Informatics and as an elected member of the Board of Directors for the American Medical Informatics Association (AMIA). She currently serves on the Board of Scientific Counselors for the National Library of Medicine. After earning her bachelor's degree in biology from Harvard, Dr. Tenenbaum was a Program Manager at Microsoft Corporation in Redmond, WA for six years before pursuing a PhD in biomedical informatics at Stanford University. Dr. Tenenbaum is a strong promoter and advocate of young women interested in STEM (science, technology, engineering, and math) careers.
+
+ Abstract:
+ AI systems tend to amplify biases and disparities. When we feed them data that reflects our biases, they mimic them---from antisemitic chatbots to racially biased software. In this talk I am going to discuss two examples how AI can help us reduce biases and disparities. First I am going to explain how we can use AI to understand why underserved populations experience higher levels of pain. This is true even after controlling for the objective severity of diseases like osteoarthritis, as graded by human physicians using medical images, which raises the possibility that underserved patients’ pain stems from factors external to the knee, such as stress. We develop a deep learning approach to measure the severity of osteoarthritis, by using knee X-rays to predict patients’ experienced pain and show that this approach dramatically reduces unexplained racial disparities in pain.
+
+ Bio:
+ Jure Leskovec is an associate professor of Computer Science at Stanford University, the Chief Scientist at Pinterest, and an Investigator at the Chan Zuckerberg Biohub. He co-founded a machine learning startup Kosei, which was later acquired by Pinterest. Leskovec's research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. You can follow him on Twitter at @jure.
+
+ Algorithmic fairness and the science of health disparities
+
+
+
+
+
+ Rumi Chunara / New York University
+
+
+
+
+
+
+
+
+
+
Abstract: It has been shown that equalizing health disparities can avert more deaths than the number of lives saved by medical advances alone in the same time frame. Moreover, without a simultaneous focus on innovations and equity, advances in health for one group can occur at the cost of added challenges for another. In this talk I will introduce the science of health disparities and juxtapose it with the machine learning subfield of algorithmic fairness. Given the key foci and principles of health equity and health disparities within public and population health, I will show examples of how machine learning and principles of public and population health can be synergized for using data to advance the science of health disparities and sustainable health of entire populations.
+
+
+
Bio: Dr. Rumi Chunara is an Associate Professor at New York University, jointly appointed at the Tandon School of Engineering (in Computer Science) and the School of Global Public Health (in Biostatistics/Epidemiology). Her PhD is from the Harvard-MIT Division of Health Sciences and Technology and her BSc from Caltech. Her research group focuses on developing computational and statistical approaches for acquiring, integrating and using data to improve population and public health. She is an MIT TR35, NSF Career, Bill & Melinda Gates Foundation Grand Challenges, Facebook Research and Max Planck Sabbatical award winner.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Human Genetics: A Multi-Scale View on Complex Traits and Disease
+
+
+
+
+
+ Lorin Crawford / Microsoft Research New England; Brown University
+
+
+
+
+
+
+
+
+
+
Abstract: A common goal in genome-wide association (GWA) studies is to characterize the relationship between genotypic and phenotypic variation. Linear models are widely used tools in GWA analyses, in part, because they provide significance measures which detail how individual single nucleotide polymorphisms (SNPs) are statistically associated with a trait or disease of interest. However, traditional linear regression largely ignores non-additive genetic variation, and the univariate SNP-level mapping approach has been shown to be underpowered and challenging to interpret for certain trait architectures. While machine learning (ML) methods such as neural networks are well known to account for complex data structures, these same algorithms have also been criticized as “black box” since they do not naturally carry out statistical hypothesis testing like classic linear models. This limitation has prevented ML approaches from being used for association mapping tasks in GWA applications. In this talk, we present flexible and scalable classes of Bayesian feedforward models which provide interpretable probabilistic summaries such as posterior inclusion probabilities and credible sets which allows researchers to simultaneously perform (i) fine-mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. While analyzing real data assayed in diverse self-identified human ancestries from the UK Biobank, the Biobank Japan, and the PAGE consortium we demonstrate that interpretable ML has the power to increase the return on investment in multi-ancestry biobanks. Furthermore, we highlight that by prioritizing biological mechanism we can identify associations that are robust across ancestries---suggesting that ML can play a key role in making personalized medicine a reality for all.
+
+
+
Bio: Lorin Crawford is a Senior Researcher at Microsoft Research New England. He also holds a position as the RGSS Assistant Professor of Biostatistics at Brown University. His scientific research interests involve the development of novel and efficient computational methodologies to address complex problems in statistical genetics, cancer pharmacology, and radiomics (e.g., cancer imaging). Dr. Crawford has an extensive background in modeling massive data sets of high-throughput molecular information as it pertains to functional genomics and cellular-based biological processes. His most recent work has earned him a place on Forbes 30 Under 30 list, The Root 100 Most Influential African Americans list, and recognition as an Alfred P. Sloan Research Fellow and a David & Lucile Packard Foundation Fellowship for Science and Engineering. Before joining Brown, Dr. Crawford received his PhD from the Department of Statistical Science at Duke University and received his Bachelor of Science degree in Mathematics from Clark Atlanta University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Understanding Heterogeneity as a Route to Understanding Health
+
+
+
+
+
+ Danielle Belgrave / DeepMind
+
+
+
+
+
+
+
+
+
+
Abstract: Machine learning presents an opportunity to understand the patient journey over high dimensional data in the clinical context. This is aligned to one of the foundational issues of machine learning for healthcare: how do you represent a patient state. Improving state representations allows us to (i) visualise/cluster deteriorating patients, (ii) understand the patient journey and thus heterogeneous pathways to improvement or clinical deterioration which encompasses different data modalities; and thus (iii) more quickly identify situations for intervention. In this talk, I present motivating examples of understanding heterogeneity as a route towards understanding health and personalising healthcare interventions.
+
+
+
Bio: Danielle Belgrave is a Senior Staff Research Scientist at DeepMind. Prior to joining DeepMind she worked in the Healthcare Intelligence group at Microsoft Research and was a tenured research fellow at Imperial College London. Her research focuses on integrating medical domain knowledge, machine learning and causal modelling frameworks to understand health. She obtained a BSc in Mathematics and Statistics from London School of Economics, an MSc in Statistics from University College London and a PhD in the area of machine learning in health applications from the University of Manchester.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Data Science against COVID-19
+
+
+
+
+
+ Nuria Oliver / ELLIS
+
+
+
+
+
+
+
+
+
+
Abstract: In my talk, I will describe the work that I have been doing since March 2020, leading a multi-disciplinary team of 20+ volunteer scientists working very closely with the Presidency of the Valencian Government in Spain on 4 large areas: (1) human mobility modeling; (2) computational epidemiological models (both metapopulation, individual and LSTM-based models); (3) predictive models; and (4) a large-scale, online citizen surveys called the COVID19impactsurvey (https://covid19impactsurvey.org) with over 700,000 answers worldwide. This survey has enabled us to shed light on the impact that the pandemic is having on people's lives. I will present the results obtained in each of these four areas, including winning the 500K XPRIZE Pandemic Response Challenge and obtaining a best paper award at ECML-PKDD 2021. I will share the lessons learned in this very special initiative of collaboration between the civil society at large (through the survey), the scientific community (through the Expert Group) and a public administration (through the Commissioner at the Presidency level). For those interested in knowing more, WIRED magazine published an extensive article describing our story: https://www.wired.co.uk/article/valencia-ai-covid-data.
+
+
+
Bio: Nuria Oliver is Co-founder and Vice-president of ELLIS (The European Laboratory for Learning and Intelligent Systems), Co-founder and Director of the ELLIS Unit Alicante, Chief Data Scientist at Data-Pop Alliance and Chief Scientific Advisor to the Vodafone Institute. Nuria earned her PhD from MIT. She is a Fellow of the ACM, IEEE and EurAI. She is the youngest member (and fourth female) in the Spanish Royal Academy of Engineering. She is also the only Spanish scientist at SIGCHI Academy. She has over 25 years of research experience in human-centric AI and is the author of over 180 widely cited scientific articles as well as an inventor of 40+ patents and a public speaker. Her work is regularly featured in the media and has received numerous recognitions, including the Spanish National Computer Science Award, the MIT TR100 (today TR35), Young Innovator Award (first Spanish scientist to receive this award); the 2020 Data Scientist of the Year by ESRI, the 2021 King Jaume I award in New Technologies and the 2021 Abie Technology Leadership Award. In March of 2020, she was appointed Commissioner to the President of the Valencian Government on AI Strategy and Data Science against COVID-19. In that role, she has recently co-led ValenciaIA4COVID, the winning team of the 500k XPRIZE Pandemic Response Challenge. Their work was featured in WIRED, among other media.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning in Public Health: are we there yet?
+
+
+
+
+
+ Jessica Tenenbaum / North Carolina Department of Health and Human Services; Duke University School of Medicine
+
+
+
+
+
+
+
+
+
+
Abstract: Spoiler alert: No. And yes, it is much, much further. Public health has not traditionally been a data-driven field. The good news is that has been changing in recent years, accelerated significantly by the COVID epidemic. But public health and human services organizations have many more fundamental things to worry about before we will have the luxury of considering what machine learning can enable. These fundamentals include data-related facets such as electronic data capture and exchange, data quality, data governance, information technology infrastructure, and data management best practices. In addition, data literacy, workforce development, and compensation that is a fraction of what 'quants' can earn in industry are also major stumbling blocks toward advanced analytics in public health. At the start of the COVID pandemic, many communicable diseases were reporting by fax machine and then hand-entered into a database. Although there was significant interest in predictive modeling to project hospital capacity out in the future, even the most sophisticated models were of limited use to policy makers beyond basic trends and observations from the front lines. The most notable exception, where AI is in fact proving useful in public health, is in the use of 'robotic process automation' (RPA) as a band-aid for poorly designed systems that require mindless human intervention. These tools serve as workarounds for systems that lack interoperability by emulating human users to do the grunt work of data entry and wrangling. This talk will be a reality check from the trenches of state government on the heels of the COVID-19 pandemic.
+
+
+
Bio: Dr. Tenenbaum serves as the Chief Data Officer (CDO) for DHHS, where she oversees data strategy across the Department enabling the use of information to inform and evaluate policy and improve the health and well-being of residents of North Carolina. Prior to taking on the role of CDO, Dr. Tenenbaum was a founding faculty member of the Division of Translational Biomedical Informatics within Duke University's Department of Biostatistics and Bioinformatics where her research focused on informatics methods to enable precision medicine, particularly in mental health. She is also interested in ethical, legal, and social issues around big data and precision medicine. Nationally, Dr. Tenenbaum has served as Associate Editor for the Journal of Biomedical Informatics and as an elected member of the Board of Directors for the American Medical Informatics Association (AMIA). She currently serves on the Board of Scientific Counselors for the National Library of Medicine. After earning her bachelor's degree in biology from Harvard, Dr. Tenenbaum was a Program Manager at Microsoft Corporation in Redmond, WA for six years before pursuing a PhD in biomedical informatics at Stanford University. Dr. Tenenbaum is a strong promoter and advocate of young women interested in STEM (science, technology, engineering, and math) careers.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Reducing bias in machine learning systems: Understanding drivers of pain
+
+
+
+
+
+ Jure Leskovec / Stanford University
+
+
+
+
+
+
+
+
+
+
Abstract: AI systems tend to amplify biases and disparities. When we feed them data that reflects our biases, they mimic them---from antisemitic chatbots to racially biased software. In this talk I am going to discuss two examples how AI can help us reduce biases and disparities. First I am going to explain how we can use AI to understand why underserved populations experience higher levels of pain. This is true even after controlling for the objective severity of diseases like osteoarthritis, as graded by human physicians using medical images, which raises the possibility that underserved patients’ pain stems from factors external to the knee, such as stress. We develop a deep learning approach to measure the severity of osteoarthritis, by using knee X-rays to predict patients’ experienced pain and show that this approach dramatically reduces unexplained racial disparities in pain.
+
+
+
Bio: Jure Leskovec is an associate professor of Computer Science at Stanford University, the Chief Scientist at Pinterest, and an Investigator at the Chan Zuckerberg Biohub. He co-founded a machine learning startup Kosei, which was later acquired by Pinterest. Leskovec's research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. You can follow him on Twitter at @jure.
+ Abstract:
+ You’ve created an awesome model that predicts with near 100 percent accuracy. Now what? In this tutorial, we will give insight into the implementation, deployment, integration, and evaluation steps following the building of a clinical model. Specifically, we will discuss each step in the context of informing design choices as you build a model. For example, aggressive feature selection is a necessary step toward integration as real time data streams of all the data points a machine learning model may consume may not be accessible or feasible. We will use our implementation and evaluation of a Covid-19 adverse event model at our institution as a representative case study. This case study will demonstrate the full lifecycle of a clinical model and how we transition from a model to affecting patient outcome and the socio-technical challenges for success.
+
+ Bio:
+ Yindalon Aphinyanaphongs, MD, PhD (Predictive Analytics Team Lead) is a physician scientist in the Center for Healthcare Innovation and Delivery Science in the Department of Population Health at NYU Langone Health in New York City. Academically, he is an assistant professor and his lab focuses on novel applications of machine learning to clinical problems and the science behind successful translation of predictive models into clinical practice to drive value. Operationally, he is the Director of Operational Data Science and Machine Learning at NYU Langone Health. In this role, he leads a Predictive Analytics Unit composed of data scientists and engineers that build, evaluate, benchmark, and deploy predictive algorithms into the clinical enterprise.
+
+ Abstract:
+ The growth of availability and variety of healthcare data sources has provided unique opportunities for data integration and evidence synthesis, which can potentially accelerate knowledge discovery and enable better clinical decision-making. However, many practical and technical challenges, such as data privacy, high-dimensionality and heterogeneity across different datasets, remain to be addressed. In this talk, I will introduce several methods for the effective and efficient integration of electronic health records and other healthcare datasets. Specifically, we develop communication-efficient distributed algorithms for jointly analyzing multiple datasets without the need of sharing patient-level data. Our algorithms can account for heterogeneity across different datasets. We provide theoretical guarantees for the performance of our algorithms, and examples of implementing the algorithms to real-world clinical research networks.
+
+ Bio:
+ Dr. Duan is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. She received her Ph.D. in Biostatistics in May 2020 from the University of Pennsylvania. Her research interests focus on three distinct areas: methods for integrating evidence from different data sources, identifying signals from high dimensional data, and accounting for suboptimality of real-world data, such as missing data and measurement errors.
+
+ Abstract:
+ Digital health technologies provide promising ways to deliver interventions outside of clinical settings. Wearable sensors and mobile phones provide real-time data streams that provide information about an individual’s current health including both internal (e.g., mood) and external (e.g., location) contexts. This tutorial discusses the algorithms underlying mobile health clinical trials. Specifically, we introduce the micro-randomized trial (MRT), an experimental design for optimizing real time interventions. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT. Attendees will have access to synthetic digital health experimental data to better understand online learning and experimentation algorithms, the systems underlying real time delivery of treatment, and their evaluation using collected data.
+
+ Bio:
+ Walter Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research. My research focuses on Statistical Methods for Digital and Mobile Health. My current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks.
+
+ Abstract:
+ Does increasing the dosage of a drug treatment cause adverse reactions in patients? This is a causal question: did increased drug dosage cause some patients to have an adverse reaction, or would they have had the reaction anyway due to other factors? A classical approach to studying this causal question from observational data involves applying causal inference techniques to observed measurements of all the relevant clinical variables. However, there is a growing recognition that abundant text data, such as medical records, physicians' notes, or even forum posts from online medical communities, provide a rich source of information for causal inference. In this tutorial, I'll introduce causal inference and highlight the unique challenges that high-dimensional and noisy text data pose. Then, I'll use two text applications involving online forums and consumer complaints to motivate recent approaches that extend natural language processing (NLP) methods in service of causal inference. I'll discuss some new assumptions we need to introduce to bridge the gap between noisy text data and valid causal inference. I'll conclude by summarizing open research questions at the intersection of causal inference and text analysis.
+
+ Bio:
+ Dhanya Sridhar is an assistant professor at the University of Montreal and a core academic member at Mila - Quebec AI Institute. She holds a Canada CIFAR AI Chair. She was a postdoctoral researcher at Columbia University and completed her PhD at the University of California, Santa Cruz. Her research interests are at the intersection of causality and machine learning, focusing on applications to text and social network data.
+
+ Abstract:
+ Data visualization is essential for analyzing biomedical and public health data and communicating the findings to key stakeholders. However, the presence of a data visualization is not enough; the choices we make when visualizing data are equally important in establishing its understandability and impact. This tutorial will discuss strategies for visualizing data and evaluating its impact with an appropriate target audience. The aim is to build an intuition for developing and assessing visualizations by drawing on theories of visualization theories together with examples from prior research and ongoing attempts to visualize the present pandemic.
+
+ Bio:
+ Ana Crisan is currently a senior research scientist at Tableau, a Salesforce company. She conducts interdisciplinary research that integrates techniques and methods from machine learning, human computer interaction, and data visualization. Her research focuses on the intersection of Data Science and Data Visualization, especially toward the way humans can collaboratively work together with ML/AI systems through visual interfaces. She completed her Ph.D. in Computer Science at the University of British Columbia, under the joint supervision of Dr. Tamara Muzner and Dr. Jennifer L. Gardy. Prior to that, she was a research scientist at the British Columbia Centre for Disease Control and Decipher Biosciences, where she conducted research on machine learning and data visualization research toward applications in infectious disease and cancer genomics, respectively. Her research has appeared in publications of the ACM (CHI), IEEE (TVCG, CG&A), Bioinformatics, and Nature.
+
+ Changing patient trajectory: A case study exploring implementation and deployment of clinical machine learning models
+
+
+
+
+
+ Yindalon Aphinyanaphongs
+
+
+
+
+
Abstract: You’ve created an awesome model that predicts with near 100 percent accuracy. Now what? In this tutorial, we will give insight into the implementation, deployment, integration, and evaluation steps following the building of a clinical model. Specifically, we will discuss each step in the context of informing design choices as you build a model. For example, aggressive feature selection is a necessary step toward integration as real time data streams of all the data points a machine learning model may consume may not be accessible or feasible. We will use our implementation and evaluation of a Covid-19 adverse event model at our institution as a representative case study. This case study will demonstrate the full lifecycle of a clinical model and how we transition from a model to affecting patient outcome and the socio-technical challenges for success.
+
+
+
Bio: Yindalon Aphinyanaphongs, MD, PhD (Predictive Analytics Team Lead) is a physician scientist in the Center for Healthcare Innovation and Delivery Science in the Department of Population Health at NYU Langone Health in New York City. Academically, he is an assistant professor and his lab focuses on novel applications of machine learning to clinical problems and the science behind successful translation of predictive models into clinical practice to drive value. Operationally, he is the Director of Operational Data Science and Machine Learning at NYU Langone Health. In this role, he leads a Predictive Analytics Unit composed of data scientists and engineers that build, evaluate, benchmark, and deploy predictive algorithms into the clinical enterprise.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Distributed Statistical Learning and Inference with Electronic Health Records Data
+
+
+
+
+
+ Rui Duan
+
+
+
+
+
Abstract: The growth of availability and variety of healthcare data sources has provided unique opportunities for data integration and evidence synthesis, which can potentially accelerate knowledge discovery and enable better clinical decision-making. However, many practical and technical challenges, such as data privacy, high-dimensionality and heterogeneity across different datasets, remain to be addressed. In this talk, I will introduce several methods for the effective and efficient integration of electronic health records and other healthcare datasets. Specifically, we develop communication-efficient distributed algorithms for jointly analyzing multiple datasets without the need of sharing patient-level data. Our algorithms can account for heterogeneity across different datasets. We provide theoretical guarantees for the performance of our algorithms, and examples of implementing the algorithms to real-world clinical research networks.
+
+
+
Bio: Dr. Duan is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. She received her Ph.D. in Biostatistics in May 2020 from the University of Pennsylvania. Her research interests focus on three distinct areas: methods for integrating evidence from different data sources, identifying signals from high dimensional data, and accounting for suboptimality of real-world data, such as missing data and measurement errors.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Challenges in Developing Online Learning and Experimentation Algorithms in Digital Health
+
+
+
+
+
+ Walter Dempsey
+
+
+
+
+
Abstract: Digital health technologies provide promising ways to deliver interventions outside of clinical settings. Wearable sensors and mobile phones provide real-time data streams that provide information about an individual’s current health including both internal (e.g., mood) and external (e.g., location) contexts. This tutorial discusses the algorithms underlying mobile health clinical trials. Specifically, we introduce the micro-randomized trial (MRT), an experimental design for optimizing real time interventions. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT. Attendees will have access to synthetic digital health experimental data to better understand online learning and experimentation algorithms, the systems underlying real time delivery of treatment, and their evaluation using collected data.
+
+
+
Bio: Walter Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research. My research focuses on Statistical Methods for Digital and Mobile Health. My current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Causal Inference from Text Data
+
+
+
+
+
+ Dhanya Sridhar
+
+
+
+
+
Abstract: Does increasing the dosage of a drug treatment cause adverse reactions in patients? This is a causal question: did increased drug dosage cause some patients to have an adverse reaction, or would they have had the reaction anyway due to other factors? A classical approach to studying this causal question from observational data involves applying causal inference techniques to observed measurements of all the relevant clinical variables. However, there is a growing recognition that abundant text data, such as medical records, physicians' notes, or even forum posts from online medical communities, provide a rich source of information for causal inference. In this tutorial, I'll introduce causal inference and highlight the unique challenges that high-dimensional and noisy text data pose. Then, I'll use two text applications involving online forums and consumer complaints to motivate recent approaches that extend natural language processing (NLP) methods in service of causal inference. I'll discuss some new assumptions we need to introduce to bridge the gap between noisy text data and valid causal inference. I'll conclude by summarizing open research questions at the intersection of causal inference and text analysis.
+
+
+
Bio: Dhanya Sridhar is an assistant professor at the University of Montreal and a core academic member at Mila - Quebec AI Institute. She holds a Canada CIFAR AI Chair. She was a postdoctoral researcher at Columbia University and completed her PhD at the University of California, Santa Cruz. Her research interests are at the intersection of causality and machine learning, focusing on applications to text and social network data.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 'Are log scales endemic yet?' Strategies for visualizing biomedical and public health data
+
+
+
+
+
+ Ana Crisan
+
+
+
+
+
Abstract: Data visualization is essential for analyzing biomedical and public health data and communicating the findings to key stakeholders. However, the presence of a data visualization is not enough; the choices we make when visualizing data are equally important in establishing its understandability and impact. This tutorial will discuss strategies for visualizing data and evaluating its impact with an appropriate target audience. The aim is to build an intuition for developing and assessing visualizations by drawing on theories of visualization theories together with examples from prior research and ongoing attempts to visualize the present pandemic.
+
+
+
Bio: Ana Crisan is currently a senior research scientist at Tableau, a Salesforce company. She conducts interdisciplinary research that integrates techniques and methods from machine learning, human computer interaction, and data visualization. Her research focuses on the intersection of Data Science and Data Visualization, especially toward the way humans can collaboratively work together with ML/AI systems through visual interfaces. She completed her Ph.D. in Computer Science at the University of British Columbia, under the joint supervision of Dr. Tamara Muzner and Dr. Jennifer L. Gardy. Prior to that, she was a research scientist at the British Columbia Centre for Disease Control and Decipher Biosciences, where she conducted research on machine learning and data visualization research toward applications in infectious disease and cancer genomics, respectively. Her research has appeared in publications of the ACM (CHI), IEEE (TVCG, CG&A), Bioinformatics, and Nature.
+ Bio:
+ Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).
+
+ Bio:
+ Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.
+
+ Bio:
+ Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.
+
+ Bio:
+ Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.
+
+ Bio:
+ Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.
+
+ Bio:
+ Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.
+
+ Network studies: As many databases as possible or enough to answer the question quickly?
+
+
+
+
+
+ Christopher Chute / Johns Hopkins University
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Network studies: As many databases as possible or enough to answer the question quickly?
+
+
+
+
+
+ Robert Platt / McGill University
+
+
+
+
+
+
+
+
+
+
+
Bio: Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
+
+
+
+
+
+ Tianxi Cai / Harvard Medical School
+
+
+
+
+
+
+
+
+
+
+
Bio: Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
+
+
+
+
+
+ Yong Chen / University of Pennsylvania
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Differential Privacy vs. Synthetic Data
+
+
+
+
+
+ Khaled El Emam / University of Ottawa
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Differential Privacy vs. Synthetic Data
+
+
+
+
+
+ Li Xiong / Emory University
+
+
+
+
+
+
+
+
+
+
+
Bio: Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.
+ Bio:
+ Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria’s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she’s been recognized as IEEE’s “AI’s 10 to Watch”, Sloan Fellow, MIT Tech Review’s “35 Under 35”, National Academy of Medicine’s list of “Emerging Leaders in Health and Medicine”, and DARPA’s Faculty Award. For her work in industry bringing AI to healthcare, she’s been recognized as World Economic Forum’s 100 Brilliant Minds Under 40, Rock Health’s “Top 50 in Digital Health”, Modern Healthcare’s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine’s Annual Scientific Award.
+
+ Bio:
+ Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.
+
+ Bio:
+ Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+
+ Invited Talk on Research and Top Recent Papers from 2020-2022
+
+
+
+
+
+ Suchi Saria / Johns Hopkins University & Bayesian Health
+
+
+
+
+
+
+
+
+
+
+
Bio: Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria’s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she’s been recognized as IEEE’s “AI’s 10 to Watch”, Sloan Fellow, MIT Tech Review’s “35 Under 35”, National Academy of Medicine’s list of “Emerging Leaders in Health and Medicine”, and DARPA’s Faculty Award. For her work in industry bringing AI to healthcare, she’s been recognized as World Economic Forum’s 100 Brilliant Minds Under 40, Rock Health’s “Top 50 in Digital Health”, Modern Healthcare’s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine’s Annual Scientific Award.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Invited Talk on Recent Deployments and Real-world Impact
+
+
+
+
+
+ Karandeep Singh / University of Michigan
+
+
+
+
+
+
+
+
+
+
+
Bio: Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Invited Talk on Under-explored Research Challenges and Opportunities
+
+
+
+
+
+ Nigam Shah / Stanford University
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+ Bio:
+ Isaac “Zak” Kohane, MD, PhD, is the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics, whose mission is to develop the methods, tools, and infrastructure required for a new generation of scientists and care providers to move biomedicine rapidly forward by taking advantage of the insight and precision offered by big data. Kohane develops and applies computational techniques to address disease at multiple scales, from whole health care systems to the functional genomics of neurodevelopment. He also has worked on AI applications in medicine since the 1990’s, including automated ventilator control, pediatric growth monitoring, detection of domestic abuse, diagnosing autism from multimodal data and most recently assisting clinicians using whole genome sequence and clinical histories to diagnose rare or unknown disease patients. His most urgent question is how to enable doctors to be most effective and enjoy their profession when they enter into a substantial symbiosis with machine intelligence. He is a member of the National Academy of Medicine, the American Society for Clinical Investigation and the American College of Medical Informatics.
+
+ Bio:
+ Leo focuses on scaling clinical research to be more inclusive through open access data and software, particularly for limited resource settings; identifying bias in the data to prevent them from being encrypted in models and algorithms; and redesigning research using the principles of team science and the hive learning strategy.
+
+ Bio:
+ Jason Fries is a research scientist at the Shah Lab at Stanford University. His work is centered on enabling domain experts to easily construct and modify machine learning models, particularly in the field of medicine where expert-labeled training data are hard to acquire. His research interests include weakly supervised machine learning, foundation models for medicine, and data-centric AI.
+
+ Bio:
+ Lauren Oakden-Rayner is a radiologist and Senior Research Fellow at the Australian Institute for Machine Learning, University of Adelaide. Her research primarily focuses on medical AI safety, specifically addressing the issues of model robustness, generalization, evaluation, and fairness. Lauren is also involved in supervising students and working on various medical AI projects, reviewing MOOCs on her blog, and advocating for diversity in her group and Institute.
+
+ Bio:
+ Maia Hightower, MD, MBA, MPH, is an accomplished healthcare IT executive and internist. She currently serves as the executive Vice President and Chief Digital & Technology Officer at the University of Chicago Medicine and the CEO and co-founder of Equality AI, a startup aimed at achieving health equity through responsible AI and machine-learning operations. Previously, she was the chief medical information officer and associate chief medical officer at University of Utah Health and served in similar roles at University of Iowa Health Care and Stanford Health Care. Dr. Hightower's work has focused on leveraging digital technology to address health inequities and promoting diversity and inclusion within healthcare IT systems. Her leadership in the field has earned her widespread recognition.
+
+ Bio:
+ Marzyeh Ghassemi is an assistant professor and the Hermann L. F. von Helmholtz Professor with appointments in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering & Science at MIT. Ghassemi’s research interests span representation learning, behavioral ML, healthcare ML, and healthy ML. One of her focuses is on real-world applications of machine learning, such as turning diverse clinical data into cohesive information with the ability to predict patient needs. Ghassemi has received BS degrees in computer science and electrical engineering from New Mexico State University, an MSc degree in biomedical engineering from Oxford University, and PhD in computer science from MIT.
+
+ Bio:
+ Ziad Obermeyer is Associate Professor and Blue Cross of California Distinguished Professor at UC Berkeley, where he works at the intersection of machine learning and health. He is a Chan Zuckerberg Biohub Investigator, a Faculty Research Fellow at the National Bureau of Economic Research, and was named an Emerging Leader by the National Academy of Medicine. Previously, he was Assistant Professor at Harvard Medical School, and continues to practice emergency medicine in underserved communities.
+
+ Bio:
+ Dr. Halamka is an emergency medicine physician, medical informatics expert and president of the Mayo Clinic Platform, which is focused on transforming health care by leveraging artificial intelligence, connected health care devices and a network of partners. Dr. Halamka has been developing and implementing health care information strategy and policy for more than 25 years. Previously, he was executive director of the Health Technology Exploration Center for Beth Israel Lahey Health, chief information officer at Beth Israel Deaconess Medical Center, and International Healthcare Innovation Professor at Harvard Medical School. He is a member of the National Academy of Medicine.
+
+ Bio:
+ Elaine Nsoesie is an Associate Professor at Boston University's School of Public Health and a leading voice in the use of data and technology to advance health equity. She is leads the Racial Data Tracker project at Boston University's Center for Antiracist Research and serves as a Senior Advisor to the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program at the National Institutes of Health. Dr. Nsoesie has published extensively on the use of data from social media, search engines, and cell phones for public health surveillance and is dedicated to increasing representation of underrepresented communities in data science. She completed her PhD in Computational Epidemiology from Virginia Tech and has held postdoctoral positions at Harvard Medical School and Boston Children's Hospital.
+
+ Bio:
+ Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics Engineering, King's College, at the University of London, England.
+
+ Bio:
+ Byron Wallace is the Sy and Laurie Sternberg Interdisciplinary Associate Professor and Director of the BS in Data Science program at Northeastern University in the Khoury College of Computer Sciences. His research is primarily in natural language processing (NLP) methods, with an emphasis on their application in healthcare and the challenges inherent to this domain.
+
+ Bio:
+ Tristan Naumann is a Principal Researcher in Microsoft Research’s Health Futures working on problems related to clinical and biomedical natural language processing (NLP). His research focuses on exploring relationships in complex, unstructured healthcare data using natural language processing and unsupervised learning techniques. He is currently serving as General Chair of NeurIPS and co-organizer of the Clinical NLP workshop at ACL. Previously, he has served as General Chair and Program Chair of the AHLI Conference on Health, Inference, and Learning (CHIL) and Machine Learning for Health (ML4H). His work has appeared in KDD, AAAI, AMIA, JMIR, MLHC, ACM HEALTH, Cell Patterns, Science Translational Medicine, and Nature Translational Psychiatry.
+
+ Bio:
+ Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.
+
+ Bio:
+ Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+
+ Bio:
+ Saadia Gabriel is currently a MIT CSAIL Postdoctoral Fellow. She is also an incoming NYU Faculty Fellow and will start as an Assistant Professor at UCLA in 2024. She completed her PhD at the University of Washington, where she was advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her research revolves around natural language processing and machine learning, with a particular focus on building systems for understanding how social commonsense manifests in text (i.e. how do people typically behave in social scenarios), as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also received a 2019 ACL best short paper nomination, a 2019 IROS RoboCup best paper nomination and won a best paper award at the 2020 WeCNLP summit.
+
+
+ Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
+
+
+
+
+
+ Moderator: Isaac Kohane / Harvard University
+
+
+
+
+
+
+
+
+
+
+
Bio: Isaac “Zak” Kohane, MD, PhD, is the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics, whose mission is to develop the methods, tools, and infrastructure required for a new generation of scientists and care providers to move biomedicine rapidly forward by taking advantage of the insight and precision offered by big data. Kohane develops and applies computational techniques to address disease at multiple scales, from whole health care systems to the functional genomics of neurodevelopment. He also has worked on AI applications in medicine since the 1990’s, including automated ventilator control, pediatric growth monitoring, detection of domestic abuse, diagnosing autism from multimodal data and most recently assisting clinicians using whole genome sequence and clinical histories to diagnose rare or unknown disease patients. His most urgent question is how to enable doctors to be most effective and enjoy their profession when they enter into a substantial symbiosis with machine intelligence. He is a member of the National Academy of Medicine, the American Society for Clinical Investigation and the American College of Medical Informatics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
+
+
+
+
+
+ Leo Celi / MIT
+
+
+
+
+
+
+
+
+
+
+
Bio: Leo focuses on scaling clinical research to be more inclusive through open access data and software, particularly for limited resource settings; identifying bias in the data to prevent them from being encrypted in models and algorithms; and redesigning research using the principles of team science and the hive learning strategy.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
+
+
+
+
+
+ Jason Fries / Stanford University
+
+
+
+
+
+
+
+
+
+
+
Bio: Jason Fries is a research scientist at the Shah Lab at Stanford University. His work is centered on enabling domain experts to easily construct and modify machine learning models, particularly in the field of medicine where expert-labeled training data are hard to acquire. His research interests include weakly supervised machine learning, foundation models for medicine, and data-centric AI.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
+
+
+
+
+
+ Lauren Oakden-Rayner / University of Adelaide
+
+
+
+
+
+
+
+
+
+
+
Bio: Lauren Oakden-Rayner is a radiologist and Senior Research Fellow at the Australian Institute for Machine Learning, University of Adelaide. Her research primarily focuses on medical AI safety, specifically addressing the issues of model robustness, generalization, evaluation, and fairness. Lauren is also involved in supervising students and working on various medical AI projects, reviewing MOOCs on her blog, and advocating for diversity in her group and Institute.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?
+
+
+
+
+
+ Maia Hightower / University of Chicago Medicine
+
+
+
+
+
+
+
+
+
+
+
Bio: Maia Hightower, MD, MBA, MPH, is an accomplished healthcare IT executive and internist. She currently serves as the executive Vice President and Chief Digital & Technology Officer at the University of Chicago Medicine and the CEO and co-founder of Equality AI, a startup aimed at achieving health equity through responsible AI and machine-learning operations. Previously, she was the chief medical information officer and associate chief medical officer at University of Utah Health and served in similar roles at University of Iowa Health Care and Stanford Health Care. Dr. Hightower's work has focused on leveraging digital technology to address health inequities and promoting diversity and inclusion within healthcare IT systems. Her leadership in the field has earned her widespread recognition.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
+
+
+
+
+
+ Moderator: Marzyeh Ghassemi / MIT
+
+
+
+
+
+
+
+
+
+
+
Bio: Marzyeh Ghassemi is an assistant professor and the Hermann L. F. von Helmholtz Professor with appointments in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering & Science at MIT. Ghassemi’s research interests span representation learning, behavioral ML, healthcare ML, and healthy ML. One of her focuses is on real-world applications of machine learning, such as turning diverse clinical data into cohesive information with the ability to predict patient needs. Ghassemi has received BS degrees in computer science and electrical engineering from New Mexico State University, an MSc degree in biomedical engineering from Oxford University, and PhD in computer science from MIT.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
+
+
+
+
+
+ Ziad Obermeyer / UC Berkeley
+
+
+
+
+
+
+
+
+
+
+
Bio: Ziad Obermeyer is Associate Professor and Blue Cross of California Distinguished Professor at UC Berkeley, where he works at the intersection of machine learning and health. He is a Chan Zuckerberg Biohub Investigator, a Faculty Research Fellow at the National Bureau of Economic Research, and was named an Emerging Leader by the National Academy of Medicine. Previously, he was Assistant Professor at Harvard Medical School, and continues to practice emergency medicine in underserved communities.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
+
+
+
+
+
+ John Halamka / Mayo Clinic
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Halamka is an emergency medicine physician, medical informatics expert and president of the Mayo Clinic Platform, which is focused on transforming health care by leveraging artificial intelligence, connected health care devices and a network of partners. Dr. Halamka has been developing and implementing health care information strategy and policy for more than 25 years. Previously, he was executive director of the Health Technology Exploration Center for Beth Israel Lahey Health, chief information officer at Beth Israel Deaconess Medical Center, and International Healthcare Innovation Professor at Harvard Medical School. He is a member of the National Academy of Medicine.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
+
+
+
+
+
+ Elaine Nsoesie / Boston University
+
+
+
+
+
+
+
+
+
+
+
Bio: Elaine Nsoesie is an Associate Professor at Boston University's School of Public Health and a leading voice in the use of data and technology to advance health equity. She is leads the Racial Data Tracker project at Boston University's Center for Antiracist Research and serves as a Senior Advisor to the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program at the National Institutes of Health. Dr. Nsoesie has published extensively on the use of data from social media, search engines, and cell phones for public health surveillance and is dedicated to increasing representation of underrepresented communities in data science. She completed her PhD in Computational Epidemiology from Virginia Tech and has held postdoctoral positions at Harvard Medical School and Boston Children's Hospital.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions
+
+
+
+
+
+ Khaled El Emam / University of Ottawa
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics Engineering, King's College, at the University of London, England.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Healthcare in the Era of ChatGPT
+
+
+
+
+
+ Moderator: Byron Wallace / Northeastern University
+
+
+
+
+
+
+
+
+
+
+
Bio: Byron Wallace is the Sy and Laurie Sternberg Interdisciplinary Associate Professor and Director of the BS in Data Science program at Northeastern University in the Khoury College of Computer Sciences. His research is primarily in natural language processing (NLP) methods, with an emphasis on their application in healthcare and the challenges inherent to this domain.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Healthcare in the Era of ChatGPT
+
+
+
+
+
+ Tristan Naumann / Microsoft Research
+
+
+
+
+
+
+
+
+
+
+
Bio: Tristan Naumann is a Principal Researcher in Microsoft Research’s Health Futures working on problems related to clinical and biomedical natural language processing (NLP). His research focuses on exploring relationships in complex, unstructured healthcare data using natural language processing and unsupervised learning techniques. He is currently serving as General Chair of NeurIPS and co-organizer of the Clinical NLP workshop at ACL. Previously, he has served as General Chair and Program Chair of the AHLI Conference on Health, Inference, and Learning (CHIL) and Machine Learning for Health (ML4H). His work has appeared in KDD, AAAI, AMIA, JMIR, MLHC, ACM HEALTH, Cell Patterns, Science Translational Medicine, and Nature Translational Psychiatry.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Healthcare in the Era of ChatGPT
+
+
+
+
+
+ Karandeep Singh / University of Michigan
+
+
+
+
+
+
+
+
+
+
+
Bio: Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Healthcare in the Era of ChatGPT
+
+
+
+
+
+ Nigam Shah / Stanford University
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Machine Learning for Healthcare in the Era of ChatGPT
+
+
+
+
+
+ Saadia Gabriel / MIT
+
+
+
+
+
+
+
+
+
+
+
Bio: Saadia Gabriel is currently a MIT CSAIL Postdoctoral Fellow. She is also an incoming NYU Faculty Fellow and will start as an Assistant Professor at UCLA in 2024. She completed her PhD at the University of Washington, where she was advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her research revolves around natural language processing and machine learning, with a particular focus on building systems for understanding how social commonsense manifests in text (i.e. how do people typically behave in social scenarios), as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also received a 2019 ACL best short paper nomination, a 2019 IROS RoboCup best paper nomination and won a best paper award at the 2020 WeCNLP summit.
+
+ Abstract:
+ Healthcare datasets often include patient-reported values, such as mood, symptoms, and meals, which can be subject to varying levels of human error. Improving the accuracy of patient-reported data could help in several downstream tasks, such as remote patient monitoring. In this study, we propose a novel denoising autoencoder (DAE) approach to denoise patient-reported data, drawing inspiration from recent work in computer vision. Our approach is based on the observation that noisy patient-reported data are often collected alongside higher fidelity data collected from wearable sensors. We leverage these auxiliary data to improve the accuracy of the patient-reported data. Our approach combines key ideas from DAEs with co-teaching to iteratively filter and learn from clean patient-reported samples. Applied to the task of recovering carbohydrate values for blood glucose management in diabetes, our approach reduces noise (MSE) in patient-reported carbohydrates from 72g2 (95% CI: 54-93) to 18g2 (13-25), outperforming the best baseline (33g2 (27-43)). Notably, our approach achieves strong performance with only access to patient-reported target values, making it applicable to many settings where ground truth data may be unavailable.
+
+ Abstract:
+ Learning multi-view data is an emerging problem in machine learning research, and nonnegative matrix factorization (NMF) is a popular dimensionality-reduction method for integrating information from multiple views. These views often provide not only consensus but also complementary information. However, most multi-view NMF algorithms assign equal weight to each view or tune the weight via line search empirically, which can be infeasible without any prior knowledge of the views or computationally expensive. In this paper, we propose a weighted multi-view NMF (WM-NMF) algorithm. In particular, we aim to address the critical technical gap, which is to learn both view-specific weight and observation-specific reconstruction weight to quantify each view’s information content. The introduced weighting scheme can alleviate unnecessary views' adverse effects and enlarge the positive effects of the important views by assigning smaller and larger weights, respectively. Experimental results confirm the effectiveness and advantages of the proposed algorithm in terms of achieving better clustering performance and dealing with the noisy data compared to the existing algorithms.
+
+ Abstract:
+ Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. To solve the challenge, we propose the Token Imbalance Adapter (TIMER), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.
+
+ Federated Multilingual Models for Medical Transcript Analysis
+
+
+ Andre Manoel* (Microsoft), Mirian Del Carmen Hipolito Garcia (Microsoft), Tal Baumel (Microsoft), Shize Su (Microsoft), Jialei Chen (Microsoft), Robert Sim (Microsoft), Dan Miller (Airbnb), Danny Karmon (Google), Dimitrios Dimitriadis (Amazon)
+
+ Abstract:
+ Federated Learning (FL) is a machine learning approach that allows the model trainer to access more data samples by training across multiple decentralized data sources while enforcing data access constraints. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. In a FL setting, none of the training data is ever transmitted to any central location; i.e. sensitive data remains local and private. These characteristics make FL perfectly suited for applications in healthcare, where a variety of compliance constraints restrict how data may be handled. Despite these apparent benefits in compliance and privacy, certain scenarios such as heterogeneity of the local data distributions pose significant challenges for FL. Such challenges are even more pronounced in the case of a multilingual setting. This paper presents a FL system for pre-training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data only. We also show that the global model performance can be further improved by a local training step.
+
+ Abstract:
+ Rare life events significantly impact mental health, and their detection in behavioral studies is a crucial step towards health-based interventions. We envision that mobile sensing data can be used to detect these anomalies. However, the human-centered nature of the problem, combined with the infrequency and uniqueness of these events makes it challenging for unsupervised machine learning methods. In this paper, we first investigate granger-causality between life events and human behavior using sensing data. Next, we propose a multi-task framework with an unsupervised autoencoder to capture irregular behavior, and an auxiliary sequence predictor that identifies transitions in workplace performance to contextualize events. We perform experiments using data from a mobile sensing study comprising N=126 information workers from multiple industries, spanning 10106 days with 198 rare events (<2%). Through personalized inference, we detect the exact day of a rare event with an F1 of 0.34, demonstrating that our method outperforms several baselines. Finally, we discuss the implications of our work from the context of real-world deployment.
+
+ Virus2Vec: Viral Sequence Classification Using Machine Learning
+
+
+ Sarwan Ali* (Georgia State University), Babatunde Bello (Georgia State University), Prakash Chourasia (Georgia State University), Ria Thazhe Punathil (Georgia State University), Pin-Yu Chen (IBM Research), Imdad Ullah Khan (Lahore University of Management Sciences), Murray Patterson (Georgia State University)
+
+ Abstract:
+ Understanding the host-specificity of different families of viruses sheds light on the origin of, e.g., SARS-CoV-2, rabies, and other such zoonotic pathogens in humans. It enables epidemiologists, medical professionals, and policymakers to curb existing epidemics and prevent future ones promptly. In the family Coronaviridae (of which SARS-CoV-2 is a member), it is well-known that the spike protein is the point of contact between the virus and the host cell membrane. On the other hand, the two traditional mammalian orders, Carnivora (carnivores) and Chiroptera (bats) are recognized to be responsible for maintaining and spreading the Rabies Lyssavirus (RABV). We propose Virus2Vec, a feature-vector representation for viral (nucleotide or amino acid) sequences that enable vector-space-based machine learning models to identify viral hosts. Virus2Vec generates numerical feature vectors for unaligned sequences, allowing us to forego the computationally expensive sequence alignment step from the pipeline. Virus2Vec leverages the power of both the minimizer and position weight matrix (PWM) to generate compact feature vectors. Using several classifiers, we empirically evaluate Virus2Vec on real-world spike sequences of Coronaviridae and rabies virus sequence data to predict the host (identifying the reservoirs of infection). Our results demonstrate that Virus2Vec outperforms the predictive accuracies of baseline and state-of-the-art methods.
+
+ Abstract:
+ We propose a general framework for visualizing any intermediate embedding representation used by any neural survival analysis model. Our framework is based on so-called anchor directions in an embedding space. We show how to estimate these anchor directions using clustering or, alternatively, using user-supplied ``concepts'' defined by collections of raw inputs (e.g., feature vectors all from female patients could encode the concept ``female''). For tabular data, we present visualization strategies that reveal how anchor directions relate to raw clinical features and to survival time distributions. We then show how these visualization ideas extend to handling raw inputs that are images. Our framework is built on looking at angles between vectors in an embedding space, where there could be ``information loss'' by ignoring magnitude information. We show how this loss results in a ``clumping'' artifact that appears in our visualizations, and how to reduce this information loss in practice.
+
+ Towards the Practical Utility of Federated Learning in the Medical Domain
+
+
+ Hyeonji Hwang* (KAIST), Seongjun Yang (KRAFTON), Daeyoung Kim (KAIST), Radhika Dua (Google Research), Jong-Yeup Kim(Konyang University), Eunho Yang (KAIST) , Edward Choi (KAIST)
+
+ Abstract:
+ Federated learning (FL) is an active area of research. One of the most suitable areas for adopting FL is the medical domain, where patient privacy must be respected. Previous research, however, does not provide a practical guide to applying FL in the medical domain. We propose empirical benchmarks and experimental settings for three representative medical datasets with different modalities: longitudinal electronic health records, skin cancer images, and electrocardiogram signals. The likely users of FL such as medical institutions and IT companies can take these benchmarks as guides for adopting FL and minimize their trial and error. For each dataset, each client data is from a different source to preserve real-world heterogeneity. We evaluate six FL algorithms designed for addressing data heterogeneity among clients, and a hybrid algorithm combining the strengths of two representative FL algorithms. Based on experiment results from three modalities, we discover that simple FL algorithms tend to outperform more sophisticated ones, while the hybrid algorithm consistently shows good, if not the best performance. We also find that a frequent global model update leads to better performance under a fixed training iteration budget. As the number of participating clients increases, higher cost is incurred due to increased IT administrators and GPUs, but the performance consistently increases. We expect future users will refer to these empirical benchmarks to design the FL experiments in the medical domain considering their clinical tasks and obtain stronger performance with lower costs.
+
+ Abstract:
+ Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task---that is, an outcome to be predicted at a particular time point---to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.
+
+ Semantic match: Debugging feature attribution methods in XAI for healthcare
+
+
+ Giovanni Cinà* (Amsterdam University Medical Center), Tabea E. Röber (University of Amsterdam), Rob Goedhart (University of Amsterdam), Ş. İlker Birbil (University of Amsterdam)
+
+ Abstract:
+ The recent spike in certified Artificial Intelligence tools for healthcare has renewed the debate around adoption of this technology. One thread of such debate concerns Explainable AI and its promise to render AI devices more transparent and trustworthy. A few voices active in the medical AI space have expressed concerns on the reliability of Explainable AI techniques and especially feature attribution methods, questioning their use and inclusion in guidelines and standards. We characterize the problem as a lack of semantic match between explanations and human understanding. To understand when feature importance can be used reliably, we introduce a distinction between feature importance of low- and high-level features. We argue that for data types where low-level features come endowed with a clear semantics, such as tabular data like Electronic Health Records, semantic match can be obtained, and thus feature attribution methods can still be employed in a meaningful and useful way. For high-level features, we sketch a procedure to test whether semantic match has been achieved.
+
+ Modeling Multivariate Biosignals With Graph Neural Networks and Structured State Space Models
+
+
+ Siyi Tang* (Stanford University), Jared A. Dunnmon (Stanford University), Liangqiong Qu (University of Hong Kong), Khaled K. Saab (Stanford University), Tina Baykaner (Stanford University), Christopher Lee-Messer (Stanford University), Daniel L. Rubin (Stanford University)
+
+ Abstract:
+ Multivariate biosignals are prevalent in many medical domains, such as electroencephalography, polysomnography, and electrocardiography. Modeling spatiotemporal dependencies in multivariate biosignals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between the electrodes. To address these challenges, we propose representing multivariate biosignals as time-dependent graphs and introduce GRAPHS4MER, a general graph neural network (GNN) architecture that improves performance on biosignal classification tasks by modeling spatiotemporal dependencies in biosignals. Specifically, (1) we leverage the Structured State Space architecture, a state-of-the-art deep sequence model, to capture long-range temporal dependencies in biosignals and (2) we propose a graph structure learning layer in GRAPHS4MER to learn dynamically evolving graph structures in the data. We evaluate our proposed model on three distinct biosignal classification tasks and show that GRAPHS4MER consistently improves over existing models, including (1) seizure detection from electroencephalographic signals, outperforming a previous GNN with self-supervised pre-training by 3.1 points in AUROC; (2) sleep staging from polysomnographic signals, a 4.1 points improvement in macro-F1 score compared to existing sleep staging models; and (3) 12-lead electrocardiogram classification, outperforming previous state-of-the-art models by 2.7 points in macro-F1 score.
+
+ Neural Fine-Gray: Monotonic neural networks for competing risks
+
+
+ Vincent Jeanselme* (University of Cambridge), Chang Ho Yoon (University of Oxford), Brian Tom (University of Cambridge), Jessica Barrett (University of Cambridge)
+
+ Abstract:
+ Time-to-event modelling, known as survival analysis, differs from standard regression as it addresses censoring in patients who do not experience the event of interest. Despite competitive performances in tackling this problem, machine learning methods often ignore other competing risks that preclude the event of interest. This practice biases the survival estimation. Extensions to address this challenge often rely on parametric assumptions or numerical estimations leading to sub-optimal survival approximations. This paper leverages constrained monotonic neural networks to model each competing survival distribution. This modelling choice ensures the exact likelihood maximisation at a reduced computational cost by using automatic differentiation. The effectiveness of the solution is demonstrated on one synthetic and three medical datasets. Finally, we discuss the implications of considering competing risks when developing risk scores for medical practice.
+
+ Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark
+
+
+ Li Xu* (Hong Kong Polytechnic University), Bo Liu (Hong Kong Polytechnic University), Ameer Hamza Khan (Hong Kong Polytechnic University), Lu Fan (Hong Kong Polytechnic University), Xiao-Ming Wu (Hong Kong Polytechnic University)
+
+ Abstract:
+ With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datasets such as MSCOCO, vision-language pre-training (VLP) has become an active area of research and proven to be effective for various VL tasks such as visual-question answering. However, studies on VLP in the medical domain have so far been scanty. To provide a comprehensive perspective on VLP for medical VL tasks, we conduct a thorough experimental analysis to study key factors that may affect the performance of VLP with a unified vision-language Transformer. To allow making sound and quick pre-training decisions, we propose RadioGraphy Captions (RGC), a high-quality, multi-modality radiographic dataset containing 18,434 image-caption pairs collected from an open-access online database MedPix. RGC can be used as a pre-training dataset or a new benchmark for medical report generation and medical image-text retrieval. By utilizing RGC and other available datasets for pre-training, we develop several key insights that can guide future medical VLP research and new strong baselines for various medical VL tasks.
+
+ SRDA: Mobile Sensing based Fluid Overload Detection for End Stage Kidney Disease Patients using Sensor Relation Dual Autoencoder
+
+
+ Mingyue Tang (University of Virginia), Jiechao Gao* (University of Virginia), Guimin Dong (Amazon), Carl Yang (Emory University), Brad Campbell (University of Virginia), Brendan Bowman (University of Virginia), Jamie Marie Zoellner (University of Virginia), Emaad Abdel-Rahman (University of Virginia), Mehdi Boukhechba (The Janssen Pharmaceutical Companies of Johnson & Johnson)
+
+ Abstract:
+ Chronic kidney disease (CKD) is a life-threatening and prevalent disease. CKD patients, especially end-stage kidney disease (ESKD) patients on hemodialysis, suffer from kidney failures and are unable to remove excessive fluid, causing fluid overload and multiple morbidities including death. Current solutions for fluid overtake monitoring such as ultrasonography and biomarkers assessment are cumbersome, discontinuous, and can only be performed in the clinic. In this paper, we propose SRDA, a latent graph learning powered fluid overload detection system based on Sensor Relation Dual Autoencoder to detect excessive fluid consumption of EKSD patients based on passively collected bio-behavioral data from smartwatch sensors. Experiments using real-world mobile sensing data indicate that SRDA outperforms the state-of-the-art baselines in both F1 score and recall, and demonstrate the potential of ubiquitous sensing for ESKD fluid intake management.
+
+ Abstract:
+ Conflict of interest (COI) disclosure statements provide rich information to support transparency and reduce bias in research. We introduce a novel task to identify relationships between sponsoring entities and the research studies they sponsor from the disclosure statement. This task is challenging due to the complexity of recognizing all potential relationship patterns and the hierarchical nature of identifying entities first and then extracting their relationships to the study. To overcome these challenges, in this paper, we also constructed a new annotated dataset and proposed a Question Answering-based method to recognize entities and extract relationships. Our method has demonstrated robustness in handling diverse relationship patterns, and it remains effective even when trained on a low-resource dataset.
+
+ Revisiting Machine-Learning based Drug Repurposing: Drug Indications Are Not a Right Prediction Target
+
+
+ Siun Kim* (Seoul National University), Jung-Hyun Won (Seoul National University), David Seung U Lee (Seoul National University), Renqian Luo (Microsoft Research), Lijun Wu (Microsoft Research), Yingce Xia (Microsoft Research), Tao Qin (Microsoft Research), Howard Lee (Seoul National University)
+
+ Abstract:
+ In this paper, we challenge the utility of approved drug indications as a prediction target for machine learning in drug repurposing (DR) studies. Our research highlights two major limitations of this approach: 1) the presence of strong confounding between drug indications and drug characteristics data, which results in shortcut learning, and 2) inappropriate normalization of indications in existing drug-disease association (DDA) datasets, which leads to an overestimation of model performance. We show that the collection patterns of drug characteristics data were similar within drugs of the same category and the Anatomical Therapeutic Chemical (ATC) classification of drugs could be predicted by using the data collection patterns. Furthermore, we confirm that the performance of existing DR models is significantly degraded in the realistic evaluation setting we proposed in this study. We provide realistic data split information for two benchmark datasets, Fdataset and deepDR dataset.
+
+ Bayesian Active Questionnaire Design for Cause-of-Death Assignment Using Verbal Autopsies
+
+
+ Toshiya Yoshida* (University of California Santa Cruz), Trinity Shuxian Fan (University of Washington), Tyler McCormick (University of Washington), Zhenke Wu (University of Michigan), Zehang Richard Li (University of California Santa Cruz)
+
+ Abstract:
+ Only about one-third of the deaths worldwide are assigned a medically-certified cause, and understanding the causes of deaths occurring outside of medical facilities is logistically and financially challenging. Verbal autopsy (VA) is a routinely used tool to collect information on cause of death in such settings. VA is a survey-based method where a structured questionnaire is conducted to family members or caregivers of a recently deceased person, and the collected information is used to infer the cause of death. As VA becomes an increasingly routine tool for cause-of-death data collection, the lengthy questionnaire has become a major challenge to the implementation and scale-up of VA interviews as they are costly and time-consuming to conduct. In this paper, we propose a novel active questionnaire design approach that optimizes the order of the questions dynamically to achieve accurate cause-of-death assignment with the smallest number of questions. We propose a fully Bayesian strategy for adaptive question selection that is compatible with any existing probabilistic cause-of-death assignment methods. We also develop an early stopping criterion that fully accounts for the uncertainty in the model parameters. We also propose a penalized score to account for constraints and preferences of existing question structures. We evaluate the performance of our active designs using both synthetic and real data, demonstrating that the proposed strategy achieves accurate cause-of-death assignment using considerably fewer questions than the traditional static VA survey instruments.
+
+ Machine Learning for Arterial Blood Pressure Prediction
+
+
+ Jessica Zheng (MIT), Hanrui Wang* (MIT), Anand Chandrasekhar (MIT), Aaron Aguirre (Massachusetts General Hospital and Harvard Medical School), Song Han (MIT), Hae-Seung Lee (MIT), Charles G. Sodini (MIT)
+
+ Abstract:
+ High blood pressure is a major risk factor for cardiovascular disease, necessitating accurate blood pressure (BP) measurement. Clinicians measure BP with an invasive arterial catheter or via a non-invasive arm or finger cuff. However, the former can cause discomfort to the patient and is unsuitable outside Intensive Care Unit (ICU). While cuff-based devices, despite being non-invasive, fails to provide continuous measurement, and they measure from peripheral blood vessels whose BP waveforms differ significantly from those proximal to the heart. Hence, there is an urgent need to develop a measurement protocol for converting easily measured non-invasive data into accurate BP values. Addressing this gap, we propose a non-invasive approach to predict BP from arterial area and blood flow velocity signals measured from a Philips ultrasound transducer (XL-143) applied to large arteries close to heart. We developed the protocol and collected data from 72 subjects. The shape of BP (relative BP) can be theoretically calculated from these waveforms, however there is no established theory to obtain absolute BP values. To tackle this challenge, we further employ data-driven machine learning models to predict the Mean Arterial Blood Pressure (MAP), from which the absolute BP can be derived. Our study investigates various machine learning algorithms to optimize the prediction accuracy. We find that LSTM, Transformer, and 1D-CNN algorithms using the blood pressure shape and blood flow velocity waveforms as inputs can achieve 8.6, 8.7, and 8.8 mmHg average standard deviation of the prediction error respectively without anthropometric data such as age, sex, heart rate, height, weight. Furthermore, the 1D-CNN model can achieve 7.9mmHg when anthropometric data is added as inputs, improving upon an anthropometric-only model of 9.5mmHg. This machine learning-based approach, capable of converting ultrasound data into MAP values, presents a promising software tool for physicians in clinical decision-making regarding blood pressure management.
+
+ Abstract:
+ High blood pressure is a major risk factor for cardiovascular disease, necessitating accurate blood pressure (BP) measurement. Clinicians measure BP with an invasive arterial catheter or via a non-invasive arm or finger cuff. However, the former can cause discomfort to the patient and is unsuitable outside Intensive Care Unit (ICU). While cuff-based devices, despite being non-invasive, fails to provide continuous measurement, and they measure from peripheral blood vessels whose BP waveforms differ significantly from those proximal to the heart. Hence, there is an urgent need to develop a measurement protocol for converting easily measured non-invasive data into accurate BP values. Addressing this gap, we propose a non-invasive approach to predict BP from arterial area and blood flow velocity signals measured from a Philips ultrasound transducer (XL-143) applied to large arteries close to heart. We developed the protocol and collected data from 72 subjects. The shape of BP (relative BP) can be theoretically calculated from these waveforms, however there is no established theory to obtain absolute BP values. To tackle this challenge, we further employ data-driven machine learning models to predict the Mean Arterial Blood Pressure (MAP), from which the absolute BP can be derived. Our study investigates various machine learning algorithms to optimize the prediction accuracy. We find that LSTM, Transformer, and 1D-CNN algorithms using the blood pressure shape and blood flow velocity waveforms as inputs can achieve 8.6, 8.7, and 8.8 mmHg average standard deviation of the prediction error respectively without anthropometric data such as age, sex, heart rate, height, weight. Furthermore, the 1D-CNN model can achieve 7.9mmHg when anthropometric data is added as inputs, improving upon an anthropometric-only model of 9.5mmHg. This machine learning-based approach, capable of converting ultrasound data into MAP values, presents a promising software tool for physicians in clinical decision-making regarding blood pressure management.
+
+ Abstract:
+ Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-art methods which often depend on black-box models for imputation. In this work, we show how recent advances in interpretable machine learning provide a new perspective for understanding and tackling the missing value problem. We propose methods based on high-accuracy glass-box Explainable Boosting Machines (EBMs) that can help users (1) gain new insights on missingness mechanisms and better understand the causes of missingness, and (2) detect -- or even alleviate -- potential risks introduced by imputation algorithms. Experiments on real-world medical datasets illustrate the effectiveness of the proposed methods.
+
+ Explaining a machine learning decision to physicians via counterfactuals
+
+
+ Supriya Nagesh* (Amazon), Nina Mishra (Amazon), Yonatan Naamad (Amazon), James M Rehg (Georgia Institute of Technology), Mehul A Shah (Aryn), Alexei Wagner (Harvard University)
+
+ Abstract:
+ Machine learning models perform well on several healthcare tasks and can help reduce the burden on the healthcare system. However, the lack of explainability is a major roadblock to their adoption in hospitals. How can the decision of an ML model be explained to a physician? The explanations considered in this paper are counterfactuals (CFs), hypothetical scenarios that would have resulted in the opposite outcome. Specifically, time-series CFs are investigated, inspired by the way physicians converse and reason out decisions `I would have given the patient a vasopressor if their blood pressure was lower and falling'. Key properties of CFs that are particularly meaningful in clinical settings are outlined: physiological plausibility, relevance to the task and sparse perturbations. Past work on CF generation does not satisfy these properties, specifically plausibility in that realistic time-series CFs are not generated. A variational autoencoder (VAE)-based approach is proposed that captures these desired properties. The method produces CFs that improve on prior approaches quantitatively (more plausible CFs as evaluated by their likelihood w.r.t original data distribution, and 100x faster at generating CFs) and qualitatively (2x more plausible and relevant) as evaluated by three physicians.
+
+ Abstract:
+ Noisy training labels can hurt model performance. Most approaches that aim to address label noise assume label noise is independent from the input features. In practice, however, label noise is often feature or instance-dependent, and therefore biased (i.e., some instances are more likely to be mislabeled than others). E.g., in clinical care, female patients are more likely to be under-diagnosed for cardiovascular disease compared to male patients. Approaches that ignore this dependence can produce models with poor discriminative performance, and in many healthcare settings, can exacerbate issues around health disparities. In light of these limitations, we propose a two-stage approach to learn in the presence instance-dependent label noise. Our approach utilizes alignment points, a small subset of data for which we know the observed and ground truth labels. On several tasks, our approach leads to consistent improvements over the state-of-the-art in discriminative performance (AUROC) while mitigating bias (area under the equalized odds curve, AUEOC). For example, when predicting acute respiratory failure onset on the MIMIC-III dataset, our approach achieves a harmonic mean (AUROC and AUEOC) of 0.84 (SD [standard deviation] 0.01) while that of the next best baseline is 0.81 (SD 0.01). Overall, our approach improves accuracy while mitigating potential bias compared to existing approaches in the presence of instance-dependent label noise.
+
+ Fair Admission Risk Prediction with Proportional Multicalibration
+
+
+ William La Cava* (Boston Children's Hospital and Harvard Medical School), Elle Lett (Boston Children's Hospital and Harvard Medical School), Guangya Wan (Boston Children's Hospital and Harvard Medical School)
+
+ Abstract:
+ Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to measure and achieve fair calibration is with multicalibration. Multicalibration constrains calibration error among flexibly-defined subpopulations while maintaining overall calibration. However, multicalibrated models can exhibit a higher percent calibration error among groups with lower base rates than groups with higher base rates. As a result, it is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. To alleviate this, we propose proportional multicalibration, a criteria that constrains the percent calibration error among groups and within prediction bins. We prove that satisfying proportional multicalibration bounds a model's multicalibration as well its differential calibration, a fairness criteria that directly measures how closely a model approximates sufficiency. Therefore, proportionally calibrated models limit the ability of decision makers to distinguish between model performance on different patient groups, which may make the models more trustworthy in practice. We provide an efficient algorithm for post-processing risk prediction models for proportional multicalibration and evaluate it empirically. We conduct simulation studies and investigate a real-world application of PMC-postprocessing to prediction of emergency department patient admissions. We observe that proportional multicalibration is a promising criteria for controlling simultaneous measures of calibration fairness of a model over intersectional groups with virtually no cost in terms of classification performance.
+
+ Collecting data when missingness is unknown: a method for improving model performance given under-reporting in patient populations
+
+
+ Kevin Wu* (Stanford University and Optum Labs), Dominik Dahlem (Optum Labs), Christopher Hane (Optum Labs), Eran Halperin (Optum Labs), James Zou (Stanford University)
+
+ Abstract:
+ Machine learning models for healthcare commonly use binary indicator variables to represent the diagnosis of specific health conditions in medical records. However, in populations with significant under-reporting, the absence of a recorded diagnosis does not rule out the presence of a condition, making it difficult to distinguish between negative and missing values. This effect, which we refer to as latent missingness, may lead to model degradation and perpetuate existing biases in healthcare. To address this issue, we propose that healthcare providers and payers allocate a budget towards data collection (eg. subsidies for check-ups or lab tests). However, given finite resources, only a subset of data points can be collected. Additionally, most models are unable to be re-trained after deployment. In this paper, we propose a method for efficient data collection in order to maximize a fixed model's performance on a given population. Through simulated and real-world data, we demonstrate the potential value of targeted data collection to address model degradation.
+
+ Abstract:
+ Detailed mobile sensing data from phones and fitness trackers offer an opportunity to quantify previously unmeasurable behavioral changes to improve individual health and accelerate responses to emerging diseases. Unlike in natural language processing and computer vision, deep learning has yet to broadly impact this domain, in which the majority of research and clinical applications still rely on manually defined features or even forgo predictive modeling altogether due to insufficient accuracy. This is due to unique challenges in the behavioral health domain, including very small datasets (~101 participants), which frequently contain missing data, consist of long time series with critical long-range dependencies (length<104), and extreme class imbalances (>103:1). Here, we describe a neural architecture for multivariate time series classification designed to address these unique domain challenges. Our proposed behavioral representation learning approach combines novel tasks for self-supervised pretraining and transfer learning to address data scarcity, and captures long-range dependencies across long-history time series through transformer self-attention following convolutional neural network-based dimensionality reduction. We propose an evaluation framework aimed at reflecting expected real-world performance in plausible deployment scenarios. Concretely, we demonstrate (1) performance improvements over baselines of up to 0.15 ROC AUC across five influenza-related prediction tasks, (2) transfer learning-induced performance improvements including a 16% relative increase in PR AUC in small data scenarios, and (3) the potential of transfer learning in novel disease scenarios through an exploratory case study of zero-shot COVID-19 prediction in an independent data set. Finally, we discuss potential implications for medical surveillance testing.
+
+ Clinical Relevance Score for Guided Trauma Injury Pattern Discovery with Weakly Supervised β-VAE
+
+
+ Qixuan Jin* (Massachusetts Institute of Technology), Jacobien Oosterhoff (Delft University of Technology), Yepeng Huang (Harvard School of Public Health), Marzyeh Ghassemi (Massachusetts Institute of Technology), Gabriel A. Brat (Beth Israel Deaconess Medical Center and Harvard Medical School)
+
+ Abstract:
+ Given the complexity of trauma presentations, particularly in those involving multiple areas of the body, overlooked injuries are common during the initial assessment by a clinician. We are motivated to develop an automated trauma pattern discovery framework for comprehensive identification of injury patterns which may eventually support diagnostic decision-making. We analyze 1,162,399 patients from the Trauma Quality Improvement Program with a disentangled variational autoencoder, weakly supervised by a latent-space classifier of auxiliary features. We also develop a novel scoring metric that serves as a proxy for clinical intuition in extracting clusters with clinically meaningful injury patterns. We validate the extracted clusters with clinical experts, and explore the patient characteristics of selected groupings. Our metric is able to perform model selection and effectively filter clusters for clinically-validated relevance.
+
+ Abstract:
+ Machine learning (ML) models deployed in healthcare systems must face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, splitting datasets according to patients sampled randomly throughout the entire study time period. This work proposes the Evaluation on Medical Datasets Over Time (EMDOT) framework, which evaluates the performance of a model class across time. Inspired by the concept of backtesting, EMDOT simulates possible training procedures that practitioners might have been able to execute at each point in time and evaluates the resulting models on all future time points. Evaluating both linear and more complex models on six distinct medical data sources (tabular and imaging), we show how depending on the dataset, using all historical data may be ideal in many cases, whereas using a window of the most recent data could be advantageous in others. In datasets where models suffer from sudden degradations in performance, we investigate plausible explanations for these shocks. We release the EMDOT package to help facilitate further works in deployment-oriented evaluation over time.
+
+ Rediscovery of CNN's Versatility for Text-based Encoding of Raw Electronic Health Records
+
+
+ Eunbyeol Cho* (KAIST), Min Jae Lee (KAIST), Kyunghoon Hur (KAIST), Jiyoun Kim (KAIST), Jinsung Yoon (Google Cloud AI Research), Edward Choi (KAIST)
+
+ Abstract:
+ Making the most use of abundant information in electronic health records (EHR) is rapidly becoming an important topic in the medical domain. Recent work presented a promising framework that embeds entire features in raw EHR data regardless of its form and medical code standards. The framework, however, only focuses on encoding EHR with minimal preprocessing and fails to consider how to learn efficient EHR representation in terms of computation and memory usage. In this paper, we search for a versatile encoder not only reducing the large data into a manageable size but also well preserving the core information of patients to perform diverse clinical tasks. We found that hierarchically structured Convolutional Neural Network (CNN) often outperforms the state-of-the-art model on diverse tasks such as reconstruction, prediction, and generation, even with fewer parameters and less training time. Moreover, it turns out that making use of the inherent hierarchy of EHR data can boost the performance of any kind of backbone models and clinical tasks performed. Through extensive experiments, we present concrete evidence to generalize our research findings into real-world practice. We give a clear guideline on building the encoder based on the research findings captured while exploring numerous settings.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/2023/proceeding_P29.html b/2023/proceeding_P29.html
new file mode 100644
index 000000000..69ce3a5b8
--- /dev/null
+++ b/2023/proceeding_P29.html
@@ -0,0 +1,514 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CHIL
+
+ : Homekit2020: A Benchmark for Time Series Classification on a Large Mobile Sensing Dataset with Laboratory Tested Ground Truth of Influenza Infections
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Homekit2020: A Benchmark for Time Series Classification on a Large Mobile Sensing Dataset with Laboratory Tested Ground Truth of Influenza Infections
+
+
+ Mike A Merrill (University of Washington), Esteban Safranchik* (University of Washington), Arinbjörn Kolbeinsson (Evidation Health), Piyusha Gade (Evidation Health), Ernesto Ramirez (Evidation Health), Ludwig Schmidt (University of Washington), Luca Foschini (Sage Bionetworks), Tim Althoff (University of Washington)
+
+ Abstract:
+ Despite increased interest in wearables as tools for detecting various health conditions, there are not as of yet any large public benchmarks for such mobile sensing data. The few datasets that are available do not contain data from more than dozens of individuals, do not contain high-resolution raw data or do not include dataloaders for easy integration into machine learning pipelines. Here, we present Homekit2020: the first large-scale public benchmark for time series classification of wearable sensor data. Our dataset contains over 14 million hours of minute-level multimodal Fitbit data, symptom reports, and ground-truth laboratory PCR influenza test results, along with an evaluation framework that mimics realistic model deployments and efficiently characterizes statistical uncertainty in model selection in the presence of extreme class imbalance. Furthermore, we implement and evaluate nine neural and non-neural time series classification models on our benchmark across 450 total training runs in order to establish state of the art performance.
+
+ Abstract:
+ The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Neural Networks (GNNs) have been applied for brain network analysis. However, training deep models requires large amounts of labeled data, which is often scarce in brain network datasets due to the complexities of data acquisition and sharing restrictions. To make the most out of available training data, we propose PTGB, a GNN pre-training framework that captures intrinsic brain network structures, regardless of clinical outcomes, and is easily adaptable to various downstream tasks. PTGB comprises two key components: (1) an unsupervised pre-training technique designed specifically for brain networks, which enables learning from large-scale datasets without task-specific labels; (2) a data-driven parcellation atlas mapping pipeline that facilitates knowledge transfer across datasets with different ROI systems. Extensive evaluations using various GNN models have demonstrated the robust and superior performance of PTGB compared to baseline methods.
+
+ Eric Lehman* (MIT and Xyla), Evan Hernandez (MIT and Xyla), Diwakar Mahajan (IBM Research), Jonas Wulff (Xyla), Micah J. Smith (Xyla), Zachary Ziegler (Xyla), Daniel Nadler (Xyla), Peter Szolovits (MIT), Alistair Johnson (The Hospital for Sick Children), Emily Alsentzer (Brigham and Women's Hospital and Harvard Medical School)
+
+ Abstract:
+ Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as clinical text. Recent results have suggested that LLMs encode a surprising amount of medical knowledge. This raises an important question regarding the utility of smaller domain-specific language models. With the success of general-domain LLMs, is there still a need for specialized clinical models? To investigate this question, we conduct an extensive empirical analysis of 12 language models, ranging from 220M to 175B parameters, measuring their performance on 3 different clinical tasks that test their ability to parse and reason over electronic health records. As part of our experiments, we train T5-Base and T5-Large models from scratch on clinical notes from MIMIC III and IV to directly investigate the efficiency of clinical tokens. We show that relatively small specialized clinical models substantially outperform all in-context learning approaches, even when finetuned on limited annotated data. Further, we find that pretraining on clinical tokens allows for smaller, more parameter-efficient models that either match or outperform much larger language models trained on general text. We release the code and the models used under the PhysioNet Credentialed Health Data license and data use agreement.
+
+ Abstract:
+ Electrodermal activity (EDA) is a biosignal that contains valuable information for monitoring health conditions related to sympathetic nervous system activity. Analyzing ambulatory EDA data is challenging because EDA measurements tend to be noisy and sparsely labeled. To address this problem, we present the first study of contrastive learning that examines approaches that are tailored to the EDA signal. We present a novel set of data augmentations that are tailored to EDA, and use them to generate positive examples for unsupervised contrastive learning. We evaluate our proposed approach on the downstream task of stress detection. We find that it outperforms baselines when used both for fine-tuning and for transfer learning, especially in regimes of high label sparsity. We verify that our novel EDA-specific augmentations add considerable value beyond those considered in prior work through a set of ablation experiments.
+
+ Understanding and Predicting the Effect of Environmental Factors on People with Type 2 Diabetes
+
+
+ Kailas Vodrahalli* (Stanford University), Gregory D. Lyng (Optum AI Labs), Brian L. Hill (Optum AI Labs), Kimmo Karkkainen (Optum AI Labs), Jeffrey Hertzberg (Optum AI Labs), James Zou (Stanford University), Eran Halperin (Optum AI Labs)
+
+ Abstract:
+ Type 2 diabetes mellitus (T2D) affects over 530 million people globally and is often difficult to manage leading to serious health complications. Continuous glucose monitoring (CGM) can help people with T2D to monitor and manage the disease. CGM devices sample an individual's glucose level at frequent intervals enabling sophisticated characterization of an individual's health. In this work, we leverage a large dataset of CGM data (5,447 individuals and 940,663 days of data) paired with health records and activity data to investigate how glucose levels in people with T2D are affected by external factors like weather conditions, extreme weather events, and temporal events including local holidays. We find temperature (p=2.37x10-8, n=3561), holidays (p=2.23x10-46, n=4079), and weekends (p=7.64x10-124, n=5429) each have a significant effect on standard glycemic metrics at a population level. Moreover, we show that we can predict whether an individual will be significantly affected by a (potentially unobserved) external event using only demographic information and a few days of CGM and activity data. Using random forest classifiers, we can predict whether an individual will be more negatively affected than a typical individual with T2D by a given external factor with respect to a given glycemic metric. We find performance (measured as ROC-AUC) is consistently above chance (across classifiers, median ROC-AUC=0.63). Performance is highest for classifiers predicting the effect of time-in-range (median ROC-AUC=0.70). These are important findings because they may enable better patient care management with day-to-day risk assessments based on external factors as well as improve algorithm development by reducing train- and test-time bias due to external factors.
+
+ Andre Manoel* (Microsoft); Mirian Del Carmen Hipolito Garcia (Microsoft); Tal Baumel (Microsoft); Shize Su (Microsoft); Jialei Chen (Microsoft); Robert Sim (Microsoft); Dan Miller (Airbnb); Danny Karmon (Google); Dimitrios Dimitriadis (Amazon)
+
+ Sarwan Ali* (Georgia State University); Babatunde Bello (Georgia State University); Prakash Chourasia (Georgia State University); Ria Thazhe Punathil (Georgia State University); Pin-Yu Chen (IBM Research); Imdad Ullah Khan (Lahore University of Management Sciences); Murray Patterson (Georgia State University)
+
+ Hyeonji Hwang* (KAIST); Seongjun Yang (KRAFTON); Daeyoung Kim (KAIST); Radhika Dua (Google Research); Jong-Yeup Kim(Konyang University); Eunho Yang (KAIST) ; Edward Choi (KAIST)
+
+ Giovanni Cinà* (Amsterdam University Medical Center); Tabea E. Röber (University of Amsterdam); Rob Goedhart (University of Amsterdam); Ş. İlker Birbil (University of Amsterdam)
+
+ Siyi Tang* (Stanford University); Jared A. Dunnmon (Stanford University); Liangqiong Qu (University of Hong Kong); Khaled K. Saab (Stanford University); Tina Baykaner (Stanford University); Christopher Lee-Messer (Stanford University); Daniel L. Rubin (Stanford University)
+
+ Vincent Jeanselme* (University of Cambridge); Chang Ho Yoon (University of Oxford); Brian Tom (University of Cambridge); Jessica Barrett (University of Cambridge)
+
+ Li Xu* (Hong Kong Polytechnic University); Bo Liu (Hong Kong Polytechnic University); Ameer Hamza Khan (Hong Kong Polytechnic University); Lu Fan (Hong Kong Polytechnic University); Xiao-Ming Wu (Hong Kong Polytechnic University)
+
+ Mingyue Tang (University of Virginia); Jiechao Gao* (University of Virginia); Guimin Dong (Amazon); Carl Yang (Emory University); Brad Campbell (University of Virginia); Brendan Bowman (University of Virginia); Jamie Marie Zoellner (University of Virginia); Emaad Abdel-Rahman (University of Virginia); Mehdi Boukhechba (The Janssen Pharmaceutical Companies of Johnson & Johnson)
+
+ Siun Kim* (Seoul National University); Jung-Hyun Won (Seoul National University); David Seung U Lee (Seoul National University); Renqian Luo (Microsoft Research); Lijun Wu (Microsoft Research); Yingce Xia (Microsoft Research); Tao Qin (Microsoft Research); Howard Lee (Seoul National University)
+
+ Toshiya Yoshida* (University of California Santa Cruz); Trinity Shuxian Fan (University of Washington); Tyler McCormick (University of Washington); Zhenke Wu (University of Michigan); Zehang Richard Li (University of California Santa Cruz)
+
+ Jessica Zheng (MIT); Hanrui Wang* (MIT); Anand Chandrasekhar (MIT); Aaron Aguirre (Massachusetts General Hospital and Harvard Medical School); Song Han (MIT); Hae-Seung Lee (MIT); Charles G. Sodini (MIT)
+
+ Supriya Nagesh* (Amazon); Nina Mishra (Amazon); Yonatan Naamad (Amazon); James M Rehg (Georgia Institute of Technology); Mehul A Shah (Aryn); Alexei Wagner (Harvard University)
+
+ William La Cava* (Boston Children's Hospital and Harvard Medical School); Elle Lett (Boston Children's Hospital and Harvard Medical School); Guangya Wan (Boston Children's Hospital and Harvard Medical School)
+
+ Kevin Wu* (Stanford University and Optum Labs); Dominik Dahlem (Optum Labs); Christopher Hane (Optum Labs); Eran Halperin (Optum Labs); James Zou (Stanford University)
+
+ Qixuan Jin* (Massachusetts Institute of Technology); Jacobien Oosterhoff (Delft University of Technology); Yepeng Huang (Harvard School of Public Health); Marzyeh Ghassemi (Massachusetts Institute of Technology); Gabriel A. Brat (Beth Israel Deaconess Medical Center and Harvard Medical School)
+
+ Eunbyeol Cho* (KAIST); Min Jae Lee (KAIST); Kyunghoon Hur (KAIST); Jiyoun Kim (KAIST); Jinsung Yoon (Google Cloud AI Research); Edward Choi (KAIST)
+
+ Mike A Merrill (University of Washington); Esteban Safranchik* (University of Washington); Arinbjörn Kolbeinsson (Evidation Health); Piyusha Gade (Evidation Health); Ernesto Ramirez (Evidation Health); Ludwig Schmidt (University of Washington); Luca Foschini (Sage Bionetworks); Tim Althoff (University of Washington)
+
+ Eric Lehman* (MIT and Xyla); Evan Hernandez (MIT and Xyla); Diwakar Mahajan (IBM Research); Jonas Wulff (Xyla); Micah J. Smith (Xyla); Zachary Ziegler (Xyla); Daniel Nadler (Xyla); Peter Szolovits (MIT); Alistair Johnson (The Hospital for Sick Children); Emily Alsentzer (Brigham and Women's Hospital and Harvard Medical School)
+
+ Kailas Vodrahalli* (Stanford University); Gregory D. Lyng (Optum AI Labs); Brian L. Hill (Optum AI Labs); Kimmo Karkkainen (Optum AI Labs); Jeffrey Hertzberg (Optum AI Labs); James Zou (Stanford University); Eran Halperin (Optum AI Labs)
+
+ Abstract:
+ Biological sequences, like DNA and protein sequences, encode genetic information essential to life. In recent times, deep learning techniques have transformed biomedical research and applications by modeling the intricate patterns in these sequences. Successful models like AlphaFold and Enformer have paved the way for accurate end-to-end prediction of complex molecular phenotypes from sequences. Such models have profound impact on biomedical research and applications, ranging from understanding basic biology to facilitating drug discovery. This talk will provide an overview of the current techniques and status of biological sequences modeling. Additionally, specific applications of such models in genetics and immunology will be discussed.
+
+ Bio:
+ Jun Cheng is a Senior Research Scientist at DeepMind. His research focused on developing machine learning methods to better understand the genetic code and disease mechanisms. Before that, he was a scientist at NEC Labs Europe, where he worked on personalized cancer vaccines. His work has been published in venues such as Genome Biology, Bioinformatics, and Nature Biotechnology. He received his PhD in computational biology from the Technical University of Munich.
+
+ Abstract:
+ Artificial intelligence could fundamentally transform clinical workflows in image-based diagnostics and population screening, promising more objective, accurate and effective analysis of medical images. A major hurdle for using medical imaging AI in clinical practice, however, is the assurance whether it is safe for patients and continues to be safe after deployment. Differences in patient populations and changes in the data acquisition pose challenges to today's AI algorithms. In this talk we will discuss AI safeguards from the perspective of robustness, reliability, and fairness. We will explore approaches for automatic failure detection, monitoring of performance, and analysis of bias, aiming to ensure the safe and ethical use of medical imaging AI.
+
+ Bio:
+ Ben Glocker is Professor in Machine Learning for Imaging and Kheiron Medical Technologies / Royal Academy of Engineering Research Chair in Safe Deployment of Medical Imaging AI. He co-leads the Biomedical Image Analysis Group, leads the HeartFlow-Imperial Research Team, and is Head of ML Research at Kheiron. His research is at the intersection of medical imaging and artificial intelligence aiming to build safe and ethical computational tools for improving image-based detection and diagnosis of disease.
+
+ Abstract:
+ Digital traces, such as social media data, supported with advances in the artificial intelligence (AI) and machine learning (ML) fields, are increasingly being used to understand the mental health of individuals, communities, and populations. However, such algorithms do not exist in a vacuum -- there is an intertwined relationship between what an algorithm does and the world it exists in. Consequently, with algorithmic approaches offering promise to change the status quo in mental health for the first time since mid-20th century, interdisciplinary collaborations are paramount. But what are some paradigms of engagement for AL/ML researchers that augment existing algorithmic capabilities while minimizing the risk of harm? Adopting a social ecological lens, this talk will describe the experiences from working with different stakeholders in research initiatives relating to digital mental health – including with healthcare providers, grassroots advocacy and public health organizations, and people with the lived experience of mental illness. The talk hopes to present some lessons learned by way of these engagements, and to reflect on a path forward that empowers us to go beyond technical innovations to envisioning contributions that center humans’ needs, expectations, values, and voices within those technical artifacts.
+
+ Bio:
+ Munmun De Choudhury is an Associate Professor of Interactive Computing at Georgia Tech. Dr. De Choudhury is best known for laying the foundation of a new line of research that develops computational techniques towards understanding and improving mental health outcomes, through ethical analysis of social media data. To do this work, she adopts a highly interdisciplinary approach, combining social computing, machine learning, and natural language analysis with insights and theories from the social, behavioral, and health sciences. Dr. De Choudhury has been recognized with the 2023 SIGCHI Societal Impact Award, the 2022 Web Science Trust Test-of-Time Award, the 2021 ACM-W Rising Star Award, the 2019 Complex Systems Society – Junior Scientific Award, numerous best paper and honorable mention awards from the ACM and AAAI, and features and coverage in popular press like the New York Times, the NPR, and the BBC. Earlier, Dr. De Choudhury was a faculty associate with the Berkman Klein Center for Internet and Society at Harvard, a postdoc at Microsoft Research, and obtained her PhD in Computer Science from Arizona State University.
+
+ Abstract:
+ Our society remains profoundly unequal. This talk discusses how data science and machine learning can be used to combat inequality in health care and public health by presenting several vignettes from domains like medical testing and cancer risk prediction.
+
+ Bio:
+ Emma Pierson is an assistant professor of computer science at the Jacobs Technion-Cornell Institute at Cornell Tech and the Technion, and a computer science field member at Cornell University. She holds a secondary joint appointment as an Assistant Professor of Population Health Sciences at Weill Cornell Medical College. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, and Forbes 30 Under 30 in Science. Her research has been published at venues including ICML, KDD, WWW, Nature, and Nature Medicine, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.
+
+ Bio:
+ Dina Katabi is the Thuan and Nicole Pham Professor of Electrical Engineering and Computer Science at MIT. She is also the director of the MIT’s Center for Wireless Networks and Mobile Computing, a member of the National Academy of Engineering, and a recipient of the MacArthur Genius Award. Professor Katabi received her PhD and MS from MIT in 2003 and 1999, and her Bachelor of Science from Damascus University in 1995. Katabi's research focuses on innovations in digital health, applied machine learning and wireless sensors and networks. Her research has been recognized with ACM Prize in Computing, the ACM Grace Murray Hopper Award, two SIGCOMM Test-of-Time Awards, the Faculty Research Innovation Fellowship, a Sloan Fellowship, the NBX Career Development chair, and the NSF CAREER award. Her students received the ACM Best Doctoral Dissertation Award in Computer Science and Engineering twice. Further, her work was recognized by the IEEE William R. Bennett prize, three ACM SIGCOMM Best Paper awards, an NSDI Best Paper award and a TR10 award. Several start-ups have beenspun out of Katabi's lab such as PiCharging and Emerald.
+
+ Abstract:
+ The traditional knowledge-based approaches to question answering might seem irrelevant now that Neural QA, particularly Large Language Models show almost human performance in question answering. Knowing what was successful in the past and which elements are essential to getting the right answers, however, is needed to inform further developments in the neural approaches and help address the known shortcomings of LLMs. This talk, therefore, will provide an overview of the approaches to biomedical question answering as they were evolving. It will cover information needs of various stakeholders and the resources created to address these information needs through Question Answering.
+
+ Bio:
+ Dina Demner-Fushman, MD, PhD is a Tenure Track Investigator in the Computational Health Research Branch at LHNCBC. She specializes in artificial intelligence and natural language processing, with a focus on information extraction and textual data analysis, EMR data analysis, and image and text retrieval for clinical decision support and education. Dr. Demner-Fushman's research aims to improve healthcare through the development of computational methods that can process and analyze clinical data more effectively. Her research led to the current iteration of the MEDLINE resource, which helps people navigate a plethora of NLM resources, as well as Open-i, which helps finding biomedical images.
+
+ Abstract:
+ Artificial intelligence tools have been touted as having performance "on par" with board certified dermatologists. However, these published claims have not translated to real world practice. In this talk, I will discuss the opportunities and challenges for AI in dermatology.
+
+ Bio:
+ Dr. Roxana Daneshjou received her undergraduate degree at Rice University in Bioengineering, where she was recognized as a Goldwater Scholar for her research. She completed her MD/PhD at Stanford, where she worked in the lab of Dr. Russ Altman. During this time, she was a Howard Hughes Medical Institute Medical Scholar and a Paul and Daisy Soros Fellowship for New Americans Fellow. She completed dermatology residency at Stanford in the research track and now practices dermatology as a Clinical Scholar in Stanford's Department of Dermatology while also conducting artificial intelligence research with Dr. James Zou as a postdoc in Biomedical Data Science. She is an incoming assistant professor of biomedical data science and dermatology at Stanford in Fall of 2023. Her research interests are in developing diverse datasets and fair algorithms for applications in precision medicine.
+
+ Biomedical Question Answering Yesterday, Today, and Tomorrow
+
+
+
+
+
+ Dina Demner-Fushman / National Institutes of Health
+
+
+
+
+
+
+
+
+
+
Abstract: The traditional knowledge-based approaches to question answering might seem irrelevant now that Neural QA, particularly Large Language Models show almost human performance in question answering. Knowing what was successful in the past and which elements are essential to getting the right answers, however, is needed to inform further developments in the neural approaches and help address the known shortcomings of LLMs. This talk, therefore, will provide an overview of the approaches to biomedical question answering as they were evolving. It will cover information needs of various stakeholders and the resources created to address these information needs through Question Answering.
+
+
+
Bio: Dina Demner-Fushman, MD, PhD is a Tenure Track Investigator in the Computational Health Research Branch at LHNCBC. She specializes in artificial intelligence and natural language processing, with a focus on information extraction and textual data analysis, EMR data analysis, and image and text retrieval for clinical decision support and education. Dr. Demner-Fushman's research aims to improve healthcare through the development of computational methods that can process and analyze clinical data more effectively. Her research led to the current iteration of the MEDLINE resource, which helps people navigate a plethora of NLM resources, as well as Open-i, which helps finding biomedical images.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Safe Deployment of Medical Imaging AI
+
+
+
+
+
+ Ben Glocker / Imperial College London
+
+
+
+
+
+
+
+
+
+
Abstract: Artificial intelligence could fundamentally transform clinical workflows in image-based diagnostics and population screening, promising more objective, accurate and effective analysis of medical images. A major hurdle for using medical imaging AI in clinical practice, however, is the assurance whether it is safe for patients and continues to be safe after deployment. Differences in patient populations and changes in the data acquisition pose challenges to today's AI algorithms. In this talk we will discuss AI safeguards from the perspective of robustness, reliability, and fairness. We will explore approaches for automatic failure detection, monitoring of performance, and analysis of bias, aiming to ensure the safe and ethical use of medical imaging AI.
+
+
+
Bio: Ben Glocker is Professor in Machine Learning for Imaging and Kheiron Medical Technologies / Royal Academy of Engineering Research Chair in Safe Deployment of Medical Imaging AI. He co-leads the Biomedical Image Analysis Group, leads the HeartFlow-Imperial Research Team, and is Head of ML Research at Kheiron. His research is at the intersection of medical imaging and artificial intelligence aiming to build safe and ethical computational tools for improving image-based detection and diagnosis of disease.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Biological Sequence Modeling in Research and Applications
+
+
+
+
+
+ Jun Cheng / DeepMind
+
+
+
+
+
+
+
+
+
+
Abstract: Biological sequences, like DNA and protein sequences, encode genetic information essential to life. In recent times, deep learning techniques have transformed biomedical research and applications by modeling the intricate patterns in these sequences. Successful models like AlphaFold and Enformer have paved the way for accurate end-to-end prediction of complex molecular phenotypes from sequences. Such models have profound impact on biomedical research and applications, ranging from understanding basic biology to facilitating drug discovery. This talk will provide an overview of the current techniques and status of biological sequences modeling. Additionally, specific applications of such models in genetics and immunology will be discussed.
+
+
+
Bio: Jun Cheng is a Senior Research Scientist at DeepMind. His research focused on developing machine learning methods to better understand the genetic code and disease mechanisms. Before that, he was a scientist at NEC Labs Europe, where he worked on personalized cancer vaccines. His work has been published in venues such as Genome Biology, Bioinformatics, and Nature Biotechnology. He received his PhD in computational biology from the Technical University of Munich.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Bridging Machine Learning and Collaborative Action Research: A Tale Engaging with Diverse Stakeholders in Digital Mental Health
+
+
+
+
+
+ Munmun De Choudhury / Georgia Tech
+
+
+
+
+
+
+
+
+
+
Abstract: Digital traces, such as social media data, supported with advances in the artificial intelligence (AI) and machine learning (ML) fields, are increasingly being used to understand the mental health of individuals, communities, and populations. However, such algorithms do not exist in a vacuum -- there is an intertwined relationship between what an algorithm does and the world it exists in. Consequently, with algorithmic approaches offering promise to change the status quo in mental health for the first time since mid-20th century, interdisciplinary collaborations are paramount. But what are some paradigms of engagement for AL/ML researchers that augment existing algorithmic capabilities while minimizing the risk of harm? Adopting a social ecological lens, this talk will describe the experiences from working with different stakeholders in research initiatives relating to digital mental health – including with healthcare providers, grassroots advocacy and public health organizations, and people with the lived experience of mental illness. The talk hopes to present some lessons learned by way of these engagements, and to reflect on a path forward that empowers us to go beyond technical innovations to envisioning contributions that center humans’ needs, expectations, values, and voices within those technical artifacts.
+
+
+
Bio: Munmun De Choudhury is an Associate Professor of Interactive Computing at Georgia Tech. Dr. De Choudhury is best known for laying the foundation of a new line of research that develops computational techniques towards understanding and improving mental health outcomes, through ethical analysis of social media data. To do this work, she adopts a highly interdisciplinary approach, combining social computing, machine learning, and natural language analysis with insights and theories from the social, behavioral, and health sciences. Dr. De Choudhury has been recognized with the 2023 SIGCHI Societal Impact Award, the 2022 Web Science Trust Test-of-Time Award, the 2021 ACM-W Rising Star Award, the 2019 Complex Systems Society – Junior Scientific Award, numerous best paper and honorable mention awards from the ACM and AAAI, and features and coverage in popular press like the New York Times, the NPR, and the BBC. Earlier, Dr. De Choudhury was a faculty associate with the Berkman Klein Center for Internet and Society at Harvard, a postdoc at Microsoft Research, and obtained her PhD in Computer Science from Arizona State University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A Healthcare Platform Powered by ML and Radio Waves
+
+
+
+
+
+ Dina Katabi / MIT
+
+
+
+
+
+
+
+
+
+
Abstract: TBD
+
+
+
Bio: Dina Katabi is the Thuan and Nicole Pham Professor of Electrical Engineering and Computer Science at MIT. She is also the director of the MIT’s Center for Wireless Networks and Mobile Computing, a member of the National Academy of Engineering, and a recipient of the MacArthur Genius Award. Professor Katabi received her PhD and MS from MIT in 2003 and 1999, and her Bachelor of Science from Damascus University in 1995. Katabi's research focuses on innovations in digital health, applied machine learning and wireless sensors and networks. Her research has been recognized with ACM Prize in Computing, the ACM Grace Murray Hopper Award, two SIGCOMM Test-of-Time Awards, the Faculty Research Innovation Fellowship, a Sloan Fellowship, the NBX Career Development chair, and the NSF CAREER award. Her students received the ACM Best Doctoral Dissertation Award in Computer Science and Engineering twice. Further, her work was recognized by the IEEE William R. Bennett prize, three ACM SIGCOMM Best Paper awards, an NSDI Best Paper award and a TR10 award. Several start-ups have beenspun out of Katabi's lab such as PiCharging and Emerald.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Skin in the Game: The State of AI in Dermatology
+
+
+
+
+
+ Roxana Daneshjou / Stanford University
+
+
+
+
+
+
+
+
+
+
Abstract: Artificial intelligence tools have been touted as having performance "on par" with board certified dermatologists. However, these published claims have not translated to real world practice. In this talk, I will discuss the opportunities and challenges for AI in dermatology.
+
+
+
Bio: Dr. Roxana Daneshjou received her undergraduate degree at Rice University in Bioengineering, where she was recognized as a Goldwater Scholar for her research. She completed her MD/PhD at Stanford, where she worked in the lab of Dr. Russ Altman. During this time, she was a Howard Hughes Medical Institute Medical Scholar and a Paul and Daisy Soros Fellowship for New Americans Fellow. She completed dermatology residency at Stanford in the research track and now practices dermatology as a Clinical Scholar in Stanford's Department of Dermatology while also conducting artificial intelligence research with Dr. James Zou as a postdoc in Biomedical Data Science. She is an incoming assistant professor of biomedical data science and dermatology at Stanford in Fall of 2023. Her research interests are in developing diverse datasets and fair algorithms for applications in precision medicine.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Using Machine Learning to Increase Equity in Healthcare and Public Health
+
+
+
+
+
+ Emma Pierson / Cornell Tech
+
+
+
+
+
+
+
+
+
+
Abstract: Our society remains profoundly unequal. This talk discusses how data science and machine learning can be used to combat inequality in health care and public health by presenting several vignettes from domains like medical testing and cancer risk prediction.
+
+
+
Bio: Emma Pierson is an assistant professor of computer science at the Jacobs Technion-Cornell Institute at Cornell Tech and the Technion, and a computer science field member at Cornell University. She holds a secondary joint appointment as an Assistant Professor of Population Health Sciences at Weill Cornell Medical College. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, and Forbes 30 Under 30 in Science. Her research has been published at venues including ICML, KDD, WWW, Nature, and Nature Medicine, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.
Bobak J Mortazavi
+Michael C Hughes
+Elena Sizikova
+
Area Chairs
+
Kirk Roberts
+Neel Dey
+Sarah Tan
+Yuyin Zhou
+Sana Tonekaboni
+Jean Feng
+Jessica Dafflon
+Weimin Zhou
+Daniel Moyer
+Ayah Zirikly
+Martine De Cock
+Zepeng Frazier Huo
+Shengpu Tang
+Stephen R Pfohl
+Prince Ebenezer Adjei
+Vivek Kumar Singh
+Hossein Azizpour
+
Reviewers
+
Zhe Huang
+Intae Moon
+Daeun Kyung
+Tong Xia
+Kyle Heuton
+Emma Charlotte Rocheteau
+Jiacheng Zhu
+Jun Yu
+Charles B. Delahunt
+Erika Bondareva
+Wenbin Zhang
+Lei Lu
+Stefan Hegselmann
+Luna Zhang
+Stephanie Hyland
+Bret Nestor
+Chaoqi Yang
+Hyungyung Lee
+Vinod Kumar Chauhan
+Purity Mugambi
+Rahmatollah Beheshti
+Stefan Feuerriegel
+Julien Le Kernec
+Karan Bhanot
+Jielin Qiu
+Peniel N Argaw
+Ran Xu
+Helen Zhou
+Thomas Hartvigsen
+Elizabeth Healey
+Tal El Hay
+Ethan Harvey
+Karla Paniagua
+Shubhranshu Shekhar
+Jieshi Chen
+Wisdom Oluchi Ikezogwo
+Xuhai Xu
+Roozbeh Yousefzadeh
+Ismael Villanueva-Miranda
+Zhenbang Wu
+Srivamshi Pittala
+Aparna Balagopalan
+Frank Rudzicz
+Sicong Huang
+Shayan Fazeli
+Gyubok Lee
+Yanchao Tan
+Shahriar Noroozizadeh
+Afrah Shafquat
+Megan Coffee
+Abdullah Mamun
+Arinbjörn Kolbeinsson
+Xiaobin Shen
+Preetish Rath
+Chuizheng Meng
+Vincent Jeanselme
+Shalini Saini
+Yao Su
+Mehul Motani
+Vasundhara Agrawal
+Lorenzo A. Rossi
+Jungwoo Oh
+Jong Hak Moon
+Mehak Gupta
+Sandeep Angara
+Pablo Moreno-Muñoz
+Salvatore Tedesco
+Matthew M. Engelhard
+Adrienne Pichon
+Houliang Zhou
+Alex Fedorov
+Seongsu Bae
+Kyunghoon Hur
+Jichan Chung
+Antonio-José Sánchez-Salmerón
+Lucía Prieto Santamaría
+Cédric Wemmert
+Ioakeim Perros
+Rudraksh Tuwani
+Walter Gerych
+Alexander Woyczyk
+Declan O'Loughlin
+Esma Yildirim
+Carlo Sansone
+MohammadAli Shaeri
+Yuan Zhao
+Marta Avalos
+Wangzhi Dai
+Xiao Fan
+Elliot Creager
+Chang Hu
+Max Homilius
+Emmanuel Klu
+Heike Leutheuser
+Melissa Danielle McCradden
+Aniruddh Raghu
+Ajinkya K Mulay
+Jessilyn Dunn
+Jing Wang
+Jiajun Xu
+Chenwei Wu
+Bowen Song
+Kenny Moise
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Financial support ensures that CHIL remains accessible to a broad set of participants by offsetting the expenses involved in participation. We follow best practices in other conferences to maintain a transparent and appropriate relationship with our funders:
+
+
The substance and structure of the conference are determined independently by the program committees.
+
All papers are chosen through a rigorous, mutually anonymous peer review process, where authors disclose conflicts of interest.
+
All sources of financial support are acknowledged.
+
Benefits are publicly disclosed below.
+
Corporate sponsors cannot specify how contributions are spent.
+
+
+
+
+
2024 Sponsorship Levels
+
Sponsorship of the annual AHLI Conference on Health, Inference and Learning (CHIL) contributes to furthering research and interdisciplinary dialogue around machine learning and health. We deeply appreciate any amount of support your company or foundation can provide.
+
+
Diamond ($20,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Access to CHIL 2024 attendees’ contact and CV who opt-in for career opportunities
+
Dedicated time during the lunch break to present a 20-minute talk on company's research in machine learning and health research or development
+
Present demo during the poster session
+
Free registration for up to ten (10) representatives from your organization
+
Free company booth at the venue
+
+
Gold ($10,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Present demo during the poster session
+
Free registration for up to five (5) representatives from your organization
+
Free company booth at the venue
+
+
Silver ($5,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Free registration for up to two (2) representatives from your organization
+
+
Bronze ($2,000 USD)
+
+
Prominent display of company logo on our website
+
Free registration for one (1) representative from your organization
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ Panel: Health Economics and Behavior
+ David Meltzer, MD, PhD, University of Chicago, Walter Dempsey, PhD, University of Michigan, F. Perry Wilson, MD, Yale School of Medicine. Moderated by Kyra Gan, PhD, Cornell University
+
+
+ Panel: Real Deployments, and How to Find Them
+ Girish Nadkarni, MD, MPH, Mount Sinai, Roy Perlis, MD, Harvard University, Ashley Beecy, MD, NewYork-Presbyterian. Moderated by Leo Celi, PhD, Massachusetts Institute of Technology
+
+
+
+
+
3:30pm - 3:55pm
+
Doctoral Symposium Lighting Talks
+
+
+
3:55pm - 4:05pm
+
Closing Remarks
+
+
+
4:05pm - 5:20pm
+
Doctoral Symposium Poster Session
+
+
+
+
+
+
+
+
+
+
Time
+
Title
+
Location
+
+
+
+
+
8:30am - 9:00am
+
Check-in and Refreshments
+
CSAIL 1st Floor Lobby
+
+
+
9:00am - 9:05am
+
Opening Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
Session IX: The State of Machine Learning for Health: Where Are We Now, and Where Do We Go?
Machine Learning for Healthcare in the Era of ChatGPT
+ Karandeep Singh, MD, University of Michigan, Nigam Shah, MBBS, PhD, Stanford University, Saadia Gabriel, PhD, MIT, and Tristan Naumann, PhD, Microsoft Research. Moderated by Byron Wallace, PhD, Northeastern University.
+
Session XI: Research Roundtables
+ Bridging the gap between the business of value-based care and the research of health AI, Yubin Park, PhD, ApolloMed
+ Auditing Algorithm Performance and Equity, Alistair Johnson, DPhil, Hospital for Sick Children
+ Data privacy: Interactive or Non-interactive?, Khaled El Emam, PhD, University of Ottawa and Li Xiong, PhD, Emory University
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?, Tianxi Cai, ScD, Harvard Medical School and Yong Chen, PhD, University of Pennsylvania
+ NetworkStudies: As Many Databases as Possible or Enough to Answer the Question Quickly?, Christopher Chute, MD, Johns Hopkins University and Robert Platt, PhD, McGill University
+
CSAIL 4th floor Star and Kiva Rooms
+
+
+
3:00pm - 3:30pm
+
Networking Break
+
CSAIL 4th Floor Lobby
+
+
+
3:30pm - 4:45pm
+
+ Session XII: Doctoral Symposium
+
CSAIL 4th Floor Lobby
+
+
+
4:45pm - 5:15pm
+
+ Session XIII: I Can’t Believe It’s Not Better Lightning Talks
+ David Bellamy, Bhawesh Kumar, Cindy Wang, Andrew Beam : Can pre-trained Transformers beat simple baselines on lab data?
+ Hiba Ahsan, Silvio Amir, Byron Wallace : On the Difficulty of Disentangling Race in Representations of Clinical Notes
+ Olga Demler : On intransitivity of win ratio and area under the receiver operating characteristics curve
+ Wouter van Amsterdam, Rajesh Ranganath : My prediction model is super accurate so it will be useful for treatment decision making, right? Wrong!
+ Yuan Zhao, David Benkeser, Russell Kempker : Doubly Robust Approaches for Estimating Treatment Effect in Observational Studies are not Better than G-Computation
+
+
CSAIL Kirsch Auditorium
+
+
+
5:15pm - 5:20pm
+
Closing Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ Panel: Health Economics and Behavior
+ David Meltzer, MD, PhD, University of Chicago, Walter Dempsey, PhD, University of Michigan, F. Perry Wilson, MD, Yale School of Medicine. Moderated by Kyra Gan, PhD, Cornell University
+
+
+ Panel: Real Deployments, and How to Find Them
+ Girish Nadkarni, MD, MPH, Mount Sinai, Roy Perlis, MD, Harvard University, Ashley Beecy, MD, NewYork-Presbyterian. Moderated by Leo Celi, PhD, Massachusetts Institute of Technology
+
+
+
+
+
3:30pm - 3:55pm
+
Doctoral Symposium Lighting Talks
+
+
+
3:55pm - 4:05pm
+
Closing Remarks
+
+
+
4:05pm - 5:20pm
+
Doctoral Symposium Poster Session
+
+
+
+
+
+
+
+
+
+
Time
+
Title
+
Location
+
+
+
+
+
8:30am - 9:00am
+
Check-in and Refreshments
+
CSAIL 1st Floor Lobby
+
+
+
9:00am - 9:05am
+
Opening Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
Session IX: The State of Machine Learning for Health: Where Are We Now, and Where Do We Go?
Machine Learning for Healthcare in the Era of ChatGPT
+ Karandeep Singh, MD, University of Michigan, Nigam Shah, MBBS, PhD, Stanford University, Saadia Gabriel, PhD, MIT, and Tristan Naumann, PhD, Microsoft Research. Moderated by Byron Wallace, PhD, Northeastern University.
+
Session XI: Research Roundtables
+ Bridging the gap between the business of value-based care and the research of health AI, Yubin Park, PhD, ApolloMed
+ Auditing Algorithm Performance and Equity, Alistair Johnson, DPhil, Hospital for Sick Children
+ Data privacy: Interactive or Non-interactive?, Khaled El Emam, PhD, University of Ottawa and Li Xiong, PhD, Emory University
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?, Tianxi Cai, ScD, Harvard Medical School and Yong Chen, PhD, University of Pennsylvania
+ NetworkStudies: As Many Databases as Possible or Enough to Answer the Question Quickly?, Christopher Chute, MD, Johns Hopkins University and Robert Platt, PhD, McGill University
+
CSAIL 4th floor Star and Kiva Rooms
+
+
+
3:00pm - 3:30pm
+
Networking Break
+
CSAIL 4th Floor Lobby
+
+
+
3:30pm - 4:45pm
+
+ Session XII: Doctoral Symposium
+
CSAIL 4th Floor Lobby
+
+
+
4:45pm - 5:15pm
+
+ Session XIII: I Can’t Believe It’s Not Better Lightning Talks
+ David Bellamy, Bhawesh Kumar, Cindy Wang, Andrew Beam : Can pre-trained Transformers beat simple baselines on lab data?
+ Hiba Ahsan, Silvio Amir, Byron Wallace : On the Difficulty of Disentangling Race in Representations of Clinical Notes
+ Olga Demler : On intransitivity of win ratio and area under the receiver operating characteristics curve
+ Wouter van Amsterdam, Rajesh Ranganath : My prediction model is super accurate so it will be useful for treatment decision making, right? Wrong!
+ Yuan Zhao, David Benkeser, Russell Kempker : Doubly Robust Approaches for Estimating Treatment Effect in Observational Studies are not Better than G-Computation
+
+
CSAIL Kirsch Auditorium
+
+
+
5:15pm - 5:20pm
+
Closing Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI CHIL 2024 Doctoral Symposium is an opportunity for PhD students to broadcast their research and get feedback on their directions from CHIL attendees and leaders in the field. Participants will present their ongoing and/or future doctoral dissertation work as a poster, encouraging discussion of ideas with senior leaders and CHIL participants. There is an additional opportunity to be selected for a 5 minute lightning presentation. Our main CFP can give an indication of the areas that are covered in CHIL. In addition to the lightning talks and poster session, participants will have facilitated opportunities to connect with and meet established researchers one-on-one throughout the conference. The Doctoral Symposium will be held on June 28, 2024 in New York City, NY, United States. It is an in-person event.
+
Important Dates
+
+
Application due: March 15, 2024 11:59 PM EST
+
Notification: April 5, 2024
+
Doctoral Symposium: June 28, 2024 (New York City, NY, USA)
+
+
Application
+
We welcome applications from Ph.D. students in computer science, data science, medical/health informatics, and other related fields. Successful candidates can be either senior students with concrete dissertations or junior students without full plans who may benefit from feedback from other participants. There will be prizes for the best presentations.
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines, including machine learning, statistics, epidemiology, health policy, operations, and economics. CHIL 2023 invites abstracts for 5 minute lightning talks on the topic “I Can’t Believe It’s Not Better” that highlight machine learning in health failures that still surprise. Specifically, authors are invited to submit 1 page abstracts (with unlimited pages for references).
+Unconference Chairs will oversee the reviewing process. If you are not sure if your submission fits the call, please reach out to the conference organizers (info@chilconference.org) for advice.
+
Important Dates
+
+
Submissions due: Feb 28, 11:59 EST Mar 15, 2023 11:59 PM EDT
+
Author notification: April 3
+
CHIL unconference: June 24
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 2 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: All submissions to CHIL are expected to be relevant to health. Concretely, this means that the problem is well-placed into the relevant themes for the conference. Reviewers will be able to recommend that submissions change tracks or flag submissions that are not suitable for the venue as a whole.
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work?
+
Originality: Are the tasks or methods new? Is it clear how this work differs from previous contributions? Is related work adequately cited?
+
Clarity: Is the submission clearly written? Does it adequately inform the reader?
+
Significance: Are the results important? Are others (researchers or practitioners) likely to use the ideas or build on them? Does the work highlight a failure that is truly surprising?
+
Final decisions will be made by Unconference Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment.
+
Submission Format and Guidelines
+
Submission Site
+
Submissions should be made via the following Google Form: https://forms.gle/gL5KNaV6fTKHPQJP8. At least one author of each accepted paper is required to register for, attend, and present the work at the conference.
+
Length and Formatting
+
Abstracts are allowed a maximum of 1 page using 12 point font, single-spacing, and 1-inch margins. Submissions are allowed unlimited pages for references. The abstract should be submitted as a PDF.
+
Archival Submissions
+
Submissions to the Unconference are considered archival and will appear in an arXiv volume if accepted. Author notification of acceptance will be provided towards April 2023.
+
Peer Review
+
The review process is mutually anonymous. Please submit completely anonymized drafts. Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations to this policy may result in rejection without review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ This Code of Conduct is largely identical to the NeurIPS and ML4H codes of conduct.
+
+
+
We, the participants involved with the Conference on Health, Inference, and Learning (CHIL), come together for the open exchange of ideas, the freedom of thought and expression, and for respectful scientific debate which is central to the goals of this Conference. This requires a community and an environment that recognizes and respects the inherent worth of every person.
+
RESPONSIBILITY
+
All participants, organizers, reviewers, speakers, media, sponsors, and volunteers (referred to as “Participants” collectively throughout this document) at our Conference and Conference-sponsored social events are required to agree with this Code of Conduct both during an event and on official communication channels, including social media.
+
Sponsors are equally subject to this Code of Conduct. In particular, sponsors should not use images, activities, or other materials that are of a sexual, racial, or otherwise offensive nature. This code applies both to official sponsors as well as any organization that uses the Conference name as branding as part of its activities at or around the Conference.
+
Organizers will enforce this Code, and it is expected that all Participants will cooperate to help ensure a safe and inclusive environment for everyone.
+
POLICY
+
The conference commits itself to providing an experience for all Participants that is free from the following:
+
+
Harassment, bullying, and discrimination which includes but is not limited to:
+
Offensive comments related to age, race, religion, creed, color, gender (including transgender/gender identity/gender expression), sexual orientation, medical condition, physical or intellectual disability, pregnancy, or medical conditions, national origin or ancestry.
+
intimidation, personal attacks, harassment, unnecessary disruption of talks or other conference events.
+
+
+
Inappropriate or unprofessional behavior that interferes with another's full participation including:
+
sexual harassment, stalking, following, harassing photography or recording, inappropriate physical contact, unwelcome attention, public vulgar exchanges, derogatory name-calling, and diminutive characterizations
+
Use of images, activities, or other materials that are of a sexual, racial, or otherwise offensive nature that may create an inappropriate or toxic environment.
+
Disorderly, boisterous, or disruptive conduct including fighting, coercion, theft, damage to property, or any mistreatment or non-businesslike behavior towards participants.
+
Zoom bombing or any virtual activity that is not related to the topic of discussion which detracts from the topic or the purpose of the program. This includes inappropriate remarks in chat areas as deemed inappropriate by presenters/monitors/event leaders.
+
Individuals and organizations that make false claims or accusations related to CHIL business or inappropriately make comments online as if they represent CHIL without advanced approval.
+
+
+
Scientific misconduct including fabrication, falsification, or plagiarism of paper submissions or research presentations, including demos, exhibits or posters.
+
+
This Code of Conduct applies to the actual meeting sites and conference venues where CHIL business is being conducted, including physical, virtual venues, and official virtual engagement platforms, including video, virtual streaming, and chat-based interactions. CHIL is not responsible for non-sponsored activity or behavior that may occur at non-sponsored locations such as hotels, restaurants, physical, virtual, or other locations not otherwise deemed a sanctioned space for CHIL sponsored events. Nonetheless, any issues brought to the Conference organizers will be considered. However, it is also the case that CHIL cannot actively monitor voluntary social media platforms and cannot follow-up on every transaction occurring between individuals who voluntarily engage in argument and altercation outside the CHIL sponsored events virtual or otherwise.
+
ACTION
+
If a Participant engages in any inappropriate behavior as defined herein, the Conference organizers may take action as deemed appropriate, including: a formal or informal warning to the offender, expulsion from the conference with no refund, barring from participation in future conferences or their organization, reporting the incident to the offender’s local institution or funding agencies, or reporting the incident to local authorities or law enforcement. A response of "just joking" is not acceptable. If action is taken, an appeals process will be made available. There will be no retaliation against any Participant who brings a complaint or submits an incident report in good faith or who honestly assists in investigating such a complaint. All issues brought forth to the onsite Conference organizers during the course of a Conference will be immediately investigated.
+
COMPLAINT REPORTING
+
The Conference encourages all Participants to immediately report any incidents of discrimination, harassment, unprofessional conduct, and/or retaliation so that complaints can be quickly and fairly resolved. All complaints will be handled as confidentially as possible and information will be disclosed only as it is necessary to complete the investigation and bring to resolution. There will be no retaliation against any Participant who brings a complaint or submits an incident report in good faith or who honestly assists in investigating such a complaint. All issues brought forth to the Conference organizers during the course of a Conference will be immediately investigated.
+
If you have concerns related to your participation/interaction at the Conference or Conference sanctioned events, or observe someone else's difficulties, or have any other concerns you wish to share, please contact the organizers at support@ahli.cc. Complaints and violations will be researched and investigated as appropriate. Reports made during the Conference will be responded to within 24 hours; those at other times in less than two weeks. We are prepared and eager to help Participants contact relevant help services, to escort them to a safe location, or to otherwise assist those experiencing harassment of any sort to feel safe for the duration of the Conference.
+
+Specifically, the following list covers in more detail features of behavior that could be subject to prejudice, harassment or discrimination: physical, cultural, or linguistic characteristics associated with a national origin group; marriage to or association with persons of a national origin group; tribal affiliation; membership in or association with an organization identified with or seeking to promote the interests of a national origin group; attendance or participation in schools, churches, temples, mosques, or other religious institutions generally used by persons of national origin group; and any name that is associated with a national origin group, genotype, marital status or registered domestic partner status. Other examples include: breastfeeding, military or veteran status, politics, technology choices, and age.
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
These Community Guidelines exist to shape and guide programs supported by the Association for Health Learning and Inference (AHLI) and the growing community it engages. These guidelines are intended to help create a space where community members can feel comfortable and supported to exist, while still allowing for different– and sometimes conflicting– points of view. These guidelines serve a broader purpose than the code of conduct.
+
These guidelines summarize a thorough, thoughtful, and living internal policy that the AHLI Board of Directors has developed to promote a safe learning and collaborative environment. These guidelines apply to all AHLI programs including scientific meetings and informal community events. Specific persons affected by these guidelines may include, but are not limited to:
+
+
Program volunteers including organizing committee members
+
Program speakers, panelists, moderators, and other invited individuals
+
Program participants and attendees
+
+
Principles
+
+
+
Openness. We encourage open communication and critique in our intellectual community. Strong intellectual disagreement should not be confused with aggressive behavior.
+
+
+
Respect. A community where people feel uncomfortable or threatened by tone or targeting is not one where we can have the most rigorous discussion possible. During the course of academic debate and discussion, we expect our community to do their best to act in an empathetic fashion, understanding that civility is important in all cultures. We emphasize that critique of ideas or contributions should not turn into personal attacks on persons or identities.
+
+
+
Responsibility. We recognize that mistakes will be made in the course of human engagement and we commit to take responsibility for them. This commitment includes listening carefully and respectfully to all complaints, and working to understand how to improve.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The AHLI Conference on Health, Inference, and Learning (CHIL) solicits work across a variety of disciplines at the intersection of machine learning and healthcare. CHIL 2024 invites submissions focused on artificial intelligence and machine learning (AI/ML) techniques that address challenges in health, which we view broadly as including clinical healthcare, public health, population health, and beyond.
+
Specifically, authors are invited to submit 8-10 page papers (with unlimited pages for references) to one of 3 possible tracks: Models and Methods, Applications and Practice, or Impact and Society. Each track is described in detail below. Authors will select exactly one primary track when they register each submission, in addition to one or more sub-disciplines. Appropriate track and sub-discipline selection will ensure that each submission is reviewed by a knowledgeable set of reviewers. Track Chairs will oversee the reviewing process. In case you are not sure which track your submission fits under, feel free to contact the Track or Proceedings Chairs for clarification. The Proceedings Chairs reserve the right to move submissions between tracks if they believe that a submission has been misclassified.
+
Important Dates (all times are anywhere on Earth, AoE)
+
+
Submissions due: Feb 16, 2024
+
Bidding opens for reviewers: Feb 17, 2024
+
Bidding closes for reviewers: Tue Feb 20, 2024
+
Papers assigned to reviewers: Wed Feb 21, 2024
+
Reviews due: Wed Mar 6, 2024
+
Author response period: Mar 12-19, 2024
+
Author / reviewer discussion period: Mar 19-26, 2024
+
Decision notification: Apr 3, 2024
+
CHIL conference: June 27-28, 2024
+
+
+
+
Submission Tracks
+
+
Track 1 - Models and Methods: Algorithms, Inference, and Estimation
+
Track 2 - Applications and Practice: Investigation, Evaluation, Interpretations, and Deployment
+
Track 3 - Impact and Society: Policy, Public Health, Social Outcomes, and Economics
+
+
+
+
+
+
Evaluation
+
Works submitted to CHIL will be reviewed by at least 3 reviewers. Reviewers will be asked to primarily judge the work according to the following criteria:
+
Relevance: Is the submission relevant to health, broadly construed? Does the problem addressed fall into the domains of machine learning and healthcare?
+
Quality: Is the submission technically sound? Are claims well supported by theoretical analysis or experimental results? Are the authors careful and honest about evaluating both the strengths and weaknesses of their work? Is the work complete rather than a work in progress?
+
Originality: Are the tasks, methods and results novel? Is it clear how this work differs from previous contributions? Is related work adequately cited to provide context? Does the submission contribute unique data, unique conclusions about existing data, or a unique theoretical or experimental approach?
+
Clarity: Is the submission clearly written? Is it well-organized? Does it adequately provide enough information for readers to reproduce experiments or results?
+
Significance: Is the contribution of the work important? Are other researchers or practitioners likely to use the ideas or build on them? Does the work advance the state of the art in a demonstrable way?
+
Final decisions will be made by Track and Proceedings Chairs, taking into account reviewer comments, ratings of confidence and expertise, and our own editorial judgment. Reviewers will be able to recommend that submissions change tracks or flag submissions for ethical issues, relevance and suitability concerns.
Submitted papers must be 8-10 pages (including all figures and tables). Unlimited additional pages can be used for references and additional supplementary materials (e.g. appendices). Reviewers will not be required to read the supplementary materials.
+
Authors are required to use the LaTeX template: Overleaf
+
+
+
+
+
Required Sections
+
Similar to last year, two sections will be required: 1) Data and Code Availability, and 2) Institutional Review Board (IRB).
+
Data and Code Availability: This initial paragraph is required. Briefly state what data you use (including citations if appropriate) and whether the data are available to other researchers. If you are not sharing code, you must explicitly state that you are not making your code available. If you are making your code available, then at the time of submission for review, please include your code as supplemental material or as a code repository link; in either case, your code must be anonymized. If your paper is accepted, then you should de-anonymize your code for the camera-ready version of the paper. If you do not include this data and code availability statement for your paper, or you provide code that is not anonymized at the time of submission, then your paper will be desk-rejected. Your experiments later could refer to this initial data and code availability statement if it is helpful (e.g., to avoid restating what data you use).
+
Institutional Review Board (IRB): This endmatter section is required. If your research requires IRB approval or has been designated by your IRB as Not Human Subject Research, then for the cameraready version of the paper, you must provide IRB information (and at the time of submission for review, you can say that this IRB information will be provided if the paper is accepted). If your research does not require IRB approval, then you must state this to be the case. This section does not count toward the paper page limit.
+
Archival Submissions
+
Submissions to the main conference are considered archival and will appear in the published proceedings of the conference, if accepted. Author notification of acceptance will be provided by the listed date under Important Dates.
+
+
+
Preprint Submission Policy
+
Submissions to preprint servers (such as ArXiv or MedRxiv) are allowed while the papers are under review. While reviewers will be encouraged not to search for the papers, you accept that uploading the paper may make your identity known.
+
Peer Review
+
The review process is mutually anonymous (aka “double blind”). Your submitted paper, as well as any supporting text or revisions provided during the discussion period, should be completely anonymized (including links to code repositories such as Github). Please do not include any identifying information, and refrain from citing the authors’ own prior work in anything other than third-person. Violations of this anonymity policy at any stage before final manuscript acceptance decisions may result in rejection without further review.
+
Conference organizers and reviewers are required to maintain confidentiality of submitted material. Upon acceptance, the titles, authorship, and abstracts of papers will be released prior to the conference.
+
You may not submit papers that are identical, or substantially similar to versions that are currently under review at another conference or journal, have been previously published, or have been accepted for publication. Submissions to the main conference are considered archival and will appear in the published proceedings of the conference if accepted.
+
An exception to this rule is extensions of workshop papers that have previously appeared in non-archival venues, such as workshops, arXiv, or similar without formal proceedings. These works may be submitted as-is or in an extended form, though they must follow our manuscript formatting guidelines. CHIL also welcomes full paper submissions that extend previously published short papers or abstracts, so long as the previously published version does not exceed 4 pages in length. Note that the submission should not cite the workshop/report and preserve anonymity in the submitted manuscript.
+
Upon submission, authors will select one or more relevant sub-discipline(s). Peer reviewers for a paper will be experts in the sub-discipline(s) selected upon its submission.
+
Note: Senior Area Chairs (AC) are prohibited from submitting manuscripts to their respective track. Area Chairs (AC) who plan to submit papers to the track they were assigned to need to notify the Track Senior Area Chair (AC) within 24 hours of submission.
+
Open Access
+
CHIL is committed to open science and ensuring our proceedings are freely available.
+
Responsible and Ethical Research
+
Computer software submissions should include an anonymized code link or code attached as supplementary material, licensing information, and provide documentation to facilitate use and reproducibility (e.g., package versions, README, intended use, and execution examples that facilitate execution by other researchers).
+
Submissions that include analysis on public datasets need to include appropriate citations and data sequestration protocols, including train/validation/test splits, where appropriate. Submissions that include analysis of non-public datasets need to additionally include information about data source, collection sites, subject demographics and subgroups statistics, data acquisition protocols, informed consent, IRB and any other information supporting evidence of adherence to data collection and release protocols.
+
Authors should discuss ethical implications and responsible uses of their work.
+
+
+
+
+
+
+
Reviewing for CHIL
+
Reviewing is a critical service in any research community, and we highly respect the expertise and contributions of our reviewers. Every submission deserves thoughtful, constructive feedback that:
+
+
Selects quality work to be highlighted at CHIL; and
+
Helps authors improve their work, either for CHIL or a future venue.
+
+
To deliver high-quality reviews, you are expected to participate in four phases of review: Bidding; Assignment; Review; Discussion. This guide is here to help you through each of these steps. Your insights and feedback make a big difference in our community and in the field of healthcare and machine learning.
+
Timeline
+
To deliver high-quality reviews, you are expected to participate in the four phases of review:
+
+
Bidding
+
Skim abstracts
+
Suggest >10 submissions that you feel qualified to review
+
Time commitment: ~1 hour
+
+
+
Assignment
+
Skim your assigned papers and immediately report:
+
Major formatting issues
+
Anonymity or Conflict of Interest issues
+
Papers that you are not comfortable reviewing
+
+
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~10 minutes per paper
+
+
+
Review
+
Deliver a thoughtful, timely review for each assigned paper
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~2-5 hours per paper
+
+
+
Discussion
+
Provide comments that respond to author feedback, other reviewers, and chairs
+
Workload: 2-5 papers per reviewer
+
Time commitment: ~1-2 hours per paper
+
+
+
+
Phase 1: Bidding
+
After the submission deadline, you will be invited to "bid" for your preferred papers in OpenReview, based on titles and abstracts. Bidding instructions will be provided via email. Please bid promptly and generously!
+
Phase 2: Assignment
+
After the bidding period closes, you will be formally assigned 2-5 papers to review. We ask you to promptly skim your papers to ensure:
+
+
no violations of required formatting rules (page limits, margins, etc)
+
no violations of anonymity (author names, institution names, github links, etc)
+
that you have sufficient expertise to review the paper
+
+
If you feel that you cannot offer an informed opinion about the quality of the paper due to expertise mismatch, please write to your assigned Area Chair on OpenReview. Area Chairs will do their best to ensure that each submission has the most competent reviewers available in the pool.
+
Phase 3: Review
+
You will be asked to complete thoughtful, constructive reviews for all assigned papers. Please ensure that your reviews are completed before the deadline, and sooner if possible. For each paper, you will fill out a form on OpenReview, similar to the form below. To help us to ensure consistency and quality, all reviews are subject to internal checks that may be manual or automated.
+
Review format
+
+
Summary of the paper
+
Summarize *your* understanding of the paper. Stick to the facts: ideally, the authors should agree with everything written here.
+
+
+
Strengths
+
Identify the promising aspects of the work.
+
+
+
Weaknesses
+
Every paper that does not meet the bar for publication is the scaffolding upon which a better research idea can be built. If you believe the work is insufficient, help the authors see where they can take their work and how.
+
If you are asking for more experiments, clearly explain why and outline what new information the experiment might offer.
+
+
+
Questions for the authors
+
Communicate what additional information would help you to evaluate the study.
+
Be explicit about how responses your questions might change your score for the paper. Prioritize questions that might lead to big potential score changes.
+
+
+
+
Emergency reviewing
+
We will likely be seeking emergency reviewers for papers that do not receive all reviews by the deadline. Emergency reviewers will be sent a maximum of 3 papers and will need to write their reviews in a short time frame. Emergency review sign-up will be indicated in the reviewer sign-up form.
+
General advice for preparing reviews
+
Please strive to be timely, polite, and constructive, submitting reviews that you yourself would be happy to receive as an author. Be sure to review the paper, not the authors.
+
When making statements, use phrases like “the paper proposes” rather than “the authors propose”. This makes your review less personal and separates critiques of the submission from critiques of the authors.
+
External resources
+
If you would like feedback on a review, we recommend asking a mentor or colleague. When doing so, take care not breach confidentiality. Some helpful resources include:
Track specific advice for preparing reviews for a CHIL submission
+
+
Track 1: it is acceptable for a paper to use synthetic data to evaluate a proposed method. Not every paper must touch real health data, though all methods should be primarily motivated by health applications and the realism of the synthetic data is fair to critique
+
Track 2: the contribution of this track should be either more focused on solving a carefully motivated problem grounded in applications or on deployments or datasets that enable exploration and evaluation of applications
+
Track 3: meaningful contributions to this track can include a broader scope of contribution beyond algorithmic development. Innovative and impactful use of existing techniques is encouraged
+
+
Phase 4: Discussion
+
During the discussion period, you will be expected to participate in discussions on OpenReview by reading the authors’ responses and comments from other reviewers, adding additional comments from your perspective, and updating your review accordingly.
+
We expect brief but thoughtful engagement from all reviewers here. For some papers, this would involve several iterations of feedback-response. A simplistic response of “I have read the authors’ response and I chose to keep my score unchanged” is not sufficient, because it does not provide detailed reasoning about what weaknesses are still salient and why the response is not sufficient. Please engage meaningfully!
+
Track Chairs will work with reviewers to try to reach a consensus decision about each paper. In the event that consensus is not reached, Track Chairs make final decisions about acceptance.
+
+
+
+
Models and Methods: Algorithms, Inference, and Estimation
+
+
Advances in machine learning are critical for a better understanding of health. This track seeks technical contributions in modeling, inference, and estimation in health-focused or health-inspired settings. We welcome submissions that develop novel methods and algorithms, introduce relevant machine learning tasks, identify challenges with prevalent approaches, or learn from multiple sources of data (e.g. non-clinical and clinical data).
+
Our focus on health is broadly construed, including clinical healthcare, public health, and population health. While submissions should be primarily motivated by problems relevant to health, the contributions themselves are not required to be directly applied to real health data. For example, authors may use synthetic datasets to demonstrate properties of their proposed algorithms.
+
We welcome submissions from many perspectives, including but not limited to supervised learning, unsupervised learning, reinforcement learning, causal inference, representation learning, survival analysis, domain adaptation or generalization, interpretability, robustness, and algorithmic fairness. All kinds of health-relevant data types are in scope, including tabular health records, time series, text, images, videos, knowledge graphs, and more. We welcome all kinds of methodologies, from deep learning to probabilistic modeling to rigorous theory and beyond.
Applications and Practice: Investigation, Evaluation, Interpretation, and Deployment
+
+
The goal of this track is to highlight works applying robust methods, models, or practices to identify, characterize, audit, evaluate, or benchmark ML approaches to healthcare problems. Additionally, we welcome unique deployments and datasets used to empirically evaluate these systems are necessary and important to advancing practice. Whereas the goal of Track 1 is to select papers that show significant algorithmic novelty, submit your work here if the contribution is describing an emerging or established innovative application of ML in healthcare. Areas of interest include but are not limited to:
+
+
Datasets and simulation frameworks for addressing gaps in ML healthcare applications
+
Tools and platforms that facilitate integration of AI algorithms and deployment for healthcare applications
+
Innovative ML-based approaches to solving a practical problems grounded in a healthcare application
+
Surveys, benchmarks, evaluations and best practices of using ML in healthcare
+
Emerging applications of AI in healthcare
+
+
Introducing a new method is not prohibited by any means for this track, but the focus should be on the extent of how the proposed ideas contribute to addressing a practical limitation (e.g., robustness, computational scalability, improved performance). We encourage submissions in both more traditional clinical areas (e.g., electronic health records (EHR), medical image analysis), as well as in emerging fields (e.g., remote and telehealth medicine, integration of omics).
Impact and Society: Policy, Public Health, and Social Outcomes
+
+
Algorithms do not exist in a vacuum: instead, they often explicitly aim for important social outcomes. This track considers issues at the intersection of algorithms and the societies they seek to impact, specifically for health. Submissions could include methodological contributions such as algorithmic development and performance evaluation for policy and public health applications, large-scale or challenging data collection, combining clinical and non-clinical data, as well as detecting and measuring bias. Submissions could also include impact-oriented research such as determining how algorithmic systems for health may introduce, exacerbate, or reduce inequities and inequalities, discrimination, and unjust outcomes, as well as evaluating the economic implications of these systems. We invite submissions tackling the responsible design of AI applications for healthcare and public health. System design for the implementation of such applications at scale is also welcome, which often requires balancing various tradeoffs in decision-making. Submissions related to understanding barriers to the deployment and adoption of algorithmic systems for societal-level health applications are also of interest. In addressing these problems, insights from social sciences, law, clinical medicine, and the humanities can be crucial.
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Financial support ensures that CHIL remains accessible to a broad set of participants by offsetting the expenses involved in participation. We follow best practices in other conferences to maintain a transparent and appropriate relationship with our funders:
+
+
The substance and structure of the conference are determined independently by the program committees.
+
All papers are chosen through a rigorous, mutually anonymous peer review process, where authors disclose conflicts of interest.
+
All sources of financial support are acknowledged.
+
Benefits are publicly disclosed below.
+
Corporate sponsors cannot specify how contributions are spent.
+
+
+
+
+
2024 Sponsorship Levels
+
Sponsorship of the annual AHLI Conference on Health, Inference and Learning (CHIL) contributes to furthering research and interdisciplinary dialogue around machine learning and health. We deeply appreciate any amount of support your company or foundation can provide.
+
+
Diamond ($20,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Access to CHIL 2024 attendees’ contact and CV who opt-in for career opportunities
+
Dedicated time during the lunch break to present a 20-minute talk on company's research in machine learning and health research or development
+
Present demo during the poster session
+
Free registration for up to ten (10) representatives from your organization
+
Free company booth at the venue
+
+
Gold ($10,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Present demo during the poster session
+
Free registration for up to five (5) representatives from your organization
+
Free company booth at the venue
+
+
Silver ($5,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Free registration for up to two (2) representatives from your organization
+
+
Bronze ($2,000 USD)
+
+
Prominent display of company logo on our website
+
Free registration for one (1) representative from your organization
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
The Conference on Health, Inference, and Learning (CHIL), targets a cross-disciplinary representation of
+clinicians and researchers (from industry and academia) in machine learning, health policy, causality,
+fairness, and other related areas.
+
The conference is designed to spark insight-driven discussions on new and emerging ideas that may lead to
+collaboration and discussion.
+
CHIL 2024 features an incredible lineup of speakers including:
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ Panel: Health Economics and Behavior
+ David Meltzer, MD, PhD, University of Chicago, Walter Dempsey, PhD, University of Michigan, F. Perry Wilson, MD, Yale School of Medicine. Moderated by Kyra Gan, PhD, Cornell University
+
+
+ Panel: Real Deployments, and How to Find Them
+ Girish Nadkarni, MD, MPH, Mount Sinai, Roy Perlis, MD, Harvard University, Ashley Beecy, MD, NewYork-Presbyterian. Moderated by Leo Celi, PhD, Massachusetts Institute of Technology
+
+
+
+
+
3:30pm - 3:55pm
+
Doctoral Symposium Lighting Talks
+
+
+
3:55pm - 4:05pm
+
Closing Remarks
+
+
+
4:05pm - 5:20pm
+
Doctoral Symposium Poster Session
+
+
+
+
+
+
+
+
+
+
Time
+
Title
+
Location
+
+
+
+
+
8:30am - 9:00am
+
Check-in and Refreshments
+
CSAIL 1st Floor Lobby
+
+
+
9:00am - 9:05am
+
Opening Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
Session IX: The State of Machine Learning for Health: Where Are We Now, and Where Do We Go?
Machine Learning for Healthcare in the Era of ChatGPT
+ Karandeep Singh, MD, University of Michigan, Nigam Shah, MBBS, PhD, Stanford University, Saadia Gabriel, PhD, MIT, and Tristan Naumann, PhD, Microsoft Research. Moderated by Byron Wallace, PhD, Northeastern University.
+
Session XI: Research Roundtables
+ Bridging the gap between the business of value-based care and the research of health AI, Yubin Park, PhD, ApolloMed
+ Auditing Algorithm Performance and Equity, Alistair Johnson, DPhil, Hospital for Sick Children
+ Data privacy: Interactive or Non-interactive?, Khaled El Emam, PhD, University of Ottawa and Li Xiong, PhD, Emory University
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?, Tianxi Cai, ScD, Harvard Medical School and Yong Chen, PhD, University of Pennsylvania
+ NetworkStudies: As Many Databases as Possible or Enough to Answer the Question Quickly?, Christopher Chute, MD, Johns Hopkins University and Robert Platt, PhD, McGill University
+
CSAIL 4th floor Star and Kiva Rooms
+
+
+
3:00pm - 3:30pm
+
Networking Break
+
CSAIL 4th Floor Lobby
+
+
+
3:30pm - 4:45pm
+
+ Session XII: Doctoral Symposium
+
CSAIL 4th Floor Lobby
+
+
+
4:45pm - 5:15pm
+
+ Session XIII: I Can’t Believe It’s Not Better Lightning Talks
+ David Bellamy, Bhawesh Kumar, Cindy Wang, Andrew Beam : Can pre-trained Transformers beat simple baselines on lab data?
+ Hiba Ahsan, Silvio Amir, Byron Wallace : On the Difficulty of Disentangling Race in Representations of Clinical Notes
+ Olga Demler : On intransitivity of win ratio and area under the receiver operating characteristics curve
+ Wouter van Amsterdam, Rajesh Ranganath : My prediction model is super accurate so it will be useful for treatment decision making, right? Wrong!
+ Yuan Zhao, David Benkeser, Russell Kempker : Doubly Robust Approaches for Estimating Treatment Effect in Observational Studies are not Better than G-Computation
+
+
CSAIL Kirsch Auditorium
+
+
+
5:15pm - 5:20pm
+
Closing Remarks
+
CSAIL Kirsch Auditorium
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
By joining this conference, you are granting the CHIL organizing committee permission to use your image (screenshots and/or video recording) for use in media publications including videos, email blasts, social media, brochures, newsletters, websites, and general publications.
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ Each dot represents a paper. They are arranged by a measure of
+ similarity.
+
+
If you hover over a dot, you see the related paper.
+
+ If you click on a dot, you go to the related paper page.
+
+
+ You can search for papers by author, keyword, or title
+
+
Drag a rectangle to summarize an area of the plot.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ We use cookies to store which papers have been visited.
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ We use cookies to store which papers have been visited.
+
+ I agree
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/papers.json b/papers.json
new file mode 100644
index 000000000..826653477
--- /dev/null
+++ b/papers.json
@@ -0,0 +1 @@
+[{"TLDR":"Spiking Neural Networks (SNNs) operate with asynchronous discrete events (or spikes) which can potentially lead to higher energy-efficiency in neuromorphic hardware implementations. Many works have shown that an SNN for inference can be formed by copying the weights from a trained Artificial Neural Network (ANN) and setting the firing threshold for each layer as the maximum input received in that layer. These type of converted SNNs require a large number of time steps to achieve competitive accuracy which diminishes the energy savings. The number of time steps can be reduced by training SNNs with spike-based backpropagation from scratch, but that is computationally expensive and slow. To address these challenges, we present a computationally-efficient training technique for deep SNNs. We propose a hybrid training methodology: 1) take a converted SNN and use its weights and thresholds as an initialization step for spike-based backpropagation, and 2) perform incremental spike-timing dependent backpropagation (STDB) on this carefully initialized network to obtain an SNN that converges within few epochs and requires fewer time steps for input processing. STDB is performed with a novel surrogate gradient function defined using neuron's spike time. The weight update is proportional to the difference in spike timing between the current time step and the most recent time step the neuron generated an output spike. The SNNs trained with our hybrid conversion-and-STDB training perform at $10{\\times}{-}25{\\times}$ fewer number of time steps and achieve similar accuracy compared to purely converted SNNs. The proposed training methodology converges in less than $20$ epochs of spike-based backpropagation for most standard image classification datasets, thereby greatly reducing the training complexity compared to training SNNs from scratch. We perform experiments on CIFAR-10, CIFAR-100 and ImageNet datasets for both VGG and ResNet architectures. We achieve top-1 accuracy of $65.19\\%$ for ImageNet dataset on SNN with $250$ time steps, which is $10{\\times}$ faster compared to converted SNNs with similar accuracy.","UID":"B1xSperKvH","abstract":"Spiking Neural Networks (SNNs) operate with asynchronous discrete events (or spikes) which can potentially lead to higher energy-efficiency in neuromorphic hardware implementations. Many works have shown that an SNN for inference can be formed by copying the weights from a trained Artificial Neural Network (ANN) and setting the firing threshold for each layer as the maximum input received in that layer. These type of converted SNNs require a large number of time steps to achieve competitive accuracy which diminishes the energy savings. The number of time steps can be reduced by training SNNs with spike-based backpropagation from scratch, but that is computationally expensive and slow. To address these challenges, we present a computationally-efficient training technique for deep SNNs. We propose a hybrid training methodology: 1) take a converted SNN and use its weights and thresholds as an initialization step for spike-based backpropagation, and 2) perform incremental spike-timing dependent backpropagation (STDB) on this carefully initialized network to obtain an SNN that converges within few epochs and requires fewer time steps for input processing. STDB is performed with a novel surrogate gradient function defined using neuron's spike time. The weight update is proportional to the difference in spike timing between the current time step and the most recent time step the neuron generated an output spike. The SNNs trained with our hybrid conversion-and-STDB training perform at $10{\\times}{-}25{\\times}$ fewer number of time steps and achieve similar accuracy compared to purely converted SNNs. The proposed training methodology converges in less than $20$ epochs of spike-based backpropagation for most standard image classification datasets, thereby greatly reducing the training complexity compared to training SNNs from scratch. We perform experiments on CIFAR-10, CIFAR-100 and ImageNet datasets for both VGG and ResNet architectures. We achieve top-1 accuracy of $65.19\\%$ for ImageNet dataset on SNN with $250$ time steps, which is $10{\\times}$ faster compared to converted SNNs with similar accuracy.","authors":["Nitin Rathi","Gopalakrishnan Srinivasan","Priyadarshini Panda","Kaushik Roy"],"code_link":"https://github.com/Mini-Conf/Mini-Conf","forum":"B1xSperKvH","keywords":["imagenet"],"link":"https://arxiv.org/abs/2007.12238","pdf_url":"","recs":[],"sessions":["Tues Session 1","Mon Session 1"],"title":"Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation"}]
diff --git a/poster_B1xSperKvH.html b/poster_B1xSperKvH.html
new file mode 100644
index 000000000..d3bf89667
--- /dev/null
+++ b/poster_B1xSperKvH.html
@@ -0,0 +1,727 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CHIL
+
+ : Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation
+
+ Abstract:
+ Spiking Neural Networks (SNNs) operate with asynchronous discrete events (or spikes) which can potentially lead to higher energy-efficiency in neuromorphic hardware implementations. Many works have shown that an SNN for inference can be formed by copying the weights from a trained Artificial Neural Network (ANN) and setting the firing threshold for each layer as the maximum input received in that layer. These type of converted SNNs require a large number of time steps to achieve competitive accuracy which diminishes the energy savings. The number of time steps can be reduced by training SNNs with spike-based backpropagation from scratch, but that is computationally expensive and slow. To address these challenges, we present a computationally-efficient training technique for deep SNNs. We propose a hybrid training methodology: 1) take a converted SNN and use its weights and thresholds as an initialization step for spike-based backpropagation, and 2) perform incremental spike-timing dependent backpropagation (STDB) on this carefully initialized network to obtain an SNN that converges within few epochs and requires fewer time steps for input processing. STDB is performed with a novel surrogate gradient function defined using neuron's spike time. The weight update is proportional to the difference in spike timing between the current time step and the most recent time step the neuron generated an output spike. The SNNs trained with our hybrid conversion-and-STDB training perform at $10{\times}{-}25{\times}$ fewer number of time steps and achieve similar accuracy compared to purely converted SNNs. The proposed training methodology converges in less than $20$ epochs of spike-based backpropagation for most standard image classification datasets, thereby greatly reducing the training complexity compared to training SNNs from scratch. We perform experiments on CIFAR-10, CIFAR-100 and ImageNet datasets for both VGG and ResNet architectures. We achieve top-1 accuracy of $65.19\%$ for ImageNet dataset on SNN with $250$ time steps, which is $10{\times}$ faster compared to converted SNNs with similar accuracy.
+
+
+
+
+
+
+
+ Add content for posters. This could be a video, embedded pdf, chat room ....
+
+
+
+
+
+
+
Example Chat
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Example SlidesLive
+
+
+
+
+
+
+
+
+
Example Poster
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
CHIL Sponsors
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+ Abstract:
+ Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artificial intelligence (AI) has allowed some works to utilize deep learning frameworks to interpret EGMs during AFib. Additionally, language models (LMs) have shown exceptional performance in being able to generalize to unseen domains, especially in healthcare. In this study, we are the first to leverage pretrained LMs for finetuning of EGM interpolation and AFib classification via masked language modeling. We formulate the EGM as a textual sequence and present competitive performances on AFib classification compared against other representations. Lastly, we provide a comprehensive interpretability study to provide a multi-perspective intuition of the model's behavior, which could greatly benefit the clinical use.
+
+ Abstract:
+ Drug synergy arises when the combined impact of two drugs exceeds the sum of their individual effects. While single-drug effects on cell lines are well documented, the scarcity of data on drug synergy, considering the vast array of potential drug combinations, prompts a growing interest in computational approaches for predicting synergies in untested drug pairs. We introduce a Graph Neural Network (GNN) based model for drug synergy prediction, which utilizes drug chemical structures and cell line gene expression data. We extract data from the largest available drug combination database (DrugComb) and generate multiple synergy scores (commonly used in the literature) to create seven datasets that serve as a reliable benchmark with high confidence. In contrast to conventional models relying on pre-computed chemical features, our GNN-based approach learns task-specific drug representations directly from the graph structure of the drugs, providing superior performance in predicting drug synergies. Our work suggests that learning task-specific drug representations and leveraging a diverse dataset is a promising approach to advancing our understanding of drug-drug interaction and synergy.
+
+ Abstract:
+ Personalization in healthcare helps to translate clinical data into more effective disease management. In practice, this is achieved by subgrouping, whereby clusters with similar patient characteristics are identified and then receive customized treatment plans with the goal of targeting subgroup-specific disease dynamics. In this paper, we propose a novel mixture hidden Markov model for subgrouping patient trajectories from chronic diseases. Our model is interpretable and carefully designed to capture different trajectory phases of chronic diseases (i.e., 'severe', 'moderate', and 'mild') through tailored latent states. We demonstrate our subgrouping framework based on a longitudinal study across 847 patients with non-specific low back pain. Here, our subgrouping framework identifies 8 subgroups. Further, we show that our subgrouping framework outperforms common baselines in terms of cluster validity indices. Finally, we discuss the applicability of the model to other chronic and long-lasting diseases. For healthcare practitioners, this presents the opportunity for treatment plans tailored to the specific needs of patient subgroups.
+
+ Abstract:
+ Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing methods that rely on general domain pipelines using only radiology reports to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views, which marks a significant advancement in the field. To achieve this, we introduce a set of specially designed tokens for each view position, tailoring the generation process to the user's preferences. Furthermore, we leverage multi-view chest X-rays as input, incorporating valuable information from different views within the same study. This integration rectifies potential errors and contributes to faithfully capturing abnormal findings in chest X-ray generation. To validate the effectiveness of our approach, we conducted statistical analyses, evaluating its performance in a clinical efficacy metric on the MIMIC-CXR dataset. Also, human evaluation demonstrates the remarkable capabilities of ViewXGen, particularly in producing realistic view-specific X-rays that closely resemble the original images.
+
+ Abstract:
+ Machine learning applications hold promise to aid clinicians in a wide range of clinical tasks, from diagnosis to prognosis, treatment, and patient monitoring. These potential applications are accompanied by a surge of ethical concerns surrounding the use of Machine Learning (ML) models in healthcare, especially regarding fairness and non-discrimination. While there is an increasing number of regulatory policies to ensure the ethical and safe integration of such systems, the translation from policies to practices remains an open challenge. Algorithmic frameworks, aiming to bridge this gap, should be tailored to the application to enable the translation from fundamental human-right principles into accurate statistical analysis, capturing the inherent complexity and risks associated with the system. In this work, we propose a set of fairness impartial checks especially adapted to ML early-warning systems in the medical context, comprising on top of standard fairness metrics, an analysis of clinical outcomes, and a screening of potential sources of bias in the pipeline. Our analysis is further fortified by the inclusion of event-based and prevalence-corrected metrics, as well as statistical tests to measure biases. Additionally, we emphasize the importance of considering subgroups beyond the conventional demographic attributes. Finally, to facilitate operationalization, we present an open-source tool FAMEWS to generate comprehensive fairness reports. These reports address the diverse needs and interests of the stakeholders involved in integrating ML into medical practice. The use of FAMEWS has the potential to reveal critical insights that might otherwise remain obscured. This can lead to improved model design, which in turn may translate into enhanced health outcomes.
+
+ Abstract:
+ Medical image segmentation typically requires numerous dense annotations in the target domain to train models, which is time-consuming and labor-intensive. To mitigate this burden, unsupervised domain adaptation has been developed to train models with good generalisation performance on the target domain by leveraging a label-rich source domain and the unlabeled target domain data. In this paper, we introduce a novel Dynamic Prototype Contrastive Learning (DPCL) framework for cross-domain medical image segmentation with unlabeled target domains, which dynamically updates crossdomain global prototypes and excavates implicit discrepancy information in a contrastive manner. DPCL enhances the discriminative capability of the segmentation model while learning cross-domain global feature representations. In particular, DPCL introduces a novel crossdomain prototype evolution module through dynamic updating and evolutionary strategies. This module generates evolved cross-domain prototypes, facilitating the progressive transformation from the source domain to the target domain and acquiring global cross-domain guidance knowledge. Moreover, a cross-domain embedding contrastive module is devised to establish contrastive relationships in the embedding space. This captures both homogeneous and heterogeneous information within the same category and among different categories, enhancing the discriminative capability of the segmentation model. Experimental results demonstrate that the proposed DPCL is effective and outperforms the state-of-the-art methods.
+
+ Abstract:
+ In healthcare applications, there is a growing need to develop machine learning models that use data from a single source, such as that from a wrist wearable device, to monitor physical activities, assess health risks, and provide immediate health recommendations or interventions. However, the limitation of using single-source data often compromises the model's accuracy, as it fails to capture the full scope of human activities. While a more comprehensive dataset can be gathered in a lab setting using multiple sensors attached to various body parts, this approach is not practical for everyday use due to the impracticality of wearing multiple sensors. To address this challenge, we introduce a transfer learning framework that optimizes machine learning models for everyday applications by leveraging multi-source data collected in a laboratory setting. We introduce a novel metric to leverage the inherent relationship between these multiple data sources, as they are all paired to capture aspects of the same physical activity. Through numerical experiments, our framework outperforms existing methods in classification accuracy and robustness to noise, offering a promising avenue for the enhancement of daily activity monitoring.
+
+ Abstract:
+ Effective collaboration across medical institutions presents a significant challenge, primarily due to the imperative of maintaining patient privacy. Optimal machine learning models in healthcare demand access to extensive, high-quality data to achieve generality and robustness. Yet, typically, medical institutions are restricted to data within their networks, limiting the scope and diversity of information. This limitation becomes particularly acute when encountering patient cases with rare or unique characteristics, leading to potential distribution shifts in the data. To address these challenges, our work introduces a framework designed to enhance existing clinical foundation models, Private Synthetic Hypercube Augmentation (PriSHA). We leverage generative models to produce synthetic data, generated from diverse sources, as a means to augment these models while adhering to strict privacy standards. This approach promises to broaden the dataset's scope and improve model performance without compromising patient confidentiality. To the best of our knowledge, our framework is the first framework to address distribution shifts through the use of synthetic privacy-preserving tabular data augmentation.
+
+ Abstract:
+ Promoting healthy lifestyle behaviors remains a major public health concern, particularly due to their crucial role in preventing chronic conditions such as cancer, heart disease, and type 2 diabetes. Mobile health applications present a promising avenue for low-cost, scalable health behavior change promotion. Researchers are increasingly exploring adaptive algorithms that personalize interventions to each person's unique context. However, in empirical studies, mobile health applications often suffer from small effect sizes and low adherence rates, particularly in comparison to human coaching. Tailoring advice to a person's unique goals, preferences, and life circumstances is a critical component of health coaching that has been underutilized in adaptive algorithms for mobile health interventions. To address this, we introduce a new Thompson sampling algorithm that can accommodate personalized reward functions (i.e., goals, preferences, and constraints), while also leveraging data sharing across individuals to more quickly be able to provide effective recommendations. We prove that our modification incurs only a constant penalty on cumulative regret while preserving the sample complexity benefits of data sharing. We present empirical results on synthetic and semi-synthetic physical activity simulators, where in the latter we conducted an online survey to solicit preference data relating to physical activity, which we use to construct realistic reward models that leverages historical data from another study. Our algorithm achieves substantial performance improvements compared to baselines that do not share data or do not optimize for individualized rewards.
+
+ Abstract:
+ This study demonstrates the first in-hospital adaptation of a cloud-based AI, similar to ChatGPT, into a secure model for analyzing radiology reports, prioritizing patient data privacy. By employing a unique sentence-level knowledge distillation method through contrastive learning, we achieve over 95% accuracy in detecting anomalies. The model also accurately flags uncertainties in its predictions, enhancing its reliability and interpretability for physicians with certainty indicators. Despite limitations in data privacy during the training phase, such as requiring de-identification or IRB permission, our study is significant in addressing this issue in the inference phase (once the local model is trained), without the need for human annotation throughout the entire process. These advancements represent a new direction for developing secure and efficient AI tools for healthcare with minimal supervision, paving the way for a promising future of in-hospital AI applications.
+
+ Abstract:
+ Most past work in multiple instance learning (MIL), which maps groups of instances to classification labels, has focused on settings in which the order of instances does not contain information. In this paper, we define MIL with \textit{absolute} position information: tasks in which instances of importance remain in similar positions across bags. Such problems arise, for example, in MIL with medical images in which there exists a common global alignment across images (e.g., in chest x-rays the heart is in a similar location). We also evaluate the performance of existing MIL methods on a set of new benchmark tasks and two real data tasks with varying amounts of absolute position information. We find that, despite being less computationally efficient than other approaches, transformer-based MIL methods are more accurate at classifying tasks with absolute position information. Thus, we investigate the ability of positional encodings, a mechanism typically only used in transformers, to improve the accuracy of other MIL approaches. Applied to the task of identifying pathological findings in chest x-rays, when augmented with positional encodings, standard MIL approaches perform significantly better than without (AUROC of 0.799, 95\% CI: [0.791, 0.806] vs. 0.782, 95\% CI: [0.774, 0.789]) and on-par with transformer-based methods (AUROC of 0.797, 95\% CI: [0.790, 0.804]) while being 10 times faster. Our results suggest that one can efficiently and accurately classify MIL data with standard approaches by simply including positional encodings.
+
+ Abstract:
+ This paper presents FlowCyt, the first comprehensive benchmark for multi-class single-cell classification in flow cytometry data. The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers. Ground truth labels identify five hematological cell types: T lymphocytes, B lymphocytes, Monocytes, Mast cells, and Hematopoietic Stem/Progenitor Cells (HSPCs). Experiments utilize supervised inductive learning and semi-supervised transductive learning on up to 1 million cells per patient. Baseline methods include Gaussian Mixture Models, XGBoost, Random Forests, Deep Neural Networks, and Graph Neural Networks (GNNs). GNNs demonstrate superior performance by exploiting spatial relationships in graph-encoded data. The benchmark allows standardized evaluation of clinically relevant classification tasks, along with exploratory analyses to gain insights into hematological cell phenotypes. This represents the first public flow cytometry benchmark with a richly annotated, heterogeneous dataset. It will empower the development and rigorous assessment of novel methodologies for single-cell analysis.
+
+ Abstract:
+ Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we develop a rigorous labeling protocol for hallucinations, and have two medical experts annotate 100 real-world summaries and 100 generated summaries. We show that fine-tuning on hallucination-free data effectively reduces hallucinations from 2.60 to 1.55 per summary for Llama 2, while preserving relevant information. Although the effect is still present, it is much smaller for GPT-4 when prompted with five examples (0.70 to 0.40). We also conduct a qualitative evaluation using hallucination-free and improved training data. GPT-4 shows very good results even in the zero-shot setting. We find that common quantitative metrics do not correlate well with faithfulness and quality. Finally, we test GPT-4 for automatic hallucination detection, which yields promising results.
+
+ Abstract:
+ Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambulatory settings. Conventional approaches typically discard corrupted segments or attempt to reconstruct original signals, allowing for the use of standard machine learning techniques. However, this reduces dataset size and introduces biases, compromising prediction accuracy and the effectiveness of continuous monitoring. We propose a novel deep learning model, Signal Quality Weighted Fusion of Attentional Convolution and Recurrent Neural Network (SQUWA), designed to learn how to retain accurate predictions from partially corrupted PPG. Specifically, SQUWA innovatively integrates an attention mechanism that directly considers signal quality during the learning process, dynamically adjusting the weights of time series segments based on their quality. This approach enhances the influence of higher-quality segments while reducing that of lower-quality ones, effectively utilizing partially corrupted segments. This approach represents a departure from the conventional methods that exclude such segments, enabling the utilization of a broader range of data, which has great implications for less disruption when monitoring of AF risks and more accurate estimation of AF burdens. Moreover, SQUWA utilizes variable-sized convolutional kernels to capture complex PPG signal patterns across different resolutions for enhanced learning. Our extensive experiments show that SQUWA outperform existing PPG-based models, achieving the highest AUCPR of 0.89 with label noise mitigation. This also exceeds the 0.86 AUCPR of models trained with using both electrocardiogram (ECG) and PPG data.
+
+ Addressing wearable sleep tracking inequity: a new dataset and novel methods for a population with sleep disorders
+
+
+ Will Ke Wang, Jiamu Yang, Leeor Hershkovich, Hayoung Jeong, Bill Chen, Karnika Singh, Ali R Roghanizad, Md Mobashir Hasan Shandhi, Andrew R Spector, Jessilyn Dunn
+
+ Abstract:
+ Sleep is crucial for health, and recent advances in wearable technology and machine learning offer promising methods for monitoring sleep outside the clinical setting. However, sleep tracking using wearables is challenging, particularly for those with irregular sleep patterns or sleep disorders. In this study, we introduce a dataset collected from 100 patients from [redacted for anonymity] Sleep Disorders Center who wore the Empatica E4 smartwatch during an overnight sleep study with concurrent clinical-grade polysomnography (PSG) recording. This dataset encompasses diverse demographics and medical conditions. We further introduce a new methodology that addresses the limitations of existing modeling methods when applied on patients with sleep disorders. Namely, we address the inability for existing base models to account for 1) temporal relationships while leveraging relatively small data by introducing a LSTM post-processing method, and 2) group-wise characteristics that impact classification task performance (i.e., random effects) by ensembling mixed-effects boosted tree models. This approach was highly successful for sleep onset and wakefulness detection in this sleep disordered population, achieving an F1 score of 0.823 ± 0.019, an AUROC of 0.926 ± 0.016, and a 0.695 ± 0.025 Cohen's Kappa. Overall, we demonstrate the utility of both the data that we collected, as well as our unique approach to address the existing gap in wearable-based sleep tracking in sleep disordered populations.
+
+ Abstract:
+ The rapid development of wearable biomedical systems now enables real-time monitoring of electroencephalography (EEG) signals. Acquisition of these signals relies on electrodes. These systems must meet the design challenge of selecting an optimal set of electrodes that balances performance and usability constraints. The search for the optimal subset of electrodes from a larger set is a problem with combinatorial complexity. While existing research has primarily focused on search strategies that only explore limited combinations, our methodology proposes a computationally efficient way to explore all combinations. To avoid the computational burden associated with training the model for each combination, we leverage an innovative approach inspired by few-shot learning. Remarkably, this strategy covers all the wearable electrode combinations while significantly reducing training time compared to retraining the network on each possible combination. In the context of an epileptic seizure detection task, the proposed method achieves an AUC value of 0.917 with configurations using eight electrodes. This performance matches that of prior research but is achieved in significantly less time, transforming a process that would span months into a matter of hours on a single GPU device. Our work allows comprehensive exploration of electrode configurations in wearable biomedical device design, yielding insights that enhance performance and real-world feasibility.
+
+ Abstract:
+ Deep learning models have achieved promising results in breast cancer classification, yet their 'black-box' nature raises interpretability concerns. This research addresses the crucial need to gain insights into the decision-making process of convolutional neural networks (CNNs) for mammogram classification, specifically focusing on the underlying reasons for the CNN's predictions of breast cancer. For CNNs trained on the Mammographic Image Analysis Society (MIAS) dataset, we compared the post-hoc interpretability techniques LIME, Grad-CAM, and Kernel SHAP in terms of explanatory depth and computational efficiency. The results of this analysis indicate that Grad-CAM, in particular, provides comprehensive insights into the behavior of the CNN, revealing distinctive patterns in normal, benign, and malignant breast tissue. We discuss the implications of the current findings for the use of machine learning models and interpretation techniques in clinical practice.
+
+ Abstract:
+ We introduce a novel hierarchical Bayesian estimator for permutation entropy (PermEn), designed to improve the accuracy of entropy assessments of biomedical time series signal sets, particularly for short-duration signals. Unlike existing methods that require a substantial number of observations or impose restrictive priors, our approach uses a non-centered, Wasserstein distance optimized hierarchical prior, enabling efficient full Markov Chain Monte Carlo inference and a broader spectrum of PermEn priors. Comparative evaluations with synthetic and secondary benchmark data demonstrate our estimator's enhanced performance, including a significant reduction in estimation error (13.33-63.67\%), posterior variance (8.16-47.77\%), and reference prior distance error (47-60.83\%, $p \leq 2.42 \times 10^{-10}$) against current state-of-the-art methods. Applied to oxygen uptake signals from cardiopulmonary exercise testing, our method revealed a previously unreported entropy difference between obese and lean subjects (mean difference: 1.732\%; 94\% CI [2.34\%, 1.11\%], $p \leq \frac{1}{20000}$), with more precise credible intervals (16-24\% improvement). This entropy disparity becomes statistically non-significant in participants completing over 7.5 minutes of testing, suggesting potential insights into physiological complexity, exercise tolerance, and obesity. Our estimator thus not only refines the estimation of PermEn in biomedical signals but also underscores entropy's potential value as a health biomarker, opening avenues for further physiological and biomedical exploration.
+
+ Abstract:
+ In this work, we address the challenge of limited data availability common in healthcare settings by using clinician (ophthalmologist) gaze data on optical coherence tomography (OCT) report images as they diagnose glaucoma, a top cause of irreversible blindness world-wide. We directly learn gaze representations with our 'GazeFormerMD' model to generate pseudo-labels using a novel multi-task objective, combining triplet and cross-entropy losses. We use these pseudo-labels for weakly supervised contrastive learning (WSupCon) to detect glaucoma from a partially-labeled dataset of OCT report images. Our natural-language-inspired region-based-encoding GazeFormerMD model pseudo-labels, trained using our multi-task objective, enable downstream glaucoma detection accuracy via WSupCon exceeding 91% even with only 70% labeled training data. Furthermore, a model pre-trained with GazeFormerMD-generated pseudo-labels and used for linear evaluation on an unseen OCT-report dataset achieved comparable performance to a fully-supervised, trained-from-scratch model while using only 25% labeled data.
+
+ Abstract:
+ This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets reflecting real-world prospective data collection. We analyze CNNs, including DenseNet and ConvNeXt, alongside transformer models like ViT, SWIN, and AST, and compare them against pre-trained audio models such as YAMNet and VGGish. Our method highlights the benefits of pre-training on large datasets before fine-tuning on specific clinical data. We prospectively collected two first-of-their-kind patient audio datasets from stroke patients. We investigated various preprocessing techniques, finding that RGB and grayscale spectrogram transformations affect model performance differently based on the priors they learn from pre-training. Our findings indicate CNNs can match or exceed transformer models in small dataset contexts, with DenseNet-Contrastive and AST models showing notable performance. This study highlights the significance of incremental marginal gains through model selection, pre-training, and preprocessing in sound classification; this offers valuable insights for clinical diagnostics that rely on audio classification.
+
+ Abstract:
+ Event-based models (EBM) provide an important platform for modeling disease progression. This work successfully extends previous EBM approaches to work with larger sets of biomarkers while simultaneously modeling heterogeneity in disease progression trajectories. We develop and validate the s-SuStain method for scalable event-based modeling of disease progression subtypes using large numbers of features. s-SuStaIn is typically an order of magnitude faster than its predecessor (SuStaIn). Moreover, we perform a case study with s-SuStaIn using open access cross-sectional Alzheimer's Disease Neuroimaging (ADNI) data to stage AD patients into four subtypes based on dynamic disease progression. s-SuStaIn shows that the inferred subtypes and stages predict progression to AD among MCI subjects. The subtypes show difference in AD incidence-rates and reveal clinically meaningful progression trajectories when mapped to a brain atlas.
+
+ Abstract:
+ Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a novel and large scale data set consisting of a training set with over 3 million hourly step count observations and a test set with over 2.5 million hourly step count observations. We propose a domain knowledge-informed sparse self-attention model for this task that captures the temporal multi-scale nature of step-count data. We assess the performance of the model relative to baselines based on different missing rates and ground-truth step counts. Finally, we conduct ablation studies to verify our specific model designs.
+
+ Abstract:
+ While the pace of development of AI has rapidly progressed in recent years, the implementation of safe and effective regulatory frameworks has lagged behind. In particular, the adaptive nature of AI models presents unique challenges to regulators as updating a model can improve its performance but also introduce safety risks. In the US, the Food and Drug Administration (FDA) has been a forerunner in regulating and approving hundreds of AI medical devices. To better understand how AI is updated and its regulatory considerations, we systematically analyze the frequency and nature of updates in FDA-approved AI medical devices. We find that less than 2% of all devices have been updated by being re-trained on new data. Meanwhile, nearly a quarter of devices have received updates in the form of new functionality and marketing claims. As an illustrative case study, we analyze pneumothorax detection models and find that while model performance can degrade by as much as 0.18 AUC when evaluated on new sites, re-training on site-specific data can mitigate this performance drop, recovering up to 0.23 AUC. However, we also observed significant degradation on the original site after re-training using data from new sites, highlighting a challenge with the current one-model-fits-all approach to regulatory approvals. Our analysis provides an in-depth look at the current state of FDA-approved AI device updates and insights for future regulatory policies toward model updating and adaptive AI.
+
+ Abstract:
+ Unstructured data in Electronic Health Records (EHRs) often contains critical information---complementary to imaging---that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by ``hallucinations''. In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.
+
+ Abstract:
+ Approximately two-thirds of survivors of childhood acute lymphoblastic leukemia (ALL) cancer develop late adverse effects post-treatment. Prior studies explored prediction models for personalized follow-up, but none integrated the usage of neural networks to date. In this work, we propose the Error Passing Network (EPN), a graph-based method that leverages relationships between samples to propagate residuals and adjust predictions of any machine learning model. We tested our approach to estimate patients' VO$_2$ peak, a reliable indicator of their cardiac health. We used the EPN in conjunction with several baseline models and observed up to $12.16$% improvement in the mean average percentage error compared to the last established equation predicting VO$_2$ peak in childhood ALL survivors. Along with this performance improvement, our final model is more efficient considering that it relies only on clinical variables that can be self-reported by patients, therefore removing the previous need of executing a resource-consuming physical test.
+
+ Abstract:
+ Large language models (LLMs) are capable of many natural language tasks, yet they are far from perfect. In health applications, grounding and interpreting domain-specific and non-linguistic data is important. This paper investigates the capacity of LLMs to make inferences about health based on contextual information (e.g. user demographics, health knowledge) and physiological data (e.g. resting heart rate, sleep minutes). We present a comprehensive evaluation of 12 publicly accessible state-of-the-art LLMs with prompting and fine-tuning techniques on four public health datasets (PMData, LifeSnaps, GLOBEM and AW\_FB). Our experiments cover 10 consumer health prediction tasks in mental health, activity, metabolic, and sleep assessment. Our fine-tuned model, HealthAlpaca exhibits comparable performance to much larger models (GPT-3.5, GPT-4 and Gemini-Pro), achieving the best or second best performance in 7 out of 10 tasks. Ablation studies highlight the effectiveness of context enhancement strategies. Notably, we observe that our context enhancement can yield up to 23.8\% improvement in performance. While constructing contextually rich prompts (combining user context, health knowledge and temporal information) exhibits synergistic improvement, the inclusion of health knowledge context in prompts significantly enhances overall performance.
+
+ Abstract:
+ This study advances Early Event Prediction (EEP) in healthcare through Dynamic Survival Analysis (DSA), offering a novel approach by integrating risk localization into alarm policies to enhance clinical event metrics. By adapting and evaluating DSA models against traditional EEP benchmarks, our research demonstrates their ability to match EEP models on a time-step level and significantly improve event-level metrics through a new alarm prioritization scheme (up to 11% AuPRC difference). This approach represents a significant step forward in predictive healthcare, providing a more nuanced and actionable framework for early event prediction and management.
+
+ Abstract:
+ Changing clinical algorithms to remove race adjustment has been proposed and implemented for multiple health conditions. Removing race adjustment from estimated glomerular filtration rate (eGFR) equations may reduce disparities in chronic kidney disease (CKD), but has not been studied in clinical practice after implementation. Here, we assessed whether implementing an eGFR equation (CKD-EPI 2021) without adjustment for Black or African American race modified quarterly rates of nephrology referrals and visits within a single healthcare system, Stanford Health Care (SHC). Our cohort study analyzed 547,194 adult patients aged 21 and older who had at least one recorded serum creatinine or serum cystatin C between January 1, 2019 and September 1, 2023. During the study period, implementation of CKD-EPI 2021 did not modify rates of quarterly nephrology referrals in those documented as Black or African American or in the overall cohort. After adjusting for capacity at SHC nephrology clinics, estimated rates of nephrology referrals and visits with CKD-EPI 2021 were 34 [95\% CI 29, 39] and 188 [175, 201] per 10,000 patients documented as Black or African American. If race adjustment had not been removed, estimated rates were nearly identical: 38 [95\% CI: 28, 53] and 189 [165, 218] per 10,000 patients. Changes to the eGFR equation are likely insufficient to achieve health equity in CKD care decision-making as many other structural inequities remain.
+
+ Abstract:
+ Clustering can be used in medical imaging research to identify different domains within a specific dataset, aiding in a better understanding of subgroups or strata that may not have been annotated. Moreover, in digital pathology, clustering can be used to effectively sample image patches from whole slide images (WSI). In this work, we conduct a comparative analysis of three deep clustering algorithms -- a simple two-step approach applying K-means onto a learned feature space, an end-to-end deep clustering method (DEC), and a Graph Convolutional Network (GCN) based method -- in application to a digital pathology dataset of endometrial biopsy WSIs. For consistency, all methods use the same Autoencoder (AE) architecture backbone that extracts features from image patches. The GCN-based model, specifically, stands out as a deep clustering algorithm that considers spatial contextual information in predicting clusters. Our study highlights the computation of graphs for WSIs and emphasizes the impact of these graphs on the formation of clusters. The main finding of our research indicates that GCN-based deep clustering demonstrates heightened spatial awareness compared to the other methods, resulting in higher cluster agreement with previous clinical annotations of WSIs.
+
+ Abstract:
+ Non-adherence to medication is a complex behavioral issue that costs hundreds of billions of dollars annually in the United States alone. Existing solutions to improve medication adherence are limited in their effectiveness and require significant user involvement. To address this, a minimally invasive mobile health system called DoseMate is proposed, which can provide quantifiable adherence data and imposes minimal user burden. To classify a motion time-series that defines pill-taking, we adopt transfer-learning and data augmentation based techniques that uses captured pill-taking gestures along with other open datasets that represent negative labels of other wrist motions. The paper also provides a design methodology that generalizes to other systems and describes a first-of-its-kind, in-the-wild, unobtrusively obtained dataset that contains unrestricted pill-related motion data from a diverse set of users.
+
+ Abstract:
+ This work introduces a novel approach to model regularization and explanation in Vision Transformers (ViTs), particularly beneficial for small-scale but high-dimensional data regimes, such as in healthcare. We introduce stochastic embedded feature selection in the context of echocardiography video analysis, specifically focusing on the EchoNet-Dynamic dataset for the prediction of the Left Ventricular Ejection Fraction (LVEF). Our proposed method, termed Gumbel Video Vision-Transformers (G-ViTs), augments Video Vision-Transformers (V-ViTs), a performant transformer architecture for videos with Concrete Autoencoders (CAEs), a common dataset-level feature selection technique, to enhance V-ViT's generalization and interpretability. The key contribution lies in the incorporation of stochastic token selection individually for each video frame during training. Such token selection regularizes the training of V-ViT, improves its interpretability, and is achieved by differentiable sampling of categoricals using the Gumbel-Softmax distribution. Our experiments on EchoNet-Dynamic demonstrate a consistent and notable regularization effect. The G-ViT model outperforms both a random selection baseline and standard V-ViT. The G-ViT is also compared against recent works on EchoNet-Dynamic where it exhibits state-of-the-art performance among end-to-end learned methods. Finally, we explore model explainability by visualizing selected patches, providing insights into how the G-ViT utilizes regions known to be crucial for LVEF prediction for humans. This proposed approach, therefore, extends beyond regularization, offering enhanced interpretability for ViTs.
+
+ Abstract:
+ Over the last decade, there has been significant progress in the field of interactive virtual rehabilitation. Physical therapy (PT) stands as a highly effective approach for enhancing physical impairments. However, patient motivation and progress tracking in rehabilitation outcomes have been a challenge. This work aims to address this gap by proposing a computational approach that uses machine learning to objectively measure reaching task outcomes from an upper limb virtual therapy user study. In this study, we use virtual reality to perform several tracing tasks while collecting motion and movement data using a KinArm robot and a custom-made wearable sleeve sensor. We introduce a two-step machine learning architecture to predict the motion intention of participants: The first step predicts reaching task segments to which the participant-marked points belonged using gaze, while the second step employs a Long Short-Term Memory (LSTM) model to predict directional movements based on resistance change values from the wearable sensor and the KinArm robot used to give the support the participant. We specifically propose to transpose our raw resistance data to the time-domain that significantly improve the accuracy of our models. To evaluate the effectiveness of our model, we compared different classification techniques with various data configurations. The results show that our proposed computational method is exceptionally good at predicting what participants are about to do, demonstrating the great promise of using multimodal data, including eye-tracking and resistance change, to objectively measure the performance and intention in virtual rehabilitation settings.
+
+ A cross-study analysis of wearable datasets and the generalizability of acute illness monitoring models
+
+
+ Patrick Kasl, Severine Soltani, Lauryn Keeler Bruce, Varun Kumar Viswanath, Wendy Hartogensis, Amarnath Gupta, Ilkay Altintas, Stephan Dilchert, Frederick M. Hecht, Ashley Mason, Benjamin L. Smarr
+
+ Abstract:
+ Large-scale wearable datasets are increasingly being used for biomedical research and to develop machine learning (ML) models for longitudinal health monitoring applications. However, it is largely unknown whether biases in these datasets lead to findings that do not generalize. Here, we present the first comparison of the data underlying multiple longitudinal, wearable-device-based datasets. We examine participant-level resting heart rate (HR) from four studies, each with thousands of wearable device users. We demonstrate that multiple regression, a community standard statistical approach, leads to conflicting conclusions about important demographic variables (age vs resting HR) and significant intra- and inter-dataset differences in HR. We then directly test the cross-dataset generalizability of a commonly used ML model trained for three existing day-level monitoring tasks: prediction of testing positive for a respiratory virus, flu symptoms, and fever symptoms. Regardless of task, most models showed relative performance loss on external datasets; most of this performance change can be attributed to concept shift between datasets. These findings suggest that research using large-scale, pre-existing wearable datasets might face bias and generalizability challenges similar to research in more established biomedical and ML disciplines. We hope that the findings from this study will encourage discussion in the wearable-ML community around standards that anticipate and account for challenges in dataset bias and model generalizability.
+
+ Abstract:
+ Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, enabling seamless integration of additional features. Additionally, we design two techniques, namely (1) \emph{Smoothness-inducing Regularization} and (2) \emph{Group-balanced Reweighting}, to enhance the model's robustness during finetuning. Through experiments conducted on two real EHR datasets, we demonstrate that HTP-Star consistently outperforms various baselines while striking a balance between patients with basic and extra features.
+
+ Abstract:
+ Explainability and privacy are the top concerns in machine learning (ML) for medical applications. In this paper, we propose a novel method, Domain-Aware Symbolic Regression with Homomorphic Encryption (DASR-HE), that addresses both concerns simultaneously by: (i) producing domain-aware, intuitive and explainable models that do not require the end-user to possess ML expertise and (ii) training only on securely encrypted data without access to actual data values or model parameters. DASR-HE is based on Symbolic Regression (SR), which is a first-class ML approach that produces simple and concise equations for regression, requiring no ML expertise to interpret. In our work, we improve the performance of SR algorithms by using existing domain-specific medical equations to augment the search space of equations, decreasing the search complexity and producing equations that are similar in structure to those used in practice. To preserve the privacy of the medical data, we enable our algorithm to learn on data that is homomorphically encrypted (HE), meaning that arithmetic operations can be done in the encrypted space. This makes HE suitable for machine learning algorithms to learn models without access to the actual data values or model parameters. We evaluate DASR-HE on three medical tasks, namely predicting glomerular filtration rate, endotracheal tube (ETT) internal diameter and ETT depth and find that DASR-HE outperforms existing medical equations, other SR ML algorithms and other explainable ML algorithms.
+
+ Abstract:
+ Limited access to health data remains a challenge for developing machine learning (ML) models. Health data is difficult to share due to privacy concerns and often does not have ground truth. Simulated data is often used for evaluating algorithms, as it can be shared freely and generated with ground truth. However, for simulated data to be used as an alternative to real data, algorithmic performance must be similar to that of real data. Existing simulation approaches are either black boxes or rely solely on expert knowledge, which may be incomplete. These methods generate data that often overstates performance, as they do not simulate many of the properties that make real data challenging. Nonstationarity, where a system's properties or parameters change over time, is pervasive in health data with changing health status of patients, standards of care, and populations. This makes ML challenging and can lead to reduced model generalizability, yet there have not been ways to systematically simulate realistic nonstationary data. This paper introduces a modular approach for learning dataset-specific models of nonstationarity in real data and augmenting simulated data with these properties to generate realistic synthetic datasets. We show that our simulation approach brings performance closer to that of real data in stress classification and glucose forecasting in people with diabetes.
+
+ Abstract:
+ Fatigue is one of the most prevalent symptoms of chronic diseases, such as Multiple Sclerosis, Alzheimer’s, and Parkinson’s. Recently researchers have explored unobtrusive and continuous ways of fatigue monitoring using mobile and wearable devices. However, data quality and limited labeled data availability in the wearable health domain pose significant challenges to progress in the field. In this work, we perform a systematic evaluation of self-supervised learning (SSL) tasks for fatigue recognition using wearable sensor data. To establish our benchmark, we use Homekit2020, which is a large-scale dataset collected using Fitbit devices in everyday life settings. Our results show that the majority of the SSL tasks outperform fully supervised baselines for fatigue recognition, even in limited labeled data scenarios. In particular, the domain features and multi-task learning achieve 0.7371 and 0.7323 AUROC, which are higher than the other SSL tasks and supervised learning baselines. In most of the pre-training tasks, the performance is higher when using at least one data augmentation that reflects the potentially low quality of wearable data (e.g., missing data). Our findings open up promising opportunities for continuous assessment of fatigue in real settings and can be used to guide the design and development of health monitoring systems.
+
+ Abstract:
+ Modern kidney placement incorporates several intelligent recommendation systems which exhibit social discrimination due to biases inherited from training data. Although initial attempts were made in the literature to study algorithmic fairness in kidney placement, these methods replace true outcomes with surgeons' decisions due to the long delays involved in recording such outcomes reliably. However, the replacement of true outcomes with surgeons' decisions disregards expert stakeholders' biases as well as social opinions of other stakeholders who do not possess medical expertise. This paper alleviates the latter concern and designs a novel fairness feedback survey to evaluate an acceptance rate predictor (ARP) that predicts a kidney's acceptance rate in a given kidney-match pair. The survey is launched on Prolific, a crowdsourcing platform, and public opinions are collected from 85 anonymous crowd participants. A novel social fairness preference learning algorithm is proposed based on minimizing social feedback regret computed using a novel logit-based fairness feedback model. The proposed model and learning algorithm are both validated using simulation experiments as well as Prolific data. Public preferences towards group fairness notions in the context of kidney placement have been estimated and discussed in detail. The specific ARP tested in the Prolific survey has been deemed fair by the participants.
+
+ Abstract:
+ Representation learning of brain activity is a key step toward unleashing machine learning models for use in the diagnosis of neurological diseases/disorders. Diagnosis of different neurological diseases/disorders, however, might require paying more attention to either spatial or temporal resolutions of brain activity. Accordingly, a generalized brain activity learner requires the ability of learning from both resolutions. Most existing studies, however, use domain knowledge to design brain encoders, and so are limited to a single neuroimage modality (e.g., EEG or fMRI) and its single resolution. Furthermore, their architecture design either: (1) uses self-attention mechanism with quadratic time with respect to input size, making its scalability limited, (2) is purely based on message-passing graph neural networks, missing long-range dependencies and temporal resolution, and/or (3) encode brain activity in each unit of brain (e.g., voxel) separately, missing the dependencies of brain regions. In this study, we present BrainMamba, an attention free, scalable, and powerful framework to learn brain activity multivariate timeseries. BrainMamba uses two modules: (i) A novel multivariate timeseries encoder that leverage an MLP to fuse information across variates and an Selective Structured State Space (S4) architecture to encode each timeseries. (ii) A novel graph learning framework that leverage message-passing neural networks along with S4 architecture to selectively choose important brain regions. Our experiments on 7 real-world datasets with 3 modalities show that BrainMamba attains outstanding performance and outperforms all baselines in different downstream tasks.
+
+ Will Ke Wang; Jiamu Yang; Leeor Hershkovich; Hayoung Jeong; Bill Chen; Karnika Singh; Ali R Roghanizad; Md Mobashir Hasan Shandhi; Andrew R Spector; Jessilyn Dunn
+
+ Patrick Kasl; Severine Soltani; Lauryn Keeler Bruce; Varun Kumar Viswanath; Wendy Hartogensis; Amarnath Gupta; Ilkay Altintas; Stephan Dilchert; Frederick M. Hecht; Ashley Mason; Benjamin L. Smarr
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/serve_archive.json b/serve_archive.json
new file mode 100644
index 000000000..17d194b96
--- /dev/null
+++ b/serve_archive.json
@@ -0,0 +1 @@
+{"2020":{"highlights":"ACM CHIL 2020 was held virtually on July 23rd and 24th. It featured 23 research talks from accepted papers, 23 workshop spotlights, and 15 participants in the doctoral symposium.\n\n### Keynotes\n\n- Yoshua Bengio - _Machine Learning Challenges in the Fight for Social Good - the Covid-19 Case_\n- Elaine Nsoesie - _Digital Platforms & Public Health in Africa_\n- Sherri Rose - _Machine Learning in Health Care: Too Important to Be a Toy Example_\n- Ruslan Salakhutdinov - _Incorporating Domain Knowledge into Deep Learning Models_\n- Nigam Shah - _A framework for shaping the future of AI in healthcare_\n\n### Tutorials\n\n- A Tour of Survival Analysis, from Classical to Modern - George H. Chen, Jeremy C. Weiss\n- Population and public health: challenges and opportunities - Vishwali Mhasawade, Yuan Zhao, Rumi Chunara\n- Public Health Datasets for Deep Learning: Challenges and Opportunities - Ziad Obermeyer, Katy Haynes, Amy Pitelka, Josh Risley, Katie Lin\n- State of the Art Deep Learning in Medical Imaging - Joseph Paul Cohen\n- Analyzing critical care data, from speculation to publication, starring MIMIC-IV (part 1) - Alistair Johnson\n\n\n### Papers\n\nProceedings from the 2020 ACM CHIL Conference are available at the ACM Digital Library: https://dl.acm.org/doi/proceedings/10.1145/3368555\n\n### Governing Board\n\n###### **General Chair**\n- Dr. Marzyeh Ghassemi of the University of Toronto and the Vector Institute\n###### **Logistics Chair**\n- Tasmie Sarker of the University of Toronto and the Vector Institute\n###### **Program Chair**\n- Dr. Tristan Naumann of Microsoft Research Seattle\n- Dr. Danielle Belgrave of Microsoft Research Cambridge, UK\n- Dr. Adrian Dalca of MIT and Harvard Medical School\n###### **Proceedings Chair**\n- Dr. Brett Beaulieu-Jones of Harvard Medical School\n- Sam Finlayson of Harvard University and MIT\n- Emily Alsentzer of Harvard University and MIT\n###### **Communications Chair**\n- Dr. Stephanie Hyland of Microsoft Research Cambridge, UK\n- Dr. Shalmali Joshi of the Vector Institute\n- Bret Nestor of the University of Toronto and the Vector Institute\n###### **Finance Chair**\n- Dr. Joyce Ho of Emory University\n- Dr. Laura Rosella of the University of Toronto\n###### **Tutorial Chair**\n- Ahmed Nasir of Trillium Health Partners\n- Dr. Andrew Beam of Harvard University\n- Irene Chen of MIT\n###### **Consortium Chair**\n- Dr. Leo Celi of MIT\n- Matthew McDermott of MIT\n###### **Virtual Chair**\n- Dr. Tom Pollard of MIT\n- Dr. Alistair Johnson of MIT\n\n### Executive Committee\n- Dr. Marzyeh Ghassemi of University of Toronto, Vector Institute\n- Dr. Tristan Naumann of Microsoft Research Seattle\n- Dr. Joyce Ho of Emory University\n- Dr. Leo Celi of MIT\n- Dr. Shalmali Joshi of the Vector Institute\n- Dr. Andrew Beam of Harvard University\n- Dr. Ziad Obermeyer of University of California, Berkeley\n- Dr. Oluwasanmi Koyejo of University of Illinois at Urbana-Champaign\n- Dr. Avi Goldfarb of Rotman School of Management, University of Toronto\n- Dr. Laura Rosella of Dalla Lana School of Public Health, University of Toronto\n- Dr. Adrian Dalca of MIT and Harvard Medical School\n- Dr. Rajesh Ranganath of NYU\n- Irene Chen of MIT\n- Matthew McDermott of MIT\n- Dr. Katherine Heller at Duke University\n- Dr. Uri Shalit of Technion\n- Dr. Stephanie Hyland of Microsoft Research Cambridge, UK\n- Dr. Danielle Belgrave of Microsoft Research Cambridge, UK\n- Dr. Shakir Mohamed of DeepMind\n- Dr. Alistair Johnson of MIT\n- Dr. Tom Pollard of MIT\n- Dr. Alan Karthikesalingam of Google Health UK\n\n### Steering Committee\n- Dr. Nigam Shah of Stanford University\n- Dr. Stephen Friend of Oxford University\n- Dr. Samantha Kleinberg of Stevens Institute of Technology\n- Dr. Anna Goldenberg of The Hospital for Sick Children Research Institute\n- Dr. Lucila Ohno-Machado of University of California, San Diego\n- Dr. Noemie Elhadad of Columbia University\n\n### Sponsors\nWe thank the Association for Computing Machinery (ACM) for sponsoring CHIL 2020, as well as the following organizations for supporting the event:\n\n- Google\n- Health[at]Scale\n- Layer6\n- Apple\n- CIFAR\n- Imagia\n- Microsoft\n- Sun Life Financial\n- Creative Destruction Lab\n- Vector Institute\n","proceedings":[{"UID":"20P01","abstract":"A key impediment to reinforcement learning (RL) in real applications with limited, batch data is in defining a reward function that reflects what we implicitly know about reasonable behaviour for a task and allows for robust off-policy evaluation. In this work, we develop a method to identify an admissible set of reward functions for policies that (a) do not deviate too far in performance from prior behaviour, and (b) can be evaluated with high confidence, given only a collection of past trajectories. Together, these ensure that we avoid proposing unreasonable policies in high-risk settings. We demonstrate our approach to reward design on synthetic domains as well as in a critical care context, to guide the design of a reward function that consolidates clinical objectives to learn a policy for weaning patients from mechanical ventilation.","authors":"Niranjani Prasad|Barbara Engelhardt|Finale Doshi-Velez","doi_link":"http://dx.doi.org/10.1145/3368555.3384450","slideslive_active_date":"","slideslive_id":"38931912","title":"Defining admissible rewards for high-confidence policy evaluation in batch reinforcement learning"},{"UID":"20P02","abstract":"The abundance of modern health data provides many opportunities for the use of machine learning techniques to build better statistical models to improve clinical decision making. Predicting time-to-event distributions, also known as survival analysis, plays a key role in many clinical applications. We introduce a variational time-to-event prediction model, named Variational Survival Inference (VSI), which builds upon recent advances in distribution learning techniques and deep neural networks. VSI addresses the challenges of non-parametric distribution estimation by (i) relaxing the restrictive modeling assumptions made in classical models, and (ii) efficiently handling the censored observations, i.e., events that occur outside the observation window, all within the variational framework. To validate the effectiveness of our approach, an extensive set of experiments on both synthetic and real-world datasets is carried out, showing improved performance relative to competing solutions.","authors":"Zidi Xiu|Chenyang Tao|Ricardo Henao","doi_link":"http://dx.doi.org/10.1145/3368555.3384454","slideslive_active_date":"","slideslive_id":"38931913","title":"Variational learning of individual survival distributions"},{"UID":"20P03","abstract":"The dearth of prescribing guidelines for physicians is one key driver of the current opioid epidemic in the United States. In this work, we analyze medical and pharmaceutical claims data to draw insights on characteristics of patients who are more prone to adverse outcomes after an initial synthetic opioid prescription. Toward this end, we propose a generative model that allows discovery from observational data of subgroups that demonstrate an enhanced or diminished causal effect due to treatment. Our approach models these sub-populations as a mixture distribution, using sparsity to enhance interpretability, while jointly learning nonlinear predictors of the potential outcomes to better adjust for confounding. The approach leads to human interpretable insights on discovered subgroups, improving the practical utility for decision support.","authors":"Chirag Nagpal|Dennis Wei|Bhanukiran Vinzamuri|Monica Shekhar|Sara E. Berger|Subhro Das|Kush R. Varshney","doi_link":"http://dx.doi.org/10.1145/3368555.3384456","slideslive_active_date":"","slideslive_id":"38931914","title":"Interpretable subgroup discovery in treatment effect estimation with application to opioid prescribing guidelines"},{"UID":"20P04","abstract":"Adverse drug reactions (ADRs) are detrimental and unexpected clinical incidents caused by drug intake. The increasing availability of massive quantities of longitudinal event data such as electronic health records (EHRs) has redefined ADR discovery as a big data analytics problem, where data-hungry deep neural networks are especially suitable because of the abundance of the data. To this end, we introduce neural self-controlled case series (NSCCS), a deep learning framework for ADR discovery from EHRs. NSCCS rigorously follows a self-controlled case series design to adjust implicitly and efficiently for individual heterogeneity. In this way, NSCCS is robust to time-invariant confounding issues and thus more capable of identifying associations that reflect the underlying mechanism between various types of drugs and adverse conditions. We apply NSCCS to a large-scale real-world EHR dataset and empirically demonstrate its superior performance with comprehensive experiments on a benchmark ADR discovery task.","authors":"Wei Zhang|Zhaobin Kuang|Peggy Peissig|David Page","doi_link":"http://dx.doi.org/10.1145/3368555.3384459","slideslive_active_date":"","slideslive_id":"38931915","title":"Adverse drug reaction discovery from electronic health records with deep neural networks"},{"UID":"20P05","abstract":"Real-world predictive models in healthcare should be evaluated in terms of discrimination, the ability to differentiate between high and low risk events, and calibration, or the accuracy of the risk estimates. Unfortunately, calibration is often neglected and only discrimination is analyzed. Calibration is crucial for personalized medicine as they play an increasing role in the decision making process. Since random forest is a popular model for many healthcare applications, we propose CaliForest, a new calibrated random forest. Unlike existing calibration methodologies, CaliForest utilizes the out-of-bag samples to avoid the explicit construction of a calibration set. We evaluated CaliForest on two risk prediction tasks obtained from the publicly-available MIMIC-III database. Evaluation on these binary prediction tasks demonstrates that CaliForest can achieve the same discriminative power as random forest while obtaining a better-calibrated model evaluated across six different metrics. CaliForest will be published on the standard Python software repository and the code will be openly available on Github.","authors":"Yubin Park|Joyce C. Ho","doi_link":"http://dx.doi.org/10.1145/3368555.3384461","slideslive_active_date":"","slideslive_id":"38931916","title":"CaliForest: calibrated random forest for health data"},{"UID":"20P06","abstract":"Retinal effusions and cysts caused by the leakage of damaged macular vessels and choroid neovascularization are symptoms of many ophthalmic diseases. Optical coherence tomography (OCT), which provides clear 10-layer cross-sectional images of the retina, is widely used to screen various ophthalmic diseases. A large number of researchers have carried out relevant studies on deep learning technology to realize the semantic segmentation of lesion areas, such as effusion on OCT images, and achieved good results. However, in this field, problems of the low contrast of the lesion area and unevenness of lesion size limit the accuracy of the deep learning semantic segmentation model. In this paper, we propose a boundary multi-scale multi-task OCT segmentation network (BMM-Net) for these two challenges to segment the retinal edema area, subretinal fluid, and pigment epithelial detachment in OCT images. We propose a boundary extraction module, a multi-scale information perception module, and a classification module to capture accurate position and semantic information and collaboratively extract meaningful features. We train and verify on the AI Challenger competition dataset. The average Dice coefficient of the three lesion areas is 3.058% higher than the most commonly used model in the field of medical image segmentation and reaches 0.8222.","authors":"Ruru Zhang|Jiawen He|Shenda Shi|Haihong E|Zhonghong Ou|Meina Song","doi_link":"http://dx.doi.org/10.1145/3368555.3384447","slideslive_active_date":"","slideslive_id":"38931917","title":"BMM-Net: automatic segmentation of edema in optical coherence tomography based on boundary detection and multi-scale network"},{"UID":"20P07","abstract":"Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations with diverse risk profiles or survival distributions. As a result, there is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles, while jointly accounting for accurate individualized time-to-event predictions. An approach that addresses this need is likely to improve the characterization of individual outcomes by leveraging regularities in subpopulations, thus accounting for population-level heterogeneity. In this paper, we propose a Bayesian nonparametrics approach that represents observations (subjects) in a clustered latent space, and encourages accurate time-to-event predictions and clusters (subpopulations) with distinct risk profiles. Experiments on real-world datasets show consistent improvements in predictive performance and interpretability relative to existing state-of-the-art survival analysis models.","authors":"Paidamoyo Chapfuwa|Chunyuan Li|Nikhil Mehta|Lawrence Carin|Ricardo Henao","doi_link":"http://dx.doi.org/10.1145/3368555.3384465","slideslive_active_date":"","slideslive_id":"38931918","title":"Survival cluster analysis"},{"UID":"20P08","abstract":"While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when they are tested in different hospital systems. Furthermore, even within a given hospital system, deep learning models have been shown to depend on hospital- and patient-level confounders rather than meaningful pathology to make classifications. In order for these models to be safely deployed, we would like to ensure that they do not use confounding variables to make their classification, and that they will work well even when tested on images from hospitals that were not included in the training data. We attempt to address this problem in the context of pneumonia classification from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia classifier by training a model that is invariant to the view position of chest radiographs (anterior-posterior vs. posterior-anterior). Our approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed methods to handle confounding, and also suggests a method for identifying models that may rely on confounders.","authors":"Joseph D. Janizek|Gabriel Erion|Alex J. DeGrave|Su-In Lee","doi_link":"http://dx.doi.org/10.1145/3368555.3384458","slideslive_active_date":"","slideslive_id":"38931919","title":"An adversarial approach for the robust classification of pneumonia from chest radiographs"},{"UID":"20P09","abstract":"Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, the goal of the explanation is to attribute the increase to a few relevant inputs from the past.While our formal setup and techniques are general, we carry out an in-depth case study in a clinical setting. The goal here is to alert a clinician when a patient's risk of deterioration rises. The clinician then has to decide whether to intervene and adjust the treatment. Given a potentially long sequence of new events since she last saw the patient, a concise explanation helps her to quickly triage the alert.We develop methods to lift static attribution techniques to the dynamical setting, where we identify and address challenges specific to dynamics. We then experimentally assess the utility of different explanations of clinical alerts through expert evaluation.","authors":"Michaela Hardt|Alvin Rajkomar|Gerardo Flores|Andrew Dai|Michael Howell|Greg Corrado|Claire Cui|Moritz Hardt","doi_link":"http://dx.doi.org/10.1145/3368555.3384460","slideslive_active_date":"","slideslive_id":"38931920","title":"Explaining an increase in predicted risk for clinical alerts"},{"UID":"20P10","abstract":"We introduce SparseVM, a method that registers clinical-quality 3D MR scans both faster and more accurately than previously possible. Deformable alignment, or registration, of clinical scans is a fundamental task for many clinical neuroscience studies. However, most registration algorithms are designed for high-resolution research-quality scans. In contrast to research-quality scans, clinical scans are often sparse, missing up to 86% of the slices available in research-quality scans. Existing methods for registering these sparse images are either inaccurate or extremely slow. We present a learning-based registration method, SparseVM, that is more accurate and orders of magnitude faster than the most accurate clinical registration methods. To our knowledge, it is the first method to use deep learning specifically tailored to registering clinical images. We demonstrate our method on a clinically-acquired MRI dataset of stroke patients and on a simulated sparse MRI dataset. Our code is available as part of the VoxelMorph package at http://voxelmorph.mit.edu.","authors":"Kathleen Lewis|Natalia S. Rost|John Guttag|Adrian V. Dalca","doi_link":"http://dx.doi.org/10.1145/3368555.3384462","slideslive_active_date":"","slideslive_id":"38931921","title":"Fast learning-based registration of sparse 3D clinical images"},{"UID":"20P11","abstract":"Necrotizing enterocolitis (NEC) is a life-threatening intestinal disease that primarily affects preterm infants during their first weeks after birth. Mortality rates associated with NEC are 15-30%, and surviving infants are susceptible to multiple serious, long-term complications. The disease is sporadic and, with currently available tools, unpredictable. We are creating an early warning system that uses stool microbiome features, combined with clinical and demographic information, to identify infants at high risk of developing NEC. Our approach uses a multiple instance learning, neural network-based system that could be used to generate daily or weekly NEC predictions for premature infants. The approach was selected to effectively utilize sparse and weakly annotated datasets characteristic of stool microbiome analysis. Here we describe initial validation of our system, using clinical and microbiome data from a nested case-control study of 161 preterm infants. We show receiver-operator curve areas above 0.9, with 75% of dominant predictive samples for NEC-affected infants identified at least 24 hours prior to disease onset. Our results pave the way for development of a real-time early warning system for NEC using a limited set of basic clinical and demographic details combined with stool microbiome data.","authors":"Thomas Hooven|Yun Chao Lin|Ansaf Salleb-Aouissi","doi_link":"http://dx.doi.org/10.1145/3368555.3384466","slideslive_active_date":"","slideslive_id":"38931922","title":"Multiple instance learning for predicting necrotizing enterocolitis in premature infants using microbiome data"},{"UID":"20P12","abstract":"In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.","authors":"Haoran Zhang|Amy X. Lu|Mohamed Abdalla|Matthew McDermott|Marzyeh Ghassemi","doi_link":"http://dx.doi.org/10.1145/3368555.3384448","slideslive_active_date":"","slideslive_id":"38931923","title":"Hurtful words: quantifying biases in clinical contextual word embeddings"},{"UID":"20P13","abstract":"Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been used for disease prediction or diagnostics. Graph Attention Networks have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that is difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92% accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network, random forest, and multi-layer perceptron. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the difference between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding which we visualize with PHATE and UMAP. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.","authors":"Neal Ravindra|Arijit Sehanobish|Jenna L. Pappalardo|David A. Hafler|David van Dijk","doi_link":"http://dx.doi.org/10.1145/3368555.3384449","slideslive_active_date":"","slideslive_id":"38931924","title":"Disease state prediction from single-cell data using graph attention networks"},{"UID":"20P14","abstract":"The International Classification of Disease (ICD) is a widely used diagnostic ontology for the classification of health disorders and a valuable resource for healthcare analytics. However, ICD is an evolving ontology and subject to periodic revisions (e.g. ICD-9-CM to ICD-10-CM) resulting in the absence of complete cross-walks between versions. While clinical experts can create custom mappings across ICD versions, this process is both time-consuming and costly. We propose an automated solution that facilitates interoperability without sacrificing accuracy.Our solution leverages the SNOMED-CT ontology whereby medical concepts are organised in a directed acyclic graph. We use this to map ICD-9-CM to ICD-10-CM by associating codes to clinical concepts in the SNOMED graph using a nearest neighbors search in combination with natural language processing. To assess the impact of our method, the performance of a gradient boosted tree (XGBoost) developed to classify patients with Exocrine Pancreatic Insufficiency (EPI) disorder, was compared when using features constructed by our solution versus clinically-driven methods. This dataset comprised of 23, 204 EPI patients and 277, 324 non-EPI patients with data spanning from October 2011 to April 2017. Our algorithm generated clinical predictors with comparable stability across the ICD-9-CM to ICD-10-CM transition point when compared to ICD-9-CM/ICD-10-CM mappings generated by clinical experts. Preliminary modeling results showed highly similar performance for models based on the SNOMED mapping vs clinically defined mapping (71% precision at 20% recall for both models). Overall, the framework does not compromise on accuracy at the individual code level or at the model-level while obviating the need for time-consuming manual mapping.","authors":"Shaun Gupta|Frederik Dieleman|Patrick Long|Orla Doyle|Nadejda Leavitt","doi_link":"http://dx.doi.org/10.1145/3368555.3384453","slideslive_active_date":"","slideslive_id":"38931925","title":"Using SNOMED to automate clinical concept mapping"},{"UID":"20P15","abstract":"Systematic review (SR) is an essential process to identify, evaluate, and summarize the findings of all relevant individual studies concerning health-related questions. However, conducting a SR is labor-intensive, as identifying relevant studies is a daunting process that entails multiple researchers screening thousands of articles for relevance. In this paper, we propose MMiDaS-AE, a Multi-modal Missing Data aware Stacked Autoencoder, for semi-automating screening for SRs. We use a multi-modal view that exploits three representations, of: 1) documents, 2) topics, and 3) citation networks. Documents that contain similar words will be nearby in the document embedding space. Models can also exploit the relationship between documents and the associated SR MeSH terms to capture article relevancy. Finally, related works will likely share the same citations, and thus closely related articles would, intuitively, be trained to be close to each other in the embedding space. However, using all three learned representations as features directly result in an unwieldy number of parameters. Thus, motivated by recent work on multi-modal auto-encoders, we adopt a multi-modal stacked autoencoder that can learn a shared representation encoding all three representations in a compressed space. However, in practice one or more of these modalities may be missing for an article (e.g., if we cannot recover citation information). Therefore, we propose to learn to impute the shared representation even when specific inputs are missing. We find this new model significantly improves performance on a dataset consisting of 15 SRs compared to existing approaches.","authors":"Eric W. Lee|Byron C. Wallace|Karla I. Galaviz|Joyce C. Ho","doi_link":"http://dx.doi.org/10.1145/3368555.3384463","slideslive_active_date":"","slideslive_id":"38931926","title":"MMiDaS-AE: multi-modal missing data aware stacked autoencoder for biomedical abstract screening"},{"UID":"20P16","abstract":"Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but the model may still consistently miss a rare but aggressive cancer subtype. We refer to this problem as hidden stratification, and observe that it results from incompletely describing the meaningful variation in a dataset. While hidden stratification can substantially reduce the clinical efficacy of machine learning models, its effects remain difficult to measure. In this work, we assess the utility of several possible techniques for measuring hidden stratification effects, and characterize these effects both via synthetic experiments on the CIFAR-100 benchmark dataset and on multiple real-world medical imaging datasets. Using these measurement techniques, we find evidence that hidden stratification can occur in unidentified imaging subsets with low prevalence, low label quality, subtle distinguishing features, or spurious correlates, and that it can result in relative performance differences of over 20% on clinically important subsets. Finally, we discuss the clinical implications of our findings, and suggest that evaluation of hidden stratification should be a critical component of any machine learning deployment in medical imaging.","authors":"Luke Oakden-Rayner|Jared Dunnmon|Gustavo Carneiro|Christopher Re","doi_link":"http://dx.doi.org/10.1145/3368555.3384468","slideslive_active_date":"","slideslive_id":"38931927","title":"Hidden stratification causes clinically meaningful failures in machine learning for medical imaging"},{"UID":"20P17","abstract":"Automated assessment of rehabilitation exercises using machine learning has a potential to improve current rehabilitation practices. However, it is challenging to completely replicate therapist's decision making on the assessment of patients with various physical conditions. This paper describes an interactive machine learning approach that iteratively integrates a data-driven model with expert's knowledge to assess the quality of rehabilitation exercises. Among a large set of kinematic features of the exercise motions, our approach identifies the most salient features for assessment using reinforcement learning and generates a user-specific analysis to elicit feature relevance from a therapist for personalized rehabilitation assessment. While accommodating therapist's feedback on feature relevance, our approach can tune a generic assessment model into a personalized model. Specifically, our approach improves performance to predict assessment from 0.8279 to 0.9116 average F1-scores of three upper-limb rehabilitation exercises (p < 0.01). Our work demonstrates that machine learning models with feature selection can generate kinematic feature-based analysis as explanations on predictions of a model to elicit expert's knowledge of assessment, and how machine learning models can augment with expert's knowledge for personalized rehabilitation assessment.","authors":"Min Hun Lee|Daniel P. Siewiorek|Asim Smailagic|Alexandre Bernardino|Sergi Berm\u00fadez i Badia","doi_link":"http://dx.doi.org/10.1145/3368555.3384452","slideslive_active_date":"","slideslive_id":"38931928","title":"Interactive hybrid approach to combine machine and human intelligence for personalized rehabilitation assessment"},{"UID":"20P18","abstract":"Accurately extracting medical entities from social media is challenging because people use informal language with different expressions for the same concept, and they also make spelling mistakes. Previous work either focused on specific diseases (e.g., depression) or drugs (e.g., opioids) or, if working with a wide-set of medical entities, only tackled individual and small-scale benchmark datasets (e.g., AskaPatient). In this work, we first demonstrated how to accurately extract a wide variety of medical entities such as symptoms, diseases, and drug names on three benchmark datasets from varied social media sources, and then also validated this approach on a large-scale Reddit dataset.We first implemented a deep-learning method using contextual embeddings that upon two existing benchmark datasets, one containing annotated AskaPatient posts (CADEC) and the other containing annotated tweets (Micromed), outperformed existing state-of-the-art methods. Second, we created an additional benchmark dataset by annotating medical entities in 2K Reddit posts (made publicly available under the name of MedRed) and showed that our method also performs well on this new dataset.Finally, to demonstrate that our method accurately extracts a wide variety of medical entities on a large scale, we applied the model pre-trained on MedRed to half a million Reddit posts. The posts came from disease-specific subreddits so we could categorise them into 18 diseases based on the subreddit. We then trained a machine-learning classifier to predict the post's category solely from the extracted medical entities. The average F1 score across categories was .87. These results open up new cost-effective opportunities for modeling, tracking and even predicting health behavior at scale.","authors":"Sanja Scepanovic|Enrique Martin-Lopez|Daniele Quercia|Khan Baykaner","doi_link":"http://dx.doi.org/10.1145/3368555.3384467","slideslive_active_date":"","slideslive_id":"38931929","title":"Extracting medical entities from social media"},{"UID":"20P19","abstract":"While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment to predict in another due to variability in features. Even within disease labels there can be differences (e.g. \"fever\" may mean something different reported in a doctor's office versus in an online app). Moreover, models are often built on passive, observational data which contain different distributions of population subgroups (e.g. men or women). Thus, there are two forms of instability between environments in this observational transport problem. We first harness substantive knowledge from health research to conceptualize the underlying causal structure of this problem in a health outcome prediction task. Based on sources of stability in the model and the task, we posit that we can combine environment and population information in a novel population-aware hierarchical Bayesian domain adaptation framework that harnesses multiple invariant components through population attributes when needed. We study the conditions under which invariant learning fails, leading to reliance on the environment-specific attributes. Experimental results for an influenza prediction task on four datasets gathered from different contexts show the model can improve prediction in the case of largely unlabelled target data from a new environment and different constituent population, by harnessing both environment and population invariant information. This work represents a novel, principled way to address a critical challenge by blending domain (health) knowledge and algorithmic innovation. The proposed approach will have significant impact in many social settings wherein who the data comes from and how it was generated, matters.","authors":"Vishwali Mhasawade|Nabeel Abdur Rehman|Rumi Chunara","doi_link":"http://dx.doi.org/10.1145/3368555.3384451","slideslive_active_date":"","slideslive_id":"38931930","title":"Population-aware hierarchical bayesian domain adaptation via multi-component invariant learning"},{"UID":"20P20","abstract":"Phenotyping electronic health records (EHR)focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose Temporal And Static TEnsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes.TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical re-formulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14\u00d7 faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 60 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.","authors":"Ardavan Afshar|Ioakeim Perros|Haesun Park|Christopher deFilippi|Xiaowei Yan|Walter Stewart|Joyce Ho|Jimeng Sun","doi_link":"http://dx.doi.org/10.1145/3368555.3384464","slideslive_active_date":"","slideslive_id":"38931931","title":"TASTE: temporal and static tensor factorization for phenotyping electronic health records"},{"UID":"20P21","abstract":"In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertainty methods in the medical domain. Using RNN ensembles and various Bayesian RNNs, we show that population-level metrics, such as AUC-PR, AUC-ROC, log-likelihood, and calibration error, do not capture model uncertainty. Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. Understanding the uncertainty for individual patients is an area with clear clinical impact, such as determining when a model decision is likely to be brittle. We further show that RNNs with only Bayesian embeddings can be a more efficient way to capture model uncertainty compared to ensembles, and we analyze how model uncertainty is impacted across individual input features and patient subgroups.","authors":"Michael W. Dusenberry|Dustin Tran|Edward Choi|Jonas Kemp|Jeremy Nixon|Ghassen Jerfel|Katherine Heller|Andrew M. Dai","doi_link":"http://dx.doi.org/10.1145/3368555.3384457","slideslive_active_date":"","slideslive_id":"38931932","title":"Analyzing the role of model uncertainty for electronic health records"},{"UID":"20P22","abstract":"The ability of caregivers and investigators to share patient data is fundamental to many areas of clinical practice and biomedical research. Prior to sharing, it is often necessary to remove identifiers such as names, contact details, and dates in order to protect patient privacy. Deidentification, the process of removing identifiers, is challenging, however. High-quality annotated data for developing models is scarce; many target identifiers are highly heterogenous (for example, there are uncountable variations of patient names); and in practice anything less than perfect sensitivity may be considered a failure. Consequently, software for adequately deidentifying clinical data is not widely available. As a result patient data is often withheld when sharing would be beneficial, and identifiable patient data is often divulged when a deidentified version would suffice.In recent years, advances in machine learning methods have led to rapid performance improvements in natural language processing tasks, in particular with the advent of large-scale pretrained language models. In this paper we develop and evaluate an approach for deidentification of clinical notes based on a bidirectional transformer model. We propose human interpretable evaluation measures and demonstrate state of the art performance against modern baseline models. Finally, we highlight current challenges in deidentification, including the absence of clear annotation guidelines, lack of portability of models, and paucity of training data. Code to develop our model is open source and simple to install, allowing for broad reuse.","authors":"Alistair E. W. Johnson|Lucas Bulgarelli|Tom J. Pollard","doi_link":"http://dx.doi.org/10.1145/3368555.3384455","slideslive_active_date":"","slideslive_id":"38931933","title":"Deidentification of free-text medical records using pre-trained bidirectional transformers"},{"UID":"20P23","abstract":"Machine learning for healthcare researchers face challenges to progress and reproducibility due to a lack of standardized processing frameworks for public datasets. We present MIMIC-Extract, an open source pipeline for transforming the raw electronic health record (EHR) data of critical care patients from the publicly-available MIMIC-III database into data structures that are directly usable in common time-series prediction pipelines. MIMIC-Extract addresses three challenges in making complex EHR data accessible to the broader machine learning community. First, MIMIC-Extract transforms raw vital sign and laboratory measurements into usable hourly time series, performing essential steps such as unit conversion, outlier handling, and aggregation of semantically similar features to reduce missingness and improve robustness. Second, MIMIC-Extract extracts and makes prediction of clinically-relevant targets possible, including outcomes such as mortality and length-of-stay as well as comprehensive hourly intervention signals for ventilators, vasopressors, and fluid therapies. Finally, the pipeline emphasizes reproducibility and extensibility to future research questions. We demonstrate the pipeline's effectiveness by developing several benchmark tasks for outcome and intervention forecasting and assessing the performance of competitive models.","authors":"Shirly Wang|Matthew B. A. McDermott|Geeticka Chauhan|Marzyeh Ghassemi|Michael C. Hughes|Tristan Naumann","doi_link":"http://dx.doi.org/10.1145/3368555.3384469","slideslive_active_date":"","slideslive_id":"38931934","title":"MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III"}],"speakers":[{"UID":"20K01","abstract":"This talk outlines two Mila projects aimed at fighting the Covid-19 pandemic which are part of Mila's AI for Humanity mission. The first one is about discovering antivirals, either via repurposing several existing drugs using graph neural networks or via discovering new drug-like molecules using reinforcement learning and docking simulations to search in the molecular space. The second project is about using machine learning to provide early warning signals to people who are contagious -- especially if they don't realize that they are -- by exchanging information between phones of users who have had dangerous contacts with each other. This extends digital contact tracing by incorporating information about symptoms, medical condition and behavior (like wearing a mask) and relies on a sophisticated epidemiological model at the individual levels in which we can simulate different individual-level and society-level strategies.","bio":"Yoshua Bengio is Professor in the Computer Science and Operations Research departments at U. Montreal, founder and scientific director of Mila and of IVADO. He is a Fellow of the Royal Society of London and of the Royal Society of Canada, has received a Canada Research Chair and a Canada CIFAR AI Chair and is a recipient of the 2018 Turing Award for pioneering deep learning, is an officer of the Order of Canada, a member of the NeurIPS advisory board, co-founder and member of the board of the ICLR conference, and program director of the CIFAR program on Learning in Machines and Brains. His goal is to contribute to uncovering the principles giving rise to intelligence through learning, as well as favour the development of AI for the benefit of all.","image":"static/images/speakers/yoshua_bengio.png","institution":"University of Montreal","slideslive_active_date":"","slideslive_id":"38931907","speaker":"Yoshua Bengio","title":"Machine Learning Challenges in the Fight for Social Good - the Covid-19 Case","website":""},{"UID":"20K02","abstract":"Countries in Sub-Saharan Africa are facing a double burden of infectious and noncommunicable diseases. The burden of noncommunicable diseases such as, diabetes and hypertension, are expected to continue increasing. Digital data and tools that can be used to study the patterns of health and disease in populations offer opportunities for improving public health. Digital platforms such as, social media, search engines, and internet forums, have been widely accepted in Sub-Saharan Africa for health information seeking and sharing. These tools can be used to improve public health in Sub-Saharan Africa in three ways: (1) monitoring health information seeking and providing health education, (2) monitoring risk factors, and (3) monitoring disease incidence. However, in order for these tools to be effective, it is important to consider and incorporate into analytical processes the distinct social, cultural, and economic context in Sub-Saharan African countries.","bio":"Dr. Nsoesie is an Assistant Professor of Global Health at Boston University (BU) School of Public Health. She is also a BU Data Science Faculty Fellow as part of the BU Data Science Initiative at the Hariri Institute for Computing and a Data and Innovation Fellow at The Directorate of Science, Technology and Innovation (DSTI) in the Office of the President in Sierra Leone. Dr. Nsoesie applies data science methodologies to global health problems, using digital data and technology to improve health, particularly in the realm of surveillance of chronic and infectious diseases. She has worked with local public health departments in the United States and international organizations. She completed her postdoctoral studies at Harvard Medical School, and her PhD in Computational Epidemiology from the Genetics, Bioinformatics and Computational Biology program at Virginia Tech. She also has an MS in Statistics and a BS in Mathematics. She is the founder of Reth\u00e9 \u2013 an initiative focused on providing scientific writing tools and resources to student communities in Africa in order to increase representation in scientific publications. She has written for NPR, The Conversation, Public Health Post and Quartz. Dr. Nsoesie was born and raised in Cameroon.","image":"static/images/speakers/elaine_nsoesie.jpg","institution":"Boston University","slideslive_active_date":"","slideslive_id":"38931908","speaker":"Elaine Nsoesie","title":"Digital Platforms & Public Health in Africa","website":""},{"UID":"20K03","abstract":"The massive size of the health care sector make data science applications in this space particularly salient for social policy. An overarching theme of this keynote is that developing machine learning methodology tailored to specific substantive health problems and the associated electronic health data is critical given the stakes involved, rather than eschewing complexity in simplified scenarios that may no longer represent an actual real-world problem.","bio":"Sherri Rose, Ph.D. is an Associate Professor of Health Care Policy at Harvard Medical School and Co-Director of the Health Policy Data Science Lab. Her research in health policy focuses on risk adjustment, comparative effectiveness, and health program evaluation. Dr. Rose coauthored the first book on machine learning for causal inference and has published work across fields, including in Biometrics, JASA, PMLR,Journal of Health Economics, and NEJM. She currently serves as co-editor of the journal Biostatistics and is Chair-Elect of the American Statistical Association\u2019s Biometrics Section. Her honors include the ISPOR Bernie J. O\u2019Brien New Investigator Award for exceptional early career work in health economics and outcomes research and an NIH Director\u2019s New Innovator Award to develop machine learning estimators for generalizability in health policy.","image":"static/images/speakers/sherri_rose.png","institution":"Harvard Medical School","slideslive_active_date":"","slideslive_id":"38931910","speaker":"Sherri Rose","title":"Machine Learning in Health Care: Too Important to Be a Toy Example","website":""},{"UID":"20K04","abstract":"Details to be confirmed.","bio":"Dr. Ruslan Salakhutdinov is a UPMC professor of Computer Science at Carnegie Mellon University. He has served as an area chair for NIPS, ICML, CVPR, and ICLR. He holds a PhD from University of Toronto and completed postdoctoral training at Massachusetts Institute of Technology.","image":"static/images/speakers/ruslan_salakhutdinov.png","institution":"Carnegie Mellon University","slideslive_active_date":"","slideslive_id":"38931911","speaker":"Ruslan Salakhutdinov","title":"Incorporating Domain Knowledge into Deep Learning Models","website":""},{"UID":"20K05","abstract":"In this session we will explore strategies for, and issues involved in, bringing Artificial Intelligence (AI) technologies to the clinic, safely and ethically. We will discuss the characteristics of a sound data strategy for powering a machine learning (ML) health system. The session introduces a frame-work for analyzing the utility of ML models in healthcare and discusses the implicit assumptions in aligning incentives for AI guided healthcare actions.","bio":"Dr. Nigam Shah is Associate Professor of Medicine (Biomedical Informatics) at Stanford University, and serves as the Associate CIO for Data Science for Stanford Health Care. Dr. Shah\u2019s research focuses on combining machine learning and prior knowledge in medical ontologies to enable the learning health system. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and is inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.","image":"static/images/speakers/nigam_shah.png","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"38931909","speaker":"Nigam Shah","title":"A framework for shaping the future of AI in healthcare","website":""}],"symposiums":[{"UID":"20S01","abstract":"Serena Jeblee's (University of Toronto, Expected Aug 2020) research focuses on clinical natural language processing (NLP), with a special focus on the extraction of a normalized Cause of Death (CoD) from verbal autopsy reports. This research would be especially impactful in low to middle income countries, where verbal autopsy reports are common, and physical autopsies or medically certified causes of death are less common. Serena approaches this problem by first extracting a temporally ordered list of symptoms from the verbal autopsy report, then uses these to construct a more accurate assessment of the overall CoD diagnosis. Serena's other work focuses on other clinical NLP tasks, including automatic extraction of pertinent information from provider-patient dialogs.","authors":"Serena Jeblee","slideslive_active_date":"","slideslive_id":"38931967","title":"Doctoral Symposium Talk: Serena Jeblee"},{"UID":"20S02","abstract":"Dr. Serifat Folorunso's (University of Ibadan, 2019) research focuses on augmented survival data analysis using modified generalized gamma mixture cure models (GGMCMs) for cancer research. In particular, Dr. Folorunso's work examines generalizing traditional GGMCMs to better account for the acute-asymmetry in survival data by using a gamma link function. Dr. Folorunso's model demonstrated superior performance to a traditional GGMCM as well as other kinds of survival mixture-cure models on an ovarian cancer dataset from University College Hospital, Ibadan. Dr. Folorunso's has also investigated works examining additional aspects of survival models, as well as social determinants and impacts of neonatal health.","authors":"Serifat Folorunso","slideslive_active_date":"","slideslive_id":"38931968","title":"Doctoral Symposium Talk: Serifat Folorunso"},{"UID":"20S03","abstract":"Dr. Savannah Bergquist's (Harvard University, 2019) research focuses on accounting for missing not at random (MNAR) data in health contexts, specifically insurance plan payment policies and lung cancer staging from insurance claims data. In the former analyses, Dr. Bergquist's work uses missingness sensitive ML methods to examine the contribution of various current practices to problematic incentives in medicare plan payment policies, and to suggest improvement. In the latter research, Dr. Bergquist focuses on predicting a clinically meaningful lung cancer staging system using classification models. Dr. Bergquist also has examined other aspects of health insurance plan design and analysis.","authors":"Savannah Bergquist","slideslive_active_date":"","slideslive_id":"38931970","title":"Doctoral Symposium Talk: Savannah Bergquist"},{"UID":"20S04","abstract":"Paidamoyo Chapfuwa's (Duke University, Expected 2021) research focuses on bringing modern machine learning approaches to survival analysis, i.e, causal inference, generative modeling, and Bayesian nonparametric. In particular, Paidamoyo's work examines generative methods for high-performance (accurate, calibrated, uncertainty-aware predictions) survival models. Moreover, her work introduces an adversarial distribution matching approach and a novel covariate-conditional Kaplan-Meier estimator, accounting for the predictive uncertainty in survival model calibration. In addition, her work also enables an interpretable time-to-event driven clustering method using a Bayesian nonparametric stick-breaking representation of the Dirichlet Process that represents patients in a clustered latent space. Recently, Paidamoyo\u2019s work has explored a unified framework for individualized treatment effect estimation for survival outcomes from observation data.","authors":"Paidamoyo Chapfuwa","slideslive_active_date":"","slideslive_id":"38931971","title":"Doctoral Symposium Talk: Paidamoyo Chapfuwa"},{"UID":"20S05","abstract":"Primoz Kocbek's (University of Maribor, Expected 2021) research focuses on interpretability and the use of synthetic data in machine learning models processing electronic health record (EHR) data. In particular, Primoz's research examined and provided a more nuanced analysis of the kinds of interpretability enabled by various kinds of models, including classifications of models as providing local vs. global or model-dependent vs. model-agnostic interpretability. Primoz also hopes to extend his research in the future with the use of synthetic data as additional structure to data, primarily leveraging the natural graph structure of some subsets of EHR data to improve predictive power.","authors":"Primoz Kocbek","slideslive_active_date":"","slideslive_id":"38931972","title":"Doctoral Symposium Talk: Primoz Kocbek"},{"UID":"20S06","abstract":"Jill Furzer's (University of Toronto, Expected 2020) research focuses on combining ensemble learning methods with an economics causal inference tool-kit to predict mental health risk in childhood, assess drivers of marginal misdiagnosis, and understand long-term socioeconomic implications of missed, late or low-value diagnoses. Jill compares classic regression with regularized regression and gradient boosted trees to estimate latent mental health risk in childhood in a nationally representative longitudinal health survey dataset, and further examines how sensitive these models are to protected subgroup information, including gender, rural v. urban, and socioeconomic status. Jill's past research has further focused on modelling the cost-effectiveness of various pediatric oncology screening guidelines and treatments.","authors":"Jill Furzer","slideslive_active_date":"","slideslive_id":"38931974","title":"Doctoral Symposium Talk: Jill Furzer"},{"UID":"20S07","abstract":"Dr. Hasna Njah's (University of Sfax, 2019) research focuses on learning bayesian networks (BNs) for health applications in the context of high-dimensional data. In particular, Dr. Njah's research proposes a new kind of BN, called a Bayesian Network Abstraction (BNA) framework, which uses latent variables to ameliorate the computational and optimization difficulties imposed by high-dimensional data. The BNA framework first uses dependency-based feature clustering algorithms to cluster input variables, followed by learning to summarize each cluster in a separate latent variable, thereby realizing the entire network in a hierarchical clustering & summarization BN, with the overall system learned using the greedy equilibrium criteria and hierarchical expectation maximization. In other work, Dr. Njah has focused on applying BNs to protein-protein interaction data and gene regulatory networks. ","authors":"Hasna Njah","slideslive_active_date":"","slideslive_id":"38931976","title":"Doctoral Symposium Talk: Hasna Njah"},{"UID":"20S08","abstract":"Vinyas Harish's (University of Toronto, Expected MD/PhD 2025) research focuses on the ways in which machine learning can complement traditional epidemiological perspectives and methods applied at the population and clinical levels, with an emphasis on promoting health systems resilience in the context of emergencies. Vinyas explores these topics in several ways, including a qualitative study on the ethics of private sector ML4H collaborations with stakeholders across technical, ethics/governance, and clinical domains, an examination of the utility of pandemic preparedness indices through cluster analysis, and the high-resolution prediction of COVID-19 transmission using mobility data and environmental covariates. Historically, Vinyas has also examined medical device safety and feasibility testing as well as the efficacy of novel methods for teaching clinicians image-guided procedures.","authors":"Vinyas Harish","slideslive_active_date":"","slideslive_id":"38931977","title":"Doctoral Symposium Talk: Vinyas Harish"},{"UID":"20S09","abstract":"Haohan Wang's (Carnegie Mellon University, Expected 2021) research focuses on the systematic development of trustworthy machine learning (ML) systems that can be deployed to answer biomedical questions in the real-world scenarios, consistently responding over significant variations of the data. In particular, Haohan's work focuses on improving robustness of ML models to dataset shift, specifically towards the application of early prediction of Alzheimer's disease from genetic and imaging data. Haohan's methods focus on using a nuanced understanding of the data generative process in order to better account for expected distributional shifts, yielding more robust and interpretable models of Alzheimer's diagnosis. In other work, Haohan has also investigated the use of ML methods on genomic and transcriptomic data for biomedical applications.","authors":"Haohan Wang","slideslive_active_date":"","slideslive_id":"38931978","title":"Doctoral Symposium Talk: Haohan Wang"},{"UID":"20S10","abstract":"Mamadou Lamine MBOUP's (University of Thies, Expected 2022) research focuses on using ML methods over ultrasound data to perform early diagnosis and identification of liver damage within chronic liver disease patients and to classify said patients according to their severity. Especially in areas where chronic liver diseases, such as hepatitis, are prevalent, and liver cirrhosis and cancer are a significant health burden on the community, using ML methods to perform early diagnosis of these syndromes based on a low-cost modality like ultrasound would be extremely impactful. Mamdou's work investigates using supervised and unsupervised classical and deep learning methods to solve this problem, using data from a cohort of patients at the Aristide Le Dantec University Hospital Center. In past work, Mamadou has investigated algorithms for image compression, as well as investigated other health tasks in the cancer area.","authors":"Mamadou Lamine MBOUP","slideslive_active_date":"","slideslive_id":"38931979","title":"Doctoral Symposium Talk: Mamadou Lamine MBOUP"},{"UID":"20S11","abstract":"Tulika Kakati's (Tezpur University, Expected 2020) research focuses on gene expression analysis using ML to identify biomarkers across disease state and the cell cycle. Tulika's work has used novel clustering methods and identification of border genes for co-expression analysis, as well as developing novel deep learning approaches to the identification of differentially expressed genes via DEGnet, validating all models across a number of gene expression datasets. Tulika has also investigated improving the computational efficiency of these methods via distributed computing, specifically with regards to the application of their clustering algorithms.","authors":"Tulika Kakati","slideslive_active_date":"","slideslive_id":"38931980","title":"Doctoral Symposium Talk: Tulika Kakati"},{"UID":"20S12","abstract":"Nirvana Nursimulu's (University of Toronto, Expected 2021) research focuses on methods for computationally analyzing metabolic networks, with applications towards understanding pathogen growth in pursuit of drug development. Nirvana has examined the enzyme annotation problem, specifically focusing on producing methods that yield lower false positives than traditional similarity search metrics while considering full sequence diversity within enzyme classes. In addition, Nirvana has developed an automated pipeline for enzyme annotation and reconstruction of a metabolic model, focusing on increasing model coverage in order to yield more realistic simulations. In other work, Nirvana has also investigated more traditional microbiology across various pathogens.","authors":"Nirvana Nursimulu","slideslive_active_date":"","slideslive_id":"38931981","title":"Doctoral Symposium Talk: Nirvana Nursimulu"},{"UID":"20S13","abstract":"Rohit Bhattacharya's (Johns Hopkins University, Expected 2021) research focuses on the development of causal methods that correct for understudied but ubiquitous sources of bias that arise during the course of data analyses, including data dependence, non-ignorable missingness, and model misspecification, in the study of infectious diseases. Rohit approaches these problems by developing novel graphical modeling techniques that can detect and correct for such sources of bias while providing the investigator with clear and interpretable representations of the underlying data dependence or missingness process. In dealing with model misspecification, Rohit has recently developed algorithms that yield doubly robust and efficient semi-parametric estimators for a wide class of causal graphical models, despite the presence of unmeasured confounders. In other work, Rohit has performed several investigations in oncology applications.","authors":"Rohit Bhattacharya","slideslive_active_date":"","slideslive_id":"38931983","title":"Doctoral Symposium Talk: Rohit Bhattacharya"},{"UID":"20S14","abstract":"Kaspar M\u00e4rtens's (University of Oxford, Expected 2020) research focuses on enabling feature-level interpretability in non-linear latent variable models via a synthesis of statistical and machine learning techniques. In particular, Kaspar designs novel latent variable, non-linear dimensionality reduction models that allow for feature-level interpretability, focusing primarily on gaussian process latent variable models (GPLVMs) and variational autoencoders (VAEs), specifically augmenting these models with ideas from classical statistics, such as the functional analysis of variance (ANOVA) decomposition or probabilistic clustering algorithms. The results of these works are a class of models for flexible non-linear dimensionality reduction together with explainability, providing a mechanism to gain insights into what the model has learnt in terms of the observed features. In other work, Kaspar has examined genomic problems and applications of MCMC sampling.","authors":"Kaspar M\u00e4rtens","slideslive_active_date":"","slideslive_id":"38931984","title":"Doctoral Symposium Talk: Kaspar M\u00e4rtens"},{"UID":"20S15","abstract":"Luis Oala's (Fraunhofer Heinrich Hertz Institute, Expected 2021) research focuses on gaining a better understanding about the vulnerabilities of deep neural networks and finding tests to make these vulnerabilities visible, primarily through the lens of uncertainty quantification. Together with his research group, Luis has developed an effective and modular alarm system for image reconstruction DNNs. The alarm system, called Interval Neural Networks, allows for high-resolution error heatmaps during inference for use cases such as CT image reconstruction. As co-chair of the Working Group on Data and AI Solution Assessment Methods in the ITU/WHO Focus Group on AI4H (FG-AI4H), he also leads a group of interdisciplinary experts working towards a standardized assessment framework for the evaluation of health AIs","authors":"Luis Oala","slideslive_active_date":"","slideslive_id":"38931985","title":"Doctoral Symposium Talk: Luis Oala"}],"tutorials":[{"UID":"20T01","abstract":"Survival analysis is used for predicting time-to-event outcomes, such as how long a patient will stay in the hospital, or when the recurrence of a tumor will likely happen. This tutorial aims to go over the basics of survival analysis, how it is used in healthcare, and some of its recent methodological advances from the ML community. We will also discuss open challenges. NOTE: This tutorial has a corresponding notebook: https://sites.google.com/view/chil-survival.","authors":"George H. Chen|Jeremy C. Weiss","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38931962","title":"A Tour of Survival Analysis, from Classical to Modern"},{"UID":"20T02","abstract":"In this tutorial, we will describe population and public health and their essential role in a comprehensive strategy to improve health. We will illustrate state of the art data and modeling approaches in population and public health. In doing so, we will identify overlaps with and open questions relevant to machine learning, causal inference and fairness.","authors":"Vishwali Mhasawade|Yuan Zhao|Rumi Chunara","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38931964","title":"Population and public health: challenges and opportunities"},{"UID":"20T03","abstract":"With today's publicly available, de-identified clinical datasets, it is possible to ask questions like, \u201cCan an algorithm read an electrocardiogram as well as a cardiologist can?\u201d However, other kinds of questions like, \u201cDoes this ECG relate to a later cardiac arrest?\u201d can\u2019t be answered with the limited public data available to us today. Research using private datasets gives us reason to be optimistic, but progress will be slow unless suitable de-identified datasets become open, allowing researchers to efficiently collaborate and compete. Learn about an effort underway at the University of Chicago, led by Ziad Obermeyer, Sendhil Mullainathan, and their team, to provide a secure and public \u201cImageNet for clinical data\u201d that balances the concerns of patients, healthcare institutions, and researchers.","authors":"Ziad Obermeyer|Katy Haynes|Amy Pitelka|Josh Risley|Katie Lin","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38931961","title":"Public Health Datasets for Deep Learning: Challenges and Opportunities"},{"UID":"20T04","abstract":"This tutorial will be styled as a graduate lecture about medical imaging with deep learning. This will cover the background of popular medical image domains (chest X-ray and histology) as well as methods to tackle multi-modality/view, segmentation, and counting tasks. These methods will be covered in terms of architecture and objective function design. Also, a discussion about incorrect feature attribution and approaches to mitigate the issue. Prerequisites: basic knowledge of computer vision (CNNs) and machine learning (regression, gradient descent).","authors":"Joseph Paul Cohen","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38931963","title":"State of the Art Deep Learning in Medical Imaging"},{"UID":"20T05","abstract":"Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.","authors":"Alistair Johnson","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38931965","title":"Analyzing critical care data, from speculation to publication, starring MIMIC-IV (Part 1)"},{"UID":"20T06","abstract":"Despite a wealth of data, only a small fraction of decisions in critical care are evidence based. In this tutorial we will start with the conception of an idea, solidify the hypothesis, operationalize the concepts involved, and execute the study in a reproducible and communicable fashion. We will run our study on MIMIC-IV, an update to MIMIC-III, and cover some of the exciting additions in the new database. This tutorial will be interactive and result in a study performed end-to-end in a Jupyter notebook. Technical expertise is not required, as we will form groups based on skill level.","authors":"Alistair Johnson","bio":"","rocketchat_id":"","slideslive_active_date":"","slideslive_id":"38932058","title":"Analyzing critical care data, from speculation to publication, starring MIMIC-IV (Part 2)"}],"workshops":[{"UID":"20W01","abstract":"Small datasets form a significant portion of releasable data in high sensitivity domains such as healthcare. But, providing differential privacy for small dataset release is a hard task, where current state-of-the-art methods suffer from severe utility loss. As a solution, we propose DPRP (Differentially Private Data Release via Random Projections), a reconstruction based approach for releasing differentially private small datasets. DPRP has several key advantages over the state-of-the-art. Using seven diverse real-life clinical datasets, we show that DPRP outperforms the current state-of-the-art on a variety of tasks, under varying conditions, and for all privacy budgets.","authors":"Lovedeep Gondara|Ke Wang","slideslive_id":"38931935","title":"Differentially Private \"Small\" Dataset Release Using Random Projections"},{"UID":"20W03","abstract":"Reinforcement Learning (RL) has recently been applied to several problems in healthcare, with a particular focus in offline learning in observational data. RL relies on the use of latent states that embed sequential observations in such a way that the embedding is sufficient to approximately predict the next observation. but the appropriate construction of such states in healthcare settings is an open question, as the variation in steady-state human physiology is poorly-understood. In this work, we evaluate several information encoding schemes for offline RL using data from electronic health records (EHR). We use observations from septic patients in the MIMIC-III intensive care unit dataset, and evaluate the predictive performance of four embedding approaches in two tasks: predicting the next observation, and predicting a ``k-step'' look ahead or roll out. Our experiments highlight that the best performing state representation learning approaches utilize higher dimension recurrent neural architectures, and demonstrate that incorporating additional context with the state representation when predicting the next observation.","authors":"Taylor Killian|Jayakumar Subramanian|Mehdi Fatemi|Marzyeh Ghassemi","slideslive_id":"38931937","title":"Learning Representations for Prediction of Next Patient State"},{"UID":"20W04","abstract":"Capturing the inter-dependencies among multiple types of clinically-critical events is critical not only to accurate future event prediction, but also to better treatment planning. In this work, we propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events (e.g., kidney failure, mortality) by explicitly modeling the temporal dynamics of patients' latent states. Based on these learned patient states, we further develop a new general discrete-time formulation of the hazard rate function to estimate the survival distribution of patients with significantly improved accuracy. Extensive evaluations over real EMR data show that our proposed model compares favorably to various state-of-the-art baselines. Further our method also uncovers meaningful insights about the latent correlation among mortality and different types of organ failures.","authors":"Yuan Xue|Denny Zhou|Nan Du|Andrew M. Dai|Zhen Xu|Kun Zhang|Claire Cui","slideslive_id":"38931938","title":"Deep State-Space Generative Model For Correlated Time-to-Event Predictions"},{"UID":"20W05","abstract":"Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function, Nelson-Aalen's estimator for the hazard function, etc. Using eleven real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.","authors":"Lovedeep Gondara|Ke Wang","slideslive_id":"38931939","title":"Differentially Private Survival Function Estimation"},{"UID":"20W07","abstract":"As machine learning has become increasingly applied to medical imaging data, noise in training labels has emerged as an important challenge. Variability in diagnosis of medical images is well established; in addition, variability in training and attention to task among medical labelers may exacerbate this issue. Methods for identifying and mitigating the impact of low quality labels have been studied, but are not well characterized in medical imaging tasks. For instance, Noisy Cross-Validation splits the training data into halves, and has been shown to identify low-quality labels in computer vision tasks; but it has not been applied to medical imaging tasks specifically. In addition, there may be concerns around label imbalance for medical image sets, where relevant pathology may be rare. In this work we introduce Stratified Noisy Cross-Validation (SNCV), an extension of noisy cross validation. SNCV allows us to measure confidence in model prediction and assign a quality score to each example; supports label stratification to handle class imbalance; and identifies likely low-quality labels to analyse the causes. In contrast to noisy cross-validation, sample selection for SNCV occurs after training two models, not during training, which simplifies application of the method. We assess performance of SNCV on diagnosis of glaucoma suspect risk (GSR) from retinal fundus photographs, a clinically important yet nuanced labeling task. Using training data from a previously-published deep learning model, we compute a continuous quality score (QS) for each training example. We relabel 1,277 low-QS examples using a trained glaucoma specialist; the new labels agree with the SNCV prediction over the initial label >85% of the time, indicating that low-QS examples appear mostly reflect labeler erors. We then quantify the impact of training with only high-QS labels, showing that strong model performance may be obtained with many fewer examples. By applying the method to randomly sub-sampled training dataset, we show that our method can reduce labelling burden by approximately 50% while achieving model performance non-inferior to using the full dataset on multiple held-out test sets.","authors":"Joy Hsu|Sonia Phene|Akinori Mitani|Jieying Luo|Naama Hammel|Jonathan Krause|Rory Sayres","slideslive_id":"38931941","title":"Improving medical annotation quality to decrease labeling burden using stratified noisy cross-validation"},{"UID":"20W08","abstract":"Modeling disease progression is an active area of research. Many computational methods for progression modeling have been developed but mostly at population levels. In this paper, we formulate a personalized disease progression modeling problem as a multi-task regression problem where the estimation of progression scores at different time points is defined as a learning task. We introduce a Personalized Progression Modeling (PPM) scheme as a novel way to estimate personalized trajectories of disease by jointly discovering clusters of similar patients while estimating disease progression scores. The approach is formulated as an optimization problem that can be solved using existing optimization techniques. We present efficient algorithms for the PPM scheme, together with experimental results on both synthetic and real world healthcare data proving its analytical efficacy over other 4 baseline methods representing the current state of the art. On synthetic data, we showed that our algorithm achieves over 40% accuracy improvement over all the baselines. On the healthcare application PPM has a 4% accuracy improvement on average over the state-of-the-art baseline in predicting the viral infection progression. These results highlight significant modeling performance gains obtained with PPM.","authors":"Mohamed Ghalwash|Daby Sow","slideslive_id":"38931942","title":"A Multi-Task Learning Approach to Personalized Progression Modeling"},{"UID":"20W09","abstract":"Clinical notes in electronic health records contain highly heterogeneous writing styles, including non-standard terminology or abbreviations. Using these notes in predictive modeling has traditionally required preprocessing (e.g. taking frequent terms or topic modeling) that removes much of the richness of the source data. We propose a pretrained hierarchical recurrent neural network model that parses minimally processed clinical notes in an intuitive fashion, and show that it improves performance for discharge diagnosis classification tasks on the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, compared to models that conduct no pretraining or that treat the notes as an unordered collection of terms. We also apply an attribution technique to examples to identify the words that the model uses to make its prediction, and show the importance of the words\u2019 nearby context.","authors":"Jonas Kemp|Alvin Rajkomar|Andrew M. Dai","slideslive_id":"38931943","title":"Improved Patient Classification with Hierarchical Language Model Pretraining over Clinical Notes"},{"UID":"20W10","abstract":"Industrial equipment, devices and patients typically undergo change from a healthy state to an unhealthy state. We develop a novel approach to detect unhealthy entities and also discover the time of change to enable deeper investigation into the cause for change. In the absence of an engineering or medical intervention, health degradation only happens in one direction --- healthy to unhealthy. Our transductive learning framework leverages this chronology of observations for learning a superior model with minimal supervision. Temporal Transduction is achieved by incorporating chronological constraints in the conventional max-margin classifier --- Support Vector Machines (SVM). We utilize stochastic gradient descent to solve the resulting optimization problem. Our experiments on publicly available benchmark datasets demonstrate the effectiveness of our approach in accurately detecting unhealthy entities with less supervision as compared to other strong baselines --- conventional and transductive SVM.","authors":"Abhay Harpale","slideslive_id":"38931944","title":"Health change detection using temporal transductive learning"},{"UID":"20W11","abstract":"In many Machine Learning applications, it is important to reduce the set of features used in training. This is especially important when different attributes have different acquisition costs, e.g., various blood tests. Cost-sensitive feature selection methods aim to select a subset of attributes that yields a performant Machine Learning model while keeping the total cost low. In this paper, we propose a Bayesian Optimization approach to this task. We explore the different subsets of available features by optimizing an evaluation function that weights the model's performance and total feature cost. We evaluate the proposed method on different UCI datasets, as well as a real-life one, and compare it to diverse feature selection approaches. Our results demonstrate that the Bayesian optimization cost-sensitive feature selection (BOCFS) can select a low-cost subset of informative features, therefore generating highly effective classifiers, and achieving state-of-the-art performance in some datasets.","authors":"Lucca G. Zenobio|Thiago N. C. Cardoso|Andrea Kauffmann|Augusto Antunes","slideslive_id":"38931945","title":"Cost-Sensitive Feature Selection Using Bayesian Optimization"},{"UID":"20W12","abstract":"With the increase in popularity of deep learning models for natural language processing (NLP) tasks in the field of Pharmacovigilance, more specifically for the identification of Adverse Drug Reactions (ADRs), there is an inherent need for large-scale social-media datasets aimed at such tasks. With most researchers allocating large amounts of time to crawl Twitter or buying expensive pre-curated datasets, then manually annotating by humans, these approaches do not scale well as more and more data keeps flowing in Twitter. In this work we re-purpose a publicly available archived dataset of more than 9.4 billion Tweets with the objective of creating a very large dataset of drug usage-related tweets. Using existing manually curated datasets from the literature, we then validate our filtered tweets for relevance using machine learning methods, with the end result of a publicly available dataset of 1,181,993 million tweets for public use. We provide all code and detailed procedure on how to extract this dataset and the selected tweet ids for researchers to use.","authors":"Ramya Tekumalla|Juan M Banda","slideslive_id":"38931946","title":"A large-scale Twitter dataset for drug safety applications mined from publicly existing resources"},{"UID":"20W13","abstract":"Clinical notes contain information about patients beyond structured data such as lab values or medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. We aim to develop and evaluate a continuous representation of clinical notes. Given this representation, our goal is to predict 30-day hospital readmission at various timepoints of admission, including early stages and at discharge. We apply bidirectional encoder representations from transformers (BERT) to clinical text. Publicly-released BERT parameters are trained on standard corpora such as Wikipedia and BookCorpus, which differ from clinical text. We therefore pre-train BERT using clinical notes and fine-tune the network for the task of predicting hospital readmission. This defines ClinicalBERT. ClinicalBERT uncovers high-quality relationships between medical concepts, as judged by physicians. ClinicalBERT outperforms various baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit on various clinically-motivated metrics. The attention weights of ClinicalBERT can also be used to interpret predictions. To facilitate research, we open-source model parameters, and scripts for training and evaluation. ClinicalBERT is a flexible framework to represent clinical notes. It improves on previous clinical text processing methods and with little engineering can be adapted to other clinical predictive tasks.","authors":"Kexin Huang|Jaan Altosaar|Rajesh Ranganath","slideslive_id":"38931947","title":"ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission"},{"UID":"20W14","abstract":"Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework to first extract problem lists from clinical notes and subsequently use the extracted problems to predict patient outcomes. This framework is both more performant and more interpretable than existing models used within the domain, achieving an AU-ROC of 0.710 for bounceback readmission and 0.869 for in-hospital mortality occurring after ICU discharge. We identify risk factors for both readmission and mortality outcomes and demonstrate that it can be used to develop dynamic problem lists that present clinical problems along with their quantitative importance. This allows clinicians to both easily identify the relevant problems and gain insight into the factors driving the model\u2019s prediction.","authors":"Justin Lovelace|Nathan Hurley|Adrian Haimovich|Bobak Mortazavi","slideslive_id":"38931948","title":"Mining Dynamic Problem Lists from Clinical Notes for the Interpretable Prediction of Adverse Outcomes"},{"UID":"20W15","abstract":"Deep learning is increasingly common in healthcare, yet transfer learning for physiological signals (e.g., temperature, heart rate, etc.) is under-explored. Here, we present a straightforward, yet performant framework for transferring knowledge about physiological signals. Our framework is called PHASE (\\underline{PH}ysiologic\\underline{A}l \\underline{S}ignal \\underline{E}mbeddings). It i) learns deep embeddings of physiological signals and ii) predicts adverse outcomes based on the embeddings. PHASE is the first instance of deep transfer learning in a cross-hospital, cross-department setting for physiological signals. We show that PHASE's per-signal (one for each signal) LSTM embedding functions confer a number of benefits including improved performance, successful transference between hospitals, and lower computational cost.","authors":"Hugh Chen|Scott Lundberg|Gabe Erion|Jerry H. Kim|Su-In Lee","slideslive_id":"38931949","title":"Deep Transfer Learning for Physiological Signals"},{"UID":"20W16","abstract":"Machine-learned diagnosis models have shown promise as medical aides but are trained under a closed-set assumption, i.e. that models will only encounter conditions on which they have been trained. However, it is practically infeasible to obtain sufficient training data for every human condition, and once deployed such models will invariably face previously unseen conditions. We frame machine-learned diagnosis as an open-set learning problem, and study how state-of-the-art approaches compare. Further, we extend our study to a setting where training data is distributed across several healthcare sites that do not allow data pooling, and experiment with different strategies of building open-set diagnostic ensembles. Across both settings, we observe consistent gains from explicitly modeling unseen conditions, but find the optimal training strategy to vary across settings.","authors":"Viraj Prabhu|Anitha Kannan|Geoffrey J. Tso|Namit Katariya|Manish Chablani|David Sontag|Xavier Amatriain","slideslive_id":"38931950","title":"Open Set Medical Diagnosis"},{"UID":"20W17","abstract":"This paper aims to evaluate the suitability of current deep learning methods for clinical workflow especially by focusing on dermatology. Although deep learning methods have been attempted to get dermatologist level accuracy in several individual conditions, it has not been rigorously tested for common clinical complaints. Most projects involve data acquired in well-controlled laboratory conditions. This may not reflect regular clinical evaluation where corresponding image quality is not always ideal. We test the robustness of deep learning methods by simulating non-ideal characteristics on user submitted images of ten classes of diseases. Assessing via imitated conditions, we have found the overall accuracy to drop and individual predictions change significantly in many cases despite of robust training.","authors":"Sourav Mishra|Subhajit Chaudhury|Hideaki Imaizumi|Toshihiko Yamasaki","slideslive_id":"38931951","title":"Assessing Robustness of Deep Learning Methods in Dermatological Workflow"},{"UID":"20W20","abstract":"Sentiment analysis is a well-researched field of machine learning and natural language processing generally concerned with determining the degree of positive or negative polarity in free text. Traditionally, such methods have focused on analyzing user opinions directed towards external entities such as products, news, or movies. However, less attention has been paid towards understanding the sentiment of human emotion in the form of internalized thoughts and expressions of self-reflection. Given the rise of public social media platforms and private online therapy services, the opportunity for designing accurate tools to quantify emotional states in is at an all-time high. Based upon findings in psychological research, in this work we propose a new type of sentiment analysis task more appropriate for assessing the valence of human emotion. Rather than assessing text on a single polarity axis ranging from positive to negative, we analyze self-expressive thoughts using a two-dimensional assignment scheme with four sentiment categories: positive, negative, both positive and negative, and neither positive nor negative. This work details the collection of a novel annotated dataset of real-world mental health therapy logs and compares several machine learning methodologies for the accurate classification of emotional valence. We found superior performance using deep transfer learning approaches, and in particular, best results were obtained using the recent breakthrough method of BERT (Bidirectional Encoder Representations from Transformers). Based on these results, it is clear that transfer learning has the potential for greatly improving the accuracy of classifiers in the mental health domain, where labeled data is often scarce. Additionally, we argue that representing emotional sentiment on decoupled valence axes via four classification labels is an appropriate modification of traditional sentiment analysis for mental health tasks.","authors":"Benjamin Shickel|Martin Heesacker|Sherry Benton|Parisa Rashidi","slideslive_id":"38931954","title":"Automated Emotional Valence Prediction in Mental Health Text via Deep Transfer Learning"},{"UID":"20W21","abstract":"Electronic Health Records (EHRs) are commonly used by the machine learning community for research on problems specifically related to health care and medicine. EHRs have the advantages that they can be easily distributed and contain many features useful for e.g. classification problems. What makes EHR data sets different from typical machine learning data sets is that they are often very sparse, due to their high dimensionality, and often contain heterogeneous data types. Furthermore, the data sets deal with sensitive information, which limits the distribution of any models learned using them, due to privacy concerns. In this work, we explore using Generative Adversarial Networks to generate synthetic, heterogeneous EHRs with the goal of using these synthetic records in place of existing data sets. We will further explore applying differential privacy (DP) preserving optimization in order to produce differentially private synthetic EHR data sets, which provide rigorous privacy guarantees, and are therefore more easily shareable. The performance (measured by AUROC, AUPRC and accuracy) of our model's synthetic, heterogeneous data is very close to the original data set (within 6.4%) for the non-DP model when tested in a binary classification task. Although incurring a 20% performance penalty, the DP synthetic data is still useful for machine learning tasks. We additionally perform a sub-population analysis and find that our model does not introduce any bias into the synthetic EHR data compared to the baseline in either male/female populations, or the 0-18, 19-50 and 51+ age groups in terms of classification performance.","authors":"Kieran Chin-Cheong|Thomas M. Sutter|Julia E. Vogt","slideslive_id":"38931955","title":"Generation of Differentially Private Heterogeneous Synthetic Electronic Health Records using GANs"},{"UID":"20W22","abstract":"Intensive Care Unit Electronic Health Records (ICU EHRs) store multimodal data about patients including clinical notes, sparse and irregularly sampled physiological time series, lab results, and more. To date, most methods designed to learn predictive models from ICU EHR data have focused on a single modality. In this paper, we leverage the recently proposed interpolation-prediction deep learning architecture as a basis for exploring how physiological time series data and clinical notes can be integrated into a unified mortality prediction model. We study both early and late fusion approaches, and demonstrate how the relative predictive value of clinical text and physiological data change over time. Our results show that a late fusion approach can provide a statistically significant improvement in mortality prediction performance over using individual modalities in isolation.","authors":"Satya Narayan Shukla|Benjamin Marlin","slideslive_id":"38931956","title":"Integrating Physiological Time Series and Clinical Notes with Deep Learning for Improved ICU Mortality Prediction"},{"UID":"20W23","abstract":"Although there have been several recent advances in the application of deep learning algorithms to chest x-ray interpretation, we identify three major challenges for the translation of chest x-ray algorithms to the clinical setting. We examine the performance of the top 10 performing models on the CheXpert challenge leaderboard on three tasks: (1) TB detection, (2) pathology detection on photos of chest x-rays, and (3) pathology detection on data from an external institution. First, we find that the top 10 chest x-ray models on the CheXpert competition achieve an average AUC of 0.851 on the task of detecting TB on two public TB datasets without fine-tuning or including the TB labels in training data. Second, we find that the average performance of the models on photos of x-rays (AUC = 0.916) is similar to their performance on the original chest x-ray images (AUC = 0.924). Third, we find that the models tested on an external dataset either perform comparably to or exceed the average performance of radiologists. We believe that our investigation will inform rapid translation of deep learning algorithms to safe and effective clinical decision support tools that can be validated prospectively with large impact studies and clinical trials.","authors":"Pranav Rajpurkar|Anirudh Joshi|Phil Chen|Anuj Pareek|Amir Kiani|Matthew Lungren|Andrew Ng|Jeremy Irvin","slideslive_id":"38931957","title":"CheXpedition: Investigating Generalization Challenges for Translation of Chest X-Ray Algorithms to the Clinical Setting"},{"UID":"20W24","abstract":"Documenting patients' interactions with health providers and institutions requires summarizing highly complex data. Medical coding reduces the dimensionality of this problem to a set of manually assigned codes that are used to bill, track patient health, and summarize a patient encounter. Incorrect coding, however, can lead to significant financial, legal, and health costs to clinics and patients. To address this, we build several deep learning models -- including transfer learning of state-of-the-art BERT models -- to predict medical codes on a novel dataset of 39,000 patient encounters. We also show through several labeling experiments that model performance is robust to subjectivity in the labels, and find that our models outperform a clinic's coding when judged against charts corrected and relabeled by an expert.","authors":"Mehmet Seflek|Wesam Elshamy|Abboud Chaballout|Ali Madani","slideslive_id":"38931958","title":"Automated Medical Coding using BERT: Benchmarking Deep Learning in the Face of Subjective Labels"},{"UID":"20W25","abstract":"Representation learning is a commonly touted goal in machine learning for healthcare, and for good reason. If we could learn a numerical encoding of clinical data which is reflective of underlying physiological similarity, this would have significant benefits both in research and application. However, many works pursuing representation learning systems evaluate only according to traditional, single-task performance metrics, and fail to assess whether or not the representations they produce actually contain generalizable signals capturing this underlying notion of similarity. In this work, we design an evaluation procedure specifically for representation learning systems, and use it to analyze the value of large-scale multi-task representation learners. We find mixed results, with multi-task representations being commonly helpful across a battery of prediction tasks and models, even while ensemble performance is often improvement by removing tasks from the trained ensemble and learned representations demonstrate no ability to cluster.","authors":"Matthew McDermott|Bret Nestor|Wancong Zhang|Peter Szolovits|Anna Goldenberg|Marzyeh Ghassemi","slideslive_id":"38931959","title":"Distracted Multi-task Learning: Addressing Negative Transfer with Fine-tuning on EHR Time-series Data"},{"UID":"20W26","abstract":"In the last few years, the FDA has begun to recognize De Novo pathways (new approval processes) for approving AI as medical devices. A major concern with this is that the review process does not adequately test for biases in these models. There are many ways in which biases can arise in data, including during data collection, training, and model deployment. In this paper, we adopt a framework for categorizing the types of bias in datasets in a fine-grained way, which enables informed, targeted interventions for each issue appropriately. From there, we propose policy recommendations to the FDA and NIH to promote the deployment of more equitable AI diagnostic systems.","authors":"Julie R Vaughn|Avital Baral|Mayukha Vadari|William Boag","slideslive_id":"38931960","title":"Dataset Bias in Diagnostic AI systems: Guidelines for Dataset Collection and Usage"},{"UID":"20W27","abstract":"Electronic records contain sequences of events, some of which take place all at once in a single visit, and others that are dispersed over multiple visits, each with a different timestamp. We postulate that fine temporal detail, e.g., whether a series of blood tests are completed at once or in rapid succession should not alter predictions based on this data. Motivated by this intuition, we propose models for analyzing sequences of multivariate clinical time series data that are invariant to this temporal clustering. We propose an efficient data augmentation technique that exploits the postulated temporal-clustering invariance to regularize deep neural networks optimized for several clinical prediction tasks. We introduce two techniques to temporally coarsen (downsample) irregular time series: (i) grouping the data points based on regularly-spaced timestamps; and (ii) clustering them, yielding irregularly-paced timestamps. Moreover, we propose a MultiResolution network with Shared Weights (MRSW), improving predictive accuracy by combining predictions based on inputs sequences transformed by different coarsening operators. Our experiments show that MRSW improves the mAP on the benchmark mortality prediction task from 51.53% to 53.92%.","authors":"Mohammad Taha Bahadori|Zachary Lipton","slideslive_id":"38931987","title":"Temporal-Clustering Invariance in Irregular Healthcare Time Series"},{"UID":"20W28","abstract":"In survival analysis, deep learning approaches have recently been proposed for estimating an individual's probability of survival over some time horizon. Such approaches can capture complex non-linear relationships, without relying on restrictive assumptions regarding the specific form of the relationship between an individual's characteristics and their underlying survival process. To date, however, these methods have focused primarily on optimizing discriminative performance, and have ignored model calibration. Well-calibrated survival curves present realistic and meaningful probabilistic estimates of the true underlying survival process for an individual. However, due to the lack of ground-truth regarding the underlying stochastic process of survival for an individual, optimizing for and measuring calibration in survival analysis is an inherently difficult task. In this work, we i) propose a new loss function, for training deep nonparametric survival analysis models, that maximizes discriminative performance, subject to good calibration, and ii) present a calibration metric for survival analysis that facilitates model comparison. Through experiments on two publicly available clinical datasets, we show that our proposed approach achieves the same discriminative performance as state-of-the-art methods, while leading to over a 60% reduction in calibration error.","authors":"Fahad Kamran|Jenna Wiens","slideslive_id":"38931988","title":"Calibrated Deep Nonparametric Survival Analysis"}]},"2021":{"highlights":"
\n\n
\n
\n
Day 1
\n
\n
\n
\n \n \n \n
\n
\n
\n
Day 2
\n
\n
\n
\n \n \n
\n
\n\n### Governing Board\n\n###### **General Chair**\n- Dr. Marzyeh Ghassemi of University of Toronto and Vector Institute\n###### **Program Chairs**\n- Dr. Tristan Naumann of Microsoft Research Seattle\n- Dr. Emma Pierson of Stanford University and Microsoft Research\n###### **Proceedings Chairs**\n- Emily Alsentzer of Harvard University and MIT\n- Matthew McDermott of MIT\n- Dr. George Chen of Carnegie Mellon University\n###### **Track Chairs**\n- ###### **Models and Methods**\n * Dr. Mike Hughes of Tufts University\n * Dr. Shalmali Joshi of Harvard University\n * Dr. Rajesh Ranganath of New York University\n * Dr. Rahul Krishnan of Microsoft Research, University of Toronto and Vector Institute\n- ###### **Applications and Practice**\n * Dr. Andrew Beam of Harvard University\n * Dr. Tom Pollard of MIT\n * Dr. Bobak Mortazavi of Texas A&M University\n * Dr. Uri Shalit of Technion - Israel Institute of Technology\n- ###### **Impact and Society**\n * Dr. Alistair Johnson of The Hospital for Sick Children\n * Dr. Rumi Chunara of New York University\n * Dr. George Chen of Carnegie Mellon University\n###### **Communications Chairs**\n- Dr. Stephanie Hyland of Microsoft Research Cambridge\n- Dr. Sanja \u0160\u0107epanovi\u0107 of Bell Labs Cambridge\n- Dr. Sanmi Koyejo of University of Illinois at Urbana-Champaign and Google Research\n###### **Finance Chairs**\n- Dr. Joyce Ho of Emory University\n- Dr. Brett Beaulieu-Jones of Harvard Medical School\n###### **Tutorial Chairs**\n- Irene Chen of MIT\n- Dr. Jessica Gronsbell of University of Toronto\n###### **Virtual Chairs**\n- Dr. Stephanie Hyland of Microsoft Research Cambridge\n- Dr. Tom Pollard of MIT\n###### **Logistics Chair**\n- Tasmie Sarker of University of Toronto\n\n### Steering Committee\n\n- Dr. Yindalon Aphinyanaphongs of NYU\n- Dr. Leo Celi of MIT\n- Dr. Nigam Shah of Stanford University\n- Dr. Stephen Friend of Oxford University\n- Dr. Alan Karthikesalingam of Google Health UK\n- Dr. Ziad Obermeyer of University of California, Berkeley\n- Dr. Samantha Kleinberg of Stevens Institute of Technology\n- Dr. Anna Goldenberg of The Hospital for Sick Children Research Institute\n- Dr. Lucila Ohno-Machado of University of California, San Diego\n- Dr. Noemie Elhadad of Columbia University\n- Dr. Katherine Heller at Google Research\n- Dr. Laura Rosella of Dalla Lana School of Public Health, University of Toronto\n- Dr. Shakir Mohamed of DeepMind\n\n### Sponsors\nWe thank the Association for Computing Machinery (ACM) for sponsoring CHIL 2021, as well as the following organizations for supporting the event:\n\n- Google\n- Health[at]Scale\n- Layer6\n- Creative Destruction Lab\n- Vector Institute\n","proceedings":[{"UID":"21P01","abstract":"Electronic Health Records (EHR) are high-dimensional data with implicit connections among thousands of medical concepts. These connections, for instance, the co-occurrence of diseases and lab-disease correlations can be informative when only a subset of these variables is documented by the clinician. A feasible approach to improving the representation learning of EHR data is to associate relevant medical concepts and utilize these connections. Existing medical ontologies can be the reference for EHR structures, but they place numerous constraints on the data source. Recent progress on graph neural networks (GNN) enables end-to-end learning of topological structures for non-grid or non-sequential data. However, there are problems to be addressed on how to learn the medical graph adaptively and how to understand the effect of the medical graph on representation learning. In this paper, we propose a variationally regularized encoder-decoder graph network that achieves more robustness in graph structure learning by regularizing node representations. Our model outperforms the existing graph and non-graph based methods in various EHR predictive tasks based on both public data and real-world clinical data. Besides the improvements in empirical experiment performances, we provide an interpretation of the effect of variational regularization compared to standard graph neural network, using singular value analysis.","authors":"Weicheng Zhu (New York University) | Narges Razavian (NYU Grossman School of Medicine)","doi_link":"https://doi.org/10.1145/3450439.3451855","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954722","title":"Variationally Regularized Graph-based Representation Learning for Electronic Health Records"},{"UID":"21P02","abstract":"Set classification is the task of predicting a single label from a set comprising multiple instances. The examples we consider are pathology slides represented by sets of patches and medical text represented by sets of word embeddings. State of the art methods, such as the transformers, typically use attention mechanisms to learn representations of set-data by modeling interactions between instances of the set. These methods, however, have complex heuristic architectures comprising multiple heads and layers. The complexity of attention architectures hampers their training when only a small number of labeled sets is available, as is often the case in medical applications. To address this problem, we present a kernel-based representation learning framework that associates between learning affinity kernels to learning representations from attention architectures. We show that learning a combination of the sum and the product of kernels is equivalent to learning representations from multi-head multi-layer attention architectures. From our framework, we devise a simplified attention architecture which we term \\emph{affinitention} (affinity-attention) nets. We demonstrate the application of affinitention nets to the classification of Set-Cifar10 dataset, thyroid malignancy prediction from pathology slides, as well as patient text message-triage. We show that affinitention nets provide competitive results compared to heuristic attention architectures and outperform other competing methods.","authors":"David Dov, Serge Assaad, Shijing Si, and Rui Wang (Duke University) | Hongteng Xu (Renmin University of China) | Shahar Ziv Kovalsky (UNC at Chapel Hill) | Jonathan Bell and Danielle Elliott Range (Duke University Hospital) | Jonathan Cohen (Kaplan Medical Center) | Ricardo Henao and Lawrence Carin (Duke University)","doi_link":"https://doi.org/10.1145/3450439.3451856","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954723","title":"Affinitention Nets: Kernel Perspective on Attention Architectures for Set Classification with Applications to Medical Text and Images"},{"UID":"21P03","abstract":"Machine Learning, and in particular Federated Machine Learning, opens new perspectives in terms of medical research and patient care. Although Federated Machine Learning improves over centralized Machine Learning in terms of privacy, it does not provide provable privacy guarantees. Furthermore, Federated Machine Learning is quite expensive in term of bandwidth consumption as it requires participant nodes to regularly exchange large updates. This paper proposes a bandwidth-efficient privacy-preserving Federated Learning that provides theoretical privacy guarantees based on Differential Privacy. We experimentally evaluate our proposal for in-hospital mortality prediction using a real dataset, containing Electronic Health Records of about one million patients. Our results suggest that strong and provable patient-level privacy can be enforced at the expense of only a moderate loss of prediction accuracy.","authors":"Raouf Kerkouche (Privatics team, Univ. Grenoble Alpes, Inria, 38000 Grenoble, France) | Gergely \u00c1cs (Crysys Lab, BME-HIT) | Claude Castelluccia (Privatics team, Univ. Grenoble Alpes, Inria, 38000 Grenoble, France) | Pierre Genev\u00e8s (Tyrex team Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG 38000 Grenoble, France)","doi_link":"https://doi.org/10.1145/3450439.3451859","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954724","title":"Privacy-Preserving and Bandwidth-Efficient Federated Learning: An Application to In-Hospital Mortality Prediction"},{"UID":"21P04","abstract":"Recurrent Neural Networks (RNNs) are often used for sequential modeling of adverse outcomes in electronic health records (EHRs) due to their ability to encode past clinical states. These deep, recurrent architectures have displayed increased performance compared to other modeling approaches in a number of tasks, fueling the interest in deploying deep models in clinical settings. One of the key elements in ensuring safe model deployment and building user trust is model explainability. Testing with Concept Activation Vectors (TCAV) has recently been introduced as a way of providing human-understandable explanations by comparing high-level concepts to the network's gradients. While the technique has shown promising results in real-world imaging applications, it has not been applied to structured temporal inputs. To enable an application of TCAV to sequential predictions in the EHR, we propose an extension of the method to time series data. We evaluate the proposed approach on an open EHR benchmark from the intensive care unit, as well as synthetic data where we are able to better isolate individual effects.","authors":"Diana Mincu (Google Research) | Eric Loreaux (Google Health) | Shaobo Hou (DeepMind) | Sebastien Baur, Ivan Protsyuk, and Martin G Seneviratne (Google Health) | Anne Mottram and Nenad Tomasev (DeepMind) | Alan Karthikesalingam (Google Health) | Jessica Schrouff (Google Research)","doi_link":"https://doi.org/10.1145/3450439.3451858","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954725","title":"Concept-based Model Explanations for Electronic Health Records"},{"UID":"21P05","abstract":"Sharing data is critical to generate large data sets required for the training of machine learning models. Trustworthy machine learning requires incentives, guarantees of data quality, and information privacy. Applying recent advancements in data valuation methods for machine learning can help to enable these. In this work, we analyze the suitability of three different data valuation methods for medical image classification tasks, specifically pleural effusion, on an extensive data set of chest x-ray scans. Our results reveal that a heuristic for calculating the Shapley valuation scheme based on a k-nearest neighbor classifier can successfully value large quantities of data instances. We also demonstrate possible applications for incentivizing data sharing, the efficient detection of mislabeled data, and summarizing data sets to exclude private information. Thereby, this work contributes to developing modern data infrastructures for trustworthy machine learning in health care.","authors":"Konstantin D Pandl, Fabian Feiland, Scott Thiebes, and Ali Sunyaev (Karlsruhe Institute of Technology)","doi_link":"https://doi.org/10.1145/3450439.3451861","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954726","title":"Trustworthy Machine Learning for Health Care: Scalable Data Valuation with the Shapley Value"},{"UID":"21P06","abstract":"The pressure of ever-increasing patient demand and budget restrictions make hospital bed management a daily challenge for clinical staff. Most critical is the efficient allocation of resource-heavy Intensive Care Unit (ICU) beds to the patients who need life support. Central to solving this problem is knowing for how long the current set of ICU patients are likely to stay in the unit. In this work, we propose a new deep learning model based on the combination of temporal convolution and pointwise (1x1) convolution, to solve the length of stay prediction task on the eICU and MIMIC-IV critical care datasets. The model - which we refer to as Temporal Pointwise Convolution (TPC) - is specifically designed to mitigate common challenges with Electronic Health Records, such as skewness, irregular sampling and missing data. In doing so, we have achieved significant performance benefits of 18-68% (metric and dataset dependent) over the commonly used Long-Short Term Memory (LSTM) network, and the multi-head self-attention network known as the Transformer. By adding mortality prediction as a side-task, we can improve performance further still, resulting in a mean absolute deviation of 1.55 days (eICU) and 2.28 days (MIMIC-IV) on predicting remaining length of stay.","authors":"Emma Rocheteau and Pietro Li\u00f2 (University of Cambridge) | Stephanie Hyland (Microsoft Research)","doi_link":"https://doi.org/10.1145/3450439.3451860","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954727","title":"Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit"},{"UID":"21P07","abstract":"Wearable devices such as smartwatches are becoming increasingly popular tools for objectively monitoring physical activity in free-living conditions. To date, research has primarily focused on the purely supervised task of human activity recognition, demonstrating limited success in inferring high-level health outcomes from low-level signals. Here, we present a novel _self-supervised_ representation learning method using activity and heart rate (HR) signals without semantic labels. With a deep neural network, we set HR responses as the _supervisory signal_ for the activity data, leveraging their underlying physiological relationship. In addition, we propose a custom quantile loss function that accounts for the long-tailed HR distribution present in the general population. We evaluate our model in the largest free-living combined-sensing dataset (comprising >280k hours of wrist accelerometer & wearable ECG data). Our contributions are two-fold: i) the pre-training task creates a model that can accurately forecast HR based only on cheap activity sensors, and ii) we leverage the information captured through this task by proposing a simple method to aggregate the learnt latent representations (embeddings) from the window-level to user-level. Notably, we show that the embeddings can generalize in various downstream tasks through transfer learning with linear classifiers, capturing physiologically meaningful, personalized information. For instance, they can be used to predict variables associated with individuals' health, fitness and demographic characteristics (AUC >70), outperforming unsupervised autoencoders and common bio-markers. Overall, we propose the first multimodal self-supervised method for behavioral and physiological data with implications for large-scale health and lifestyle monitoring.
Code:https://github.com/sdimi/Step2heart","authors":"Dimitris Spathis, Ignacio Pozuelo, Soren Brage, Nicholas J. Wareham, and Cecilia Mascolo (University of Cambridge)","doi_link":"https://doi.org/10.1145/3450439.3451863","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954728","title":"Self-supervised Transfer Learning of Physiological Representations from Free-living Wearable Data"},{"UID":"21P08","abstract":"In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are ``known-unknowns'': We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.","authors":"Ori Linial and Neta Ravid (Technion) | Danny Eytan (Technion, Rambam) | Uri Shalit (Technion)","doi_link":"https://doi.org/10.1145/3450439.3451866","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954729","title":"Generative ODE Modeling with Known Unknowns"},{"UID":"21P09","abstract":"The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models. In this paper, we present a method to provide people with clinical expertise with domain-relevant evidence about why a prediction should be trusted. We first design a probabilistic model that relates meaningful latent concepts to prediction targets and observed data. Inference of latent variables in this model corresponds to both making a prediction $\\textit{and}$ providing supporting evidence for that prediction. We present a two-step process to efficiently approximate inference: (i) estimating model parameters using variational learning, and (ii) approximating $\\textit{maximum a posteriori}$ estimation of latent variables in the model using a neural network trained with an objective derived from the probabilistic model. We demonstrate the method on the task of predicting mortality risk for cardiovascular patients. Specifically, using electrocardiogram and tabular data as input, we show that our approach provides appropriate domain-relevant supporting evidence for accurate predictions.","authors":"Aniruddh Raghu and John Guttag (Massachusetts Institute of Technology) | Katherine Young (Harvard Medical School) | Eugene Pomerantsev (Massachusetts General Hospital) | Adrian V. Dalca (Harvard Medical School & MIT) | Collin M. Stultz (Massachusetts Institute of Technology)","doi_link":"https://doi.org/10.1145/3450439.3451869","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954730","title":"Learning to Predict with Supporting Evidence: Applications to Clinical Risk Prediction"},{"UID":"21P10","abstract":"Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods to produce labels from radiology reports that have better agreement with radiologists labeling images. Our best performing method, called VisualCheXbert, uses a biomedically-pretrained BERT model to directly map from a radiology report to the image labels, with a supervisory signal determined by a computer vision model trained to detect medical conditions from chest X-ray images. We find that VisualCheXbert outperforms an approach using an existing radiology report labeler by an average F1 score of 0.14 (95% CI 0.12, 0.17). We also find that VisualCheXbert better agrees with radiologists labeling chest X-ray images than do radiologists labeling the corresponding radiology reports by an average F1 score across several medical conditions of between 0.12 (95% CI 0.09, 0.15) and 0.21 (95% CI 0.18, 0.24).","authors":"Saahil Jain and Akshay Smit (Stanford University) | Steven QH Truong, Chanh DT Nguyen, and Minh-Thanh Huynh (VinBrain) | Mudit Jain (unaffiliated) | Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, and Pranav Rajpurkar (Stanford University)","doi_link":"https://doi.org/10.1145/3450439.3451862","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954731","title":"VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels"},{"UID":"21P11","abstract":"Deep learning methods for chest X-ray interpretation typically rely on pretrained models developed for ImageNet. This paradigm assumes that better ImageNet architectures perform better on chest X-ray tasks and that ImageNet-pretrained weights provide a performance boost over random initialization. In this work, we compare the transfer performance and parameter efficiency of 16 popular convolutional architectures on a large chest X-ray dataset (CheXpert) to investigate these assumptions. First, we find no relationship between ImageNet performance and CheXpert performance for both models without pretraining and models with pretraining. Second, we find that, for models without pretraining, the choice of model family influences performance more than size within a family for medical imaging tasks. Third, we observe that ImageNet pretraining yields a statistically significant boost in performance across architectures, with a higher boost for smaller architectures. Fourth, we examine whether ImageNet architectures are unnecessarily large for CheXpert by truncating final blocks from pretrained models, and find that we can make models 3.25x more parameter-efficient on average without a statistically significant drop in performance. Our work contributes new experimental evidence about the relation of ImageNet to chest x-ray interpretation performance.","authors":"Alexander Ke, William Ellsworth, Oishi Banerjee, Andrew Y. Ng, and Pranav Rajpurkar (Stanford University)","doi_link":"https://doi.org/10.1145/3450439.3451867","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954732","title":"CheXtransfer: Performance and Parameter Efficiency of ImageNet Models for Chest X-Ray Interpretation"},{"UID":"21P12","abstract":"Recent advances in training deep learning models have demonstrated the potential to provide accurate chest X-ray interpretation and increase access to radiology expertise. However, poor generalization due to data distribution shifts in clinical settings is a key barrier to implementation. In this study, we measured the diagnostic performance for 8 different chest X-ray models when applied to (1) smartphone photos of chest X-rays and (2) external datasets without any finetuning. All models were developed by different groups and submitted to the CheXpert challenge, and re-applied to test datasets without further tuning. We found that (1) on photos of chest X-rays, all 8 models experienced a statistically significant drop in task performance, but only 3 performed significantly worse than radiologists on average, and (2) on the external set, none of the models performed statistically significantly worse than radiologists, and five models performed statistically significantly better than radiologists. Our results demonstrate that some chest X-ray models, under clinically relevant distribution shifts, were comparable to radiologists while other models were not. Future work should investigate aspects of model training procedures and dataset collection that influence generalization in the presence of data distribution shifts.","authors":"Pranav Rajpurkar, Anirudh Joshi, Anuj Pareek, Andrew Y. Ng, and Matthew P. Lungren (Stanford University)","doi_link":"https://doi.org/10.1145/3450439.3451876","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954733","title":"CheXternal: Generalization of Deep Learning Models for Chest X-ray Interpretation to Photos of Chest X-rays and External Clinical Settings"},{"UID":"21P13","abstract":"Balanced representation learning methods have been applied successfully to counterfactual inference from observational data. However, approaches that account for survival outcomes are relatively limited. Survival data are frequently encountered across diverse medical applications, \\textit{i.e.}, drug development, risk profiling, and clinical trials, and such data are also relevant in fields like manufacturing (\\textit{e.g.}, for equipment monitoring). When the outcome of interest is a time-to-event, special precautions for handling censored events need to be taken, as ignoring censored outcomes may lead to biased estimates. We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes. Further, we formulate a nonparametric hazard ratio metric for evaluating average and individualized treatment effects. Experimental results on real-world and semi-synthetic datasets, the latter of which we introduce, demonstrate that the proposed approach significantly outperforms competitive alternatives in both survival-outcome prediction and treatment-effect estimation.","authors":"Paidamoyo Chapfuwa, Serge Assaad, Shuxi Zeng, Michael Pencina, Lawrence Carin, and Ricardo Henao (Duke University)","doi_link":"https://doi.org/10.1145/3450439.3451875","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954734","title":"Enabling Counterfactual Survival Analysis with Balanced Representations"},{"UID":"21P14","abstract":"Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly. In addition, optimizing these multiple properties is a challenging task because the optimization of one property is prone to changing other properties. In this paper, we pose this multi-property optimization problem as a sequence translation process and propose a new optimized molecule generator model based on the Transformer with two constraint networks: property prediction and similarity prediction. We further improve the model by incorporating score predictions from these constraint networks in a modified beam search algorithm. The experiments demonstrate that our proposed model outperforms state-of-the-art models by a significant margin for optimizing multiple properties simultaneously.","authors":"Bonggun Shin and Sungsoo Park (Deargen Inc.) | JinYeong Bak (SungKyunKwan University) | Joyce C. Ho (Emory University)","doi_link":"https://doi.org/10.1145/3450439.3451879","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954735","title":"Controlled Molecule Generator for Optimizing Multiple Chemical Properties"},{"UID":"21P15","abstract":"There are large individual differences in physiological processes, making designing personalized health sensing algorithms challenging. Existing machine learning systems struggle to generalize well to unseen subjects or contexts and can often contain problematic biases. Video-based physiological measurement is not an exception. Therefore, learning personalized or customized models from a small number of unlabeled samples is very attractive as it would allow fast calibrations to improve generalization and help correct biases. In this paper, we present a novel meta-learning approach called MetaPhys for personalized video-based cardiac measurement. Our method uses only 18-seconds of video for customization and works effectively in both supervised and unsupervised manners. We evaluate our proposed approach on two benchmark datasets and demonstrate superior performance in cross-dataset evaluation with substantial reductions (42% to 44%) in errors compared with state-of-the-art approaches. We have also demonstrated our proposed method significantly helps reduce the bias in skin type.","authors":"Xin Liu and Ziheng Jiang (University of Washington) | Josh Fromm (OctoML) | Xuhai Xu and Shwetak Patel (University of Washington) | Daniel McDuff (Microsoft Research)","doi_link":"https://doi.org/10.1145/3450439.3451870","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954736","title":"MetaPhys: Few-Shot Adaptation for Non-Contact Physiological Measurement"},{"UID":"21P16","abstract":"Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts. As such, regulatory bodies like the US FDA have begun discussions on how to autonomously approve modifications to algorithms. Current proposals evaluate algorithmic modifications via hypothesis testing. However, these methods are only able to define and control the online error rate if the data is stationary over time, which is unlikely to hold in practice. In this manuscript, we investigate designing approval policies for modifications to ML algorithms in the presence of distributional shifts. Our key observation is that the approval policy that is most efficient at identifying and approving beneficial modifications varies across different problem settings. So rather than selecting fixed approval policy a priori, we propose learning the best approval policy by searching over a family of approval strategies. We define a family of strategies that range in their level of optimism when approving modifications. This family includes the pessimistic strategy that, in fact, rescinds approval, which is necessary when no version of the ML algorithm performs well. We use the exponentially weighted averaging forecaster (EWAF) to learn the most appropriate strategy and derive tighter regret bounds assuming the distributional shifts are bounded. In simulation studies and empirical analyses, we find that wrapping approval strategies within EWAF algorithm is a simple yet effective strategy that can help protect against distributional shifts without significantly slowing down approval of beneficial modifications.","authors":"Jean Feng (University of California, San Francisco)","doi_link":"https://doi.org/10.1145/3450439.3451864","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954737","title":"Learning to Safely Approve Updates to Machine Learning Algorithms"},{"UID":"21P17","abstract":"The black-box nature of the deep networks makes the explanation for \"why\" they make certain predictions extremely challenging. Saliency maps are one of the most widely-used local explanation tools to alleviate this problem. One of the primary approaches for generating saliency maps is by optimizing a mask over the input dimensions so that the output of the network is influenced the most by the masking. However, prior work only studies such influence by removing evidence from the input. In this paper, we present iGOS++, a framework to generate saliency maps that are optimized for altering the output of the black-box system by either removing or preserving only a small fraction of the input. Additionally, we propose to add a bilateral total variation term to the optimization that improves the continuity of the saliency map especially under high resolution and with thin object parts. The evaluation results from comparing iGOS++ against state-of-the-art saliency map methods show significant improvement in locating salient regions that are directly interpretable by humans. We utilized iGOS++ in the task of classifying COVID-19 cases from x-ray images and discovered that sometimes the CNN network is overfitted to the characters printed on the x-ray images when performing classification. Fixing this issue by data cleansing significantly improved the precision and recall of the classifier.","authors":"Saeed Khorram, Tyler Lawson, and Fuxin Li (Oregon State University)","doi_link":"https://doi.org/10.1145/3450439.3451865","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954738","title":"iGOS++: Integrated Gradient Optimized Saliency by Bilateral Perturbations"},{"UID":"21P18","abstract":"Despite the large number of patients in Electronic Health Records (EHRs), the subset of usable data for modeling outcomes of specific phenotypes are often imbalanced and of modest size. This can be attributed to the uneven coverage of medical concepts in EHRs. We propose OMTL, an Ontology-driven Multi-Task Learning framework, that is designed to overcome such data limitations.The key contribution of our work is the effective use of knowledge from a predefined well-established medical relationship graph (ontology) to construct a novel deep learning network architecture that mirrors this ontology. This enables common representations to be shared across related phenotypes, and was found to improve the learning performance. The proposed OMTL naturally allows for multi-task learning of different phenotypes on distinct predictive tasks. These phenotypes are tied together by their semantic relationship according to the external medical ontology. Using the publicly available MIMIC-III database, we evaluate OMTL and demonstrate its efficacy on several real patient outcome predictions over state-of-the-art multi-task learning schemes. The results of evaluating the proposed approach on six experiments show improvement in the area under ROC curve by 9\\% and by 8\\% in the area under precision-recall curve.","authors":"Mohamed Ghalwash, Zijun Yao, Prithwish Chakraborty, james Codella, and Daby Sow (IBM Research)","doi_link":"https://doi.org/10.1145/3450439.3451881","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954748","title":"Phenotypical Ontology Driven Framework for Multi-Task Learning"},{"UID":"21P19","abstract":"A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and ablation variants.","authors":"Alvin Chan, Anna Korsakova, Yew-Soon Ong, Fernaldo Richtia Winnerdy, Kah Wai Lim, and Anh Tuan Phan (Nanyang Technological University)","doi_link":"https://doi.org/10.1145/3450439.3451857","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954739","title":"RNA Alternative Splicing Prediction with Discrete Compositional Energy Network"},{"UID":"21P20","abstract":"Colorectal cancer recurrence is a major clinical problem - around 30-40% of patients who are treated with curative intent surgery will experience cancer relapse. Proactive prognostication is critical for early detection and treatment of recurrence. However, the common clinical approach to monitoring recurrence through testing for carcinoembryonic antigen (CEA) does not possess a strong prognostic performance. In our paper, we study a series of machine and deep learning architectures that exploit heterogeneous healthcare data to predict colorectal cancer recurrence. In particular, we demonstrate three different approaches to extract and integrate features from multiple modalities including longitudinal as well as tabular clinical data. Our best model employs a hybrid architecture that takes in multi-modal inputs and comprises: 1) a Transformer model carefully modified to extract high-quality features from time-series data, and 2) a Multi-Layered Perceptron (MLP) that learns tabular data features, followed by feature integration and classification for prediction of recurrence. It achieves an AUROC score of 0.95, as well as precision, sensitivity and specificity scores of 0.83, 0.80 and 0.96 respectively, surpassing the performance of all-known published results based on CEA, as well as most commercially available diagnostic assays. Our results could lead to better post-operative management and follow-up of colorectal cancer patients.","authors":"Danliang Ho (National University of Singapore) | Iain Bee Huat Tan (National Cancer Center Singapore) | Mehul Motani (National University of Singapore)","doi_link":"https://doi.org/10.1145/3450439.3451868","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954740","title":"Predictive Models for Colorectal Cancer Recurrence Using Multi-modal Healthcare Data"},{"UID":"21P21","abstract":"Melanoma is the most common form of cancer in the world. Early diagnosis of the disease and an accurate estimation of its size and shape are crucial in preventing its spread to other body parts. Manual segmentation of these lesions by a radiologist however is time consuming and error-prone. It is clinically desirable to have an automatic tool to detect malignant skin lesions from dermoscopic skin images. We propose a novel end-to-end convolution neural network(CNN) for a precise and robust skin lesion localization and segmentation. The proposed network has 3 sub-encoders branching out from the main encoder. The 3 sub-encoders are inspired from Coordinate Convolution, Hourglass, and Octave Convolutional blocks: each sub-encoder summarizes different patterns and yet collectively aims to achieve a precise segmentation. We trained our segmentation model just on the ISIC 2018 dataset. To demonstrate the generalizability of our model, we evaluated our model on the ISIC 2018 and unseen datasets including ISIC 2017 and PH$^2$. Our approach showed an average 5\\% improvement in performance over different datasets while having less than half of the number of parameters when compared to other state-of-the-art segmentation models.","authors":"shreshth saini (indian institute of technology jodhpur) | Jeon Young Seok and Mengling Feng (Saw Swee Hock School of PublicHealth, National University HealthSystem, National University ofSingapore)","doi_link":"https://doi.org/10.1145/3450439.3451873","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954741","title":"B-SegNet : Branched-SegMentor Network For Skin Lesion Segmentation"},{"UID":"21P22","abstract":"In medicine, comorbidities refer to the presence of multiple, co-occurring diseases. Due to their co-occurring nature, the course of one comorbidity is often highly dependent on the course of the other disease and, hence, treatments can have significant spill-over effects. Despite the prevalence of comorbidities among patients, a comprehensive statistical framework for modeling the longitudinal dynamics of comorbidities is missing. In this paper, we propose a probabilistic model for analyzing comorbidity dynamics over time in patients. Specifically, we develop a coupled hidden Markov model with a personalized, non-homogeneous transition mechanism, named Comorbidity-HMM. The specification of our Comorbidity-HMM is informed by clinical research: (1) It accounts for different disease states (i. e., acute, stable) in the disease progression by introducing latent states that are of clinical meaning. (2) It models a coupling among the trajectories from comorbidities to capture co-evolution dynamics. (3) It considers between-patient heterogeneity (e. g., risk factors, treatments) in the transition mechanism. Based on our model, we define a spill-over effect that measures the indirect effect of treatments on patient trajectories through coupling (i. e., through comorbidity co-evolution). We evaluated our proposed Comorbidity-HMM based on 675 health trajectories where we investigate the joint progression of diabetes mellitus and chronic liver disease. Compared to alternative models without coupling, we find that our Comorbidity-HMM achieves a superior fit. Further, we quantify the spill-over effect, that is, to what extent diabetes treatments are associated with a change in the chronic liver disease from an acute to a stable disease state. To this end, our model is of direct relevance for both treatment planning and clinical research in the context of comorbidities.","authors":"Basil Maag, Stefan Feuerriegel, and Mathias Kraus (ETH Zurich) | Maytal Saar-Tsechansky (University of Texas at Austin) | Thomas Zueger (1) Inselspital, Bern, University Hospital, University of Bern 2) ETH Zurich)","doi_link":"https://doi.org/10.1145/3450439.3451871","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954742","title":"Modeling Longitudinal Dynamics of Comorbidities"},{"UID":"21P23","abstract":"Generating interpretable visualizations of multivariate time series in the intensive care unit is of great practical importance. Clinicians seek to condense complex clinical observations into intuitively understandable critical illness patterns, like failures of different organ systems. They would greatly benefit from a low-dimensional representation in which the trajectories of the patients' pathology become apparent and relevant health features are highlighted. To this end, we propose to use the latent topological structure of Self-Organizing Maps (SOMs) to achieve an interpretable latent representation of ICU time series and combine it with recent advances in deep clustering. Specifically, we (a) present a novel way to fit SOMs with probabilistic cluster assignments (PSOM), (b) propose a new deep architecture for probabilistic clustering (DPSOM) using a VAE, and (c) extend our architecture to cluster and forecast clinical states in time series (T-DPSOM). We show that our model achieves superior clustering performance compared to state-of-the-art SOM-based clustering methods while maintaining the favorable visualization properties of SOMs. On the eICU data-set, we demonstrate that T-DPSOM provides interpretable visualizations of patient state trajectories and uncertainty estimation. We show that our method rediscovers well-known clinical patient characteristics, such as a dynamic variant of the Acute Physiology And Chronic Health Evaluation (APACHE) score. Moreover, we illustrate how it can disentangle individual organ dysfunctions on disjoint regions of the two-dimensional SOM map.","authors":"Laura Manduchi, Matthias H\u00fcser, Martin Faltys, Julia Vogt, Gunnar R\u00e4tsch, and Vincent Fortuin (ETH Z\u00fcrich)","doi_link":"https://doi.org/10.1145/3450439.3451872","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954743","title":"T-DPSOM - An Interpretable Clustering Method for Unsupervised Learning of Patient Health States"},{"UID":"21P24","abstract":"Wearable technology opens opportunities to reduce sedentary behavior; however, commercially available devices do not provide tailored coaching strategies. Just-In-Time Adaptive Interventions (JITAI) provide such a framework; however most JITAI are conceptual to date. We conduct a study to evaluate just-in-time nudges in free-living conditions in terms of receptiveness and nudge impact. We first quantify baseline behavioral patterns in context using features such as location and step count, and assess differences in individual responses. We show there is a strong inverse relationship between average daily step counts and time spent being sedentary indicating that steps are steadily taken throughout the day, rather than in large bursts. Interestingly, the effect of nudges delivered at the workplace is larger in terms of step count than those delivered at home. We develop Random Forest models to learn nudge receptiveness using both individualized and contextualized data. We show that step count is the least important identifier in nudge receptiveness, while location is the most important. Furthermore, we compare the developed models with a commercially available smart coach using post-hoc analysis. The results show that using the contextualized and individualized information significantly outperforms non-JITAI approaches to determine nudge receptiveness.","authors":"Matthew Saponaro, Ajith Vemuri, Greg Dominick, and Keith Decker (University of Delaware)","doi_link":"https://doi.org/10.1145/3450439.3451874","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954744","title":"Contextualization and Individualization for Just-in-Time Adaptive Interventions to Reduce Sedentary Behavior"},{"UID":"21P25","abstract":"Pre-training (PT) has been used successfully in many areas of machine learning. One area where PT would be extremely impactful is over electronic health record (EHR) data. Successful PT strategies on this modality could improve model performance in data-scarce contexts such as modeling for rare diseases or allowing smaller hospitals to benefit from data from larger health systems. While many PT strategies have been explored in other domains, much less exploration has occurred for EHR data. One reason this may be is the lack of standardized benchmarks suitable for developing and testing PT algorithms. In this work, we establish a PT benchmark dataset for EHR timeseries data, establishing cohorts, a diverse set of fine-tuning tasks, and PT-focused evaluation regimes across two public EHR datasets: MIMIC-III and eICU. This benchmark fills an essential hole in the field by enabling a robust manner of iterating on PT strategies for this modality. To show the value of this benchmark and provide baselines for further research, we also profile two simple PT algorithms: a self-supervised, masked imputation system and a weakly-supervised, multi-task system. We find that PT strategies (in particular weakly-supervised PT methods) can offer significant gains over traditional learning in few-shot settings, especially on tasks with strong class imbalance. Our full benchmark and code are publicly available at https://github.com/mmcdermott/comprehensive_MTL_EHR.","authors":"Matthew McDermott (Massachusetts Institute of Technology) | Bret Nestor (University of Toronto) | Evan Kim (Massachusetts Institute of Technology) | Wancong Zhang (New York University) | Anna Goldenberg (Hospital for Sick Children, University of Toronto, Vector Institute) | Peter Szolovits (MIT) | Marzyeh Ghassemi (University of Toronto | Vector Institute for Artificial Intelligence)","doi_link":"https://doi.org/10.1145/3450439.3451877","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954745","title":"A Comprehensive EHR Timeseries Pre-training Benchmark"},{"UID":"21P26","abstract":"Clinical machine learning models have been found to significantly degrade in performance on hospitals or regions not seen during training. Recent developments in domain generalization offer a promising solution to this problem, by creating models that learn invariances which hold across environments. In this work, we benchmark the performance of eight domain generalization methods on clinical time series and medical imaging data. We introduce a framework to induce practical confounding and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find, consistent with prior work, that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data. However, we do find a subset of realistic confounding scenarios where significant performance gains are observed. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.","authors":"Haoran Zhang (University of Toronto | Vector Institute) | Natalie Dullerud (University of Toronto, Vector Institute) | Laleh Seyyed-Kalantari (University of Toronto) | Quaid Morris (Memorial Sloan Kettering Cancer Center) | Shalmali Joshi (Harvard University) | Marzyeh Ghassemi (University of Toronto | Vector Institute for Artificial Intelligence)","doi_link":"https://doi.org/10.1145/3450439.3451878","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954746","title":"An Empirical Framework for Domain Generalization In Clinical Settings"},{"UID":"21P27","abstract":"Early detection of influenza-like symptoms can prevent widespread flu viruses and enable timely treatments, particularly in the post-pandemic era. Mobile sensing leverages an increasingly diverse set of embedded sensors to capture fine-grained information of human behaviors and ambient contexts and can serve as a promising solution for influenza-like symptom recognition. Traditionally, handcrafted and high level features of mobile sensing data are extracted by using handcrafted feature engineering and Convolutional/Recurrent Neural Network respectively. However, in this work, we use graph representation to encode the dynamics of state transitions and internal dependencies in human behaviors, apply graph embeddings to automatically extract the topological and spatial features from graph input and propose an end-to-end Graph Neural Network model with multi-channel mobile sensing input for influenza-like symptom recognition based on people's daily mobility, social interactions, and physical activities. Using data generated from 448 participants, We show that Graph Neural Networks (GNN) with GraphSAGE convolutional layers significantly outperform baseline models with handcrafted features. Furthermore, we use GNN interpretability method to generate insight (important node, graph structure) for the symptom recognition. To the best of our knowledge, this is the first work that applies graph representation and graph neural network on mobile sensing data for graph-based human behaviors modeling.","authors":"Guimin Dong, Lihua Cai, Debajyoti Datta, Shashwat Kumar, Laura E. Barnes, and Mehdi Boukhechba (University of Virginia)","doi_link":"https://doi.org/10.1145/3450439.3451880","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954747","title":"Influenza-like Symptom Recognition using Mobile Sensing and Graph Neural Networks"}],"speakers":[{"UID":"21K01","abstract":"Until today, all the available therapeutics are designed by human experts, with no help from AI tools. This reliance on human knowledge and dependence on large-scale experimentations result in prohibitive development cost and high failure rate. Recent developments in machine learning algorithms for molecular modeling aim to transform this field. In my talk, I will present state-of-the-art approaches for property prediction and de-novo molecular generation, describing their use in drug design. In addition, I will highlight unsolved algorithmic questions in this field, including confidence estimation, pretraining, and deficiencies in learned molecular representations.","bio":"Regina Barzilay is a professor in the Department of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology. She is an AI faculty lead for Jameel Clinic, an MIT center for Machine Learning in Health at MIT. Her research interests are in natural language processing, applications of deep learning to chemistry and oncology. She is a recipient of various awards including the NSF Career Award, the MIT Technology Review TR-35 Award, Microsoft Faculty Fellowship and several Best Paper Awards at NAACL and ACL. In 2017, she received a MacArthur fellowship, an ACL fellowship and an AAAI fellowship. In 2020, she was awarded AAAI Squirrel Award for Artificial Intelligence for the Benefit of Humanity. She received her Ph.D. in Computer Science from Columbia University, and spent a year as a postdoc at Cornell University. Regina received her undergraduate from Ben Gurion University of the Negev, Israel.","image":"static/images/speakers/r-barzilay.jpg","institution":"MIT Computer Science & Artificial Intelligence Lab","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954711","speaker":"Regina Barzilay","title":"AI for Drug Discovery: Challenges and Opportunities","website":"https://www.regina.csail.mit.edu"},{"UID":"21K02","abstract":"Improved healthcare delivery and patient outcomes are the ultimate goals of many AI applications in healthcare. However, relatively few machine learning models have been translated to clinical practice so far and among those even fewer have undergone a randomized control trial (RCT) to assess their impact. This talk will highlight aspects of the clinical translational process, beyond retrospective modeling, that impact design, development, validation, and regulation of machine learning models in healthcare. In particular, this talk focuses on our recent study of predicting favorable outcomes in hospitalized COVID-19 patients. The resulting model, which was deployed and prospectively validated at NYU Langone, underwent an RCT, and was eventually shared with other institutions. I will discuss challenges around integrating our model in the EHR system and their implications, the efficacy and safety results of our RCT, and practical insights about sharing models across clinics. We will end the talk by reviewing results of a survey of over 195 clinical users who interacted with this model, summarizing when and how the model was most helpful.","bio":"Narges Razavian is an assistant professor at NYU Langone Health, Center for Healthcare Innovation and Delivery Sciences, and Predictive Analytics Unit. Her lab focuses on various applications of Machine Learning and AI for medicine with a clinical translation outlook, and they work with Medical Images, Clinical Notes, and Electronic Health Records. Before NYU Langone, she was a postdoc at CILVR lab at NYU Courant CS department. She received her PhD at CMU Computational Biology group.","image":"static/images/speakers/n-razavian.jpg","institution":"New York University Langone Medical Center","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954712","speaker":"Narges Razavian","title":"Machine Learning in Healthcare: From Modeling to Clinical Impact","website":"http://razavian.net/"},{"UID":"21K03","abstract":"In early March 2020, Mark joined an interdisciplinary team to launch the Pandemic Response Network. Over the subsequent months, he helped build and launch programs to support health workers, university students and staff, small businesses, K-12 public schools, and historically marginalized communities through the COVID-19 pandemic. With a strong background in design and implementation of high-tech health innovations, Mark worked alongside public health practitioners and community leaders to repeatedly execute the last mile implementation of critical COVID-19 programs, including symptom monitoring in the workplace, rapid antigen testing in schools, and pop-up vaccination events in churches. The portfolio of programs rapidly shifted health care capabilities and expertise out of hospitals and clinics into community settings that were poorly supported by existing public health infrastructure. The experience forced Mark and his team to approach technology design with a new set of assumptions and led to the development of completely novel data streams and technology systems. In his talk, Mark distills insights and learnings from the front lines of the COVID-19 response and highlights important implications and opportunities for the field of machine learning and artificial intelligence in health care.","bio":"Mark Sendak, MD, MPP is the Population Health & Data Science Lead at the Duke Institute for Health Innovation (DIHI), where he leads interdisciplinary teams of data scientists, clinicians, and machine learning experts to build technologies that solve real clinical problems. He has built tools to support Duke Health's Accountable Care Organization, COVID-19 Pandemic Response Network, and hospital network. Together with his team, he has integrated dozens of data-driven technologies into clinical operations and is a co-inventor of software to scale machine learning applications. He leads the DIHI Clinical Research & Innovation scholarship, which equips medical students with the business and data science skills required to lead health care innovation efforts. His work has been published in technical venues such as the Machine Learning for Healthcare Proceedings and Fairness, Accountability, and Transparency in Machine Learning Proceedings and clinical journals such as Plos Medicine, Nature Medicine and JAMA Open. He has served as an expert advisor to the American Medical Association, AARP, and National Academies of Medicine on matters related to machine learning, innovation, and policy. He obtained his MD and Masters of Public Policy at Duke University as a Dean's Tuition Scholar and his Bachelor's of Science in Mathematics from UCLA.","image":"static/images/speakers/m-sendak.jpg","institution":"Duke Institute for Health Innovation","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954713","speaker":"Mark Sendak","title":"Holding a Hammer When There are no Nails - Rapid Iteration to Build COVID-19 Support Programs for Historically Marginalized Communities","website":"https://dihi.org/team-member/mark-sendak-md/"},{"UID":"21K04","abstract":"Machine learning in healthcare could have transformative impact for patients, caregivers and health systems but the potential benefits remain challenging to realise at scale. Along the path from the development of a model to the realisation of clinical and health-economic impact are a number of challenges and learnings that might be transferable across a range of applications. This talk surveys some recent progress at Google Health and shares learnings from their team in moving from early research to product development; from product development to deployment; and from deployment to early measures of clinical impact.","bio":"Dr. Alan Karthikesalingam is a surgeon-scientist who leads the healthcare machine learning research group at Google Health in London (and formerly for healthcare at DeepMind).
He led DeepMind and Google\u2019s teams in four landmark studies in Nature and Nature Medicine focusing on AI for breast cancer screening with Cancer Research UK, AI for the recognition and prediction of blinding eye diseases with the world\u2019s largest eye hospital (Moorfields) and medical records research with the Veterans Affairs developing AI early warning systems for common causes of patient deterioration, like acute kidney injury.
He is leading work on how machine learning approaches can best promote AI safety as the team takes forward its early research into products for clinical care. Alan continues to practice clinically and supervise PhD students as a lecturer in the vascular surgery department of Imperial College, London.","image":"static/images/speakers/a-karthikesalingam.jpg","institution":"Google Health - London","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954714","speaker":" Alan Karthikesalingam","title":"Lessons on the Path from Code to Clinic - Some Common Myths in Machine Learning for Healthcare","website":""},{"UID":"21K05","abstract":"In medicine, the integration of artificial intelligence (AI) and machine learning (ML) tools could lead to a paradigm shift in which human-AI collaboration becomes integrated in medical decision-making. Despite many years of enthusiasm towards these technologies, the majority of tools fail once they are deployed in the real-world, often due to failures in workflow integration and interface design. In this talk, I will share research using methods in human-computer interaction (HCI) to design and evaluate machine learning tools for real-world clinical use. Results from this work suggest that trends in explainable AI may be inappropriate for clinical environments. I will discuss paths towards designing these tools for real-world medical systems, and describe how we are using collaborations across medicine, data science, and HCI to create machine learning tools for complex medical decisions.","bio":"Dr. Maia Jacobs is an assistant professor at Northwestern University in Computer Science and Preventive Medicine. Her research contributes to the fields of Computer Science, Human-Computer Interaction (HCI), and Health Informatics through the design and evaluation of novel computing approaches that provide individuals with timely, relevant, and actionable health information. Recent projects include the design and deployment of mobile tools to increase health information access in rural communities, evaluating the influence of AI interface design on expert decision making, and co-designing intelligent decision support tools with clinicians. Her research has been funded by the National Science Foundation, the National Cancer Institute, and the Harvard Data Science Institute and has resulted in the deployment of tools currently being used by healthcare systems and patients around the country. She completed her PhD in Human Centered Computing at Georgia Institute of Technology and was a postdoctoral fellow in the Center for Research on Computation and Society at Harvard University. Jacobs\u2019 work was awarded the iSchools Doctoral Dissertation Award, the Georgia Institute of Technology College of Computing Dissertation Award, and was recognized in the 2016 report to the President of the United States from the President's Cancer Panel, which focused on improving cancer-related outcomes.","image":"static/images/speakers/m-jacobs.png","institution":"Northwestern University","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954715","speaker":"Maia Jacobs","title":"Bringing AI to the Bedside with User Centered Design","website":"http://maiajacobs.com/"},{"UID":"21K06","abstract":"The wide adoption of electronic health records (EHR) systems has led to the availability of large clinical datasets available for precision medicine research. EHR data, linked with bio-repository, is a valuable new source for deriving real-word, data-driven prediction models of disease risk and treatment response. Yet, they also bring analytical difficulties. Precise information on clinical outcomes is not readily available and requires labor intensive manual chart review. Synthesizing information across healthcare systems is also challenging due to heterogeneity and privacy. In this talk, I\u2019ll discuss analytical approaches for mining EHR data with a focus on denoising, scalability and transportability . These methods will be illustrated using EHR data from multiple healthcare centers.","bio":"Dr. Tianxi Cai is the John Rock Professor of Population and Translational Data Science jointly appointed in the Department of Biostatistics at the Harvard T.H. Chan School of Public Health (HSPH) and the Department of Biomedical Informatics (DBMI), Harvard Medical School, where she directs the Translational Data Science Center for a learning healthcare system. Her recent research has been focusing on developing interpretable and robust statistical and machine learning methods for deriving precision medicine strategies and more broadly for mining large-scale biomedical data including electronic health records data.","image":"static/images/speakers/t-cai.jpg","institution":"Harvard Medical School","slideslive_active_date":"2021-04-14T23:59:00.00","slideslive_id":"38954716","speaker":"Tianxi Cai","title":"Precision Medicine with Imprecise EHR Data","website":"https://celehs.hms.harvard.edu/tcai/"}],"tutorials":[{"UID":"21T01","abstract":"Causal inference is an important topic in healthcare because a causal relationship between an exposure and a health outcome may suggest an intervention to improve the health outcome. In this tutorial, we provide an introduction to the field of causal inference. We will cover several fundamental topics in causal inference, including the potential outcome framework, structural equation modeling, propensity score modeling, and instrumental variable analysis. Methods will be illustrated using real clinical examples.","authors":"Linbo Wang","bio":"\n Linbo Wang is an assistant professor in the Department of Statistical Sciences, University of Toronto. He is also an Affiliate Assistant Professor in the Department of Statistics, University of Washington, and a faculty affiliate at Vector Institute. His research interest is centered around causality and its interaction with statistics and machine learning. Prior to these roles, he was a postdoc at Harvard T.H. Chan School of Public Health. He obtained his Ph.D. from the University of Washington.","rocketchat_id":"","slideslive_active_date":"2021-03-28T23:59:00.00","slideslive_id":"38954717","title":"Causal Inference in Clinical Research: From Theory to Practice"},{"UID":"21T02","abstract":"Mobile health (mHealth) technologies are providing new promising ways to deliver interventions in both clinical and non-clinical settings. Wearable sensors and smartphones collect real-time data streams that provide information about an individual\u2019s current health including both internal (e.g., mood, blood sugar level) and external (e.g., social, location) contexts. Both wearables and smartphones can be used to deliver interventions. mHealth interventions are in current use across a vast number of health-related fields including medication adherence, physical activity, weight loss, mental illness and addictions. This tutorial discusses the micro-randomized trial (MRT), an experimental trial design for use in optimizing real time delivery of sequences of treatment, with an emphasis on mHealth. We introduce the MRT design using HeartSteps, a physical activity study, as an example. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT with continuous binary outcomes. We discuss the sample size considerations for designing MRTs.","authors":"Tianchen Qian","bio":"\n Tianchen Qian is an Assistant Professor in the Department of Statistics at University of California, Irvine. He completed his PhD at the Johns Hopkins University and was a postdoctoral fellow at Harvard University. His research is focused on the experimental design and statistical analysis methods for developing mobile health interventions. In particular, he has developed causal inference methods for analyzing micro-randomized trial data and sample size calculation approaches for designing micro-randomized trials.\n
Tianchen Qian, Ph.D., Assistant Professor, Department of Statistics, Donald Bren School of Information and Computer Sciences, UC Irvine | Email: t.qian@uci.edu | Website: https://sites.google.com/view/tianchen-qian","rocketchat_id":"","slideslive_active_date":"2021-03-28T23:59:00.00","slideslive_id":"38954718","title":"Experimental Design and Causal Inference Methods For Micro-Randomized Trials: A Framework for Developing Mobile Health Interventions"},{"UID":"21T03","abstract":"Offline reinforcement learning (offline RL), a.k.a. batch-mode reinforcement learning, involves learning a policy from potentially suboptimal data. In contrast to imitation learning, offline RL does not rely on expert demonstrations, but rather seeks to surpass the average performance of the agents that generated the data. Methodologies such as the gathering of new experience fall short in offline settings, requiring reassessment of fundamental learning paradigms. In this tutorial I aim to provide the necessary background and challenges of this exciting area of research, from off policy evaluation through bandits to deep reinforcement learning.","authors":"Guy Tennenholtz","bio":"\n Guy Tennenholtz is a fourth-year Ph.D. student at the Technion University, advised by Prof. Shie Mannor. His research interests lie in the field of reinforcement learning, and specifically, how offline data can be leveraged to build better agents. Problems of large action spaces, partial observability, confounding bias, and uncertainty are only some of the problems he is actively researching. In his spare time Guy also enjoys creating mobile games, with the vision of incorporating AI into both the game development process and gameplay.","rocketchat_id":"","slideslive_active_date":"2021-03-28T23:59:00.00","slideslive_id":"38954719","title":"Offline Reinforcement Learning"},{"UID":"21T04","abstract":"As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in a post hoc manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. However, recent research has shed light on the vulnerabilities of popular post hoc explanation techniques. In this tutorial, I will provide a brief overview of post hoc explanation methods with special emphasis on feature attribution methods such as LIME and SHAP. I will then discuss recent research which demonstrates that these methods are brittle, unstable, and are vulnerable to a variety of adversarial attacks. Lastly, I will present two solutions to address some of the vulnerabilities of these methods \u2013 (i) a generic framework based on adversarial training that is designed to make post hoc explanations more stable and robust to shifts in the underlying data, and (ii) a Bayesian framework that captures the uncertainty associated with post hoc explanations and in turn allows us to generate reliable explanations which satisfy user specified levels of confidence. Overall, this tutorial will provide a bird\u2019s eye view of the state-of-the-art in the burgeoning field of explainable machine learning.","authors":"Hima Lakkaraju","bio":"\n Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io","rocketchat_id":"","slideslive_active_date":"2021-03-28T23:59:00.00","slideslive_id":"38954720","title":"Explainable ML: Understanding the Limits and Pushing the Boundaries"},{"UID":"21T05","abstract":"Phenotyping is the process of identifying a patient\u2019s health state based on the information in their electronic health records. In this tutorial, we will discuss why phenotyping is a challenging problem from both a practical and methodological perspective. We will focus primarily on the the challenges in obtaining annotated phenotype information from patient records and present statistical learning methods that leverage unlabeled examples to improve model estimation and evaluation to reduce the annotation burden.","authors":"Jesse Gronsbell | Chuan Hong | Molei Liu | Clara-Lea Bonzel | Aaron Sonabend","bio":"\n Jesse Gronsbell is an Assistant Professor in the Department of Statistical Sciences at the University of Toronto. Prior to joining U of T, Jesse spent a couple of years as a data scientist in the Mental Health Research and Development Group at Alphabet's Verily Life Sciences. Her primary interest is in the development of statistical methods for modern digital data sources such as electronic health records and mobile health data.\n
Chuan Hong is an instructor in biomedical informatics from the Department of Biomedical Informatics (DBMI) at Harvard Medical School. She received her PhD in Biostatistics from the University of Texas Health Science Center at Houston. Her doctoral research focused on meta-analysis and DNA methylation detection. At DBMI, Chuan's research interests lie in developing statistical and computational methods for biomarker evaluation, predictive modeling, and precision medicine with biomedical data. In particular, she is interested in combining electronic medical records with biorepositories and relevant resources to improve phenotyping accuracy, detect novel biomarkers, and monitor disease progression in clinical research.\n
Molei Liu is a 4th year PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He received a Bachelor's degree in Statistics from Peking University. Molei has been working in areas including high dimensional statistics, distributed learning, semi-supervised learning, semi-parametric inference, and model-X inference. He has also been working on methods for phenome-wide association studies (PheWAS) using electronic health records data.\n
Clara-Lea Bonzel is a research assistant at the Department of Biomedical Informatics at Harvard Medical School. She is mainly interested in personalized medicine using phenomic and genomic data, and model selection and evaluation. Clara-Lea received her master's degree in Applied Mathematics and Financial Engineering from the Swiss Federal Institute of Technology (EPFL).\n
Aaron Sonabend is a PhD candidate in the Biostatistics department at Harvard T.H. Chan School of Public Health. He is primarily focused on developing robust reinforcement learning and natural language processing methods for contexts with sampling bias, partially observed rewards, or strong distribution shifts. He is interested in healthcare and biomedical applications, such as finding optimal sequential treatment regimes for complex diseases, and phenotyping using electronic health records. Aaron holds a Bachelor's degree in Applied Mathematics, and in Economics from the National Autonomous Technological Institute of Mexico (ITAM).","rocketchat_id":"","slideslive_active_date":"2021-03-28T23:59:00.00","slideslive_id":"38954721","title":"Semi-supervised Phenotyping with Electronic Health Records"}],"workshops":[{"UID":"21WS01","abstract":"In many real-world environments, the details of decision-making processes are not fully known, e.g., how oncologists decide on specific radiation therapy treatment plans for cancer patients, how clinicians decide on medication dosages for different patients, or how hypertension patients choose their diet to control their illness. While conventional machine learning and statistical methods can be used to better understand such processes, they often fail to provide meaningful insights into the unknown parameters when the problem's setting is heavily constrained. Similarly, conventional constrained inference models, such as inverse optimization, are not well equipped for data-driven problems. In this study, we develop a novel methodology (called MLIO) that combines machine learning and inverse optimization techniques to recover the utility functions of a black-box decision-making process. Our method can be applied to settings where different types of data are required to capture the problem. MLIO is specifically developed with data-intensive medical decision-making environments in mind. We evaluate our approach in the context of personalized diet recommendations for patients, building on a large dataset of historical daily food intakes of patients from NHANES. MLIO considers these prior dietary behaviors in addition to complementary data (e.g., demographics and preexisting conditions) to recover the underlying criteria that the patients had in mind when deciding on their food choices. Once the underlying criteria are known, an optimization model can be used to find personalized diet recommendations that adhere to patients' behavior while meeting all required dietary constraints.","authors":"Farzin Ahmadi, Tinglong Dai, and Kimia Ghobadi (Johns Hopkins University)","title":"Emulating Human Decision-Making Under Multiple Constraints"},{"UID":"21WS02","abstract":"Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future.","authors":"Ella Y. Wang (BASIS Chandler); Anirudh Som (SRI International); Ankita Shukla, Hongjun Choi, and Pavan Turaga (ASU)","title":"Interpretable COVID-19 Chest X-Ray Classification via Orthogonality Constraint"},{"UID":"21WS03","abstract":"Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies related to the same treatment and outcome measurement. It is an important tool for medical researchers and clinicians to derive reliable conclusions regarding the overall effect of treatments and interventions (e.g., drugs) on a certain outcome (e.g., the severity of a disease). Unfortunately, conventional meta-analysis involves great human effort, i.e., it is constructed by hand and is extremely time-consuming and labor-intensive, rendering a process that is inefficient in practice and vulnerable to human bias. To overcome these challenges, we work toward automating meta-analysis with a focus on controlling for the potential biases. Automating meta-analysis consists of two major steps: (1) extracting information from scientific publications written in natural language, which is different and noisier than what humans typically extract when conducting a meta-analysis; and (2) modeling meta-analysis, from a novel \\textit{causal-inference} perspective, to control for the potential biases and summarize the treatment effect from the outputs of the first step. Since sufficient prior work exists for the first step, this study focuses on the second step. The core contribution of this work is a multiple causal inference algorithm tailored to the potentially noisy and biased information automatically extracted by current natural language processing systems. Empirical evaluations on both synthetic and semi-synthetic data show that the proposed approach for automated meta-analysis yields high-quality performance.","authors":"Lu Cheng (Arizona State University); Dmitriy Katz-Rogozhnikov, Kush R. Varshney, and Ioana Baldini (IBM Research)","title":"Automated Meta-Analysis in Medical Research: A Causal Learning Perspective"},{"UID":"21WS04","abstract":"Attention is a powerful concept in computer vision. End-to-end networks that learn to focus selectively on regions of an image or video often perform strongly. However, other image regions, while not necessarily containing the signal of interest, may contain useful context. We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the signal of interest and those that do not. Our technique uses the inverse of an attention mask to generate a noise estimate that is then used to denoise temporal observations. We apply this to the task of camera-based physiological measurement. A convolutional attention network is used to learn which regions of a video contain the physiological signal and generate a preliminary estimate. A noise estimate is obtained by using the pixel intensities in the inverse regions of the learned attention mask, this in turn is used to refine the estimate of the physiological signal. We perform experiments on two large benchmark datasets and show that this approach produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation error by as much as 30%, recovering subtle waveform dynamics, and generalizing from RGB to NIR videos without retraining.","authors":"Ewa Nowara (RICE UNIVERSITY); Daniel McDuff (Microsoft Research); Ashok Veeraraghavan (RICE UNIVERSITY)","title":"The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention"},{"UID":"21WS05","abstract":"Electronic health records (EHRs) provide an abundance of data for clinical outcomes modeling. The prevalence of EHR data has enabled a number of studies using a variety of machine learning algorithms to predict potential adverse events. However, these studies do not account for the heterogeneity present in EHR data, including various lengths of stay, various frequencies of vitals captured in invasive versus non-invasive fashion, and various repetitions (or lack of thereof) of laboratory examinations. Therefore, studies limit the types of features extracted or the domain considered to provide a more homogeneous training set to machine learning models. The heterogeneity in this data represents important risk differences in each patient. In this work, we examine such data in an intensive care unit (ICU) setting, where the length of stay and the frequency of data gathered may vary significantly based upon the severity of patient condition. Therefore, it is unreasonable to use the same model for patients first entering the ICU versus those that have been there for above average lengths of stay. Developing multiple individual models to account for different patient cohorts, different lengths of stay, and different sources for key vital sign data may be tedious and not account for rare cases well. We address this challenge by developing a dynamic model, based upon meta-learning, to adapt to data heterogeneity and generate predictions of various outcomes across the different lengths of data. We compare this technique against a set of benchmarks on a publicly-available ICU dataset (MIMIC-III) and demonstrate improved model performance by accounting for data heterogeneity.","authors":"Lida Zhang (Texas A&M University); Xiaohan Chen, Tianlong Chen, and Zhangyang Wang (University of Texas at Austin); Bobak J. Mortazavi (Texas A&M University)","title":"DynEHR: Dynamic Adaptation of Models with Data Heterogeneity in Electronic Health Records"},{"UID":"21WS06","abstract":"Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician labeled data. Active Learning (AL) is a process of selecting the most informative instances to be labeled by an expert to further train a supervised algorithm. We demonstrate the effectiveness of AL in multi-label text classification in the clinical domain. In this context, we apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set (8.3\\% of the total instances). We conclude that AL methods can significantly reduce the manual annotation cost while preserving model performance.","authors":"Martha Ferreira (Dalhousie University); Michal Malyska and Nicola Sahar (Semantic Health); Riccardo Miotto (Icahn School of Medicine at Mount Sinai); Fernando Paulovich (Dalhousie University); Evangelos Milios (Dalhousie University, Faculty of Computer Scienc)","title":"Active Learning for Medical Code Assignment"},{"UID":"21WS07","abstract":"Assessment of COVID-19 pandemic predictions indicates that differential equation-based epidemic spreading models are less than satisfactory in the contemporary world of intense human connectivity. Network-based simulations are more apt for studying the contagion dynamics due to their ability to model heterogeneity of human interactions. However, the quality of predictions in network-based models depends on how well the underlying wire-frame approximates the real social contact network of the population. In this paper, we propose a framework to create a modular wire-frame to mimic the social contact network of geography by lacing it with demographic information. The proposed inter-connected network sports small-world topology, accommodates density variations in the geography, and emulates human interactions in family, social, and work spaces. The resulting wire-frame is a generic and potent instrument for urban planners, demographers, economists, and social scientists to simulate different \"what-if\" scenarios and predict epidemic variables. The basic frame can be laden with any economic, social, urban data that can potentially shape human connectance. We present a preliminary study of the impact of variations in contact patterns due to density and demography on the epidemic variables.","authors":"Kirti Jain (Department of Computer Science, University of Delhi, Delhi, India); Sharanjit Kaur (Acharya Narendra Dev College, University of Delhi, Delhi, India); Vasudha Bhatnagar (Department of Computer Science, University of Delhi, Delhi, India)","title":"Framing Social Contact Networks for Contagion Dynamics"},{"UID":"21WS08","abstract":"Shaping an epidemic with an adaptive contact restriction policy that balances the disease and socioeconomic impact has been the holy grail during the COVID-19 pandemic. Most of the existing work on epidemiological models focuses on scenario-based forecasting via simulation but techniques for explicit control of epidemics via an analytical framework are largely missing. In this paper, we consider the problem of determining the optimal control policy for transmission rate assuming SIR dynamics, which is the most widely used epidemiological paradigm. We first demonstrate that the SIR model with infectious patients and susceptible contacts (i.e., product of transmission rate and susceptible population) interpreted as predators and prey respectively reduces to a Lotka-Volterra (LV) predator-prey model. The modified SIR system (LVSIR) has a stable equilibrium point, an 'energy' conservation property, and exhibits bounded cyclic behaviour similar to an LV system. This mapping permits a theoretical analysis of the control problem supporting some of the recent simulation-based studies that point to the benefits of periodic interventions. We use a control-Lyapunov approach to design adaptive control policies (CoSIR) to nudge the SIR model to the desired equilibrium that permits ready extensions to richer compartmental models. We also describe a practical implementation of this transmission control method by approximating the ideal control with a finite, but a time-varying set of restriction levels. We provide experimental results comparing with periodic lockdowns on few different geographical regions (India, Mexico, Netherlands) to demonstrate the efficacy of this approach.","authors":"Harsh Maheshwari and Shreyas Shetty (Flipkart Internet Private Ltd.); Nayana Bannur (Wadhwani AI); Srujana Merugu (Independent)","title":"CoSIR: Managing an Epidemic via Optimal Adaptive Control of Transmission Rate Policy"},{"UID":"21WS09","abstract":"A major obstacle to the integration of deep learning models for chest x-ray interpretation into clinical settings is the lack of understanding of their failure modes. In this work, we first investigate whether there are clinical subgroups that chest x-ray models are likely to misclassify. We find that older patients and patients with a lung lesion or pneumothorax finding have a higher probability of being misclassified on some diseases. Second, we develop misclassification predictors on chest x-ray models using their outputs and clinical features. We find that our best performing misclassification identifier achieves an AUROC close to 0.9 for most diseases. Third, employing our misclassification identifiers, we develop a corrective algorithm to selectively flip model predictions that have high likelihood of misclassification at inference time. We observe F1 improvement on the prediction of Consolidation (0.008, 95%CI[0.005, 0.010]) and Edema (0.003, 95%CI[0.001, 0.006]). By carrying out our investigation on ten distinct and high-performing chest x-ray models, we are able to derive insights across model architectures and offer a generalizable framework applicable to other medical imaging tasks.","authors":"Emma Chen, Andy Kim, Rayan Krishnan, Andrew Y. Ng, and Pranav Rajpurkar (Stanford University)","title":"CheXbreak: Misclassification Identification for Deep Learning Models Interpreting Chest X-rays"},{"UID":"21WS10","abstract":"Contrastive learning is a form of self-supervision that can leverage unlabeled data to produce pretrained models. While contrastive learning has demonstrated promising results on natural image classification tasks, its application to medical imaging tasks like chest X-ray interpretation has been limited. In this work, we propose MoCo-CXR, which is an adaptation of the contrastive learning method Momentum Contrast (MoCo), to produce models with better representations and initializations for the detection of pathologies in chest X-rays. In detecting pleural effusion, we find that linear models trained on MoCo-CXR-pretrained representations outperform those without MoCo-CXR-pretrained representations, indicating that MoCo-CXR-pretrained representations are of higher-quality. End-to-end fine-tuning experiments reveal that a model initialized via MoCo-CXR-pretraining outperforms its non-MoCo-CXR-pretrained counterpart. We find that MoCo-CXR-pretraining provides the most benefit with limited labeled training data. Finally, we demonstrate similar results on a target Tuberculosis dataset unseen during pretraining, indicating that MoCo-CXR-pretraining endows models with representations and transferability that can be applied across chest X-ray datasets and tasks.","authors":"Hari Sowrirajan, Jingbo Yang, Andrew Ng, and Pranav Rajpurkar (Stanford University)","title":"MoCo-CXR: MoCo Pretraining Improves Representation and Transferability of Chest X-ray Models"},{"UID":"21WS11","abstract":"Inertial Measurement Unit (IMU) sensors are becoming increasingly ubiquitous in everyday devices such as smartphones, fitness watches, etc. As a result, the array of health-related applications that tap onto this data has been growing, as well as the importance of designing accurate prediction models for tasks such as human activity recognition (HAR). However, one important task that has received little attention is the prediction of an individual's heart rate when undergoing a physical activity using IMU data. This could be used, for example, to determine which activities are safe for a person without having him/her actually perform them. We propose a neural architecture for this task composed of convolutional and LSTM layers, similarly to the state-of-the-art techniques for the closely related task of HAR. However, our model includes a convolutional network that extracts, based on sensor data from a previously executed activity, a physical conditioning embedding (PCE) of the individual to be used as the LSTM's initial hidden state. We evaluate the proposed model, dubbed PCE-LSTM, when predicting the heart rate of 23 subjects performing a variety of physical activities from IMU-sensor data available in public datasets (PAMAP2, PPG-DaLiA). For comparison, we use as baselines the only model specifically proposed for this task, and an adapted state-of-the-art model for HAR. PCE-LSTM yields over 10% lower mean absolute error. We demonstrate empirically that this error reduction is in part due to the use of the PCE. Last, we use the two datasets (PPG-DaLiA, WESAD) to show that PCE-LSTM can also be successfully applied when photoplethysmography (PPG) sensors are available to rectify heart rate measurement errors caused by movement, outperforming the state-of-the-art deep learning baselines by more than 30%.","authors":"Davi Pedrosa de Aguiar, Ot\u00e1vio Augusto Silva, and Fabricio Murai (Universidade Federal de Minas Gerais)","title":"Encoding physical conditioning from inertial sensors for multi-step heart rate estimation"},{"UID":"21WS12","abstract":"COVID-19 pandemic has been ravaging the world we know since it's insurgence. Computer-Aided Diagnosis (CAD) systems with high precision and reliability can play a vital role in the battle against COVID-19. Most of the existing works in the literature focus on developing sophisticated methods yielding high detection performance yet not addressing the issue of predictive uncertainty. Uncertainty estimation has been explored heavily in the literature for Deep Neural Networks; however, not much work focused on this issue on COVID-19 detection. In this work, we explore the efficacy of state-of-the-art (SOTA) uncertainty estimation methods on COVID-19 detection. We propose to augment the best performing method by using feature denoising algorithm to gain higher Positive Predictive Value (PPV) on COVID positive cases. Through extensive experimentation, we identify the most lightweight and easy-to-deploy uncertainty estimation framework that can effectively identify the confusing COVID-19 cases for expert analysis while performing comparatively with the existing resource heavy uncertainty estimation methods. In collaboration with medical professionals, we further validate the results to ensure the viability of the framework in clinical practice.","authors":"Krishanu Sarker (Georgia State University); Sharbani Pandit (Georgia Institute of Technology); Anupam Sarker (Institute of Epidemiology, Disease Control and Research); Saeid Belkasim and Shihao Ji (Georgia State University)","title":"Towards Reliable and Trustworthy Computer-Aided Diagnosis Predictions: Diagnosing COVID-19 from X-Ray Images"},{"UID":"21WS13","abstract":"We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as \"no disease\". Second, we evaluate whether models trained on seen diseases can detect seen diseases when co-occurring with diseases outside the subset (unseen diseases). We find that models are still able to detect seen diseases even when co-occurring with unseen diseases. Third, we evaluate whether feature representations learned by models may be used to detect the presence of unseen diseases given a small labeled set of unseen diseases. We find that the penultimate layer provides useful features for unseen disease detection. Our results can inform the safe clinical deployment of deep learning models trained on a non-exhaustive set of disease classes.","authors":"Siyu Shi (Department of Medicine, School of Medicine, Stanford University); Ishaan Malhi, Kevin Tran, Andrew Y. Ng, and Pranav Rajpurkar (Department of Computer Science, Stanford University)","title":"CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays"},{"UID":"21WS14","abstract":"We explore the application of graph neural networks (GNNs) to the problem of estimating exposure to an infectious pathogen and probability of transmission. Specifically, given a datatset in which a subset of patients are known to be infected and information in the form of a graph about who has interacted with whom, we aim to directly estimate transmission dynamics, i.e., what types of interactions (e.g., length and number) lead to transmission events. While, graph neural networks (GNNs) have proven capable of learning meaningful representations from graph data, they commonly assume tasks with high homophily (i.e., nodes that share an edge look similar). Recently researchers have proposed techniques for addressing problems with low homophily (e.g., adding residual connections to GNNs). In our problem setting, homophily is high on average, the majority of patients do not become infected. But, homophily remains low with respect to the minority class. In this paper, we characterize this setting as particularly challenging for GNNs. Given the asymmetry in homophily between classes, we hypothesize that solutions designed to address low homophily on average will not suffice and instead propose a solution based on attention. Applied to both real-world and synthetic network data, we test this hypothesis and explore the ability of GNNs to learn complex transmission dynamics directly from network data. Overall, attention proves to be an effective mechanism for addressing low homophily in the minority class (AUROC with 95\\% CI: GCN 0.684 (0.659,0.710) vs. GAT 0.715 (0.688,0.742)) and such a data-driven approach can outperform approaches based on potentially flawed expert knowledge.","authors":"Jeeheh Oh (University of Michigan, Ann Arbor); Jenna Wiens (University of Michigan)","title":"A Data-Driven Approach to Estimating Infectious Disease Transmission from Graphs: A Case of Class Imbalance Driven Low Homophily"},{"UID":"21WS15","abstract":"Explainable artificial intelligence provides an opportunity to improve prediction accuracy over standard linear models using 'black box' machine learning (ML) models while still revealing insights into a complex outcome such as all-cause mortality. We propose the IMPACT (Interpretable Machine learning Prediction of All-Cause morTality) framework that implements and explains complex, non-linear ML models in epidemiological research, by combining a tree ensemble mortality prediction model and an explainability method. We use 133 variables from NHANES 1999-2014 datasets (number of samples: ?? = 47, 261) to predict all-cause mortality. To explain our model, we extract local (i.e., per-sample) explanations to verify well-studied mortality risk factors, and make new dis- coveries. We present major factors for predicting ??-year mortality (?? = 1, 3, 5) across different age groups and their individualized im- pact on mortality prediction. Moreover, we highlight interactions between risk factors associated with mortality prediction, which leads to findings that linear models do not reveal. We demonstrate that compared with traditional linear models, tree-based models have unique strengths such as: (1) improving prediction power, (2) making no distribution assumptions, (3) capturing non-linear relationships and important thresholds, (4) identifying feature interactions, and (5) detecting different non-linear relationships between models. Given the popularity of complex ML models in prognostic research, combining these models with explainability methods has implications for further applications of ML in medical fields. To our knowledge, this is the first study that combines complex ML models and state-of-the-art feature attributions to explain mortality prediction, which enables us to achieve higher prediction accuracy and gain new insights into the effect of risk factors on mortality.","authors":"Wei Qiu, Hugh Chen, Ayse Berceste Dincer, and Su-In Lee (Paul G. Allen School of Computer Science and Engineering, University of Washington)","title":"Interpretable Machine Learning Prediction of All-cause Mortality"},{"UID":"21WS16","abstract":"Cardiogenic shock is a deadly and complicated illness. Despite extensive research into treating cardiogenic shock, mortality remains high and has not decreased over time. Patients suffering from cardiogenic shock are highly heterogeneous, and developing an understanding of phenotypes among these patients is crucial for understanding this disease and the appropriate treatments for individual patients. In this work, we develop a deep mixture of experts approach to jointly find phenotypes among patients with cardiogenic shock while simultaneously estimating their risk of in-hospital mortality. Although trained with information regarding treatment and outcomes, after training, the proposed model is decomposable into a network that clusters patients into phenotypes from information available prior to treatment. This model is validated on a synthetic dataset and then applied to a cohort of 28,304 patients with cardiogenic shock. The full model predicts in-hospital mortality on this cohort with an AUROC of 0.85 \u00b1 0.01. The model discovers five phenotypes among the population, finding statistically different mortality rates among them and among treatment choices within those groups. This approach allows for grouping patients in clinical clusters with different rates of device utilization and different risk of mortality. This approach is suitable for jointly finding phenotypes within a clinical population and in modeling risk among that population.","authors":"Nathan C. Hurley (Texas A&M University); Alyssa Berkowitz (Yale University); Frederick Masoudi (University of Colorado School of Medicine); Joseph Ross and Nihar Desai (Yale University); Nilay Shah (Mayo Clinic); Sanket Dhruva (UCSF School of Medicine); Bobak J. Mortazavi (Texas A&M University)","title":"Outcomes-Driven Clinical Phenotyping in Patients with Cardiogenic Shock for Risk Modeling and Comparative Treatment Effectiveness"},{"UID":"21WS17","abstract":"Severe infectious diseases such as the novel coronavirus (COVID-19) pose a huge threat to public health. Stringent control measures, such as school closures and stay-at-home orders, while having significant effects, also bring huge economic losses. In the face of an emerging infectious disease, a crucial question for policymakers is how to make the trade-off and implement the appropriate interventions timely, with the existence of huge uncertainty. In this work, we propose a Multi-Objective Model-based Reinforcement Learning framework to facilitate data-driven decision making and minimize the long-term overall cost. Specifically, at each decision point, a Bayesian epidemiological model is first learned as the environment model, and then the proposed model-based multi-objective planning algorithm is applied to find a set of Pareto-optimal policies. This framework, combined with the prediction bands for each policy, provides a real-time decision support tool for policymakers. The application is demonstrated with the spread of COVID-19 in China.","authors":"Runzhe Wan, Xinyu Zhang, and Rui Song (North Carolina State University)","title":"Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control"},{"UID":"21WS18","abstract":"With the growing amount of text in health data, there have beenrapid advances in large pre-trained models that can be applied to awide variety of biomedical tasks with minimal task-specific mod-ifications. Emphasizing the cost of these models, which renderstechnical replication challenging, this paper summarizes experi-ments conducted in replicating BioBERT and further pre-trainingand careful fine-tuning in the biomedical domain. We also inves-tigate the effectiveness of domain-specific and domain-agnosticpre-trained models across downstream biomedical NLP tasks. Ourfinding confirms that pre-trained models can be impactful in somedownstream NLP tasks (QA and NER) in the biomedical domain;however, this improvement may not justify the high cost of domain-specific pre-training.","authors":"Paul Grouchi (Untether AI); Shobhit Jain (Manulife); Michael Liu (Tealbook); Kuhan Wang (CIBC); Max Tian (Adeptmind); Nidhi Arora (Intact); Hillary Ngai (University of Toronto); Faiza Khan Khattak (Manulife); Elham Dolatabadi and Sedef Akinli Kocak (Vector Institute)","title":"An Experimental Evaluation of Transformer-based LanguageModels in the Biomedical Domain"},{"UID":"21WS19","abstract":"Question Answering (QA) is a widely-used framework for developing and evaluating an intelligent machine. In this light, QA on Electronic Health Records (EHR), namely EHR QA, can work as a crucial milestone towards developing an intelligent agent in healthcare. EHR data are typically stored in a relational database, which can also be converted to a directed acyclic graph, allowing two approaches for EHR QA: Table-based QA and Knowledge Graph-based QA. We hypothesize that the graph-based approach is more suitable for EHR QA as graphs can represent relations between entities and values more naturally compared to tables, which essentially require JOIN operations. In this paper, we propose a graph-based EHR QA where natural language queries are converted to SPARQL instead of SQL. To validate our hypothesis, we create four EHR QA datasets (graph-based VS table-based, and simplified database schema VS original database schema), based on a table-based dataset MIMICSQL. We test both a simple Seq2Seq model and a state-of-the-art EHR QA model on all datasets where the graph-based datasets facilitated up to 34% higher accuracy than the table-based dataset without any modification to the model architectures. Finally, all datasets will be open-sourced to encourage further EHR QA research in both directions.","authors":"Junwoo Park and Youngwoo Cho (Korea Advanced Institute of Science and Technology (KAIST)); Haneol Lee (Yonsei University); Jaegul Choo and Edward Choi (Korea Advanced Institute of Science and Technology (KAIST))","title":"Knowledge Graph-based Question Answering with Electronic Health Records"},{"UID":"21WS20","abstract":"There is an increased adoption of electronic health record (EHR) systems by variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich source of patient information is the unstructured clinical text. In this work, we develop \\model, a chart-aware temporal attention network for learning patient representations from clinical notes. We introduce a novel representation where each note is considered a single unit, like a sentence, and composed of attention-weighted words. The notes in turn are aggregated into a patient representation using a second weighting unit, note attention. Unlike standard attention computations which focus only on the content of the note, we incorporate the chart-time for each note as a constraint for attention calculation. This allows our model to focus on notes closer to the prediction time. Using the MIMIC-III dataset, we empirically show that our patient representation and attention calculation achieves the best performance in comparison with various state-of-the-art baselines for one-year mortality prediction and 30-day hospital readmission. Moreover, the attention weights can be used to offer transparency into our model's predictions.","authors":" Zelalem Gero and Joyce Ho (Emory University)","title":"CATAN: Chart-aware temporal attention network for clinical text classification"},{"UID":"21WS21","abstract":"Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.","authors":"Chirag Nagpal (Carnegie Mellon University); Steve Yadlowsky; Negar Rostamzadeh; and Katherine Heller (Google Brain)","title":"Deep Cox Mixtures for Survival Regression"}]},"2022":{"highlights":"
\n\n
\n
\n
Day 1
\n
\n
\n
\n \n \n \n
\n
\n
\n
Day 2
\n
\n
\n
\n \n \n
\n
\n\n\n### Governing Board\n###### **General Chairs**\n- Dr. Tristan Naumann of Microsoft Research\n- Dr. Joyce Ho of Emory University\n###### **Program Chairs**\n- Dr. Sherri Rose of Stanford University\n- Matthew McDermott of MIT\n###### **Proceedings Chairs**\n- Dr. George Chen of Carnegie Mellon University\n- Dr. Tom Pollard of MIT\n- Gerardo Flores of Google\n###### **Track Chairs**\n- ###### **Models and Methods**\n * Dr. Rahul Krishnan of University of Toronto and Vector Institute\n * Dr. Shalmali Joshi of Harvard University\n * Dr. Michael Hughes of Tufts University\n * Dr. Yuyin Zhou of Stanford\n * Dr. Uri Shalit of Technion\n- ###### **Applications and Practice**\n * Dr. Alistair Johnson of The Hospital for Sick Children\n * Dr. Judy Gichoya of Emory University\n * Emma Rocheteau of University of Cambridge\n * Dr. Lifang He of Lehigh University\n- ###### **Impact and Society**\n * Dr. Bobak Mortazavi of Texas A&M University \n * Dr. Stephen Pfohl of Stanford University\n * Dr. Farzan Sasangohar of Texas A&M University\n###### **Communications Chairs**\n- Dr. Sanja \u0160\u0107epanovi\u0107 of Bell Labs Cambridge\n- Emily Alsentzer of Harvard University and MIT\n- Dr. Ayah Zirikly of Johns Hopkins University\n###### **Finance Chairs**\n- Dr. Brett Beaulieu-Jones of Harvard Medical School\n- Dr. Ahmed Alaa of Harvard University and MIT\n- Tasmie Sarker of Association for Health Learning and Inference\n###### **Tutorial Chairs**\n- Dr. Jessica Gronsbell of University of Toronto\n- Harvineet Singh of New York University\n###### **Virtual Chairs**\n- Dr. Stephanie Hyland of Microsoft Research Cambridge\n- Dr. Ioakeim Perros of HEALTH[at]SCALE\n- Brian Gow of MIT\n###### **Logistics Chair**\n- Tasmie Sarker of Association for Health Learning and Inference\n\n\n### Steering Committee\n- Dr. Yindalon Aphinyanaphongs of NYU\n- Dr. Leo Celi of MIT\n- Dr. Nigam Shah of Stanford University\n- Dr. Stephen Friend of Oxford University\n- Dr. Alan Karthikesalingam of Google Health UK\n- Dr. Ziad Obermeyer of University of California, Berkeley\n- Dr. Samantha Kleinberg of Stevens Institute of Technology\n- Dr. Anna Goldenberg of The Hospital for Sick Children Research Institute\n- Dr. Lucila Ohno-Machado of University of California, San Diego\n- Dr. Noemie Elhadad of Columbia University\n- Dr. Katherine Heller at Google Research\n- Dr. Laura Rosella of Dalla Lana School of Public Health, University of Toronto\n- Dr. Shakir Mohamed of DeepMind\n\n### Sponsors\nThank you to our 2022 sponsors: HEALTH[at]SCALE (Silver), the Vector Institute (Silver) and Microsoft (Bronze)!\n","proceedings":[],"speakers":[{"UID":"S01","abstract":"It has been shown that equalizing health disparities can avert more deaths than the number of lives saved by medical advances alone in the same time frame. Moreover, without a simultaneous focus on innovations and equity, advances in health for one group can occur at the cost of added challenges for another. In this talk I will introduce the science of health disparities and juxtapose it with the machine learning subfield of algorithmic fairness. Given the key foci and principles of health equity and health disparities within public and population health, I will show examples of how machine learning and principles of public and population health can be synergized for using data to advance the science of health disparities and sustainable health of entire populations.","bio":"Dr. Rumi Chunara is an Associate Professor at New York University, jointly appointed at the Tandon School of Engineering (in Computer Science) and the School of Global Public Health (in Biostatistics/Epidemiology). Her PhD is from the Harvard-MIT Division of Health Sciences and Technology and her BSc from Caltech. Her research group focuses on developing computational and statistical approaches for acquiring, integrating and using data to improve population and public health. She is an MIT TR35, NSF Career, Bill & Melinda Gates Foundation Grand Challenges, Facebook Research and Max Planck Sabbatical award winner.","image":"static/images/speakers/rumi_chunara.jpg","institution":"New York University","slideslive_active_date":"","slideslive_id":"","speaker":"Rumi Chunara","title":"Algorithmic fairness and the science of health disparities","website":""},{"UID":"S02","abstract":"A common goal in genome-wide association (GWA) studies is to characterize the relationship between genotypic and phenotypic variation. Linear models are widely used tools in GWA analyses, in part, because they provide significance measures which detail how individual single nucleotide polymorphisms (SNPs) are statistically associated with a trait or disease of interest. However, traditional linear regression largely ignores non-additive genetic variation, and the univariate SNP-level mapping approach has been shown to be underpowered and challenging to interpret for certain trait architectures. While machine learning (ML) methods such as neural networks are well known to account for complex data structures, these same algorithms have also been criticized as \u201cblack box\u201d since they do not naturally carry out statistical hypothesis testing like classic linear models. This limitation has prevented ML approaches from being used for association mapping tasks in GWA applications. In this talk, we present flexible and scalable classes of Bayesian feedforward models which provide interpretable probabilistic summaries such as posterior inclusion probabilities and credible sets which allows researchers to simultaneously perform (i) fine-mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. While analyzing real data assayed in diverse self-identified human ancestries from the UK Biobank, the Biobank Japan, and the PAGE consortium we demonstrate that interpretable ML has the power to increase the return on investment in multi-ancestry biobanks. Furthermore, we highlight that by prioritizing biological mechanism we can identify associations that are robust across ancestries---suggesting that ML can play a key role in making personalized medicine a reality for all.","bio":"Lorin Crawford is a Senior Researcher at Microsoft Research New England. He also holds a position as the RGSS Assistant Professor of Biostatistics at Brown University. His scientific research interests involve the development of novel and efficient computational methodologies to address complex problems in statistical genetics, cancer pharmacology, and radiomics (e.g., cancer imaging). Dr. Crawford has an extensive background in modeling massive data sets of high-throughput molecular information as it pertains to functional genomics and cellular-based biological processes. His most recent work has earned him a place on Forbes 30 Under 30 list, The Root 100 Most Influential African Americans list, and recognition as an Alfred P. Sloan Research Fellow and a David & Lucile Packard Foundation Fellowship for Science and Engineering. Before joining Brown, Dr. Crawford received his PhD from the Department of Statistical Science at Duke University and received his Bachelor of Science degree in Mathematics from Clark Atlanta University.","image":"static/images/speakers/lorin_crawford.jpg","institution":"Microsoft Research New England; Brown University","slideslive_active_date":"","slideslive_id":"","speaker":"Lorin Crawford","title":"Machine Learning for Human Genetics: A Multi-Scale View on Complex Traits and Disease","website":""},{"UID":"S03","abstract":"Machine learning presents an opportunity to understand the patient journey over high dimensional data in the clinical context. This is aligned to one of the foundational issues of machine learning for healthcare: how do you represent a patient state. Improving state representations allows us to (i) visualise/cluster deteriorating patients, (ii) understand the patient journey and thus heterogeneous pathways to improvement or clinical deterioration which encompasses different data modalities; and thus (iii) more quickly identify situations for intervention. In this talk, I present motivating examples of understanding heterogeneity as a route towards understanding health and personalising healthcare interventions.","bio":"Danielle Belgrave is a Senior Staff Research Scientist at DeepMind. Prior to joining DeepMind she worked in the Healthcare Intelligence group at Microsoft Research and was a tenured research fellow at Imperial College London. Her research focuses on integrating medical domain knowledge, machine learning and causal modelling frameworks to understand health. She obtained a BSc in Mathematics and Statistics from London School of Economics, an MSc in Statistics from University College London and a PhD in the area of machine learning in health applications from the University of Manchester.","image":"static/images/speakers/danielle_belgrave.jpg","institution":"DeepMind","slideslive_active_date":"","slideslive_id":"","speaker":"Danielle Belgrave","title":"Understanding Heterogeneity as a Route to Understanding Health","website":""},{"UID":"S04","abstract":"In my talk, I will describe the work that I have been doing since March 2020, leading a multi-disciplinary team of 20+ volunteer scientists working very closely with the Presidency of the Valencian Government in Spain on 4 large areas: (1) human mobility modeling; (2) computational epidemiological models (both metapopulation, individual and LSTM-based models); (3) predictive models; and (4) a large-scale, online citizen surveys called the COVID19impactsurvey (https://covid19impactsurvey.org) with over 700,000 answers worldwide. This survey has enabled us to shed light on the impact that the pandemic is having on people's lives. I will present the results obtained in each of these four areas, including winning the 500K XPRIZE Pandemic Response Challenge and obtaining a best paper award at ECML-PKDD 2021. I will share the lessons learned in this very special initiative of collaboration between the civil society at large (through the survey), the scientific community (through the Expert Group) and a public administration (through the Commissioner at the Presidency level). For those interested in knowing more, WIRED magazine published an extensive article describing our story: https://www.wired.co.uk/article/valencia-ai-covid-data.","bio":"Nuria Oliver is Co-founder and Vice-president of ELLIS (The European Laboratory for Learning and Intelligent Systems), Co-founder and Director of the ELLIS Unit Alicante, Chief Data Scientist at Data-Pop Alliance and Chief Scientific Advisor to the Vodafone Institute. Nuria earned her PhD from MIT. She is a Fellow of the ACM, IEEE and EurAI. She is the youngest member (and fourth female) in the Spanish Royal Academy of Engineering. She is also the only Spanish scientist at SIGCHI Academy. She has over 25 years of research experience in human-centric AI and is the author of over 180 widely cited scientific articles as well as an inventor of 40+ patents and a public speaker. Her work is regularly featured in the media and has received numerous recognitions, including the Spanish National Computer Science Award, the MIT TR100 (today TR35), Young Innovator Award (first Spanish scientist to receive this award); the 2020 Data Scientist of the Year by ESRI, the 2021 King Jaume I award in New Technologies and the 2021 Abie Technology Leadership Award. In March of 2020, she was appointed Commissioner to the President of the Valencian Government on AI Strategy and Data Science against COVID-19. In that role, she has recently co-led ValenciaIA4COVID, the winning team of the 500k XPRIZE Pandemic Response Challenge. Their work was featured in WIRED, among other media.","image":"static/images/speakers/nuria_oliver.jpg","institution":"ELLIS","slideslive_active_date":"","slideslive_id":"","speaker":"Nuria Oliver","title":"Data Science against COVID-19","website":""},{"UID":"S05","abstract":"Spoiler alert: No. And yes, it is much, much further. Public health has not traditionally been a data-driven field. The good news is that has been changing in recent years, accelerated significantly by the COVID epidemic. But public health and human services organizations have many more fundamental things to worry about before we will have the luxury of considering what machine learning can enable. These fundamentals include data-related facets such as electronic data capture and exchange, data quality, data governance, information technology infrastructure, and data management best practices. In addition, data literacy, workforce development, and compensation that is a fraction of what 'quants' can earn in industry are also major stumbling blocks toward advanced analytics in public health. At the start of the COVID pandemic, many communicable diseases were reporting by fax machine and then hand-entered into a database. Although there was significant interest in predictive modeling to project hospital capacity out in the future, even the most sophisticated models were of limited use to policy makers beyond basic trends and observations from the front lines. The most notable exception, where AI is in fact proving useful in public health, is in the use of 'robotic process automation' (RPA) as a band-aid for poorly designed systems that require mindless human intervention. These tools serve as workarounds for systems that lack interoperability by emulating human users to do the grunt work of data entry and wrangling. This talk will be a reality check from the trenches of state government on the heels of the COVID-19 pandemic.","bio":"Dr. Tenenbaum serves as the Chief Data Officer (CDO) for DHHS, where she oversees data strategy across the Department enabling the use of information to inform and evaluate policy and improve the health and well-being of residents of North Carolina. Prior to taking on the role of CDO, Dr. Tenenbaum was a founding faculty member of the Division of Translational Biomedical Informatics within Duke University's Department of Biostatistics and Bioinformatics where her research focused on informatics methods to enable precision medicine, particularly in mental health. She is also interested in ethical, legal, and social issues around big data and precision medicine. Nationally, Dr. Tenenbaum has served as Associate Editor for the Journal of Biomedical Informatics and as an elected member of the Board of Directors for the American Medical Informatics Association (AMIA). She currently serves on the Board of Scientific Counselors for the National Library of Medicine. After earning her bachelor's degree in biology from Harvard, Dr. Tenenbaum was a Program Manager at Microsoft Corporation in Redmond, WA for six years before pursuing a PhD in biomedical informatics at Stanford University. Dr. Tenenbaum is a strong promoter and advocate of young women interested in STEM (science, technology, engineering, and math) careers.","image":"static/images/speakers/jessica_tenenbaum.jpg","institution":"North Carolina Department of Health and Human Services; Duke University School of Medicine","slideslive_active_date":"","slideslive_id":"","speaker":"Jessica Tenenbaum","title":"Machine Learning in Public Health: are we there yet?","website":""},{"UID":"S06","abstract":"AI systems tend to amplify biases and disparities. When we feed them data that reflects our biases, they mimic them---from antisemitic chatbots to racially biased software. In this talk I am going to discuss two examples how AI can help us reduce biases and disparities. First I am going to explain how we can use AI to understand why underserved populations experience higher levels of pain. This is true even after controlling for the objective severity of diseases like osteoarthritis, as graded by human physicians using medical images, which raises the possibility that underserved patients\u2019 pain stems from factors external to the knee, such as stress. We develop a deep learning approach to measure the severity of osteoarthritis, by using knee X-rays to predict patients\u2019 experienced pain and show that this approach dramatically reduces unexplained racial disparities in pain.","bio":"Jure Leskovec is an associate professor of Computer Science at Stanford University, the Chief Scientist at Pinterest, and an Investigator at the Chan Zuckerberg Biohub. He co-founded a machine learning startup Kosei, which was later acquired by Pinterest. Leskovec's research area is machine learning and data science for complex, richly-labeled relational structures, graphs, and networks for systems at all scales, from interactions of proteins in a cell to interactions between humans in a society. Applications include commonsense reasoning, recommender systems, social network analysis, computational social science, and computational biology with an emphasis on drug discovery. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. You can follow him on Twitter at @jure.","image":"static/images/speakers/jure_leskovec.jpg","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Jure Leskovec","title":"Reducing bias in machine learning systems: Understanding drivers of pain","website":""}],"tutorials":[{"UID":"T01","abstract":"You\u2019ve created an awesome model that predicts with near 100 percent accuracy. Now what? In this tutorial, we will give insight into the implementation, deployment, integration, and evaluation steps following the building of a clinical model. Specifically, we will discuss each step in the context of informing design choices as you build a model. For example, aggressive feature selection is a necessary step toward integration as real time data streams of all the data points a machine learning model may consume may not be accessible or feasible. We will use our implementation and evaluation of a Covid-19 adverse event model at our institution as a representative case study. This case study will demonstrate the full lifecycle of a clinical model and how we transition from a model to affecting patient outcome and the socio-technical challenges for success.","authors":"Yindalon Aphinyanaphongs","bio":"Yindalon Aphinyanaphongs, MD, PhD (Predictive Analytics Team Lead) is a physician scientist in the Center for Healthcare Innovation and Delivery Science in the Department of Population Health at NYU Langone Health in New York City. Academically, he is an assistant professor and his lab focuses on novel applications of machine learning to clinical problems and the science behind successful translation of predictive models into clinical practice to drive value. Operationally, he is the Director of Operational Data Science and Machine Learning at NYU Langone Health. In this role, he leads a Predictive Analytics Unit composed of data scientists and engineers that build, evaluate, benchmark, and deploy predictive algorithms into the clinical enterprise.","image":"static/images/speakers/yindalon_aphinyanaphongs.jpg","rocketchat_id":"","slideslive_active_date":"2022-03-28T23:59:00.00","slideslive_id":"","title":"Changing patient trajectory: A case study exploring implementation and deployment of clinical machine learning models"},{"UID":"T02","abstract":"The growth of availability and variety of healthcare data sources has provided unique opportunities for data integration and evidence synthesis, which can potentially accelerate knowledge discovery and enable better clinical decision-making. However, many practical and technical challenges, such as data privacy, high-dimensionality and heterogeneity across different datasets, remain to be addressed. In this talk, I will introduce several methods for the effective and efficient integration of electronic health records and other healthcare datasets. Specifically, we develop communication-efficient distributed algorithms for jointly analyzing multiple datasets without the need of sharing patient-level data. Our algorithms can account for heterogeneity across different datasets. We provide theoretical guarantees for the performance of our algorithms, and examples of implementing the algorithms to real-world clinical research networks.","authors":"Rui Duan","bio":"Dr. Duan is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. She received her Ph.D. in Biostatistics in May 2020 from the University of Pennsylvania. Her research interests focus on three distinct areas: methods for integrating evidence from different data sources, identifying signals from high dimensional data, and accounting for suboptimality of real-world data, such as missing data and measurement errors.","image":"static/images/speakers/rui_duan.jpg","rocketchat_id":"","slideslive_active_date":"2022-03-28T23:59:00.00","slideslive_id":"","title":"Distributed Statistical Learning and Inference with Electronic Health Records Data"},{"UID":"T03","abstract":"Digital health technologies provide promising ways to deliver interventions outside of clinical settings. Wearable sensors and mobile phones provide real-time data streams that provide information about an individual\u2019s current health including both internal (e.g., mood) and external (e.g., location) contexts. This tutorial discusses the algorithms underlying mobile health clinical trials. Specifically, we introduce the micro-randomized trial (MRT), an experimental design for optimizing real time interventions. We define the causal excursion effect and discuss reasons why this effect is often considered the primary causal effect of interest in MRT analysis. We introduce statistical methods for primary and secondary analyses for MRT. Attendees will have access to synthetic digital health experimental data to better understand online learning and experimentation algorithms, the systems underlying real time delivery of treatment, and their evaluation using collected data.","authors":"Walter Dempsey","bio":"Walter Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research. My research focuses on Statistical Methods for Digital and Mobile Health. My current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks.","image":"static/images/speakers/walter_dempsey.jpg","rocketchat_id":"","slideslive_active_date":"2022-03-28T23:59:00.00","slideslive_id":"","title":"Challenges in Developing Online Learning and Experimentation Algorithms in Digital Health"},{"UID":"T04","abstract":"Does increasing the dosage of a drug treatment cause adverse reactions in patients? This is a causal question: did increased drug dosage cause some patients to have an adverse reaction, or would they have had the reaction anyway due to other factors? A classical approach to studying this causal question from observational data involves applying causal inference techniques to observed measurements of all the relevant clinical variables. However, there is a growing recognition that abundant text data, such as medical records, physicians' notes, or even forum posts from online medical communities, provide a rich source of information for causal inference. In this tutorial, I'll introduce causal inference and highlight the unique challenges that high-dimensional and noisy text data pose. Then, I'll use two text applications involving online forums and consumer complaints to motivate recent approaches that extend natural language processing (NLP) methods in service of causal inference. I'll discuss some new assumptions we need to introduce to bridge the gap between noisy text data and valid causal inference. I'll conclude by summarizing open research questions at the intersection of causal inference and text analysis.","authors":"Dhanya Sridhar","bio":"Dhanya Sridhar is an assistant professor at the University of Montreal and a core academic member at Mila - Quebec AI Institute. She holds a Canada CIFAR AI Chair. She was a postdoctoral researcher at Columbia University and completed her PhD at the University of California, Santa Cruz. Her research interests are at the intersection of causality and machine learning, focusing on applications to text and social network data.","image":"static/images/speakers/dhanya_sridhar.jpg","rocketchat_id":"","slideslive_active_date":"2022-03-28T23:59:00.00","slideslive_id":"","title":"Causal Inference from Text Data"},{"UID":"T05","abstract":"Data visualization is essential for analyzing biomedical and public health data and communicating the findings to key stakeholders. However, the presence of a data visualization is not enough; the choices we make when visualizing data are equally important in establishing its understandability and impact. This tutorial will discuss strategies for visualizing data and evaluating its impact with an appropriate target audience. The aim is to build an intuition for developing and assessing visualizations by drawing on theories of visualization theories together with examples from prior research and ongoing attempts to visualize the present pandemic.","authors":"Ana Crisan","bio":"Ana Crisan is currently a senior research scientist at Tableau, a Salesforce company. She conducts interdisciplinary research that integrates techniques and methods from machine learning, human computer interaction, and data visualization. Her research focuses on the intersection of Data Science and Data Visualization, especially toward the way humans can collaboratively work together with ML/AI systems through visual interfaces. She completed her Ph.D. in Computer Science at the University of British Columbia, under the joint supervision of Dr. Tamara Muzner and Dr. Jennifer L. Gardy. Prior to that, she was a research scientist at the British Columbia Centre for Disease Control and Decipher Biosciences, where she conducted research on machine learning and data visualization research toward applications in infectious disease and cancer genomics, respectively. Her research has appeared in publications of the ACM (CHI), IEEE (TVCG, CG&A), Bioinformatics, and Nature.","image":"static/images/speakers/ana_crisan.jpg","rocketchat_id":"","slideslive_active_date":"2022-03-28T:23:59:00.00","slideslive_id":"","title":"'Are log scales endemic yet?' Strategies for visualizing biomedical and public health data"}]},"2023":{"debates":[{"UID":"D01","abstract":"","bio":"Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).","image":"static/images/speakers/christopher_chute.jpg","institution":"Johns Hopkins University","slideslive_active_date":"","slideslive_id":"","speaker":"Christopher Chute","title":"Network studies: As many databases as possible or enough to answer the question quickly?"},{"UID":"D02","abstract":"","bio":"Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.","image":"static/images/speakers/robert_platt.jpg","institution":"McGill University","slideslive_active_date":"","slideslive_id":"","speaker":"Robert Platt","title":"Network studies: As many databases as possible or enough to answer the question quickly?"},{"UID":"D03","abstract":"","bio":"Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.","image":"static/images/speakers/t-cai.jpg","institution":"Harvard Medical School","slideslive_active_date":"","slideslive_id":"","speaker":"Tianxi Cai","title":"Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?"},{"UID":"D04","abstract":"","bio":"Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.","image":"static/images/speakers/yong_chen.png","institution":"University of Pennsylvania","slideslive_active_date":"","slideslive_id":"","speaker":"Yong Chen","title":"Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?"},{"UID":"D05","abstract":"","bio":"Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children\u2019s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.","image":"static/images/speakers/khaled_el_emam.png","institution":"University of Ottawa","slideslive_active_date":"","slideslive_id":"","speaker":"Khaled El Emam","title":"Differential Privacy vs. Synthetic Data"},{"UID":"D06","abstract":"","bio":"Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.","image":"static/images/speakers/li_xiong.png","institution":"Emory University","slideslive_active_date":"","slideslive_id":"","speaker":"Li Xiong","title":"Differential Privacy vs. Synthetic Data"}],"highlights":"
\n \n\n
\n\n\n##### **General Chairs**\n- Joyce Ho of Emory University\n- Andrew Beam of Harvard University\n##### **Program Chairs**\n- Matthew McDermott of Harvard University\n- Emily Alsentzer of Harvard University & Brigham and Women's Hospital\n##### **Track Chairs**\n- ##### **Track 1**\n * Mike Hughes of Tufts University (Track Lead)\n * Yuyin Zhou of University of California, Santa Cruz\n * Rahul Krishnan of University of Toronto & Vector Institute\n * Jean Feng of University of California, San Francisco\n * Samantha Kleinberg of Stevens Institute of Technology\n- ##### **Tracks 1 & 2**\n * Elena Sizikova of Food and Drug Administration\n- ##### **Track 2**\n * Lifang He of Lehigh University (Track Lead)\n * Tom Pollard of MIT\n * Carl Yang of Emory University\n * Yu Zhang of Lehigh University\n- ##### **Track 3**\n * Sanja \u0160\u0107epanovi\u0107 of Nokia Bell Labs (Track Lead)\n * Stephen Pfohl of Google\n * Dimitris Spathis of Nokia Bell Labs & University of Cambridge\n##### **Proceedings Chair**\n- Bobak Mortazavi of Texas A&M University\n##### **Technology Chairs**\n- Huan He of Harvard University\n- Jiayu Yao of Harvard University & Gladstone Institutes\n##### **Virtual Chair**\n- Brian Gow of MIT\n##### **Communications Chairs**\n- Ioakeim Perros of HEALTH[at]SCALE\n- Anil Palepu of Harvard University & MIT\n##### **Logistics Chairs**\n- Monica Munnangi of Northeastern University\n- Tasmie Sarker of Association for Health Learning and Inference\n##### **Uncoference Chair**\n- Jessica Gronsbell of University of Toronto\n- Rui Duan of Harvard University\n##### **Finance Chairs**\n- Edward Choi of KAIST\n- Harvineet Singh of New York University\n##### **Doctoral Symposium Chair**\n- Tom Hartvigsen of MIT\n\n\n### Sponsors\nThank you to our 2023 sponsors:\n\n- ApolloMed\n- Dandelion Health\n- Genentech\n- Apple\n- Sage BioNetworks\n- AITRICS\n","invited":[{"UID":"I01","abstract":"","bio":"Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria\u2019s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she\u2019s been recognized as IEEE\u2019s \u201cAI\u2019s 10 to Watch\u201d, Sloan Fellow, MIT Tech Review\u2019s \u201c35 Under 35\u201d, National Academy of Medicine\u2019s list of \u201cEmerging Leaders in Health and Medicine\u201d, and DARPA\u2019s Faculty Award. For her work in industry bringing AI to healthcare, she\u2019s been recognized as World Economic Forum\u2019s 100 Brilliant Minds Under 40, Rock Health\u2019s \u201cTop 50 in Digital Health\u201d, Modern Healthcare\u2019s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine\u2019s Annual Scientific Award.","image":"static/images/speakers/suchi_saria.jpg","institution":"Johns Hopkins University & Bayesian Health","slideslive_active_date":"","slideslive_id":"","speaker":"Suchi Saria","title":"Invited Talk on Research and Top Recent Papers from 2020-2022"},{"UID":"I02","abstract":"","bio":"Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women\u2019s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master\u2019s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.","image":"static/images/speakers/karandeep_singh.jpg","institution":"University of Michigan","slideslive_active_date":"","slideslive_id":"","speaker":"Karandeep Singh","title":"Invited Talk on Recent Deployments and Real-world Impact"},{"UID":"I03","abstract":"","bio":"Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.","image":"static/images/speakers/nigam_shah.png","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Nigam Shah","title":"Invited Talk on Under-explored Research Challenges and Opportunities"}],"panels":[{"UID":"P01","abstract":"","bio":"Isaac \u201cZak\u201d Kohane, MD, PhD, is the inaugural chair of Harvard Medical School\u2019s Department of Biomedical Informatics, whose mission is to develop the methods, tools, and infrastructure required for a new generation of scientists and care providers to move biomedicine rapidly forward by taking advantage of the insight and precision offered by big data. Kohane develops and applies computational techniques to address disease at multiple scales, from whole health care systems to the functional genomics of neurodevelopment. He also has worked on AI applications in medicine since the 1990\u2019s, including automated ventilator control, pediatric growth monitoring, detection of domestic abuse, diagnosing autism from multimodal data and most recently assisting clinicians using whole genome sequence and clinical histories to diagnose rare or unknown disease patients. His most urgent question is how to enable doctors to be most effective and enjoy their profession when they enter into a substantial symbiosis with machine intelligence. He is a member of the National Academy of Medicine, the American Society for Clinical Investigation and the American College of Medical Informatics.","image":"static/images/speakers/isaac_kohane.jpg","institution":"Harvard University","slideslive_active_date":"","slideslive_id":"","speaker":"Moderator: Isaac Kohane","title":"Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?"},{"UID":"P02","abstract":"","bio":"Leo focuses on scaling clinical research to be more inclusive through open access data and software, particularly for limited resource settings; identifying bias in the data to prevent them from being encrypted in models and algorithms; and redesigning research using the principles of team science and the hive learning strategy.","image":"static/images/speakers/leo_celi.png","institution":"MIT","slideslive_active_date":"","slideslive_id":"","speaker":"Leo Celi","title":"Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?"},{"UID":"P03","abstract":"","bio":"Jason Fries is a research scientist at the Shah Lab at Stanford University. His work is centered on enabling domain experts to easily construct and modify machine learning models, particularly in the field of medicine where expert-labeled training data are hard to acquire. His research interests include weakly supervised machine learning, foundation models for medicine, and data-centric AI.","image":"static/images/speakers/jason_fries.jpg","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Jason Fries","title":"Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?"},{"UID":"P04","abstract":"","bio":"Lauren Oakden-Rayner is a radiologist and Senior Research Fellow at the Australian Institute for Machine Learning, University of Adelaide. Her research primarily focuses on medical AI safety, specifically addressing the issues of model robustness, generalization, evaluation, and fairness. Lauren is also involved in supervising students and working on various medical AI projects, reviewing MOOCs on her blog, and advocating for diversity in her group and Institute.","image":"static/images/speakers/lauren_oakden_rayner.jpg","institution":"University of Adelaide","slideslive_active_date":null,"slideslive_id":"","speaker":"Lauren Oakden-Rayner","title":"Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?"},{"UID":"P05","abstract":"","bio":"Maia Hightower, MD, MBA, MPH, is an accomplished healthcare IT executive and internist. She currently serves as the executive Vice President and Chief Digital & Technology Officer at the University of Chicago Medicine and the CEO and co-founder of Equality AI, a startup aimed at achieving health equity through responsible AI and machine-learning operations. Previously, she was the chief medical information officer and associate chief medical officer at University of Utah Health and served in similar roles at University of Iowa Health Care and Stanford Health Care. Dr. Hightower's work has focused on leveraging digital technology to address health inequities and promoting diversity and inclusion within healthcare IT systems. Her leadership in the field has earned her widespread recognition.","image":"static/images/speakers/maia_hightower.jpg","institution":"University of Chicago Medicine","slideslive_active_date":"","slideslive_id":"","speaker":"Maia Hightower","title":"Generalizability in Machine Learning for Health: Critical for Robustness, or a Distraction from Specific Validation?"},{"UID":"P06","abstract":"","bio":"Marzyeh Ghassemi is an assistant professor and the Hermann L. F. von Helmholtz Professor with appointments in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering & Science at MIT. Ghassemi\u2019s research interests span representation learning, behavioral ML, healthcare ML, and healthy ML. One of her focuses is on real-world applications of machine learning, such as turning diverse clinical data into cohesive information with the ability to predict patient needs. Ghassemi has received BS degrees in computer science and electrical engineering from New Mexico State University, an MSc degree in biomedical engineering from Oxford University, and PhD in computer science from MIT.","image":"static/images/speakers/marzyeh_ghassemi.jpeg","institution":"MIT","slideslive_active_date":null,"slideslive_id":"","speaker":"Moderator: Marzyeh Ghassemi","title":"Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions"},{"UID":"P07","abstract":"","bio":"Ziad Obermeyer is Associate Professor and Blue Cross of California Distinguished Professor at UC Berkeley, where he works at the intersection of machine learning and health. He is a Chan Zuckerberg Biohub Investigator, a Faculty Research Fellow at the National Bureau of Economic Research, and was named an Emerging Leader by the National Academy of Medicine. Previously, he was Assistant Professor at Harvard Medical School, and continues to practice emergency medicine in underserved communities.","image":"static/images/speakers/ziad_obermeyer.jpg","institution":"UC Berkeley","slideslive_active_date":"","slideslive_id":"","speaker":"Ziad Obermeyer","title":"Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions"},{"UID":"P08","abstract":"","bio":"Dr. Halamka is an emergency medicine physician, medical informatics expert and president of the Mayo Clinic Platform, which is focused on transforming health care by leveraging artificial intelligence, connected health care devices and a network of partners. Dr. Halamka has been developing and implementing health care information strategy and policy for more than 25 years. Previously, he was executive director of the Health Technology Exploration Center for Beth Israel Lahey Health, chief information officer at Beth Israel Deaconess Medical Center, and International Healthcare Innovation Professor at Harvard Medical School. He is a member of the National Academy of Medicine.","image":"static/images/speakers/john_halamka.jpg","institution":"Mayo Clinic","slideslive_active_date":"","slideslive_id":"","speaker":"John Halamka","title":"Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions"},{"UID":"P09","abstract":"","bio":"Elaine Nsoesie is an Associate Professor at Boston University's School of Public Health and a leading voice in the use of data and technology to advance health equity. She is leads the Racial Data Tracker project at Boston University's Center for Antiracist Research and serves as a Senior Advisor to the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program at the National Institutes of Health. Dr. Nsoesie has published extensively on the use of data from social media, search engines, and cell phones for public health surveillance and is dedicated to increasing representation of underrepresented communities in data science. She completed her PhD in Computational Epidemiology from Virginia Tech and has held postdoctoral positions at Harvard Medical School and Boston Children's Hospital.","image":"static/images/speakers/elaine_nsoesie.jpg","institution":"Boston University","slideslive_active_date":"","slideslive_id":"","speaker":"Elaine Nsoesie","title":"Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions"},{"UID":"P10","abstract":"","bio":"Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children\u2019s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics Engineering, King's College, at the University of London, England.","image":"static/images/speakers/khaled_el_emam.png","institution":"University of Ottawa","slideslive_active_date":"","slideslive_id":"","speaker":"Khaled El Emam","title":"Sharing Health Data in an Age of Generative AI: Risks, Limitations, and Solutions"},{"UID":"P11","abstract":"","bio":"Byron Wallace is the Sy and Laurie Sternberg Interdisciplinary Associate Professor and Director of the BS in Data Science program at Northeastern University in the Khoury College of Computer Sciences. His research is primarily in natural language processing (NLP) methods, with an emphasis on their application in healthcare and the challenges inherent to this domain.","image":"static/images/speakers/byron_wallace.jpg","institution":"Northeastern University","slideslive_active_date":"","slideslive_id":"","speaker":"Moderator: Byron Wallace","title":"Machine Learning for Healthcare in the Era of ChatGPT"},{"UID":"P12","abstract":"","bio":"Tristan Naumann is a Principal Researcher in Microsoft Research\u2019s Health Futures working on problems related to clinical and biomedical natural language processing (NLP). His research focuses on exploring relationships in complex, unstructured healthcare data using natural language processing and unsupervised learning techniques. He is currently serving as General Chair of NeurIPS and co-organizer of the Clinical NLP workshop at ACL. Previously, he has served as General Chair and Program Chair of the AHLI Conference on Health, Inference, and Learning (CHIL) and Machine Learning for Health (ML4H). His work has appeared in KDD, AAAI, AMIA, JMIR, MLHC, ACM HEALTH, Cell Patterns, Science Translational Medicine, and Nature Translational Psychiatry.","image":"static/images/speakers/tristan_naumann.jpeg","institution":"Microsoft Research","slideslive_active_date":"","slideslive_id":"","speaker":"Tristan Naumann","title":"Machine Learning for Healthcare in the Era of ChatGPT"},{"UID":"P13","abstract":"","bio":"Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women\u2019s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master\u2019s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.","image":"static/images/speakers/karandeep_singh.jpg","institution":"University of Michigan","slideslive_active_date":"","slideslive_id":"","speaker":"Karandeep Singh","title":"Machine Learning for Healthcare in the Era of ChatGPT"},{"UID":"P14","abstract":"","bio":"Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.","image":"static/images/speakers/nigam_shah.png","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Nigam Shah","title":"Machine Learning for Healthcare in the Era of ChatGPT"},{"UID":"P15","abstract":"","bio":"Saadia Gabriel is currently a MIT CSAIL Postdoctoral Fellow. She is also an incoming NYU Faculty Fellow and will start as an Assistant Professor at UCLA in 2024. She completed her PhD at the University of Washington, where she was advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her research revolves around natural language processing and machine learning, with a particular focus on building systems for understanding how social commonsense manifests in text (i.e. how do people typically behave in social scenarios), as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also received a 2019 ACL best short paper nomination, a 2019 IROS RoboCup best paper nomination and won a best paper award at the 2020 WeCNLP summit. \n","image":"static/images/speakers/saadia_gabriel.jpg","institution":"MIT","slideslive_active_date":"","slideslive_id":"","speaker":"Saadia Gabriel","title":"Machine Learning for Healthcare in the Era of ChatGPT"}],"proceedings":[{"UID":"P01","abstract":"Healthcare datasets often include patient-reported values, such as mood, symptoms, and meals, which can be subject to varying levels of human error. Improving the accuracy of patient-reported data could help in several downstream tasks, such as remote patient monitoring. In this study, we propose a novel denoising autoencoder (DAE) approach to denoise patient-reported data, drawing inspiration from recent work in computer vision. Our approach is based on the observation that noisy patient-reported data are often collected alongside higher fidelity data collected from wearable sensors. We leverage these auxiliary data to improve the accuracy of the patient-reported data. Our approach combines key ideas from DAEs with co-teaching to iteratively filter and learn from clean patient-reported samples. Applied to the task of recovering carbohydrate values for blood glucose management in diabetes, our approach reduces noise (MSE) in patient-reported carbohydrates from 72g2 (95% CI: 54-93) to 18g2 (13-25), outperforming the best baseline (33g2 (27-43)). Notably, our approach achieves strong performance with only access to patient-reported target values, making it applicable to many settings where ground truth data may be unavailable.","authors":"Harry Rubin-Falcone* (University of Michigan)|Joyce Lee (University of Michigan)|Jenna Wiens (University of Michigan)","session":"B","title":"Denoising Autoencoders for Learning from Noisy Patient-Reported Data"},{"UID":"P02","abstract":"Learning multi-view data is an emerging problem in machine learning research, and nonnegative matrix factorization (NMF) is a popular dimensionality-reduction method for integrating information from multiple views. These views often provide not only consensus but also complementary information. However, most multi-view NMF algorithms assign equal weight to each view or tune the weight via line search empirically, which can be infeasible without any prior knowledge of the views or computationally expensive. In this paper, we propose a weighted multi-view NMF (WM-NMF) algorithm. In particular, we aim to address the critical technical gap, which is to learn both view-specific weight and observation-specific reconstruction weight to quantify each view\u2019s information content. The introduced weighting scheme can alleviate unnecessary views' adverse effects and enlarge the positive effects of the important views by assigning smaller and larger weights, respectively. Experimental results confirm the effectiveness and advantages of the proposed algorithm in terms of achieving better clustering performance and dealing with the noisy data compared to the existing algorithms.","authors":"Shuo Shuo Liu* (Pennsylvania State University)|Lin Lin (Duke University)","session":"A","title":"Adaptive Weighted Multi-View Clustering"},{"UID":"P03","abstract":"Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. To solve the challenge, we propose the Token Imbalance Adapter (TIMER), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.","authors":"Yuexin Wu* (University of Memphis)|I-Chan Huang (St. Jude Children's Research Hospital)|Xiaolei Huang (University of Memphis)","session":"A","title":"Token Imbalance Adaptation for Radiology Report Generation"},{"UID":"P04","abstract":"Federated Learning (FL) is a machine learning approach that allows the model trainer to access more data samples by training across multiple decentralized data sources while enforcing data access constraints. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. In a FL setting, none of the training data is ever transmitted to any central location; i.e. sensitive data remains local and private. These characteristics make FL perfectly suited for applications in healthcare, where a variety of compliance constraints restrict how data may be handled. Despite these apparent benefits in compliance and privacy, certain scenarios such as heterogeneity of the local data distributions pose significant challenges for FL. Such challenges are even more pronounced in the case of a multilingual setting. This paper presents a FL system for pre-training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data only. We also show that the global model performance can be further improved by a local training step.","authors":"Andre Manoel* (Microsoft)|Mirian Del Carmen Hipolito Garcia (Microsoft)|Tal Baumel (Microsoft)|Shize Su (Microsoft)|Jialei Chen (Microsoft)|Robert Sim (Microsoft)|Dan Miller (Airbnb)|Danny Karmon (Google)|Dimitrios Dimitriadis (Amazon)","session":"B","title":"Federated Multilingual Models for Medical Transcript Analysis"},{"UID":"P05","abstract":"Rare life events significantly impact mental health, and their detection in behavioral studies is a crucial step towards health-based interventions. We envision that mobile sensing data can be used to detect these anomalies. However, the human-centered nature of the problem, combined with the infrequency and uniqueness of these events makes it challenging for unsupervised machine learning methods. In this paper, we first investigate granger-causality between life events and human behavior using sensing data. Next, we propose a multi-task framework with an unsupervised autoencoder to capture irregular behavior, and an auxiliary sequence predictor that identifies transitions in workplace performance to contextualize events. We perform experiments using data from a mobile sensing study comprising N=126 information workers from multiple industries, spanning 10106 days with 198 rare events (<2%). Through personalized inference, we detect the exact day of a rare event with an F1 of 0.34, demonstrating that our method outperforms several baselines. Finally, we discuss the implications of our work from the context of real-world deployment.","authors":"Arvind Pillai* (Dartmouth College)|Subigya Nepal (Dartmouth College)|Andrew Campbell (Dartmouth College)","session":"B","title":"Rare Life Event Detection via Mobile Sensing Using Multi-Task Learning"},{"UID":"P06","abstract":"Understanding the host-specificity of different families of viruses sheds light on the origin of, e.g., SARS-CoV-2, rabies, and other such zoonotic pathogens in humans. It enables epidemiologists, medical professionals, and policymakers to curb existing epidemics and prevent future ones promptly. In the family Coronaviridae (of which SARS-CoV-2 is a member), it is well-known that the spike protein is the point of contact between the virus and the host cell membrane. On the other hand, the two traditional mammalian orders, Carnivora (carnivores) and Chiroptera (bats) are recognized to be responsible for maintaining and spreading the Rabies Lyssavirus (RABV). We propose Virus2Vec, a feature-vector representation for viral (nucleotide or amino acid) sequences that enable vector-space-based machine learning models to identify viral hosts. Virus2Vec generates numerical feature vectors for unaligned sequences, allowing us to forego the computationally expensive sequence alignment step from the pipeline. Virus2Vec leverages the power of both the minimizer and position weight matrix (PWM) to generate compact feature vectors. Using several classifiers, we empirically evaluate Virus2Vec on real-world spike sequences of Coronaviridae and rabies virus sequence data to predict the host (identifying the reservoirs of infection). Our results demonstrate that Virus2Vec outperforms the predictive accuracies of baseline and state-of-the-art methods.","authors":"Sarwan Ali* (Georgia State University)|Babatunde Bello (Georgia State University)|Prakash Chourasia (Georgia State University)|Ria Thazhe Punathil (Georgia State University)|Pin-Yu Chen (IBM Research)|Imdad Ullah Khan (Lahore University of Management Sciences)|Murray Patterson (Georgia State University)","session":"A","title":"Virus2Vec: Viral Sequence Classification Using Machine Learning"},{"UID":"P07","abstract":"We propose a general framework for visualizing any intermediate embedding representation used by any neural survival analysis model. Our framework is based on so-called anchor directions in an embedding space. We show how to estimate these anchor directions using clustering or, alternatively, using user-supplied ``concepts'' defined by collections of raw inputs (e.g., feature vectors all from female patients could encode the concept ``female''). For tabular data, we present visualization strategies that reveal how anchor directions relate to raw clinical features and to survival time distributions. We then show how these visualization ideas extend to handling raw inputs that are images. Our framework is built on looking at angles between vectors in an embedding space, where there could be ``information loss'' by ignoring magnitude information. We show how this loss results in a ``clumping'' artifact that appears in our visualizations, and how to reduce this information loss in practice.","authors":"George H. Chen* (Carnegie Mellon University)","session":"B","title":"A General Framework for Visualizing Embedding Spaces of Neural Survival Analysis Models Based on Angular Information"},{"UID":"P08","abstract":"Federated learning (FL) is an active area of research. One of the most suitable areas for adopting FL is the medical domain, where patient privacy must be respected. Previous research, however, does not provide a practical guide to applying FL in the medical domain. We propose empirical benchmarks and experimental settings for three representative medical datasets with different modalities: longitudinal electronic health records, skin cancer images, and electrocardiogram signals. The likely users of FL such as medical institutions and IT companies can take these benchmarks as guides for adopting FL and minimize their trial and error. For each dataset, each client data is from a different source to preserve real-world heterogeneity. We evaluate six FL algorithms designed for addressing data heterogeneity among clients, and a hybrid algorithm combining the strengths of two representative FL algorithms. Based on experiment results from three modalities, we discover that simple FL algorithms tend to outperform more sophisticated ones, while the hybrid algorithm consistently shows good, if not the best performance. We also find that a frequent global model update leads to better performance under a fixed training iteration budget. As the number of participating clients increases, higher cost is incurred due to increased IT administrators and GPUs, but the performance consistently increases. We expect future users will refer to these empirical benchmarks to design the FL experiments in the medical domain considering their clinical tasks and obtain stronger performance with lower costs.","authors":"Hyeonji Hwang* (KAIST)|Seongjun Yang (KRAFTON)|Daeyoung Kim (KAIST)|Radhika Dua (Google Research)|Jong-Yeup Kim(Konyang University)|Eunho Yang (KAIST) |Edward Choi (KAIST)","session":"A","title":"Towards the Practical Utility of Federated Learning in the Medical Domain"},{"UID":"P09","abstract":"Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task---that is, an outcome to be predicted at a particular time point---to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.","authors":"Christina X Ji (MIT CSAIL and IMES)|Ahmed Alaa (UC Berkeley and UCSF)|David Sontag (MIT CSAIL and IMES)","session":"A","title":"Large-Scale Study of Temporal Shift in Health Insurance Claims"},{"UID":"P10","abstract":"The recent spike in certified Artificial Intelligence tools for healthcare has renewed the debate around adoption of this technology. One thread of such debate concerns Explainable AI and its promise to render AI devices more transparent and trustworthy. A few voices active in the medical AI space have expressed concerns on the reliability of Explainable AI techniques and especially feature attribution methods, questioning their use and inclusion in guidelines and standards. We characterize the problem as a lack of semantic match between explanations and human understanding. To understand when feature importance can be used reliably, we introduce a distinction between feature importance of low- and high-level features. We argue that for data types where low-level features come endowed with a clear semantics, such as tabular data like Electronic Health Records, semantic match can be obtained, and thus feature attribution methods can still be employed in a meaningful and useful way. For high-level features, we sketch a procedure to test whether semantic match has been achieved.","authors":"Giovanni Cin\u00e0* (Amsterdam University Medical Center)|Tabea E. R\u00f6ber (University of Amsterdam)|Rob Goedhart (University of Amsterdam)|\u015e. \u0130lker Birbil (University of Amsterdam)","session":"B","title":"Semantic match: Debugging feature attribution methods in XAI for healthcare"},{"UID":"P11","abstract":"Multivariate biosignals are prevalent in many medical domains, such as electroencephalography, polysomnography, and electrocardiography. Modeling spatiotemporal dependencies in multivariate biosignals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between the electrodes. To address these challenges, we propose representing multivariate biosignals as time-dependent graphs and introduce GRAPHS4MER, a general graph neural network (GNN) architecture that improves performance on biosignal classification tasks by modeling spatiotemporal dependencies in biosignals. Specifically, (1) we leverage the Structured State Space architecture, a state-of-the-art deep sequence model, to capture long-range temporal dependencies in biosignals and (2) we propose a graph structure learning layer in GRAPHS4MER to learn dynamically evolving graph structures in the data. We evaluate our proposed model on three distinct biosignal classification tasks and show that GRAPHS4MER consistently improves over existing models, including (1) seizure detection from electroencephalographic signals, outperforming a previous GNN with self-supervised pre-training by 3.1 points in AUROC; (2) sleep staging from polysomnographic signals, a 4.1 points improvement in macro-F1 score compared to existing sleep staging models; and (3) 12-lead electrocardiogram classification, outperforming previous state-of-the-art models by 2.7 points in macro-F1 score.","authors":"Siyi Tang* (Stanford University)|Jared A. Dunnmon (Stanford University)|Liangqiong Qu (University of Hong Kong)|Khaled K. Saab (Stanford University)|Tina Baykaner (Stanford University)|Christopher Lee-Messer (Stanford University)|Daniel L. Rubin (Stanford University)","session":"B","title":"Modeling Multivariate Biosignals With Graph Neural Networks and Structured State Space Models"},{"UID":"P12","abstract":"Time-to-event modelling, known as survival analysis, differs from standard regression as it addresses censoring in patients who do not experience the event of interest. Despite competitive performances in tackling this problem, machine learning methods often ignore other competing risks that preclude the event of interest. This practice biases the survival estimation. Extensions to address this challenge often rely on parametric assumptions or numerical estimations leading to sub-optimal survival approximations. This paper leverages constrained monotonic neural networks to model each competing survival distribution. This modelling choice ensures the exact likelihood maximisation at a reduced computational cost by using automatic differentiation. The effectiveness of the solution is demonstrated on one synthetic and three medical datasets. Finally, we discuss the implications of considering competing risks when developing risk scores for medical practice.","authors":"Vincent Jeanselme* (University of Cambridge)|Chang Ho Yoon (University of Oxford)|Brian Tom (University of Cambridge)|Jessica Barrett (University of Cambridge)","session":"A","title":"Neural Fine-Gray: Monotonic neural networks for competing risks"},{"UID":"P13","abstract":"With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datasets such as MSCOCO, vision-language pre-training (VLP) has become an active area of research and proven to be effective for various VL tasks such as visual-question answering. However, studies on VLP in the medical domain have so far been scanty. To provide a comprehensive perspective on VLP for medical VL tasks, we conduct a thorough experimental analysis to study key factors that may affect the performance of VLP with a unified vision-language Transformer. To allow making sound and quick pre-training decisions, we propose RadioGraphy Captions (RGC), a high-quality, multi-modality radiographic dataset containing 18,434 image-caption pairs collected from an open-access online database MedPix. RGC can be used as a pre-training dataset or a new benchmark for medical report generation and medical image-text retrieval. By utilizing RGC and other available datasets for pre-training, we develop several key insights that can guide future medical VLP research and new strong baselines for various medical VL tasks.","authors":"Li Xu* (Hong Kong Polytechnic University)|Bo Liu (Hong Kong Polytechnic University)|Ameer Hamza Khan (Hong Kong Polytechnic University)|Lu Fan (Hong Kong Polytechnic University)|Xiao-Ming Wu (Hong Kong Polytechnic University)","session":"A","title":"Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark"},{"UID":"P14","abstract":"Chronic kidney disease (CKD) is a life-threatening and prevalent disease. CKD patients, especially end-stage kidney disease (ESKD) patients on hemodialysis, suffer from kidney failures and are unable to remove excessive fluid, causing fluid overload and multiple morbidities including death. Current solutions for fluid overtake monitoring such as ultrasonography and biomarkers assessment are cumbersome, discontinuous, and can only be performed in the clinic. In this paper, we propose SRDA, a latent graph learning powered fluid overload detection system based on Sensor Relation Dual Autoencoder to detect excessive fluid consumption of EKSD patients based on passively collected bio-behavioral data from smartwatch sensors. Experiments using real-world mobile sensing data indicate that SRDA outperforms the state-of-the-art baselines in both F1 score and recall, and demonstrate the potential of ubiquitous sensing for ESKD fluid intake management.","authors":"Mingyue Tang (University of Virginia)|Jiechao Gao* (University of Virginia)|Guimin Dong (Amazon)|Carl Yang (Emory University)|Brad Campbell (University of Virginia)|Brendan Bowman (University of Virginia)|Jamie Marie Zoellner (University of Virginia)|Emaad Abdel-Rahman (University of Virginia)|Mehdi Boukhechba (The Janssen Pharmaceutical Companies of Johnson & Johnson)","session":"B","title":"SRDA: Mobile Sensing based Fluid Overload Detection for End Stage Kidney Disease Patients using Sensor Relation Dual Autoencoder"},{"UID":"P15","abstract":"Conflict of interest (COI) disclosure statements provide rich information to support transparency and reduce bias in research. We introduce a novel task to identify relationships between sponsoring entities and the research studies they sponsor from the disclosure statement. This task is challenging due to the complexity of recognizing all potential relationship patterns and the hierarchical nature of identifying entities first and then extracting their relationships to the study. To overcome these challenges, in this paper, we also constructed a new annotated dataset and proposed a Question Answering-based method to recognize entities and extract relationships. Our method has demonstrated robustness in handling diverse relationship patterns, and it remains effective even when trained on a low-resource dataset.","authors":"Hardy* (Universitas Mikroskil)|Derek Ruths (McGill University)|Nicholas B King (McGill University)","session":"A","title":"Who Controlled the Evidence? Question Answering for Disclosure Information Retrieval"},{"UID":"P16","abstract":"In this paper, we challenge the utility of approved drug indications as a prediction target for machine learning in drug repurposing (DR) studies. Our research highlights two major limitations of this approach: 1) the presence of strong confounding between drug indications and drug characteristics data, which results in shortcut learning, and 2) inappropriate normalization of indications in existing drug-disease association (DDA) datasets, which leads to an overestimation of model performance. We show that the collection patterns of drug characteristics data were similar within drugs of the same category and the Anatomical Therapeutic Chemical (ATC) classification of drugs could be predicted by using the data collection patterns. Furthermore, we confirm that the performance of existing DR models is significantly degraded in the realistic evaluation setting we proposed in this study. We provide realistic data split information for two benchmark datasets, Fdataset and deepDR dataset.","authors":"Siun Kim* (Seoul National University)|Jung-Hyun Won (Seoul National University)|David Seung U Lee (Seoul National University)|Renqian Luo (Microsoft Research)|Lijun Wu (Microsoft Research)|Yingce Xia (Microsoft Research)|Tao Qin (Microsoft Research)|Howard Lee (Seoul National University)","session":"A","title":"Revisiting Machine-Learning based Drug Repurposing: Drug Indications Are Not a Right Prediction Target"},{"UID":"P17","abstract":"Only about one-third of the deaths worldwide are assigned a medically-certified cause, and understanding the causes of deaths occurring outside of medical facilities is logistically and financially challenging. Verbal autopsy (VA) is a routinely used tool to collect information on cause of death in such settings. VA is a survey-based method where a structured questionnaire is conducted to family members or caregivers of a recently deceased person, and the collected information is used to infer the cause of death. As VA becomes an increasingly routine tool for cause-of-death data collection, the lengthy questionnaire has become a major challenge to the implementation and scale-up of VA interviews as they are costly and time-consuming to conduct. In this paper, we propose a novel active questionnaire design approach that optimizes the order of the questions dynamically to achieve accurate cause-of-death assignment with the smallest number of questions. We propose a fully Bayesian strategy for adaptive question selection that is compatible with any existing probabilistic cause-of-death assignment methods. We also develop an early stopping criterion that fully accounts for the uncertainty in the model parameters. We also propose a penalized score to account for constraints and preferences of existing question structures. We evaluate the performance of our active designs using both synthetic and real data, demonstrating that the proposed strategy achieves accurate cause-of-death assignment using considerably fewer questions than the traditional static VA survey instruments.","authors":"Toshiya Yoshida* (University of California Santa Cruz)|Trinity Shuxian Fan (University of Washington)|Tyler McCormick (University of Washington)|Zhenke Wu (University of Michigan)|Zehang Richard Li (University of California Santa Cruz)","session":"A","title":"Bayesian Active Questionnaire Design for Cause-of-Death Assignment Using Verbal Autopsies"},{"UID":"P18","abstract":"High blood pressure is a major risk factor for cardiovascular disease, necessitating accurate blood pressure (BP) measurement. Clinicians measure BP with an invasive arterial catheter or via a non-invasive arm or finger cuff. However, the former can cause discomfort to the patient and is unsuitable outside Intensive Care Unit (ICU). While cuff-based devices, despite being non-invasive, fails to provide continuous measurement, and they measure from peripheral blood vessels whose BP waveforms differ significantly from those proximal to the heart. Hence, there is an urgent need to develop a measurement protocol for converting easily measured non-invasive data into accurate BP values. Addressing this gap, we propose a non-invasive approach to predict BP from arterial area and blood flow velocity signals measured from a Philips ultrasound transducer (XL-143) applied to large arteries close to heart. We developed the protocol and collected data from 72 subjects. The shape of BP (relative BP) can be theoretically calculated from these waveforms, however there is no established theory to obtain absolute BP values. To tackle this challenge, we further employ data-driven machine learning models to predict the Mean Arterial Blood Pressure (MAP), from which the absolute BP can be derived. Our study investigates various machine learning algorithms to optimize the prediction accuracy. We find that LSTM, Transformer, and 1D-CNN algorithms using the blood pressure shape and blood flow velocity waveforms as inputs can achieve 8.6, 8.7, and 8.8 mmHg average standard deviation of the prediction error respectively without anthropometric data such as age, sex, heart rate, height, weight. Furthermore, the 1D-CNN model can achieve 7.9mmHg when anthropometric data is added as inputs, improving upon an anthropometric-only model of 9.5mmHg. This machine learning-based approach, capable of converting ultrasound data into MAP values, presents a promising software tool for physicians in clinical decision-making regarding blood pressure management.","authors":"Jessica Zheng (MIT)|Hanrui Wang* (MIT)|Anand Chandrasekhar (MIT)|Aaron Aguirre (Massachusetts General Hospital and Harvard Medical School)|Song Han (MIT)|Hae-Seung Lee (MIT)|Charles G. Sodini (MIT)","session":"A","title":"Machine Learning for Arterial Blood Pressure Prediction"},{"UID":"P19","abstract":"High blood pressure is a major risk factor for cardiovascular disease, necessitating accurate blood pressure (BP) measurement. Clinicians measure BP with an invasive arterial catheter or via a non-invasive arm or finger cuff. However, the former can cause discomfort to the patient and is unsuitable outside Intensive Care Unit (ICU). While cuff-based devices, despite being non-invasive, fails to provide continuous measurement, and they measure from peripheral blood vessels whose BP waveforms differ significantly from those proximal to the heart. Hence, there is an urgent need to develop a measurement protocol for converting easily measured non-invasive data into accurate BP values. Addressing this gap, we propose a non-invasive approach to predict BP from arterial area and blood flow velocity signals measured from a Philips ultrasound transducer (XL-143) applied to large arteries close to heart. We developed the protocol and collected data from 72 subjects. The shape of BP (relative BP) can be theoretically calculated from these waveforms, however there is no established theory to obtain absolute BP values. To tackle this challenge, we further employ data-driven machine learning models to predict the Mean Arterial Blood Pressure (MAP), from which the absolute BP can be derived. Our study investigates various machine learning algorithms to optimize the prediction accuracy. We find that LSTM, Transformer, and 1D-CNN algorithms using the blood pressure shape and blood flow velocity waveforms as inputs can achieve 8.6, 8.7, and 8.8 mmHg average standard deviation of the prediction error respectively without anthropometric data such as age, sex, heart rate, height, weight. Furthermore, the 1D-CNN model can achieve 7.9mmHg when anthropometric data is added as inputs, improving upon an anthropometric-only model of 9.5mmHg. This machine learning-based approach, capable of converting ultrasound data into MAP values, presents a promising software tool for physicians in clinical decision-making regarding blood pressure management.","authors":"Iman Deznabi* (University of Massachusetts, Amherst)|Madalina Fiterau (University of Massachusetts, Amherst)","session":"B","title":"MultiWave: Multiresolution Deep Architectures through Wavelet Decomposition for Multivariate Time Series Prediction"},{"UID":"P20","abstract":"Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-art methods which often depend on black-box models for imputation. In this work, we show how recent advances in interpretable machine learning provide a new perspective for understanding and tackling the missing value problem. We propose methods based on high-accuracy glass-box Explainable Boosting Machines (EBMs) that can help users (1) gain new insights on missingness mechanisms and better understand the causes of missingness, and (2) detect -- or even alleviate -- potential risks introduced by imputation algorithms. Experiments on real-world medical datasets illustrate the effectiveness of the proposed methods.","authors":"Zhi Chen* (Duke University)|Sarah Tan (Cornell University)|Urszula Chajewska (Microsoft Research)|Cynthia Rudin (Duke University)|Rich Caruana (Microsoft Research)","session":"B","title":"Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?"},{"UID":"P21","abstract":"Machine learning models perform well on several healthcare tasks and can help reduce the burden on the healthcare system. However, the lack of explainability is a major roadblock to their adoption in hospitals. How can the decision of an ML model be explained to a physician? The explanations considered in this paper are counterfactuals (CFs), hypothetical scenarios that would have resulted in the opposite outcome. Specifically, time-series CFs are investigated, inspired by the way physicians converse and reason out decisions `I would have given the patient a vasopressor if their blood pressure was lower and falling'. Key properties of CFs that are particularly meaningful in clinical settings are outlined: physiological plausibility, relevance to the task and sparse perturbations. Past work on CF generation does not satisfy these properties, specifically plausibility in that realistic time-series CFs are not generated. A variational autoencoder (VAE)-based approach is proposed that captures these desired properties. The method produces CFs that improve on prior approaches quantitatively (more plausible CFs as evaluated by their likelihood w.r.t original data distribution, and 100x faster at generating CFs) and qualitatively (2x more plausible and relevant) as evaluated by three physicians.","authors":"Supriya Nagesh* (Amazon)|Nina Mishra (Amazon)|Yonatan Naamad (Amazon)|James M Rehg (Georgia Institute of Technology)|Mehul A Shah (Aryn)|Alexei Wagner (Harvard University)","session":"B","title":"Explaining a machine learning decision to physicians via counterfactuals"},{"UID":"P22","abstract":"Noisy training labels can hurt model performance. Most approaches that aim to address label noise assume label noise is independent from the input features. In practice, however, label noise is often feature or instance-dependent, and therefore biased (i.e., some instances are more likely to be mislabeled than others). E.g., in clinical care, female patients are more likely to be under-diagnosed for cardiovascular disease compared to male patients. Approaches that ignore this dependence can produce models with poor discriminative performance, and in many healthcare settings, can exacerbate issues around health disparities. In light of these limitations, we propose a two-stage approach to learn in the presence instance-dependent label noise. Our approach utilizes alignment points, a small subset of data for which we know the observed and ground truth labels. On several tasks, our approach leads to consistent improvements over the state-of-the-art in discriminative performance (AUROC) while mitigating bias (area under the equalized odds curve, AUEOC). For example, when predicting acute respiratory failure onset on the MIMIC-III dataset, our approach achieves a harmonic mean (AUROC and AUEOC) of 0.84 (SD [standard deviation] 0.01) while that of the next best baseline is 0.81 (SD 0.01). Overall, our approach improves accuracy while mitigating potential bias compared to existing approaches in the presence of instance-dependent label noise.","authors":"Donna Tjandra* (University of Michigan)|Jenna Wiens (University of Michigan)","session":"A","title":"Leveraging an Alignment Set in Tackling Instance-Dependent Label Noise"},{"UID":"P23","abstract":"Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to measure and achieve fair calibration is with multicalibration. Multicalibration constrains calibration error among flexibly-defined subpopulations while maintaining overall calibration. However, multicalibrated models can exhibit a higher percent calibration error among groups with lower base rates than groups with higher base rates. As a result, it is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. To alleviate this, we propose proportional multicalibration, a criteria that constrains the percent calibration error among groups and within prediction bins. We prove that satisfying proportional multicalibration bounds a model's multicalibration as well its differential calibration, a fairness criteria that directly measures how closely a model approximates sufficiency. Therefore, proportionally calibrated models limit the ability of decision makers to distinguish between model performance on different patient groups, which may make the models more trustworthy in practice. We provide an efficient algorithm for post-processing risk prediction models for proportional multicalibration and evaluate it empirically. We conduct simulation studies and investigate a real-world application of PMC-postprocessing to prediction of emergency department patient admissions. We observe that proportional multicalibration is a promising criteria for controlling simultaneous measures of calibration fairness of a model over intersectional groups with virtually no cost in terms of classification performance.","authors":"William La Cava* (Boston Children's Hospital and Harvard Medical School)|Elle Lett (Boston Children's Hospital and Harvard Medical School)|Guangya Wan (Boston Children's Hospital and Harvard Medical School)","session":"B","title":"Fair Admission Risk Prediction with Proportional Multicalibration"},{"UID":"P24","abstract":"Machine learning models for healthcare commonly use binary indicator variables to represent the diagnosis of specific health conditions in medical records. However, in populations with significant under-reporting, the absence of a recorded diagnosis does not rule out the presence of a condition, making it difficult to distinguish between negative and missing values. This effect, which we refer to as latent missingness, may lead to model degradation and perpetuate existing biases in healthcare. To address this issue, we propose that healthcare providers and payers allocate a budget towards data collection (eg. subsidies for check-ups or lab tests). However, given finite resources, only a subset of data points can be collected. Additionally, most models are unable to be re-trained after deployment. In this paper, we propose a method for efficient data collection in order to maximize a fixed model's performance on a given population. Through simulated and real-world data, we demonstrate the potential value of targeted data collection to address model degradation.","authors":"Kevin Wu* (Stanford University and Optum Labs)|Dominik Dahlem (Optum Labs)|Christopher Hane (Optum Labs)|Eran Halperin (Optum Labs)|James Zou (Stanford University)","session":"A","title":"Collecting data when missingness is unknown: a method for improving model performance given under-reporting in patient populations"},{"UID":"P25","abstract":"Detailed mobile sensing data from phones and fitness trackers offer an opportunity to quantify previously unmeasurable behavioral changes to improve individual health and accelerate responses to emerging diseases. Unlike in natural language processing and computer vision, deep learning has yet to broadly impact this domain, in which the majority of research and clinical applications still rely on manually defined features or even forgo predictive modeling altogether due to insufficient accuracy. This is due to unique challenges in the behavioral health domain, including very small datasets (~101 participants), which frequently contain missing data, consist of long time series with critical long-range dependencies (length<104), and extreme class imbalances (>103:1). Here, we describe a neural architecture for multivariate time series classification designed to address these unique domain challenges. Our proposed behavioral representation learning approach combines novel tasks for self-supervised pretraining and transfer learning to address data scarcity, and captures long-range dependencies across long-history time series through transformer self-attention following convolutional neural network-based dimensionality reduction. We propose an evaluation framework aimed at reflecting expected real-world performance in plausible deployment scenarios. Concretely, we demonstrate (1) performance improvements over baselines of up to 0.15 ROC AUC across five influenza-related prediction tasks, (2) transfer learning-induced performance improvements including a 16% relative increase in PR AUC in small data scenarios, and (3) the potential of transfer learning in novel disease scenarios through an exploratory case study of zero-shot COVID-19 prediction in an independent data set. Finally, we discuss potential implications for medical surveillance testing.","authors":"Mike A Merrill* (University of Washington)|Tim Althoff (University of Washington)","session":"A","title":"Self-Supervised Pretraining and Transfer Learning Enable Flu and COVID-19 Predictions in Small Mobile Sensing Datasets"},{"UID":"P26","abstract":"Given the complexity of trauma presentations, particularly in those involving multiple areas of the body, overlooked injuries are common during the initial assessment by a clinician. We are motivated to develop an automated trauma pattern discovery framework for comprehensive identification of injury patterns which may eventually support diagnostic decision-making. We analyze 1,162,399 patients from the Trauma Quality Improvement Program with a disentangled variational autoencoder, weakly supervised by a latent-space classifier of auxiliary features. We also develop a novel scoring metric that serves as a proxy for clinical intuition in extracting clusters with clinically meaningful injury patterns. We validate the extracted clusters with clinical experts, and explore the patient characteristics of selected groupings. Our metric is able to perform model selection and effectively filter clusters for clinically-validated relevance.","authors":"Qixuan Jin* (Massachusetts Institute of Technology)|Jacobien Oosterhoff (Delft University of Technology)|Yepeng Huang (Harvard School of Public Health)|Marzyeh Ghassemi (Massachusetts Institute of Technology)|Gabriel A. Brat (Beth Israel Deaconess Medical Center and Harvard Medical School)","session":"B","title":"Clinical Relevance Score for Guided Trauma Injury Pattern Discovery with Weakly Supervised \u03b2-VAE"},{"UID":"P27","abstract":"Machine learning (ML) models deployed in healthcare systems must face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, splitting datasets according to patients sampled randomly throughout the entire study time period. This work proposes the Evaluation on Medical Datasets Over Time (EMDOT) framework, which evaluates the performance of a model class across time. Inspired by the concept of backtesting, EMDOT simulates possible training procedures that practitioners might have been able to execute at each point in time and evaluates the resulting models on all future time points. Evaluating both linear and more complex models on six distinct medical data sources (tabular and imaging), we show how depending on the dataset, using all historical data may be ideal in many cases, whereas using a window of the most recent data could be advantageous in others. In datasets where models suffer from sudden degradations in performance, we investigate plausible explanations for these shocks. We release the EMDOT package to help facilitate further works in deployment-oriented evaluation over time.","authors":"Helen Zhou* (Carnegie Mellon University)|Yuwen Chen (Carnegie Mellon University)|Zachary Chase Lipton (Carnegie Mellon University)","session":"B","title":"Evaluating Model Performance in Medical Datasets Over Time"},{"UID":"P28","abstract":"Making the most use of abundant information in electronic health records (EHR) is rapidly becoming an important topic in the medical domain. Recent work presented a promising framework that embeds entire features in raw EHR data regardless of its form and medical code standards. The framework, however, only focuses on encoding EHR with minimal preprocessing and fails to consider how to learn efficient EHR representation in terms of computation and memory usage. In this paper, we search for a versatile encoder not only reducing the large data into a manageable size but also well preserving the core information of patients to perform diverse clinical tasks. We found that hierarchically structured Convolutional Neural Network (CNN) often outperforms the state-of-the-art model on diverse tasks such as reconstruction, prediction, and generation, even with fewer parameters and less training time. Moreover, it turns out that making use of the inherent hierarchy of EHR data can boost the performance of any kind of backbone models and clinical tasks performed. Through extensive experiments, we present concrete evidence to generalize our research findings into real-world practice. We give a clear guideline on building the encoder based on the research findings captured while exploring numerous settings.","authors":"Eunbyeol Cho* (KAIST)|Min Jae Lee (KAIST)|Kyunghoon Hur (KAIST)|Jiyoun Kim (KAIST)|Jinsung Yoon (Google Cloud AI Research)|Edward Choi (KAIST)","session":"A","title":"Rediscovery of CNN's Versatility for Text-based Encoding of Raw Electronic Health Records"},{"UID":"P29","abstract":"Despite increased interest in wearables as tools for detecting various health conditions, there are not as of yet any large public benchmarks for such mobile sensing data. The few datasets that are available do not contain data from more than dozens of individuals, do not contain high-resolution raw data or do not include dataloaders for easy integration into machine learning pipelines. Here, we present Homekit2020: the first large-scale public benchmark for time series classification of wearable sensor data. Our dataset contains over 14 million hours of minute-level multimodal Fitbit data, symptom reports, and ground-truth laboratory PCR influenza test results, along with an evaluation framework that mimics realistic model deployments and efficiently characterizes statistical uncertainty in model selection in the presence of extreme class imbalance. Furthermore, we implement and evaluate nine neural and non-neural time series classification models on our benchmark across 450 total training runs in order to establish state of the art performance.","authors":"Mike A Merrill (University of Washington)|Esteban Safranchik* (University of Washington)|Arinbj\u00f6rn Kolbeinsson (Evidation Health)|Piyusha Gade (Evidation Health)|Ernesto Ramirez (Evidation Health)|Ludwig Schmidt (University of Washington)|Luca Foschini (Sage Bionetworks)|Tim Althoff (University of Washington)","session":"B","title":"Homekit2020: A Benchmark for Time Series Classification on a Large Mobile Sensing Dataset with Laboratory Tested Ground Truth of Influenza Infections"},{"UID":"P30","abstract":"The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Neural Networks (GNNs) have been applied for brain network analysis. However, training deep models requires large amounts of labeled data, which is often scarce in brain network datasets due to the complexities of data acquisition and sharing restrictions. To make the most out of available training data, we propose PTGB, a GNN pre-training framework that captures intrinsic brain network structures, regardless of clinical outcomes, and is easily adaptable to various downstream tasks. PTGB comprises two key components: (1) an unsupervised pre-training technique designed specifically for brain networks, which enables learning from large-scale datasets without task-specific labels; (2) a data-driven parcellation atlas mapping pipeline that facilitates knowledge transfer across datasets with different ROI systems. Extensive evaluations using various GNN models have demonstrated the robust and superior performance of PTGB compared to baseline methods.","authors":"Yi Yang* (Emory University)|Hejie Cui (Emory University)|Carl Yang (Emory University)","session":"B","title":"PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis"},{"UID":"P31","abstract":"Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as clinical text. Recent results have suggested that LLMs encode a surprising amount of medical knowledge. This raises an important question regarding the utility of smaller domain-specific language models. With the success of general-domain LLMs, is there still a need for specialized clinical models? To investigate this question, we conduct an extensive empirical analysis of 12 language models, ranging from 220M to 175B parameters, measuring their performance on 3 different clinical tasks that test their ability to parse and reason over electronic health records. As part of our experiments, we train T5-Base and T5-Large models from scratch on clinical notes from MIMIC III and IV to directly investigate the efficiency of clinical tokens. We show that relatively small specialized clinical models substantially outperform all in-context learning approaches, even when finetuned on limited annotated data. Further, we find that pretraining on clinical tokens allows for smaller, more parameter-efficient models that either match or outperform much larger language models trained on general text. We release the code and the models used under the PhysioNet Credentialed Health Data license and data use agreement.","authors":"Eric Lehman* (MIT and Xyla)|Evan Hernandez (MIT and Xyla)|Diwakar Mahajan (IBM Research)|Jonas Wulff (Xyla)|Micah J. Smith (Xyla)|Zachary Ziegler (Xyla)|Daniel Nadler (Xyla)|Peter Szolovits (MIT)|Alistair Johnson (The Hospital for Sick Children)|Emily Alsentzer (Brigham and Women's Hospital and Harvard Medical School)","session":"A","title":"Do We Still Need Clinical Language Models?"},{"UID":"P32","abstract":"Electrodermal activity (EDA) is a biosignal that contains valuable information for monitoring health conditions related to sympathetic nervous system activity. Analyzing ambulatory EDA data is challenging because EDA measurements tend to be noisy and sparsely labeled. To address this problem, we present the first study of contrastive learning that examines approaches that are tailored to the EDA signal. We present a novel set of data augmentations that are tailored to EDA, and use them to generate positive examples for unsupervised contrastive learning. We evaluate our proposed approach on the downstream task of stress detection. We find that it outperforms baselines when used both for fine-tuning and for transfer learning, especially in regimes of high label sparsity. We verify that our novel EDA-specific augmentations add considerable value beyond those considered in prior work through a set of ablation experiments.","authors":"Katie Matton* (MIT CSAIL and MIT Media Lab)|Robert A Lewis* (MIT Media Lab)|John Guttag (MIT CSAIL)|Rosalind Picard (MIT Media Lab)","session":"B","title":"Contrastive Learning of Electrodermal Activity Representations for Stress Detection"},{"UID":"P33","abstract":"Type 2 diabetes mellitus (T2D) affects over 530 million people globally and is often difficult to manage leading to serious health complications. Continuous glucose monitoring (CGM) can help people with T2D to monitor and manage the disease. CGM devices sample an individual's glucose level at frequent intervals enabling sophisticated characterization of an individual's health. In this work, we leverage a large dataset of CGM data (5,447 individuals and 940,663 days of data) paired with health records and activity data to investigate how glucose levels in people with T2D are affected by external factors like weather conditions, extreme weather events, and temporal events including local holidays. We find temperature (p=2.37x10-8, n=3561), holidays (p=2.23x10-46, n=4079), and weekends (p=7.64x10-124, n=5429) each have a significant effect on standard glycemic metrics at a population level. Moreover, we show that we can predict whether an individual will be significantly affected by a (potentially unobserved) external event using only demographic information and a few days of CGM and activity data. Using random forest classifiers, we can predict whether an individual will be more negatively affected than a typical individual with T2D by a given external factor with respect to a given glycemic metric. We find performance (measured as ROC-AUC) is consistently above chance (across classifiers, median ROC-AUC=0.63). Performance is highest for classifiers predicting the effect of time-in-range (median ROC-AUC=0.70). These are important findings because they may enable better patient care management with day-to-day risk assessments based on external factors as well as improve algorithm development by reducing train- and test-time bias due to external factors.","authors":"Kailas Vodrahalli* (Stanford University)|Gregory D. Lyng (Optum AI Labs)|Brian L. Hill (Optum AI Labs)|Kimmo Karkkainen (Optum AI Labs)|Jeffrey Hertzberg (Optum AI Labs)|James Zou (Stanford University)|Eran Halperin (Optum AI Labs)","session":"B","title":"Understanding and Predicting the Effect of Environmental Factors on People with Type 2 Diabetes"}],"speakers":[{"UID":"S06","abstract":"The traditional knowledge-based approaches to question answering might seem irrelevant now that Neural QA, particularly Large Language Models show almost human performance in question answering. Knowing what was successful in the past and which elements are essential to getting the right answers, however, is needed to inform further developments in the neural approaches and help address the known shortcomings of LLMs. This talk, therefore, will provide an overview of the approaches to biomedical question answering as they were evolving. It will cover information needs of various stakeholders and the resources created to address these information needs through Question Answering.","bio":"Dina Demner-Fushman, MD, PhD is a Tenure Track Investigator in the Computational Health Research Branch at LHNCBC. She specializes in artificial intelligence and natural language processing, with a focus on information extraction and textual data analysis, EMR data analysis, and image and text retrieval for clinical decision support and education. Dr. Demner-Fushman's research aims to improve healthcare through the development of computational methods that can process and analyze clinical data more effectively. Her research led to the current iteration of the MEDLINE resource, which helps people navigate a plethora of NLM resources, as well as Open-i, which helps finding biomedical images.","image":"static/images/speakers/dina_demner_fushman.jpg","institution":"National Institutes of Health","slideslive_active_date":"","slideslive_id":"","speaker":"Dina Demner-Fushman","title":"Biomedical Question Answering Yesterday, Today, and Tomorrow"},{"UID":"S02","abstract":"Artificial intelligence could fundamentally transform clinical workflows in image-based diagnostics and population screening, promising more objective, accurate and effective analysis of medical images. A major hurdle for using medical imaging AI in clinical practice, however, is the assurance whether it is safe for patients and continues to be safe after deployment. Differences in patient populations and changes in the data acquisition pose challenges to today's AI algorithms. In this talk we will discuss AI safeguards from the perspective of robustness, reliability, and fairness. We will explore approaches for automatic failure detection, monitoring of performance, and analysis of bias, aiming to ensure the safe and ethical use of medical imaging AI.","bio":"Ben Glocker is Professor in Machine Learning for Imaging and Kheiron Medical Technologies / Royal Academy of Engineering Research Chair in Safe Deployment of Medical Imaging AI. He co-leads the Biomedical Image Analysis Group, leads the HeartFlow-Imperial Research Team, and is Head of ML Research at Kheiron. His research is at the intersection of medical imaging and artificial intelligence aiming to build safe and ethical computational tools for improving image-based detection and diagnosis of disease.","image":"static/images/speakers/ben_glocker.jpg","institution":"Imperial College London","slideslive_active_date":"","slideslive_id":"","speaker":"Ben Glocker","title":"Safe Deployment of Medical Imaging AI"},{"UID":"S01","abstract":"Biological sequences, like DNA and protein sequences, encode genetic information essential to life. In recent times, deep learning techniques have transformed biomedical research and applications by modeling the intricate patterns in these sequences. Successful models like AlphaFold and Enformer have paved the way for accurate end-to-end prediction of complex molecular phenotypes from sequences. Such models have profound impact on biomedical research and applications, ranging from understanding basic biology to facilitating drug discovery. This talk will provide an overview of the current techniques and status of biological sequences modeling. Additionally, specific applications of such models in genetics and immunology will be discussed.","bio":"Jun Cheng is a Senior Research Scientist at DeepMind. His research focused on developing machine learning methods to better understand the genetic code and disease mechanisms. Before that, he was a scientist at NEC Labs Europe, where he worked on personalized cancer vaccines. His work has been published in venues such as Genome Biology, Bioinformatics, and Nature Biotechnology. He received his PhD in computational biology from the Technical University of Munich.","image":"static/images/speakers/jun_cheng.jpg","institution":"DeepMind","slideslive_active_date":"","slideslive_id":"","speaker":"Jun Cheng","title":"Biological Sequence Modeling in Research and Applications"},{"UID":"S03","abstract":"Digital traces, such as social media data, supported with advances in the artificial intelligence (AI) and machine learning (ML) fields, are increasingly being used to understand the mental health of individuals, communities, and populations. However, such algorithms do not exist in a vacuum -- there is an intertwined relationship between what an algorithm does and the world it exists in. Consequently, with algorithmic approaches offering promise to change the status quo in mental health for the first time since mid-20th century, interdisciplinary collaborations are paramount. But what are some paradigms of engagement for AL/ML researchers that augment existing algorithmic capabilities while minimizing the risk of harm? Adopting a social ecological lens, this talk will describe the experiences from working with different stakeholders in research initiatives relating to digital mental health \u2013 including with healthcare providers, grassroots advocacy and public health organizations, and people with the lived experience of mental illness. The talk hopes to present some lessons learned by way of these engagements, and to reflect on a path forward that empowers us to go beyond technical innovations to envisioning contributions that center humans\u2019 needs, expectations, values, and voices within those technical artifacts.","bio":"Munmun De Choudhury is an Associate Professor of Interactive Computing at Georgia Tech. Dr. De Choudhury is best known for laying the foundation of a new line of research that develops computational techniques towards understanding and improving mental health outcomes, through ethical analysis of social media data. To do this work, she adopts a highly interdisciplinary approach, combining social computing, machine learning, and natural language analysis with insights and theories from the social, behavioral, and health sciences. Dr. De Choudhury has been recognized with the 2023 SIGCHI Societal Impact Award, the 2022 Web Science Trust Test-of-Time Award, the 2021 ACM-W Rising Star Award, the 2019 Complex Systems Society \u2013 Junior Scientific Award, numerous best paper and honorable mention awards from the ACM and AAAI, and features and coverage in popular press like the New York Times, the NPR, and the BBC. Earlier, Dr. De Choudhury was a faculty associate with the Berkman Klein Center for Internet and Society at Harvard, a postdoc at Microsoft Research, and obtained her PhD in Computer Science from Arizona State University.","image":"static/images/speakers/munmun_de_choudhury.jpg","institution":"Georgia Tech","slideslive_active_date":"","slideslive_id":"","speaker":"Munmun De Choudhury","title":"Bridging Machine Learning and Collaborative Action Research: A Tale Engaging with Diverse Stakeholders in Digital Mental Health"},{"UID":"S05","abstract":"TBD","bio":"Dina Katabi is the Thuan and Nicole Pham Professor of Electrical Engineering and Computer Science at MIT. She is also the director of the MIT\u2019s Center for Wireless Networks and Mobile Computing, a member of the National Academy of Engineering, and a recipient of the MacArthur Genius Award. Professor Katabi received her PhD and MS from MIT in 2003 and 1999, and her Bachelor of Science from Damascus University in 1995. Katabi's research focuses on innovations in digital health, applied machine learning and wireless sensors and networks. Her research has been recognized with ACM Prize in Computing, the ACM Grace Murray Hopper Award, two SIGCOMM Test-of-Time Awards, the Faculty Research Innovation Fellowship, a Sloan Fellowship, the NBX Career Development chair, and the NSF CAREER award. Her students received the ACM Best Doctoral Dissertation Award in Computer Science and Engineering twice. Further, her work was recognized by the IEEE William R. Bennett prize, three ACM SIGCOMM Best Paper awards, an NSDI Best Paper award and a TR10 award. Several start-ups have beenspun out of Katabi's lab such as PiCharging and Emerald.","image":"static/images/speakers/dina_katabi.jpg","institution":"MIT","slideslive_active_date":"","slideslive_id":"","speaker":"Dina Katabi","title":"A Healthcare Platform Powered by ML and Radio Waves"},{"UID":"S07","abstract":"Artificial intelligence tools have been touted as having performance \"on par\" with board certified dermatologists. However, these published claims have not translated to real world practice. In this talk, I will discuss the opportunities and challenges for AI in dermatology.","bio":"Dr. Roxana Daneshjou received her undergraduate degree at Rice University in Bioengineering, where she was recognized as a Goldwater Scholar for her research. She completed her MD/PhD at Stanford, where she worked in the lab of Dr. Russ Altman. During this time, she was a Howard Hughes Medical Institute Medical Scholar and a Paul and Daisy Soros Fellowship for New Americans Fellow. She completed dermatology residency at Stanford in the research track and now practices dermatology as a Clinical Scholar in Stanford's Department of Dermatology while also conducting artificial intelligence research with Dr. James Zou as a postdoc in Biomedical Data Science. She is an incoming assistant professor of biomedical data science and dermatology at Stanford in Fall of 2023. Her research interests are in developing diverse datasets and fair algorithms for applications in precision medicine.","image":"static/images/speakers/roxana_daneshjou.jpg","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Roxana Daneshjou","title":"Skin in the Game: The State of AI in Dermatology"},{"UID":"S04","abstract":"Our society remains profoundly unequal. This talk discusses how data science and machine learning can be used to combat inequality in health care and public health by presenting several vignettes from domains like medical testing and cancer risk prediction.","bio":"Emma Pierson is an assistant professor of computer science at the Jacobs Technion-Cornell Institute at Cornell Tech and the Technion, and a computer science field member at Cornell University. She holds a secondary joint appointment as an Assistant Professor of Population Health Sciences at Weill Cornell Medical College. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, and Forbes 30 Under 30 in Science. Her research has been published at venues including ICML, KDD, WWW, Nature, and Nature Medicine, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.","image":"static/images/speakers/emma_pierson.jpeg","institution":"Cornell Tech","slideslive_active_date":"","slideslive_id":"","speaker":"Emma Pierson","title":"Using Machine Learning to Increase Equity in Healthcare and Public Health"}]},"has_data":true,"has_summary":{"2020":true,"2021":true,"2022":true,"2023":true},"years_list":["2020","2021","2022","2023"]}
diff --git a/serve_committee.json b/serve_committee.json
new file mode 100644
index 000000000..1536ca53d
--- /dev/null
+++ b/serve_committee.json
@@ -0,0 +1 @@
+{"committee":[{"aff":"MIT","img":"marzyeh-ghassemi.jpg","name":"Marzyeh Ghassemi","role":"General Chair","url":"https://healthyml.org/marzyeh/"},{"aff":"Harvard Medical School","img":"matthew-mcdermott.jpg","name":"Matthew McDermott","role":"General Chair","url":"https://dbmi.hms.harvard.edu/people/matthew-mcdermott"},{"aff":"Weill Cornell Medicine","img":"fei-wang.jpg","name":"Fei Wang","role":"Program Chair","url":"https://wcm-wanglab.github.io/"},{"aff":"UC Berkeley","img":"irene-chen.jpg","name":"Irene Chen","role":"Program Chair","url":"https://irenechen.net/"},{"aff":"Carnegie Mellon University","img":"george-chen.jpg","name":"George Chen","role":"Program Sub-Chair, Research Roundtables","url":"https://www.andrew.cmu.edu/user/georgech/"},{"aff":"University of British Columbia","img":"xiaoxiao-li.jpg","name":"Xiaoxiao Li","role":"Program Sub-Chair, Research Roundtables","url":"https://bmiai.ubc.ca/people/xiaoxiao-li"},{"aff":"Duke University","img":"monica-agrawal.jpg","name":"Monica Agrawal","role":"Program Sub-Chair, Doctoral Symposium","url":"https://www.linkedin.com/in/monica-agrawal-bb73a874/"},{"aff":"University of Cambridge","img":"emma-rocheteau.jpg","name":"Emma Rocheteau","role":"Program Sub-Chair, Doctoral Symposium","url":"https://emmarocheteau.com/"},{"aff":"MIT","img":"tom-pollard.jpg","name":"Tom Pollard","role":"Proceedings Chair","url":"https://people.csail.mit.edu/tpollard/"},{"aff":"KAIST","img":"edward-choi.jpg","name":"Edward Choi","role":"Proceedings Chair","url":"https://mp2893.com/"},{"aff":"University of Pennsylvania","img":"pankhuri-singhal.jpg","name":"Pankhuri Singhal","role":"Proceedings Sub-Chair, Workflow","url":"https://pankhurisinghal.com/"},{"aff":"Tufts University","img":"michael-c-hughes.jpg","name":"Michael C. Hughes","role":"Senior Area Chair, Track 1","url":"https://www.michaelchughes.com/"},{"aff":"FDA","img":"elena-sizikova.jpg","name":"Elena Sizikova","role":"Senior Area Chair, Track 2","url":"https://esizikova.github.io/"},{"aff":"Texas A&M University","img":"bobak-mortazavi.jpg","name":"Bobak Mortazavi","role":"Senior Area Chair, Track 3","url":"https://stmilab.github.io/"},{"aff":"Weill Cornell Medicine","img":"zehra-abedi.jpg","name":"Zehra Abedi","role":"Logistics Chair","url":"https://aidh.weill.cornell.edu"},{"aff":"Weill Cornell Medicine","img":"chengxi-zang.jpg","name":"Chengxi Zang","role":"Logistics Sub-Chair, Local","url":"https://www.calvinzang.com/"},{"aff":"Columbia University","img":"kaveri-thakoor.jpg","name":"Kaveri Thakoor","role":"Logistics Sub-Chair, Local","url":"https://www.ai4vslab.org/"},{"aff":"Columbia University","img":"roshan-kenia.jpg","name":"Roshan Kenia","role":"Logistics Sub-Chair, Local","url":"https://www.linkedin.com/in/roshan-kenia/"},{"aff":"MIT","img":"elizabeth-healey.jpg","name":"Elizabeth Healey","role":"Logistics Sub-Chair, Volunteer Coordinator","url":"https://www.linkedin.com/in/elizabeth-healey-408368115"},{"aff":"UC Berkeley and UCSF","img":"ahmed-alaa.jpg","name":"Ahmed Alaa","role":"Finance Chair","url":"https://ahmedmalaa.github.io/"},{"aff":"Georgia Tech","img":"kai-wang.jpg","name":"Kai Wang","role":"Finance Chair","url":"https://guaguakai.com/"},{"aff":"Northeastern University","img":"monica-munnangi.jpg","name":"Monica Munnangi","role":"Comms Chair","url":"https://monicamunnangi.github.io/"},{"aff":"MIT, CSAIL","img":"jiacheng-zhu.jpg","name":"Jiacheng Zhu","role":"Comms Sub-Chair","url":"https://jiachengzhuml.github.io/"},{"aff":"MIT","img":"brian-gow.jpg","name":"Brian Gow","role":"Comms Sub-Chair","url":"https://www.linkedin.com/in/gowbrian/"},{"aff":"Northeastern University","img":"koyena-pal.jpg","name":"Koyena Pal","role":"Comms Sub-Chair","url":"https://koyenapal.github.io/"}]}
diff --git a/serve_config.json b/serve_config.json
new file mode 100644
index 000000000..64dc6722a
--- /dev/null
+++ b/serve_config.json
@@ -0,0 +1 @@
+{"analytics":"UA-","auth0_authorize_endpoint":"https://chil.us.auth0.com/authorize","auth0_client_id":"idrCis50v8uMru0k189CZY3LXVK9tUnY","auth0_domain":"chil.us.auth0.com","background_image":"/static/images/main.jpg","calendar":{"colors":{"---":"#bed972","qa":"#4FCAEB","talk":"#ccc"},"sunday_saturday":false},"chat_server":"chat.chilconference.org","citation_date":"June 2024","countdown_date":"June, 2024","date":"June 27 - 28, 2024 New York City, NY, United States","default_chat_channel":"general","default_poster_pdf":"/static/images/GLTR_poster.pdf","default_presentation_id":38954751,"logo":{"alt_text":"CHIL logo","height":"50px","image":null,"width":"auto"},"name":"CHIL 2024","organization":"CHIL Organization Committee","page_title":{"prefix":"CHIL","separator":": "},"proceedings_title":"Proceedings of CHIL 2024","registration":{"link_text":"Registration opens February 2023","url":"register.html"},"site_title":"CHIL 2024","tagline":"Conference on Health, Inference, and Learning","use_auth0":true}
diff --git a/serve_debates.json b/serve_debates.json
new file mode 100644
index 000000000..170ee1f92
--- /dev/null
+++ b/serve_debates.json
@@ -0,0 +1 @@
+[{"UID":"D01","abstract":"","bio":"Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).","image":"static/images/speakers/christopher_chute.jpg","institution":"Johns Hopkins University","slideslive_active_date":"","slideslive_id":"","speaker":"Christopher Chute","title":"Network studies: As many databases as possible or enough to answer the question quickly?"},{"UID":"D02","abstract":"","bio":"Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.","image":"static/images/speakers/robert_platt.jpg","institution":"McGill University","slideslive_active_date":"","slideslive_id":"","speaker":"Robert Platt","title":"Network studies: As many databases as possible or enough to answer the question quickly?"},{"UID":"D03","abstract":"","bio":"Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.","image":"static/images/speakers/t-cai.jpg","institution":"Harvard Medical School","slideslive_active_date":"","slideslive_id":"","speaker":"Tianxi Cai","title":"Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?"},{"UID":"D04","abstract":"","bio":"Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.","image":"static/images/speakers/yong_chen.png","institution":"University of Pennsylvania","slideslive_active_date":"","slideslive_id":"","speaker":"Yong Chen","title":"Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?"},{"UID":"D05","abstract":"","bio":"Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children\u2019s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.","image":"static/images/speakers/khaled_el_emam.png","institution":"University of Ottawa","slideslive_active_date":"","slideslive_id":"","speaker":"Khaled El Emam","title":"Differential Privacy vs. Synthetic Data"},{"UID":"D06","abstract":"","bio":"Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.","image":"static/images/speakers/li_xiong.png","institution":"Emory University","slideslive_active_date":"","slideslive_id":"","speaker":"Li Xiong","title":"Differential Privacy vs. Synthetic Data"}]
diff --git a/serve_faq.json b/serve_faq.json
new file mode 100644
index 000000000..62b2ede1e
--- /dev/null
+++ b/serve_faq.json
@@ -0,0 +1 @@
+{"FAQ":[{"QA":[{"Answer":"Answer","Question":"Question?"}],"Section":"Section Title"}]}
diff --git a/serve_highlighted.json b/serve_highlighted.json
new file mode 100644
index 000000000..bcfc14149
--- /dev/null
+++ b/serve_highlighted.json
@@ -0,0 +1 @@
+[{"UID":"B1xSperKvH","session":"1"}]
diff --git a/serve_invited.json b/serve_invited.json
new file mode 100644
index 000000000..d90f805e2
--- /dev/null
+++ b/serve_invited.json
@@ -0,0 +1 @@
+[{"UID":"I01","abstract":"","bio":"Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria\u2019s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she\u2019s been recognized as IEEE\u2019s \u201cAI\u2019s 10 to Watch\u201d, Sloan Fellow, MIT Tech Review\u2019s \u201c35 Under 35\u201d, National Academy of Medicine\u2019s list of \u201cEmerging Leaders in Health and Medicine\u201d, and DARPA\u2019s Faculty Award. For her work in industry bringing AI to healthcare, she\u2019s been recognized as World Economic Forum\u2019s 100 Brilliant Minds Under 40, Rock Health\u2019s \u201cTop 50 in Digital Health\u201d, Modern Healthcare\u2019s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine\u2019s Annual Scientific Award.","image":"static/images/speakers/suchi_saria.jpg","institution":"Johns Hopkins University & Bayesian Health","slideslive_active_date":"","slideslive_id":"","speaker":"Suchi Saria","title":"Invited Talk on Research and Top Recent Papers from 2020-2022"},{"UID":"I02","abstract":"","bio":"Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women\u2019s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master\u2019s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.","image":"static/images/speakers/karandeep_singh.jpg","institution":"University of Michigan","slideslive_active_date":"","slideslive_id":"","speaker":"Karandeep Singh","title":"Invited Talk on Recent Deployments and Real-world Impact"},{"UID":"I03","abstract":"","bio":"Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.","image":"static/images/speakers/nigam_shah.png","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Nigam Shah","title":"Invited Talk on Under-explored Research Challenges and Opportunities"}]
diff --git a/serve_main_calendar.json b/serve_main_calendar.json
new file mode 100644
index 000000000..b74bbd731
--- /dev/null
+++ b/serve_main_calendar.json
@@ -0,0 +1 @@
+[{"calendarId":"qa","category":"time","end":"2020-04-29T11:30:00+00:00","link":"https://iclr.cc/virtual/speaker_5.html","location":"https://iclr.cc/virtual/speaker_5.html","start":"2020-04-29T11:00:00+00:00","title":"Live Q&A: Mihaela van der Schaar"}]
diff --git a/serve_ml_health.json b/serve_ml_health.json
new file mode 100644
index 000000000..5210ffe1e
--- /dev/null
+++ b/serve_ml_health.json
@@ -0,0 +1 @@
+[{"UID":"M01","abstract":"","bio":"Isaac (Zak) Kohane, MD, PhD is the inaugural Chair of the Department of Biomedical Informatics and the Marion V. Nelson Professor of Biomedical Informatics at Harvard Medical School. He develops and applies computational techniques to address disease at multiple scales: from whole healthcare systems as \u201cliving laboratories\u201d to the functional genomics of neurodevelopment with a focus on autism. Kohane earned his MD/PhD from Boston University and then completed his post-doctoral work at Boston Children\u2019s Hospital, where he has since worked as a pediatric endocrinologist. Kohane has published several hundred papers in the medical literature and authored the widely-used books Microarrays for an Integrative Genomics(2003) and The AI Revolution in Medicine: GPT-4 and Beyond(2023). He is also Editor-in-Chief of NEJM AI.","image":"static/images/speakers/zak_kohane.png","institution":"Harvard Medical School","slideslive_active_date":"","slideslive_id":"","speaker":"Isaac (Zak) Kohane","title":""},{"UID":"M02","abstract":"","bio":"Kyunghyun Cho is a professor of computer science and data science at New York University and a senior director of frontier research at the Prescient Design team within Genentech Research & Early Development (gRED). He is also a CIFAR Fellow of Learning in Machines & Brains and an Associate Member of the National Academy of Engineering of Korea. He served as a (co-)Program Chair of ICLR 2020, NeurIPS 2022 and ICML 2022. He is also a founding co-Editor-in-Chief of the Transactions on Machine Learning Research (TMLR). He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio. He received the Samsung Ho-Am Prize in Engineering in 2021.","image":"static/images/speakers/kyunghyun_cho.jpeg","institution":"New York University","slideslive_active_date":"","slideslive_id":"","speaker":"Kyunghyun Cho","title":""},{"UID":"M03","abstract":"","bio":"Leo Anthony Celi has practiced medicine in three continents, giving him broad perspectives in healthcare delivery. As clinical research director and principal research scientist at the MIT Laboratory of Computational Physiology (LCP), he brings together clinicians and data scientists to support research using data routinely collected in the intensive care unit (ICU). His group built and maintains the Medical Information Mart for Intensive Care (MIMIC) database. This public-access database has been meticulously de-identified and is freely shared online with the research community. It is an unparalleled research resource; over 2000 investigators from more than 30 countries have free access to the clinical data under a data use agreement. In 2016, LCP partnered with Philips eICU Research Institute to host the eICU database with more than 2 million ICU patients admitted across the United States. The goal is to scale the database globally and build an international collaborative research community around health data analytics.\n
\nLeo founded and co-directs Sana, a cross-disciplinary organization based at the Institute for Medical Engineering and Science at MIT, whose objective is to leverage information technology to improve health outcomes in low- and middle-income countries. At its core is an open-source mobile tele-health platform that allows for capture, transmission and archiving of complex medical data (e.g. images, videos, physiologic signals such as ECG, EEG and oto-acoustic emission responses), in addition to patient demographic and clinical information. Sana is the inaugural recipient of both the mHealth (Mobile Health) Alliance Award from the United Nations Foundation and the Wireless Innovation Award from the Vodafone Foundation in 2010. The software has since been implemented around the globe including India, Kenya, Lebanon, Haiti, Mongolia, Uganda, Brazil, Ethiopia, Argentina, and South Africa.\n
\n
\nHe is one of the course directors for HST.936\u2014global health informatics to improve quality of care, and HST.953\u2014secondary analysis of electronic health records, both at MIT. He is an editor of the textbook for each course, both released under an open access license. The textbook Secondary Analysis of Electronic Health Records came out in October 2016 and was downloaded over 48,000 times in the first two months of publication. The course \u201cGlobal Health Informatics to Improve Quality of Care\u201d was launched under MITx in February 2017.\n
\nLeo was featured as a designer in the Smithsonian Museum National Design Triennial \u201cWhy Design Now?\u201d held at the Cooper-Hewitt Museum in New York City in 2010 for his work in global health informatics. He was also selected as one of 12 external reviewers for the National Academy of Medicine 2014 report \u201cInvesting in Global Health Systems: Sustaining gains, transforming lives\u201d.","image":"static/images/speakers/leo_celi.png","institution":"Massachusetts Institute of Technology","slideslive_active_date":"","slideslive_id":"","speaker":"Leo Celi","title":""}]
diff --git a/serve_panels.json b/serve_panels.json
new file mode 100644
index 000000000..08d2323c3
--- /dev/null
+++ b/serve_panels.json
@@ -0,0 +1 @@
+[{"UID":"P01","abstract":"","bio":"David O. Meltzer is Chief of the Section of Hospital Medicine, Director of the Center for Health and the Social Sciences, and Chair of the Committee on Clinical and Translational Science at the University of Chicago, where he is Professor in the Department of Medicine, and affiliated faculty at the University of Chicago Harris School of Public Policy and the Department of Economics. Dr. Meltzer\u2019s research explores problems in health economics and public policy with a focus on the theoretical foundations of medical cost-effectiveness analysis and the cost and quality of hospital care. He is currently leading a Centers for Medicaid and Medicare Innovation Challenge award to study the effects of improved continuity in the doctor patient relationship between the inpatient and outpatient setting on the costs and outcomes of care for frequently hospitalized Medicare patients. He led the formation of the Chicago Learning Effectiveness Advancement Research Network (Chicago LEARN) that helped pioneer collaboration of Chicago-Area academic medical centers in hospital-based comparative effectiveness research and the recent support of the Chicago Area Patient Centered Outcomes Research Network (CAPriCORN) by the Patient Centered Outcomes Research Institute (PCORI).\n
\nMeltzer received his MD and PhD in economics from the University of Chicago and completed his residency in internal medicine at Brigham and Women\u2019s Hospital in Boston. Meltzer is the recipient of numerous awards, including the Lee Lusted Prize of the Society for Medical Decision Making, the Health Care Research Award of the National Institute for Health Care Management, and the Eugene Garfield Award from Research America. Meltzer is a research associate of the National Bureau of Economic Research, elected member of the American Society for Clinical Investigation, and past president of the Society for Medical Decision Making. He has served on several IOM panels, include one examining U.S. organ allocation policy and the recent panel on the Learning Health Care System that produced Best Care at Lower Cost. He also has served on the DHHS Secretary\u2019s Advisory Committee on Healthy People 2020, the Patient Centered Outcomes Research Institute (PCORI) Methodology Committee, as a Council Member of the National Institute for General Medical Studies, and as a health economics advisor for the Congressional Budget Office.
","image":"static/images/speakers/david_meltzer.jpg","institution":"University of Chicago","slideslive_active_date":"","slideslive_id":"","speaker":"David O. Meltzer","title":""},{"UID":"P02","abstract":"","bio":"Dr. Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research at the University of Michigan. His research focuses on Statistical Methods for Digital and Mobile Health. His current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks. Prior to joining, I was a postdoctoral fellow in the Department of Statistics at Harvard University. His fellowship was in the Statistical Reinforcement Learning Lab under the supervision of Susan Murphy. He received my PhD in Statistics at the University of Chicago under the supervision of Peter McCullagh.","image":"static/images/speakers/walter_dempsey.jpg","institution":"University of Michigan","slideslive_active_date":"","slideslive_id":"","speaker":"Walter Dempsey","title":""},{"UID":"P03","abstract":"","bio":"F. Perry Wilson, MD, MSCE, is a nephrologist who treats patients in Yale New Haven Hospital who have kidney issues or who developed one while hospitalized for another problem. He is also an epidemiologist and a prolific researcher focused on studying ways to improve patient care. An associate professor at Yale School of Medicine, Dr. Wilson is director of the Yale Clinical and Translational Research Accelerator and codirector of the Yale Section of Nephrology\u2019s Human Genetics and Clinical Research Core. He is the creator of the popular online course Understanding Medical Research: Your Facebook Friend Is Wrong\" on the Coursera platform.\"","image":"static/images/speakers/f_perry_wilson.png","institution":"Yale School of Medicine","slideslive_active_date":"","slideslive_id":"","speaker":"F. Perry Wilson","title":""},{"UID":"P04","abstract":"","bio":"Kyra Gan is an Assistant Professor in the School of Operations Research and Information Engineering and Cornell Tech at Cornell University. Her research interests include adaptive/online algorithm design in personalized treatment (including micro-randomized trials and N-of-1 trials) under constraint settings, computerized/automated inference methods (e.g., targeted learning with RKHS), robust causal discovery in medical data, and fairness in organ transplants. More broadly, she is interested in bridging the gap between research and practice in healthcare.\n
\nPrior to Cornell Tech, she was a postdoctoral fellow at the Statistical Reinforcement Lab at Harvard University. She received her Ph.D. in Operations Research in 2022 from Carnegie Mellon University at the Tepper School of Business. She received her B.A.s in Mathematics and Economics from Smith College in 2017. She is a recipient of the 2021 Pierskalla Best Paper Award and the 2021 CHOW Best Student Paper Award in the Category of Operations Research and Management Science.
","image":"static/images/speakers/kyra_gan.png","institution":"Cornell University","slideslive_active_date":"","slideslive_id":"","speaker":"Kyra Gan","title":""},{"UID":"P05","abstract":"","bio":"Girish N. Nadkarni, MD, MPH, is the Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai. As an expert physician-scientist, Dr. Nadkarni bridges the gap between comprehensive clinical care and innovative research. He is the System Chief of the Division of Data Driven and Digital Medicine (D3M), the Co-Director of the Mount Sinai Clinical Intelligence Center (MSCIC) and the Director of Charles Bronfman Institute for Personalized Medicine.\n
\nBefore completing his medical degree at one of the top-ranked medical colleges in India, Dr. Nadkarni received training in mathematics. He then received a master\u2019s degree in public health at the Johns Hopkins Bloomberg School of Public Health, and then was a research associate at the Johns Hopkins Medical Institute. Dr. Nadkarni completed his residency in internal medicine and his clinical fellowship in nephrology at the Icahn School of Medicine at Mount Sinai. He then completed a research fellowship in personalized medicine and informatics.\n
\nDr. Nadkarni has authored more than 240 peer-reviewed scientific publications, including articles in the New England Journal of Medicine, the Journal of the American Medical Association, the Annals of Internal Medicine and Nature Medicine. Dr. Nadkarni is the principal or co-investigator for several grants funded by the National Institutes of Health focusing on informatics, data science, and precision medicine. He is also one of the multiple principal investigators of the NIH RECOVER consortium focusing on the long-term sequelae of COVID-19. He has several patents and is also the scientific co-founder of investor-backed companies\u2014one of which, Renalytix, is listed on NASDAQ. In recognition of his work as an active clinician and investigator, he has received several awards and honors, including the Dr. Harold and Golden Lamport Research Award, the Deal of the Year award from Mount Sinai Innovation Partners, the Carl Nacht Memorial Lecture, and the Rising Star Award from ANIO.","image":"static/images/speakers/girish_nadkarni.png","institution":"Mount Sinai","slideslive_active_date":"","slideslive_id":"","speaker":"Girish N. Nadkarni","title":""},{"UID":"P06","abstract":"","bio":"Roy Perlis, MD MSc is Associate Chief for Research in the Department of Psychiatry and Director of the Center for Quantitative Health at Massachusetts General Hospital. He is Professor of Psychiatry at Harvard Medical School and Associate Editor at JAMA's open-access journal, JAMA Network - Open. Dr. Perlis graduated from Brown University, Harvard Medical School and Harvard School of Public Health, and completed his residency, chief residency, and clinical/research fellowship at MGH before joining the faculty. Dr. Perlis's research is focused on identifying predictors of treatment response in brain diseases, and using these biomarkers to develop novel treatments. Dr. Perlis has authored more than 350 articles reporting original research, in journals including Nature Genetics, Nature Neuroscience, JAMA, NEJM, the British Medical Journal, and the American Journal of Psychiatry. His research has been supported by awards from NIMH, NHGRI, NHLBI, NICHD, NCCIH, and NSF, among others. In 2010 Dr. Perlis was awarded the Depression and Bipolar Support Alliance's Klerman Award; he now serves as a scientific advisor to the DBSA. ","image":"static/images/speakers/roy_perlis.jpeg","institution":"Massachusetts General Hospital","slideslive_active_date":"","slideslive_id":"","speaker":"Roy Perlis","title":""},{"UID":"P07","abstract":"","bio":"Dr. Ashley Beecy is the Medical Director of Artificial Intelligence (AI) Operations at NewYork-Presbyterian (NYP). She is a core member of NYP\u2019s AI leadership team partnering with clinical, administrative and research leaders across the enterprise to drive digital transformation and deliver on NYP\u2019s data and AI strategy. Dr. Beecy provides leadership in key areas including the governance, processes, and infrastructure to ensure the responsible and agile deployment of AI. She is responsible for NYP\u2019s largest enterprise-wide AI initiative in collaboration with Cornell Tech and Cornell University. She is a thought leader and serves as a subject matter expert on multiple national AI collaboratives.","image":"static/images/speakers/ashley_beecy.jpeg","institution":"NewYork-Presbyterian","slideslive_active_date":"","slideslive_id":"","speaker":"Ashley Beecy","title":""},{"UID":"P08","abstract":"","bio":"Leo Anthony Celi has practiced medicine in three continents, giving him broad perspectives in healthcare delivery. As clinical research director and principal research scientist at the MIT Laboratory of Computational Physiology (LCP), he brings together clinicians and data scientists to support research using data routinely collected in the intensive care unit (ICU). His group built and maintains the Medical Information Mart for Intensive Care (MIMIC) database. This public-access database has been meticulously de-identified and is freely shared online with the research community. It is an unparalleled research resource; over 2000 investigators from more than 30 countries have free access to the clinical data under a data use agreement. In 2016, LCP partnered with Philips eICU Research Institute to host the eICU database with more than 2 million ICU patients admitted across the United States. The goal is to scale the database globally and build an international collaborative research community around health data analytics.\n
\nLeo founded and co-directs Sana, a cross-disciplinary organization based at the Institute for Medical Engineering and Science at MIT, whose objective is to leverage information technology to improve health outcomes in low- and middle-income countries. At its core is an open-source mobile tele-health platform that allows for capture, transmission and archiving of complex medical data (e.g. images, videos, physiologic signals such as ECG, EEG and oto-acoustic emission responses), in addition to patient demographic and clinical information. Sana is the inaugural recipient of both the mHealth (Mobile Health) Alliance Award from the United Nations Foundation and the Wireless Innovation Award from the Vodafone Foundation in 2010. The software has since been implemented around the globe including India, Kenya, Lebanon, Haiti, Mongolia, Uganda, Brazil, Ethiopia, Argentina, and South Africa.\n
\n
\nHe is one of the course directors for HST.936\u2014global health informatics to improve quality of care, and HST.953\u2014secondary analysis of electronic health records, both at MIT. He is an editor of the textbook for each course, both released under an open access license. The textbook Secondary Analysis of Electronic Health Records came out in October 2016 and was downloaded over 48,000 times in the first two months of publication. The course \u201cGlobal Health Informatics to Improve Quality of Care\u201d was launched under MITx in February 2017.\n
\nLeo was featured as a designer in the Smithsonian Museum National Design Triennial \u201cWhy Design Now?\u201d held at the Cooper-Hewitt Museum in New York City in 2010 for his work in global health informatics. He was also selected as one of 12 external reviewers for the National Academy of Medicine 2014 report \u201cInvesting in Global Health Systems: Sustaining gains, transforming lives\u201d.","image":"static/images/speakers/leo_celi.png","institution":"Massachusetts Institute of Technology","slideslive_active_date":"","slideslive_id":"","speaker":"Leo Celi","title":""}]
diff --git a/serve_papers.json b/serve_papers.json
new file mode 100644
index 000000000..4fe3a7ab3
--- /dev/null
+++ b/serve_papers.json
@@ -0,0 +1 @@
+[{"UID":"B1xSperKvH","abstract":"Spiking Neural Networks (SNNs) operate with asynchronous discrete events (or spikes) which can potentially lead to higher energy-efficiency in neuromorphic hardware implementations. Many works have shown that an SNN for inference can be formed by copying the weights from a trained Artificial Neural Network (ANN) and setting the firing threshold for each layer as the maximum input received in that layer. These type of converted SNNs require a large number of time steps to achieve competitive accuracy which diminishes the energy savings. The number of time steps can be reduced by training SNNs with spike-based backpropagation from scratch, but that is computationally expensive and slow. To address these challenges, we present a computationally-efficient training technique for deep SNNs. We propose a hybrid training methodology: 1) take a converted SNN and use its weights and thresholds as an initialization step for spike-based backpropagation, and 2) perform incremental spike-timing dependent backpropagation (STDB) on this carefully initialized network to obtain an SNN that converges within few epochs and requires fewer time steps for input processing. STDB is performed with a novel surrogate gradient function defined using neuron's spike time. The weight update is proportional to the difference in spike timing between the current time step and the most recent time step the neuron generated an output spike. The SNNs trained with our hybrid conversion-and-STDB training perform at $10{\\times}{-}25{\\times}$ fewer number of time steps and achieve similar accuracy compared to purely converted SNNs. The proposed training methodology converges in less than $20$ epochs of spike-based backpropagation for most standard image classification datasets, thereby greatly reducing the training complexity compared to training SNNs from scratch. We perform experiments on CIFAR-10, CIFAR-100 and ImageNet datasets for both VGG and ResNet architectures. We achieve top-1 accuracy of $65.19\\%$ for ImageNet dataset on SNN with $250$ time steps, which is $10{\\times}$ faster compared to converted SNNs with similar accuracy.","authors":"Nitin Rathi|Gopalakrishnan Srinivasan|Priyadarshini Panda|Kaushik Roy","keywords":"imagenet","sessions":"Tues Session 1|Mon Session 1","title":"Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation"}]
diff --git a/serve_papers_projection.json b/serve_papers_projection.json
new file mode 100644
index 000000000..1fb904582
--- /dev/null
+++ b/serve_papers_projection.json
@@ -0,0 +1 @@
+[{"id":"B1xSperKvH","pos":[3.0959818363189697,-11.557245254516602]}]
diff --git a/serve_proceedings.json b/serve_proceedings.json
new file mode 100644
index 000000000..79bf89c66
--- /dev/null
+++ b/serve_proceedings.json
@@ -0,0 +1 @@
+[{"UID":"P005","abstract":"Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artificial intelligence (AI) has allowed some works to utilize deep learning frameworks to interpret EGMs during AFib. Additionally, language models (LMs) have shown exceptional performance in being able to generalize to unseen domains, especially in healthcare. In this study, we are the first to leverage pretrained LMs for finetuning of EGM interpolation and AFib classification via masked language modeling. We formulate the EGM as a textual sequence and present competitive performances on AFib classification compared against other representations. Lastly, we provide a comprehensive interpretability study to provide a multi-perspective intuition of the model's behavior, which could greatly benefit the clinical use.","authors":"William Han|Diana Guadalupe Gomez|Avi Alok|Chaojing Duan|Michael A. Rosenberg|Douglas J Weber|Emerson Liu|Ding Zhao","doi_link":"static/proceedings/2024/han24.pdf","session":"A","title":"Interpretation of Intracardiac Electrograms Through Textual Representations"},{"UID":"P007","abstract":"Drug synergy arises when the combined impact of two drugs exceeds the sum of their individual effects. While single-drug effects on cell lines are well documented, the scarcity of data on drug synergy, considering the vast array of potential drug combinations, prompts a growing interest in computational approaches for predicting synergies in untested drug pairs. We introduce a Graph Neural Network (GNN) based model for drug synergy prediction, which utilizes drug chemical structures and cell line gene expression data. We extract data from the largest available drug combination database (DrugComb) and generate multiple synergy scores (commonly used in the literature) to create seven datasets that serve as a reliable benchmark with high confidence. In contrast to conventional models relying on pre-computed chemical features, our GNN-based approach learns task-specific drug representations directly from the graph structure of the drugs, providing superior performance in predicting drug synergies. Our work suggests that learning task-specific drug representations and leveraging a diverse dataset is a promising approach to advancing our understanding of drug-drug interaction and synergy.","authors":"Kyriakos Schwarz|Pliego Mendieta Alicia|Amina Mollaysa|Planas-Paz Lara|Chantal Pauli|Ahmed Allam|Michael Krauthammer","doi_link":"static/proceedings/2024/schwarz24.pdf","session":"A","title":"DDOS: A GRAPH NEURAL NETWORK BASED DRUG SYNERGY PREDICTION ALGORITHM"},{"UID":"P023","abstract":"In healthcare applications, there is a growing need to develop machine learning models that use data from a single source, such as that from a wrist wearable device, to monitor physical activities, assess health risks, and provide immediate health recommendations or interventions. However, the limitation of using single-source data often compromises the model's accuracy, as it fails to capture the full scope of human activities. While a more comprehensive dataset can be gathered in a lab setting using multiple sensors attached to various body parts, this approach is not practical for everyday use due to the impracticality of wearing multiple sensors. To address this challenge, we introduce a transfer learning framework that optimizes machine learning models for everyday applications by leveraging multi-source data collected in a laboratory setting. We introduce a novel metric to leverage the inherent relationship between these multiple data sources, as they are all paired to capture aspects of the same physical activity. Through numerical experiments, our framework outperforms existing methods in classification accuracy and robustness to noise, offering a promising avenue for the enhancement of daily activity monitoring.","authors":"Haoting Zhang|Donglin Zhan|Yunduan Lin|Jinghai He|Qing Zhu|Zuo-Jun Shen|Zeyu Zheng","doi_link":"static/proceedings/2024/zhang24.pdf","session":"A","title":"Daily Physical Activity Monitoring---Adaptive Learning from Multi-source Motion Sensor Data"},{"UID":"P029","abstract":"Effective collaboration across medical institutions presents a significant challenge, primarily due to the imperative of maintaining patient privacy. Optimal machine learning models in healthcare demand access to extensive, high-quality data to achieve generality and robustness. Yet, typically, medical institutions are restricted to data within their networks, limiting the scope and diversity of information. This limitation becomes particularly acute when encountering patient cases with rare or unique characteristics, leading to potential distribution shifts in the data. To address these challenges, our work introduces a framework designed to enhance existing clinical foundation models, Private Synthetic Hypercube Augmentation (PriSHA). We leverage generative models to produce synthetic data, generated from diverse sources, as a means to augment these models while adhering to strict privacy standards. This approach promises to broaden the dataset's scope and improve model performance without compromising patient confidentiality. To the best of our knowledge, our framework is the first framework to address distribution shifts through the use of synthetic privacy-preserving tabular data augmentation.","authors":"Shinpei Nakamura Sakai|Dennis Shung|Jasjeet S Sekhon","doi_link":"static/proceedings/2024/sakai24.pdf","session":"A","title":"Enhancing Collaborative Medical Outcomes through Private Synthetic Hypercube Augmentation: PriSHA"},{"UID":"P034","abstract":"This study demonstrates the first in-hospital adaptation of a cloud-based AI, similar to ChatGPT, into a secure model for analyzing radiology reports, prioritizing patient data privacy. By employing a unique sentence-level knowledge distillation method through contrastive learning, we achieve over 95% accuracy in detecting anomalies. The model also accurately flags uncertainties in its predictions, enhancing its reliability and interpretability for physicians with certainty indicators. Despite limitations in data privacy during the training phase, such as requiring de-identification or IRB permission, our study is significant in addressing this issue in the inference phase (once the local model is trained), without the need for human annotation throughout the entire process. These advancements represent a new direction for developing secure and efficient AI tools for healthcare with minimal supervision, paving the way for a promising future of in-hospital AI applications.","authors":"Kyungsu Kim|Junhyun Park|Saul Langarica|Adham Mahmoud Alkhadrawi|Synho Do","doi_link":"static/proceedings/2024/kim24a.pdf","session":"A","title":"Integrating ChatGPT into Secure Hospital Networks: A Case Study on Improving Radiology Report Analysis"},{"UID":"P035","abstract":"Most past work in multiple instance learning (MIL), which maps groups of instances to classification labels, has focused on settings in which the order of instances does not contain information. In this paper, we define MIL with \\textit{absolute} position information: tasks in which instances of importance remain in similar positions across bags. Such problems arise, for example, in MIL with medical images in which there exists a common global alignment across images (e.g., in chest x-rays the heart is in a similar location). We also evaluate the performance of existing MIL methods on a set of new benchmark tasks and two real data tasks with varying amounts of absolute position information. We find that, despite being less computationally efficient than other approaches, transformer-based MIL methods are more accurate at classifying tasks with absolute position information. Thus, we investigate the ability of positional encodings, a mechanism typically only used in transformers, to improve the accuracy of other MIL approaches. Applied to the task of identifying pathological findings in chest x-rays, when augmented with positional encodings, standard MIL approaches perform significantly better than without (AUROC of 0.799, 95\\% CI: [0.791, 0.806] vs. 0.782, 95\\% CI: [0.774, 0.789]) and on-par with transformer-based methods (AUROC of 0.797, 95\\% CI: [0.790, 0.804]) while being 10 times faster. Our results suggest that one can efficiently and accurately classify MIL data with standard approaches by simply including positional encodings.","authors":"Meera Krishnamoorthy|Jenna Wiens","doi_link":"static/proceedings/2024/krishnamoorthy24.pdf","session":"A","title":"Multiple Instance Learning with Absolute Position Information"},{"UID":"P041","abstract":"Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambulatory settings. Conventional approaches typically discard corrupted segments or attempt to reconstruct original signals, allowing for the use of standard machine learning techniques. However, this reduces dataset size and introduces biases, compromising prediction accuracy and the effectiveness of continuous monitoring. We propose a novel deep learning model, Signal Quality Weighted Fusion of Attentional Convolution and Recurrent Neural Network (SQUWA), designed to learn how to retain accurate predictions from partially corrupted PPG. Specifically, SQUWA innovatively integrates an attention mechanism that directly considers signal quality during the learning process, dynamically adjusting the weights of time series segments based on their quality. This approach enhances the influence of higher-quality segments while reducing that of lower-quality ones, effectively utilizing partially corrupted segments. This approach represents a departure from the conventional methods that exclude such segments, enabling the utilization of a broader range of data, which has great implications for less disruption when monitoring of AF risks and more accurate estimation of AF burdens. Moreover, SQUWA utilizes variable-sized convolutional kernels to capture complex PPG signal patterns across different resolutions for enhanced learning. Our extensive experiments show that SQUWA outperform existing PPG-based models, achieving the highest AUCPR of 0.89 with label noise mitigation. This also exceeds the 0.86 AUCPR of models trained with using both electrocardiogram (ECG) and PPG data.","authors":"Runze Yan|Cheng Ding|Ran Xiao|Alex Fedorov|Randall J Lee|Fadi Nahab|Xiao Hu","doi_link":"static/proceedings/2024/yan24.pdf","session":"A","title":"SQUWA: Signal Quality Aware DNN Architecture for Enhanced Accuracy in Atrial Fibrillation Detection from Noisy PPG Signals"},{"UID":"P052","abstract":"We introduce a novel hierarchical Bayesian estimator for permutation entropy (PermEn), designed to improve the accuracy of entropy assessments of biomedical time series signal sets, particularly for short-duration signals. Unlike existing methods that require a substantial number of observations or impose restrictive priors, our approach uses a non-centered, Wasserstein distance optimized hierarchical prior, enabling efficient full Markov Chain Monte Carlo inference and a broader spectrum of PermEn priors. Comparative evaluations with synthetic and secondary benchmark data demonstrate our estimator's enhanced performance, including a significant reduction in estimation error (13.33-63.67\\%), posterior variance (8.16-47.77\\%), and reference prior distance error (47-60.83\\%, $p \\leq 2.42 \\times 10^{-10}$) against current state-of-the-art methods. Applied to oxygen uptake signals from cardiopulmonary exercise testing, our method revealed a previously unreported entropy difference between obese and lean subjects (mean difference: 1.732\\%; 94\\% CI [2.34\\%, 1.11\\%], $p \\leq \\frac{1}{20000}$), with more precise credible intervals (16-24\\% improvement). This entropy disparity becomes statistically non-significant in participants completing over 7.5 minutes of testing, suggesting potential insights into physiological complexity, exercise tolerance, and obesity. Our estimator thus not only refines the estimation of PermEn in biomedical signals but also underscores entropy's potential value as a health biomarker, opening avenues for further physiological and biomedical exploration.","authors":"Zachary Blanks|Donald E. Brown|Marc A. Adams|Siddhartha S. Angadi","doi_link":"static/proceedings/2024/blanks24.pdf","session":"A","title":"An Improved Bayesian Permutation Entropy Estimator with Wasserstein-Optimized Hierarchical Priors"},{"UID":"P061","abstract":"Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a novel and large scale data set consisting of a training set with over 3 million hourly step count observations and a test set with over 2.5 million hourly step count observations. We propose a domain knowledge-informed sparse self-attention model for this task that captures the temporal multi-scale nature of step-count data. We assess the performance of the model relative to baselines based on different missing rates and ground-truth step counts. Finally, we conduct ablation studies to verify our specific model designs.","authors":"Hui Wei|Maxwell A Xu|Colin Samplawski|James Matthew Rehg|Santosh Kumar|Benjamin Marlin","doi_link":"static/proceedings/2024/wei24.pdf","session":"A","title":"Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation"},{"UID":"P095","abstract":"This work introduces a novel approach to model regularization and explanation in Vision Transformers (ViTs), particularly beneficial for small-scale but high-dimensional data regimes, such as in healthcare. We introduce stochastic embedded feature selection in the context of echocardiography video analysis, specifically focusing on the EchoNet-Dynamic dataset for the prediction of the Left Ventricular Ejection Fraction (LVEF). Our proposed method, termed Gumbel Video Vision-Transformers (G-ViTs), augments Video Vision-Transformers (V-ViTs), a performant transformer architecture for videos with Concrete Autoencoders (CAEs), a common dataset-level feature selection technique, to enhance V-ViT's generalization and interpretability. The key contribution lies in the incorporation of stochastic token selection individually for each video frame during training. Such token selection regularizes the training of V-ViT, improves its interpretability, and is achieved by differentiable sampling of categoricals using the Gumbel-Softmax distribution. Our experiments on EchoNet-Dynamic demonstrate a consistent and notable regularization effect. The G-ViT model outperforms both a random selection baseline and standard V-ViT. The G-ViT is also compared against recent works on EchoNet-Dynamic where it exhibits state-of-the-art performance among end-to-end learned methods. Finally, we explore model explainability by visualizing selected patches, providing insights into how the G-ViT utilizes regions known to be crucial for LVEF prediction for humans. This proposed approach, therefore, extends beyond regularization, offering enhanced interpretability for ViTs.","authors":"Alfred Nilsson|Hossein Azizpour","doi_link":"static/proceedings/2024/nilsson24.pdf","session":"A","title":"Regularizing and Interpreting Vision Transformer by Patch Selection on Echocardiography Data"},{"UID":"P096","abstract":"Over the last decade, there has been significant progress in the field of interactive virtual rehabilitation. Physical therapy (PT) stands as a highly effective approach for enhancing physical impairments. However, patient motivation and progress tracking in rehabilitation outcomes have been a challenge. This work aims to address this gap by proposing a computational approach that uses machine learning to objectively measure reaching task outcomes from an upper limb virtual therapy user study. In this study, we use virtual reality to perform several tracing tasks while collecting motion and movement data using a KinArm robot and a custom-made wearable sleeve sensor. We introduce a two-step machine learning architecture to predict the motion intention of participants: The first step predicts reaching task segments to which the participant-marked points belonged using gaze, while the second step employs a Long Short-Term Memory (LSTM) model to predict directional movements based on resistance change values from the wearable sensor and the KinArm robot used to give the support the participant. We specifically propose to transpose our raw resistance data to the time-domain that significantly improve the accuracy of our models. To evaluate the effectiveness of our model, we compared different classification techniques with various data configurations. The results show that our proposed computational method is exceptionally good at predicting what participants are about to do, demonstrating the great promise of using multimodal data, including eye-tracking and resistance change, to objectively measure the performance and intention in virtual rehabilitation settings.","authors":"Pavan Uttej Ravva|Pinar Kullu|Mohammad Fahim Abrar|Roghayeh Leila Barmaki","doi_link":"static/proceedings/2024/ravva24.pdf","session":"A","title":"A Machine Learning Approach for Predicting Upper Limb Motion Intentions with Multimodal Data"},{"UID":"P100","abstract":"Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, enabling seamless integration of additional features. Additionally, we design two techniques, namely (1) \\emph{Smoothness-inducing Regularization} and (2) \\emph{Group-balanced Reweighting}, to enhance the model's robustness during finetuning. Through experiments conducted on two real EHR datasets, we demonstrate that HTP-Star consistently outperforms various baselines while striking a balance between patients with basic and extra features.","authors":"Ran Xu|Yiwen Lu|Chang Liu|Yong Chen|Yan Sun|Xiao Hu|Joyce C. Ho|Carl Yang","doi_link":"static/proceedings/2024/xu24.pdf","session":"A","title":"From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR"},{"UID":"P102","abstract":"Explainability and privacy are the top concerns in machine learning (ML) for medical applications. In this paper, we propose a novel method, Domain-Aware Symbolic Regression with Homomorphic Encryption (DASR-HE), that addresses both concerns simultaneously by: (i) producing domain-aware, intuitive and explainable models that do not require the end-user to possess ML expertise and (ii) training only on securely encrypted data without access to actual data values or model parameters. DASR-HE is based on Symbolic Regression (SR), which is a first-class ML approach that produces simple and concise equations for regression, requiring no ML expertise to interpret. In our work, we improve the performance of SR algorithms by using existing domain-specific medical equations to augment the search space of equations, decreasing the search complexity and producing equations that are similar in structure to those used in practice. To preserve the privacy of the medical data, we enable our algorithm to learn on data that is homomorphically encrypted (HE), meaning that arithmetic operations can be done in the encrypted space. This makes HE suitable for machine learning algorithms to learn models without access to the actual data values or model parameters. We evaluate DASR-HE on three medical tasks, namely predicting glomerular filtration rate, endotracheal tube (ETT) internal diameter and ETT depth and find that DASR-HE outperforms existing medical equations, other SR ML algorithms and other explainable ML algorithms.","authors":"Kei Sen Fong|Mehul Motani","doi_link":"static/proceedings/2024/fong24.pdf","session":"A","title":"Explainable and Privacy-Preserving Machine Learning via Domain-Aware Symbolic Regression"},{"UID":"P104","abstract":"Limited access to health data remains a challenge for developing machine learning (ML) models. Health data is difficult to share due to privacy concerns and often does not have ground truth. Simulated data is often used for evaluating algorithms, as it can be shared freely and generated with ground truth. However, for simulated data to be used as an alternative to real data, algorithmic performance must be similar to that of real data. Existing simulation approaches are either black boxes or rely solely on expert knowledge, which may be incomplete. These methods generate data that often overstates performance, as they do not simulate many of the properties that make real data challenging. Nonstationarity, where a system's properties or parameters change over time, is pervasive in health data with changing health status of patients, standards of care, and populations. This makes ML challenging and can lead to reduced model generalizability, yet there have not been ways to systematically simulate realistic nonstationary data. This paper introduces a modular approach for learning dataset-specific models of nonstationarity in real data and augmenting simulated data with these properties to generate realistic synthetic datasets. We show that our simulation approach brings performance closer to that of real data in stress classification and glucose forecasting in people with diabetes.","authors":"Adedolapo Aishat Toye|Louis Gomez|Samantha Kleinberg","doi_link":"static/proceedings/2024/toye24.pdf","session":"A","title":"Simulation of Health Time Series with Nonstationarity"},{"UID":"P123","abstract":"Representation learning of brain activity is a key step toward unleashing machine learning models for use in the diagnosis of neurological diseases/disorders. Diagnosis of different neurological diseases/disorders, however, might require paying more attention to either spatial or temporal resolutions of brain activity. Accordingly, a generalized brain activity learner requires the ability of learning from both resolutions. Most existing studies, however, use domain knowledge to design brain encoders, and so are limited to a single neuroimage modality (e.g., EEG or fMRI) and its single resolution. Furthermore, their architecture design either: (1) uses self-attention mechanism with quadratic time with respect to input size, making its scalability limited, (2) is purely based on message-passing graph neural networks, missing long-range dependencies and temporal resolution, and/or (3) encode brain activity in each unit of brain (e.g., voxel) separately, missing the dependencies of brain regions. In this study, we present BrainMamba, an attention free, scalable, and powerful framework to learn brain activity multivariate timeseries. BrainMamba uses two modules: (i) A novel multivariate timeseries encoder that leverage an MLP to fuse information across variates and an Selective Structured State Space (S4) architecture to encode each timeseries. (ii) A novel graph learning framework that leverage message-passing neural networks along with S4 architecture to selectively choose important brain regions. Our experiments on 7 real-world datasets with 3 modalities show that BrainMamba attains outstanding performance and outperforms all baselines in different downstream tasks.","authors":"Ali Behrouz|Farnoosh Hashemi","doi_link":"static/proceedings/2024/behrouz24.pdf","session":"A","title":"Brain-Mamba: Encoding Brain Activity via Selective State Space Models"},{"UID":"P008","abstract":"Personalization in healthcare helps to translate clinical data into more effective disease management. In practice, this is achieved by subgrouping, whereby clusters with similar patient characteristics are identified and then receive customized treatment plans with the goal of targeting subgroup-specific disease dynamics. In this paper, we propose a novel mixture hidden Markov model for subgrouping patient trajectories from chronic diseases. Our model is interpretable and carefully designed to capture different trajectory phases of chronic diseases (i.e., 'severe', 'moderate', and 'mild') through tailored latent states. We demonstrate our subgrouping framework based on a longitudinal study across 847 patients with non-specific low back pain. Here, our subgrouping framework identifies 8 subgroups. Further, we show that our subgrouping framework outperforms common baselines in terms of cluster validity indices. Finally, we discuss the applicability of the model to other chronic and long-lasting diseases. For healthcare practitioners, this presents the opportunity for treatment plans tailored to the specific needs of patient subgroups.","authors":"Christof Friedrich Naumzik|Alice Kongsted|Werner Vach|Stefan Feuerriegel","doi_link":"static/proceedings/2024/naumzik24.pdf","session":"B","title":"Data-driven subgrouping of patient trajectories with chronic diseases: Evidence from low back pain"},{"UID":"P013","abstract":"Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing methods that rely on general domain pipelines using only radiology reports to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views, which marks a significant advancement in the field. To achieve this, we introduce a set of specially designed tokens for each view position, tailoring the generation process to the user's preferences. Furthermore, we leverage multi-view chest X-rays as input, incorporating valuable information from different views within the same study. This integration rectifies potential errors and contributes to faithfully capturing abnormal findings in chest X-ray generation. To validate the effectiveness of our approach, we conducted statistical analyses, evaluating its performance in a clinical efficacy metric on the MIMIC-CXR dataset. Also, human evaluation demonstrates the remarkable capabilities of ViewXGen, particularly in producing realistic view-specific X-rays that closely resemble the original images.","authors":"Hyungyung Lee|Da Young Lee|Wonjae Kim|Jin-Hwa Kim|Tackeun Kim|Jihang Kim|Leonard Sunwoo|Edward Choi","doi_link":"static/proceedings/2024/lee24.pdf","session":"B","title":"Vision-Language Generative Model for View-Specific Chest X-ray Generation"},{"UID":"P018","abstract":"Machine learning applications hold promise to aid clinicians in a wide range of clinical tasks, from diagnosis to prognosis, treatment, and patient monitoring. These potential applications are accompanied by a surge of ethical concerns surrounding the use of Machine Learning (ML) models in healthcare, especially regarding fairness and non-discrimination. While there is an increasing number of regulatory policies to ensure the ethical and safe integration of such systems, the translation from policies to practices remains an open challenge. Algorithmic frameworks, aiming to bridge this gap, should be tailored to the application to enable the translation from fundamental human-right principles into accurate statistical analysis, capturing the inherent complexity and risks associated with the system. In this work, we propose a set of fairness impartial checks especially adapted to ML early-warning systems in the medical context, comprising on top of standard fairness metrics, an analysis of clinical outcomes, and a screening of potential sources of bias in the pipeline. Our analysis is further fortified by the inclusion of event-based and prevalence-corrected metrics, as well as statistical tests to measure biases. Additionally, we emphasize the importance of considering subgroups beyond the conventional demographic attributes. Finally, to facilitate operationalization, we present an open-source tool FAMEWS to generate comprehensive fairness reports. These reports address the diverse needs and interests of the stakeholders involved in integrating ML into medical practice. The use of FAMEWS has the potential to reveal critical insights that might otherwise remain obscured. This can lead to improved model design, which in turn may translate into enhanced health outcomes.","authors":"Marine Hoche|Olga Mineeva|Manuel Burger|Alessandro Blasimme|Gunnar Ratsch","doi_link":"static/proceedings/2024/hoche24.pdf","session":"B","title":"FAMEWS: a Fairness Auditing tool for Medical Early-Warning Systems"},{"UID":"P022","abstract":"Medical image segmentation typically requires numerous dense annotations in the target domain to train models, which is time-consuming and labor-intensive. To mitigate this burden, unsupervised domain adaptation has been developed to train models with good generalisation performance on the target domain by leveraging a label-rich source domain and the unlabeled target domain data. In this paper, we introduce a novel Dynamic Prototype Contrastive Learning (DPCL) framework for cross-domain medical image segmentation with unlabeled target domains, which dynamically updates crossdomain global prototypes and excavates implicit discrepancy information in a contrastive manner. DPCL enhances the discriminative capability of the segmentation model while learning cross-domain global feature representations. In particular, DPCL introduces a novel crossdomain prototype evolution module through dynamic updating and evolutionary strategies. This module generates evolved cross-domain prototypes, facilitating the progressive transformation from the source domain to the target domain and acquiring global cross-domain guidance knowledge. Moreover, a cross-domain embedding contrastive module is devised to establish contrastive relationships in the embedding space. This captures both homogeneous and heterogeneous information within the same category and among different categories, enhancing the discriminative capability of the segmentation model. Experimental results demonstrate that the proposed DPCL is effective and outperforms the state-of-the-art methods.","authors":"Qing En|Yuhong Guo","doi_link":"static/proceedings/2024/en24.pdf","session":"B","title":"Unsupervised Domain Adaptation for Medical Image Segmentation with Dynamic Prototype-based Contrastive Learning"},{"UID":"P036","abstract":"This paper presents FlowCyt, the first comprehensive benchmark for multi-class single-cell classification in flow cytometry data. The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers. Ground truth labels identify five hematological cell types: T lymphocytes, B lymphocytes, Monocytes, Mast cells, and Hematopoietic Stem/Progenitor Cells (HSPCs). Experiments utilize supervised inductive learning and semi-supervised transductive learning on up to 1 million cells per patient. Baseline methods include Gaussian Mixture Models, XGBoost, Random Forests, Deep Neural Networks, and Graph Neural Networks (GNNs). GNNs demonstrate superior performance by exploiting spatial relationships in graph-encoded data. The benchmark allows standardized evaluation of clinically relevant classification tasks, along with exploratory analyses to gain insights into hematological cell phenotypes. This represents the first public flow cytometry benchmark with a richly annotated, heterogeneous dataset. It will empower the development and rigorous assessment of novel methodologies for single-cell analysis.","authors":"Lorenzo Bini|Fatemeh Nassajian Mojarrad|Margarita Liarou|Thomas Matthes|Stephane Marchand-Maillet","doi_link":"static/proceedings/2024/bini24.pdf","session":"B","title":"FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking"},{"UID":"P037","abstract":"Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we develop a rigorous labeling protocol for hallucinations, and have two medical experts annotate 100 real-world summaries and 100 generated summaries. We show that fine-tuning on hallucination-free data effectively reduces hallucinations from 2.60 to 1.55 per summary for Llama 2, while preserving relevant information. Although the effect is still present, it is much smaller for GPT-4 when prompted with five examples (0.70 to 0.40). We also conduct a qualitative evaluation using hallucination-free and improved training data. GPT-4 shows very good results even in the zero-shot setting. We find that common quantitative metrics do not correlate well with faithfulness and quality. Finally, we test GPT-4 for automatic hallucination detection, which yields promising results.","authors":"Stefan Hegselmann|Zejiang Shen|Florian Gierse|Monica Agrawal|David Sontag|Xiaoyi Jiang","doi_link":"static/proceedings/2024/hegselmann24.pdf","session":"B","title":"A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models"},{"UID":"P045","abstract":"Sleep is crucial for health, and recent advances in wearable technology and machine learning offer promising methods for monitoring sleep outside the clinical setting. However, sleep tracking using wearables is challenging, particularly for those with irregular sleep patterns or sleep disorders. In this study, we introduce a dataset collected from 100 patients from [redacted for anonymity] Sleep Disorders Center who wore the Empatica E4 smartwatch during an overnight sleep study with concurrent clinical-grade polysomnography (PSG) recording. This dataset encompasses diverse demographics and medical conditions. We further introduce a new methodology that addresses the limitations of existing modeling methods when applied on patients with sleep disorders. Namely, we address the inability for existing base models to account for 1) temporal relationships while leveraging relatively small data by introducing a LSTM post-processing method, and 2) group-wise characteristics that impact classification task performance (i.e., random effects) by ensembling mixed-effects boosted tree models. This approach was highly successful for sleep onset and wakefulness detection in this sleep disordered population, achieving an F1 score of 0.823 \u00b1 0.019, an AUROC of 0.926 \u00b1 0.016, and a 0.695 \u00b1 0.025 Cohen's Kappa. Overall, we demonstrate the utility of both the data that we collected, as well as our unique approach to address the existing gap in wearable-based sleep tracking in sleep disordered populations.","authors":"Will Ke Wang|Jiamu Yang|Leeor Hershkovich|Hayoung Jeong|Bill Chen|Karnika Singh|Ali R Roghanizad|Md Mobashir Hasan Shandhi|Andrew R Spector|Jessilyn Dunn","doi_link":"static/proceedings/2024/wang24.pdf","session":"B","title":"Addressing wearable sleep tracking inequity: a new dataset and novel methods for a population with sleep disorders"},{"UID":"P048","abstract":"The rapid development of wearable biomedical systems now enables real-time monitoring of electroencephalography (EEG) signals. Acquisition of these signals relies on electrodes. These systems must meet the design challenge of selecting an optimal set of electrodes that balances performance and usability constraints. The search for the optimal subset of electrodes from a larger set is a problem with combinatorial complexity. While existing research has primarily focused on search strategies that only explore limited combinations, our methodology proposes a computationally efficient way to explore all combinations. To avoid the computational burden associated with training the model for each combination, we leverage an innovative approach inspired by few-shot learning. Remarkably, this strategy covers all the wearable electrode combinations while significantly reducing training time compared to retraining the network on each possible combination. In the context of an epileptic seizure detection task, the proposed method achieves an AUC value of 0.917 with configurations using eight electrodes. This performance matches that of prior research but is achieved in significantly less time, transforming a process that would span months into a matter of hours on a single GPU device. Our work allows comprehensive exploration of electrode configurations in wearable biomedical device design, yielding insights that enhance performance and real-world feasibility.","authors":"Alireza Amirshahi|Jonathan Dan|Jose Angel Miranda|Amir Aminifar|David Atienza","doi_link":"static/proceedings/2024/amirshahi24.pdf","session":"B","title":"FETCH: A Fast and Efficient Technique for Channel Selection in EEG Wearable Systems"},{"UID":"P050","abstract":"Deep learning models have achieved promising results in breast cancer classification, yet their 'black-box' nature raises interpretability concerns. This research addresses the crucial need to gain insights into the decision-making process of convolutional neural networks (CNNs) for mammogram classification, specifically focusing on the underlying reasons for the CNN's predictions of breast cancer. For CNNs trained on the Mammographic Image Analysis Society (MIAS) dataset, we compared the post-hoc interpretability techniques LIME, Grad-CAM, and Kernel SHAP in terms of explanatory depth and computational efficiency. The results of this analysis indicate that Grad-CAM, in particular, provides comprehensive insights into the behavior of the CNN, revealing distinctive patterns in normal, benign, and malignant breast tissue. We discuss the implications of the current findings for the use of machine learning models and interpretation techniques in clinical practice.","authors":"Ann-Kristin Balve|Peter Hendrix","doi_link":"static/proceedings/2024/balve.pdf","session":"B","title":"Interpretable breast cancer classification using CNNs on mammographic images"},{"UID":"P055","abstract":"In this work, we address the challenge of limited data availability common in healthcare settings by using clinician (ophthalmologist) gaze data on optical coherence tomography (OCT) report images as they diagnose glaucoma, a top cause of irreversible blindness world-wide. We directly learn gaze representations with our 'GazeFormerMD' model to generate pseudo-labels using a novel multi-task objective, combining triplet and cross-entropy losses. We use these pseudo-labels for weakly supervised contrastive learning (WSupCon) to detect glaucoma from a partially-labeled dataset of OCT report images. Our natural-language-inspired region-based-encoding GazeFormerMD model pseudo-labels, trained using our multi-task objective, enable downstream glaucoma detection accuracy via WSupCon exceeding 91% even with only 70% labeled training data. Furthermore, a model pre-trained with GazeFormerMD-generated pseudo-labels and used for linear evaluation on an unseen OCT-report dataset achieved comparable performance to a fully-supervised, trained-from-scratch model while using only 25% labeled data.","authors":"Wai Tak Lau|Ye Tian|Roshan Kenia|Saanvi Aima|Kaveri A. Thakoor","doi_link":"static/proceedings/2024/lau24.pdf","session":"B","title":"Using Expert Gaze for Self-Supervised and Supervised Contrastive Learning of Glaucoma from OCT Data"},{"UID":"P058","abstract":"This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets reflecting real-world prospective data collection. We analyze CNNs, including DenseNet and ConvNeXt, alongside transformer models like ViT, SWIN, and AST, and compare them against pre-trained audio models such as YAMNet and VGGish. Our method highlights the benefits of pre-training on large datasets before fine-tuning on specific clinical data. We prospectively collected two first-of-their-kind patient audio datasets from stroke patients. We investigated various preprocessing techniques, finding that RGB and grayscale spectrogram transformations affect model performance differently based on the priors they learn from pre-training. Our findings indicate CNNs can match or exceed transformer models in small dataset contexts, with DenseNet-Contrastive and AST models showing notable performance. This study highlights the significance of incremental marginal gains through model selection, pre-training, and preprocessing in sound classification; this offers valuable insights for clinical diagnostics that rely on audio classification.","authors":"Hamza Mahdi|Eptehal Nashnoush|Rami Saab|Arjun Balachandar|Rishit Dagli|Lucas Perri|Houman Khosravani","doi_link":"static/proceedings/2024/mahdi24.pdf","session":"B","title":"Tuning In: Comparative Analysis of Audio Classifier Performance in Clinical Settings with Limited Data"},{"UID":"P059","abstract":"Event-based models (EBM) provide an important platform for modeling disease progression. This work successfully extends previous EBM approaches to work with larger sets of biomarkers while simultaneously modeling heterogeneity in disease progression trajectories. We develop and validate the s-SuStain method for scalable event-based modeling of disease progression subtypes using large numbers of features. s-SuStaIn is typically an order of magnitude faster than its predecessor (SuStaIn). Moreover, we perform a case study with s-SuStaIn using open access cross-sectional Alzheimer's Disease Neuroimaging (ADNI) data to stage AD patients into four subtypes based on dynamic disease progression. s-SuStaIn shows that the inferred subtypes and stages predict progression to AD among MCI subjects. The subtypes show difference in AD incidence-rates and reveal clinically meaningful progression trajectories when mapped to a brain atlas.","authors":"Raghav Tandon|James J Lah|Cassie S. Mitchell","doi_link":"static/proceedings/2024/tandon24.pdf","session":"B","title":"s-SuStaIn : Scaling subtype and stage inference via simultaneous clustering of subjects and biomarkers"},{"UID":"P069","abstract":"While the pace of development of AI has rapidly progressed in recent years, the implementation of safe and effective regulatory frameworks has lagged behind. In particular, the adaptive nature of AI models presents unique challenges to regulators as updating a model can improve its performance but also introduce safety risks. In the US, the Food and Drug Administration (FDA) has been a forerunner in regulating and approving hundreds of AI medical devices. To better understand how AI is updated and its regulatory considerations, we systematically analyze the frequency and nature of updates in FDA-approved AI medical devices. We find that less than 2% of all devices have been updated by being re-trained on new data. Meanwhile, nearly a quarter of devices have received updates in the form of new functionality and marketing claims. As an illustrative case study, we analyze pneumothorax detection models and find that while model performance can degrade by as much as 0.18 AUC when evaluated on new sites, re-training on site-specific data can mitigate this performance drop, recovering up to 0.23 AUC. However, we also observed significant degradation on the original site after re-training using data from new sites, highlighting a challenge with the current one-model-fits-all approach to regulatory approvals. Our analysis provides an in-depth look at the current state of FDA-approved AI device updates and insights for future regulatory policies toward model updating and adaptive AI.","authors":"Kevin Wu|Eric Wu|Kit Rodolfa|Daniel E. Ho|James Zou","doi_link":"static/proceedings/2024/wu24.pdf","session":"B","title":"Regulating AI Adaptation: An Analysis of AI Medical Device Updates"},{"UID":"P076","abstract":"Unstructured data in Electronic Health Records (EHRs) often contains critical information---complementary to imaging---that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by ``hallucinations''. In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.","authors":"Hiba Ahsan|Denis Jered McInerney|Jisoo Kim|Christopher A Potter|Geoffrey Young|Silvio Amir|Byron C Wallace","doi_link":"static/proceedings/2024/ahsan24.pdf","session":"B","title":"Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges"},{"UID":"P079","abstract":"Approximately two-thirds of survivors of childhood acute lymphoblastic leukemia (ALL) cancer develop late adverse effects post-treatment. Prior studies explored prediction models for personalized follow-up, but none integrated the usage of neural networks to date. In this work, we propose the Error Passing Network (EPN), a graph-based method that leverages relationships between samples to propagate residuals and adjust predictions of any machine learning model. We tested our approach to estimate patients' VO$_2$ peak, a reliable indicator of their cardiac health. We used the EPN in conjunction with several baseline models and observed up to $12.16$% improvement in the mean average percentage error compared to the last established equation predicting VO$_2$ peak in childhood ALL survivors. Along with this performance improvement, our final model is more efficient considering that it relies only on clinical variables that can be self-reported by patients, therefore removing the previous need of executing a resource-consuming physical test.","authors":"Nicolas Raymond|Hakima Laribi|Maxime Caru|Mehdi Mitiche|Valerie Marcil|Maja Krajinovic|Daniel Curnier|Daniel Sinnett|Martin Valli\u00e8res","doi_link":"static/proceedings/2024/raymond24.pdf","session":"B","title":"Development of Error Passing Network for Optimizing the Prediction of VO$_2$ peak in Childhood Acute Leukemia Survivors"},{"UID":"P086","abstract":"Large language models (LLMs) are capable of many natural language tasks, yet they are far from perfect. In health applications, grounding and interpreting domain-specific and non-linguistic data is important. This paper investigates the capacity of LLMs to make inferences about health based on contextual information (e.g. user demographics, health knowledge) and physiological data (e.g. resting heart rate, sleep minutes). We present a comprehensive evaluation of 12 publicly accessible state-of-the-art LLMs with prompting and fine-tuning techniques on four public health datasets (PMData, LifeSnaps, GLOBEM and AW\\_FB). Our experiments cover 10 consumer health prediction tasks in mental health, activity, metabolic, and sleep assessment. Our fine-tuned model, HealthAlpaca exhibits comparable performance to much larger models (GPT-3.5, GPT-4 and Gemini-Pro), achieving the best or second best performance in 7 out of 10 tasks. Ablation studies highlight the effectiveness of context enhancement strategies. Notably, we observe that our context enhancement can yield up to 23.8\\% improvement in performance. While constructing contextually rich prompts (combining user context, health knowledge and temporal information) exhibits synergistic improvement, the inclusion of health knowledge context in prompts significantly enhances overall performance.","authors":"Yubin Kim|Xuhai Xu|Daniel McDuff|Cynthia Breazeal|Hae Won Park","doi_link":"static/proceedings/2024/kim24b24.pdf","session":"B","title":"Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data"},{"UID":"P087","abstract":"This study advances Early Event Prediction (EEP) in healthcare through Dynamic Survival Analysis (DSA), offering a novel approach by integrating risk localization into alarm policies to enhance clinical event metrics. By adapting and evaluating DSA models against traditional EEP benchmarks, our research demonstrates their ability to match EEP models on a time-step level and significantly improve event-level metrics through a new alarm prioritization scheme (up to 11% AuPRC difference). This approach represents a significant step forward in predictive healthcare, providing a more nuanced and actionable framework for early event prediction and management.","authors":"Hugo Y\u00e8che|Manuel Burger|Dinara Veshchezerova|Gunnar Ratsch","doi_link":"static/proceedings/2024/yeche24.pdf","session":"B","title":"Dynamic Survival Analysis for Early Event Prediction"},{"UID":"P093","abstract":"Clustering can be used in medical imaging research to identify different domains within a specific dataset, aiding in a better understanding of subgroups or strata that may not have been annotated. Moreover, in digital pathology, clustering can be used to effectively sample image patches from whole slide images (WSI). In this work, we conduct a comparative analysis of three deep clustering algorithms -- a simple two-step approach applying K-means onto a learned feature space, an end-to-end deep clustering method (DEC), and a Graph Convolutional Network (GCN) based method -- in application to a digital pathology dataset of endometrial biopsy WSIs. For consistency, all methods use the same Autoencoder (AE) architecture backbone that extracts features from image patches. The GCN-based model, specifically, stands out as a deep clustering algorithm that considers spatial contextual information in predicting clusters. Our study highlights the computation of graphs for WSIs and emphasizes the impact of these graphs on the formation of clusters. The main finding of our research indicates that GCN-based deep clustering demonstrates heightened spatial awareness compared to the other methods, resulting in higher cluster agreement with previous clinical annotations of WSIs.","authors":"Mariia Sidulova|Seyed Kahaki|Ian Hagemann|Alexej Gossmann","doi_link":"static/proceedings/2024/sidulova24.pdf","session":"B","title":"Contextual unsupervised deep clustering in digital pathology"},{"UID":"P094","abstract":"Non-adherence to medication is a complex behavioral issue that costs hundreds of billions of dollars annually in the United States alone. Existing solutions to improve medication adherence are limited in their effectiveness and require significant user involvement. To address this, a minimally invasive mobile health system called DoseMate is proposed, which can provide quantifiable adherence data and imposes minimal user burden. To classify a motion time-series that defines pill-taking, we adopt transfer-learning and data augmentation based techniques that uses captured pill-taking gestures along with other open datasets that represent negative labels of other wrist motions. The paper also provides a design methodology that generalizes to other systems and describes a first-of-its-kind, in-the-wild, unobtrusively obtained dataset that contains unrestricted pill-related motion data from a diverse set of users.","authors":"Antoine Nzeyimana|Anthony Campbell|James M Scanlan|Joanne D Stekler|Jenna Marquard|Barry G Saver|Jeremy Gummeson","doi_link":"static/proceedings/2024/nzeyimana24.pdf","session":"B","title":"DoseMate: A Real-world Evaluation of Machine Learning Classification of Pill Taking Using Wrist-worn Motion Sensors"},{"UID":"P119","abstract":"Fatigue is one of the most prevalent symptoms of chronic diseases, such as Multiple Sclerosis, Alzheimer\u2019s, and Parkinson\u2019s. Recently researchers have explored unobtrusive and continuous ways of fatigue monitoring using mobile and wearable devices. However, data quality and limited labeled data availability in the wearable health domain pose significant challenges to progress in the field. In this work, we perform a systematic evaluation of self-supervised learning (SSL) tasks for fatigue recognition using wearable sensor data. To establish our benchmark, we use Homekit2020, which is a large-scale dataset collected using Fitbit devices in everyday life settings. Our results show that the majority of the SSL tasks outperform fully supervised baselines for fatigue recognition, even in limited labeled data scenarios. In particular, the domain features and multi-task learning achieve 0.7371 and 0.7323 AUROC, which are higher than the other SSL tasks and supervised learning baselines. In most of the pre-training tasks, the performance is higher when using at least one data augmentation that reflects the potentially low quality of wearable data (e.g., missing data). Our findings open up promising opportunities for continuous assessment of fatigue in real settings and can be used to guide the design and development of health monitoring systems.","authors":"Tam\u00e1s Visy|Rita Kuznetsova|Christian Holz|Shkurta Gashi","doi_link":"static/proceedings/2024/visy24.pdf","session":"B","title":"Systematic Evaluation of Self-Supervised Learning Approaches for Wearable-Based Fatigue Recognition"},{"UID":"P031","abstract":"Promoting healthy lifestyle behaviors remains a major public health concern, particularly due to their crucial role in preventing chronic conditions such as cancer, heart disease, and type 2 diabetes. Mobile health applications present a promising avenue for low-cost, scalable health behavior change promotion. Researchers are increasingly exploring adaptive algorithms that personalize interventions to each person's unique context. However, in empirical studies, mobile health applications often suffer from small effect sizes and low adherence rates, particularly in comparison to human coaching. Tailoring advice to a person's unique goals, preferences, and life circumstances is a critical component of health coaching that has been underutilized in adaptive algorithms for mobile health interventions. To address this, we introduce a new Thompson sampling algorithm that can accommodate personalized reward functions (i.e., goals, preferences, and constraints), while also leveraging data sharing across individuals to more quickly be able to provide effective recommendations. We prove that our modification incurs only a constant penalty on cumulative regret while preserving the sample complexity benefits of data sharing. We present empirical results on synthetic and semi-synthetic physical activity simulators, where in the latter we conducted an online survey to solicit preference data relating to physical activity, which we use to construct realistic reward models that leverages historical data from another study. Our algorithm achieves substantial performance improvements compared to baselines that do not share data or do not optimize for individualized rewards.","authors":"Aishwarya Mandyam|Matthew J\u00f6rke|William Denton|Barbara E Engelhardt|Emma Brunskill","doi_link":"static/proceedings/2024/mandyam24.pdf","session":"C","title":"Adaptive Interventions with User-Defined Goals for Health Behavior Change"},{"UID":"P090","abstract":"Changing clinical algorithms to remove race adjustment has been proposed and implemented for multiple health conditions. Removing race adjustment from estimated glomerular filtration rate (eGFR) equations may reduce disparities in chronic kidney disease (CKD), but has not been studied in clinical practice after implementation. Here, we assessed whether implementing an eGFR equation (CKD-EPI 2021) without adjustment for Black or African American race modified quarterly rates of nephrology referrals and visits within a single healthcare system, Stanford Health Care (SHC). Our cohort study analyzed 547,194 adult patients aged 21 and older who had at least one recorded serum creatinine or serum cystatin C between January 1, 2019 and September 1, 2023. During the study period, implementation of CKD-EPI 2021 did not modify rates of quarterly nephrology referrals in those documented as Black or African American or in the overall cohort. After adjusting for capacity at SHC nephrology clinics, estimated rates of nephrology referrals and visits with CKD-EPI 2021 were 34 [95\\% CI 29, 39] and 188 [175, 201] per 10,000 patients documented as Black or African American. If race adjustment had not been removed, estimated rates were nearly identical: 38 [95\\% CI: 28, 53] and 189 [165, 218] per 10,000 patients. Changes to the eGFR equation are likely insufficient to achieve health equity in CKD care decision-making as many other structural inequities remain.","authors":"Marika M. Cusick|Glenn M. Chertow|Douglas K. Owens|Michelle Y. Williams|Sherri Rose","doi_link":"static/proceedings/2024/cusick24.pdf","session":"C","title":"Algorithmic changes are not enough: Evaluating the removal of race adjustment from the eGFR equation"},{"UID":"P097","abstract":"Large-scale wearable datasets are increasingly being used for biomedical research and to develop machine learning (ML) models for longitudinal health monitoring applications. However, it is largely unknown whether biases in these datasets lead to findings that do not generalize. Here, we present the first comparison of the data underlying multiple longitudinal, wearable-device-based datasets. We examine participant-level resting heart rate (HR) from four studies, each with thousands of wearable device users. We demonstrate that multiple regression, a community standard statistical approach, leads to conflicting conclusions about important demographic variables (age vs resting HR) and significant intra- and inter-dataset differences in HR. We then directly test the cross-dataset generalizability of a commonly used ML model trained for three existing day-level monitoring tasks: prediction of testing positive for a respiratory virus, flu symptoms, and fever symptoms. Regardless of task, most models showed relative performance loss on external datasets; most of this performance change can be attributed to concept shift between datasets. These findings suggest that research using large-scale, pre-existing wearable datasets might face bias and generalizability challenges similar to research in more established biomedical and ML disciplines. We hope that the findings from this study will encourage discussion in the wearable-ML community around standards that anticipate and account for challenges in dataset bias and model generalizability.","authors":"Patrick Kasl|Severine Soltani|Lauryn Keeler Bruce|Varun Kumar Viswanath|Wendy Hartogensis|Amarnath Gupta|Ilkay Altintas|Stephan Dilchert|Frederick M. Hecht|Ashley Mason|Benjamin L. Smarr","doi_link":"static/proceedings/2024/kasl24.pdf","session":"C","title":"A cross-study analysis of wearable datasets and the generalizability of acute illness monitoring models"},{"UID":"P120","abstract":"Modern kidney placement incorporates several intelligent recommendation systems which exhibit social discrimination due to biases inherited from training data. Although initial attempts were made in the literature to study algorithmic fairness in kidney placement, these methods replace true outcomes with surgeons' decisions due to the long delays involved in recording such outcomes reliably. However, the replacement of true outcomes with surgeons' decisions disregards expert stakeholders' biases as well as social opinions of other stakeholders who do not possess medical expertise. This paper alleviates the latter concern and designs a novel fairness feedback survey to evaluate an acceptance rate predictor (ARP) that predicts a kidney's acceptance rate in a given kidney-match pair. The survey is launched on Prolific, a crowdsourcing platform, and public opinions are collected from 85 anonymous crowd participants. A novel social fairness preference learning algorithm is proposed based on minimizing social feedback regret computed using a novel logit-based fairness feedback model. The proposed model and learning algorithm are both validated using simulation experiments as well as Prolific data. Public preferences towards group fairness notions in the context of kidney placement have been estimated and discussed in detail. The specific ARP tested in the Prolific survey has been deemed fair by the participants.","authors":"Mukund Telukunta|Sukruth Rao|Gabriella Stickney|Venkata Sriram Siddhardh Nadendla|Casey Canfield","doi_link":"static/proceedings/2024/telukunta24.pdf","session":"C","title":"Learning Social Fairness Preferences from Non-Expert Stakeholder Opinions in Kidney Placement"}]
diff --git a/serve_roundtables.json b/serve_roundtables.json
new file mode 100644
index 000000000..68c794c29
--- /dev/null
+++ b/serve_roundtables.json
@@ -0,0 +1 @@
+[{"UID":"R01","abstract":"Value-Based Care (VBC) is getting its momentum. The Centers for Medicare and Medicaid Services (CMS) is pushing to have all Medicare fee-for-service beneficiaries under a care relationship with accountability for quality and total cost of care by 2030. However, the business of VBC is more complex and is different from other businesses as it needs to satisfy three-part aims simultaneously; they are 1) better care for individuals, 2) better health for populations, and 3) lower cost. Meeting all three aims is challenging, and the details and implications of these aims are not well-known for healthcare machine learning researchers. Therefore, we want to pick a few papers from this and past years' CHIL proceedings. Then, we would like to brainstorm and discuss how those ideas in the papers can be deployed in practice, what are the barriers to the deployment/sales, what are the hidden or visible incentives for adopting such ideas, how the government and policymakers should incentivize to achieve the three-part aims of CMS while encouraging the adoption of such technologies.","authors":"Yubin Park","bio":"Yubin Park, Ph.D., is Chief Data and\u00a0Analytics Officer at Apollo Medical Holdings, Inc. (ApolloMed, NASDAQ: AMEH). He oversees value-based care analytics, remote patient monitoring, and partnerships with third-party data vendors in his current position. Yubin started his career by founding a healthcare analytics start-up after obtaining his Ph.D. degree in Machine Learning at the University of Texas at Austin in 2014. His first start-up, Accordion Health, provided an AI-driven Risk Adjustment and Quality analytics platform to Medicare Advantage plans. In 2017, Evolent Health (NYSE: EVH) acquired his company, and there, he led various clinical transformation/innovation projects. In 2020, he then founded his second start-up, Orma Health. The company built a virtual care and analytics platform for payers and providers in value-based care, e.g., Direct Contracting Entities and Accountable Care Organizations. At Orma, he worked with many sizes of risk-bearing primary care and specialty groups, helping them connect with patients through virtual care technologies. ApolloMed acquired Orma Health in 2022.","image":"static/images/speakers/yubin_park.png","rocketchat_id":"","slideslive_active_date":"2023-03-28T:23:59:00.00","slideslive_id":"","title":"Bridging the gap between the business of value-based care and the research of health AI"},{"UID":"R02","abstract":"Machine learning algorithms should be easy to evaluate for performance and equity: they generate quantitative predictions that can be compared to their intended target, both in the general population and in under-served groups. But the scarcity of data means that, for most algorithms, we have no idea how they perform, and how much bias they contain. Concretely, there is no way for algorithm developers or potential users to answer the simple question: does this algorithm do what it\u2019s supposed to do? This roundtable will focus on the opportunities and challenges of auditing algorithm performance and equity.","authors":"Alistair Johnson","bio":"Dr. Johnson is a Scientist at the Hospital for Sick Children. He received his Bachelor of Biomedical and Electrical Engineering at McMaster University and successfully read for a DPhil at the University of Oxford. Dr. Johnson is most well-known for his work on the MIMIC-III Clinical database, a publicly available critical care database used by over 30,000 researchers around the world. His research focuses on the development of new database structures tailored for healthcare and machine learning algorithms for natural language processing, particularly focusing on the deidentification of free-text clinical notes.","image":"static/images/speakers/alistair_johnson.jpg","rocketchat_id":"","slideslive_active_date":"2023-03-28T:23:59:00.00","slideslive_id":"","title":"Auditing Algorithm Performance and Equity"}]
diff --git a/serve_schedule.json b/serve_schedule.json
new file mode 100644
index 000000000..c09c3715f
--- /dev/null
+++ b/serve_schedule.json
@@ -0,0 +1 @@
+{"friday":[{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Joyce Ho","time":"12:00 - 12:10 PM","title":"Opening Remarks ","type":"Opening"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Zoom","moderator":"Harvineet Singh","speaker":"Walter Dempsey","time":"12:10 - 1:10 PM","title":"Challenges in developing online learning and experimentation algorithms in digital health","type":"Tutorial"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Zoom","moderator":"Jesse Gronsbell","speaker":"Rui Duan","time":"12:10 - 1:10 PM","title":"Distributed statistical learning and inference with electronic health records data","type":"Tutorial"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Daeyoung Kim","time":"1:10 PM","title":"Uncertainty-aware text-to-program for question answering on structured electronic health records","type":"Spotlight 66"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Addison Weatherhead","time":"1:17 PM","title":"Learning unsupervised representations for ICU timeseries","type":"Spotlight 67"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Sana Tonekaboni","time":"1:24 PM","title":"How to validate machine learning models prior to deployment: Silent trial protocol for evaluation of real-time models at the ICU","type":"Spotlight 62"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Kyunghoon Hur","time":"1:31 PM","title":"Unifying heterogeneous electronic health records systems via text-based code embedding","type":"Spotlight 71"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Haoran Zhang","time":"1:38 PM","title":"Improving the fairness of chest X-ray classifiers","type":"Spotlight 79"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Juyong Kim","time":"1:45 PM","title":"Context-sensitive spelling correction of clinical text with conditional independence model","type":"Spotlight 81"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Ankit Pal","time":"1:52 PM","title":"MedMCQA : A large-scale multi-subject multi-choice dataset for medical domain question answering","type":"Spotlight 4"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Sungjin Park","time":"1:59 PM","title":"Graph-text multi-modal pre-training for medical representation learning","type":"Spotlight 11"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Danielle Belgrave","time":"2:10 - 2:30 PM","title":"Understanding heterogeneity as a route to understanding health","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Jessica Tenenbaum","time":"2:30 - 2:50 PM","title":"Machine learning in public health: Are we there yet?","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"Sherri Rose","speaker":null,"time":"2:50 - 3:10 PM","title":"Panel Q&A with Danielle Belgrave and Jessica Tenenbaum","type":"Q&A"},{"doclink":null,"link":null,"linkactive":false,"linktext":null,"moderator":"--","speaker":null,"time":"3:10 PM","title":"Sponsored Break","type":"Break"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Aniruddh Raghu","time":"3:20 PM","title":"Data augmentation for electrocardiograms","type":"Spotlight 31"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Kwanhyung Lee","time":"3:27 PM","title":"Real-time seizure detection using EEG: A comprehensive comparison of recent approaches under a realistic setting","type":"Spotlight 58"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Jungwoo Oh","time":"3:34 PM","title":"Lead-agnostic self-supervised learning for local and global representations of electrocardiogram","type":"Spotlight 69"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Jiacheng Zhu","time":"3:41 PM","title":"PhysioMTL: Personalizing physiological patterns using optimal transport multi-task regression","type":"Spotlight 54"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Subhrajit Roy","time":"3:48 PM","title":"Disability prediction in multiple sclerosis using performance outcome measures and demographic data","type":"Spotlight 34"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Ramtin Keramati","time":"3:55 PM","title":"Identification of subgroups with similar benefits in off-policy policy evaluation","type":"Spotlight 45"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Shadi Rahimian","time":"4:02 PM","title":"Practical challenges in differentially-private federated survival analysis of medical data","type":"Spotlight 68"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"--","speaker":null,"time":"4:10 PM","title":"Break ","type":"Break"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"Stephanie Hyland","speaker":null,"time":"4:20 - 5:20 PM","title":"Poster Session","type":"Poster"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Tristan Naumann","time":"5:20 - 5:30 PM","title":"Closing Remarks","type":"Closing"}],"thursday":[{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Tristan Naumann","time":"12:00 - 12:10 PM","title":"Welcome and Opening Remarks","type":"Opening"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Zoom","moderator":"Bobak Mortazavi","speaker":"Yindalon Aphinyanaphongs","time":"12:10 - 1:10 PM","title":"Changing patient trajectory: A case study exploring implementation and deployment of clinical machine learning models","type":"Tutorial"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Zoom","moderator":"Harvineet Singh","speaker":"Dhanya Sridhar","time":"12:10 - 1:10 PM","title":"Causal Inference from text data","type":"Tutorial"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Zoom","moderator":"Jesse Gronsbell","speaker":"Anamaria Crisan","time":"12:10 - 1:10 PM","title":"'Are log scales endemic yet?' Strategies for visualizing biomedical and public health data","type":"Tutorial"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Rumi Chunara","time":"1:10 - 1:30 PM","title":"Algorithmic fairness and the science of health disparities","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Nuria Oliver","time":"1:30 - 1:50 PM","title":"Data science against COVID-19","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"Matthew McDermott","speaker":null,"time":"1:50 - 2:10 PM","title":"Panel Q&A with Rumi Chunara and Nuria Oliver","type":"Q&A"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Taylor Killian","time":"2:10 PM","title":"Counterfactually guided policy transfer in clinical settings","type":"Spotlight 06"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Florian Pfiseterer","time":"2:17 PM","title":"Evaluating domain generalization for survival analysis in clinical studies","type":"Spotlight 08"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Tal El Hay","time":"2:24 PM","title":"Estimating model performance on external datasets from their limited statistical characteristics","type":"Spotlight 29"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Xiaolei Huang","time":"2:31 PM","title":"Enriching unsupervised user embedding via medical concepts","type":"Spotlight 10"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Zhixuan Chu","time":"2:38 PM","title":"Multi-task adversarial learning for treatment effect estimation in basket trials","type":"Spotlight 26"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Vincent Jeanselme","time":"2:45 PM","title":"Neural Survival Clustering: Non parametric mixture of neural networks for survival clustering","type":"Spotlight 28"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Newton Mwai Kinyanjui","time":"2:52 PM","title":"ADCB: An Alzheimer's disease simulator for benchmarking observational estimators of causal effects","type":"Spotlight 30"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Mehdi Fatemi","time":"2:59 PM","title":"Semi-Markov offline reinforcement learning for healthcare","type":"Spotlight 41"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"--","speaker":null,"time":"3:05 PM","title":"Sponsored Break","type":"Break"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"Bobak Mortazavi","speaker":"Rosa Arriaga","time":"3:30 - 4:30 PM","title":"Human centered AI for health and wellness","type":"Research Roundtable"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"Stephen Pfohl","speaker":"Leo Celi","time":"3:30 - 4:30 PM","title":"Responsible AI for health","type":"Research Roundtable"},{"doclink":null,"link":null,"linkactive":false,"linktext":"Gather","moderator":"Sanja \u0160\u0107epanovi\u0107","speaker":"Esra Suel","time":"3:30 - 4:30 PM","title":"Social and environmental determinants of health","type":"Research Roundtable"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Lorin Crawford","time":"4:30 - 4:50 PM","title":"Machine learning for human genetics: A multi-scale view on complex traits and disease","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"--","speaker":"Jure Leskovec","time":"4:50 - 5:10 PM","title":"Reducing bias in machine learning systems: Understanding drivers of pain","type":"Keynote"},{"doclink":null,"link":"/live.html","linkactive":false,"linktext":"Livestream","moderator":"Rahul Krishnan","speaker":null,"time":"5:10 - 5:30 PM","title":"Panel Q&A with Lorin Crawford and Jure Leskovec","type":"Q&A"}]}
diff --git a/serve_speakers.json b/serve_speakers.json
new file mode 100644
index 000000000..b6f066249
--- /dev/null
+++ b/serve_speakers.json
@@ -0,0 +1 @@
+[{"UID":"S01","abstract":"","bio":"Hamsa Bastani is an Associate Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations, social good, and revenue management. Her work has received several recognitions, including the Wagner Prize for Excellence in Practice (2021), the Pierskalla Award for the best paper in healthcare (2016, 2019, 2021), the Behavioral OM Best Paper Award (2021), as well as first place in the George Nicholson and MSOM student paper competitions (2016). She previously completed her PhD at Stanford University, and spent a year as a Herman Goldstine postdoctoral fellow at IBM Research.","image":"static/images/speakers/hamsa-bastani.jpeg","institution":"University of Pennsylvania","slideslive_active_date":"","slideslive_id":"","speaker":"Hamsa Bastani","title":""},{"UID":"S02","abstract":"","bio":"Samantha Kleinberg is an Associate Professor in the Computer Science department at Stevens Institute of Technology. After completing her PhD in Computer Science in 2010 at NYU, she spent two years as a postdoctoral Computing Innovation Fellow at Columbia University, in the Department of Biomedical Informatics. Before that she was an undergraduate at NYU in Computer Science and Physics, and more recently spent a year on sabbatical in the psychology department of University College London. Dr. Kleinberg has written an academic book, Causality, Probability, and Time, and another for a wider audience, Why: A Guide To Finding and Using Causes. She is the editor of Time and Causality Across the Sciences.","image":"static/images/speakers/samantha_kleinberg.jpg","institution":"Stevens Institute of Technology","slideslive_active_date":"","slideslive_id":"","speaker":"Samantha Kleinberg","title":""},{"UID":"S03","abstract":"","bio":"Deborah Raji is a Mozilla fellow and CS PhD student at University of California, Berkeley, who is interested in questions on algorithmic auditing and evaluation. In the past, she worked closely with the Algorithmic Justice League initiative to highlight bias in deployed AI products. She has also worked with Google\u02bcs Ethical AI team and been a research fellow at the Partnership on AI and AI Now Institute at New York University working on various projects to operationalize ethical considerations in ML engineering practice. Recently, she was named to Forbes 30 Under 30 and MIT Tech Review 35 Under 35 Innovators.","image":"static/images/speakers/deb_raji.jpg","institution":"University of California, Berkeley","slideslive_active_date":"","slideslive_id":"","speaker":"Deborah Raji","title":""},{"UID":"S04","abstract":"","bio":"Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University. Koyejo was previously an Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning, focusing on applications to neuroscience and healthcare. Koyejo completed a Ph.D. at the University of Texas at Austin, and postdoctoral research at Stanford University. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence, a Skip Ellis Early Career Award, a Sloan Fellowship, a Terman faculty fellowship, an NSF CAREER award, a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping. Koyejo spends time at Google as a part of the Brain team, serves on the Neural Information Processing Systems Foundation Board, the Association for Health Learning and Inference Board, and as president of the Black in AI organization.","image":"static/images/speakers/sanmi_koyejo.jpg","institution":"Stanford University","slideslive_active_date":"","slideslive_id":"","speaker":"Sanmi (Oluwasanmi) Koyejo","title":""},{"UID":"S05","abstract":"","bio":"Nils Gehlenborg is an Associate Professor of Biomedical Informatics At Harvard Medical School. The goal of Gehlenborg\u2019s research is to improve human health by developing computational techniques and interfaces that enable scientists and clinicians to efficiently interact with biomedical data. He received his PhD from the University of Cambridge and was a predoctoral fellow at the European Bioinformatics Institute (EMBL-EBI). Gehlenborg is a co-founder and former general chair of BioVis, the Symposium on Biological Data Visualization, and co-founder of VIZBI, the annual workshop on Visualizing Biological Data. Occasionally, he contributes to the \u201cPoints of View\u201d data visualization column in Nature Methods.","image":"static/images/speakers/nils_gehlenborg.jpeg","institution":"Harvard Medical School","slideslive_active_date":"","slideslive_id":"","speaker":"Nils Gehlenborg","title":""},{"UID":"S06","abstract":"","bio":"No\u00e9mie Elhadad is Associate Professor and Chair of the Department of Biomedical Informatics at Columbia University Vagelos College of Physicians and Surgeons. She is affiliated with Columbia\u2019s Department of Computer Science and the Columbia Data Science Institute. Dr. Elhadad\u2019s research lies at the intersection of artificial intelligence, human-centered computing, and medicine, with a focus on developing novel machine-learning methods. She creates methods and tools to support patients and clinicians in their information needs, with particular focus on ensuring that AI systems of the future are fair and just. She obtained her PhD in 2006 in Computer Science, focusing on multi-document, patient-specific text summarization of the clinical literature. She was on the Computer Science faculty at The City College of New York and the CUNY graduate center starting in 2006 before joining the Department of Biomedical Informatics at Columbia in 2007. Dr. Elhadad served as Chair of the Health Analytics Center at the Columbia Data Science Institute from 2013 to 2016","image":"static/images/speakers/noemie_elhadad.jpeg","institution":"Columbia University","slideslive_active_date":"","slideslive_id":"","speaker":"No\u00e9mie Elhadad","title":""}]
diff --git a/serve_sponsors.json b/serve_sponsors.json
new file mode 100644
index 000000000..8b1b7108a
--- /dev/null
+++ b/serve_sponsors.json
@@ -0,0 +1 @@
+[{"UID":"S01","image":"static/images/sponsors/moore-logo.jpg","level":"gold","name":"Gordon and Betty Moore Foundation","website":""},{"UID":"S02","image":"static/images/sponsors/CTSI-logo.png","level":"gold","name":"UFlorida CTSI Department of Biomedical Informatics","website":""},{"UID":"S03","image":"static/images/sponsors/apple-logo.jpg","level":"silver","name":"Apple","website":""},{"UID":"S04","image":"static/images/sponsors/genentech-logo.png","level":"silver","name":"Genentech","website":"https://www.gene.com/scientists/our-scientists/prescient-design"},{"UID":"S05","image":"static/images/sponsors/google-logo.png","level":"silver","name":"Google","website":"https://health.google/health-research/"},{"UID":"S06","image":"static/images/sponsors/mount-sinai.png","level":"silver","name":"The Mount Sinai Hospital","website":""},{"UID":"S07","image":"static/images/sponsors/CPH_UCSF_logo.png","level":"silver","name":"Computational Precision Health Program at UCSF / UC Berkeley","website":""},{"UID":"S08","image":"static/images/sponsors/uflorida-health.gif","level":"silver","name":"University of Florida Health","website":""},{"UID":"S09","image":" static/images/sponsors/UPenn-Chase-Center-logo.png","level":"silver","name":"Chase Center at University of Pennsylvania","website":"https://chase.med.upenn.edu/"},{"UID":"S10","image":"static/images/sponsors/Columbia-logo.png","level":"bronze","name":"Columbia University (Biostats Dept.)","website":""},{"UID":"S11","image":"static/images/sponsors/Health-Data-Science-logo.jpg","level":"bronze","name":"Health Data Science","website":""},{"UID":"S12","image":"static/images/sponsors/UMinn_CHS_logo.png","level":"bronze","name":"Department of Surgery at University of Minnesota","website":"https://med.umn.edu/surgery/divisions/computational-health-sciences"}]
diff --git a/serve_symposiums.json b/serve_symposiums.json
new file mode 100644
index 000000000..fe51488c7
--- /dev/null
+++ b/serve_symposiums.json
@@ -0,0 +1 @@
+[]
diff --git a/serve_tutorials.json b/serve_tutorials.json
new file mode 100644
index 000000000..25a2ac904
--- /dev/null
+++ b/serve_tutorials.json
@@ -0,0 +1 @@
+[{"UID":"T01","abstract":"","authors":"Katie Link","bio":"Katie Link is the Healthcare Solutions Product Manager at NVIDIA, where she helps enable healthcare companies and researchers to solve real-world healthcare challenges with large language models (LLMs) and other advanced technologies. Prior to NVIDIA, she led healthcare and life sciences applications of artificial intelligence as a Machine Learning Engineer at Hugging Face, an open source AI startup. She is currently based in New York City and is on leave as a medical student at the Icahn School of Medicine at Mount Sinai. While in medical school, she led artificial intelligence research at NYU Langone Hospital, creating the largest open dataset of magnetic resonance imaging (MRI) for brain metastases and developing novel deep learning algorithms for tracking cancer progression. In her spare time, she also works on AI education initiatives for medical trainees and physicians. Prior to medical school, she was an AI Resident at Google X. She holds a bachelor\u2019s degree in Neuroscience with a minor in Computer Science from Johns Hopkins University.","image":"static/images/speakers/katie_link.jpeg","rocketchat_id":"","slideslive_active_date":"2022-03-28T23:59:00.00","slideslive_id":"","title":""}]
diff --git a/serve_workshops.json b/serve_workshops.json
new file mode 100644
index 000000000..b0f7afcee
--- /dev/null
+++ b/serve_workshops.json
@@ -0,0 +1 @@
+[{"UID":"WS01","abstract":"In many real-world environments, the details of decision-making processes are not fully known, e.g., how oncologists decide on specific radiation therapy treatment plans for cancer patients, how clinicians decide on medication dosages for different patients, or how hypertension patients choose their diet to control their illness. While conventional machine learning and statistical methods can be used to better understand such processes, they often fail to provide meaningful insights into the unknown parameters when the problem's setting is heavily constrained. Similarly, conventional constrained inference models, such as inverse optimization, are not well equipped for data-driven problems. In this study, we develop a novel methodology (called MLIO) that combines machine learning and inverse optimization techniques to recover the utility functions of a black-box decision-making process. Our method can be applied to settings where different types of data are required to capture the problem. MLIO is specifically developed with data-intensive medical decision-making environments in mind. We evaluate our approach in the context of personalized diet recommendations for patients, building on a large dataset of historical daily food intakes of patients from NHANES. MLIO considers these prior dietary behaviors in addition to complementary data (e.g., demographics and preexisting conditions) to recover the underlying criteria that the patients had in mind when deciding on their food choices. Once the underlying criteria are known, an optimization model can be used to find personalized diet recommendations that adhere to patients' behavior while meeting all required dietary constraints.","authors":"Farzin Ahmadi, Tinglong Dai, and Kimia Ghobadi (Johns Hopkins University)","title":"Emulating Human Decision-Making Under Multiple Constraints"},{"UID":"WS02","abstract":"Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future.","authors":"Ella Y. Wang (BASIS Chandler); Anirudh Som (SRI International); Ankita Shukla, Hongjun Choi, and Pavan Turaga (ASU)","title":"Interpretable COVID-19 Chest X-Ray Classification via Orthogonality Constraint"},{"UID":"WS03","abstract":"Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies related to the same treatment and outcome measurement. It is an important tool for medical researchers and clinicians to derive reliable conclusions regarding the overall effect of treatments and interventions (e.g., drugs) on a certain outcome (e.g., the severity of a disease). Unfortunately, conventional meta-analysis involves great human effort, i.e., it is constructed by hand and is extremely time-consuming and labor-intensive, rendering a process that is inefficient in practice and vulnerable to human bias. To overcome these challenges, we work toward automating meta-analysis with a focus on controlling for the potential biases. Automating meta-analysis consists of two major steps: (1) extracting information from scientific publications written in natural language, which is different and noisier than what humans typically extract when conducting a meta-analysis; and (2) modeling meta-analysis, from a novel \\textit{causal-inference} perspective, to control for the potential biases and summarize the treatment effect from the outputs of the first step. Since sufficient prior work exists for the first step, this study focuses on the second step. The core contribution of this work is a multiple causal inference algorithm tailored to the potentially noisy and biased information automatically extracted by current natural language processing systems. Empirical evaluations on both synthetic and semi-synthetic data show that the proposed approach for automated meta-analysis yields high-quality performance.","authors":"Lu Cheng (Arizona State University); Dmitriy Katz-Rogozhnikov, Kush R. Varshney, and Ioana Baldini (IBM Research)","title":"Automated Meta-Analysis in Medical Research: A Causal Learning Perspective"},{"UID":"WS04","abstract":"Attention is a powerful concept in computer vision. End-to-end networks that learn to focus selectively on regions of an image or video often perform strongly. However, other image regions, while not necessarily containing the signal of interest, may contain useful context. We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the signal of interest and those that do not. Our technique uses the inverse of an attention mask to generate a noise estimate that is then used to denoise temporal observations. We apply this to the task of camera-based physiological measurement. A convolutional attention network is used to learn which regions of a video contain the physiological signal and generate a preliminary estimate. A noise estimate is obtained by using the pixel intensities in the inverse regions of the learned attention mask, this in turn is used to refine the estimate of the physiological signal. We perform experiments on two large benchmark datasets and show that this approach produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation error by as much as 30%, recovering subtle waveform dynamics, and generalizing from RGB to NIR videos without retraining.","authors":"Ewa Nowara (RICE UNIVERSITY); Daniel McDuff (Microsoft Research); Ashok Veeraraghavan (RICE UNIVERSITY)","title":"The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention"},{"UID":"WS05","abstract":"Electronic health records (EHRs) provide an abundance of data for clinical outcomes modeling. The prevalence of EHR data has enabled a number of studies using a variety of machine learning algorithms to predict potential adverse events. However, these studies do not account for the heterogeneity present in EHR data, including various lengths of stay, various frequencies of vitals captured in invasive versus non-invasive fashion, and various repetitions (or lack of thereof) of laboratory examinations. Therefore, studies limit the types of features extracted or the domain considered to provide a more homogeneous training set to machine learning models. The heterogeneity in this data represents important risk differences in each patient. In this work, we examine such data in an intensive care unit (ICU) setting, where the length of stay and the frequency of data gathered may vary significantly based upon the severity of patient condition. Therefore, it is unreasonable to use the same model for patients first entering the ICU versus those that have been there for above average lengths of stay. Developing multiple individual models to account for different patient cohorts, different lengths of stay, and different sources for key vital sign data may be tedious and not account for rare cases well. We address this challenge by developing a dynamic model, based upon meta-learning, to adapt to data heterogeneity and generate predictions of various outcomes across the different lengths of data. We compare this technique against a set of benchmarks on a publicly-available ICU dataset (MIMIC-III) and demonstrate improved model performance by accounting for data heterogeneity.","authors":"Lida Zhang (Texas A&M University); Xiaohan Chen, Tianlong Chen, and Zhangyang Wang (University of Texas at Austin); Bobak J. Mortazavi (Texas A&M University)","title":"DynEHR: Dynamic Adaptation of Models with Data Heterogeneity in Electronic Health Records"},{"UID":"WS06","abstract":"Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician labeled data. Active Learning (AL) is a process of selecting the most informative instances to be labeled by an expert to further train a supervised algorithm. We demonstrate the effectiveness of AL in multi-label text classification in the clinical domain. In this context, we apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set (8.3\\% of the total instances). We conclude that AL methods can significantly reduce the manual annotation cost while preserving model performance.","authors":"Martha Ferreira (Dalhousie University); Michal Malyska and Nicola Sahar (Semantic Health); Riccardo Miotto (Icahn School of Medicine at Mount Sinai); Fernando Paulovich (Dalhousie University); Evangelos Milios (Dalhousie University, Faculty of Computer Scienc)","title":"Active Learning for Medical Code Assignment"},{"UID":"WS07","abstract":"Assessment of COVID-19 pandemic predictions indicates that differential equation-based epidemic spreading models are less than satisfactory in the contemporary world of intense human connectivity. Network-based simulations are more apt for studying the contagion dynamics due to their ability to model heterogeneity of human interactions. However, the quality of predictions in network-based models depends on how well the underlying wire-frame approximates the real social contact network of the population. In this paper, we propose a framework to create a modular wire-frame to mimic the social contact network of geography by lacing it with demographic information. The proposed inter-connected network sports small-world topology, accommodates density variations in the geography, and emulates human interactions in family, social, and work spaces. The resulting wire-frame is a generic and potent instrument for urban planners, demographers, economists, and social scientists to simulate different \"what-if\" scenarios and predict epidemic variables. The basic frame can be laden with any economic, social, urban data that can potentially shape human connectance. We present a preliminary study of the impact of variations in contact patterns due to density and demography on the epidemic variables.","authors":"Kirti Jain (Department of Computer Science, University of Delhi, Delhi, India); Sharanjit Kaur (Acharya Narendra Dev College, University of Delhi, Delhi, India); Vasudha Bhatnagar (Department of Computer Science, University of Delhi, Delhi, India)","title":"Framing Social Contact Networks for Contagion Dynamics"},{"UID":"WS08","abstract":"Shaping an epidemic with an adaptive contact restriction policy that balances the disease and socioeconomic impact has been the holy grail during the COVID-19 pandemic. Most of the existing work on epidemiological models focuses on scenario-based forecasting via simulation but techniques for explicit control of epidemics via an analytical framework are largely missing. In this paper, we consider the problem of determining the optimal control policy for transmission rate assuming SIR dynamics, which is the most widely used epidemiological paradigm. We first demonstrate that the SIR model with infectious patients and susceptible contacts (i.e., product of transmission rate and susceptible population) interpreted as predators and prey respectively reduces to a Lotka-Volterra (LV) predator-prey model. The modified SIR system (LVSIR) has a stable equilibrium point, an 'energy' conservation property, and exhibits bounded cyclic behaviour similar to an LV system. This mapping permits a theoretical analysis of the control problem supporting some of the recent simulation-based studies that point to the benefits of periodic interventions. We use a control-Lyapunov approach to design adaptive control policies (CoSIR) to nudge the SIR model to the desired equilibrium that permits ready extensions to richer compartmental models. We also describe a practical implementation of this transmission control method by approximating the ideal control with a finite, but a time-varying set of restriction levels. We provide experimental results comparing with periodic lockdowns on few different geographical regions (India, Mexico, Netherlands) to demonstrate the efficacy of this approach.","authors":"Harsh Maheshwari and Shreyas Shetty (Flipkart Internet Private Ltd.); Nayana Bannur (Wadhwani AI); Srujana Merugu (Independent)","title":"CoSIR: Managing an Epidemic via Optimal Adaptive Control of Transmission Rate Policy"},{"UID":"WS09","abstract":"A major obstacle to the integration of deep learning models for chest x-ray interpretation into clinical settings is the lack of understanding of their failure modes. In this work, we first investigate whether there are clinical subgroups that chest x-ray models are likely to misclassify. We find that older patients and patients with a lung lesion or pneumothorax finding have a higher probability of being misclassified on some diseases. Second, we develop misclassification predictors on chest x-ray models using their outputs and clinical features. We find that our best performing misclassification identifier achieves an AUROC close to 0.9 for most diseases. Third, employing our misclassification identifiers, we develop a corrective algorithm to selectively flip model predictions that have high likelihood of misclassification at inference time. We observe F1 improvement on the prediction of Consolidation (0.008, 95%CI[0.005, 0.010]) and Edema (0.003, 95%CI[0.001, 0.006]). By carrying out our investigation on ten distinct and high-performing chest x-ray models, we are able to derive insights across model architectures and offer a generalizable framework applicable to other medical imaging tasks.","authors":"Emma Chen, Andy Kim, Rayan Krishnan, Andrew Y. Ng, and Pranav Rajpurkar (Stanford University)","title":"CheXbreak: Misclassification Identification for Deep Learning Models Interpreting Chest X-rays"},{"UID":"WS10","abstract":"Contrastive learning is a form of self-supervision that can leverage unlabeled data to produce pretrained models. While contrastive learning has demonstrated promising results on natural image classification tasks, its application to medical imaging tasks like chest X-ray interpretation has been limited. In this work, we propose MoCo-CXR, which is an adaptation of the contrastive learning method Momentum Contrast (MoCo), to produce models with better representations and initializations for the detection of pathologies in chest X-rays. In detecting pleural effusion, we find that linear models trained on MoCo-CXR-pretrained representations outperform those without MoCo-CXR-pretrained representations, indicating that MoCo-CXR-pretrained representations are of higher-quality. End-to-end fine-tuning experiments reveal that a model initialized via MoCo-CXR-pretraining outperforms its non-MoCo-CXR-pretrained counterpart. We find that MoCo-CXR-pretraining provides the most benefit with limited labeled training data. Finally, we demonstrate similar results on a target Tuberculosis dataset unseen during pretraining, indicating that MoCo-CXR-pretraining endows models with representations and transferability that can be applied across chest X-ray datasets and tasks.","authors":"Hari Sowrirajan, Jingbo Yang, Andrew Ng, and Pranav Rajpurkar (Stanford University)","title":"MoCo-CXR: MoCo Pretraining Improves Representation and Transferability of Chest X-ray Models"},{"UID":"WS11","abstract":"Inertial Measurement Unit (IMU) sensors are becoming increasingly ubiquitous in everyday devices such as smartphones, fitness watches, etc. As a result, the array of health-related applications that tap onto this data has been growing, as well as the importance of designing accurate prediction models for tasks such as human activity recognition (HAR). However, one important task that has received little attention is the prediction of an individual's heart rate when undergoing a physical activity using IMU data. This could be used, for example, to determine which activities are safe for a person without having him/her actually perform them. We propose a neural architecture for this task composed of convolutional and LSTM layers, similarly to the state-of-the-art techniques for the closely related task of HAR. However, our model includes a convolutional network that extracts, based on sensor data from a previously executed activity, a physical conditioning embedding (PCE) of the individual to be used as the LSTM's initial hidden state. We evaluate the proposed model, dubbed PCE-LSTM, when predicting the heart rate of 23 subjects performing a variety of physical activities from IMU-sensor data available in public datasets (PAMAP2, PPG-DaLiA). For comparison, we use as baselines the only model specifically proposed for this task, and an adapted state-of-the-art model for HAR. PCE-LSTM yields over 10% lower mean absolute error. We demonstrate empirically that this error reduction is in part due to the use of the PCE. Last, we use the two datasets (PPG-DaLiA, WESAD) to show that PCE-LSTM can also be successfully applied when photoplethysmography (PPG) sensors are available to rectify heart rate measurement errors caused by movement, outperforming the state-of-the-art deep learning baselines by more than 30%.","authors":"Davi Pedrosa de Aguiar, Ot\u00e1vio Augusto Silva, and Fabricio Murai (Universidade Federal de Minas Gerais)","title":"Encoding physical conditioning from inertial sensors for multi-step heart rate estimation"},{"UID":"WS12","abstract":"COVID-19 pandemic has been ravaging the world we know since it's insurgence. Computer-Aided Diagnosis (CAD) systems with high precision and reliability can play a vital role in the battle against COVID-19. Most of the existing works in the literature focus on developing sophisticated methods yielding high detection performance yet not addressing the issue of predictive uncertainty. Uncertainty estimation has been explored heavily in the literature for Deep Neural Networks; however, not much work focused on this issue on COVID-19 detection. In this work, we explore the efficacy of state-of-the-art (SOTA) uncertainty estimation methods on COVID-19 detection. We propose to augment the best performing method by using feature denoising algorithm to gain higher Positive Predictive Value (PPV) on COVID positive cases. Through extensive experimentation, we identify the most lightweight and easy-to-deploy uncertainty estimation framework that can effectively identify the confusing COVID-19 cases for expert analysis while performing comparatively with the existing resource heavy uncertainty estimation methods. In collaboration with medical professionals, we further validate the results to ensure the viability of the framework in clinical practice.","authors":"Krishanu Sarker (Georgia State University); Sharbani Pandit (Georgia Institute of Technology); Anupam Sarker (Institute of Epidemiology, Disease Control and Research); Saeid Belkasim and Shihao Ji (Georgia State University)","title":"Towards Reliable and Trustworthy Computer-Aided Diagnosis Predictions: Diagnosing COVID-19 from X-Ray Images"},{"UID":"WS13","abstract":"We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as \"no disease\". Second, we evaluate whether models trained on seen diseases can detect seen diseases when co-occurring with diseases outside the subset (unseen diseases). We find that models are still able to detect seen diseases even when co-occurring with unseen diseases. Third, we evaluate whether feature representations learned by models may be used to detect the presence of unseen diseases given a small labeled set of unseen diseases. We find that the penultimate layer provides useful features for unseen disease detection. Our results can inform the safe clinical deployment of deep learning models trained on a non-exhaustive set of disease classes.","authors":"Siyu Shi (Department of Medicine, School of Medicine, Stanford University); Ishaan Malhi, Kevin Tran, Andrew Y. Ng, and Pranav Rajpurkar (Department of Computer Science, Stanford University)","title":"CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays"},{"UID":"WS14","abstract":"We explore the application of graph neural networks (GNNs) to the problem of estimating exposure to an infectious pathogen and probability of transmission. Specifically, given a datatset in which a subset of patients are known to be infected and information in the form of a graph about who has interacted with whom, we aim to directly estimate transmission dynamics, i.e., what types of interactions (e.g., length and number) lead to transmission events. While, graph neural networks (GNNs) have proven capable of learning meaningful representations from graph data, they commonly assume tasks with high homophily (i.e., nodes that share an edge look similar). Recently researchers have proposed techniques for addressing problems with low homophily (e.g., adding residual connections to GNNs). In our problem setting, homophily is high on average, the majority of patients do not become infected. But, homophily remains low with respect to the minority class. In this paper, we characterize this setting as particularly challenging for GNNs. Given the asymmetry in homophily between classes, we hypothesize that solutions designed to address low homophily on average will not suffice and instead propose a solution based on attention. Applied to both real-world and synthetic network data, we test this hypothesis and explore the ability of GNNs to learn complex transmission dynamics directly from network data. Overall, attention proves to be an effective mechanism for addressing low homophily in the minority class (AUROC with 95\\% CI: GCN 0.684 (0.659,0.710) vs. GAT 0.715 (0.688,0.742)) and such a data-driven approach can outperform approaches based on potentially flawed expert knowledge.","authors":"Jeeheh Oh (University of Michigan, Ann Arbor); Jenna Wiens (University of Michigan)","title":"A Data-Driven Approach to Estimating Infectious Disease Transmission from Graphs: A Case of Class Imbalance Driven Low Homophily"},{"UID":"WS15","abstract":"Explainable artificial intelligence provides an opportunity to improve prediction accuracy over standard linear models using 'black box' machine learning (ML) models while still revealing insights into a complex outcome such as all-cause mortality. We propose the IMPACT (Interpretable Machine learning Prediction of All-Cause morTality) framework that implements and explains complex, non-linear ML models in epidemiological research, by combining a tree ensemble mortality prediction model and an explainability method. We use 133 variables from NHANES 1999-2014 datasets (number of samples: ?? = 47, 261) to predict all-cause mortality. To explain our model, we extract local (i.e., per-sample) explanations to verify well-studied mortality risk factors, and make new dis- coveries. We present major factors for predicting ??-year mortality (?? = 1, 3, 5) across different age groups and their individualized im- pact on mortality prediction. Moreover, we highlight interactions between risk factors associated with mortality prediction, which leads to findings that linear models do not reveal. We demonstrate that compared with traditional linear models, tree-based models have unique strengths such as: (1) improving prediction power, (2) making no distribution assumptions, (3) capturing non-linear relationships and important thresholds, (4) identifying feature interactions, and (5) detecting different non-linear relationships between models. Given the popularity of complex ML models in prognostic research, combining these models with explainability methods has implications for further applications of ML in medical fields. To our knowledge, this is the first study that combines complex ML models and state-of-the-art feature attributions to explain mortality prediction, which enables us to achieve higher prediction accuracy and gain new insights into the effect of risk factors on mortality.","authors":"Wei Qiu, Hugh Chen, Ayse Berceste Dincer, and Su-In Lee (Paul G. Allen School of Computer Science and Engineering, University of Washington)","title":"Interpretable Machine Learning Prediction of All-cause Mortality"},{"UID":"WS16","abstract":"Cardiogenic shock is a deadly and complicated illness. Despite extensive research into treating cardiogenic shock, mortality remains high and has not decreased over time. Patients suffering from cardiogenic shock are highly heterogeneous, and developing an understanding of phenotypes among these patients is crucial for understanding this disease and the appropriate treatments for individual patients. In this work, we develop a deep mixture of experts approach to jointly find phenotypes among patients with cardiogenic shock while simultaneously estimating their risk of in-hospital mortality. Although trained with information regarding treatment and outcomes, after training, the proposed model is decomposable into a network that clusters patients into phenotypes from information available prior to treatment. This model is validated on a synthetic dataset and then applied to a cohort of 28,304 patients with cardiogenic shock. The full model predicts in-hospital mortality on this cohort with an AUROC of 0.85 \u00b1 0.01. The model discovers five phenotypes among the population, finding statistically different mortality rates among them and among treatment choices within those groups. This approach allows for grouping patients in clinical clusters with different rates of device utilization and different risk of mortality. This approach is suitable for jointly finding phenotypes within a clinical population and in modeling risk among that population.","authors":"Nathan C. Hurley (Texas A&M University); Alyssa Berkowitz (Yale University); Frederick Masoudi (University of Colorado School of Medicine); Joseph Ross and Nihar Desai (Yale University); Nilay Shah (Mayo Clinic); Sanket Dhruva (UCSF School of Medicine); Bobak J. Mortazavi (Texas A&M University)","title":"Outcomes-Driven Clinical Phenotyping in Patients with Cardiogenic Shock for Risk Modeling and Comparative Treatment Effectiveness"},{"UID":"WS17","abstract":"Severe infectious diseases such as the novel coronavirus (COVID-19) pose a huge threat to public health. Stringent control measures, such as school closures and stay-at-home orders, while having significant effects, also bring huge economic losses. In the face of an emerging infectious disease, a crucial question for policymakers is how to make the trade-off and implement the appropriate interventions timely, with the existence of huge uncertainty. In this work, we propose a Multi-Objective Model-based Reinforcement Learning framework to facilitate data-driven decision making and minimize the long-term overall cost. Specifically, at each decision point, a Bayesian epidemiological model is first learned as the environment model, and then the proposed model-based multi-objective planning algorithm is applied to find a set of Pareto-optimal policies. This framework, combined with the prediction bands for each policy, provides a real-time decision support tool for policymakers. The application is demonstrated with the spread of COVID-19 in China.","authors":"Runzhe Wan, Xinyu Zhang, and Rui Song (North Carolina State University)","title":"Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control"},{"UID":"WS18","abstract":"With the growing amount of text in health data, there have beenrapid advances in large pre-trained models that can be applied to awide variety of biomedical tasks with minimal task-specific mod-ifications. Emphasizing the cost of these models, which renderstechnical replication challenging, this paper summarizes experi-ments conducted in replicating BioBERT and further pre-trainingand careful fine-tuning in the biomedical domain. We also inves-tigate the effectiveness of domain-specific and domain-agnosticpre-trained models across downstream biomedical NLP tasks. Ourfinding confirms that pre-trained models can be impactful in somedownstream NLP tasks (QA and NER) in the biomedical domain;however, this improvement may not justify the high cost of domain-specific pre-training.","authors":"Paul Grouchi (Untether AI); Shobhit Jain (Manulife); Michael Liu (Tealbook); Kuhan Wang (CIBC); Max Tian (Adeptmind); Nidhi Arora (Intact); Hillary Ngai (University of Toronto); Faiza Khan Khattak (Manulife); Elham Dolatabadi and Sedef Akinli Kocak (Vector Institute)","title":"An Experimental Evaluation of Transformer-based LanguageModels in the Biomedical Domain"},{"UID":"WS19","abstract":"Question Answering (QA) is a widely-used framework for developing and evaluating an intelligent machine. In this light, QA on Electronic Health Records (EHR), namely EHR QA, can work as a crucial milestone towards developing an intelligent agent in healthcare. EHR data are typically stored in a relational database, which can also be converted to a directed acyclic graph, allowing two approaches for EHR QA: Table-based QA and Knowledge Graph-based QA. We hypothesize that the graph-based approach is more suitable for EHR QA as graphs can represent relations between entities and values more naturally compared to tables, which essentially require JOIN operations. In this paper, we propose a graph-based EHR QA where natural language queries are converted to SPARQL instead of SQL. To validate our hypothesis, we create four EHR QA datasets (graph-based VS table-based, and simplified database schema VS original database schema), based on a table-based dataset MIMICSQL. We test both a simple Seq2Seq model and a state-of-the-art EHR QA model on all datasets where the graph-based datasets facilitated up to 34% higher accuracy than the table-based dataset without any modification to the model architectures. Finally, all datasets will be open-sourced to encourage further EHR QA research in both directions.","authors":"Junwoo Park and Youngwoo Cho (Korea Advanced Institute of Science and Technology (KAIST)); Haneol Lee (Yonsei University); Jaegul Choo and Edward Choi (Korea Advanced Institute of Science and Technology (KAIST))","title":"Knowledge Graph-based Question Answering with Electronic Health Records"},{"UID":"WS20","abstract":"There is an increased adoption of electronic health record (EHR) systems by variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich source of patient information is the unstructured clinical text. In this work, we develop \\model, a chart-aware temporal attention network for learning patient representations from clinical notes. We introduce a novel representation where each note is considered a single unit, like a sentence, and composed of attention-weighted words. The notes in turn are aggregated into a patient representation using a second weighting unit, note attention. Unlike standard attention computations which focus only on the content of the note, we incorporate the chart-time for each note as a constraint for attention calculation. This allows our model to focus on notes closer to the prediction time. Using the MIMIC-III dataset, we empirically show that our patient representation and attention calculation achieves the best performance in comparison with various state-of-the-art baselines for one-year mortality prediction and 30-day hospital readmission. Moreover, the attention weights can be used to offer transparency into our model's predictions.","authors":" Zelalem Gero and Joyce Ho (Emory University)","title":"CATAN: Chart-aware temporal attention network for clinical text classification"},{"UID":"WS21","abstract":"Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.","authors":"Chirag Nagpal (Carnegie Mellon University); Steve Yadlowsky; Negar Rostamzadeh; and Katherine Heller (Google Brain)","title":"Deep Cox Mixtures for Survival Regression"}]
diff --git a/speaker_S01.html b/speaker_S01.html
new file mode 100644
index 000000000..bf94314ad
--- /dev/null
+++ b/speaker_S01.html
@@ -0,0 +1,514 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ CHIL
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Bio:
+ Hamsa Bastani is an Associate Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations, social good, and revenue management. Her work has received several recognitions, including the Wagner Prize for Excellence in Practice (2021), the Pierskalla Award for the best paper in healthcare (2016, 2019, 2021), the Behavioral OM Best Paper Award (2021), as well as first place in the George Nicholson and MSOM student paper competitions (2016). She previously completed her PhD at Stanford University, and spent a year as a Herman Goldstine postdoctoral fellow at IBM Research.
+
+ Bio:
+ Samantha Kleinberg is an Associate Professor in the Computer Science department at Stevens Institute of Technology. After completing her PhD in Computer Science in 2010 at NYU, she spent two years as a postdoctoral Computing Innovation Fellow at Columbia University, in the Department of Biomedical Informatics. Before that she was an undergraduate at NYU in Computer Science and Physics, and more recently spent a year on sabbatical in the psychology department of University College London. Dr. Kleinberg has written an academic book, Causality, Probability, and Time, and another for a wider audience, Why: A Guide To Finding and Using Causes. She is the editor of Time and Causality Across the Sciences.
+
+ Bio:
+ Deborah Raji is a Mozilla fellow and CS PhD student at University of California, Berkeley, who is interested in questions on algorithmic auditing and evaluation. In the past, she worked closely with the Algorithmic Justice League initiative to highlight bias in deployed AI products. She has also worked with Googleʼs Ethical AI team and been a research fellow at the Partnership on AI and AI Now Institute at New York University working on various projects to operationalize ethical considerations in ML engineering practice. Recently, she was named to Forbes 30 Under 30 and MIT Tech Review 35 Under 35 Innovators.
+
+ Bio:
+ Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University. Koyejo was previously an Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning, focusing on applications to neuroscience and healthcare. Koyejo completed a Ph.D. at the University of Texas at Austin, and postdoctoral research at Stanford University. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence, a Skip Ellis Early Career Award, a Sloan Fellowship, a Terman faculty fellowship, an NSF CAREER award, a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping. Koyejo spends time at Google as a part of the Brain team, serves on the Neural Information Processing Systems Foundation Board, the Association for Health Learning and Inference Board, and as president of the Black in AI organization.
+
+ Bio:
+ Nils Gehlenborg is an Associate Professor of Biomedical Informatics At Harvard Medical School. The goal of Gehlenborg’s research is to improve human health by developing computational techniques and interfaces that enable scientists and clinicians to efficiently interact with biomedical data. He received his PhD from the University of Cambridge and was a predoctoral fellow at the European Bioinformatics Institute (EMBL-EBI). Gehlenborg is a co-founder and former general chair of BioVis, the Symposium on Biological Data Visualization, and co-founder of VIZBI, the annual workshop on Visualizing Biological Data. Occasionally, he contributes to the “Points of View” data visualization column in Nature Methods.
+
+ Bio:
+ Noémie Elhadad is Associate Professor and Chair of the Department of Biomedical Informatics at Columbia University Vagelos College of Physicians and Surgeons. She is affiliated with Columbia’s Department of Computer Science and the Columbia Data Science Institute. Dr. Elhadad’s research lies at the intersection of artificial intelligence, human-centered computing, and medicine, with a focus on developing novel machine-learning methods. She creates methods and tools to support patients and clinicians in their information needs, with particular focus on ensuring that AI systems of the future are fair and just. She obtained her PhD in 2006 in Computer Science, focusing on multi-document, patient-specific text summarization of the clinical literature. She was on the Computer Science faculty at The City College of New York and the CUNY graduate center starting in 2006 before joining the Department of Biomedical Informatics at Columbia in 2007. Dr. Elhadad served as Chair of the Health Analytics Center at the Columbia Data Science Institute from 2013 to 2016
+
+
+ Hamsa Bastani / University of Pennsylvania
+
+
+
+
+
+
+
+
+
+
+
Bio: Hamsa Bastani is an Associate Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations, social good, and revenue management. Her work has received several recognitions, including the Wagner Prize for Excellence in Practice (2021), the Pierskalla Award for the best paper in healthcare (2016, 2019, 2021), the Behavioral OM Best Paper Award (2021), as well as first place in the George Nicholson and MSOM student paper competitions (2016). She previously completed her PhD at Stanford University, and spent a year as a Herman Goldstine postdoctoral fellow at IBM Research.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Samantha Kleinberg / Stevens Institute of Technology
+
+
+
+
+
+
+
+
+
+
+
Bio: Samantha Kleinberg is an Associate Professor in the Computer Science department at Stevens Institute of Technology. After completing her PhD in Computer Science in 2010 at NYU, she spent two years as a postdoctoral Computing Innovation Fellow at Columbia University, in the Department of Biomedical Informatics. Before that she was an undergraduate at NYU in Computer Science and Physics, and more recently spent a year on sabbatical in the psychology department of University College London. Dr. Kleinberg has written an academic book, Causality, Probability, and Time, and another for a wider audience, Why: A Guide To Finding and Using Causes. She is the editor of Time and Causality Across the Sciences.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Deborah Raji / University of California, Berkeley
+
+
+
+
+
+
+
+
+
+
+
Bio: Deborah Raji is a Mozilla fellow and CS PhD student at University of California, Berkeley, who is interested in questions on algorithmic auditing and evaluation. In the past, she worked closely with the Algorithmic Justice League initiative to highlight bias in deployed AI products. She has also worked with Googleʼs Ethical AI team and been a research fellow at the Partnership on AI and AI Now Institute at New York University working on various projects to operationalize ethical considerations in ML engineering practice. Recently, she was named to Forbes 30 Under 30 and MIT Tech Review 35 Under 35 Innovators.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Sanmi (Oluwasanmi) Koyejo / Stanford University
+
+
+
+
+
+
+
+
+
+
+
Bio: Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University. Koyejo was previously an Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo's research interests are in developing the principles and practice of trustworthy machine learning, focusing on applications to neuroscience and healthcare. Koyejo completed a Ph.D. at the University of Texas at Austin, and postdoctoral research at Stanford University. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence, a Skip Ellis Early Career Award, a Sloan Fellowship, a Terman faculty fellowship, an NSF CAREER award, a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping. Koyejo spends time at Google as a part of the Brain team, serves on the Neural Information Processing Systems Foundation Board, the Association for Health Learning and Inference Board, and as president of the Black in AI organization.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Nils Gehlenborg / Harvard Medical School
+
+
+
+
+
+
+
+
+
+
+
Bio: Nils Gehlenborg is an Associate Professor of Biomedical Informatics At Harvard Medical School. The goal of Gehlenborg’s research is to improve human health by developing computational techniques and interfaces that enable scientists and clinicians to efficiently interact with biomedical data. He received his PhD from the University of Cambridge and was a predoctoral fellow at the European Bioinformatics Institute (EMBL-EBI). Gehlenborg is a co-founder and former general chair of BioVis, the Symposium on Biological Data Visualization, and co-founder of VIZBI, the annual workshop on Visualizing Biological Data. Occasionally, he contributes to the “Points of View” data visualization column in Nature Methods.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Noémie Elhadad / Columbia University
+
+
+
+
+
+
+
+
+
+
+
Bio: Noémie Elhadad is Associate Professor and Chair of the Department of Biomedical Informatics at Columbia University Vagelos College of Physicians and Surgeons. She is affiliated with Columbia’s Department of Computer Science and the Columbia Data Science Institute. Dr. Elhadad’s research lies at the intersection of artificial intelligence, human-centered computing, and medicine, with a focus on developing novel machine-learning methods. She creates methods and tools to support patients and clinicians in their information needs, with particular focus on ensuring that AI systems of the future are fair and just. She obtained her PhD in 2006 in Computer Science, focusing on multi-document, patient-specific text summarization of the clinical literature. She was on the Computer Science faculty at The City College of New York and the CUNY graduate center starting in 2006 before joining the Department of Biomedical Informatics at Columbia in 2007. Dr. Elhadad served as Chair of the Health Analytics Center at the Columbia Data Science Institute from 2013 to 2016
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ David O. Meltzer / University of Chicago
+
+
+
+
+
+
+
+
+
+
+
Bio: David O. Meltzer is Chief of the Section of Hospital Medicine, Director of the Center for Health and the Social Sciences, and Chair of the Committee on Clinical and Translational Science at the University of Chicago, where he is Professor in the Department of Medicine, and affiliated faculty at the University of Chicago Harris School of Public Policy and the Department of Economics. Dr. Meltzer’s research explores problems in health economics and public policy with a focus on the theoretical foundations of medical cost-effectiveness analysis and the cost and quality of hospital care. He is currently leading a Centers for Medicaid and Medicare Innovation Challenge award to study the effects of improved continuity in the doctor patient relationship between the inpatient and outpatient setting on the costs and outcomes of care for frequently hospitalized Medicare patients. He led the formation of the Chicago Learning Effectiveness Advancement Research Network (Chicago LEARN) that helped pioneer collaboration of Chicago-Area academic medical centers in hospital-based comparative effectiveness research and the recent support of the Chicago Area Patient Centered Outcomes Research Network (CAPriCORN) by the Patient Centered Outcomes Research Institute (PCORI).
+
+Meltzer received his MD and PhD in economics from the University of Chicago and completed his residency in internal medicine at Brigham and Women’s Hospital in Boston. Meltzer is the recipient of numerous awards, including the Lee Lusted Prize of the Society for Medical Decision Making, the Health Care Research Award of the National Institute for Health Care Management, and the Eugene Garfield Award from Research America. Meltzer is a research associate of the National Bureau of Economic Research, elected member of the American Society for Clinical Investigation, and past president of the Society for Medical Decision Making. He has served on several IOM panels, include one examining U.S. organ allocation policy and the recent panel on the Learning Health Care System that produced Best Care at Lower Cost. He also has served on the DHHS Secretary’s Advisory Committee on Healthy People 2020, the Patient Centered Outcomes Research Institute (PCORI) Methodology Committee, as a Council Member of the National Institute for General Medical Studies, and as a health economics advisor for the Congressional Budget Office.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Walter Dempsey / University of Michigan
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Dempsey is an Assistant Professor of Biostatistics and an Assistant Research Professor in the d3lab located in the Institute of Social Research at the University of Michigan. His research focuses on Statistical Methods for Digital and Mobile Health. His current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks. Prior to joining, I was a postdoctoral fellow in the Department of Statistics at Harvard University. His fellowship was in the Statistical Reinforcement Learning Lab under the supervision of Susan Murphy. He received my PhD in Statistics at the University of Chicago under the supervision of Peter McCullagh.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ F. Perry Wilson / Yale School of Medicine
+
+
+
+
+
+
+
+
+
+
+
Bio: F. Perry Wilson, MD, MSCE, is a nephrologist who treats patients in Yale New Haven Hospital who have kidney issues or who developed one while hospitalized for another problem. He is also an epidemiologist and a prolific researcher focused on studying ways to improve patient care. An associate professor at Yale School of Medicine, Dr. Wilson is director of the Yale Clinical and Translational Research Accelerator and codirector of the Yale Section of Nephrology’s Human Genetics and Clinical Research Core. He is the creator of the popular online course Understanding Medical Research: Your Facebook Friend Is Wrong" on the Coursera platform."
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Kyra Gan / Cornell University
+
+
+
+
+
+
+
+
+
+
+
Bio: Kyra Gan is an Assistant Professor in the School of Operations Research and Information Engineering and Cornell Tech at Cornell University. Her research interests include adaptive/online algorithm design in personalized treatment (including micro-randomized trials and N-of-1 trials) under constraint settings, computerized/automated inference methods (e.g., targeted learning with RKHS), robust causal discovery in medical data, and fairness in organ transplants. More broadly, she is interested in bridging the gap between research and practice in healthcare.
+
+Prior to Cornell Tech, she was a postdoctoral fellow at the Statistical Reinforcement Lab at Harvard University. She received her Ph.D. in Operations Research in 2022 from Carnegie Mellon University at the Tepper School of Business. She received her B.A.s in Mathematics and Economics from Smith College in 2017. She is a recipient of the 2021 Pierskalla Best Paper Award and the 2021 CHOW Best Student Paper Award in the Category of Operations Research and Management Science.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Girish N. Nadkarni / Mount Sinai
+
+
+
+
+
+
+
+
+
+
+
Bio: Girish N. Nadkarni, MD, MPH, is the Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai. As an expert physician-scientist, Dr. Nadkarni bridges the gap between comprehensive clinical care and innovative research. He is the System Chief of the Division of Data Driven and Digital Medicine (D3M), the Co-Director of the Mount Sinai Clinical Intelligence Center (MSCIC) and the Director of Charles Bronfman Institute for Personalized Medicine.
+
+Before completing his medical degree at one of the top-ranked medical colleges in India, Dr. Nadkarni received training in mathematics. He then received a master’s degree in public health at the Johns Hopkins Bloomberg School of Public Health, and then was a research associate at the Johns Hopkins Medical Institute. Dr. Nadkarni completed his residency in internal medicine and his clinical fellowship in nephrology at the Icahn School of Medicine at Mount Sinai. He then completed a research fellowship in personalized medicine and informatics.
+
+Dr. Nadkarni has authored more than 240 peer-reviewed scientific publications, including articles in the New England Journal of Medicine, the Journal of the American Medical Association, the Annals of Internal Medicine and Nature Medicine. Dr. Nadkarni is the principal or co-investigator for several grants funded by the National Institutes of Health focusing on informatics, data science, and precision medicine. He is also one of the multiple principal investigators of the NIH RECOVER consortium focusing on the long-term sequelae of COVID-19. He has several patents and is also the scientific co-founder of investor-backed companies—one of which, Renalytix, is listed on NASDAQ. In recognition of his work as an active clinician and investigator, he has received several awards and honors, including the Dr. Harold and Golden Lamport Research Award, the Deal of the Year award from Mount Sinai Innovation Partners, the Carl Nacht Memorial Lecture, and the Rising Star Award from ANIO.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Roy Perlis / Massachusetts General Hospital
+
+
+
+
+
+
+
+
+
+
+
Bio: Roy Perlis, MD MSc is Associate Chief for Research in the Department of Psychiatry and Director of the Center for Quantitative Health at Massachusetts General Hospital. He is Professor of Psychiatry at Harvard Medical School and Associate Editor at JAMA's open-access journal, JAMA Network - Open. Dr. Perlis graduated from Brown University, Harvard Medical School and Harvard School of Public Health, and completed his residency, chief residency, and clinical/research fellowship at MGH before joining the faculty. Dr. Perlis's research is focused on identifying predictors of treatment response in brain diseases, and using these biomarkers to develop novel treatments. Dr. Perlis has authored more than 350 articles reporting original research, in journals including Nature Genetics, Nature Neuroscience, JAMA, NEJM, the British Medical Journal, and the American Journal of Psychiatry. His research has been supported by awards from NIMH, NHGRI, NHLBI, NICHD, NCCIH, and NSF, among others. In 2010 Dr. Perlis was awarded the Depression and Bipolar Support Alliance's Klerman Award; he now serves as a scientific advisor to the DBSA.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Ashley Beecy / NewYork-Presbyterian
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Ashley Beecy is the Medical Director of Artificial Intelligence (AI) Operations at NewYork-Presbyterian (NYP). She is a core member of NYP’s AI leadership team partnering with clinical, administrative and research leaders across the enterprise to drive digital transformation and deliver on NYP’s data and AI strategy. Dr. Beecy provides leadership in key areas including the governance, processes, and infrastructure to ensure the responsible and agile deployment of AI. She is responsible for NYP’s largest enterprise-wide AI initiative in collaboration with Cornell Tech and Cornell University. She is a thought leader and serves as a subject matter expert on multiple national AI collaboratives.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Leo Celi / Massachusetts Institute of Technology
+
+
+
+
+
+
+
+
+
+
+
Bio: Leo Anthony Celi has practiced medicine in three continents, giving him broad perspectives in healthcare delivery. As clinical research director and principal research scientist at the MIT Laboratory of Computational Physiology (LCP), he brings together clinicians and data scientists to support research using data routinely collected in the intensive care unit (ICU). His group built and maintains the Medical Information Mart for Intensive Care (MIMIC) database. This public-access database has been meticulously de-identified and is freely shared online with the research community. It is an unparalleled research resource; over 2000 investigators from more than 30 countries have free access to the clinical data under a data use agreement. In 2016, LCP partnered with Philips eICU Research Institute to host the eICU database with more than 2 million ICU patients admitted across the United States. The goal is to scale the database globally and build an international collaborative research community around health data analytics.
+
+Leo founded and co-directs Sana, a cross-disciplinary organization based at the Institute for Medical Engineering and Science at MIT, whose objective is to leverage information technology to improve health outcomes in low- and middle-income countries. At its core is an open-source mobile tele-health platform that allows for capture, transmission and archiving of complex medical data (e.g. images, videos, physiologic signals such as ECG, EEG and oto-acoustic emission responses), in addition to patient demographic and clinical information. Sana is the inaugural recipient of both the mHealth (Mobile Health) Alliance Award from the United Nations Foundation and the Wireless Innovation Award from the Vodafone Foundation in 2010. The software has since been implemented around the globe including India, Kenya, Lebanon, Haiti, Mongolia, Uganda, Brazil, Ethiopia, Argentina, and South Africa.
+
+
+He is one of the course directors for HST.936—global health informatics to improve quality of care, and HST.953—secondary analysis of electronic health records, both at MIT. He is an editor of the textbook for each course, both released under an open access license. The textbook Secondary Analysis of Electronic Health Records came out in October 2016 and was downloaded over 48,000 times in the first two months of publication. The course “Global Health Informatics to Improve Quality of Care” was launched under MITx in February 2017.
+
+Leo was featured as a designer in the Smithsonian Museum National Design Triennial “Why Design Now?” held at the Cooper-Hewitt Museum in New York City in 2010 for his work in global health informatics. He was also selected as one of 12 external reviewers for the National Academy of Medicine 2014 report “Investing in Global Health Systems: Sustaining gains, transforming lives”.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Isaac (Zak) Kohane / Harvard Medical School
+
+
+
+
+
+
+
+
+
+
+
Bio: Isaac (Zak) Kohane, MD, PhD is the inaugural Chair of the Department of Biomedical Informatics and the Marion V. Nelson Professor of Biomedical Informatics at Harvard Medical School. He develops and applies computational techniques to address disease at multiple scales: from whole healthcare systems as “living laboratories” to the functional genomics of neurodevelopment with a focus on autism. Kohane earned his MD/PhD from Boston University and then completed his post-doctoral work at Boston Children’s Hospital, where he has since worked as a pediatric endocrinologist. Kohane has published several hundred papers in the medical literature and authored the widely-used books Microarrays for an Integrative Genomics(2003) and The AI Revolution in Medicine: GPT-4 and Beyond(2023). He is also Editor-in-Chief of NEJM AI.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Kyunghyun Cho / New York University
+
+
+
+
+
+
+
+
+
+
+
Bio: Kyunghyun Cho is a professor of computer science and data science at New York University and a senior director of frontier research at the Prescient Design team within Genentech Research & Early Development (gRED). He is also a CIFAR Fellow of Learning in Machines & Brains and an Associate Member of the National Academy of Engineering of Korea. He served as a (co-)Program Chair of ICLR 2020, NeurIPS 2022 and ICML 2022. He is also a founding co-Editor-in-Chief of the Transactions on Machine Learning Research (TMLR). He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio. He received the Samsung Ho-Am Prize in Engineering in 2021.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Leo Celi / Massachusetts Institute of Technology
+
+
+
+
+
+
+
+
+
+
+
Bio: Leo Anthony Celi has practiced medicine in three continents, giving him broad perspectives in healthcare delivery. As clinical research director and principal research scientist at the MIT Laboratory of Computational Physiology (LCP), he brings together clinicians and data scientists to support research using data routinely collected in the intensive care unit (ICU). His group built and maintains the Medical Information Mart for Intensive Care (MIMIC) database. This public-access database has been meticulously de-identified and is freely shared online with the research community. It is an unparalleled research resource; over 2000 investigators from more than 30 countries have free access to the clinical data under a data use agreement. In 2016, LCP partnered with Philips eICU Research Institute to host the eICU database with more than 2 million ICU patients admitted across the United States. The goal is to scale the database globally and build an international collaborative research community around health data analytics.
+
+Leo founded and co-directs Sana, a cross-disciplinary organization based at the Institute for Medical Engineering and Science at MIT, whose objective is to leverage information technology to improve health outcomes in low- and middle-income countries. At its core is an open-source mobile tele-health platform that allows for capture, transmission and archiving of complex medical data (e.g. images, videos, physiologic signals such as ECG, EEG and oto-acoustic emission responses), in addition to patient demographic and clinical information. Sana is the inaugural recipient of both the mHealth (Mobile Health) Alliance Award from the United Nations Foundation and the Wireless Innovation Award from the Vodafone Foundation in 2010. The software has since been implemented around the globe including India, Kenya, Lebanon, Haiti, Mongolia, Uganda, Brazil, Ethiopia, Argentina, and South Africa.
+
+
+He is one of the course directors for HST.936—global health informatics to improve quality of care, and HST.953—secondary analysis of electronic health records, both at MIT. He is an editor of the textbook for each course, both released under an open access license. The textbook Secondary Analysis of Electronic Health Records came out in October 2016 and was downloaded over 48,000 times in the first two months of publication. The course “Global Health Informatics to Improve Quality of Care” was launched under MITx in February 2017.
+
+Leo was featured as a designer in the Smithsonian Museum National Design Triennial “Why Design Now?” held at the Cooper-Hewitt Museum in New York City in 2010 for his work in global health informatics. He was also selected as one of 12 external reviewers for the National Academy of Medicine 2014 report “Investing in Global Health Systems: Sustaining gains, transforming lives”.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Invited Talk on Research and Top Recent Papers from 2020-2022
+
+
+
+
+
+ Suchi Saria / Johns Hopkins University & Bayesian Health
+
+
+
+
+
+
+
+
+
+
+
Bio: Suchi Saria, PhD, holds the John C. Malone endowed chair and is the Director of the Machine Learning, AI and Healthcare Lab at Johns Hopkins. She is also is the Founder and CEO of Bayesian Health. Her research has pioneered the development of next generation diagnostic and treatment planning tools that use statistical machine learning methods to individualize care. She has written several of the seminal papers in the field of ML and its use for improving patient care and has given over 300 invited keynotes and talks to organizations including the NAM, NAS, and NIH. Dr. Saria has served as an advisor to multiple Fortune 500 companies and her work has been funded by leading organizations including the NIH, FDA, NSF, DARPA and CDC.Dr. Saria’s has been featured by the Atlantic, Smithsonian Magazine, Bloomberg News, Wall Street Journal, and PBS NOVA to name a few. She has won several awards for excellence in AI and care delivery. For example, for her academic work, she’s been recognized as IEEE’s “AI’s 10 to Watch”, Sloan Fellow, MIT Tech Review’s “35 Under 35”, National Academy of Medicine’s list of “Emerging Leaders in Health and Medicine”, and DARPA’s Faculty Award. For her work in industry bringing AI to healthcare, she’s been recognized as World Economic Forum’s 100 Brilliant Minds Under 40, Rock Health’s “Top 50 in Digital Health”, Modern Healthcare’s Top 25 Innovators, The Armstrong Award for Excellence in Quality and Safety and Society of Critical Care Medicine’s Annual Scientific Award.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Invited Talk on Recent Deployments and Real-world Impact
+
+
+
+
+
+ Karandeep Singh / University of Michigan
+
+
+
+
+
+
+
+
+
+
+
Bio: Karandeep Singh, MD, MMSc, is an Assistant Professor of Learning Health Sciences, Internal Medicine, Urology, and Information at the University of Michigan. He directs the Machine Learning for Learning Health Systems (ML4LHS) Lab, which focuses on translational issues related to the implementation of machine learning (ML) models within health systems. He serves as an Associate Chief Medical Information Officer for Artificial Intelligence for Michigan Medicine and is the Associate Director for Implementation for U-M Precision Health, a Presidential Initiative focused on bringing research discoveries to the bedside, with a focus on prediction models and genomics data. He chairs the Michigan Medicine Clinical Intelligence Committee, which oversees the governance of machine learning models across the health system. He teaches a health data science course for graduate and doctoral students, and provides clinical care for people with kidney disease. He completed his internal medicine residency at UCLA Medical Center, where he served as chief resident, and a nephrology fellowship in the combined Brigham and Women’s Hospital/Massachusetts General Hospital program in Boston, MA. He completed his medical education at the University of Michigan Medical School and holds a master’s degree in medical sciences in Biomedical Informatics from Harvard Medical School. He is board certified in internal medicine, nephrology, and clinical informatics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Invited Talk on Under-explored Research Challenges and Opportunities
+
+
+
+
+
+ Nigam Shah / Stanford University
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Nigam Shah is Professor of Medicine at Stanford University, and Chief Data Scientist for Stanford Health Care. His research group analyzes multiple types of health data (EHR, Claims, Wearables, Weblogs, and Patient blogs), to answer clinical questions, generate insights, and build predictive models for the learning health system. At Stanford Healthcare, he leads artificial intelligence and data science efforts for advancing the scientific understanding of disease, improving the practice of clinical medicine and orchestrating the delivery of health care. Dr. Shah is an inventor on eight patents and patent applications, has authored over 200 scientific publications and has co-founded three companies. Dr. Shah was elected into the American College of Medical Informatics (ACMI) in 2015 and was inducted into the American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS from Baroda Medical College, India, a PhD from Penn State University and completed postdoctoral training at Stanford University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Network studies: As many databases as possible or enough to answer the question quickly?
+
+
+
+
+
+ Christopher Chute / Johns Hopkins University
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Chute is the Bloomberg Distinguished Professor of Health Informatics, Professor of Medicine, Public Health, and Nursing at Johns Hopkins University, and Chief Research Information Officer for Johns Hopkins Medicine. He is also Section Head of Biomedical Informatics and Data Science and Deputy Director of the Institute for Clinical and Translational Research. He received his undergraduate and medical training at Brown University, internal medicine residency at Dartmouth, and doctoral training in Epidemiology and Biostatistics at Harvard. He is Board Certified in Internal Medicine and Clinical Informatics, and an elected Fellow of the American College of Physicians, the American College of Epidemiology, HL7, the American Medical Informatics Association, and the American College of Medical Informatics (ACMI), as well as a Founding Fellow of the International Academy of Health Sciences Informatics; he was president of ACMI 2017-18. He is an elected member of the Association of American Physicians. His career has focused on how we can represent clinical information to support analyses and inferencing, including comparative effectiveness analyses, decision support, best evidence discovery, and translational research. He has had a deep interest in the semantic consistency of health data, harmonized information models, and ontology. His current research focuses on translating basic science information to clinical practice, how we classify dysfunctional phenotypes (disease), and the harmonization and rendering of real-world clinical data including electronic health records to support data inferencing. He became founding Chair of Biomedical Informatics at Mayo Clinic in 1988, retiring from Mayo in 2014, where he remains an emeritus Professor of Biomedical Informatics. He is presently PI on a spectrum of high-profile informatics grants from NIH spanning translational science including co-lead on the National COVID Cohort Collaborative (N3C). He has been active on many HIT standards efforts and chaired ISO Technical Committee 215 on Health Informatics and chaired the World Health Organization (WHO) International Classification of Disease Revision (ICD-11).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Network studies: As many databases as possible or enough to answer the question quickly?
+
+
+
+
+
+ Robert Platt / McGill University
+
+
+
+
+
+
+
+
+
+
+
Bio: Robert Platt is Professor in the Departments of Epidemiology, Biostatistics, and Occupational Health, and of Pediatrics, at McGill University. He holds the Albert Boehringer I endowed chair in Pharmacoepidemiology, and is Principal Investigator of the Canadian Network for Observational Drug Effect Studies (CNODES). His research focuses on improving statistical methods for the study of medications using administrative data, with a substantive focus on medications in pregnancy. Dr. Platt is an editor-in-chief of Statistics in Medicine and is on the editorial boards of the American Journal of Epidemiology and Pharmacoepidemiology and Drug Safety. He has published over 400 articles, one book and several book chapters on biostatistics and epidemiology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
+
+
+
+
+
+ Tianxi Cai / Harvard Medical School
+
+
+
+
+
+
+
+
+
+
+
Bio: Tianxi Cai is John Rock Professor of Translational Data Science at Harvard, with joint appointments in the Biostatistics Department and the Department of Biomedical Informatics. She directs the Translational Data Science Center for a Learning Health System at Harvard Medical School and co-directs the Applied Bioinformatics Core at VA MAVERIC. She is a major player in developing analytical tools for mining multi-institutional EHR data, real world evidence, and predictive modeling with large scale biomedical data. Tianxi received her Doctor of Science in Biostatistics at Harvard and was an assistant professor at the University of Washington before returning to Harvard as a faculty member in 2002.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Data Heterogeneity: More Heterogeneous Data or Less Homogeneous Data?
+
+
+
+
+
+ Yong Chen / University of Pennsylvania
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Yong Chen is Professor of Biostatistics at the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania (Penn). He directs a Computing, Inference and Learning Lab at University of Pennsylvania, which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He has published over 170 peer-reviewed papers in a wide spectrum of methodological and clinical areas. During the pandemic, Dr. Chen is serving as Director of Biostatistics Core for Pedatric PASC of the RECOVER COVID initiative which a national multi-center RWD-based study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 13 million patients across more than 10 health systems. He is an elected fellow of the American Statistical Association, the American Medical Informatics Association, Elected Member of the International Statistical Institute, and Elected Member of the Society for Research Synthesis Methodology.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Differential Privacy vs. Synthetic Data
+
+
+
+
+
+ Khaled El Emam / University of Ottawa
+
+
+
+
+
+
+
+
+
+
+
Bio: Dr. Khaled El Emam is the Canada Research Chair (Tier 1) in Medical AI at the University of Ottawa, where he is a Professor in the School of Epidemiology and Public Health. He is also a Senior Scientist at the Children’s Hospital of Eastern Ontario Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting research on privacy enhancing technologies to enable the sharing of health data for secondary purposes, including synthetic data generation and de-identification methods. Khaled is a co-founder of Replica Analytics, a company that develops synthetic data generation technology, which was recently acquired by Aetion. As an entrepreneur, Khaled founded or co-founded six product and services companies involved with data management and data analytics, with some having successful exits. Prior to his academic roles, he was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He participates in a number of committees, number of the European Medicines Agency Technical Anonymization Group, the Panel on Research Ethics advising on the TCPS, the Strategic Advisory Council of the Office of the Information and Privacy Commissioner of Ontario, and also is co-editor-in-chief of the JMIR AI journal. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015. Khaled has a PhD from the Department of Electrical and Electronics.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Differential Privacy vs. Synthetic Data
+
+
+
+
+
+ Li Xiong / Emory University
+
+
+
+
+
+
+
+
+
+
+
Bio: Li Xiong is a Samuel Candler Dobbs Professor of Computer Science and Professor of Biomedical Informatics at Emory University. She held a Winship Distinguished Research Professorship from 2015-2018. She has a Ph.D. from Georgia Institute of Technology, an MS from Johns Hopkins University, and a BS from the University of Science and Technology of China. She and her research lab, Assured Information Management and Sharing (AIMS), conduct research on algorithms and methods at the intersection of data management, machine learning, and data privacy and security, with a recent focus on privacy-enhancing and robust machine learning. She has published over 170 papers and received six best paper or runner up awards. She has served and serves as associate editor for IEEE TKDE, IEEE TDSC, and VLDBJ, general co-chair for ACM CIKM 2022, program co-chair for IEEE BigData 2020 and ACM SIGSPATIAL 2018, 2020, program vice-chair for ACM SIGMOD 2024, 2022, and IEEE ICDE 2023, 2020, and VLDB Sponsorship Ambassador. Her research is supported by federal agencies including NSF, NIH, AFOSR, PCORI, and industry awards including Google, IBM, Cisco, AT&T, and Woodrow Wilson Foundation. She is an IEEE felllow.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Katie Link
+
+
+
+
+
+
Bio: Katie Link is the Healthcare Solutions Product Manager at NVIDIA, where she helps enable healthcare companies and researchers to solve real-world healthcare challenges with large language models (LLMs) and other advanced technologies. Prior to NVIDIA, she led healthcare and life sciences applications of artificial intelligence as a Machine Learning Engineer at Hugging Face, an open source AI startup. She is currently based in New York City and is on leave as a medical student at the Icahn School of Medicine at Mount Sinai. While in medical school, she led artificial intelligence research at NYU Langone Hospital, creating the largest open dataset of magnetic resonance imaging (MRI) for brain metastases and developing novel deep learning algorithms for tracking cancer progression. In her spare time, she also works on AI education initiatives for medical trainees and physicians. Prior to medical school, she was an AI Resident at Google X. She holds a bachelor’s degree in Neuroscience with a minor in Computer Science from Johns Hopkins University.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Bridging the gap between the business of value-based care and the research of health AI
+
+
+
+
+
+ Discussion leader: Yubin Park
+
+
+
+
+
Value-Based Care (VBC) is getting its momentum. The Centers for Medicare and Medicaid Services (CMS) is pushing to have all Medicare fee-for-service beneficiaries under a care relationship with accountability for quality and total cost of care by 2030. However, the business of VBC is more complex and is different from other businesses as it needs to satisfy three-part aims simultaneously; they are 1) better care for individuals, 2) better health for populations, and 3) lower cost. Meeting all three aims is challenging, and the details and implications of these aims are not well-known for healthcare machine learning researchers. Therefore, we want to pick a few papers from this and past years' CHIL proceedings. Then, we would like to brainstorm and discuss how those ideas in the papers can be deployed in practice, what are the barriers to the deployment/sales, what are the hidden or visible incentives for adopting such ideas, how the government and policymakers should incentivize to achieve the three-part aims of CMS while encouraging the adoption of such technologies.
+
+
+
Bio: Yubin Park, Ph.D., is Chief Data and Analytics Officer at Apollo Medical Holdings, Inc. (ApolloMed, NASDAQ: AMEH). He oversees value-based care analytics, remote patient monitoring, and partnerships with third-party data vendors in his current position. Yubin started his career by founding a healthcare analytics start-up after obtaining his Ph.D. degree in Machine Learning at the University of Texas at Austin in 2014. His first start-up, Accordion Health, provided an AI-driven Risk Adjustment and Quality analytics platform to Medicare Advantage plans. In 2017, Evolent Health (NYSE: EVH) acquired his company, and there, he led various clinical transformation/innovation projects. In 2020, he then founded his second start-up, Orma Health. The company built a virtual care and analytics platform for payers and providers in value-based care, e.g., Direct Contracting Entities and Accountable Care Organizations. At Orma, he worked with many sizes of risk-bearing primary care and specialty groups, helping them connect with patients through virtual care technologies. ApolloMed acquired Orma Health in 2022.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Auditing Algorithm Performance and Equity
+
+
+
+
+
+ Discussion leader: Alistair Johnson
+
+
+
+
+
Machine learning algorithms should be easy to evaluate for performance and equity: they generate quantitative predictions that can be compared to their intended target, both in the general population and in under-served groups. But the scarcity of data means that, for most algorithms, we have no idea how they perform, and how much bias they contain. Concretely, there is no way for algorithm developers or potential users to answer the simple question: does this algorithm do what it’s supposed to do? This roundtable will focus on the opportunities and challenges of auditing algorithm performance and equity.
+
+
+
Bio: Dr. Johnson is a Scientist at the Hospital for Sick Children. He received his Bachelor of Biomedical and Electrical Engineering at McMaster University and successfully read for a DPhil at the University of Oxford. Dr. Johnson is most well-known for his work on the MIMIC-III Clinical database, a publicly available critical care database used by over 30,000 researchers around the world. His research focuses on the development of new database structures tailored for healthcare and machine learning algorithms for natural language processing, particularly focusing on the deidentification of free-text clinical notes.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Thursday, 4:30pm - 6:00pm - Poster Session A (Workshop)
+ Lida Zhang (Texas A&M University); Xiaohan Chen, Tianlong Chen, and Zhangyang Wang (University of Texas at Austin); Bobak J. Mortazavi (Texas A&M University)
+
+ Martha Ferreira (Dalhousie University); Michal Malyska and Nicola Sahar (Semantic Health); Riccardo Miotto (Icahn School of Medicine at Mount Sinai); Fernando Paulovich (Dalhousie University); Evangelos Milios (Dalhousie University, Faculty of Computer Scienc)
+
+ Kirti Jain (Department of Computer Science, University of Delhi, Delhi, India); Sharanjit Kaur (Acharya Narendra Dev College, University of Delhi, Delhi, India); Vasudha Bhatnagar (Department of Computer Science, University of Delhi, Delhi, India)
+
+ Krishanu Sarker (Georgia State University); Sharbani Pandit (Georgia Institute of Technology); Anupam Sarker (Institute of Epidemiology, Disease Control and Research); Saeid Belkasim and Shihao Ji (Georgia State University)
+
+ Siyu Shi (Department of Medicine, School of Medicine, Stanford University); Ishaan Malhi, Kevin Tran, Andrew Y. Ng, and Pranav Rajpurkar (Department of Computer Science, Stanford University)
+
+ Nathan C. Hurley (Texas A&M University); Alyssa Berkowitz (Yale University); Frederick Masoudi (University of Colorado School of Medicine); Joseph Ross and Nihar Desai (Yale University); Nilay Shah (Mayo Clinic); Sanket Dhruva (UCSF School of Medicine); Bobak J. Mortazavi (Texas A&M University)
+
+ Junwoo Park and Youngwoo Cho (Korea Advanced Institute of Science and Technology (KAIST)); Haneol Lee (Yonsei University); Jaegul Choo and Edward Choi (Korea Advanced Institute of Science and Technology (KAIST))
+
+ Will Ke Wang, Jiamu Yang, Leeor Hershkovich, Hayoung Jeong, Bill Chen, Karnika Singh, Ali R Roghanizad, Md Mobashir Hasan Shandhi, Andrew R Spector, Jessilyn Dunn
+
+ Patrick Kasl, Severine Soltani, Lauryn Keeler Bruce, Varun Kumar Viswanath, Wendy Hartogensis, Amarnath Gupta, Ilkay Altintas, Stephan Dilchert, Frederick M. Hecht, Ashley Mason, Benjamin L. Smarr
+
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
Financial support ensures that CHIL remains accessible to a broad set of participants by offsetting the expenses involved in participation. We follow best practices in other conferences to maintain a transparent and appropriate relationship with our funders:
+
+
The substance and structure of the conference are determined independently by the program committees.
+
All papers are chosen through a rigorous, mutually anonymous peer review process, where authors disclose conflicts of interest.
+
All sources of financial support are acknowledged.
+
Benefits are publicly disclosed below.
+
Corporate sponsors cannot specify how contributions are spent.
+
+
+
+
+
2024 Sponsorship Levels
+
Sponsorship of the annual AHLI Conference on Health, Inference and Learning (CHIL) contributes to furthering research and interdisciplinary dialogue around machine learning and health. We deeply appreciate any amount of support your company or foundation can provide.
+
+
Diamond ($20,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Access to CHIL 2024 attendees’ contact and CV who opt-in for career opportunities
+
Dedicated time during the lunch break to present a 20-minute talk on company's research in machine learning and health research or development
+
Present demo during the poster session
+
Free registration for up to ten (10) representatives from your organization
+
Free company booth at the venue
+
+
Gold ($10,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Present demo during the poster session
+
Free registration for up to five (5) representatives from your organization
+
Free company booth at the venue
+
+
Silver ($5,000 USD)
+
+
Prominent display of company logo on our website
+
Verbal acknowledgment of contribution in the opening and closing remarks of the conference
+
Free registration for up to two (2) representatives from your organization
+
+
Bronze ($2,000 USD)
+
+
Prominent display of company logo on our website
+
Free registration for one (1) representative from your organization
Thank you to our 2024 sponsors: Gordon and Betty Moore Foundation (Gold), Department of Health Outcomes and Biomedical Informatics at UFlorida College of Medicine (Gold), Apple (Silver), Genentech (Silver), Google (Silver), The Mount Sinai Hospital (Silver), Computational Precision Health Program at UCSF / UC Berkeley (Silver), UF Health (Silver), Chase Center at University of Pennsylvania (Silver), Department of Biostatistics at University of Pennsylvania (Silver), Department of Biostatistics at Columbia University (Bronze), Health Data Science (Bronze), and the Department of Surgery at University of Minnesota (Bronze)!
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/static/.DS_Store b/static/.DS_Store
new file mode 100644
index 000000000..9a4a5340c
Binary files /dev/null and b/static/.DS_Store differ
diff --git a/static/css/countdown.css b/static/css/countdown.css
new file mode 100644
index 000000000..c8c711f82
--- /dev/null
+++ b/static/css/countdown.css
@@ -0,0 +1,185 @@
+/* COUNTDOWN FLIPPER */
+.countdown-flipper {
+ margin: 0 auto;
+ width: 355px;
+}
+
+.countdown-flipper .wrapper {
+ height: 130px;
+ margin-bottom: 20px;
+}
+
+.countdown-flipper .time {
+ border-radius: 5px;
+ box-shadow: 0 0 10px 0 rgba(0,0,0,0.5);
+ display: inline-block;
+ text-align: center;
+ position: relative;
+ height: 77px;
+ width: 65px;
+
+ -webkit-perspective: 479px;
+ -moz-perspective: 479px;
+ -ms-perspective: 479px;
+ -o-perspective: 479px;
+ perspective: 479px;
+
+ -webkit-backface-visibility: hidden;
+ -moz-backface-visibility: hidden;
+ -ms-backface-visibility: hidden;
+ -o-backface-visibility: hidden;
+ backface-visibility: hidden;
+
+ -webkit-transform: translateZ(0);
+ -moz-transform: translateZ(0);
+ -ms-transform: translateZ(0);
+ -o-transform: translateZ(0);
+ transform: translateZ(0);
+
+ -webkit-transform: translate3d(0,0,0);
+ -moz-transform: translate3d(0,0,0);
+ -ms-transform: translate3d(0,0,0);
+ -o-transform: translate3d(0,0,0);
+ transform: translate3d(0,0,0);
+}
+
+.countdown-flipper .count {
+ background: #202020;
+ color: #f8f8f8;
+ display: block;
+ font-family: 'Oswald', sans-serif;
+ font-size: 2em;
+ line-height: 2.4em;
+ overflow: hidden;
+ position: absolute;
+ text-align: center;
+ text-shadow: 0 0 10px rgba(0, 0, 0, 0.8);
+ top: 0;
+ width: 100%;
+
+ -webkit-transform: translateZ(0);
+ -moz-transform: translateZ(0);
+ -ms-transform: translateZ(0);
+ -o-transform: translateZ(0);
+ transform: translateZ(0);
+
+ -webkit-transform-style: flat;
+ -moz-transform-style: flat;
+ -ms-transform-style: flat;
+ -o-transform-style: flat;
+ transform-style: flat;
+}
+
+.countdown-flipper .count.top {
+ border-top: 1px solid rgba(255,255,255,0.2);
+ border-bottom: 1px solid rgba(255,255,255,0.1);
+ border-radius: 5px 5px 0 0;
+ height: 50%;
+
+ -webkit-transform-origin: 50% 100%;
+ -moz-transform-origin: 50% 100%;
+ -ms-transform-origin: 50% 100%;
+ -o-transform-origin: 50% 100%;
+ transform-origin: 50% 100%;
+}
+
+.countdown-flipper .count.bottom {
+ background-image: linear-gradient(rgba(255,255,255,0.1), transparent);
+ background-image: -webkit-linear-gradient(rgba(255,255,255,0.1), transparent);
+ background-image: -moz-linear-gradient(rgba(255,255,255,0.1), transparent);
+ background-image: -ms-linear-gradient(rgba(255,255,255,0.1), transparent);
+ background-image: -o-linear-gradient(rgba(255,255,255,0.1), transparent);
+ border-top: 1px solid #000;
+ border-bottom: 1px solid #000;
+ border-radius: 0 0 5px 5px;
+ line-height: 0;
+ height: 50%;
+ top: 50%;
+
+ -webkit-transform-origin: 50% 0;
+ -moz-transform-origin: 50% 0;
+ -ms-transform-origin: 50% 0;
+ -o-transform-origin: 50% 0;
+ transform-origin: 50% 0;
+}
+
+.countdown-flipper .label {
+ font-size: normal;
+ margin-top: 5px;
+ display: block;
+ position: absolute;
+ top: 95px;
+ width: 100%;
+ text-shadow: 1px 1px 1px #000;
+}
+
+/* Animation start */
+.countdown-flipper .count.curr.top {
+ -webkit-transform: rotateX(0deg);
+ -moz-transform: rotateX(0deg);
+ -ms-transform: rotateX(0deg);
+ -o-transform: rotateX(0deg);
+ transform: rotateX(0deg);
+ z-index: 3;
+}
+
+.countdown-flipper .count.next.bottom {
+ -webkit-transform: rotateX(90deg);
+ -moz-transform: rotateX(90deg);
+ -ms-transform: rotateX(90deg);
+ -o-transform: rotateX(90deg);
+ transform: rotateX(90deg);
+ z-index: 2;
+}
+
+/* Animation end */
+.countdown-flipper .flip .count.curr.top {
+ -webkit-transition: all 250ms ease-in-out;
+ -moz-transition: all 250ms ease-in-out;
+ -ms-transition: all 250ms ease-in-out;
+ -o-transition: all 250ms ease-in-out;
+ transition: all 250ms ease-in-out;
+ -webkit-transform: rotateX(-90deg);
+ -moz-transform: rotateX(-90deg);
+ -ms-transform: rotateX(-90deg);
+ -o-transform: rotateX(-90deg);
+ transform: rotateX(-90deg);
+}
+
+.countdown-flipper .flip .count.next.bottom {
+ -webkit-transition: all 250ms ease-in-out 250ms;
+ -moz-transition: all 250ms ease-in-out 250ms;
+ -ms-transition: all 250ms ease-in-out 250ms;
+ -o-transition: all 250ms ease-in-out 250ms;
+ transition: all 250ms ease-in-out 250ms;
+ -webkit-transform: rotateX(0deg);
+ -moz-transform: rotateX(0deg);
+ -ms-transform: rotateX(0deg);
+ -o-transform: rotateX(0deg);
+ transform: rotateX(0deg);
+}
+
+@media screen and (max-width: 48em) {
+ .countdown-flipper {
+ width: 100%;
+ }
+
+ .countdown-flipper .countdown-container {
+ height: 100px;
+ }
+
+ .countdown-flipper .time {
+ height: 70px;
+ width: 48px;
+ }
+
+ .countdown-flipper .count {
+ font-size: 1.5em;
+ line-height: 70px;
+ }
+
+ .countdown-flipper .label {
+ font-size: 0.8em;
+ top: 72px;
+ }
+}
diff --git a/static/css/dropdownmenu.css b/static/css/dropdownmenu.css
new file mode 100644
index 000000000..1ec72cfe4
--- /dev/null
+++ b/static/css/dropdownmenu.css
@@ -0,0 +1,62 @@
+
+.btco-hover-menu a , .navbar > li > a {
+ text-transform: capitalize;
+ padding: 10px 15px;
+}
+.btco-hover-menu .active a,
+.btco-hover-menu .active a:focus,
+.btco-hover-menu .active a:hover,
+.btco-hover-menu li a:hover,
+.btco-hover-menu li a:focus ,
+.navbar>.show>a, .navbar>.show>a:focus, .navbar>.show>a:hover{
+ color: #000;
+ background: transparent;
+ outline: 0;
+}
+
+/*submenu style start from here*/
+.dropdown-menu {
+ padding: 0px 0;
+ margin: 0 0 0;
+ border: 0px solid transition !important;
+ border: 0px solid rgba(0,0,0,.15);
+ border-radius: 0px;
+ -webkit-box-shadow: none !important;
+ box-shadow: none !important;
+
+}
+
+/*first level*/
+.btco-hover-menu .collapse ul > li:hover > a{background: #f5f5f5;}
+.btco-hover-menu .collapse ul ul > li:hover > a, .navbar .show .dropdown-menu > li > a:focus, .navbar .show .dropdown-menu > li > a:hover{background: #fff;}
+/*second level*/
+.btco-hover-menu .collapse ul ul ul > li:hover > a{background: #fff;}
+
+/*third level*/
+.btco-hover-menu .collapse ul ul, .btco-hover-menu .collapse ul ul.dropdown-menu{background:#f5f5f5;}
+.btco-hover-menu .collapse ul ul ul, .btco-hover-menu .collapse ul ul ul.dropdown-menu{background:#f5f5f5}
+.btco-hover-menu .collapse ul ul ul ul, .btco-hover-menu .collapse ul ul ul ul.dropdown-menu{background:#f5f5f5}
+
+/*Drop-down menu work on hover*/
+.btco-hover-menu{background: none;margin: 0;padding: 0;min-height:20px}
+
+@media only screen and (max-width: 960px) {
+ .btco-hover-menu .show > .dropdown-toggle::after{
+ transform: rotate(-90deg);
+ }
+}
+@media only screen and (min-width: 960px) {
+
+ .btco-hover-menu .collapse ul li{position:relative;}
+ .btco-hover-menu .collapse ul li:hover> ul{display:block}
+ .btco-hover-menu .collapse ul ul{position:absolute;top:100%;left:0;min-width:250px;display:none}
+ /*******/
+ .btco-hover-menu .collapse ul ul li{position:relative}
+ .btco-hover-menu .collapse ul ul li:hover> ul{display:block}
+ .btco-hover-menu .collapse ul ul ul{position:absolute;top:0;left:100%;min-width:250px;display:none}
+ /*******/
+ .btco-hover-menu .collapse ul ul ul li{position:relative}
+ .btco-hover-menu .collapse ul ul ul li:hover ul{display:block}
+ .btco-hover-menu .collapse ul ul ul ul{position:absolute;top:0;left:-100%;min-width:250px;display:none;z-index:1}
+
+}
diff --git a/static/css/fa_regular.css b/static/css/fa_regular.css
new file mode 100644
index 000000000..50fd9e146
--- /dev/null
+++ b/static/css/fa_regular.css
@@ -0,0 +1,21 @@
+/*!
+ * Font Awesome Free 5.14.0 by @fontawesome - https://fontawesome.com
+ * License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License)
+ */
+@font-face {
+ font-family: "Font Awesome 5 Free";
+ font-style: normal;
+ font-weight: 400;
+ font-display: block;
+ src: url("../webfonts/fa-regular-400.eot");
+ src: url("../webfonts/fa-regular-400.eot?#iefix") format("embedded-opentype"),
+ url("../webfonts/fa-regular-400.woff2") format("woff2"),
+ url("../webfonts/fa-regular-400.woff") format("woff"),
+ url("../webfonts/fa-regular-400.ttf") format("truetype"),
+ url("../webfonts/fa-regular-400.svg#fontawesome") format("svg");
+}
+
+.far {
+ font-family: "Font Awesome 5 Free";
+ font-weight: 400;
+}
diff --git a/static/css/fa_solid.css b/static/css/fa_solid.css
new file mode 100644
index 000000000..62998fd87
--- /dev/null
+++ b/static/css/fa_solid.css
@@ -0,0 +1,22 @@
+/*!
+ * Font Awesome Free 5.14.0 by @fontawesome - https://fontawesome.com
+ * License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License)
+ */
+@font-face {
+ font-family: "Font Awesome 5 Free";
+ font-style: normal;
+ font-weight: 900;
+ font-display: block;
+ src: url("../webfonts/fa-solid-900.eot");
+ src: url("../webfonts/fa-solid-900.eot?#iefix") format("embedded-opentype"),
+ url("../webfonts/fa-solid-900.woff2") format("woff2"),
+ url("../webfonts/fa-solid-900.woff") format("woff"),
+ url("../webfonts/fa-solid-900.ttf") format("truetype"),
+ url("../webfonts/fa-solid-900.svg#fontawesome") format("svg");
+}
+
+.fa,
+.fas {
+ font-family: "Font Awesome 5 Free";
+ font-weight: 900;
+}
diff --git a/static/css/fonts/4UaZrEtFpBI4f1ZSIK9d4LjJ4lM3OwRmPg.ttf b/static/css/fonts/4UaZrEtFpBI4f1ZSIK9d4LjJ4lM3OwRmPg.ttf
new file mode 100644
index 000000000..8a858a823
Binary files /dev/null and b/static/css/fonts/4UaZrEtFpBI4f1ZSIK9d4LjJ4lM3OwRmPg.ttf differ
diff --git a/static/css/fonts/Cuprum.css b/static/css/fonts/Cuprum.css
new file mode 100644
index 000000000..868d47c85
--- /dev/null
+++ b/static/css/fonts/Cuprum.css
@@ -0,0 +1,7 @@
+@font-face {
+ font-family: "Cuprum";
+ font-style: normal;
+ font-weight: 400;
+ src: local("Cuprum Regular"), local("Cuprum-Regular"),
+ url(dg4k_pLmvrkcOkBNJutH.ttf) format("truetype");
+}
diff --git a/static/css/fonts/Exo.css b/static/css/fonts/Exo.css
new file mode 100644
index 000000000..9dbc4c905
--- /dev/null
+++ b/static/css/fonts/Exo.css
@@ -0,0 +1,6 @@
+@font-face {
+ font-family: "Exo";
+ font-style: normal;
+ font-weight: 400;
+ src: url(4UaZrEtFpBI4f1ZSIK9d4LjJ4lM3OwRmPg.ttf) format("truetype");
+}
diff --git a/static/css/fonts/Lato.css b/static/css/fonts/Lato.css
new file mode 100644
index 000000000..9d0060cd8
--- /dev/null
+++ b/static/css/fonts/Lato.css
@@ -0,0 +1,16 @@
+@font-face {
+ font-family: "Lato";
+ font-style: normal;
+ font-weight: 400;
+ font-display: swap;
+ src: local("Lato Regular"), local("Lato-Regular"),
+ url(S6uyw4BMUTPHjx4wWw.ttf) format("truetype");
+}
+@font-face {
+ font-family: "Lato";
+ font-style: normal;
+ font-weight: 900;
+ font-display: swap;
+ src: local("Lato Black"), local("Lato-Black"),
+ url(S6u9w4BMUTPHh50XSwiPHA.ttf) format("truetype");
+}
diff --git a/static/css/fonts/Oswald.css b/static/css/fonts/Oswald.css
new file mode 100644
index 000000000..e76874dae
--- /dev/null
+++ b/static/css/fonts/Oswald.css
@@ -0,0 +1,8 @@
+@font-face {
+ font-family: "Oswald";
+ font-style: normal;
+ font-weight: 400;
+ font-display: swap;
+ src: local("Oswald Regular"), local("Oswald-Regular"),
+ url(oswald-regular.ttf) format("truetype");
+}
diff --git a/static/css/fonts/S6u9w4BMUTPHh50XSwiPHA.ttf b/static/css/fonts/S6u9w4BMUTPHh50XSwiPHA.ttf
new file mode 100644
index 000000000..0582f02f2
Binary files /dev/null and b/static/css/fonts/S6u9w4BMUTPHh50XSwiPHA.ttf differ
diff --git a/static/css/fonts/S6uyw4BMUTPHjx4wWw.ttf b/static/css/fonts/S6uyw4BMUTPHjx4wWw.ttf
new file mode 100644
index 000000000..3c2d417ea
Binary files /dev/null and b/static/css/fonts/S6uyw4BMUTPHjx4wWw.ttf differ
diff --git a/static/css/fonts/dg4k_pLmvrkcOkBNJutH.ttf b/static/css/fonts/dg4k_pLmvrkcOkBNJutH.ttf
new file mode 100644
index 000000000..c2a09d176
Binary files /dev/null and b/static/css/fonts/dg4k_pLmvrkcOkBNJutH.ttf differ
diff --git a/static/css/fonts/oswald-regular.ttf b/static/css/fonts/oswald-regular.ttf
new file mode 100644
index 000000000..2492c44a2
Binary files /dev/null and b/static/css/fonts/oswald-regular.ttf differ
diff --git a/static/css/global.css b/static/css/global.css
new file mode 100644
index 000000000..978bed449
--- /dev/null
+++ b/static/css/global.css
@@ -0,0 +1,408 @@
+html,
+body {
+ height: 100%;
+}
+
+body a {
+ color: #2294e0;
+}
+
+body a:hover {
+ color: #2294e0;
+ text-decoration: none;
+}
+
+h2,
+h3 {
+ font-family: "Lato";
+}
+
+label:text-muted,
+.nav-link {
+ font-family: "Lato";
+}
+
+body {
+ font-family: "Lato", sans-serif;
+}
+
+button#btn-login-toggle {
+ border: none;
+ background: none;
+ color: rgba(0,0,0,.5);
+ padding: 0 0.5rem;
+ height: 40px;
+ vertical-align: middle;
+}
+
+button#btn-login-toggle:hover {
+ color: rgba(0,0,0,.7);
+}
+
+button#btn-login-toggle:focus {
+ outline: none;
+}
+
+.hidden,
+.gated-content {
+ display: none;
+}
+
+.invisible {
+ visibility: hidden;
+}
+
+.container a:hover {
+ text-decoration: none;
+}
+
+.container.content-wrapper {
+ /* 100% minus height to keep footer
+ at bottom, on pages with little content */
+ min-height: calc(100% - 298px);
+}
+
+.jumbotron {
+ font-family: "Lato", sans-serif;
+ background-color: rgba(236, 241, 246, 1);
+}
+
+.jumbotron + .container.content-wrapper {
+ min-height: 0;
+}
+
+.page-header {
+ padding: 10px 0;
+}
+
+.btn-group {
+ background-color: white;
+}
+
+.btn {
+ background-color: white;
+}
+
+#main-nav {
+ padding-top: 10px;
+ padding-bottom: 10px;
+ position: relative
+}
+
+.header {
+ font: "Montserrat";
+}
+
+.pp-card {
+ font-family: "Exo";
+ box-shadow: 0 0 14px 0 rgba(204, 204, 204, 1);
+ margin-bottom: 1em;
+ display: block;
+ overflow: hidden;
+}
+
+.pp-card-header {
+ border: 4px solid #eee;
+ font-family: "Exo";
+ height: 340px;
+ padding-top: 10px;
+ padding-left: 15px;
+ padding-right: 15px;
+}
+
+.pp-mode-mini .pp-card-header {
+ height: 175px;
+ border: 1px solid #eee;
+}
+
+.pp-card-footer {
+ position: absolute;
+ width: 100%;
+ bottom: 3px;
+}
+
+.pp-mode-mini .pp-card-header .card-title {
+ font-size: 17px;
+}
+
+.cards_img {
+ margin-top: 20px;
+ border-radius: 2px;
+ max-height: 130px;
+ object-fit: scale-down;
+}
+
+body .nav-pills .nav-link.active {
+ background-color: #bed972;
+}
+
+.container {
+ margin-top: 10px;
+}
+
+.card {
+ font-family: "Exo";
+}
+
+.card-header {
+ font-family: "Exo";
+}
+
+.card-header .card-title {
+ font-family: "Exo";
+}
+
+.cards.myCard {
+ display: flex;
+ flex-flow: wrap;
+ margin-left: 0;
+ box-sizing: border-box;
+}
+
+.cards {
+ margin-top: 10px;
+}
+
+.card-title {
+ margin-top: 10px;
+ font-size: 20px;
+}
+
+h2.card-title {
+ margin-top: 10px;
+ font-size: 30px;
+}
+
+h3.card-subtitle.text-muted {
+ margin-top: 10px;
+ font-size: 20px;
+}
+
+.card-subtitle.text-muted {
+ text-align: center;
+ font-size: 13px;
+}
+
+.main-title {
+ font-weight: 700;
+ color: #2294e0;
+ font-family: "Exo";
+}
+
+.myAccordion {
+ box-shadow: 0 0 14px 0 rgba(0, 0, 0, 0.1);
+ border-radius: 10px;
+ margin-bottom: 18px;
+ padding-left: 15px;
+ padding-bottom: 10px;
+ padding-right: 15px;
+ padding-top: 10px;
+ background-color: rgba(255, 255, 255, 1);
+}
+
+.sponsorLogo {
+ width: 300px;
+ display: block;
+ margin-left: auto;
+ margin-right: auto;
+ margin-top: auto;
+ margin-bottom: auto;
+}
+
+.slp iframe {
+ background: #fff;
+ margin: 5px;
+ border: 1px solid rgba(0, 0, 0, 0.125);
+ border-radius: 0.25rem;
+}
+
+.carousel-inner .carousel-item.active,
+.carousel-inner .carousel-item-next,
+.carousel-inner .carousel-item-prev {
+ display: flex;
+}
+
+/* .carousel-inner .carousel-item-right.active, */
+
+/* .carousel-inner .carousel-item-next { */
+
+/* transform: translateX(25%); */
+
+/* } */
+
+/* .carousel-inner .carousel-item-left.active, */
+
+/* .carousel-inner .carousel-item-prev { */
+
+/* transform: translateX(-25%); */
+
+/* } */
+
+/* .carousel-inner .carousel-item-right, */
+
+/* .carousel-inner .carousel-item-left{ */
+
+/* transform: translateX(0); */
+
+/* } */
+
+.carousel-control-prev-icon {
+ filter: invert(1);
+ transform: translateX(-200%);
+}
+
+div.anchor {
+ display: block;
+ position: relative;
+ top: -250px;
+ visibility: hidden;
+}
+
+.carousel-control-next-icon {
+ filter: invert(1);
+ transform: translateX(200%);
+}
+
+.border {
+ background: #fff;
+ padding: 10px;
+ border: 4px solid #eee;
+ box-shadow: rgb(204, 204, 204) 2px 2px 14px 0;
+}
+
+tr.fc-list-item {
+ cursor: pointer;
+}
+
+.details {
+ font-family: "Lato", sans-serif;
+}
+
+#abstractExample.collapse:not(.show) {
+ display: block;
+
+ /* height = lineheight * no of lines to display */
+ height: 4.5em;
+ overflow: hidden;
+}
+
+#abstractExample.collapsing {
+ height: 4.5em;
+}
+
+#absShow.collapsed::after {
+ content: "+ Show More";
+}
+
+#absShow:not(.collapsed)::after {
+ content: "- Show Less";
+}
+
+body .carousel-control-next,
+body .carousel-control-prev {
+ width: 5%;
+}
+
+.icon_video,
+.icon_cal {
+ fill: #6c757d;
+}
+
+.icon_video:hover,
+.icon_cal:hover {
+ fill: #000;
+}
+
+.checkbox-paper {
+ /*opacity: 0.2;*/
+ color: #fcfcfc;
+ font-size: 14pt;
+ font-weight: bold;
+ cursor: pointer;
+ transition-duration: 100ms;
+}
+
+.checkbox-bookmark {
+ color: #ddd;
+ font-size: 18pt;
+ font-weight: bold;
+ cursor: pointer;
+ transition-duration: 100ms;
+}
+
+.checkbox-paper.selected {
+ opacity: 1;
+ color: #6c757d;
+}
+
+.checkbox-bookmark.selected {
+ opacity: 1;
+ color: #bed972;
+}
+
+.image-container {
+ display: flex;
+ flex-wrap: wrap;
+ justify-content: center;
+}
+
+.image-wrapper {
+ flex: 1 0 70%;
+ max-width: 900px;
+ margin-bottom: 20px;
+}
+
+.wider-image-wrapper {
+ flex: 1 0 85%;
+}
+
+.custom-img {
+ width: 100%;
+ height: auto;
+ margin-bottom: 10px;
+}
+
+/*@supports (-webkit-text-stroke: 1px gray) {*/
+/* .checkbox-paper {*/
+/* opacity: 0.2;*/
+/* -webkit-text-stroke: 1px gray;*/
+/* -webkit-text-fill-color: white;*/
+/* }*/
+
+/* .checkbox-paper.selected {*/
+/* opacity: 1;*/
+/* -webkit-text-fill-color: #bed972;*/
+/* }*/
+/*}*/
+
+@media only screen and (max-width: 480px) {
+ .navbar-brand .logo {
+ width: 100px;
+ height: auto;
+ }
+
+ .navbar .container {
+ flex-direction: row-reverse;
+ }
+
+ .navbar-collapse {
+ padding: 0 16px;
+ }
+
+ .navbar-toggler {
+ margin-left: 16px;
+ }
+}
+
+@media only screen and (max-width: 320px) {
+ .navbar-brand.logo-wrapper {
+ width: 100%;
+ margin-right: 0;
+ margin-bottom: 10px;
+ text-align: center;
+ }
+
+ .navbar-brand .logo {
+ width: 50%;
+ }
+}
diff --git a/static/css/lazy_load.css b/static/css/lazy_load.css
new file mode 100644
index 000000000..dd48417ae
--- /dev/null
+++ b/static/css/lazy_load.css
@@ -0,0 +1,3 @@
+.lazy-load-img {
+ min-height: 100px;
+}
diff --git a/static/css/main.css b/static/css/main.css
new file mode 100644
index 000000000..b020c8ddd
--- /dev/null
+++ b/static/css/main.css
@@ -0,0 +1,25 @@
+/* TYPOGRAPHY */
+@import "fonts/Exo.css";
+@import "fonts/Cuprum.css";
+@import "fonts/Lato.css";
+@import "fonts/Oswald.css";
+
+/* GLOBAL STYLES */
+@import "global.css";
+
+/* COMPONENTS */
+@import "countdown.css";
+@import "dropdownmenu.css";
+
+/* PAGES */
+@import "pages/index.css";
+@import "pages/call-for-papers.css";
+@import "pages/sponsor.css";
+@import "pages/schedule.css";
+@import "pages/live.css";
+@import "pages/speaker.css";
+@import "pages/tutorial.css";
+@import "pages/roundtable.css";
+@import "pages/proceeding.css";
+@import "pages/symposium.css";
+@import "pages/committee.css";
diff --git a/static/css/pages/call-for-papers.css b/static/css/pages/call-for-papers.css
new file mode 100644
index 000000000..2805186cc
--- /dev/null
+++ b/static/css/pages/call-for-papers.css
@@ -0,0 +1,15 @@
+/* CALL FOR PAPERS CSS */
+
+a.btn.btn-primary.btn-lg.active {
+ background-color: #278eb4;
+ border-color: #278eb4;
+}
+
+a.btn.btn-primary.btn-lg.active:hover {
+ background-color: #399dbe;
+ border-color: #399dbe;
+}
+
+.alert-text {
+ color: #d20008;
+}
diff --git a/static/css/pages/committee.css b/static/css/pages/committee.css
new file mode 100644
index 000000000..5e2a23ef9
--- /dev/null
+++ b/static/css/pages/committee.css
@@ -0,0 +1,81 @@
+/* COMMITTEE CSS */
+#section-organizing-committee {
+
+}
+
+.committee-row {
+ column-gap: 30px;
+ row-gap: 25px;
+ justify-content: center;
+}
+
+.committee iframe {
+ border: none;
+}
+
+.committee-info {
+ margin: 60px auto;
+}
+
+.committee-member {
+ display: flex;
+ max-width: 260px;
+}
+
+.committee-member a {
+ color: #000;
+}
+
+.committee-member__card {
+ flex-direction: row;
+ border: 1px solid #eee;
+ border-radius: 15px;
+ padding: 10px;
+ background-color: #fff;
+ display: flex;
+ align-items: flex-start;
+ column-gap: 20px;
+ height: 100%;
+ box-shadow: 1px 1px 9px -6px rgba(0, 0, 0, 0.5);
+}
+
+.committee-member__image {
+ position: relative;
+ max-height: 110px;
+ overflow: hidden;
+ border-radius: 15px;
+ box-shadow: 1px 1px 16px -6px rgba(0, 0, 0, 0.75);
+}
+
+.committee-member__card img {
+ max-width: 130px;
+ transform: translate(-15%, 0%);
+}
+
+.committee-member__info {
+ display: flex;
+ flex-direction: column;
+ justify-content: flex-end;
+ flex-basis: 75%;
+}
+
+
+.committee-member__name {
+ font-size: 17px;
+}
+
+.committee-member__role {
+ font-style: italic;
+}
+
+.committee-member__info p {
+ font-size: 14px;
+ margin-bottom: 0rem;
+ color: #888;
+}
+
+@media only screen and (max-width: 480px) {
+ .committee-row {
+ justify-content: center;
+ }
+}
diff --git a/static/css/pages/index.css b/static/css/pages/index.css
new file mode 100644
index 000000000..619c61be9
--- /dev/null
+++ b/static/css/pages/index.css
@@ -0,0 +1,55 @@
+/* INDEX CSS */
+.timeline-table {
+ border-collapse: collapse;
+}
+
+.timeline-table td,
+.timeline-table th {
+ position: relative;
+}
+
+.timeline-table tr.strikeout td:before,
+.timeline-table tr.strikeout th:before {
+ content: " ";
+ position: absolute;
+ top: 50%;
+ left: 0;
+ border-bottom: 1px solid #444;
+ width: 100%;
+}
+
+.main-page-speaker-group-row {
+ justify-content: center;
+}
+
+.main-page-speaker-block-title {
+ font-size: 1.85em;
+ font-family: "Exo";
+ color: #4e5459;
+}
+
+.main-page-speaker-name {
+ font-size: 1.2em;
+ font-family: "Exo";
+ color: #6c757d;
+}
+
+.main-page-speaker-affiliation {
+ font-size: 1em;
+ font-family: "Exo";
+ color: #6c757d;
+}
+
+.main-page-state-of-ml-talk-title {
+ font-size: 1.25em;
+ font-family: "Exo";
+ color: #4e5459;
+}
+
+.main-page-speaker-highlight {
+ text-align: center;
+}
+
+.main-page-speaker-group {
+ box-shadow: 5px 5px 10px rgba(0, 0, 0, 0.5);
+}
diff --git a/static/css/pages/live.css b/static/css/pages/live.css
new file mode 100644
index 000000000..db7c5d08d
--- /dev/null
+++ b/static/css/pages/live.css
@@ -0,0 +1,17 @@
+/* LIVE CSS */
+
+.live iframe {
+ border: none;
+}
+
+.live .table-tutorials {
+ margin-bottom: 60px;
+}
+
+.live .table-tutorials td {
+ width: 33.33%;
+}
+
+.live .table-tutorials a {
+ font-size: 1.2rem;
+}
diff --git a/static/css/pages/proceeding.css b/static/css/pages/proceeding.css
new file mode 100644
index 000000000..fafa0a0bd
--- /dev/null
+++ b/static/css/pages/proceeding.css
@@ -0,0 +1,5 @@
+/* PROCEEDING DETAIL CSS */
+
+.proceeding iframe {
+ border: none;
+}
diff --git a/static/css/pages/roundtable.css b/static/css/pages/roundtable.css
new file mode 100644
index 000000000..9d46776ca
--- /dev/null
+++ b/static/css/pages/roundtable.css
@@ -0,0 +1,5 @@
+/* ROUNDTABLE DETAIL CSS */
+
+.roundtable iframe {
+ border: none;
+}
diff --git a/static/css/pages/schedule.css b/static/css/pages/schedule.css
new file mode 100644
index 000000000..aba49a5c3
--- /dev/null
+++ b/static/css/pages/schedule.css
@@ -0,0 +1,61 @@
+/* SCHEDULE CSS */
+
+.schedule-table .table-primary {
+ background-color: #666;
+ color: #fff;
+}
+
+.schedule-table .table-secondary {
+ background-color: #d9d9d9;
+}
+
+.schedule-table .table-secondary.time {
+ min-width: 85px;
+}
+
+.schedule-table .method {
+ width: 110px;
+}
+
+.schedule-table .table-keynote {
+ background-color: #83a5a7;
+}
+
+.schedule-table .table-keynote-secondary {
+ background-color: #c7dee0;
+}
+
+.schedule-table .table-proceedings {
+ background-color: #b4a7d6;
+}
+
+.schedule-table .table-proceedings-secondary {
+ background-color: #d9d2e9;
+}
+
+.schedule-table .table-tutorial {
+ background-color: #bfc375;
+}
+
+.schedule-table .table-tutorial-secondary {
+ background-color: #e1e3b7;
+}
+
+.schedule-table .table-roundtable {
+ background-color: #bfc375;
+}
+
+.schedule-table .table-poster {
+ background-color: #f9cb9c;
+}
+
+
+.table-bordered tbody>tr>td.keynote {
+ border-left: hidden!important;
+}
+
+
+em {
+ font-size: 0.88rem; /* 14px */
+ color: #999;
+}
diff --git a/static/css/pages/speaker.css b/static/css/pages/speaker.css
new file mode 100644
index 000000000..8a386942d
--- /dev/null
+++ b/static/css/pages/speaker.css
@@ -0,0 +1,5 @@
+/* SPEAKER DETAIL CSS */
+
+.speaker iframe {
+ border: none;
+}
diff --git a/static/css/pages/sponsor.css b/static/css/pages/sponsor.css
new file mode 100644
index 000000000..c8c9272a2
--- /dev/null
+++ b/static/css/pages/sponsor.css
@@ -0,0 +1,26 @@
+/* SPONSOR CSS */
+.sponsors {
+ margin-top: 20px;
+ margin-bottom: 20px;
+}
+
+.pp-card.sponsor .pp-card-header {
+ height: 130px;
+ display: flex;
+ align-items: center;
+ justify-content: center;
+ padding: 15px;
+}
+
+.pp-card.sponsor.silver .pp-card-header {
+ height: 120px;
+}
+
+.pp-card.sponsor.bronze .pp-card-header {
+ height: 100px;
+}
+
+.pp-card.sponsor img {
+ max-height: 100%;
+ max-width: 100%;
+}
diff --git a/static/css/pages/symposium.css b/static/css/pages/symposium.css
new file mode 100644
index 000000000..bd9dfa794
--- /dev/null
+++ b/static/css/pages/symposium.css
@@ -0,0 +1,5 @@
+/* PROCEEDING DETAIL CSS */
+
+.symposium iframe {
+ border: none;
+}
diff --git a/static/css/pages/tutorial.css b/static/css/pages/tutorial.css
new file mode 100644
index 000000000..2ee420ce9
--- /dev/null
+++ b/static/css/pages/tutorial.css
@@ -0,0 +1,5 @@
+/* TUTORIAL DETAIL CSS */
+
+.tutorial iframe {
+ border: none;
+}
diff --git a/static/css/paper_vis.css b/static/css/paper_vis.css
new file mode 100644
index 000000000..930037550
--- /dev/null
+++ b/static/css/paper_vis.css
@@ -0,0 +1,77 @@
+.dot {
+ fill: rgb(43, 113, 255);
+ fill-opacity: 0.5;
+ transition-duration: 0.2s;
+}
+
+.dot:hover {
+ fill-opacity: 1;
+}
+
+.dot.highlight {
+ fill-opacity: 1;
+ fill: rgb(255, 54, 72);
+}
+.dot.non-highlight {
+ fill: rgba(128, 128, 128, 0.62);
+}
+
+.dot.rect_selected {
+ /*fill: rgb(169, 208, 62);*/
+ stroke: #333;
+ stroke-width: 1;
+ stroke-opacity: 1;
+}
+
+.dot.highlight_sel {
+ fill-opacity: 1;
+ fill: rgb(255, 54, 72);
+}
+
+.dot.read {
+ fill: gray;
+}
+.dot.bookmarked {
+ fill: #bed972;
+ fill-opacity: 1;
+}
+
+.tt-title {
+ font-weight: bold;
+}
+
+.topWords {
+ display: inline-block;
+ padding: 2px 4px;
+ margin: 2px;
+ background: #666666;
+ color: white;
+ border-radius: 2px;
+}
+
+.p_title {
+ font-weight: bold;
+}
+.p_authors {
+ font-size: small;
+}
+.sel_paper {
+ background: white;
+ padding: 5px;
+ margin: 5px 0;
+ border-radius: 5px;
+ cursor: pointer;
+}
+.sel_paper:hover {
+ background: lightgray;
+}
+
+.results {
+ width: 200px;
+}
+
+@media (max-width: 767.98px) {
+ .results {
+ width: 90%;
+ }
+}
diff --git a/static/css/typeahead.css b/static/css/typeahead.css
new file mode 100644
index 000000000..d62d4d392
--- /dev/null
+++ b/static/css/typeahead.css
@@ -0,0 +1,56 @@
+.tt-query,
+.tt-hint {
+ width: 396px;
+ height: 30px;
+ padding: 8px 12px;
+ font-size: 24px;
+ line-height: 30px;
+ border: 2px solid #ccc;
+ border-radius: 8px;
+ outline: none;
+}
+
+.tt-query {
+ box-shadow: inset 0 1px 1px rgba(0, 0, 0, 0.075);
+}
+
+.tt-hint {
+ color: #999;
+}
+
+.tt-menu {
+ width: 422px;
+ margin-top: 12px;
+ padding: 8px 0;
+ background-color: #fff;
+ border: 1px solid #ccc;
+ border: 1px solid rgba(0, 0, 0, 0.2);
+ border-radius: 8px;
+ box-shadow: 0 5px 10px rgba(0, 0, 0, 0.2);
+}
+
+.tt-menu {
+ max-height: 150px;
+ overflow-y: auto;
+}
+
+.tt-suggestion {
+ padding: 3px 20px;
+ font-size: 18px;
+ line-height: 24px;
+}
+
+.tt-suggestion.tt-cursor {
+ color: #fff;
+ background-color: #0097cf;
+}
+
+.tt-suggestion p {
+ margin: 0;
+}
+
+.tt-suggestion:hover {
+ cursor: pointer;
+}
+
+/*.twitter-typeahead, .tt-hint, .tt-input, .tt-menu { width: 90%; }*/
diff --git a/static/images/CSAIL.jpg b/static/images/CSAIL.jpg
new file mode 100644
index 000000000..1c2b5c71c
Binary files /dev/null and b/static/images/CSAIL.jpg differ
diff --git a/static/images/GLTR_poster.pdf b/static/images/GLTR_poster.pdf
new file mode 100644
index 000000000..2cb13d724
Binary files /dev/null and b/static/images/GLTR_poster.pdf differ
diff --git a/static/images/MiniConf.png b/static/images/MiniConf.png
new file mode 100644
index 000000000..7b73b1a62
Binary files /dev/null and b/static/images/MiniConf.png differ
diff --git a/static/images/attend/travel_2024_image1.png b/static/images/attend/travel_2024_image1.png
new file mode 100644
index 000000000..9983fb9b0
Binary files /dev/null and b/static/images/attend/travel_2024_image1.png differ
diff --git a/static/images/attend/travel_2024_image2.png b/static/images/attend/travel_2024_image2.png
new file mode 100644
index 000000000..363b76d3f
Binary files /dev/null and b/static/images/attend/travel_2024_image2.png differ
diff --git a/static/images/attend/travel_2024_image3.png b/static/images/attend/travel_2024_image3.png
new file mode 100644
index 000000000..e1d7b3af3
Binary files /dev/null and b/static/images/attend/travel_2024_image3.png differ
diff --git a/static/images/committee/ahmed-alaa.jpg b/static/images/committee/ahmed-alaa.jpg
new file mode 100644
index 000000000..ff62e7969
Binary files /dev/null and b/static/images/committee/ahmed-alaa.jpg differ
diff --git a/static/images/committee/anon-person.jpg b/static/images/committee/anon-person.jpg
new file mode 100644
index 000000000..ee14510e1
Binary files /dev/null and b/static/images/committee/anon-person.jpg differ
diff --git a/static/images/committee/bobak-mortazavi.jpg b/static/images/committee/bobak-mortazavi.jpg
new file mode 100644
index 000000000..4cdcffceb
Binary files /dev/null and b/static/images/committee/bobak-mortazavi.jpg differ
diff --git a/static/images/committee/brian-gow.jpg b/static/images/committee/brian-gow.jpg
new file mode 100644
index 000000000..1a4aae5f5
Binary files /dev/null and b/static/images/committee/brian-gow.jpg differ
diff --git a/static/images/committee/chengxi-zang.jpg b/static/images/committee/chengxi-zang.jpg
new file mode 100644
index 000000000..861c0e1b8
Binary files /dev/null and b/static/images/committee/chengxi-zang.jpg differ
diff --git a/static/images/committee/edward-choi.jpg b/static/images/committee/edward-choi.jpg
new file mode 100644
index 000000000..326f9a66e
Binary files /dev/null and b/static/images/committee/edward-choi.jpg differ
diff --git a/static/images/committee/elena-sizikova.jpg b/static/images/committee/elena-sizikova.jpg
new file mode 100644
index 000000000..7619341ee
Binary files /dev/null and b/static/images/committee/elena-sizikova.jpg differ
diff --git a/static/images/committee/elizabeth-healey.jpg b/static/images/committee/elizabeth-healey.jpg
new file mode 100644
index 000000000..ab3b8c261
Binary files /dev/null and b/static/images/committee/elizabeth-healey.jpg differ
diff --git a/static/images/committee/emma-rocheteau.jpg b/static/images/committee/emma-rocheteau.jpg
new file mode 100644
index 000000000..408f16163
Binary files /dev/null and b/static/images/committee/emma-rocheteau.jpg differ
diff --git a/static/images/committee/fei-wang.jpg b/static/images/committee/fei-wang.jpg
new file mode 100644
index 000000000..e79fe46b8
Binary files /dev/null and b/static/images/committee/fei-wang.jpg differ
diff --git a/static/images/committee/george-chen.jpg b/static/images/committee/george-chen.jpg
new file mode 100644
index 000000000..dab549ff5
Binary files /dev/null and b/static/images/committee/george-chen.jpg differ
diff --git a/static/images/committee/irene-chen.jpg b/static/images/committee/irene-chen.jpg
new file mode 100644
index 000000000..d82435b10
Binary files /dev/null and b/static/images/committee/irene-chen.jpg differ
diff --git a/static/images/committee/jiacheng-zhu.jpg b/static/images/committee/jiacheng-zhu.jpg
new file mode 100644
index 000000000..f0ba251e9
Binary files /dev/null and b/static/images/committee/jiacheng-zhu.jpg differ
diff --git a/static/images/committee/kai-wang.jpg b/static/images/committee/kai-wang.jpg
new file mode 100644
index 000000000..f2a520f0c
Binary files /dev/null and b/static/images/committee/kai-wang.jpg differ
diff --git a/static/images/committee/kaveri-thakoor.jpg b/static/images/committee/kaveri-thakoor.jpg
new file mode 100644
index 000000000..4d4c2a813
Binary files /dev/null and b/static/images/committee/kaveri-thakoor.jpg differ
diff --git a/static/images/committee/koyena-pal.jpg b/static/images/committee/koyena-pal.jpg
new file mode 100644
index 000000000..b8d6f7f4b
Binary files /dev/null and b/static/images/committee/koyena-pal.jpg differ
diff --git a/static/images/committee/marzyeh-ghassemi.jpg b/static/images/committee/marzyeh-ghassemi.jpg
new file mode 100644
index 000000000..2be228e4b
Binary files /dev/null and b/static/images/committee/marzyeh-ghassemi.jpg differ
diff --git a/static/images/committee/matthew-mcdermott.jpg b/static/images/committee/matthew-mcdermott.jpg
new file mode 100644
index 000000000..8ebadbd5e
Binary files /dev/null and b/static/images/committee/matthew-mcdermott.jpg differ
diff --git a/static/images/committee/michael-c-hughes.jpg b/static/images/committee/michael-c-hughes.jpg
new file mode 100644
index 000000000..377f386d7
Binary files /dev/null and b/static/images/committee/michael-c-hughes.jpg differ
diff --git a/static/images/committee/monica-agrawal.jpg b/static/images/committee/monica-agrawal.jpg
new file mode 100644
index 000000000..c62bd580d
Binary files /dev/null and b/static/images/committee/monica-agrawal.jpg differ
diff --git a/static/images/committee/monica-munnangi.jpg b/static/images/committee/monica-munnangi.jpg
new file mode 100644
index 000000000..b7c27fee3
Binary files /dev/null and b/static/images/committee/monica-munnangi.jpg differ
diff --git a/static/images/committee/pankhuri-singhal.jpg b/static/images/committee/pankhuri-singhal.jpg
new file mode 100644
index 000000000..742840316
Binary files /dev/null and b/static/images/committee/pankhuri-singhal.jpg differ
diff --git a/static/images/committee/roshan-kenia.jpg b/static/images/committee/roshan-kenia.jpg
new file mode 100644
index 000000000..7c8f58d03
Binary files /dev/null and b/static/images/committee/roshan-kenia.jpg differ
diff --git a/static/images/committee/tom-pollard.jpg b/static/images/committee/tom-pollard.jpg
new file mode 100644
index 000000000..a5993caba
Binary files /dev/null and b/static/images/committee/tom-pollard.jpg differ
diff --git a/static/images/committee/xiaoxiao-li.jpg b/static/images/committee/xiaoxiao-li.jpg
new file mode 100644
index 000000000..a412dc698
Binary files /dev/null and b/static/images/committee/xiaoxiao-li.jpg differ
diff --git a/static/images/committee/zehra-abedi.jpg b/static/images/committee/zehra-abedi.jpg
new file mode 100644
index 000000000..ef746fc7a
Binary files /dev/null and b/static/images/committee/zehra-abedi.jpg differ
diff --git a/static/images/favicon.png b/static/images/favicon.png
new file mode 100644
index 000000000..babd44203
Binary files /dev/null and b/static/images/favicon.png differ
diff --git a/static/images/gather-avatar.png b/static/images/gather-avatar.png
new file mode 100644
index 000000000..cf9d299e9
Binary files /dev/null and b/static/images/gather-avatar.png differ
diff --git a/static/images/main.jpg b/static/images/main.jpg
new file mode 100644
index 000000000..dd628c333
Binary files /dev/null and b/static/images/main.jpg differ
diff --git a/static/images/poster_session.jpg b/static/images/poster_session.jpg
new file mode 100644
index 000000000..1dc6ed8b0
Binary files /dev/null and b/static/images/poster_session.jpg differ
diff --git a/static/images/speakers/a-karthikesalingam.jpg b/static/images/speakers/a-karthikesalingam.jpg
new file mode 100644
index 000000000..f7bd540a0
Binary files /dev/null and b/static/images/speakers/a-karthikesalingam.jpg differ
diff --git a/static/images/speakers/alistair_johnson.jpg b/static/images/speakers/alistair_johnson.jpg
new file mode 100644
index 000000000..23765a96b
Binary files /dev/null and b/static/images/speakers/alistair_johnson.jpg differ
diff --git a/static/images/speakers/ana_crisan.jpg b/static/images/speakers/ana_crisan.jpg
new file mode 100644
index 000000000..3ae48d5eb
Binary files /dev/null and b/static/images/speakers/ana_crisan.jpg differ
diff --git a/static/images/speakers/ashley_beecy.jpeg b/static/images/speakers/ashley_beecy.jpeg
new file mode 100644
index 000000000..f97e3aa78
Binary files /dev/null and b/static/images/speakers/ashley_beecy.jpeg differ
diff --git a/static/images/speakers/ben_glocker.jpg b/static/images/speakers/ben_glocker.jpg
new file mode 100644
index 000000000..b07864995
Binary files /dev/null and b/static/images/speakers/ben_glocker.jpg differ
diff --git a/static/images/speakers/blank_headshot.jpg b/static/images/speakers/blank_headshot.jpg
new file mode 100644
index 000000000..2dd0e8d2f
Binary files /dev/null and b/static/images/speakers/blank_headshot.jpg differ
diff --git a/static/images/speakers/byron_wallace.jpg b/static/images/speakers/byron_wallace.jpg
new file mode 100644
index 000000000..df8f4ed05
Binary files /dev/null and b/static/images/speakers/byron_wallace.jpg differ
diff --git a/static/images/speakers/christopher_chute.jpg b/static/images/speakers/christopher_chute.jpg
new file mode 100644
index 000000000..bcd0f933a
Binary files /dev/null and b/static/images/speakers/christopher_chute.jpg differ
diff --git a/static/images/speakers/danielle_belgrave.jpg b/static/images/speakers/danielle_belgrave.jpg
new file mode 100644
index 000000000..76dd7e81f
Binary files /dev/null and b/static/images/speakers/danielle_belgrave.jpg differ
diff --git a/static/images/speakers/david_meltzer.jpg b/static/images/speakers/david_meltzer.jpg
new file mode 100644
index 000000000..64e614da3
Binary files /dev/null and b/static/images/speakers/david_meltzer.jpg differ
diff --git a/static/images/speakers/deb_raji-square.jpg b/static/images/speakers/deb_raji-square.jpg
new file mode 100644
index 000000000..7ac9c02d2
Binary files /dev/null and b/static/images/speakers/deb_raji-square.jpg differ
diff --git a/static/images/speakers/deb_raji.jpg b/static/images/speakers/deb_raji.jpg
new file mode 100644
index 000000000..da8741aa2
Binary files /dev/null and b/static/images/speakers/deb_raji.jpg differ
diff --git a/static/images/speakers/dhanya_sridhar.jpg b/static/images/speakers/dhanya_sridhar.jpg
new file mode 100644
index 000000000..454805880
Binary files /dev/null and b/static/images/speakers/dhanya_sridhar.jpg differ
diff --git a/static/images/speakers/dina_demner_fushman.jpg b/static/images/speakers/dina_demner_fushman.jpg
new file mode 100644
index 000000000..e90a29c1a
Binary files /dev/null and b/static/images/speakers/dina_demner_fushman.jpg differ
diff --git a/static/images/speakers/dina_katabi.jpg b/static/images/speakers/dina_katabi.jpg
new file mode 100644
index 000000000..d465f2b47
Binary files /dev/null and b/static/images/speakers/dina_katabi.jpg differ
diff --git a/static/images/speakers/elaine_nsoesie.jpg b/static/images/speakers/elaine_nsoesie.jpg
new file mode 100644
index 000000000..4c695b89a
Binary files /dev/null and b/static/images/speakers/elaine_nsoesie.jpg differ
diff --git a/static/images/speakers/emma_pierson.jpeg b/static/images/speakers/emma_pierson.jpeg
new file mode 100644
index 000000000..83b49c6d2
Binary files /dev/null and b/static/images/speakers/emma_pierson.jpeg differ
diff --git a/static/images/speakers/esra_suel.jpeg b/static/images/speakers/esra_suel.jpeg
new file mode 100644
index 000000000..7679ff7cd
Binary files /dev/null and b/static/images/speakers/esra_suel.jpeg differ
diff --git a/static/images/speakers/f_perry_wilson.png b/static/images/speakers/f_perry_wilson.png
new file mode 100644
index 000000000..d04817ba3
Binary files /dev/null and b/static/images/speakers/f_perry_wilson.png differ
diff --git a/static/images/speakers/girish_nadkarni.png b/static/images/speakers/girish_nadkarni.png
new file mode 100644
index 000000000..5eb484b4e
Binary files /dev/null and b/static/images/speakers/girish_nadkarni.png differ
diff --git a/static/images/speakers/hamsa-bastani-square.jpeg b/static/images/speakers/hamsa-bastani-square.jpeg
new file mode 100644
index 000000000..b034d1448
Binary files /dev/null and b/static/images/speakers/hamsa-bastani-square.jpeg differ
diff --git a/static/images/speakers/hamsa-bastani.jpeg b/static/images/speakers/hamsa-bastani.jpeg
new file mode 100644
index 000000000..fc4f5abdd
Binary files /dev/null and b/static/images/speakers/hamsa-bastani.jpeg differ
diff --git a/static/images/speakers/isaac_kohane.jpg b/static/images/speakers/isaac_kohane.jpg
new file mode 100644
index 000000000..8575be804
Binary files /dev/null and b/static/images/speakers/isaac_kohane.jpg differ
diff --git a/static/images/speakers/jason_fries.jpg b/static/images/speakers/jason_fries.jpg
new file mode 100644
index 000000000..e5dde3e52
Binary files /dev/null and b/static/images/speakers/jason_fries.jpg differ
diff --git a/static/images/speakers/jessica_tenenbaum.jpg b/static/images/speakers/jessica_tenenbaum.jpg
new file mode 100644
index 000000000..b1e8e7984
Binary files /dev/null and b/static/images/speakers/jessica_tenenbaum.jpg differ
diff --git a/static/images/speakers/john_halamka.jpg b/static/images/speakers/john_halamka.jpg
new file mode 100644
index 000000000..e60159108
Binary files /dev/null and b/static/images/speakers/john_halamka.jpg differ
diff --git a/static/images/speakers/jun_cheng.jpg b/static/images/speakers/jun_cheng.jpg
new file mode 100644
index 000000000..2833bf3e4
Binary files /dev/null and b/static/images/speakers/jun_cheng.jpg differ
diff --git a/static/images/speakers/jure_leskovec.jpg b/static/images/speakers/jure_leskovec.jpg
new file mode 100644
index 000000000..546affca9
Binary files /dev/null and b/static/images/speakers/jure_leskovec.jpg differ
diff --git a/static/images/speakers/karandeep_singh.jpg b/static/images/speakers/karandeep_singh.jpg
new file mode 100644
index 000000000..2b886149f
Binary files /dev/null and b/static/images/speakers/karandeep_singh.jpg differ
diff --git a/static/images/speakers/katie_link.jpeg b/static/images/speakers/katie_link.jpeg
new file mode 100644
index 000000000..507031888
Binary files /dev/null and b/static/images/speakers/katie_link.jpeg differ
diff --git a/static/images/speakers/khaled_el_emam.png b/static/images/speakers/khaled_el_emam.png
new file mode 100644
index 000000000..d89f74100
Binary files /dev/null and b/static/images/speakers/khaled_el_emam.png differ
diff --git a/static/images/speakers/kyra_gan.png b/static/images/speakers/kyra_gan.png
new file mode 100644
index 000000000..c8c6e40c3
Binary files /dev/null and b/static/images/speakers/kyra_gan.png differ
diff --git a/static/images/speakers/kyunghyun_cho.jpeg b/static/images/speakers/kyunghyun_cho.jpeg
new file mode 100644
index 000000000..40b1d8f50
Binary files /dev/null and b/static/images/speakers/kyunghyun_cho.jpeg differ
diff --git a/static/images/speakers/lauren_oakden_rayner.jpg b/static/images/speakers/lauren_oakden_rayner.jpg
new file mode 100644
index 000000000..3f15d8485
Binary files /dev/null and b/static/images/speakers/lauren_oakden_rayner.jpg differ
diff --git a/static/images/speakers/leo_celi-square.png b/static/images/speakers/leo_celi-square.png
new file mode 100644
index 000000000..ef2551566
Binary files /dev/null and b/static/images/speakers/leo_celi-square.png differ
diff --git a/static/images/speakers/leo_celi.jpeg b/static/images/speakers/leo_celi.jpeg
new file mode 100644
index 000000000..9006a71db
Binary files /dev/null and b/static/images/speakers/leo_celi.jpeg differ
diff --git a/static/images/speakers/leo_celi.png b/static/images/speakers/leo_celi.png
new file mode 100644
index 000000000..6078e3a0b
Binary files /dev/null and b/static/images/speakers/leo_celi.png differ
diff --git a/static/images/speakers/li_xiong.png b/static/images/speakers/li_xiong.png
new file mode 100644
index 000000000..45ed367f1
Binary files /dev/null and b/static/images/speakers/li_xiong.png differ
diff --git a/static/images/speakers/lorin_crawford.jpg b/static/images/speakers/lorin_crawford.jpg
new file mode 100644
index 000000000..e0cedbcb1
Binary files /dev/null and b/static/images/speakers/lorin_crawford.jpg differ
diff --git a/static/images/speakers/m-jacobs.png b/static/images/speakers/m-jacobs.png
new file mode 100644
index 000000000..5ede53d34
Binary files /dev/null and b/static/images/speakers/m-jacobs.png differ
diff --git a/static/images/speakers/m-sendak.jpg b/static/images/speakers/m-sendak.jpg
new file mode 100644
index 000000000..48ec8a157
Binary files /dev/null and b/static/images/speakers/m-sendak.jpg differ
diff --git a/static/images/speakers/maia_hightower.jpg b/static/images/speakers/maia_hightower.jpg
new file mode 100644
index 000000000..269286472
Binary files /dev/null and b/static/images/speakers/maia_hightower.jpg differ
diff --git a/static/images/speakers/marzyeh_ghassemi.jpeg b/static/images/speakers/marzyeh_ghassemi.jpeg
new file mode 100644
index 000000000..3472f9fe9
Binary files /dev/null and b/static/images/speakers/marzyeh_ghassemi.jpeg differ
diff --git a/static/images/speakers/munmun_de_choudhury.jpg b/static/images/speakers/munmun_de_choudhury.jpg
new file mode 100644
index 000000000..fd16bba18
Binary files /dev/null and b/static/images/speakers/munmun_de_choudhury.jpg differ
diff --git a/static/images/speakers/n-razavian.jpg b/static/images/speakers/n-razavian.jpg
new file mode 100644
index 000000000..7e6d62d03
Binary files /dev/null and b/static/images/speakers/n-razavian.jpg differ
diff --git a/static/images/speakers/nigam_shah.png b/static/images/speakers/nigam_shah.png
new file mode 100644
index 000000000..42d0b9d53
Binary files /dev/null and b/static/images/speakers/nigam_shah.png differ
diff --git a/static/images/speakers/nils_gehlenborg-square.jpeg b/static/images/speakers/nils_gehlenborg-square.jpeg
new file mode 100644
index 000000000..40168ff67
Binary files /dev/null and b/static/images/speakers/nils_gehlenborg-square.jpeg differ
diff --git a/static/images/speakers/nils_gehlenborg.jpeg b/static/images/speakers/nils_gehlenborg.jpeg
new file mode 100644
index 000000000..4a094e9d7
Binary files /dev/null and b/static/images/speakers/nils_gehlenborg.jpeg differ
diff --git a/static/images/speakers/noemie_elhadad-square.jpeg b/static/images/speakers/noemie_elhadad-square.jpeg
new file mode 100644
index 000000000..97c2d534c
Binary files /dev/null and b/static/images/speakers/noemie_elhadad-square.jpeg differ
diff --git a/static/images/speakers/noemie_elhadad.jpeg b/static/images/speakers/noemie_elhadad.jpeg
new file mode 100644
index 000000000..2d3759a3d
Binary files /dev/null and b/static/images/speakers/noemie_elhadad.jpeg differ
diff --git a/static/images/speakers/nuria_oliver.jpg b/static/images/speakers/nuria_oliver.jpg
new file mode 100644
index 000000000..bae619bd3
Binary files /dev/null and b/static/images/speakers/nuria_oliver.jpg differ
diff --git a/static/images/speakers/r-barzilay.jpg b/static/images/speakers/r-barzilay.jpg
new file mode 100644
index 000000000..b2c3af9b6
Binary files /dev/null and b/static/images/speakers/r-barzilay.jpg differ
diff --git a/static/images/speakers/robert_platt.jpg b/static/images/speakers/robert_platt.jpg
new file mode 100644
index 000000000..ba9682bfa
Binary files /dev/null and b/static/images/speakers/robert_platt.jpg differ
diff --git a/static/images/speakers/rosa_arriaga.png b/static/images/speakers/rosa_arriaga.png
new file mode 100644
index 000000000..b19410afa
Binary files /dev/null and b/static/images/speakers/rosa_arriaga.png differ
diff --git a/static/images/speakers/roxana_daneshjou.jpg b/static/images/speakers/roxana_daneshjou.jpg
new file mode 100644
index 000000000..bb800af56
Binary files /dev/null and b/static/images/speakers/roxana_daneshjou.jpg differ
diff --git a/static/images/speakers/roy_perlis.jpeg b/static/images/speakers/roy_perlis.jpeg
new file mode 100644
index 000000000..9d009f44e
Binary files /dev/null and b/static/images/speakers/roy_perlis.jpeg differ
diff --git a/static/images/speakers/rui_duan.jpg b/static/images/speakers/rui_duan.jpg
new file mode 100644
index 000000000..09b14ac77
Binary files /dev/null and b/static/images/speakers/rui_duan.jpg differ
diff --git a/static/images/speakers/rumi_chunara.jpg b/static/images/speakers/rumi_chunara.jpg
new file mode 100644
index 000000000..e2a54c496
Binary files /dev/null and b/static/images/speakers/rumi_chunara.jpg differ
diff --git a/static/images/speakers/ruslan_salakhutdinov.png b/static/images/speakers/ruslan_salakhutdinov.png
new file mode 100644
index 000000000..d0a8524a3
Binary files /dev/null and b/static/images/speakers/ruslan_salakhutdinov.png differ
diff --git a/static/images/speakers/saadia_gabriel.jpg b/static/images/speakers/saadia_gabriel.jpg
new file mode 100644
index 000000000..8c403be47
Binary files /dev/null and b/static/images/speakers/saadia_gabriel.jpg differ
diff --git a/static/images/speakers/samantha_kleinberg-square.jpg b/static/images/speakers/samantha_kleinberg-square.jpg
new file mode 100644
index 000000000..aafd63ffb
Binary files /dev/null and b/static/images/speakers/samantha_kleinberg-square.jpg differ
diff --git a/static/images/speakers/samantha_kleinberg.jpg b/static/images/speakers/samantha_kleinberg.jpg
new file mode 100644
index 000000000..71598fb51
Binary files /dev/null and b/static/images/speakers/samantha_kleinberg.jpg differ
diff --git a/static/images/speakers/sanmi_koyejo-square.jpg b/static/images/speakers/sanmi_koyejo-square.jpg
new file mode 100644
index 000000000..bc94dcb77
Binary files /dev/null and b/static/images/speakers/sanmi_koyejo-square.jpg differ
diff --git a/static/images/speakers/sanmi_koyejo.jpg b/static/images/speakers/sanmi_koyejo.jpg
new file mode 100644
index 000000000..7ddb26ebd
Binary files /dev/null and b/static/images/speakers/sanmi_koyejo.jpg differ
diff --git a/static/images/speakers/sherri_rose.png b/static/images/speakers/sherri_rose.png
new file mode 100644
index 000000000..4c8549800
Binary files /dev/null and b/static/images/speakers/sherri_rose.png differ
diff --git a/static/images/speakers/suchi_saria.jpg b/static/images/speakers/suchi_saria.jpg
new file mode 100644
index 000000000..f1f1207c4
Binary files /dev/null and b/static/images/speakers/suchi_saria.jpg differ
diff --git a/static/images/speakers/t-cai.jpg b/static/images/speakers/t-cai.jpg
new file mode 100644
index 000000000..98908c46a
Binary files /dev/null and b/static/images/speakers/t-cai.jpg differ
diff --git a/static/images/speakers/tanzeem_choudhury.png b/static/images/speakers/tanzeem_choudhury.png
new file mode 100644
index 000000000..95291d0df
Binary files /dev/null and b/static/images/speakers/tanzeem_choudhury.png differ
diff --git a/static/images/speakers/tristan_naumann.jpeg b/static/images/speakers/tristan_naumann.jpeg
new file mode 100644
index 000000000..a2bf711b1
Binary files /dev/null and b/static/images/speakers/tristan_naumann.jpeg differ
diff --git a/static/images/speakers/walter_dempsey.jpg b/static/images/speakers/walter_dempsey.jpg
new file mode 100644
index 000000000..d1744be5a
Binary files /dev/null and b/static/images/speakers/walter_dempsey.jpg differ
diff --git a/static/images/speakers/yindalon_aphinyanaphongs.jpg b/static/images/speakers/yindalon_aphinyanaphongs.jpg
new file mode 100644
index 000000000..8b47a7b23
Binary files /dev/null and b/static/images/speakers/yindalon_aphinyanaphongs.jpg differ
diff --git a/static/images/speakers/yong_chen.png b/static/images/speakers/yong_chen.png
new file mode 100644
index 000000000..659cddfdf
Binary files /dev/null and b/static/images/speakers/yong_chen.png differ
diff --git a/static/images/speakers/yoshua_bengio.png b/static/images/speakers/yoshua_bengio.png
new file mode 100644
index 000000000..dba857b19
Binary files /dev/null and b/static/images/speakers/yoshua_bengio.png differ
diff --git a/static/images/speakers/yubin_park.png b/static/images/speakers/yubin_park.png
new file mode 100644
index 000000000..f3554e86c
Binary files /dev/null and b/static/images/speakers/yubin_park.png differ
diff --git a/static/images/speakers/zak_kohane-square.png b/static/images/speakers/zak_kohane-square.png
new file mode 100644
index 000000000..b18cdc57b
Binary files /dev/null and b/static/images/speakers/zak_kohane-square.png differ
diff --git a/static/images/speakers/zak_kohane.png b/static/images/speakers/zak_kohane.png
new file mode 100644
index 000000000..511cc9443
Binary files /dev/null and b/static/images/speakers/zak_kohane.png differ
diff --git a/static/images/speakers/ziad_obermeyer.jpg b/static/images/speakers/ziad_obermeyer.jpg
new file mode 100644
index 000000000..6d8115200
Binary files /dev/null and b/static/images/speakers/ziad_obermeyer.jpg differ
diff --git a/static/images/sponsors/AITRICS.png b/static/images/sponsors/AITRICS.png
new file mode 100644
index 000000000..c04c69d07
Binary files /dev/null and b/static/images/sponsors/AITRICS.png differ
diff --git a/static/images/sponsors/CPH_UCSF_logo.png b/static/images/sponsors/CPH_UCSF_logo.png
new file mode 100644
index 000000000..594594ded
Binary files /dev/null and b/static/images/sponsors/CPH_UCSF_logo.png differ
diff --git a/static/images/sponsors/CTSI-logo.png b/static/images/sponsors/CTSI-logo.png
new file mode 100644
index 000000000..d4c9fa206
Binary files /dev/null and b/static/images/sponsors/CTSI-logo.png differ
diff --git a/static/images/sponsors/Columbia-logo.png b/static/images/sponsors/Columbia-logo.png
new file mode 100644
index 000000000..e09e13acb
Binary files /dev/null and b/static/images/sponsors/Columbia-logo.png differ
diff --git a/static/images/sponsors/Genentech.jpg b/static/images/sponsors/Genentech.jpg
new file mode 100644
index 000000000..075a320e9
Binary files /dev/null and b/static/images/sponsors/Genentech.jpg differ
diff --git a/static/images/sponsors/Health-Data-Science-logo.jpg b/static/images/sponsors/Health-Data-Science-logo.jpg
new file mode 100644
index 000000000..141de69b0
Binary files /dev/null and b/static/images/sponsors/Health-Data-Science-logo.jpg differ
diff --git a/static/images/sponsors/Sage-Bionetworks.png b/static/images/sponsors/Sage-Bionetworks.png
new file mode 100644
index 000000000..2f7dd36aa
Binary files /dev/null and b/static/images/sponsors/Sage-Bionetworks.png differ
diff --git a/static/images/sponsors/UMinn_CHS_logo.png b/static/images/sponsors/UMinn_CHS_logo.png
new file mode 100644
index 000000000..85fab5924
Binary files /dev/null and b/static/images/sponsors/UMinn_CHS_logo.png differ
diff --git a/static/images/sponsors/UPenn-Chase-Center-logo.png b/static/images/sponsors/UPenn-Chase-Center-logo.png
new file mode 100644
index 000000000..d226421d0
Binary files /dev/null and b/static/images/sponsors/UPenn-Chase-Center-logo.png differ
diff --git a/static/images/sponsors/apollomed.png b/static/images/sponsors/apollomed.png
new file mode 100644
index 000000000..9fc65ec92
Binary files /dev/null and b/static/images/sponsors/apollomed.png differ
diff --git a/static/images/sponsors/apple-logo.jpg b/static/images/sponsors/apple-logo.jpg
new file mode 100644
index 000000000..3aa1bcde3
Binary files /dev/null and b/static/images/sponsors/apple-logo.jpg differ
diff --git a/static/images/sponsors/creative-destruction-lab.png b/static/images/sponsors/creative-destruction-lab.png
new file mode 100644
index 000000000..a2dcfaac0
Binary files /dev/null and b/static/images/sponsors/creative-destruction-lab.png differ
diff --git a/static/images/sponsors/dandelion.png b/static/images/sponsors/dandelion.png
new file mode 100644
index 000000000..05c3e9f80
Binary files /dev/null and b/static/images/sponsors/dandelion.png differ
diff --git a/static/images/sponsors/genentech-logo.png b/static/images/sponsors/genentech-logo.png
new file mode 100644
index 000000000..239a620f3
Binary files /dev/null and b/static/images/sponsors/genentech-logo.png differ
diff --git a/static/images/sponsors/google-logo.png b/static/images/sponsors/google-logo.png
new file mode 100644
index 000000000..7eba33576
Binary files /dev/null and b/static/images/sponsors/google-logo.png differ
diff --git a/static/images/sponsors/google.png b/static/images/sponsors/google.png
new file mode 100644
index 000000000..a008c8d1e
Binary files /dev/null and b/static/images/sponsors/google.png differ
diff --git a/static/images/sponsors/health-at-scale.png b/static/images/sponsors/health-at-scale.png
new file mode 100644
index 000000000..8ee69ae7e
Binary files /dev/null and b/static/images/sponsors/health-at-scale.png differ
diff --git a/static/images/sponsors/layer6-ai.png b/static/images/sponsors/layer6-ai.png
new file mode 100644
index 000000000..888ff5617
Binary files /dev/null and b/static/images/sponsors/layer6-ai.png differ
diff --git a/static/images/sponsors/microsoft.png b/static/images/sponsors/microsoft.png
new file mode 100644
index 000000000..564416bfe
Binary files /dev/null and b/static/images/sponsors/microsoft.png differ
diff --git a/static/images/sponsors/moore-logo.jpg b/static/images/sponsors/moore-logo.jpg
new file mode 100644
index 000000000..72059babd
Binary files /dev/null and b/static/images/sponsors/moore-logo.jpg differ
diff --git a/static/images/sponsors/mount-sinai.png b/static/images/sponsors/mount-sinai.png
new file mode 100644
index 000000000..8a5ddd6b6
Binary files /dev/null and b/static/images/sponsors/mount-sinai.png differ
diff --git a/static/images/sponsors/uflorida-health.gif b/static/images/sponsors/uflorida-health.gif
new file mode 100644
index 000000000..3eb384fdb
Binary files /dev/null and b/static/images/sponsors/uflorida-health.gif differ
diff --git a/static/images/sponsors/vector-institute.png b/static/images/sponsors/vector-institute.png
new file mode 100644
index 000000000..025a9e154
Binary files /dev/null and b/static/images/sponsors/vector-institute.png differ
diff --git a/static/images/venue.png b/static/images/venue.png
new file mode 100644
index 000000000..ab73e5c8c
Binary files /dev/null and b/static/images/venue.png differ
diff --git a/static/js/data/api.js b/static/js/data/api.js
new file mode 100644
index 000000000..f8a67c29c
--- /dev/null
+++ b/static/js/data/api.js
@@ -0,0 +1,99 @@
+/* eslint-disable no-underscore-dangle */
+class API {
+ /**
+ * get and cache config object
+ * @return object
+ */
+ static getConfig() {
+ if (API.configCache == null) {
+ API.configCache = $.get("serve_config.json");
+ }
+ return API.configCache;
+ }
+
+ static getCalendar() {
+ return $.get("serve_main_calendar.json");
+ }
+
+ static getPapers() {
+ if (API.paperCache == null) {
+ API.paperCache = $.get("papers.json");
+ }
+ return API.paperCache;
+ }
+
+ static getPapersAndProjection() {
+ return Promise.all([
+ API.getPapers(),
+ $.get("serve_papers_projection.json"),
+ ]);
+ }
+
+ /**
+ * lazy store creation/loading - not needed if own store backend
+ * @see API.storeIDs
+ * @return object
+ */
+ static getStore(storeID) {
+ if (!(storeID in API._storeCaches)) {
+ API._storeCaches[storeID] = new Persistor(
+ `miniconf-${API.getConfig().name}-${storeID}`
+ );
+ }
+ return API._storeCaches[storeID];
+ }
+
+ /**
+ * get marks for all papers of a specific type
+ * @see API.storeIDs
+ * @param storeID
+ * @return {Promise