Tuesday, September 3 |
07:30 - 09:00 |
Breakfast (Restaurant Hotel Hacienda Los Laureles) |
09:00 - 09:45 |
Trevor Campbell: autoMALA: Locally adaptive Metropolis-adjusted Langevin algorithm ↓ The Metropolis-adjusted Langevin Algorithm (MALA) is a widely used Markov chain Monte Carlo (MCMC) algorithm for Bayesian posterior inference. Like many MCMC algorithms, MALA has a “step size” parameter that must be tuned in order to obtain satisfactory performance. However, finding an adequate step size for an arbitrary target distribution can be a difficult task, and there may not even be a single step size that works well throughout the whole distribution. To resolve this issue we introduce autoMALA, a new Markov chain Monte Carlo algorithm based on MALA that automatically sets its step size at each iteration based on the local geometry of the target distribution. We prove that autoMALA has the correct invariant distribution, despite continual automatic adjustments of the step size. Our experiments demonstrate that autoMALA is competitive with related state-of-the-art MCMC methods, in terms of the number of log density evaluations per effective sample, and it outperforms state-of-the-art samplers on targets with varying geometries. Furthermore, we find that autoMALA tends to find step sizes comparable to optimally-tuned MALA when a fixed step size suffices for the whole domain. (Conference Room San Felipe) |
09:45 - 10:15 |
Panayiota Touloupou: Scalable inference for epidemic models with individual level data ↓ As individual level epidemiological and pathogen genetic data become available in ever increasing quantities, the task of analyzing such data becomes more and more challenging. Inferences for this type of data are complicated by the fact that the data is usually incomplete, in the sense that the times of acquiring and clearing infection are not directly observed, making the evaluation of the model likelihood intractable. A solution to this problem can be given in the Bayesian framework with unobserved data being imputed within Markov chain Monte Carlo (MCMC) algorithms at the cost of considerable extra computational effort.
Motivated by this demand, we develop a novel method for updating individual level infection states within MCMC algorithms that respects the dependence structure inherent within epidemic data. We apply our new methodology to an epidemic of Escherichia coli O157:H7 in feedlot cattle in which eight competing strains were identified using genetic typing methods. We show that surprisingly little genetic data is needed to produce a probabilistic reconstruction of the epidemic trajectories, despite some possibility of misclassification in the genetic typing. We believe that this complex model, capturing the interactions between strains, would not have been able to be fitted using existing methodologies. (Online - CMO) |
10:15 - 10:45 |
Bernardo Flores López: Predictive coresets ↓ Coresets are a set of methods born from information geometry designed to, given a data set and a learning algorithm, reduce the size of the dataset while retaining a similar learning performance. Traditionally this has been done by finding a set of sparse weights that minimize the KL divergence between the likelihood based on the original dataset and the one on the weighted data. This has the disadvantage of being ill-defined for nonparametric models, where the likelihood is often intractable. We propose an alternative construction based on matching the unknown predictive distributions over the unseen data based on a generalized posterior, which gives a robust estimator amenable to nonparametric priors. The performance of our method is evaluated on sRNA-Seq data, a good example of high dimensional data where classical estimation algorithms fail to scale. (Conference Room San Felipe) |
10:45 - 11:15 |
Round Table & Coffee Break (Conference Room San Felipe) |
11:15 - 11:45 |
Giovanni Rebaudo: Understanding partially exchangeable nonparametric priors for discrete structure ↓ The Bayesian approach to inference is based on a coherent probabilistic framework that naturally leads to principled uncertainty quantification and prediction. Via posterior distributions, Bayesian nonparametric models make inference on parameters belonging to infinite-dimensional spaces, such as the space of probability distributions. The development of Bayesian nonparametrics has been triggered by the Dirichlet process, a nonparametric prior that allows one to learn the law of the observations through closed-form expressions. Still, its learning mechanism is often too simplistic and many generalizations have been proposed to increase its flexibility, a popular one being the class of normalized completely random measures. Here we investigate a simple yet fundamental matter: will a different prior actually guarantee a different learning outcome? To this end, we develop a new distance between completely random measures based on optimal transport, which provides an original framework for quantifying the similarity between posterior distributions (merging of opinions). Our findings provide neat and interpretable insights on the impact of popular Bayesian nonparametric priors, avoiding the usual restrictive assumptions on the data-generating process. This is joint work with Hugo Lavenant. (Online - CMO) |
11:45 - 12:15 |
Sameer Deshpande: Scalable smoothing in high-dimensions with BART ↓ Bayesian Additive Regression Trees (BART) is an easy-to-use and highly effective nonparametric regression model that approximates unknown functions with a sum of binary regression trees (i.e., piecewise-constant step functions). Consequently, BART is fundamentally limited in its ability to estimate smooth functions. Initial attempts to overcome this limitation replaced the constant output in each leaf of a tree with a realization of a Gaussian Process (GP). While these elaborations are conceptually elegant, most implementations thereof are computationally prohibitive, displaying a nearly-cubic per-iteration complexity.
We propose a version of BART built with trees that output linear combinations of ridge functions; that is, our trees return linear combinations of compositions between affine transforms of the inputs and a (potentially non-linear) activation function. We develop a new MCMC sampler that updates trees in linear time. Our proposed model includes a random Fourier feature-inspired approximation to treed GPs as a special case. More generally, our proposed model can be viewed as an ensemble of local neural networks, which combines the representational flexibility of neural networks with the uncertainty quantification and computational tractability of BART. (Conference Room San Felipe) |
12:15 - 12:45 |
Marta Catalano: Merging rate of opinions via optimal transport on random measures ↓ Species sampling models provide a general framework for random discrete distributions that are tailored for exchangeable data. However, they fall short when used for modeling heterogeneous data collected from related sources or distinct experimental conditions. To address this, partial exchangeability serves as the ideal probabilistic framework. While numerous models exist for partially exchangeable observations, a unifying framework, like species sampling models, is currently missing for this framework. Thus, we introduce multivariate species sampling models, a general class of models characterized by their partially exchangeable partition probability function. They encompass existing nonparametric models for partial exchangeable data, highlighting their core distributional properties. Our results allow the study of the induced dependence structure and facilitate the development of new models. This is a joint work with Beatrice Franzolini, Antonio Lijoi, and Igor Pruenster. (Online - CMO) |
12:50 - 14:15 |
Lunch (Restaurant Hotel Hacienda Los Laureles) |
14:15 - 15:00 |
François-Xavier Briol: Robust and Conjugate Gaussian Process Regression ↓ To enable closed form conditioning, a common assumption in Gaussian process (GP) regression is independent and identically distributed Gaussian observation noise. This strong and simplistic assumption is often violated in practice, which leads to unreliable inferences and uncertainty quantification. Unfortunately, existing methods for robustifying GPs break closed-form conditioning, which makes them less attractive to practitioners and significantly more computationally expensive. In this paper, we demonstrate how to perform provably robust and conjugate Gaussian process (RCGP) regression at virtually no additional cost using generalised Bayesian inference. RCGP is particularly versatile as it enables exact conjugate closed form updates in all settings where standard GPs admit them. To demonstrate its strong empirical performance, we deploy RCGP for problems ranging from Bayesian optimisation to sparse variational Gaussian processes. (Online - CMO) |
15:00 - 15:30 |
Eli Weinstein: Nonparametrically-perturbed parametric Bayesian models: robustness, efficiency and approximations ↓ Parametric Bayesian modeling offers a powerful and flexible toolbox for scientific data analysis. Yet it often faces a basic challenge: the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this project we study nonparametrically perturbed parametric (NPP) Bayesian models, in which a parametric Bayesian model is relaxed via a nonparametric distortion of its likelihood. In particular, we analyze the properties of NPP models when the target of inference is the true data distribution or some functional of it, such as is often the case for instance in causal inference. We show that NPP models offer the robustness of nonparametric models while retaining the data efficiency of parametric models, achieving fast convergence when the parametric model is close to true. We then develop a practical generalized Bayes procedure which inherits the key properties of NPP models, at much less computational cost. Overall, we argue that NPP models offer a robust, efficient and black-box approach to Bayesian inference in general and causal Bayesian inference in particular. (Conference Room San Felipe) |
15:30 - 16:00 |
Lorenzo Capello: Scalable Bayesian inference for Coalescent Models ↓ The observed sequence variation at a locus informs about the evolutionary history of the sample and past population size dynamics. The Kingman coalescent (and its extensions) is commonly used in a generative model of molecular sequence variation to infer evolutionary parameters. However, it is well understood that inference under this model does not scale well with sample size. In the talk, we will discuss a few attempts to tackle this issue. The first attempt is a lower-resolution coalescent model: here, we aimed at scalable inference via a model with drastically smaller state space. A second line of research is a different algorithm for inference: here, we will try to leverage advances in approximate inference for Bayes of the last decade and customize them to this specific setting. (Conference Room San Felipe) |
16:00 - 16:30 |
Round Table & Coffee Break (Conference Room San Felipe) |
16:30 - 17:00 |
Georgia Papadogeorgou: Spatial causal inference in the presence of unmeasured confounding and interference ↓ We discuss concepts from causal inference and spatial statistics, presenting novel insights for causal inference in spatial data analysis, and establishing how tools from spatial statistics can be used to draw causal inferences. We introduce spatial causal graphs to highlight that spatial confounding and interference can be entangled, in that investigating the presence of one can lead to wrongful conclusions in the presence of the other. Moreover, we show that spatial dependence in the exposure variable can render standard analyses invalid, which can lead to erroneous conclusions. To remedy these issues, we propose a Bayesian parametric approach based on tools commonly-used in spatial statistics. This approach simultaneously accounts for interference and mitigates bias resulting from local and neighborhood unmeasured spatial confounding. From a Bayesian perspective, we show that incorporating an exposure model is necessary, and we theoretically prove that all model parameters are identifiable, even in the presence of unmeasured confounding. (Online - CMO) |
17:00 - 17:15 |
Falco Joannes Bargagli Stoffi: Confounder-Dependent Bayesian Mixture Model: Characterizing Heterogeneity of Causal Effects ↓ Several epidemiological studies have provided evidence that long-term exposure to fine particulate matter (PM2.5) increases mortality rate. Furthermore, some population characteristics (e.g., age, race, and socioeconomic status) might play a crucial role in understanding vulnerability to air pollution. To inform policy, it is necessary to identify groups of the population that are more or less vulnerable to air pollution. In causal inference literature, the Group Average Treatment Effect (GATE) is a distinctive facet of the conditional average treatment effect. This widely employed metric serves to characterize the heterogeneity of a treatment effect based on some population characteristics. In this paper, we introduce a novel Confounder-Dependent Bayesian Mixture Model (CDBMM) to characterize causal effect heterogeneity. More specifically, our method leverages the flexibility of the dependent Dirichlet process to model the distribution of the potential outcomes conditionally to the covariates and the treatment levels, thus enabling us to: (i) identify heterogeneous and mutually exclusive population groups defined by similar GATEs in a data-driven way, and (ii) estimate and characterize the causal effects within each of the identified groups. Through simulations, we demonstrate the effectiveness of our method in uncovering key insights about treatment effects heterogeneity. We apply our method to claims data from Medicare enrollees in Texas. We found six mutually exclusive groups where the causal effects of PM2.5 on mortality rate are heterogeneous. (Online - CMO) |
17:15 - 17:45 |
Dafne Zorzetto: Bayesian Nonparametrics for Principal Stratification with Continuous Post-Treatment Variables ↓ Principal stratification provides a causal inference framework that allows adjustment for confounded post-treatment variables when comparing treatments. Despite the literature that mainly focused on binary post-treatment variables, principal stratification with continuous post-treatment variables is gaining increasing attention, with several emerging challenges to be carefully considered. Characterizing the latent principal strata presents a significant challenge that directly impacts the selection of models and the estimation of the principal causal effect. This challenge is further complicated in observational studies where the treatment is not randomly assigned to the units. We develop a novel approach for principal stratification with continuous post-treatment variables leveraging a data-driven method. Our approach exploits Bayesian nonparametric priors for detecting the principal strata, defines novel principal causal effects, and provides a full quantification of the principal strata membership uncertainty. More specifically, we introduce the Confounders-Aware Shared-atoms BAyesian mixture (CASBAH), where the dependent Dirichlet process with shared atoms across treatment levels allows us to adjust for the confounding bias and share information between treatment levels while estimating the principal strata membership. Through Monte Carlo simulations, we show that the proposed methodology has excellent performance in characterizing the latent principal strata and estimating the effects of treatment on post-treatment variables and outcomes. Our proposed method is applied to a case study where we estimate the causal effects of U.S. national air quality regulations on pollution levels and health outcomes. (Conference Room San Felipe) |
19:00 - 21:00 |
Dinner (Restaurant Hotel Hacienda Los Laureles) |