Fifth IMSISBA joint meetingMCMSki IV
Chamonix, 6  8 January 2014
Chamonix MontBlanc, Jan. 68, 2014
Program:
Plenary
speakers
Invited talks
Contributed talks
*See
Christain Robert's wordpress dedicated blog for poster session abstracts.
Plenary I: Chris Holmes, University of Oxford, UK (top)
 Title: Computational challenges arising in modern biomedical research
 Abstract: Biomedical datasets continue to increase in size and diversity. Biostatisticians now routinely work with data collected on 1,000s of individuals measuring 1,000,000s of genetic or molecular covariates linked to environmental factors and multiple longitudinal outcomes (response variables). Such data present considerable challenges to computational statistical modeling. We will discuss recent work on methods that address some of these hurdles using approximate models and modern computing including the use of graphics cards (GPUs) for Monte Carlo simulation. One key question is the sensitivity of scientific conclusions and decisions to model and computational approximations. We will discuss how MC samples can be used within a formal framework to aid in this respect.
Plenary II: Michele Parrinello, University of Lugano, Switzerland and ETH Zurich, Switzerland (top)
 Title: Advanced Sampling Methods in Physics and Chemistry
 Abstract: We introduce the welltempered ensemble (WTE) which is the biased ensemble sampled by well tempered metadynamics when the energy is used as collective variable. WTE can be designed so as to have approximately the same average energy as the canonical ensemble but much larger fluctuations. These two properties lead to an extremely fast exploration of phase space. An even greater efficiency is obtained when WTE is combined with parallel tempering. It is shown that this new ensemble has very useful properties and can accelerate sampling considerably.
Plenary III: Andrew Gelman, Columbia University, USA (top)
 Title: Can we use Bayesian methods to resolve the current crisis of statisticallysignificant research findings that don't hold up? [slides]
 Abstract: In recent years, psychology and medicine have been rocked by scandals of research fraud. At the same time, there is a growing awareness of serious flaws in the general practices of statistics for scientific research, to the extent that top journals routinely publish claims that are implausible and cannot be replicated. All this is occurring despite (or perhaps because of?) statistical tools such as Type 1 error control that are supposed to restrict the rate of unreliable claims. We consider ways in which prior information and Bayesian methods might help resolve these problems.
Invited 1  Approximate Bayesian Computation (top)
 Organizer: Christian Robert
 Speakers:
 Richard Everitt, University of Reading, UK
 Title: Evidence estimation for Markov random fields: a triply intractable problem [slides]
 Abstract: Markov random field models are used widely in computer science, statistical physics and spatial statistics and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to an intractable likelihood function. Several methods have been developed that permit exact, or close to exact, simulation from the posterior distribution. However, estimating the marginal likelihood and Bayes' factors for these models remains challenging in general. This talk will describe new methods for estimating Bayes' factors that use simulation to circumvent the evaluation of the intractable likelihood, and compare them to standard ABC methods.
 Oliver Ratman, Imperial College London,
UK and Duke University, USA
 Title: Statistical modelling of summary values
leads to accurate Approximate Bayesian Computations [slides]
 Abstract: Approximate Bayesian Computations (ABC) are considered to be noisy. We
present a statistical framework for accurate ABC parameter inference
that rests on wellestablished results from indirect inference and
decision theory. This framework guarantees that ABC estimates the mode
of the true posterior density exactly and that the KullbackLeibler
divergence of the ABC approximation to the true posterior density is
minimal, provided that verifiable conditions are met. Our approach
requires appropriate statistical modelling of the distribution of
"summary values"  data points on a summary level  from which the
choice of summary statistics follows implicitly. This places elementary
statistical modelling at the heart of ABC analyses, which we illustrate
on several examples.
 JeanMichel Marin, Université Montpellier II, France
 Title: Approximate Bayesian Computation inferences about population history using large molecular datasets
 Abstract: One prospect of current biology is that molecular data will help us to reveal the complex demographic processes
that have acted on natural populations. The extensive availability of various molecular markers and increased computer power
have promoted the development of inferential methods. Among these methods, Approximate Bayesian Computation method
is increasingly used to make inferences from large datasets for complex models in various research fields,
including population and evolutionary biology. In this talk, we will explain why ABC methods have to be adapted when
analyzing large molecular datasets and will present some progress concerning Single Nucleotide Polymorphism (SNP) data.
Invited 2  Scaling and optimisation of MCMC algorithms (top)
 Organizer: Gareth Roberts
 Speakers:
 Tony Lelievre, Ecole des Ponts ParisTech, France
 Title: Optimal scaling of the transient phase of Metropolis Hastings
algorithms [slides]
 Abstract: We consider the Random Walk Metropolis algorithm on R^n with
Gaussian proposals, and when the target probability measure is the
nfold product of a one dimensional law. It is wellknown that, in the
limit n goes to infinity, starting at equilibrium and for an appropriate
scaling of the variance and of the timescale as a function of the
dimension n, a diffusive limit is obtained for each component of the
Markov chain. We generalize this result when the initial distribution is
not the target probability measure. The obtained diffusive limit is the
solution to a stochastic differential equation nonlinear in the sense of
McKean. We prove convergence to equilibrium for this equation using
entropy techniques. We discuss practical counterparts in order to
optimize the variance of the proposal distribution to accelerate
convergence to equilibrium. Our analysis confirms the interest of the
constant acceptance rate strategy (with acceptance rate between 1/4 and
1/3).
This is a joint work with B. Jourdain and B. Miasojedow.
 Jochen Voss, University of Leeds, UK
 Title: The Rate of Convergence for Approximate Bayesian Computation
 Abstract: Approximate Bayesian Computation (ABC) is a popular computational method for likelihoodfree Bayesian inference. The term "likelihoodfree" refers to problems where the likelihood is intractable to compute or estimate directly, but where it is possible to generate simulated data X relatively easily given a candidate set of parameters θ simulated from a prior distribution. Parameters which generate simulated data within some tolerance δ of the observed data x* are regarded as plausible, and a collection of such θ is used to estimate the posterior distribution θX = x*. Suitable choice of δ is vital for ABC methods to return good approximations to θ in reasonable computational time.
While ABC methods are widely used in practice, particularly in population genetics, study of the mathematical properties of ABC estimators is still in its infancy. We prove that ABC estimates converge to the exact solution under very weak assumptions and, under slightly stronger assumptions, quantify the rate of this convergence. Our results can be used to guide the choice of the tolerance parameter δ.
(Joint work with Stuart Barber and Mark Webster)
 Alex Thiéry, University of Warwick, UK
 Title: Tuning of pseudomarginal MCMC [slides]
 Abstract:Pseudomarginal MCMC methods have opened new horizons in Bayesian computational statistics; they are at the basis of several algorithms for inference in statespace models, doubly intractable distributions, energy models. When employed to tackle complex target distribution, they are nevertheless still difficult to use properly. We analyse their asymptotic behaviour through scaling arguments and obtain general widelyapplicable rules of thumb for tuning these pseudomarginal MCMC algorithms.
 Chris Sherlock, Lancaster University, UK
 Title: Optimising pseudomarginal random walk Metropolis algorithms.
 Abstract:
We examine the pseudomarginal random walk Metropolis algorithm,
where evaluations of the target density for the accept/reject
probability are estimated rather than computed precisely. We use a
result for the speed of a limiting diffusion and of the limiting
expected squared jump distance to examine the overall efficiency of
the algorithm, in terms of both speed of mixing and computational
time. Assuming the additive noise is Gaussian and is inversely
proportional to the number of unbiased estimates that are used, we
show that the algorithm is optimally efficient when the variance of
the noise is approximately 3.283 and the acceptance rate is
approximately 7.001%. The theory is illustrated with a short
simulation study using the particle marginal random walk Metropolis.
We also consider alternative assumptions, and discover that the
optimal proposal scaling is insensitive to the form of the noise
distribution, and that in some cases the algorithm is optimal when
just a single unbiased estimate is used.
Invited 3  MCMC for Bayesian nonparametrics (top)
 Organizers: Antonietta Mira and Antonio Lijoi
 Speakers:
 Stefano Favaro, Università degli Studi di Torino, Italy
 Title: Marginalized samplers for normalized random measure mixture models
 Abstract: Random probability measures play a fundamental role in Bayesian nonparametrics as their distributions act as nonparametric priors. While the Dirichlet process is the most notable example of a nonparametric prior, in the last decades other proposals of random probability measures have appeared in the literature. Among these, the socalled normalized random measures (NRMs) certainly stand out for being extremely flexible in the context of mixture modeling. Previously, posterior simulation for NRMs have been either of marginal type, based on the system of induced predictive distributions, or of conditional type, based on slice sampling. These sampling methods can be inefficient, requiring either significant numerical integrations or suitable approximate truncation schemes. In this talk, we present novel marginalized samplers for NRMs. These samplers do not require numerical integrations or approximate truncation schemes and are simple to implement. One sampler is a direct generalization of Neal's wellregarded Algorithm 8, while another is based on a transdimensional approach and is significantly more efficient.
 Yee Whye Teh, University of Oxford, UK
 Title: SMC inference algorithms for Bayesian nonparametric trees and hierarchies
 Abstract: Bayesian nonparametrics allows us to learn complex models with elegant properties. The simplest and most extensively studied Bayesian nonparametric model is the Dirichlet Process and its underlying combinatorial stochastic process, the Chinese Restaurant Process (CRP), which is an exchangeable random partition, and has been popularly used for clustering via DP mixture models.
More recently, there has been increasing interest in Bayesian nonparametric models for more complex structures. In this talk we discuss Bayesian nonparametric models for trees and hierarchies, based on both fragmentation and coalescent processes. While mathematically elegant, one difficulty with these models is that MCMC algorithms for inference can be quite complicated and slow. We discuss algorithms based on sequential Monte Carlo instead, where the structure of the algorithm follows closely the generative structure of the fragmentation and coalescent processes, with each particle being constructed in a topdown manner in the case of fragmentation processes, and bottomup in the case of coalescent processes.
This talk is based on work done with: Daniel Roy, Hal Daume III, Dilan Gorur, Charles Blundell and Balaji Lakshminarayanan.
 Ryan Adams, Harvard University, USA
 Title: Bayesian nonparametrics and the parallelization of Bayesian computations
 Abstract: One of the key challenges for scaling Bayesian computation is the development of algorithms across tens or hundreds of cores. Markov chain Monte Carlo is a particularly salient example of this challenge, as MCMC is intrinsically sequential. In this talk I will discuss some of my work in developing parallel algorithms for MCMC, and Bayesian nonparametric models in particular. I will discuss ways to minimize communication overhead, while still maintaining the technical conditions that ensure the Markov chain converges to the desired target.
Invited 4  Bayesian Microsimulation (top)
 Organizer: Brad Carlin
 Speakers:
 Laura Hatfield, Harvard Medical School, USA
 Title: Microsimulation of Medicare beneficiaries' supplemental health insurance and health care spending
 Abstract: Dynamic microsimulations model a population of units that may interact with one another, are organized into groups, have diverse traits that transition according to probabilistic rules, and undergo birth and death processes. We augment a wellestablished model of retirees in the United States, adding their Medicare and supplemental health insurance coverage and health care spending. Key economic features of our simulation include income effects of premiums and outofpocket costsharing, trends in benefit generosity and employer contributions, technology growth as a driver of spending, and consumer choice of supplemental coverage. In this talk, I will focus on two models for transitions among coverage types over time. The goal is to build a system in which selection forces lead to realistic assortment of individuals into insurance types according to their characteristics. It must flexible enough to allow experiments with novel policies and changing trends over time, yet simple enough to be tuned to fit observed data, a process called calibration. I will highlight Bayesian contributions to calibration and compare the two models in terms of usefulness for policy experiments.
 Chris Jackson, MRC Biostatistics Unit, Cambridge, UK
 Title: Computational Issues in Health Economic Models and Measures of Decision Uncertinty
 Abstract: Health economic models are used to compare the costs and benefits of different treatments, and guide decisions about which should be funded in a national healthcare system. Expected longterm qualityadjusted lifetimes and costs are estimated under stochastic models, typically discretetime statetransition models. The parameters come from a synthesis of all available evidence, usually from several different sources. Parameter uncertainty is addressed in a Bayesian framework using Monte Carlo simulation or MCMC. I will discuss computational issues in these models, with an application to the choice of diagnostic tests for coronary artery disease.
Firstly, realistically complex model structures may be computationally expensive. For example, if the Markov assumption holds, we can compute expected outcomes for a patient cohort in closed form given the parameter values. But often it will not, as the probability of disease progression might depend on how long a person has had the disease. Then we would either have to simulate individual patient histories and calculate Monte Carlo expectations, or include many more states to represent the timedependence.
A second issue is that treatment decisions from these models are often uncertain. Therefore we want to prioritise what further data should be gathered to reduce uncertainty. This can be quantified in decisiontheoretic terms by the expected value of information (EVI). Estimates of EVI are subject to particularly large Monte Carlo error. In addition, estimating the expected value of learning particular parameters may involve a third nested simulation loop, in addition to the loops needed to represent parameter uncertainty and patient heterogeneity.
I will review and demonstrate currentlyavailable methods for computing expected outcomes and measures of EVI more efficiently. For example, replacing the model with an emulator based on Gaussian process priors can greatly improve efficiency with negligible loss of accuracy.
 Yolanda Hagar, University of Colorado, USA
 Title: A Bayesian microsimulation approach to health economic evaluation of treatment algorithms in schizophrenia.
 Abstract: The goal of this work is to obtain the posterior predictive distribution of transition probabilities between schizophrenia symptom severity states over time, for patients undergoing different treatment algorithms. This is done by:
(i) employing a Bayesian metaanalysis of published clinical trials and observational studies to estimate the posterior distribution of parameters which guide changes in Positive and Negative Syndrome Scale (PANSS) scores over time, and under various treatments;
(ii) propagating the variability from the posterior distributions of these parameters through a microsimulation model that is formulated based on schizophrenia progression.
Results can show detailed differences among haloperidol, risperidone and olanzapine in controlling various levels of severities of positive, negative and joint symptoms over time. Risperidone seems best in controlling severe positive symptoms while olanzapine seems worst for that during the first quarter of drug treatment; however, olanzapine seems to be best in controlling severe negative symptoms across all four quarters of treatment while haloperidol is the worst in this regard. These details may further serve to better estimate quality of life of patients and aid in resource utilization decisions in treating schizophrenic patients via realistic multidrug algorithms. In addition, consistent estimation of uncertainty in the timeprofile parameters has important implications for the practice of costeffectiveness analyses, and for future health policy for schizophrenia treatment algorithms.
Invited 5  Convergence of MCMC and adaptive MCMC I (top)
 Organizers: Gersende Fort and Jeff Rosenthal
 Speakers:
 Galin Jones, University of Minnesota, USA
 Title: Markov Chain Monte Carlo with Linchpin Variables
 Abstract: Many highdimensional posteriors can be factored into a product of a conditional density which is easy to sample directly and a lowdimensional marginal density. If it is possible to make a draw from the marginal, then a simple sequential sampling algorithm can be used to make a perfect draw from the joint target density. We show that in many common Bayesian models it is possible to make essentially perfect draws from the marginal and hence also from the joint posterior. This applies to versions of the Bayesian linear mixed model and Bayesian probit regression model, among others. When the marginal is difficult to sample from we propose to use a MetropolisHastings step on the marginal followed by a draw from the conditional distribution. We show that the resulting Markov chain is reversible and that its convergence rate is the same as that of the subchain where the MetropolisHastings step is being performed. We use this to construct several examples of uniformly ergodic Markov chains which offers a qualitative improvement over the Gibbs sampler having a geometrically ergodic convergence rate.
This is joint work with Felipe Acosta.
 Jim Hobert
 Title: The PolyaGamma Gibbs Sampler for Bayesian Logistic Regression is Uniformly Ergodic
 Abstract: One of the most widely used data augmentation algorithms is Albert & Chib's (1993) algorithm for Bayesian probit regression. Polson, Scott & Windle (2013) recently introduced an analogous algorithm for Bayesian logistic regression. I will describe this new algorithm, which is based on missing data from the socalled PolyaGamma distribution, and I will present a result showing that the underlying Markov chain is uniformly ergodic.
This is joint work with Hee Min Choi.
 Krys Łatuszyński, University of Warwick, UK
 Title: Solidarity of Gibbs Samplers: the spectral gap
 Abstract: We show the solidarity principle of the spectral gap for Gibbs samplers. In particular it turns out that if any of the random scan or d! deterministic scans has a spectral gap than all of them have.
Joint work with Blazej Miasojedow (Warsaw).
 Eric Moulines, Institut Télécom / Télécom ParisTech (ENST), France
 Title: On the the particle Gibbs algorithm and some variants
 Abstract: The particle Gibbs (PG) sampler was introduced by Doucet, Andrieu and Hollenstein (2010) as a way to introduce a particle ﬁlter (PF) as a proposal in a Markov chain Monte Carlo (MCMC) scheme. The resulting algorithm was shown to be an efﬁcient tool for joint Bayesian parameter and state inference in nonlinear, nonGaussian statespace models.
We have established that this algorithm is uniformly ergodic under rather general assumptions, that
we will carefully review and discuss.
Despite this encouraging result, the mixing of the PG kernel can be poor, especially when there is severe degeneracy in the PF. Hence, the success of the PG sampler relies on the, often unrealistic, assumption that we can implement a PF without suffering from any considerate degeneracy. We show that the mixing can be improved by adding a backward simulation step to the PG sampler.
Here, we investigate this further, derive an explicit PG sampler with backward simulation (denoted PGBSi) and show that this indeed is a valid MCMC method. Several illustrations will be presented.
Joint work with F. Lindstein (Linköping Univ., Sweden)
Invited 6  Convergence of MCMC and adaptive MCMC II (top)
 Organizers: Gersende Fort and Jeff Rosenthal
 Speakers:
 Gareth Roberts, University of Warwick, UK
 Title: From Peskun Ordering to Optimal Simulated Tempering [slides]
 Abstract:The problem of optimal temperature choices for simulated tempering surprisingly has close connections to the optimal scaling problem for Metropolis algorithms. This will consider the simulated tempering problem in highdimensions which leads to a functional optimisation problem which turns out to have a solution which can easily be found pointwise, and which is characterised by requiring temperature moves having acceptance probability 0.234. The proof requires a continuoustime Peskun ordering argument which seems likely to be useful elsewhere.
This is joint work with Jeffrey Rosenthal and Yves Atchade.
 Radu Craiu, University of Toronto, Canada
 Title: When Interaction Meets Adaption: A Bouncy MultipleTry Metropolis
 Abstract: Adaptive MCMC (AMCMC) algorithms offer the attractive feature of automatic tuning. In order to validate an adaptive scheme one needs to demonstrate diminishing adaptation and containment (or some alternate versions of these). While the former can be handled reasonably well in many cases, the latter is much harder to prove for many AMCMC samplers, including adaptive MultipleTry Metropolis samplers. To bypass this theoretical difficulty we propose the use of auxiliary adaptive chains in the context of the interacting MultipleTry Metropolis algorithm. We also discuss strategies to reduce the computational load involved in implementations for large data.
 Matti Vihola, University of Jyväskÿla, Finland
 Title: Stability of some controlled Markov chains
 Abstract: Stability analysis of adaptive Markov chain Monte Carlo (MCMC) algorithms is often based on algorithmspecific features. This talk presents a generic approach to verify the stability of a 'controlled'
Markov chain, such as the adaptive MCMC process. The approach is based on a compound Lyapunov function involving a function on both the state and the adaptation parameter.
This is joint work with C. Andrieu and V. B. Tadic
Invited 7  Advances in Sequential Monte Carlo methods (top)
 Organizer: Christophe Andrieu
 Speakers:
 Pierre Jacob, National University of Singapore, SG
 Title: Path storage in the particle filter [slides]
 Abstract: This talk considers the problem of storing all the paths generated by a particle filter. I will present a theoretical result bounding the expected memory cost and an efficient algorithm to realise this. The theoretical result and the algorithm are illustrated with numerical experiments.
Joint work with Lawrence Murray, Sylvain Rubenthaler
 Nick Whiteley, University of Bristol, UK
 Title: On the role of interaction in sequential Monte Carlo algorithms
 Abstract: We introduce a general form of sequential Monte Carlo algorithm defined in terms of a parameterized resampling mechanism. We find that a suitably generalized notion of the Effective Sample Size (ESS), widely used to monitor algorithm degeneracy, appears naturally in a study of its convergence properties. We are then able to phrase sufficient conditions for timeuniform convergence in terms of algorithmic control of the ESS, in turn achievable by adaptively modulating the interaction between particles. This leads us to suggest novel algorithms which are, in senses to be made precise, provably stable and yet designed to avoid the degree of interaction which hinders parallelization of standard algorithms. As a byproduct we prove timeuniform convergence of the popular adaptive resampling particle filter.
 Adam Johansen, University of Warwick, UK
 Title: Monte Carlo Approximation of Monte Carlo Filters [slides]
 Abstract: We will discuss the use of exact approximation within Monte Carlo
(particle) filters to allow the approximation of idealised algorithms and present some illustrative examples.
 Anthony Lee, University of Warwick, UK
 Title: Uniform Ergodicity of the Iterated Conditional SMC and Geometric Ergodicity of Particle Gibbs samplers
 Abstract: We establish quantitative bounds for rates of convergence and asymptotic variances for iterated conditional sequential Monte Carlo (icSMC) Markov chains and associated particle Gibbs samplers. Our main findings are that the essential boundedness of potential functions associated with the icSMC algorithm provide necessary and sufficient conditions for the uniform ergodicity of the icSMC Markov chain, as well as quantitative bounds on its (uniformly geometric) rate of convergence.
This complements more straightforward results for the particle independent MetropolisHastings (PIMH) algorithm. Our results for icSMC imply that the rate of convergence can be improved arbitrarily by increasing N, the number of particles in the algorithm, and that in the presence of mixing assumptions, the rate of convergence can be kept constant by increasing N linearly with the time horizon. Neither of these phenomena are observed for the PIMH algorithm. We translate the sufficiency of the boundedness condition for icSMC into sufficient conditions for the particle Gibbs Markov chain to be geometrically ergodic and quantitative bounds on its geometric rate of convergence.
These results complement recently discovered, and related, conditions for the particle marginal MetropolisHastings (PMMH) Markov chain.
This is joint work with Christophe Andrieu and Matti Vihola.
Invited 8  Convergence Rates of Markov Chains (top)
 Organizer: Dawn Woodard
 Speakers:
 Kshitij Khare, University of Florida, USA
 Title: Convergence for some multivariate Markov chains with polynomial
eigenfunctions.
 Abstract: In this talk, we will present examples of multivariate Markov chains for
which
the eigenfunctions turn out to be wellknown orthogonal polynomials. This
knowledge can be used to come up with exact rates of convergence for these
Markov chains. The examples include the multivariate normal autoregressive
process and simple models in population genetics. Then we will consider
some generalizations of the above Markov chains for which the stationary
distribution is completely unknown. We derive upper bounds for the total
variation distance to stationarity by developing coupling techniques for
multivariate state spaces. The talk is based on joint works with Hua Zhou
and Nabanita Mukherjee.
 Dawn Woodard, Cornell University, USA
 Title: Efficiency of Markov Chain Monte Carlo for Parametric Statistical Models
 Abstract: We analyze the efficiency of Markov chain Monte Carlo (MCMC) methods used in Bayesian computation. While convergence diagnosis is used to choose how long to run a Markov chain, it can be inaccurate and does not provide insight regarding how the efficiency scales with the size of the dataset or other quantities of interest. We characterize the number of iterations of the Markov chain (the running time) sufficient to ensure that the approximate Bayes estimator obtained by MCMC preserves the property of asymptotic efficiency. We show that in many situations where the likelihood satisfies local asymptotic normality, the running time grows linearly in the number of observations n.
 Natesh Pillai, Harvard University, USA
 Title: Finite sample properties of adaptive Markov chains via curvature.
 Abstract: In this talk, we discuss a new way using coupling methods to obtain the finite sample properties for some Adaptive Markov chains. Although there has been previous work establishing conditions for their ergodicity, not much was known theoretically about their finite sample properties. In this talk, using a variant of the discrete Ricci curvature for Markov kernels introduced by Ollivier, we derive concentration inequalities and finite sample bounds for a class of adaptive Markov chains. Next we apply this theory to two examples. In the first example, we provide the first rigorous proofs that the finite sample properties obtained from an equienergy sampler are superior to those obtained from related parallel tempering and MCMC samplers. In the second example, we analyze a simple adaptive version of the usual random walk on Zn and show that the mixing time improves from O(n2) to O(nlogn).
 Discussant: Gersende Fort, LTCI, CNRS  TELECOM Paris Tech, France
Invited 9  Recent Developments in Software for MCMC
 Round Table Session (top)
 Organizer: Luke Bornn
This invited panel features four leading researchers working on
software development for Bayesian computation. Each panelist will
highlight their particular software, including its history,
development, and relative strengths and weaknesses. Looking forward,
panelists will discuss and debate the future of Bayesian computation
and software development, including challenges, opportunities and
bottlenecks. Emphasis throughout will be on simplifying and automating
the implementation of Monte Carlo methods, with an eye towards
scalability to larger and more complex models and data.
 Speakers:
Prop 1  Approximate Inference (top)
 Proposed by Daniel Simpson
 Speakers:
 Nicolas Chopin
(CREST, France)
[webpage]
 Title: Sequential Quasi Monte Carlo [slides]
 Abstract: We develop a new class of algorithms, SQMC (Sequential Quasi Monte Carlo), as a variant of SMC (Sequential Monte Carlo) based on lowdiscrepancy points. The complexity of SQMC is O(N log N) where N is the number of simulations at each iteration, and its error rate is smaller than the Monte Carlo rate O(N^{1/2}). The only requirement to implement SQMC is the ability to write the simulation of particle x_t^n given x_{t1}^n as a deterministic function of x_{t1}^n and uniform variates. We show that SQMC is amenable to the same extensions as standard SMC, such as forward smoothing, backward smoothing, unbiased likelihood evaluation, and so on. In particular, SQMC may replace SMC within a PMCMC (particle Markov chain Monte
Carlo) algorithm. We establish several convergence results. We provide numerical evidence in several difficult scenarios than SQMC significantly outperforms SMC in terms of approximation error (joint work with Mathieu Gerber).
 Thiago G. Martins (NTNU, Norway) [webpage]
 Title: Bayesian
flexible models with INLA: from computational to prior issues
 Abstract: In this talk we present our
approach to extend INLA to a class of latent
models, where components of the latent field can have
\textit{nearGaussian} distributions, which we define
to be distributions that correct the Gaussian for skewness
and/or kurtosis, allowing us extra modeling
flexibility within the fast and accurate INLA framework.
However, by leaving the realm of Gaussian
distributions, the choice of prior distributions for the
hyperparameters become even more challenging, specially
for distributions that correct the Gaussian for both
skewness and kurtosis, where independent priors might not be
an option. We present a novel approach to specify prior
distributions in this setting that allow the user
to provide the desired degree of flexibility that is
compatible to the data at hand.
 Clare McGrory (Univ. of Queensland)
 Title: Variational Bayes for Applications
Involving Large Datasets,
 Abstract: Variational Bayes is a
deterministic approach for Bayesian inference. The
timeefficiency of variational Bayesbased approaches makes
them attractive in applications, particularly when datasets
are large. In recent years the development of new
variational Bayes algorithms and exploration of their use in
various modelling settings has been explored. We look at
some such areas where variational Bayes has been shown to be
valuable and discuss implications of taking an approximate
approach rather than performing a Markov chain Monte Carlo
analysis.
 Discussant :
Hävard Rue
Prop 2  Inference and Computation
for Highdimensional Sparse Graphical Models (top)
 Proposed by Guido Consonni (Università Cattolica del
Sacro Cuore, Milan)
 Speakers:
 Alex Lenkoski
(Norwegian Computing Center) [webpage]
 Title: A Direct Sampler for GWishart Variates and Error Dressing Electricity Spot Price Forecasts
 Abstract: The GWishart distribution is the conjugate prior for precision
matrices that encode the conditional independencies of a Gaussian graphical model. While the distribution has received considerable
attention, posterior inference has proven computationally challenging,
in part due to the lack of a direct sampler. We rectify this situation. The existence of a direct sampler offers a host of new possibilities for the use of GWishart variates. We discuss one such development by outlining a new methodology for error dressing deterministic price forecasts of the Nordpool spot electricity using published bid/ask curves. This enables the construction of joint
predictive distributions that appropriately characterize the potential for occasional extreme prices in spot markets.
 Donatello Telesca (University of California at
Los Angeles)
[webpage]
 Title: Graphical model determination
based on nonlocal priors
 Abstract: We discuss the use of
nonlocal priors and associated computational challenges
in graphical model determination. We present some new
results based on mixture representations and discuss
related posterior simulation algorithms.
 Hao Wang (Univ of South Carolina) [webpage]
 Title: Scaling it Up: Stochastic
Graphical Model Determination under Spike and Slab Prior
Distributions
 Abstract: Gaussian covariance graph
models and Gaussian concentration graph models are two
classes of models useful for uncovering latent dependence
structures among multivariate variables. In the Bayesian
literature, graphs are often induced by priors over the
space of positive definite matrices with fixed zeros, but
these methods present daunting computational burdens in
large problems. Motivated by the superior computational
efficiency of continuous shrinkage priors for linear
regression models, I propose a new framework for graphical
model determination that is based on continuous spike and
slab priors and uses latent variables to identify graphs.
I discuss model specification, computation, and inference
for both covariance graph models and concentration graph
models. The new approach produces reliable estimates of
graphs and efficiently handles problems with hundreds of
variables.
 Discussant: Francesco Stingo (University of Texas
M D Anderson Cancer Center Houston)
Prop 3  Probabilistic advances for Monte Carlo methods (top)
 Organizer: Vivekananda Roy
 Speakers:
 James Flegal
University of California, Riverside, USA [webpage]
 Title: Relative fixedwidth stopping
rules for Markov chain Monte Carlo simulations
 Abstract: Markov chain Monte Carlo
(MCMC) simulations are commonly employed for estimating
features of a target distribution. A fundamental challenge
in MCMC simulations is determining when the simulation
should stop. We consider a sequential stopping rule that
terminates the simulation when the width of a confidence
interval is sufficiently small relative to the size of the
target parameter. Specifically, we consider relative
magnitude and relative standard deviation stopping rules in
the context of MCMC. In each setting, we develop sufficient
conditions to ensure asymptotic validity, that is conditions
to ensure the simulation will terminate with probability one
and the resulting confidence intervals have the proper
coverage probability. Finally, we investigate the finite
sample properties for estimating expectations and quantiles
through a variety of examples.
 Radu Herbei (Ohio State University, USA [webpage]
 Title: Exact MCMC using
approximations and Bernoulli factories
 Abstract: With the ever increasing
complexity of models used in modern science, there is a need
for new computing strategies. Classical MCMC algorithms
(MetropolisHastings, Gibbs) have difficulty handling very
highdimensional state spaces and models where likelihood
evaluation is impossible. In this work we study a collection
of models for which the likelihood cannot be evaluated
exactly; however, it can be estimated unbiasedly in an
efficient way via distributed computing. Such models
include, but are not limited to cases where the data are
discrete noisy observations from a class of diffusion
processes or partial measurements of a solution to a partial
differential equation. In each case, an exact MCMC algorithm
targeting the correct posterior distribution can be obtained
either via the "auxiliary variable trick" or by using a
Bernoulli factory to advance the current state. We explore
the advantages and disadvantages of such MCMC algorithms and
show how they can be used in applications from oceanography
and phylogenetics.
 Sumeetpal Singh University of Cambridge, UK
[webpage]
 Title: Properties of the Particle Gibbs
Sampler [slides]
 Abstract: The particle Gibbs sampler is
a Markov chain algorithm which operates on the extended
space of the auxiliary variables generated by a interacting
particle system. In particular, it samples from the discrete
variables that determines the particle genealogy. We
establish the ergodicity of the Particle Gibbs Markov
kernel, for any length of time, under certain assumptions.
We discuss several algorithmic variations, either proposed
in the literature or original. For some of these variations,
we are able to prove that they strictly dominate the
original algorithm in terms of efficiency, while for the
others, we provide counterexamples that they do not.
 Jimmy Olsson Lund University, Sweden
[webpage]
 Title: Partial ordering of inhomogeneous Markov chains with applications to
Markov Chain Monte Carlo methods [slides]
 Abstract: In this talk we will discuss the asymptotic variance of sample path
averages for inhomogeneous Markov chains that evolve alternatingly
according to two different πreversible Markov transition kernels. More
specifically, we define a partial ordering over the pairs of
πreversible Markov kernels, which allows us to compare directly the
asymptotic variances for the inhomogeneous Markov chains associated with
each pair. As an important application we use this result for comparing
different dataaugmentationtype Metropolis Hastings algorithms. In
particular, we compare some pseudomarginal algorithms and propose a
novel exact algorithm, referred to as the random refreshment algorithm,
which is more efficient, in terms of asymptotic variance, than the
Grouped Independence Metropolis Hastings algorithm and has a
computational complexity that does not exceed that of the Monte Carlo
Within Metropolis algorithm. Finally, we provide a theoretical
justification of the Carlin and Chib algorithm used in model selection.
This is joint work with Florian Maire and Randal Douc.
Prop 4  Bayesian computation in
Neurosciences (top)
 Organizers : Nicolas Chopin and Simon Barthelmé
 Speakers:
 Tim Johnson (Univ.
of Michigan, USA) [webpage]
 Title: A GPU Implementation of a Spatial
GLMM: Assessing Spatially Varying Coefficients of Multiple
Sclerosis Lesions [slides]
 Abstract: In
this talk I present a parallel programing approach for the
estimation of spatially varying coefficients in a spatial
GLMM analyzing Multiple Sclerosis lesion images. The model
is a spatial GLMM of binary image data with subject specific
covariates. The spatial coefficients for these covariates
are spatially dependent and are a priori modeled using a
multivariate conditional autoregressive model. The large
size of these images and coefficient maps, along with
spatial dependence between voxels, makes this problem
challenging and extremely computationally intense. Code par
allelization and GPU implementation results in an executable
that is 50 times faster than a serial implementation.
 Emily Fox (Washington, USA) [webpage]
 Title: Gaussian Processes on the Brain:
Heteroscedasticity, Nonstationarity, and Longrange
Dependencies
 Abstract: In this talk, we focus
on a set of modeling challenges associated with
Magnetoencephalography (MEG) recordings of brain activity:
(i) the time series are high dimensional with longrange
dependencies, (ii) the recordings are extremely noisy, and
(iii) gathering multiple trials for a given stimulus is
costly. Our goal then is to harness shared structure both
within and between trials. Correlations between sensors
arise based on spatial proximity, but also from coactivation
patterns of brain activity that change with time. Capturing
this unknown and changing correlation structure is crucial
in effectively sharing information. Motivated by the
structure of our highdimensional time series, we propose a
Bayesian nonparametric dynamic latent factor model based on
a sparse combination of Gaussian processes (GPs). Key to the
formulation is a timevarying mapping from the lower
dimensional embedding of the dynamics to the full
observation space, thus capturing timevarying correlations
between sensors.Finally, we turn to another challenge: in
addition to longrange dependencies, there are abrupt
changes in the MEG signal. We propose a multiresolution GP
that hierarchically couples GPs over a random nested
partition. Longrange dependencies are captured by the
toplevel GP while the partition points define the abrupt
changes in the time series. The inherent conjugacy of the
GPs allows for efficient inference of the hierarchical
partition, for which we employ graphtheoretic techniques.
Portions of
this talk are joint work with David Dunson, Alona Fyshe, and Tom
Mitchell.
 Yan Zhou (Univ. of
Warwick, UK) [webpage]
 Title: Sequential Monte Carlo methods
for Bayesian Model selection of PET image data
[slides]
 Abstract: Positron Emission
Tomography (PET) is a widely used 3D medical imaging
technique used for in vivo study of human brains. The
massive data size has limited its analyze to computational
cheap optimization methods previously. Bayesian modeling has
been proposed and successfully applied to PET in recent
literatures. Though relatively simple model can be used to
model the data, some computational challenges remain. First,
at voxel level there are about about a quarter million data
sets whose posteriori need to be simulated, and thus fast
Monte Carlo algorithms are demanded. Second, the data varies
significantly across the 3D space and thus robust Monte
Carlo methods are needed. Any methods that requires human
tuning is not feasible. Third, data are very noisy and each
data set has a limited size, and the model identification
requires very accurate estimates of the Bayes factor. In
this talk, we will illustrate how SMC can be used to analyze
PET data. In particular, how some adaptive techniques can be
used to improve the results while lowering the computational
cost. It will be shown that with the SMC framework it is
possible to obtain selftuning, robust and automatic
Bayesian model selection results for PET data with
relatively low computational cost.
 Liam Paninski (Columbia Univ.,USA) [webpage]
 Title: Applications of exact
Hamiltonian Monte Carlo methods
 Abstract: We present a Hamiltonian
Monte Carlo algorithm to sample from multivariate Gaussian
distributions in which the target space is constrained by
linear and quadratic inequalities or products thereof. The
Hamiltonian equations of motion can be integrated exactly
and there are no parameters to tune. The algorithm mixes
faster and is more efficient than Gibbs sampling. The
runtime depends on the number and shape of the constraints
but the algorithm is highly parallelizable. In many cases,
we can exploit special structure in the covariance matrices
of the untruncated Gaussian to further speed up the runtime.
A simple extension of the algorithm permits sampling from
distributions whose logdensity is piecewise quadratic, as
in the "Bayesian Lasso" model. We are currently
investigating an application that involves an auxiliary
variable approach to sampling from binary vectors. We
illustrate the usefulness of these ideas in several
neuroscience applications.
Prop 5  Advances in Monte Carlo
motivated by applications (top)
 Organizer : Robin Ryder (Univ. Dauphine, France)
 Speakers:
 Alexis MuirWatt (Oxford, UK)
 Title: Monte Carlo inference for
partial orders from linear extensions
 Abstract: A new model for dynamic
rankings parameterized by partial orders is introduced. As
application rankings represent an underlying social order
between 12th Century bishops. A partial order on a set P
corresponds to a transitively closed, directed acyclic graph
or DAG h(P) with vertices in P. Such orders generalize
orders defined by partitioning the elements of P and ranking
the elements of the partition. In the following an
unobserved partial order h(P) evolves according to a
stochastic process in time and observations are random
linear extensions of suborders of h(P). Following (Mogapi et
al., 2010), we specify a model for random partial orders
with base measure the random kdimensional orders (Winkler
1985) and a parameter controlling the typical depth of a
random partial order. We extend the static model to a Hidden
Markov Model in which the orders evolve in time. The partial
order h(P) evolves according to a hidden Markov process
which has the static model as its equilibrium: singleton
events reorder individual nodes while changepoint events
resample the entire partial order. The process is observed
by taking random linear extensions from suborders of h(P) at
a sequence of sampling times. The sampling times are
uncertain (up to an interval). The posterior distribution
for the unobserved process and parameters, which is
determined by the HMM, is doublyintractable. Despite a high
variance in estimating the posterior density, the Particle
MCMC approach of (Andrieu et al. 2010) is used for
Monte Carlo based inference.
 Simon Barthelme
(Geneva, Switzerland) [webpage]
 Title: MCMC techniques for functional
ANOVA in Gaussian processes
 Abstract: Functional
ANOVA (fANOVA) was developed by Ramsay and Silverman (2005)
as a exploratory tool to understand variability among sets
of functions. The core of the technique is to build a
hierarchical model for functions, in which each level adds a
functional perturbation to a group average. Remarkably, such
hierarchical structures also come up in a latent form in a
range of models used in neuroscience and spatial statistics:
socalled "doubly stochastic point processes", for example
the logGaussian Cox process of Møller et al. (1998).
One way of characterising fANOVA is to frame it as a
hierarchical Gaussian Process prior on groups of functions.
This formulation has the advantage of theoretical elegance,
but outofthebox MCMC samplers can be extremely slow on
realistic datasets. We will show how recent results obtained
by Heersink & Furrer (2011) on quasiKronecker matrices
can be used to speed up MCMC sampling for fANOVA problems.
We will focus more particularly on applications to
neuroscience, especially spiketrain analysis.
 Lawrence Murray
(Perth, Australia) [webpage]
 Title: Environmental applications of
particle Markov chain Monte Carlo methods
 Abstract: I'll report on progress in
applying particle Markov chain Monte Carlo (PMCMC) methods
for state and parameter estimation in environmental domains,
including marine biogeochemistry, soil carbon modelling and
hurricane tracking. Statespace models in these areas often
derive from a tradition of deterministic process modelling
to capture the physical, chemical and biological
understanding of the system under study. They are then
augmented with stochastic or statistical components to
capture uncertainty in the process model itself, its
parameters, inputs, initial conditions and observation.
PMCMC has some advantages for inference in such a context:
it imposes few constraints on model development, it remains
true to a model specification without introducing
approximations, and is highly amenable to parallelisation on
modern computing hardware. But PMCMC also has some
drawbacks: it is better suited to fastmixing models, and
can be computationally expensive. I'll discuss these issues,
with examples, and attempt to draw out some general lessons
from the experience.
 Rémi Bardenet
(Oxford, UK) [webpage]
 Title: When cosmic particles switch labels: Adaptive Metropolis with online relabeling, motivated by the data analysis of the Pierre Auger experiment
 Abstract: The Pierre Auger experiment is a giant cosmic ray observatory located in Argentina. Cosmic rays are charged particles that travel through the universe at very high energies. When one of these particles hits our atmosphere, it generates a cascade of particles that strike the surface of Earth on several square kilometers. Auger has a 3000 km2 wide array of 1600+ detectors gathering data from these cascades. The objective of the data analysis is to infer the parameters of the original incoming particles. In this talk, we first derive a model of part of the detection process, which involves elementary particles called muons. The resulting model is invariant to permutations of these muons, thus making MCMC inference prone to labelswitching, similarly to what happens with MCMC in mixture models. In addition, our model is high dimensional and involves a lot of correlated variables, which motivates the use of adaptive MCMC, such as the adaptive Metropolis (AM) of Haario et al.
(Bernoulli, 2001). However, running AM on our model requires to solve the labelswitching online. Building on previous approaches, we present AMOR, a variant of AM that learns together an optimal proposal and an optimal relabeling of the marginal chains. We present applications and state convergence results for AMOR, including a law of large numbers, and demonstrate interesting links between relabeling and vector quantization.
Prop 6  Monte Carlo methods in
network analysis (top)
 Organizer: Nial Friel (University College Dublin)
 Speakers:
 David Hunter
(Penn State University, USA) [webpage]
 Title: Improving SimulationBased
Algorithms for Fitting ERGMs
 Abstract: Markov chain Monte Carlo
methods can be used to approximate the intractable
normalizing constants that arise in likelihood calculations
for many exponential family random graph models for
networks. However, in practice, the resulting approximations
degrade as parameter values move away from the value used to
dene the Markov chain, even in cases where the chain
produces perfectly ecient samples. We introduce a new
approximation method along with a novel method of moving
toward a maximum likelihood estimator (MLE) from an
arbitrary starting parameter value in a series of steps
based on alternating between the canonical exponential
family parameterization and the meanvalue parameterization.
This technique enables us to find an approximate MLE in
many cases where this was previously not possible. We
illustrate these methods on a model for a transcriptional
regulation network for E. coli, an example where previous
attempts to approximate an MLE had failed, and a model for a
wellknown social network dataset involving friendships
among workers in a tailor shop. These methods are
implemented in the publicly available ergm package for R.
 Adrian Raftery
(University of Washington, USA) [webpage]
 Title: Fast Inference for ModelBased
Clustering of Networks Using an Approximate CaseControl
Likelihood
 Abstract: The modelbased clustering
latent space network model represents relational data
visually and takes account of several basic network
properties. Due to the structure of its likelihood function,
the computational cost is of order O(N2), where N is the
number of nodes. This makes it infeasible for large
networks. We propose an approximation of the log likelihood
function. We adapt the casecontrol idea from epidemiology
and construct an approximate casecontrol log likelihood
which is an unbiased estimator of the full log likelihood.
Replacing the full likelihood by the casecontrol likelihood
in the MCMC estimation of the latent space model reduces the
computational time from O(N2) to O(N), making it feasible
for large networks. We evaluate its performance using
simulated and real data. We t the model to a large
proteinprotein interaction data using the casecontrol
likelihood and use the model tted link probabilities to
identify false positive links. This is joint with with
Xiaoyue Niu, Peter Ho and Ka Yee Yeung.
 Ernst Wit
(University of Groningen, NL) [webpage]
 Title: Network inference via
birthdeath MCMC
 Abstract: We propose a new Bayesian
methodology for model determination in Gaussian graphical
models for both decomposable and nondecomposable cases. The
proposed methodology is a transdimensional MCMC approach,
which makes use of a birthdeath process. In particular, the
birthdeath process updates the graph by adding a new edge
in a birth event or deleting an edge in a death event. It is
easy to implement, computationally feasible for large
graphical models. Unlike frequentist approaches, our method
gives a principled and, in practice, sensible approach for
model selection, as we show in a cell signaling example. We
illustrate the eciency of the proposed methodology on
simulated and real datasets. Moreover, we implemented the
proposed methodology into an Rpackage, called BDgraph.
Prop 7  Bayesian Inference for
Multivariate Dynamic Panel Data Models (top)
 Organizer: Robert Kohn
 Speakers:
 Sally Wood
(Melbourne Business School, Australia)[webpage]
 Title: : Flexible Models for Longitudinal Data
 Abstract: A method is presented for flexibly modelling longitudinal data. Flexibility is achieved in two ways. First by assuming the regression coefficients of random effects models are generated from a timevarying prior distribution and second by allowing the manner in which the prior evolves over time to vary with individual time series. The frequentist properties of the approach are examined and the method is applied to modelling the performance trajectories of individuals in psychological experiments.
 Robert Kohn
(Australian School of Business, Australia) [webpage]
 Title: Estimating Dynamic Panel
Mixture Models.
 Abstract: A method is presented for
estimating dynamic panel data models based on mixtures. The
methods are developed for both discrete and continuous data
or a combination of both and applied to data in health and
finance. The approach is Bayesian and the estimation is
carried out using newly developed particle methods by the
author.
 Silvia Cagnone
(University of Bologna, Italy) [webpage]
 Title:An adaptive GaussHermite quadrature method for likelihood evaluation of latent autoregressive models for panel data and timeseries
 Abstract: We propose to use the Adaptive Gaussian Hermite (AGH) numerical quadrature approximation to solve the integrals involved in the estimation of a class of dynamic latent variable models for timeseries and longitudinal data. In particular, we consider models based on continuous timevarying latent variables which are modeled by an autoregressive process of order 1, AR(1). Two examples of such models are the Stochastic Volatility models for the analysis of financial time series and the Limited Dependent Variable models for the analysis of panel data. A comparison between the performance of AGH methods and alternative approximation methods proposed in the literature is carried out by simulation. Applications on real data are also illustrated.
Prop 8
 Bayesian statistics and Population genetics (top)
 Organizers: Michael Blum
and Olivier François
 Speakers:
 Jukka Corander
(Helsinki University, Finland) [webpage]
 Title: Bayesian
inference about population genealogies using diffusion
approximations to allele frequency distributions
 Abstract: Genotyped
individuals from several sample populations are frequently
used for inferring the underlying genetic population
structure, where a number of divergent subpopulations can
be present. Statistical inference about the genealogy of
the subpopulations can be made by modeling the stochastic
changes in allele frequencies caused by demographic
processes over time. Recently, significant advances have
benn made concerning this inference problem by
characterizing the changes in allele frequency using
diffusionbased approximation and Bayesian hierarchical
models. A particularly attractive feature of such models
is that the sufficient statistics are equal to the
observed allele counts over the loci per population, and
consequently, computational complexity is not a function
of the number of genotyped individuals, unlike in
coalescent models. We show how the neutral WrightFisher
and infinite allels models can be fitted to genotype data
with a combination of analytical integration, Laplace
approximations and adaptive Monte Carlo algorithms. A
number of possible generalizations and alternative
inference approaches will also be discussed.
 Daniel Lawson
(Bristol University, UK) [webpage]
 Title: "All the genomes in the world": Scalable Bayesian Computation using emulation
 Abstract: As the size of datasets grows, the majority of interesting models become inaccessible because they scale quadratically or worse. For some problems fast algorithms exist that converge to the desired model of interest. But this is rarely true: we often really want to use a complex model  in this case, model based clustering in genetics. Beyond simply discarding data, how do we make the model run? We describe a framework in which statistical emulators can be substituted in for part of the likelihood. By careful construction of a) a decision framework to decide which data to compute the full likelihood for, b) the choice of subquadratic cost emulator, and c) integration with the full model, we show that there are conditions in which the emulated Bayesian model can be consistent with the full model, and that the full model is recovered as the amount of emulation decreases. We specify the details of the framework for models of general similarity matrices, and give an example of Bayesian clustering model for genetics data. This allows us in principle to cluster "all the genomes in the world", costing subquadratic computation, and describe a tempered MCMClike algorithm to find the maximum a posteriori state that can be implemented on parallel architecture.
 Barbara Engelhardt (Duke University, USA)
[webpage]
 Title: Parameter
estimation in Bayesian matrix factorization models for
highdimensional genomic data
 Abstract: Matrix
factorization models are a workhorse of genomics: much of
the difficulty of highdimensional data is uncovering
lowdimensional structure while accounting for technical
and biological noise. Although the structure and specific
distributions underlying these models vary across
application and purpose, the dimensionality and complexity
of the data are both large. In this work, we consider
latent factor models for capturing latent population
structure in genomic samples, with a focus on the
interpretation of the estimated factors with respect to
characteristics of sample ancestry. We consider various
ways to improve parameter estimation in these models, with
the goal of making the application of structured models
possible and also useful.
Prop 9  Pseudomarginal and
particle MCMC methods (top)
 Organizer: M. Vihola
 Speakers:
 C. Andrieu (Univ.
Bristol, UK) [webpage]
 Title:Some properties of algorithms
for inference with noisy likelihoods
 Abstract: As statistical models become
ever more complex, evaluating the likelihood function has
become a challenge when carrying out statistical inference.
In recent years various methods which rely on noisy
estimates of the likelihood have been proposed in order to
circumvent this problem. In fact these methods share a
common structure, and perhaps surprisingly lead to correct
inference in that they do not lead to additional
approximations when compared to methods which use exact
values of the likelihoodone often uses the term "exact
approximations" to refer to some of the associated
algorithms. There is naturally a price to pay for using such
approximations.
In the presentation we will review these methods and discuss
some of the theoretical properties underpinning the
associated algorithms and their implications on the design
and expected performance of the algorithms.
 G. Karagiannis (PNNL, USA)
[webpage]
 Title: Annealed Importance Sampling
Reversible Jump MCMC algorithms
 Abstract: Reversible jump Markov chain
Monte Carlo (RJMCMC) algorithms is an extension to standard
MCMC methodology that allows sampling from transdimensional
distributions. In practice, their efficient
implementation remains a challenge due to the difficulty in
constructing efficient proposal moves. We present a
new algorithm that allows for an efficient implementation of
RJMCMC; we call this algorithm "Annealed Importance
Sampling Reversible Jump". The proposed algorithm can
be thought of as being an exactapproximation of idealized
RJ algorithms which in a Bayesian model selection problem
would sample the model labels only, but cannot be
implemented. The methodology relies on the idea of
bridging different models with artificial intermediate
models, whose role is to introduce smooth intermodel
transitions and improve performance. We demonstrate
the good performance of the proposed algorithm on standard
model selection problems and show that in spite of the
additional computational effort, the approach is highly
competitive computationally.
 G. Nicholls (Oxford
Univ., UK)
[webpage]
 Title: Approximatelikelihood MCMC is
close to the Penalty Method algorithm
 Abstract: We
consider Metropolis Hastings MCMC in cases where the log of
the ratio of target distributions is replaced by an
estimator. The estimator is based on m samples from an
independent online Monte Carlo simulation. Under some
conditions on the distribution of the estimator the process
resembles Metropolis Hastings MCMC with a randomized
transition kernel. When this is the case there is a
correction to the estimated acceptance probability which
ensures that the target distribution remains the equilibrium
distribution. The simplest versions of the Penalty Method of
Ceperley and Dewing (1999), the Universal Algorithm of Ball
et al. (2003) and the Single Variable Exchange algorithm of
Murray et al. (2006) are special cases. In many applications
of interest the correction terms cannot be computed. We
consider approximate versions of the algorithms. We show,
using a coupling argument, that on average O(m) of the
samples realized by a simulation approximating a randomized
chain of length n are exactly the same as those of a coupled
(exact) randomized chain. We define a distance between
MCMC algorithms based on their coupling separation times.
The naïve algorithm is separated by order m^(1/2) from
the standard exact algorithm but order 1/m from the exact
penalty method MCMC algorithm.
 Fredrik Lindsten
(Linkoping Univ., Sweden) [webpage]
 Title: Particle Gibbs using ancestor
sampling [slides]
 Abstract: In this talk we will
introduce particle Gibbs with ancestor sampling (PGAS),
which is a relatively new member of the family of particle
MCMC methods. Similarly to the particle Gibbs with backward
simulation (PGBS) procedure, we use backward sampling to
(considerably) improve the mixing of the PG kernel. Instead
of using separate forward and backward sweeps as in PGBS,
however, the ancestor sampling allows us to achieve the same
effect in a single forward sweep. We will also show that
PGAS successfully solves the problem of inferring a Wiener
model (linear dynamical system followed by a static
nonlinearity), with very few assumtions on the model.
Finally, we note that PGAS opens up for interesting
developments when it comes to inference in nonMarkovian
models.
Prop 10  Computational
and Methodological Challenges in evidence synthesis and multistep (modular
models) (top)
 Organizers: Prof. Nicky Best (Imperial College London, UK) and
Prof. Sylvia Richardson (MRC Biostatistics Unit and Univ. of
Cambridge, UK)
Bayesian
graphical models offer a very flexible and coherent framework for
building complex joint models that link together several
submodels. The submodels can represent different features of the
global model, such as a measurement error or missing data or bias
component linked to an analysis model of interest, or different
sources of information (data sets or studies) in a metaanalysis
or evidence synthesis. More generally, whenever inputs into a
model of interest are themselves unknown or uncertain, we may wish
to build one or more submodels to predict these and propagate
uncertainty to the main model of interest. In principle, the full
joint posterior distribution of such models can be estimated using
an appropriate flavour of MCMC. In practice, however, the samplers
can run into convergence and mixing problems, particularly if the
different submodels provide conflicting information about certain
parameters, or unknown parameters in one submodel are confounded
with unknown parameters in another, leading to lack of
identifiability. There may also be conceptual difficulties with
the joint model implied by linking together several submodels in
a Bayesian graphical model. For example, we may have more
confidence in some submodels than others (e.g. there may be a
sound scientific basis for some submodels whereas others may be
more speculative); the likelihood contribution from one submodel
may dominate all the others (due to imbalance in the quantity
and/or quality of the data sources informing different
submodels); or there may be good scientific reasons to wish to
keep estimation of submodels separate, yet still allow
propagation of uncertainty between them. In this session, speakers
will discuss some scenarios in which conventional MCMC estimation
of the joint posterior distribution of a “modular” Bayesian model
creates practical or conceptual problems, and present various
alternative computational strategies designed to approximate full
Bayesian inference in such circumstances.
 Speakers and provisional titles
 Martyn Plummer
(Infections and Cancer Epidemiology Group, IARC, Lyon)
 Title: Cuts in graphical models [slides]
 Abstract: The WinBUGS cut function and its associated modified Gibbs sampling algorithm are designed to compartmentalize information flow in graphical models. In particular cuts prevent "feedback" of information in models where the data collection comes in two phases, such as measurement error models and PK/PD models. Outside of its implementation in WinBUGS, the cut function has been reinvented several times, typically in an attempt to overcome convergence problems associated with Gibbs sampling.
I will show that the cut function in its current form does not work correctly. The limiting distribution depends on the update method being used. However, it can be modified with an additional MetropolisHasting acceptance step, and the result is equivalent to using multiple imputation. The connection to multiple imputation gives some insight into how cuts can be used for consistent estimation.
 David Lunn (MRC
Biostatistics Unit, UK) [webpage]
 Title: Two stage approaches to fully Bayesian hierarchical modelling [slides]
 Abstract: We present a novel and efficient MCMC method to facilitate the analysis of Bayesian hierarchical models in two stages. Thus we benefit from the convenience and flexibility of a two stage approach but the full hierarchical model, with feedback/shrinkage/borrowingofstrength and no approximations, is fitted. The first stage of our method estimates independent posterior distributions for the units under investigation, e.g. patients, clinical studies. These are then used as "proposal distributions" for the relevant parameters in the full hierarchical model during stage two (in which no likelihood evaluations are required). We identify three situations in which such an approach is particularly effective: (i) when unitspecific analyses are complex and/or timeconsuming; (ii) when there are several/numerous models or parameters of interest (e.g. covariate selection); and (iii) when the parameters of interest are complex functions of the 'natural' parameters. The twostage Bayesian approach closely reproduces a onestage analysis when it can be undertaken, but can also be easily carried out when a onestage approach is difficult or impossible. The method is implemented in the freely available BUGS software, and we illustrate its use/performance with insulinkinetic data from pregnant women with type 1 diabetes. We also explore the potential role of such methods in general evidence synthesis.
 Christopher Paciorek
[webpage] (Department of Statistics,
University of California, Berkeley) and Perry de Valpine
(Department of Environmental Science, Policy, and
Management, University of California, Berkeley) [webpage]
 Title: Extensible software for fitting hierarchical models:
using the NIMBLE platform to integrate disparate sources
of global health data [slides]
 Abstract: Hierarchical modeling has become ubiquitous in statistics, while MCMC has become the default approach to fitting such models. At the same time, the literature has seen an explosion in techniques (MCMCbased and
otherwise) for fitting, assessing, and comparing models. There has also been exploration of a variety of practical techniques for combining information, including modular modelbuilding, empirical Bayes, and cutting feedback. We argue that further progress in using hierarchical models in applications and exploitation of the wealth of algorithms requires a new software strategy that allows users to easily explore models using a variety of algorithms, while allowing developers to easily disseminate new algorithms. We present a new Rbased software platform, called NIMBLE, that is under development for this purpose and uses a BUGScompatible language for model specification and a new Rlike language for algorithm specification. We conclude with discussion of how the platform can help enable model exploration in the context of evidence synthesis and modular modeling, using an example from the area of paleoecology.
Prop 11  Computational
methods for Image analysis (top)
 Organizer: Matthew Moore
 Speakers:
 Lionel Cucala (Univ. Montpellier,
France) [webpage]
 Title: Bayesian inference on a mixture
model with spatial dependence
 Abstract: We
introduce a new technique to select the number of labels of
a mixture model with spatial dependence. It consists in an
estimation of the Integrated Completed Likelihood based on a
Laplace's approximation and a new technique to deal with the
normalizing constant intractability of the hidden Potts
model. Our proposal is applied to a real satellite image.
 Mark Huber (Claremont McKenna College,
USA) [webpage]
 Title: Perfect simulation for image
analysis [slides]
 Abstract: In
this talk I will discuss perfect simulation for discrete
and continuous autonormal models for image analysis.
For the continuous autonormal model monotonic CFTP
can be shown to always converge quickly, while for
discrete models the rate of convergence depends sharply on
the influence of the prior. Perfect simulation can
also be used with SwendsenWang type chains.
Partially recursive acceptance rejection can also be
effective for a nontrivial class of models.
 Matthew Moore (Queensland
Univ. of Technology, Australia)
 Title: Precomputation for ABC
 Abstract: The existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudodata from the model. However, the computational cost of these simulations can be prohibitive for highdimensional data. An important example is the Potts model, which is commonly used in image analysis.
The dimension of the state vector in this model is equal to the size of the data, which can be millions of pixels. In this talk I will show that the scalability of ABCSMC can be improved by performing a precomputation step before model fitting. The output of this precomputation can be reused across multiple datasets.
Prop 12
 Applications of MCMC (top)
 Organizer: Radu Craiu
 Speakers:
 Roberto Casarin
University Ca' Foscari, Venice, Italy
 Title: BetaProduct Dependent PitmanYor Processes for Bayesian Inference
 Abstract:
Multiple time series data may exhibit clustering over time and the clustering effect may change across different series. This paper is motivated by the Bayesian nonparametric modelling of the dependence between clustering effects in multiple time series analysis. We follow a Dirichlet process mixture approach and define a new class of multivariate dependent PitmanYor processes (DPY). The proposed DPY are represented in terms of vectors of stickbreaking processes which determine dependent clustering structures in the time series. We follow a hierarchical specification of the DPY base measure to account for various degrees of information pooling across the series. We discuss some theoretical properties of the DPY and use them to define Bayesian nonparametric repeated measurement and vector autoregressive models.
We provide efficient Monte Carlo Markov Chain algorithms for posterior computation of the proposed models and illustrate the effectiveness of the method with a simulation study and an application to the United States and the European Union business cycles.
 Samuel
Wong, University of Florida, USA
 Title: Sequential
Monte Carlo methods in protein folding
 Abstract: Predicting
the native structure of a protein from its aminoacid
sequence is a long standing problem. A significant
bottleneck of computational prediction is the lack of
efficient sampling algorithms to explore the configuration
space of a protein. In this talk we will introduce a
sequential Monte Carlo method to address this challenge:
fragment regrowth via energyguided sequential sampling
(FRESS). The FRESS algorithm combines statistical learning
(namely, learning from the protein data bank) with
sequential sampling to guide the computation, resulting in a
fast and effective exploration of the configurations. We
will illustrate the FRESS algorithm with both lattice
protein model and real proteins.
 Yuguo
Chen, University of Illinois UrbanaChampaign, USA
 Title: Augmented Particle
Filters for State Space Models
 Abstract: We
describe a new particle filtering algorithm, called the
augmented particle filter (APF), for online filtering
problems in state space models. The APF combines
information from both the observation equation and the
state equation, and the state space is augmented to
facilitate the weight computation. Theoretical
justification of the
APF is provided, and the connection between the APF and
the optimal particle filter in some special state space
models is investigated. We apply the APF to a target
tracking problem and the Lorenz model to demonstrate the
effectiveness of the method.
Prop 13  Innovative Bayesian Computing in Astrophysics (top)
 Organizer: David A. van Dyk
Sophisticated Bayesian methods and computational techniques are becoming ever more important for solving statistical challenges in astrophysics or cosmology. This sessions describes a number of specially designed Bayesian models that have proven useful in astronomy and how problems in astronomy have been used as springboards in the development of new general Bayesian computational methods.
 Speakers:
 Title: Characterizing the Population of Extrasolar Planetary Systems with Kepler and Hierarchical Bayes
 Abstract: NASA's Kepler Mission was designed to search for small planets, including those in the habitable zone of sunlike stars. From 20092013, a specially designed 0.95meter diameter telescope in solar orbit observed over 160,000 stars nearly continuously, once every 1 or
30 minutes. By measuring the decrease in brightness when a planet passes in front of its host star, Kepler has identified over 3400 strong planet candidates, most with sizes between that of Earth and Neptune. Kepler has revolutionized our knowledge of small planets, but it has also raised several new statistical challenges, particularly in regards to characterizing the intrinsic population of extrasolar planetary systems in light of a variety of detection limitations and biases.
We present results of Bayesian hierarchical models applied to characterize the true distributions of physical and orbital properties of exoplanets. Starting with simple population models, we compare approximations to the posterior distribution generated using multiple algorithms, including Markov chain Monte Carlo and Approximate Bayesian Computation. We discuss the implications for accuracy and performance when applying hierarchical Bayes to more realistic, complex and higherdimensional models of the planet population.
This research builds on collaborations between astronomers and statisticians forged during a three week workshop on "Modern Statistical and Computational Methods for Analysis of Kepler Data" at SAMSI in June 2013.
 Title: Bayesian Exoplanet Hunting with NASA's Kepler Mission
 Abstract: In this talk we introduce NASA's Kepler mission, and its stated goal of searching for "Habitable Planets" outside our solar system (known as "habitable exoplanets"). To detect planets, Kepler uses tremendously precise photometry to monitor a large number of stars. Time series of these stars show a signature "dip" when a planet passes in front of the star, allowing for statistical procedures to estimate properties of the exoplanet. We introduce a Waveletbased Bayesian model for detecting and modeling exoplanets in the presence of large instrumental and astrophysical "noise", and describe the challenges of the resulting MCMC algorithms. Simulation studies and preliminary results will be provided to illustrate the performance of the method.
 Title: Novel Bayesian approaches to supernova type Ia cosmology [slides]
 Abstract: Supernovae type Ia are a special type of stellar explosions,
whose intrinsic brightness can be standardised exploiting empirically
discovered correlations. This allows astronomers to use the observed
apparent brightness to reconstruct their distance, which in turns
depends on the expansion history of the Universe. The goal is to infer
cosmological parameters of interest such as the dark matter density in
the Universe and the dark energy density (and its time evolution).
In this talk I will present the statistical challenges that this
problem poses, and some of the Bayesian methods that have been
recently developed to meet them. I will discuss the use of highly
structured hierarchical methods to infer cosmological parameters from
the output of light curve fitters. I will also present some recent
ideas about replacing the entire inference chain with a fully Bayesian
approach, and how this can also be extended to automatic
classification of various supernova types. Computational challenges
(and solution) will also be presented.
 Title: Cosmological Parameter Estimation [slides]
 Abstract: Cosmological parameter estimation is a critical part of modern cosmology, but it is computationally challenging. We will review how Monte Carlo Markov chain methods have been introduced to solve this problem. Furthermore, we will discuss how ensemble sampling algorithms can be efficiently parallelized and point out why this is relevant for analyzing the increasingly complex datasets of current and future observations.
Prop 14  Differential geometry for
Monte Carlo algorithms (top)
 Organizer: Mark Girolami
Adopting
the tools of differential geometry provides the means to develop
and analyse new MCMC methods that exploit local and global
information about the underlying statistical model. Such
analysis has already highlighted the fundamental geometric
principles of Hybrid / Hamiltonian Monte Carlo by identifying
the equivalence between proposal mechanisms based on discrete
Hamiltonian flows and local geodesic flows on manifolds.
Such geodesics and optimal flows indicate a deeper connection
with optimal transport theory in the design of MCMC methods with
recent work by Marzouk and Reich hinting at these connections.
The emerging field of differential geometric MCMC is now
starting to exploit the many exotic structures available to
address open problems in MCMC such as efficient sampling in
hierarchic Bayesian models and distributions themselves defined
on manifolds, for example, Dirichlet and Bingham distributions.
This session will provide an opportunity to present advances in
these two new areas of MCMC research and explore the deep
connections between them both.
The invited speakers have been carefully chosen to provide a
spread of background and experience but ultimately to provide an
opportunity to explore these emerging themes and their
interconnections.
 Speakers:
 Sebastian
Reich, University of Potsdam, Germany
 Title: Particle filters for infinitedimensional systems: combining localization and optimal transportation [slides]
 Abstract: Particle filters or sequential Monte Carlo methods are
powerful tools for adjusting model state to data. However they suffer
from the curse of dimensionality and have not yet found widespread
application in the context of spatiotemporal evolution models. On the
other hand, the ensemble Kalman filter with its simple Gaussian
approximation has successfully been applied to such models using the
concept of localization. Localization allows one to account for a
spatial decay of correlation in a filter algorithm. In my talk, I will
propose novel particle filter implementations which are suitable for
localization and, as the ensemble Kalman filter, fit into the broad
class of linear transform filters. In case of a particle filter this
transformation will be determined by ideas from optimal transportation
while in case of the ensemble Kalman filter one essentially relies on
the linear Kalman update formulas. This common framework also allows
for a mixture of particle and ensemble Kalman filters.
Numerical results will be provided for the Lorenz96 model which is a
crude model for nonlinear advection.
 Youssef Marzouk, Massachusetts Institute of Technology, USA
 Title: Bayesian inference with optimal transport maps
 Abstract: We present a new approach to Bayesian inference that entirely avoids Markov chain Monte Carlo simulation, by constructing a deterministic map that pushes forward the prior measure (or another reference measure) to the posterior measure. Existence and uniqueness of a suitable measurepreserving map is established by formulating the problem in the context of optimal transport theory. We discuss various means of explicitly parameterizing the map and computing it efficiently through solution of a stochastic optimization problem; in particular, we use a sample average approximation approach that exploits gradient information from the likelihood function when available. The resulting scheme overcomes many computational bottlenecks associated with Markov chain Monte Carlo; advantages include analytical expressions for posterior moments, clear convergence criteria for posterior approximation, the ability to generate arbitrary numbers of independent samples, and automatic evaluation of the marginal likelihood to facilitate model comparison. We evaluate the accuracy and performance of the scheme on a wide range of statistical models, including hierarchical models, highdimensional models arising in spatial statistics, and parameter inference in partial differential equations.
 Simon Byrne, University of Cambridge, UK
 Title: Geodesic Hamiltonian Monte Carlo on Manifolds
 Abstract: Statistical problems often involve probability distributions on nonEuclidean manifolds. For instance, the field of directional statistics utilises distributions over circles, spheres and tori. Many dimensionreduction methods utilise orthogonal matrices, which form a natural manifold known as a Stiefel manifold. Unfortunately, it is often difficult to construct methods for independent sampling from such distributions, as the normalisation constants are often intractable, which means that standard approaches such as rejection sampling cannot be easily implemented. As a result, Markov chain Monte Carlo (MCMC) methods are often used, however even simple methods such as Gibbs sampling and random walk Metropolis require complicated reparametrisations and need to be specifically adapted to each distributional family of interest.
I will demonstrate how the geodesic structure of the manifold (such as "great circle" rotations on spheres) can be exploited to construct efficient methods for sampling from such distributions via a Hamiltonian Monte Carlo (HMC) scheme. These methods are very flexible and straightforward to implement, requiring only the ability to evaluate the unnormalised logdensity and its gradients.
 Michael
Betancourt, Massachusetts Institute of Technology, USA
 Title: Optimal Tuning of Numerical Integrators for Hamiltonian Monte Carlo
 Abstract: Leveraging techniques from differential geometry, Hamiltonian Monte Carlo generates Markov chains that explore the target distribution extremely efficiently, even in highdimensions.
Implementing Hamiltonian Monte Carlo in practice, however, requires a numerical integrator and a step size on which the performance of the algorithm depends.
Building of on the work of
Beskos et al., I show how the step sizedependent cost of an HMC transition can be bounded both below and above, and how these bounds can be computed for a wide class of numerical integrators.
Prop 15  Sampling and data
assimilation for large models (top)
 Organizer: Heikki Haario
While
Monte Carlo methods are becoming routine in moderately low
dimensions, models with high dimensional unknowns or high
CPU demands still pose a serious challenge. This sessions
presents ways to circumvent the 'curse of dimension' of standard
MCMC methods: one may formulate the sampling directly in
infinite dimensional function spaces, avoid MCMC by optimal
maps, or resort to optimization algorithms with randomized data
and prior. Applications include high dimensional inverse
problems as well as state estimation of large dynamical models.
 Speakers:
 Kody Law, University of Warwick, UK
 Title: Dimensionindependent likelihoodinformed MCMC samplers
 Abstract: Many Bayesian inference problems require exploring the posterior distribution of highdimensional parameters, which in principle can be described as functions.
Formulating algorithms which are defined on function space yields dimensionindependent algorithms.
By exploiting the intrinsic low dimensionality of the likelihood function, we introduce a newly developed suite of proposals for the Metropolis Hastings MCMC algorithm that can adapt to the complex structure of the posterior distribution, yet are defined on function space. I will present numerical examples indicating the efficiency of these dimensionindependent likelihoodinformed samplers.
I will also present some applications of functionspace samplers to problems relevant to numerical weather prediction and subsurface reconstruction.
 Patrick Conrad, Massachusetts Institute of Technology, USA
 Title: Asymptotically Exact MCMC Algorithms for Computationally Expensive Models via Local Approximations
 Abstract: We construct a new framework for accelerating MCMC algorithms for sampling from posterior distributions in the context of computationally intensive models. We proceed by constructing local surrogates of the forward model within the MetropolisHastings kernel, borrowing ideas from deterministic approximation theory, optimization, and experimental design. Our work builds upon previous work in surrogatebased inference by exploiting useful convergence characteristics of local surrogates. We prove the ergodicity of our approximate Markov chain and show that asymptotically it samples from the exact posterior density of interest. We describe variations of the algorithm that construct either local polynomial approximations or Gaussian process regressors, thus spanning two important classes of surrogate models. Numerical experiments demonstrate significant reductions in the number of forward model evaluations used in representative ODE or PDE inference problems, in both real and synthetic data examples.
This is joint work with Youssef Marzouk, Natesh Pillai, and Aaron Smith.
 Antti Solonen, Lappeenranta University of Technology, FI
 Title: Optimizationbased sampling and dimension reduction for nonlinear inverse problems
 Abstract: Highdimensional nonlinear inverse problems pose a challenge for MCMC
samplers. In this talk, we present two ways to improve sampling
efficiency for nonlinear forward models with Gaussian likelihood and
prior. First, we present an optimizationbased sampling approach, where
candidate samples are generated by randomly perturbing the data and the
prior, and repeatedly solving the corresponding MAP estimate. We derive
the probability density for this candidate generating mechanism, and
use it as a proposal density in Metropolis and importance sampling
schemes. Secondly, we discuss how the dimension in MCMC sampling can be
reduced by applying the nonlinear model only in directions where the
likelihood dominates the prior. We demonstrate the efficiency of these
approaches with various numerical examples, including inverse
diffusion, atmospheric remote sensing and electrical impedance
tomography.
Prop 16  Sequential Monte Carlo for
Static Learning (top)
 Organizer: Robert B. Gramacy
Sequential
Monte Carlo (SMC) is primarily a tool for simulationbased inference
of time series and state space models, often in a Bayesian
context. This session outlines how many of the strengths of
SMC can be ported to static modeling frameworks (i.e.,
independent data modeling). Examples include the design of
experiments, optimization under uncertainty, big data problems
and online learning, variable selection and input sensitivity
analysis. A theme underlying each of these is that certain
drawbacks of the typical SMC framework, like large MC error in
big data contexts, can be avoided explicitly or even leveraged
(spinning a bug as a feature) due to the specic nature of the
application at hand. Other SMC strengths, like embarrassing
parallelism or a natural tempering of the posterior
distribution, are played up in a big way to get better MC
properties compared to popular static inference techniques
like MCMC. In a nutshell, these talks aim to demonstrate the
power of dynamic thinking in an otherwise static environment.
 Speakers:
 Chris
Drovandi, Queensland University of Technology, AU
 Title: Sequential Monte Carlo Algorithms for Bayesian Sequential Design [slides]
 Abstract: Here I present sequential Monte Carlo (SMC) algorithms for solving
sequential Bayesian decision problems in the presence of parameter
and/or model uncertainty. The algorithm is computationally more
efficient than Markov chain Monte Carlo approaches and thus allows
investigation of the simulation properties of various utility
functions in a timely fashion. Furthermore, it is well known that SMC
provides convenient estimators of otherwise tricky quantities, such as
the model evidence, which allows for the fast estimation of popular
Bayesian utility functions for parameter estimation and model
discrimination. Extensions to sequential design algorithms for random
effects models and better ways for handling continuous design spaces
will also be discussed. This is joint work with Dr James McGree and
Professor Tony Pettitt of the Queensland University of Technology.
This research is supported by an Australian Research Council Discovery
Grant.
 Christoforos Anagnostopoulos, Imperial College London, UK
 Title: Informationtheoretic data discarding for Dynamic Trees on Data
Streams
 Abstract:
Online inference is often the only computationally tractable method
for analysing massive datasets. In such contexts, better exploration
of the model space can be afforded by introducing a statespace
formulation. This talk is focused on precisely such a proposal: an SMC
algorithm for dynamic regression and classification trees, applied on
static data. introduce informationtheoretic heuristics for data
discarding that ensure the algorithm is truly online, in the sense
that the computational requirements of processing an additional
datapoint are constant with the size of data already seen. We discuss
the effect such heuristics have on the longterm behaviour of the SMC
algorithm."
 Luke
Bornn, Harvard University, USA
 Title: Efficient Prior Sensitivity Analysis and Crossvalidation
 Abstract: Prior sensitivity analysis and crossvalidation are important tools in
Bayesian statistics. However, due to the computational expense of
implementing existing methods, these techniques are rarely used. In
this talk I will show how it is possible to use sequential Monte Carlo
methods to create an efficient and automated algorithm to perform
these tasks. I will apply the algorithm to the computation of
regularization path plots and to assess the sensitivity of the tuning
parameter in gprior model selection, then demonstrate the algorithm
in a crossvalidation context and use it to select the shrinkage
parameter in Bayesian regression.
 Matt Pratola, Ohio State University, USA
 Title: Efficient MetropolisHastings Proposal Mechanisms for Bayesian Regression Tree Models
 Abstract: Bayesian regression trees are flexible nonparametric models that are well suited to many modern statistical regression problems. Many such tree models have been proposed, from the simple singletree model to more complex tree ensembles. Their nonparametric formulation allows one to model datasets exhibiting complex nonlinear relationships between the model predictors and observations. However, the mixing behavior of the Markov Chain Monte Carlo (MCMC) sampler is sometimes poor, frequently suffering from localmode stickiness and poor mixing.
This is because existing MetropolisHastings proposals do not allow for efficient traversal of the model space. We develop novel MetropolisHastings proposals that account for the topological structure of regression trees. The first is a rule perturbation proposal while the second we call tree rotation. The perturbation proposal can be seen as an efficient variation of the change proposal found in existing literature. The novel tree rotation proposal only requires local changes to the regression tree structure, yet it efficiently traverses disparate regions of the model space along contours of equal probability. We implement these samplers for the Bayesian Additive Regression Tree (BART) model and demonstrate their effectiveness on a prediction problem from computer experiments and a computer model calibration problem involving CO2 emissions data.
