29 Jan 2016
This post by Andrew Gelman first appeared on Statistical Modeling, Causual Inference, and Social Science
Ben Goodrich writes:
The rstanarm R package, which has been mentioned several times on stan-users,
is now available in binary form on CRAN mirrors (unless you are using an old version of R and / or an old version of OSX).
It is an R package that comes with a few precompiled Stan models — which are called by R wrapper functions that have
the same syntax as popular model-fitting functions in R such as glm() — and some supporting R functions for working
with posterior predictive distributions.
The files in its
demo/ subdirectory, which can be called via the demo() function,
show how you can fit essentially all of the models in Gelman and Hill’s textbook
and rstanarm already offers more (although not strictly a superset of the) functionality in the arm R package.
The rstanarm package can be installed in the usual way with
which does not technically require the computer to have a C++ compiler if you on Windows / Mac
(unless you want to build it from source, which might provide a slight boost to the execution speed).
The vignettes explain in detail how to use each of the model fitting functions in rstanarm.
However, the vignettes on the CRAN website
do not currently show the generated images, so call
help("priors") pages are also essential for understanding
what rstanarm does and how it works.
Briefly, there are several model-fitting functions:
stan_aov(), which just calls
stan_lm(), use the same likelihood as
but add regularizing priors on the coefficients.
stan_polr() uses the same likelihood as
MASS::polr() and adds regularizing priors on the coefficients and,
indirectly, on the cutpoints. The
stan_polr() function can also handle binary outcomes and can do scobit
stan_glm.nb() use the same likelihood(s) as
MASS::glm.nb() and respectively provide
a few options for priors.
stan_gamm4() use the same likelihoods
gamm4::gamm4() respectively and basically
stan_glm() but add regularizing priors on the covariance matrices that comprise the blocks of the block-diagonal
covariance matrix of the group-specific parameters. The
stan_[g]lmer() functions accept all the same formulas
lme4::[g]lmer() — and indeed use lme4’s formula parser — and
stan_gamm4() accepts all the same
gamm::gamm4(), which can / should include smooth additive terms such as splines.
If the objective is merely to obtain and interpret results and one of the model-fitting functions in rstanarm is adequate
for your needs, then you should almost always use it. The Stan programs in the rstanarm package are better tested,
have incorporated a lot of tricks and reparameterizations to be numerically stable, and have more options than what
most Stan users would implement on their own. Also, all the model-fitting functions in rstanarm are integrated with
loo(), which are somewhat tedious to implement on your own. Conversely,
if you want to learn how to write Stan programs, there is no substitute for practice, but the Stan programs in
rstanarm are not particularly well-suited for a beginner to learn from because of all their
tricks / reparameterizations / options.
Feel free to file bugs and feature requests at
If you would like to make a pull request to add a model-fitting function to rstanarm, there is a pretty well-established path in the
code for how to do that but it is spread out over a bunch of different files. It is probably easier to contribute to rstanarm,
but some developers may be interested in distributing their own CRAN packages that come with precompiled Stan programs that are
focused on something besides applied regression modeling in the social sciences.
The Makefile and cleanup scripts in the rstanarm package show how this can be accomplished (which took weeks to figure out),
but it is easiest to get started by calling
rstan::rstan_package_skeleton(), which sets up the package structure and copies
some stuff from the rstanarm GitHub repository.
On behalf of Jonah who wrote half the code in rstanarm and the rest of the Stan Development Team who wrote the math library
and estimation algorithms used by rstanarm, we hope rstanarm is useful to you.
Also, Leon Shernoff pointed us to this post by Wayne Folta, delightfully titled “R Users Will Now Inevitably Become Bayesians,” introducing two new R packages for fitting Stan models: rstanarm and brms. Here’s Folta:
There are several reasons why everyone isn’t using Bayesian methods for regression modeling. One reason is that Bayesian modeling requires more thought . . . A second reason is that MCMC sampling . . . can be slow compared to closed-form or MLE procedures. A third reason is that existing Bayesian solutions have either been highly-specialized (and thus inflexible),
or have required knowing how to use a generalized tool like BUGS, JAGS, or Stan. T
his third reason has recently been shattered in the R world by not one but two packages:
Interestingly, both of these packages are elegant front ends to Stan, via
shinystan. . . . You can install
both packages from CRAN . . .
He illustrates with an example:
mm <- stan_glm (mpg ~ ., data=mtcars, prior=normal (0, 8))
mm #===> Results
stan_glm(formula = mpg ~ ., data = mtcars, prior = normal(0,
(Intercept) 11.7 19.1
cyl -0.1 1.1
disp 0.0 0.0
hp 0.0 0.0
drat 0.8 1.7
wt -3.7 2.0
qsec 0.8 0.8
vs 0.3 2.1
am 2.5 2.2
gear 0.7 1.5
carb -0.2 0.9
sigma 2.7 0.4
Sample avg. posterior predictive
distribution of y (X = xbar):
mean_PPD 20.1 0.7
Note the more sparse output, which Gelman promotes. You can get more detail with
summary (br), and you can also use
shinystan to look at most everything that a Bayesian regression can give you. We can look at the values and CIs of the coefficients with
plot (mm), and we can compare posterior sample distributions with the actual distribution with:
pp_check (mm, "dist", nreps=30):
This is all great. I’m looking forward to never having to use lm, glm, etc. again. I like being able to put in priors (or, if desired, no priors) as a matter of course, to switch between mle/penalized mle and full Bayes at will, to get simulation-based uncertainty intervals for any quantities of interest, and to be able to build out my models as needed.
Original post at http://andrewgelman.com/2016/01/14/rstanarm-and-more/
11 Aug 2015
<img src=”https://raw.githubusercontent.com/stan-dev/logos/master/logo.png” width=200 alt=”Stan Logo”/>
ShinyStan provides immediate, informative, customizable visual and
numerical summaries of model parameters and convergence diagnostics for
MCMC simulations. The ShinyStan graphical user interface is available
via the shinystan R package. Try the online demo.
Installing the shinystan R package
Install from CRAN:
If this fails, try adding the arguments
Install from GitHub (requires devtools package):
devtools::install_github("stan-dev/shinystan", build_vignettes = TRUE)
To take advantage of all the features in the shinystan package, it is also
recommended to install the shinyapps
package. You can do this by running
Applied Bayesian data analysis is primarily implemented through the MCMC
algorithms offered by various software packages. When analyzing a posterior sample
obtained by one of these algorithms the first step is to check for signs that
the chains have converged to the target distribution and and also for signs that
the algorithm might require tuning or might be ill-suited for the given model.
There may also be theoretical problems or practical inefficiencies with the
specification of the model.
ShinyStan provides interactive plots and tables helpful for analyzing a
posterior sample, with particular attention to identifying potential problems
with the performance of the MCMC algorithm or the specification of the model.
ShinyStan is powered by RStudio’s Shiny web application framework and works with
the output of MCMC programs written in any programming language (and has extended
functionality for models fit using RStan
and the No-U-Turn sampler).
Saving and deploying (sharing)
The shinystan package allows you to store the basic components of an entire
project (code, posterior samples, graphs, tables, notes) in a single object.
Users can save many of the plots as ggplot2 objects for further customization
and easy integration in reports or post-processing for publication.
The new version of shinystan also provides the
which lets you easily deploy your own ShinyStan apps online using RStudio’s
ShinyApps service for any of
your models. Each of your apps (each of your models) will have a unique url
and is compatible with Safari, Firefox, Chrome, and most other browsers.
The shinystan R package and ShinyStan interface are open source licensed under
the GNU Public License, version 3 (GPLv3).
19 Mar 2015
This post first appeared on Statistical Modeling, Causual Inference, and Social Science
As a project for Andrew's Statistical Communication and Graphics graduate course at Columbia, a few of us (Michael Andreae, Yuanjun Gao, Dongying Song, and I) had the goal of giving RStan's print and plot functions a makeover. We ended up getting a bit carried away and instead we designed a graphical user interface for interactively exploring virtually any Bayesian model fit using a Markov chain Monte Carlo algorithm.
The result is shinyStan, a package for R and an app powered by Shiny. The full version of shinyStan v1.0.0 can be downloaded as an R package from the Stan Development Team GitHub page here, and we have a demo up online here. If you're not an R user, we're working on a full online version of shinyStan too.
For me, there are two primary motivations behind shinyStan:
1) Interactive visual model exploration
- Immediate, informative, customizable visual and numerical summaries of model parameters and convergence diagnostics for MCMC simulations.
- Good defaults with many opportunities for customization.
2) Convenient saving and sharing
- Store the basic components of an entire project (code, posterior samples, graphs, tables, notes) in a single object.
- Export graphics into R session as ggplot2 objects for further customization and easy integration in reports or post-processing for publication.
There’s also a third thing that has me excited at the moment. That online demo I mentioned above… well, since you’ll be able to upload your own data soon enough and even add your own plots if we haven’t included something you want, imagine an interactive library of your models hosted online. I’m imagining something like this except, you know, finite, useful, and for statistical models instead of books. (Quite possibly with fewer paradoxes too.) So it won’t be anything like Borges’ library, but I couldn’t resist the chance to give him a shout-out.
Finally, for those of you who haven’t converted to Stan quite yet, shinyStan is agnostic when it comes to inputs, which is to say that you don’t need to use Stan to use shinyStan (though we like it when you do). If you’re a Jags or Bugs user, or if you write your own MCMC algorithms, as long as you have some simulations in an array, matrix, mcmc.list, etc., you can take advantage of shinyStan.
If you haven’t stopped reading yet and want a more detailed list of features, release notes are below. But why read the notes when you can try it out right now! And if you do try it out, we’d love your feedback.
How to Get It
Interactive and customizable plots:
* Parameter estimates
* Traceplots for individual or multiple parameters
* Autocorrelation for individual or multiple parameters
* Bivariate scatterplots
* (New) Trivariate scatter plots (using three.js library)
* (New) Distributions of Rhat, effective sample size / total sample size, monte carlo error / posterior sd
Customizable tables (via jQuery DataTables)
* Posterior summary statistics (can now search table with regular expressions for easier filtering)
* Average, max and min of sampler parameters (for NUTS and HMC algorithms)
* In addition to stanfit objects, you can also use arrays of simulations and mcmc.lists with shinyStan
* Model code is viewable in the shinyStan GUI
* Save notes about your model
* Save plots as ggplot2 objects (i.e. not just the image but an object that can be edited with functions from the ggplot2 package)
* Glossaries with definitions of terms used in the tables
* Generate new quantities as a function of one or two existing parameters
Coming soon to your local shinyStan:
* Graphical posterior predictive checks
* shinyStan online
* Deploy a shinyStan app for each of your models online to shinyapps.io (Start an online library of your models)
* Add your own custom plots to a shinystan objects so you can really store everything in one object