Stan (software)

{{Short description|Probabilistic programming language for Bayesian inference}}

{{Other uses|Stan (disambiguation)}}

{{Infobox software

| name = Stan

| logo = Stan (programming) logo.png

| author = Stan Development Team

| released = {{Start date|2012|8|30}}

| discontinued =

| latest release version = {{wikidata|property|edit|reference|P348}}

| latest release date = {{start date and age|{{wikidata|qualifier|P348|P577}}}}

| programming language = C++

| operating system = Unix-like, Microsoft Windows, Mac OS X

| platform = Intel x86 - 32-bit, x64

| genre = Statistical package

| license = New BSD License

| website = {{URL|https://mc-stan.org/}}

}}

Stan is a probabilistic programming language for statistical inference written in C++.Stan Development Team. 2015. [https://github.com/stan-dev/stan/releases/download/v2.9.0/stan-reference-2.9.0.pdf Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0] The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.

Stan is licensed under the New BSD License. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method.

Stan was created by a development team consisting of 52 members{{Cite news|url=http://mc-stan.org/about/team/|title=Development Team|work=stan-dev.github.io|access-date=2024-11-21|language=en-US}} that includes Andrew Gelman, Bob Carpenter, Daniel Lee, Ben Goodrich, and others.

Example

A simple linear regression model can be described as y_n = \alpha + \beta x_n + \epsilon_n, where \epsilon_n \sim \text{normal} (0, \sigma). This can also be expressed as y_n \sim \text{normal}(\alpha + \beta X_n, \sigma). The latter form can be written in Stan as the following:

data {

int N;

vector[N] x;

vector[N] y;

}

parameters {

real alpha;

real beta;

real sigma;

}

model {

y ~ normal(alpha + beta * x, sigma);

}

Interfaces

The Stan language itself can be accessed through several interfaces:

In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language:{{cite web |last1=Gabry |first1=Jonah |title=The current state of the Stan ecosystem in R |url=https://statmodeling.stat.columbia.edu/2018/04/24/current-state-stan-ecosystem-r/ |website=Statistical Modeling, Causal Inference, and Social Science |accessdate=25 August 2020 |ref=stanecosystem}}

  • rstanarm provides a drop-in replacement for frequentist models provided by base R and lme4 using the R formula syntax;
  • brms{{Cite web|url=https://cran.r-project.org/web/packages/brms/index.html|title = BRMS: Bayesian Regression Models using 'Stan'|date = 23 August 2021}} provides a wide array of linear and nonlinear models using the R formula syntax;
  • prophet provides automated procedures for time series forecasting.

Algorithms

Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation.

| last1 = Hoffman | first1 = Matthew D.

| last2 = Gelman |first2 = Andrew

| title = The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo

| journal = Journal of Machine Learning Research

|date=April 2014

| volume = 15

| pages = pp. 1593–1623

| url = http://jmlr.org/papers/v15/hoffman14a.html

}} (NUTS), a variant of HMC and Stan's default MCMC engine

  • Variational inference algorithms:
  • Automatic Differentiation Variational Inference{{cite journal

|last1 = Kucukelbir|first1 = Alp|last2 = Ranganath|first2 = Rajesh|last3 = Blei|first3 = David M.|title = Automatic Variational Inference in Stan|date = June 2015|volume = 1506|issue = 3431|arxiv = 1506.03431|bibcode = 2015arXiv150603431K}}

  • Pathfinder: Parallel quasi-Newton variational inference{{cite journal | last1=Zhang | first1=Lu | last2=Carpenter | first2=Bob | last3=Gelman | first3=Andrew | last4=Vehtari | first4=Aki | year=2022 | title=Pathfinder: Parallel quasi-Newton variational inference | journal=Journal of Machine Learning Research | volume=23 | issue=306 | pages=1–49 }}

Automatic differentiation

Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference. The automatic differentiation within Stan can be used outside of the probabilistic programming language.

Usage

Stan is used in fields including social science,Goodrich, Benjamin King, Wawro, Gregory and Katznelson, Ira, Designing Quantitative Historical Social Inquiry: An Introduction to Stan (2012). APSA 2012 Annual Meeting Paper. Available at {{SSRN|2105531}} pharmaceutical statistics,{{cite journal |last1=Natanegara|first1= Fanni |last2=Neuenschwander|first2= Beat |last3=Seaman|first3= John W. |last4=Kinnersley| first4=Nelson |last5=Heilmann| first5=Cory R. |last6=Ohlssen|first6=David |last7=Rochester|first7= George| title=The current state of Bayesian methods in medical product development: survey results and recommendations from the DIA Bayesian Scientific Working Group| journal=Pharmaceutical Statistics| issn=1539-1612| doi=10.1002/pst.1595|pmid= 24027093 | year=2013 |pages=3–12| volume=13 |issue=1|s2cid= 19738522 }} market research,{{cite web |last1=Feit |first1=Elea |title=Using Stan to Estimate Hierarchical Bayes Models |date=15 May 2017 |url=https://discourse.mc-stan.org/t/tutorial-on-using-stan-in-marketing-research/554 |accessdate=19 March 2019}} and medical imaging.{{cite journal | last1=Gordon | first1=GSD | first2=J |last2=Joseph | first3=MP | last3= Alcolea | first4=T | last4= Sawyer | first5=AJ | last5=Macfaden | first6=C | last6=Williams | first7=CRM | last7=Fitzpatrick | first8 = PH | last8 = Jones | first9 = M | last9 = di Pietro | first10=RC | last10=Fitzgerald | first11=TD | last11= Wilkinson | first12 = SE | last12 = Bohndiek | title = Quantitative phase and polarization imaging through an optical fiber applied to detection of early esophageal tumorigenesis | journal=Journal of Biomedical Optics | arxiv=1811.03977 | year=2019 | volume=24 | issue=12 | pages=1–13 | doi=10.1117/1.JBO.24.12.126004 | pmid=31840442 | pmc=7006047 | bibcode=2019JBO....24l6004G }}

See also

  • PyMC is a probabilistic programming language in Python
  • ArviZ a Python library for Exploratory Analysis of Bayesian Models

References

{{reflist|30em}}

Further reading

  • {{cite journal | first1 = Bob| last1 = Carpenter|

first2 = Andrew| last2 = Gelman|

first3 = Matthew| last3 = Hoffman|

first4 = Daniel| last4 = Lee|

first5 = Ben| last5 = Goodrich|

first6 = Michael| last6 = Betancourt|

first7 = Marcus| last7 = Brubaker|

first8 = Jiqiang| last8 = Guo|

first9 = Peter| last9 = Li|

first10 = Allen| last10 = Riddell|

title = Stan: A Probabilistic Programming Language|

journal = Journal of Statistical Software|

volume = 76|

number = 1|

year = 2017|

issn = 1548-7660|

pages = 1–32|

doi = 10.18637/jss.v076.i01| pmid = 36568334| pmc = 9788645|

doi-access = free}}

  • Gelman, Andrew, Daniel Lee, and Jiqiang Guo (2015). [http://www.stat.columbia.edu/~gelman/research/published/stan_jebs_2.pdf Stan: A probabilistic programming language for Bayesian inference and optimization], Journal of Educational and Behavioral Statistics.
  • Hoffman, Matthew D., Bob Carpenter, and Andrew Gelman (2012). [http://probabilistic-programming.org/wiki/NIPS*2012_Workshop/Schedule#talk-hoffman Stan, scalable software for Bayesian modeling] {{Webarchive|url=https://web.archive.org/web/20150121055338/http://probabilistic-programming.org/wiki/NIPS*2012_Workshop/Schedule#talk-hoffman |date=2015-01-21 }}, Proceedings of the NIPS Workshop on Probabilistic Programming.