├── .gitignore
├── README.rst
├── SciPy Lecture 1.pdf
├── SciPy Lecture 4.pdf
├── bioassay.py
├── cov.py
├── mean.py
├── obs.py
├── triangular.py
├── truncated_metropolis.py
└── weibull.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
  1 | An Introduction to Bayesian Statistical Modeling using PyMC
  2 | ===========================================================
  3 | 
  4 | PyMC is a Python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo. Its flexibility and extensibility make it applicable to a large suite of problems across all quantitative disciplines. This hands-on tutorial will introduce users to the key components of PyMC and how to employ them to construct, fit and diagnose models. Though some familiarity with statistics is assumed, the tutorial will begin with a brief overview of Bayesian inference, including an introduction to Markov chain Monte Carlo.
  5 | 
  6 | Installing PyMC
  7 | ---------------
  8 | 
  9 | PyMC is known to run on Mac OS X, Linux and Windows, but in theory should be
 10 | able to work on just about any platform for which Python, a Fortran compiler
 11 | and the NumPy module are  available. However, installing some extra
 12 | depencies can greatly improve PyMC's performance and versatility.
 13 | The following describes the required and optional dependencies and takes you
 14 | through the installation process.
 15 | 
 16 | Dependencies
 17 | ------------
 18 | 
 19 | PyMC requires some prerequisite packages to be present on the system.
 20 | Fortunately, there are currently only a few dependencies, and all are
 21 | freely available online.
 22 | 
 23 | * `Python`_ version 2.5 or 2.6.
 24 | 
 25 | * `NumPy`_ (1.4 or newer): The fundamental scientific programming package, it provides a
 26 |   multidimensional array type and many useful functions for numerical analysis.
 27 | 
 28 | * `Matplotlib (optional)`_ : 2D plotting library which produces publication
 29 |   quality figures in a variety of image formats and interactive environments
 30 | 
 31 | * `pyTables (optional)`_ : Package for managing hierarchical datasets and
 32 |   designed to efficiently and easily cope with extremely large amounts of data.
 33 |   Requires the `HDF5`_ library.
 34 | 
 35 | * `pydot (optional)`_ : Python interface to Graphviz's Dot language, it allows
 36 |   PyMC to create both directed and non-directed graphical representations of models.
 37 |   Requires the `Graphviz`_ library.
 38 | 
 39 | * `SciPy (optional)`_ : Library of algorithms for mathematics, science
 40 |   and engineering.
 41 | 
 42 | * `IPython (optional)`_ : An enhanced interactive Python shell and an
 43 |   architecture for interactive parallel computing.
 44 | 
 45 | * `nose (optional)`_ : A test discovery-based unittest extension (required
 46 |   to run the test suite).
 47 | 
 48 | 
 49 | There are prebuilt distributions that include all required dependencies. For
 50 | Mac OS X users, we recommend the `MacPython`_ distribution or the
 51 | `Enthought Python Distribution`_ on OS X 10.5 (Leopard) and Python 2.6.1 that 
 52 | ships with OS X 10.6 (Snow Leopard). Windows users should download and install the
 53 | `Enthought Python Distribution`_. The Enthought Python Distribution comes
 54 | bundled with these prerequisites. Note that depending on the currency of these
 55 | distributions, some packages may need to be updated manually.
 56 | 
 57 | For Mac OS X 10.6 (Leopard) users, a script for installing all the key dependencies, as well as a recent build of PyMC, can be downloaded from the `SciPy Superpack page`_.
 58 | 
 59 | If instead of installing the prebuilt binaries you prefer (or have) to build
 60 | ``pymc`` yourself, make sure you have a Fortran and a C compiler. There are free
 61 | compilers (gfortran, gcc) available on all platforms. Other compilers have not been
 62 | tested with PyMC but may work nonetheless.
 63 | 
 64 | 
 65 | .. _`Python`: http://www.python.org/.
 66 | 
 67 | .. _`NumPy`: http://www.scipy.org/NumPy
 68 | 
 69 | .. _`Matplotlib (optional)`: http://matplotlib.sourceforge.net/
 70 | 
 71 | .. _`MacPython`: http://www.activestate.com/Products/ActivePython/
 72 | 
 73 | .. _`Enthought Python Distribution`: http://www.enthought.com/products/epddownload.php
 74 | 
 75 | .. _`SciPy (optional)`: http://www.scipy.org/
 76 | 
 77 | .. _`IPython (optional)`: http://ipython.scipy.org/
 78 | 
 79 | .. _`pyTables (optional)`: http://www.pytables.org/moin
 80 | 
 81 | .. _`HDF5`: http://www.hdfgroup.org/HDF5/
 82 | 
 83 | .. _`pydot (optional)`: http://code.google.com/p/pydot/
 84 | 
 85 | .. _`Graphviz`: http://www.graphviz.org/
 86 | 
 87 | .. _`nose (optional)`: http://somethingaboutorange.com/mrl/projects/nose/
 88 | 
 89 | .. _`SciPy Superpack page`: http://http://stronginference.com/scipy-superpack/
 90 | 
 91 | Compiling the source code
 92 | -------------------------
 93 | 
 94 | You can check out the latest development source of the code from `GitHub`_
 95 | repository::
 96 | 
 97 |     git clone git://github.com/pymc-devs/pymc.git pymc
 98 | 
 99 | Then move into the ``pymc`` directory and follow the platform specific instructions.
100 | 
101 | Though this code is technically development source, it contains important bug fixes and features absent from the previous release (2.1) and is relatively stable. Hence, we recommend using the latest development code if possible. A new release is in the works, but will not be complete prior to SciPy 2011.
102 | 
103 | Windows
104 | ~~~~~~~
105 | 
106 | One way to compile PyMC on Windows is to install `MinGW`_ and `MSYS`_. MinGW is
107 | the GNU Compiler Collection (GCC) augmented with Windows specific headers and
108 | libraries. MSYS is a POSIX-like console (bash) with UNIX command line tools.
109 | Download the `Automated MinGW Installer`_ and double-click on it to launch
110 | the installation process. You will be asked to select which
111 | components are to be installed: make sure the g77 compiler is selected and
112 | proceed with the instructions. Then download and install `MSYS-1.0.exe`_,
113 | launch it and again follow the on-screen instructions.
114 | 
115 | Once this is done, launch the MSYS console, change into the PyMC directory and
116 | type::
117 | 
118 |     python setup.py install
119 | 
120 | This will build the C and Fortran extension and copy the libraries and python
121 | modules in the C:/Python26/Lib/site-packages/pymc directory.
122 | 
123 | .. _`GitHub`: http://github.com
124 | 
125 | .. _`MinGW`: http://www.mingw.org/
126 | 
127 | .. _`MSYS`: http://www.mingw.org/wiki/MSYS
128 | 
129 | .. _`Automated MinGW Installer`: http://sourceforge.net/projects/mingw/files/
130 | 
131 | .. _`MSYS-1.0.exe`: http://downloads.sourceforge.net/mingw/MSYS-1.0.11.exe
132 | 
133 | 
134 | Mac OS X or Linux
135 | ~~~~~~~~~~~~~~~~~
136 | 
137 | In a terminal, type::
138 | 
139 |     python setup.py config_fc --fcompiler gnu95 build
140 |     python setup.py install
141 | 
142 | The above syntax also assumes that you have gFortran installed and available. The 
143 | `sudo` command may be required to install PyMC into the Python ``site-packages``
144 | directory if it has restricted privileges.
145 | 
146 | In addition, the python2.6-dev package may be required to install PyMC on Linux systems. On Ubuntu or Debian, we have had success by installing the following prior to building PyMC::
147 | 
148 |     sudo apt-get install ipython python-setuptools python-dev python-nose
149 |     python-tk python-numpy python-matplotlib python-scipy python-networkx   
150 |     gfortran libatlas-base-dev
151 | 
152 | 
153 | Running the test suite
154 | ----------------------
155 | 
156 | ``pymc`` comes with a set of tests that verify that the critical components
157 | of the code work as expected. To run these tests, users must have `nose`_
158 | installed. The tests are launched from a python shell::
159 | 
160 |     import pymc
161 |     pymc.test()
162 | 
163 | In case of failures, messages detailing the nature of these failures will
164 | appear. In case this happens (it shouldn't), please report
165 | the problems on the `issue tracker`_ 
166 | specifying the version you are using and the environment.
167 | 
168 | .. _`nose`: http://somethingaboutorange.com/mrl/projects/nose/
169 | 
170 | .. _`issue tracker`: http://github.com/pymc-devs/pymc/issues
171 | 
172 | 
173 | Code for BDA Project Template
174 | -----------------------------
175 | 
176 | Here is `a template`_ for a project to do Bayesian data analysis with PyMC
177 | 
178 | .. _`a template`: https://github.com/aflaxman/pymc-project-template
179 | 
180 | Code for the Human Development Index vs Total Fertility Rate example
181 | --------------------------------------------------------------------
182 | 
183 | Code to `replicate examples`_ from the tutorial.
184 | 
185 | .. _`replicate examples`: https://github.com/aflaxman/pymc-example-tfr-hdi
186 | 


--------------------------------------------------------------------------------
/SciPy Lecture 1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fonnesbeck/pymc_tutorial/46b12d84569b517f19aec2961bb052a94de06a51/SciPy Lecture 1.pdf


--------------------------------------------------------------------------------
/SciPy Lecture 4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fonnesbeck/pymc_tutorial/46b12d84569b517f19aec2961bb052a94de06a51/SciPy Lecture 4.pdf


--------------------------------------------------------------------------------
/bioassay.py:
--------------------------------------------------------------------------------
 1 | from pymc import *
 2 | from numpy import array
 3 | 
 4 | n = [5]*4
 5 | dose = [-.86,-.3,-.05,.73]
 6 | response = [0,1,3,5]
 7 | 
 8 | alpha = Normal('alpha', mu=0, tau=0.01)
 9 | beta = Normal('beta', mu=0, tau=0.01)
10 |     
11 | theta = Lambda('theta', lambda a=alpha, b=beta: invlogit(a + b*array(dose)))
12 | 
13 | @observed
14 | def deaths(value=response, n=n, p=theta):
15 |     """deaths ~ binomial_like(n, p)"""
16 |     return binomial_like_like(value, n, p)


--------------------------------------------------------------------------------
/cov.py:
--------------------------------------------------------------------------------
 1 | from pymc.gp import *
 2 | from pymc.gp.cov_funs import matern
 3 | from numpy import *
 4 | 
 5 | C = Covariance(eval_fun = matern.euclidean, diff_degree = 1.4, amp = .4, scale = 1.)
 6 | # C = Covariance(eval_fun = matern.euclidean, diff_degree = 1.4, amp = .4, scale = 1., rank_limit=100)
 7 | # C = FullRankCovariance(eval_fun = matern.euclidean, diff_degree = 1.4, amp = .4, scale = 1.)
 8 | # C = NearlyFullRankCovariance(eval_fun = matern.euclidean, diff_degree = 1.4, amp = .4, scale = 1.)
 9 | 
10 | #### - Plot - ####
11 | if __name__ == '__main__':
12 |     from pylab import *
13 | 
14 |     x=arange(-1.,1.,.01)
15 |     clf()
16 | 
17 |     # Plot the covariance function
18 |     subplot(1,2,1)
19 | 
20 |     contourf(x,x,C(x,x).view(ndarray),origin='lower',extent=(-1.,1.,-1.,1.),cmap=cm.bone)
21 | 
22 |     xlabel('x')
23 |     ylabel('y')
24 |     title('C(x,y)')
25 |     axis('tight')
26 |     colorbar()
27 | 
28 |     # Plot a slice of the covariance function
29 |     subplot(1,2,2)
30 | 
31 |     plot(x,C(x,0).view(ndarray).ravel(),'k-')
32 | 
33 |     xlabel('x')
34 |     ylabel('C(x,0)')
35 |     title('A slice of C')
36 | 
37 |     # show()
38 | 


--------------------------------------------------------------------------------
/mean.py:
--------------------------------------------------------------------------------
1 | from pymc.gp import *
2 | 
3 | # Generate mean 
4 | def quadfun(x, a, b, c):
5 | 	return (a * x ** 2 + b * x + c)
6 | 
7 | M = Mean(quadfun, a = 1., b = .5, c = 2.)


--------------------------------------------------------------------------------
/obs.py:
--------------------------------------------------------------------------------
 1 | # Import the mean and covariance
 2 | from mean import M
 3 | from cov import C
 4 | from pymc.gp import *
 5 | from numpy import *
 6 | 
 7 | # Impose observations on the GP
 8 | o = array([-.5,.5])
 9 | V = array([.002,.002])
10 | data = array([3.1, 2.9])
11 | observe(M, C, obs_mesh=o, obs_V = V, obs_vals = data)
12 | 
13 | # Generate realizations
14 | f_list=[Realization(M, C) for i in range(3)]
15 | 
16 | x=arange(-1.,1.,.01)
17 | 
18 | #### - Plot - ####
19 | if __name__ == '__main__':
20 |     from pylab import *
21 | 
22 |     x=arange(-1.,1.,.01)
23 | 
24 |     clf()
25 | 
26 |     plot_envelope(M, C, mesh=x)
27 | 
28 |     for f in f_list:
29 |         plot(x, f(x))
30 | 
31 |     xlabel('x')
32 |     ylabel('f(x)')
33 |     title('Three realizations of the observed GP')
34 |     axis('tight')
35 | 
36 | 
37 |     # show()
38 | 


--------------------------------------------------------------------------------
/triangular.py:
--------------------------------------------------------------------------------
 1 | from numpy import log, random, sqrt, zeros, atleast_1d
 2 | 
 3 | def triangular_like(x, mode, minval, maxval):
 4 |     """Log-likelihood of triangular distribution"""
 5 |     
 6 |     x = atleast_1d(x)
 7 |     
 8 |     # Check for support
 9 |     if any(x<minval) or any(x>maxval): return -inf
10 |     
11 |     # Likelihood of left values
12 |     like = sum(log(2*(x[x<=mode] - minval)) - log(mode - minval) - log(maxval - minval))
13 | 
14 |     # Likelihood of right values    
15 |     like += sum(log(2*(maxval - x[x>mode])) - log(maxval - minval) - log(maxval - mode))
16 |     
17 |     return like
18 | 
19 | def rtriangular(mode, minval, maxval, size=1):
20 |     """Generate triangular random numbers"""
21 | 
22 |     # Uniform random numbers
23 |     z = atleast_1d(random.random(size))
24 | 
25 |     # Threshold for transformation
26 |     threshold = (mode - minval)/(maxval - minval)
27 |     
28 |     # Transform uniforms
29 |     u = atleast_1d(zeros(size))
30 |     u[z<=threshold] = minval + sqrt(z[z<=threshold]*(maxval - minval)*(mode - minval))
31 |     u[z>threshold] = maxval - sqrt((1 - z[z>threshold])*(maxval - minval)*(maxval - mode))
32 |     
33 |     return u


--------------------------------------------------------------------------------
/truncated_metropolis.py:
--------------------------------------------------------------------------------
 1 | class TruncatedMetropolis(pymc.Metropolis):
 2 |     def __init__(self, stochastic, low_bound, up_bound, *args, **kwargs):
 3 |         self.low_bound = low_bound
 4 |         self.up_bound = up_bound
 5 |         pymc.Metropolis.__init__(self, stochastic, *args, **kwargs)
 6 | 
 7 |     # Propose method generates proposal values
 8 |     def propose(self):
 9 |         tau = 1./(self.adaptive_scale_factor * self.proposal_sd)**2
10 |         self.stochastic.value = \
11 |             pymc.rtruncnorm(self.stochastic.value, tau, self.low_bound, self.up_bound)
12 | 
13 |     # Hastings factor method accounts for asymmetric proposal distribution
14 |     def hastings_factor(self):
15 |         tau = 1./(self.adaptive_scale_factor * self.proposal_sd)**2
16 |         cur_val = self.stochastic.value
17 |         last_val = self.stochastic.last_value
18 | 
19 |         lp_for = pymc.truncnorm_like(cur_val, last_val, tau, \
20 |             self.low_bound, self.up_bound)
21 |         lp_bak = pymc.truncnorm_like(last_val, cur_val, tau, \
22 |             self.low_bound, self.up_bound)
23 | 
24 |         if self.verbose > 1:
25 |             print self._id + ': Hastings factor %f'%(lp_bak - lp_for)
26 |         return lp_bak - lp_for


--------------------------------------------------------------------------------
/weibull.py:
--------------------------------------------------------------------------------
 1 | import pymc
 2 | 
 3 | # Some fake data
 4 | alpha = 3
 5 | beta = 5
 6 | N = 100
 7 | dataset = pymc.rweibull(alpha,beta, N)
 8 | 
 9 | # Model
10 | a = pymc.Uniform('a', lower=0, upper=10, value=5, doc='Weibull alpha parameter')
11 | b = pymc.Uniform('b', lower=0, upper=10, value=5, doc='Weibull beta parameter')
12 | like = pymc.Weibull('like', alpha=a, beta=b, value=dataset, observed=True)


--------------------------------------------------------------------------------