# 15.4. Computing exact probabilities and manipulating random variables

*This is one of the 100+ free recipes of the IPython Cookbook, Second Edition, by Cyrille Rossant, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at Packt Publishing.*

▶ *Text on GitHub with a CC-BY-NC-ND license*

▶ *Code on GitHub with a MIT license*

▶ *Go to**Chapter 15 : Symbolic and Numerical Mathematics*

▶ **Get** the Jupyter notebook

SymPy includes a module named `stats`

that lets us create and manipulate random variables. This is useful when we work with probabilistic or statistical models; we can compute symbolic expectancies, variances probabilities, and densities of random variables.

## How to do it...

**1. ** Let's import SymPy and the stats module:

```
from sympy import *
from sympy.stats import *
init_printing()
```

**2. ** Let's roll two dice, `X`

and `Y`

, with six faces each:

```
X, Y = Die('X', 6), Die('Y', 6)
```

**3. ** We can compute probabilities defined by equalities (with the `Eq`

operator) or inequalities:

```
P(Eq(X, 3))
```

```
P(X > 3)
```

**4. ** Conditions can also involve multiple random variables:

```
P(X > Y)
```

**5. ** We can compute conditional probabilities:

```
P(X + Y > 6, X < 5)
```

**6. ** We can also work with arbitrary discrete or continuous random variables:

```
Z = Normal('Z', 0, 1) # Gaussian variable
```

```
P(Z > pi)
```

**7. ** We can compute expectancies and variances:

```
E(Z**2), variance(Z**2)
```

**8. ** We can also compute densities:

```
f = density(Z)
```

```
var('x')
f(x)
```

**9. ** We can plot these densities:

```
%matplotlib inline
plot(f(x), (x, -6, 6))
```

## How it works...

SymPy's `stats`

module contains many functions to define random variables with classical laws (binomial, exponential, and so on), discrete or continuous. It works by leveraging SymPy's powerful integration algorithms to compute exact probabilistic quantities as integrals of probability distributions. For example, \(P(Z > \pi)\) is:

```
Eq(Integral(f(x), (x, pi, oo)),
simplify(integrate(f(x), (x, pi, oo))))
```

Note that the equality condition is written using the `Eq`

operator rather than the more standard `==`

Python syntax. This is a general feature in SymPy; `==`

means equality between Python variables, whereas `Eq`

is the mathematical operation between symbolic expressions.

## There's more...

Here are a few references:

- SymPy stats module documentation at http://docs.sympy.org/latest/modules/stats.html
- Probability lectures on Awesome Math, at https://github.com/rossant/awesome-math/#probability-theory
- Statistics lectures on Awesome Math, at https://github.com/rossant/awesome-math/#statistics