Plot 95 Confidence Interval Python
Plot 95% confidence interval errorbar python pandas dataframes. June 17, 2017, at 9:31 PM. I want to show 95% confidence interval with Python pandas, matpolib. But I stucked, because for usual.std I would do smth like this. I often see QQ-plots with a confidence interval: For my application, I have a QQ plot of test p-values against a uniform distribution. I want to add the 95% CI of the observed p-values on the plot, but I don't know how to compute them. In other words, I am not sure: That I correctly understand what the CI represents on QQ plots. Compute the 95% confidence interval using np.percentile and passing in two arguments: The array bsreplicates, and the list of percentiles - in this case 2.5 and 97.5. Print the confidence interval. Plot a histogram of your bootstrap replicates. This has been done for you, so hit 'Submit Answer' to see the plot! Sns.barplot(cycles, ci=95, capsize=0.1) plt.xlabel('Cycles until failure (95% CI)') 0 200 400 600 800 1000 1200 1400 Cycles until failure, with 95% confidence interval DatafromBirnbaumandSaunders(1958) 21/47.
A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence.
This tutorial explains how to plot a confidence interval for a dataset in Python using the seaborn visualization library.
Plotting Confidence Intervals Using lineplot()
The first way to plot a confidence interval is by using the lineplot() function, which connects all of the data points in a dataset with a line and displays a confidence band around each point:
By default, the lineplot() function uses a 95% confidence interval but can specify the confidence level to use with the ci command.
The smaller the confidence level, the more narrow the confidence interval will be around the line. For example, here’s what an 80% confidence interval looks like for the exact same dataset:
Plotting Confidence Intervals Using regplot()
You can also plot confidence intervals by using the regplot() function, which displays a scatterplot of a dataset with confidence bands around the estimated regression line:
Similar to lineplot(), the regplot() function uses a 95% confidence interval by default but can specify the confidence level to use with the ci command.
Again, the smaller the confidence level the more narrow the confidence interval will be around the regression line. For example, here’s what an 80% confidence interval looks like for the exact same dataset:
Additional Resources
What are Confidence Intervals?
How to Calculate Confidence Intervals in Python
LOWESS (or also referred to as LOESS for locally-weighted scatterplot smoothing) is a non-parametric regression method for smoothing data. But how do we get uncertainties on the curve?
The “non-parametric”-ness of the method refers to the fact that unlike linear or non-linear regression, the model can’t be parameterised – we can’t write the model as the sum of several parametric terms such as with linear regression:
$$ y = beta_0 + beta_1 x $$
Instead, LOWESS attempts to fit a linear model for each observation based on nearby (or local) observations. This makes the method more versatile than fitting some n-order polynomial to wiggly data but of course, means that we do not have global parameters such as $ beta_0 $ and $beta_1$ with which to test hypotheses. To start looking at LOWESS methods, let’s create some nonsense data which has some interesting nonmonotonic response and plot it.
Python package statsmodels has an efficient LOWESS smoother built-in which provides the obvious choice for doing a lowess smoother in python:
So that looks fairly sensible, sm_lowess
has picked out the main structure in the data but not overfitted the curve to any of the noise in the data.
Of course, judging the quality of the fit is difficult because we don’t really have an idea of the uncertainty. Ideally, we’d have a LOWESS method which provides us a confidence interval on the fit of the model so we can see which individual observations are within the fit and those outside of the fitted curve.
One obvious way to get a confidence interval around the LOWESS fit is to use bootstrapping to provide an estimate of the spread of the curve. We’ll sample from the observations to change the distribution of points for each fit each time:
So smooth
samples 50% of the observations and fits the LOWESS model. Also because statsmodels doest not provide the solution on an interpolated, and we’re randomly sampling each, the solution is interpolated to the same 1d grid each time specified with xgrid
. Let’s run smooth
100 times and plot each lowess solution:
We can then use the individual fits to provide the mean $mu$ and standard error $sigma$ of the LOWESS model:
The 95% confidence interval (shaded blue) seems fairly sensible - the uncertainty increases when observations nearby have a large spread (at around x=2) but also at the edges of the plot where the number of observations tends towards zero (at the very edge we only have observations from the left or right to do the smoothing).
Similarly, because bootstrapping provides draws from the posterior of the LOWESS smooth we can create a true confidence interval from any percentiles:
Notice the similarity in the $mu + 1.96sigma$ confidence interval and the percentile-based 95% confidence interval. Actually, under certain assumptions about the errors in the data, these should be the same. This is because LOWESS smoothers essentially fit a unique linear regression for every data point by including nearby data points to estimate the slope and intercept. The correlation in nearby data points helps ensure that we get a smooth curve fit. We can understand this a bit more clearly by estimating the curve locally for a couple of observations with linear regression:
Extending this principle we can get something that looks a bit like the curve from earlier:
Instead of just selecting the 5 nearest data points and fitting a simple linear regression, LOWESS weights the points based on the proximity of neighbouring points. This is what specifies LOWESS smoothing as just another kernel smoother method in which a function $F$ is used to weight observations based on proximity. For LOWESS, the most commonly used function is the tricube:
Fitting the LOWESS curve
It’s helpful (well necessary) to delve into the LOWESS mathematics at this point. The observations (a collection of x and y points we’ll say) are composed of $x$ and $y$ vectors:
$$ x = [x_1 cdots x_N] $$
$$ y = [y_1 cdots y_N] $$
For observation $x_i$, the LOWESS fitting starts with calculating the distance between $x_i$ and each other observation $x_j$ provides:
$$ d_{ij} = [y_{i0} cdots y_{iN}] $$
With these distances, the tricube weighting function is used to up-weight observations which are more local:
$$ w_i = (1 - d_{ij} ^3)^3 $$
With $w$ we solve a standard weighted least squares system:
$$ hat{beta} = left(mathbf{X}^T mathbf{W} mathbf{X} right)^{-1} mathbf{X}^T mathbf{W} y $$
where $mathbf{X}$ is specified as a first order model:
$$begin{equation}mathbf{X} =begin{bmatrix}1 & x_1
vdots & vdots
1 & x_Nend{bmatrix}end{equation}$$
which is a linear model with an intercept (1 values) and the slope. It’s possible to use higher order polynomials of course such as $x^2$ but we’ll stick with the simplest case here. $mathbf{W}$ is just a diagonal matrix formed from the weights $w_i$:
$$ mathbf{W} = begin{bmatrix}w_1 & &
& ddots &
& & w_Nend{bmatrix} $$
Solving the system provides $hat{beta}$ - the intercept $beta_0$ and slope $beta_1$. The LOWESS smoothed observation $hat{y}_{sm}$ is fitted/predicted from the row of the system corresponding to $i$:
$$ hat{y}_{sm} = mathbf{X}_i hat{beta} $$
So algorithmically we loop over each observation $i$ update the $mathbf{W}$ with the new weights and resolve the least squares system above to update $hat{beta}$ and then predict each $hat{y}_{sm}$.
In code this relatively simple to implement for one observation $i$:
Confidence intervals for LOWESS
Because the LOWESS smoother for any individual prediction is essentially weighted linear least squares, the propagation of uncertainty principles are well understood. With any linear model (with normal errors in the observations $y$) the uncertainty in the parameters $var(hat{beta})$ are provided from the posterior covariance matrix:
$$sigma^2left(mathbf{X}^T mathbf{X}right)^{-1}$$
The diagonal of this matrix provides $var(hat{beta})$:
$$ var(hat{beta}) = text{diag} left[ sigma^2 left(mathbf{X}^T mathbf{X}right)^{-1} right] $$
$sigma^2$ is important - it is our estimate of the uncertainty in the initial observations. For this approach to work, we are assuming that the observations have an error model of the form:
Confidence Interval Matplotlib
$$ y = y_text{true} + mathcal{N}(0, sigma^2) $$
The observational error $sigma^2$ provides an indication of the spread in the observations away from our model of the observations:
$$ hat{y}_{sm} = f(x) + epsilon $$
where $f(x)$ is the LOWESS smoothing model and $epsilon$ is $mathcal{N}(0, sigma^2)$ – the error arising in the prediction because of the observations. In some cases, we know what $sigma^2$ is, but often we don’t. One option to estimate $sigma^2$ is to assume that our model is perfect and the residuals between the model and observations $(hat{y}_{sm}-y)$ provides a good estimate. The root mean square error (RMSE) provides this estimate of $sigma$:
$$ sigma = sqrt{frac{sum_i^{N}( hat{y}_{sm} - y_i )^2}{N}} $$
With an estimate of $sigma$ we can then estimate $var(hat{beta})$ correctly and provide a confidence interval based on the assumption that the uncertainty in the parameters is normally distributed. For example a 95 confidence interval on the slope parameter $hat{beta_1}$ is:
$$ text{CI}_{0.95} = hat{beta_1} pm 1.96 sqrt{var(hat{beta_1}) } $$
So we’ve now got a way to get the confidence interval in parameters $hat{beta}$ from the variance $var(hat{beta})$ but we really want the confidence interval for the fitted curve $hat{y}_{sm}$.
To get this remember that $hat{y}_{sm}$ is provided by:
$$ hat{y}_{sm} = mathbf{X}_i hat{beta} $$
To get the variance or uncertainty in the prediction $var(hat{y}_{sm})$ we apply standard least squares uncertainty propagation:
Plot 95 Confidence Interval Python
$$ var(hat{y}_{sm}) = var(mathbf{X}_i hat{beta}) = mathbf{X}_i^T sigma^2 left(mathbf{X}^T mathbf{X}right)^{-1} mathbf{X}_i$$
Confidence Interval In Python
In code this relatively simple to implement also:
Wrapping it all up into a lowess
function that loops over each observation and fits the curve and uncertainties: