Non-parametric distributions

A parametric distribution can be completely characterized by a small set of parameters. The most common parametric distribution in statistics is the normal distribution, yet there are many others. Parametric distributions are nice to work with since they can be stated concisely and their properties are well understood. However, there may be situations where your data is not well modeled by one of the common parametric distributions.

The two most common ways to display non-parametric data are the histogram and the box plot. These allow you to see how the data are distributed without making any assumptions about its underlying form. These graphs can be used to get a feel for the central tendency, dispersion, and modes of the data. Furthermore, you can visually identify outliers.

One downside of a histogram is that it only makes frequency estimates for a subset of ranges over the distribution (the bins). However, what if you would like to compute probability estimates over arbitrary ranges? This is straightforward for parametric distributions. For example, for normally distributed data you can use the mean and standard deviation to compute a probability estimate over any range you like (see z-scores). For non-parametric data you can use a technique known as kernel density estimation to create a continuous distribution.

Example

You are conducting a study on people's desire to visit South America. You ask a group of people to give you a number from 1 to 9, with 1 meaning "I have no desire to to visit South America", and 9 meaning "I'd love to visit South America". Assume you collect the following responses:

7, 3, 2, 1, 7, 3, 4, 5, 7, 6, 2, 2, 1, 3, 7, 2, 6, 8, 2, 7, 2, 2, 1, 
3, 5, 8, 2, 6, 7, 8, 6, 2, 8, 7, 9, 2, 7, 5, 1, 8, 8, 2, 3, 7, 3, 8

If you use this data to generate a histogram (e.g. using the calculator on this page), you will notice that there are two peaks: one at 2 and one at 7. This is known as a bimodal distribution since it has two modes. One interpretation is that your sample group is polarized: your subjects tend to feel strongly one way or another. You could model this distribution using a mixture of gaussians, which models each subpopulation individually, and then combines the results. However, computing the parameters for this distribution is relatively tricky (it is typically done using a technique known as expectation–maximization). An easier approach is to estimate a non-parametric model using kernel density estimation.

Try copying and pasting this data into the calculator below and using the result to compute the probability of obtaining a score in the range 3 to 6.

loading...