*μ*:

*σ*:

Mixture of normal distributions – an animation

General.
When studying data from a practical situation it is easy to suspect that the data might consist of two or more
normal distributions (%mix*). Especially if a normal distribution is expected.

Perhaps the data consist of produced items from a number of sources, machines, spindels, etc. Even if the items are produced
from the same drawing there is variation. The process might also be 'drifting' and thus produces data with constantly changing mean or variation.

There are a number of practical statistical tools such as *histogram*, *probability plot* (%Hist*), *cusumplot*
(%Diagn*, %SimDiagn*), etc, to graphically see if the data is non-normal (non-symmetrical). NB that it can be that the data comes from an non-symmetrical
distribution (typically time measurements) and in such case the idea of normal distributions is wrong.

Formal analysis.
A formal analysis demands that there are extra data columns showing what machine, spindle, etc that was used. A common way
is the to perform a so-called t-test (%t-test*).

(It is actually possible, at least in theory, to use the measurements only. By a complicated mathematical method the five
parameters, equivalent to the five sliders to the left ([Change parameters]), can be estimated. However, this demands a large
number of data values and will most likely produce results with a large uncertainty.)

(*A number macros for Minitab can be obtained by request via www.ing-stat.se)

••••

*μ* distr 1

*μ* distr 2

*σ* distr 1

*σ* distr 2

Proportion distr 1

Exercise 1 – change the parameters

Change the parameters via the slides and note that the distribution changes accordingly. If necessary, change the min or max values of the X-axis.

Make sure that the two μ-parameters have the same values and set also the two σ-parameters to equal values. Change the proportion slide
and notice that the two total parameters do not change.

Exercise 2 – one sigma difference

Change the two μ-parameters to 40 and 45, respectively. Change the two σ-parameters to 5. Set the proportion to 0.5.
Thus the difference in mean is 45-40=5 i.e. one standard deviation. Notice that the resulting distribution does not visually reveal
this rather large difference. (To find this difference other data and methods are needed.) Decrease the first parameter and note
that it needs nearly two sigma difference before the difference becomes visible.

Exercise 3 – plus/minus three sigma

Change the two μ-parameters to 20 and 40, respectively. Change the two σ-parameters to 4. Set the proportion to 0.5.
This produces a distribution with two distinct peaks where the mean is 30.00 and sigma is 10.77. Notice that the rule of thumb of
plus/minus three sigmas embraces practically all the distribution.

••••

After reading the 'info'-fields and performing the exercises, it is obvious that a mixture of distributions can be difficult to find.

Usually there is a need for other variables that indicate e.g. machine or similar.

If the data consists of a sudden change in mean, this can sometimes be found by e.g. SQC-metods or other types of time series analysis.

••••

**The expected value** where *p* is the proportion of the first normal distribution (0 < *p* < 1):

$${\mu}_{\mathrm{tot}}=p\cdot {\mu}_{1}+(1-p)\cdot {\mu}_{2}$$

**The standard deviation**:

$${\sigma}_{\mathrm{tot}}=\sqrt{p\cdot [{\sigma}_{1}^{2}+({\mu}_{\mathrm{tot}}-{\mu}_{1}{)}^{2}]+(1-p)\cdot [{\sigma}_{2}^{2}+({\mu}_{\mathrm{tot}}-{\mu}_{2}{)}^{2}]}$$

**The pdf _{tot}** is the 'height' of the mixed distribution at every

$${\text{pdf}}_{\mathrm{tot}}=p\cdot {\text{pdf}}_{1}+(1-p)\cdot {\text{pdf}}_{2}$$

••••

The blue line is the resulting mixed distribution and the area under the curve is the probability.
The total area is 1.

The expected value is indicated on the X-axis as one red vertical bar with the value attached to it.
The small red lines indicate 1, 2, and 3 sigma from the expected value.

The X-axis can be changed by clicking and changing the min or max values for a better fit.

Use the button [Ordinary normal] to learn more about the normal distribution.

••••

**Exercises.** A number of exercises to further illuminate certain features of mixture of variables.

**Some conclusions.** A summary of the main ideas and problems with mixture of variables.

**Formulas.** There are three main formulas that are used for the mixed result: *the expected value*,
*the standard deviation* and *the probability distribution*. These formulas are valid for all distributions.

**Change parameters.** It is possible to change the parameters for the mixed distribution. This is done
using five sliders.

**Mixed Poisson.** The button leads to a page showing a mixture of Poisson distributions.

**Mixed normal.** The button leads to a page showing a mixture of normal distributions.

**Ordinary Poisson.** The button leads to a page showing all basic features of a Poisson distribution.

**Ordinary normal.** The button leads to a page showing all basic features of a normal distribution.

* μ.* The theoretical mean of the mixture of distributions.

••••

The range of the slides can not be changed. The four top slides move 0.1 every time a right
or left arrow is pressed. The bottom slide moves 0.01 every time a right or left arrow is pressed.

** μ distr 1:** The theoretical mean of the first normal distribution.

••••