While talking about distribution, normal distribution is the most common type of distribution that we encounter while reading textbooks of statistics or biostatistics. In common problem solving methods, we often try to test our hypothesis with certain statistical tests like t-test or z-test, ANOVA or F-test. The preliminary assumptions of all these parametric tests is that the data are normally distributed. Even correlation coefficient (Karl Pearson's correlation, which is used for quantitative variables) or regression models assume that the data under study are normal. So, enough with the terminologies, you may wish to know what and how is normal distribution. And here is what you seek for. I have presented most applicable properties of normal distribution, the curve of which is called as normal curve.
1. The normal distribution is obtained from a continuous probability distribution. (Continuous variables = interval/ratio)
2. The distribution when plotted gives a peculiar curve known as Bell Curve, Guassian Curve or simply Normal Curve
Guassian or Bell Curve |
3. The curve is defined by two parameters: Mean (μ) and Standard Deviation (σ). So it is not uncommon for some people to call it a 'bi-parametric curve'. Watch out: bi-parametric 'distribution' is binomial distribution, however. Be careful with two different terms.
4. The curve is unimodal i.e. it has always a single mode. The highest peak of the curve is the mode.
5. The curve is asymptotic i.e the curve never touches the x-axis. It is continuous from -∞ to + ∞.
6. The curve is perfectly symmetrical at X = μ. That means this line creates a mirror image of each half of the curve. Analogously, the skewness of bell curve is 0.
7. Mean, median and mode all lie at X = μ. Hence, in a normal distribution; mean= median = mode
8. Kurtosis of normal curve = 3
9. Population are more clustered around the mean. So, the maximum probability of any element being selected in a random picking is at X = μ, which is given by the equation
10. Z-score: the difference of any point from mean when divided by the standard deviation gives us the z-score. So, z-score is useful to know how many standard deviations away from mean is the point. The random variable 'X' is said to be Standard Normal Variate (SNV) when μ = 0 and σ = 1.
11. Mean deviation (MD) can also be estimated through standard deviation in a normal distribution (σ).
12. In a hypothesis testing, normal curve plays a vital role whether from the p-value or from z or t test. The concept of confidence level (or level of significance) comes from normal curve. If you say a sample has a mean of 200 ± 10 at confidence interval of 95% (which is same as saying level of significance 5%), then it means that when you choose an element from the population, then the mean of the element so selected will lie in 190 - 210 interval 95 times out of 100 trial. Conversely, the chance that the element won't have mean in the given interval is only 5% or less. That means, if, by any means, the mean fell in the region beyond 95% (the extreme area representing 0.05 area of normal curve), then the null hypothesis will be rejected.
13. The curve speculated by the central limit theorem is also the normal curve which states that as the number of different sets of samples tends to infinity, then the average of the means of all samples will lie at or very close to X = μ; and the distribution of those means gives the Bell curve. In other words, the average of infinite sets of samples will be the close approximation of population mean at stake.
14. The probability or total area inside the curve is 1.
In other words, the probability-density function (PDF) of the random variable X is a function such that the area under the density function curve is equal to the probability that the random variable X falls between any of the two points a and b. Thus the total area under the density curve over the entire range of possible values for the random variable is always 1.
15. The normal PDF (Probability Density Function) is given by the equation,
14. The probability or total area inside the curve is 1.
In other words, the probability-density function (PDF) of the random variable X is a function such that the area under the density function curve is equal to the probability that the random variable X falls between any of the two points a and b. Thus the total area under the density curve over the entire range of possible values for the random variable is always 1.
15. The normal PDF (Probability Density Function) is given by the equation,
16. Bell curve Area
a) The left side of vertical line X = μ comprises 50% of the total population and the right side remaining 50%
b) One standard deviation away from either side of line X = μ lies 68% of the total population
c) Two standard deviation away from either side of line X = μ consists of 95% of the total population
d) Three standard deviation away from either side of line X = μ comprises 99.7% of the total population
Sayonara!!! Best of luck for the exams!......
a) The left side of vertical line X = μ comprises 50% of the total population and the right side remaining 50%
b) One standard deviation away from either side of line X = μ lies 68% of the total population
c) Two standard deviation away from either side of line X = μ consists of 95% of the total population
d) Three standard deviation away from either side of line X = μ comprises 99.7% of the total population
Sayonara!!! Best of luck for the exams!......
0 Comments