What is the confidence interval?
A confidence interval in statistics refers to the probability that a population parameter falls between two defined values for a certain proportion of time. Confidence intervals measure the degree of uncertainty or certainty in a sampling method. A confidence interval can take any number of probabilities, the most common being a 95% or 99% confidence level.
Trust interval and trust level are interdependent but are not exactly the same.
Understanding the confidence interval
Statisticians use confidence intervals to measure uncertainty. For example, a researcher randomly selects different samples from the same population and calculates a confidence interval for each sample. The resulting datasets are all different; some intervals include the true population parameter and others do not.
A Confidence interval is a range of values that would likely contain an unknown population parameter. A level of confidence refers to the percentage probability, or certainty, that the confidence interval would contain the true population parameter when you draw a random sample multiple times. Or, in the vernacular, “We are 99% certain (a level of confidence) that most of these data sets (confidence intervals) contain the true population parameter. “
Key points to remember
- A confidence interval calculates the probability that a population parameter falls between two defined values.
- Confidence intervals measure the degree of uncertainty or certainty in a sampling method.
- Most often, confidence intervals reflect 95% or 99% confidence levels.
Calculation of a confidence interval
Suppose a group of researchers studies the heights of high school basketball players. The researchers take a random sample of the population and establish an average height of 74 inches. The 74-inch average is a point estimate of the population average. A point estimate in itself is of limited use as it does not reveal the uncertainty associated with the estimate; you don’t know how far this 74 inch average sample could be from the population average. What is missing is the degree of uncertainty in this unique sample.
Confidence intervals provide more information than point estimates. By establishing a 95% confidence interval using the mean and standard deviation of the sample and assuming a normal distribution represented by the bell curve, the researchers arrive at an upper and lower limit which contains the true mean. 95% of the time. Suppose the interval is between 72 inches and 76 inches. If the researchers take 100 random samples of the high school basketball player population as a whole, the average should be between 72 and 76 inches in 95 of these samples.
If researchers want even greater confidence, they can extend the interval to 99% confidence. This invariably creates a wider range, as it leaves room for more sampling means. If they set the 99% confidence interval to be between 70 inches and 78 inches, they can expect 99 of the 100 samples evaluated to contain an average value between these numbers. A 90% confidence level means that we would expect 90% of the interval estimates to include the population parameter. Likewise, a 99% confidence level means that 95% of the intervals would include the parameter.
Common misconceptions about the confidence interval
The biggest misconception about confidence intervals is that they represent the percentage of data in a given sample that falls between the upper and lower limits. For example, the above 99% confidence interval of 70 to 78 inches could be misinterpreted as indicating that 99% of the data in a random sample falls between these figures. This is incorrect, although there is a separate statistical analysis method for making such a determination. To do this, identify the sample mean and standard deviation and plot these numbers on a bell curve.