Significance Test

The significance test is the process used, by researchers, to determine whether the null hypothesis is rejected, in favor of the alternative research hypothesis, or not.

Statistical Conclusion

This article is a part of the course "Statistical Conclusion".

The test involves comparing the observed values with theorized values. The tests establish whether there is a relationship between the variables, or whether pure chance could produce the observed results.

In everyday language, 'significance' means that something is extremely important. For example, an office manager might state that a new computer system has 'significantly' improved the efficiency of their staff. The term is used in a loose context, a belief that the new system is an instrumental factor in improving workflow.

Probabilities

Science is much stricter in its definition of significance, and uses it as a benchmark. For most scientific research, a statistical significance test eliminates the possibility that the results arose by chance, allowing a rejection of the null hypothesis (H0). There are a number of strict definitions governing the accepted level of significance.

The easiest way to look at this is by using a percentage. For example, a weather forecaster, after inputting data into a computer, might state that there is an 80% chance of rain the next day, rather than saying that it will definitely rain. They have an 80% level of confidence in their predictions.

Reasoning Cycle - Scientific Research

Reversing the Principle

There are two accepted levels of confidence in the results of significance tests. The first is the 95% level, indicating that the experimental results are 95% certain to support rejecting the null hypothesis. The second is the 99% level, which is rare, but indicates a high level of support for rejecting the null in favor of the alternative hypothesis.

Biology, and other imprecise sciences, tend to accept the 95% benchmark, due to the variety of unknown variables influencing results. Physical sciences, with robust systems of measurement, usually demand the higher level of significance.

Scientists usually turn these results around. Instead of using 95%, they use 5%, indicating that there is only a 5% chance that the results were due to statistical errors and chance. They will refute the null hypothesis and support the alternative. Falsifiability ensures that the hypothesis is never completely accepted, only that the null is rejected.

Statistical Tests

For reasons of mathematical ease, and convention, scientists rarely use percentages to express results. A statistical significance test normally generates a result between 0 and 1, with 0 showing no correlation between variables, and 1 that there is no chance involved. Of course, values of exactly 0 or 1 are impossible, and are merely the extreme ends of the scale.

For example, a statistical test might yield a result of 0.825. This indicates that there is a 17.5% probability that any link between the variables was due to chance. Thus, the < 0.05 level , or a statistical score of > 0.95, gives a 5% level of confidence in the results. This means that it was less than 5% chance than the result was a coincidence if the experiment was one tailed. The < 0.01, or > 0.99 level, gives only a 1% chance of randomness producing the results.

If your confidence level, P, is less than 0.05 or 0.001, you should reject the null hypothesis in favor of the alternative. Alternatively, if P is greater than 0.05, you should not reject the null.

Conclusion

Most people studying science very rarely see these levels of confidence with significance tests. If, for example, an experiment generated a P of 0.1, the results are not significant. They do show, however, that a refinement of the hypothesis, or research design, may yield better results. For example, if the experiment used small sample groups, a larger study may produce a higher level of confidence.

On the other hand, if P was 0.5, giving a 50% probability that the results were due to chance, then the experiment, or hypothesis, are fatally flawed. It would be preferable to pursue another line of research. These confidence levels are arbitrary, but do give a sound basis for constituting accepted proof. Whilst Type I and Type II errors are still possible, continued replication of the experiment will eliminate false positives and negatives.

We would love feedback on this article.
Please let us know about any error.
We highly appreciate suggestions.

Suggest changes

Citation: 

(Nov 26, 2008). Significance Test. Retrieved Apr 24, 2014 from Explorable.com: https://explorable.com/significance-test