Statistics Tutorial

This statistics tutorial is a guide to help you understand key concepts of statistics and how these concepts relate to the scientific method and research.

Scientists frequently use statistics to analyze [3] their results. Why do researchers use statistics? [4] Statistics can help understand a phenomenon by confirming or rejecting a hypothesis. It is vital to how we acquire knowledge to most scientific theories.

You don't need to be a scientist though; anyone wanting to learn about how researchers can get help from statistics may want to read this statistics tutorial for the scientific method.

What is Statistics? [5]

Research Data

This section of the statistics tutorial is about understanding how data is acquired and used.

The results of a science investigation often contain much more data or information than the researcher needs. This data-material, or information, is called raw data.

To be able to analyze the data sensibly, the raw data is processed [6] into "output data [7]". There are many methods to process the data, but basically the scientist organizes and summarizes the raw data into a more sensible chunk of data. Any type of organized information may be called a "data set [8]".

Then, researchers may apply different statistical methods to analyze and understand the data better (and more accurately). Depending on the research, the scientist may also want to use statistics descriptively [9] or for exploratory research.

What is great about raw data is that you can go back and check things if you suspect something different is going on than you originally thought. This happens after you have analyzed the meaning of the results.

The raw data can give you ideas for new hypotheses, since you get a better view of what is going on. You can also control the variables which might influence the conclusion [10] (e.g. third variables [11]). In statistics, a parameter [12] is any numerical quantity that characterizes a given population or some aspect of it.

Central Tendency and Normal Distribution

This part of the statistics tutorial will help you understand distribution, central tendency and how it relates to data sets [8].

Much data from the real world is normal distributed [13], that is, a frequency curve, or a frequency distribution [14], which has the most frequent number near the middle. Many experiments rely on assumptions of a normal distribution [15]. This is a reason why researchers very often measure the central tendency [16] in statistical research, such as the mean [17](arithmetic mean [18] or geometric mean [19]), median [20] or mode [21].

The central tendency may give a fairly good idea about the nature of the data (mean, median and mode shows the "middle value"), especially when combined with measurements on how the data is distributed. Scientists normally calculate the standard deviation [22] to measure how the data is distributed.

But there are various methods to measure how data is distributed: variance [23], standard deviation [24], standard error of the mean [25], standard error of the estimate or "range [26]" (which states the extremities in the data).

To create the graph of the normal distribution for something, you'll normally use the arithmetic mean [18] of a "big enough sample [27]" and you will have to calculate the standard deviation.

However, the sampling distribution [28] will not be normally distributed if the distribution is skewed (naturally) or has outliers [29] (often rare outcomes or measurement errors) messing up the data. One example of a distribution which is not normally distributed is the F-distribution [30], which is skewed to the right.

So, often researchers double check that their results are normally distributed using range, median and mode. If the distribution is not normally distributed, this will influence which statistical test/method to choose for the analysis.

Other Tools

Quartile [31]
Trimean [32]

Hypothesis Testing - Statistics Tutorial

How do we know whether a hypothesis is correct or not?

Why use statistics to determine this? [4]

Using statistics in research involves a lot more than make use of statistical formulas or getting to know statistical software.

Making use of statistics in research basically involves

Learning basic statistics [33]
Understanding the relationship between probability and statistics [34]
Comprehension of the two major branches in statistics [35]: descriptive statistics [9] and inferential statistics [36].
Knowledge of how statistics relates to the scientific method [4].

Statistics in research is not just about formulas and calculation. (Many wrong conclusions have been conducted from not understanding basic statistical concepts)

Statistics inference helps us to draw conclusions [10] from samples of a population.

When conducting experiments [37], a critical part is to test hypotheses [38] against each other. Thus, it is an important part of the statistics tutorial for the scientific method.

Hypothesis testing is conducted by formulating an alternative hypothesis [39] which is tested against the null hypothesis [40], the common view. The hypotheses are tested statistically [41] against each other.

The researcher can work out a confidence interval [42], which defines the limits when you will regard a result as supporting the null hypothesis and when the alternative research hypothesis [39] is supported.

This means that not all differences between the experimental group and the control group [43] can be accepted as supporting the alternative hypothesis - the result need to differ significantly statistically [44] for the researcher to accept the alternative hypothesis. This is done using a significance test [45] (another article [46]).

Caution though, data dredging [47], data snooping or fishing for data without later testing your hypothesis in a controlled experiment may lead you to conclude on cause and effect [48] even though there is no relationship to the truth [49].

Depending on the hypothesis, you will have to choose between one-tailed and two tailed tests.

Sometimes the control group is replaced with experimental probability [50] - often if the research treats a phenomenon which is ethically problematic [51], economically too costly or overly time-consuming, then the true experimental design [52] is replaced by a quasi-experimental approach [53].

Often there is a publication bias when the researcher finds the alternative hypothesis correct, rather than having a "null result", concluding that the null hypothesis provides the best explanation.

If applied correctly, statistics can be used to understand cause and effect between research variables [54].

It may also help identify third variables, although statistics can also be used to manipulate and cover up third variables [11] if the person presenting the numbers does not have honest intentions (or sufficient knowledge) with their results.

Misuse of statistics is a common phenomenon, and will probably continue as long as people have intentions about trying to influence others. Proper statistical treatment [55] of experimental data can thus help avoid unethical use of statistics. Philosophy of statistics [56] involves justifying proper use of statistics, ensuring statistical validity [57] and establishing the ethics in statistics [58].

Here is another great statistics tutorial [59] which integrates statistics and the scientific method.

Reliability and Experimental Error

Statistical tests make use of data from samples. These results are then generalized [60] to the general population. How can we know that it reflects the correct conclusion [10]?

Contrary to what some might believe, errors in research [61] are an essential part of significance testing [46]. Ironically, the possibility of a research error is what makes the research scientific in the first place. If a hypothesis cannot be falsified [62] (e.g. the hypothesis has circular logic), it is not testable [63], and thus not scientific, by definition.

If a hypothesis is testable, to be open to the possibility of going wrong. Statistically this opens up the possibility of getting experimental errors [64] in your results due to random errors or other problems with the research. Experimental errors may also be broken down into Type-I error and Type-II error [61]. ROC Curves [65] are used to calculate sensitivity between true positives and false positives.

A power analysis [66] of a statistical test can determine how many samples a test will need to have an acceptable p-value [67] in order to reject a false null hypothesis [40].

The margin of error [68] is related to the confidence interval [42] and the relationship between statistical significance, sample size and expected results [69]. The effect size [70] estimate the strength of the relationship between two variables in a population. It may help determine the sample size [27] needed to generalize [60] the results to the whole population.

Replicating [71] the research of others is also essential to understand if the results of the research were a result which can be generalized or just due to a random "outlier experiment". Replication can help identify both random errors [72] and systematic errors [73] (test validity [74]).

Cronbach's Alpha [75] is used to measure the internal consistency or reliability [76] of a test score.

Replicating the experiment/research ensures the reliability of the results statistically [77].

What you often see if the results have outliers, is a regression towards the mean [78], which then makes the result not be statistically different between the experimental and control group.

Statistical Tests

Here we will introduce a few commonly used statistics tests/methods, often used by researchers.

Relationship Between Variables

The relationship between variables [79] is very important to scientists. This will help them to understand the nature of what they are studying. A linear relationship [80] is when two variables varies proportionally, that is, if one variable goes up, the other variable will also go up with the same ratio. A non-linear relationship [81] is when variables do not vary proportionally. Correlation [82] is a a way to express relationship between two data sets or between two variables.

Measurement scales [83] are used to classify, categorize and (if applicable) quantify variables.

Pearson correlation coefficient [84] (or Pearson Product-Moment Correlation) will only express the linear relationship between two variables. Spearman rho [85] is mostly used for linear relationships when dealing with ordinal variables. Kendall's tau (τ) coefficient can be used to measure nonlinear relationships.

Partial Correlation [86] (and Multiple Correlation) may be used when controlling for a third variable [11].

Predictions

The goal of predictions is to understand causes. Correlation does not necessarily mean causation [87]. With linear regression, you often measure a manipulated variable [88].

What is the difference between correlation and linear regression [89]? Basically, a correlational study [90] looks at the strength between the variables [91] whereas linear regression is about the best fit line in a graph.

Regression analysis and other modeling tools

Linear Regression [92]
Multiple Regression [93]
A Path Analysis is an extension of the regression model
A Factor Analysis [94] attempts to uncover underlying factors of something.
The Meta-Analysis [95] frequently make use of effect size [70]

Bayesian Probability [96] is a way of predicting the likelihood of future events in an interactive way, rather than to start measuring and then get results/predictions.

Testing Hypotheses Statistically

Student's t-test [97] is a test which can indicate whether the null hypothesis [40] is correct or not. In research it is often used to test differences between two groups (e.g. between a control group [43] and an experimental group).

The t-test assumes that the data is more or less normally distributed and that the variance is equal (this can be tested by the F-test [98]).

Student's t-test [99]:

Independent One-Sample T-Test [100]
Independent Two-Sample T-Test [101]
Dependent T-Test for Paired Samples [102]

Wilcoxon Signed Rank Test [103] may be used for non-parametric data.

A Z-Test [104] is similar to a t-test, but will usually not be used on sample sizes below 30.

A Chi-Square [105] can be used if the data is qualitative rather than quantitative.

Comparing More Than Two Groups

An ANOVA [106], or Analysis of Variance, is used when it is desirable to test whether there are different variability between groups rather than different means. Analysis of Variance can also be applied to more than two groups. The F-distribution [30] can be used to calculate p-values [67] for the ANOVA.

Analysis of Variance

Nonparametric Statistics

Some common methods using nonparametric statistics [111]:

Cohen's Kappa [112]
Mann-Whitney U-test [113]
Spearman's Rank Correlation Coefficient [85]

Other Important Terms in Statistics

Discrete Variables [114]