AI-Therapy creates online self-help programs using the latest evidence-based treatments, such as cognitive behavioural therapy. To find out more visit:
At the heart of research lies a question. For example, consider the following scenario: you just went for a run in the park, and you feel great. Naturally, you might ask yourself "does exercise make people happy?". If you are asking a question that you don't know the answer to, research is necessary to resolve it. There are many forms that this research can take, from a literature review to performing an experiment. A technique known as statistical hypothesis testing is often used in psychology to determine a likely answer to a research question.
With hypothesis testing, the research question is formulated as two competing hypotheses: the null hypothesis and the alternative hypothesis. The null hypothesis is the default position that the effect you are looking for does not exist, and the alternative hypothesis is that your prediction is correct. The goal of hypothesis testing is to collect evidence and reject the null hypothesis if it appears unlikely to be true. In other words, if we reject the null hypothesis there is some experimental support for the alternative hypothesis (although it is important to keep in mind that we have not proved the alternative hypothesis is true).
Here are the hypotheses for our example:
Hypotheses can have a direction. In particular, a directional hypothesis not only states that an effect exists, but also states the direction of the effect. In the terminology of hypothesis testing, this is known as the number of tails of the hypothesis:
Due to naturally occuring variablilty, two seperate measurements (even of the same phenomenon) will almost always give different results. For example, assume I measure my happiness after a run on Monday, and I measure it again after a run on Wednesday. It would not be surprising if the results are different each time, since there are many factors that impact mood. Therefore, the goal of hypothesis testing is not to see if there is any difference between sets of measurements (there almost always will be), but rather to see if the differences are unlikely to be due to random variation. If so, we can say that our result is statistically significant. The general procedure is as follows:
The goal of hypothesis testing is to select either the null hypothesis or the alternative hypothesis. However, no matter how careful you are with your experimental design, there is always a non-zero probability that you will come to the incorrect conclusion. There are two possible errors, depending on which hypothesis is actually true:
What type of error is worse? Obviously, the impact of an error depends on many factors. However, generally speaking a type I error is worse since the trial is more likely to be published and instigate change. For example, if you are testing the efficacy of a new psychoactive drug, a type I error may result in the drug being released to the public. This is potentially dangerous, as you are exposing people to the risk of side effects for a drug that doesn't work.
With experimental research, the general strategy is to manipulate one aspect of the trial (the independent variable) and measure the impact on another aspect of the trial (the dependent variable). There are two primary methods of data collection:
The distinction between independent groups and same subject designs is important since different statistical tests are used for hypothesis testing. In general, same subject research designs have more statistical power since there are fewer sources of variation in the experiment. Note that a randomized controlled trial (RCT), the golden standard of clinical trials, combines both design types by having pre and post measures for both a control and treatment group.
The statistical tests on the following pages can be categorized as either parametric or non-parametric. Parametric tests make certain assumptions about the nature of the underlying data, while non-parametric tests are more general. Parametric tests tend to have more statistical power than their non-parametric counterparts, so should be used when applicable. However, if their assumptions are violated, they may give incorrect or misleading results.
This choice between parametric and non-parametric models is based on the intrinsic nature of the data, and is therefore outside of the control of the experimenter. Therefore, you should always examine your data and conduct tests to verify the assumptions where appropriate.
The most common assumption for the parametric tests is that the assumption of normality. Typically, the assumption of normality applies to the sampling distribution, rather than the underlying data. This is good news since it is usually satisfied for sufficient large data sets (e.g. N > 30) due to the central limit theorem. In general, normality of the underlying data is sufficient, but not necessary, for the use of parametric tests.