The use of single-subject designs in psychology


A single-subject design is an experimental design where one participant is used as both the control and treatment groups. It is important to distinguish a single-subject design from a case study design. Case studies also use one participant who is analysed, however case studies are descriptive methods and are often used for forming hypotheses and are often qualitative, whereas cause-and-effect can be established using a quantitative single-subject design as it is experimental.

A series of observations are made over time of one participant during various phases.  In most cases an ABAB design is used, where phases will alternate between the baseline phase and the treatment phase.  This alternating treatments design can often be used to compare different treatments as well as recurrence of one treatment, as discussed by Barlow and Hayes (1979).  In the baseline phase observations are made when there is no treatment.  This serves as the control condition for the participant.  Once a treatment has been administered this becomes the treatment phase.  Many observations are made during this phase to establish the effect of the treatment in comparison to the baseline phase.  This treatment is then removed, thus returning to the baseline phase again. The aim of this phase is to establish if the effects in the treatment condition were due to the intervention or due to a different confounding variable.  It is expected that if the effect was due to the intervention then this second baseline phase should be equivalent to the first baseline phase as the effects of treatment have been removed.  Sometimes this is not shown when a treatment has long lasting effects on a patient.  After this second baseline phase the treatment is administered again to compare to the other treatment phase to ensure that it was the treatment causing the effects for the first interval.  The image below, taken from Horner, Carr, Halle, McGee, Odom and Wolery (2005), shows an example of an ABAB design.

Single-subject designs are not usually analysed using the traditional statistical methods used for many other methods of research. The data is frequently presented graphically for visual inspection.  The graph is measured on 2 main features, its level and trend. A level is the magnitude of the participant’s responses, which should be approximately a horizontal line. A trend is when differences from one measurement to the next follow the same direction and magnitude.  On a graph this would be shown by clustering points along a sloping line.  These 2 features are described in terms of stability, based on the consistency of levels and trends.

A main practical use of a single-subject design is the N of 1 trial. This is a clinical trial where the participant serves as both the control and the patient.  This can be used in a similar way to the ABAB design as described above, or can be adapted to establish the effects of various types of a drug, or to test a drug against a placebo. These trials are flexible towards the individual, and the rate of success for each individual is much higher than that of using traditional group methods of testing, see Kravitz et al. (2008). However, this trial is not used very often in today’s society, even though it has the potential to be much more effective than other methods of treatment.  It is very costly and time-consuming to perform long-term and detailed examination of each individual’s reactions to certain drugs in order to provide the appropriate care required, however it could be argued that in many cases drugs are being prescribed on the basis of group research when an individual may benefit more from a different drug which is produced more cheaply, and may not require the drug for as long as the average patient, therefore money could also be saved. Kravitz et al. (2008) argue that it is unjust that methods such as X-rays are used for diagnostic precision, however clinicians will not readily use the N of 1 trial to increase therapeutic precision and to facilitate modern clinical care.

Single-subject designs have several advantages, the main one being that it has the potential to increase the success of treatments for individuals.  Cause-and-effect can be established using this method, and only 1 participant is needed, thus reducing the need of standardized treatments like in group research.  It is also helpful to observe long lasting effects which would not usually be tested using group research.  However there are also many limitations for using this research design.  As the research is only using one participant it is difficult to generalise as individual differences could have a major impact on the effectiveness of the intervention.  Also, multiple observations can lead to sensitization and carry over effects which can alter the measurements of behaviour.  Reliance on a graph for interpretation can also limit the effectiveness of the research as it can be based on individual interpretation and can require large and immediate changes in order to perceive any effects.  However, it is possible to do statistical analysis along with visual representation.

So, to conclude, single-subject designs can be very useful, especially in the clinical field, to find cause and effect relationships using few participants and to establish longer term results than usually investigated in traditional group methods, however it is hard to generalise these results for the research to have an impact on a larger population.

For more information see Gravetter & Forzano (2009), Research methods for the behavioural sciences, chapter 14.

The last blog and comments for the semester. Yay!



This week’s blog is here:

My comments for this week can be found here:

Thankfully these are the last set of blog and comments for this semester. Yay. =]

Merry Christmas and have a good new year!

Is Zimbardo’s Stanford Prison Experiment really that unethical?


Ethics are very important when conducting research with both animal subjects and human participants.  They are used to ensure the safety of the participants and to establish sensible boundaries of research.  Psychologists must abide by the code of ethics and conduct (BPS, 2009) consisting of 4 main principles; respect, competence, responsibility, and integrity, each of which has its own set of values.  There are 6 values often considered the most important.

Informed consent- Participants must be fully informed about what is expected of them during the study and what will happen. This information must be understood as well as just presented before voluntary consent is obtained.  Issues can arise when deception is used.

Right to withdraw- Participants should be aware that they can leave the study at any point with no penalty and request that their data is not used.

Confidentiality- Participants’ data must be kept confidential and only accessible to investigators involved with the study.

Deception- Researchers can withhold certain information from participants when it is required to maintain the integrity of the study.  Any deception must be justified and explained to participant in debriefing.

Debrief- Participants must be informed about the nature of the study, and if deception has been used this must be fully explained in order to ensure that participants leave in the same psychological state as they arrived.

Protection from harm- Researchers must eliminate potential risks to maintain physical and psychological well-being of the participants.  Consideration for the effects of age, disability, religion etc must be taken into account.

The Stanford Prison Experiment (SPE) is considered very controversial, as the results were really valuable to the progression of psychological research, but there are also many criticisms about it, mainly concerning ethics.  This study is considered the 8th most unethical psychological study carried out (Listverse, 2008).  But is the study really that unethical, or are the results just so shocking that we want to find a way to disregard the research? Using the criteria described above I shall explain why this experiment is not as unethical as many people believe.

Informed consent- Each participant was given an information sheet about the study and told that if selected as a prisoner they would have their usual rights taken away and would be living in minimally acceptable conditions.  After completing a consent form they also had a preliminary interview where participants who had anxiety issues and similar were encouraged not to participate due to the effects of the study, therefore informed consent was obtained.

Right to withdraw- Participants were informed that they could leave the study if they wished to do so, however as they were acting in a prison environment they were told that it would not appear as if they could easily leave so must be done through established procedures.  Participants forgot this when taking part as the ringleader of the prisoners attempted to leave but misinterpreted what he was told so believed none of them could leave.  Therefore participants were initially aware of their right to withdraw, even if they did forget this during the study.

Confidentiality- All information from the study was coded to ensure confidentiality.  Participants were also asked to complete a release form for their video footage to be used.  Prisoners also known by their ID number during the experiment therefore remained anonymous to other members and to those who view the video footage. This means that confidentiality was maintained.

Deception- There was no deception used as participants were informed that their usual rights would be taken away and under what conditions they would be living in.  They were also informed about the nature of the study and were told the rules that they must abide by. In hindsight it may be considered that participants did not know all the information about what would happen in the study, however the researchers did not anticipate the type of behaviour that occurred therefore they were not withholding information or deceiving participants.

Debrief- All participants were given a full debrief after the study and fully explained what was expected to be found and why the study was finished early.  Their psychological state was also analysed.

Protection from harm- A preliminary interview and assessments were performed to establish who were not fit to participate in the study, thus reducing potential risks.  Not all risks in the study could be reduced as they were not anticipated.  Participants’ psychological state was tested at the end of the study and on many occasions since and it has been found that none have suffered from long term trauma.  Therefore, although participants were exposed to temporary distress during the study it was not permanent so may be considered acceptable.

The SPE was proposed to a human review committee to evaluate the ethics of the study before it began, and the experiment gained approval.  2 years after the study was conducted the ethics were evaluated by the APA and concluded that ethical guidelines had been followed.

We must take into consideration that ethical guidelines were not as thorough when the study was conducted, and it is experiments like the SPE that have led us to believe that humans are more fragile than previously expected.  However, from using today’s ethical guidelines it can be suggested that the study was not as unethical as many people make it out to be, and it does follow the BPS guidelines.  It could be suggested that participants were not fully protected from harm, but as this distress was only temporary do the ends justify the means? If it wasn’t for experiments like these we would not have the strict ethical guidelines we have today!

For those of you who are particularly interested in the SPE the video below is a BBC documentary about the study which includes talking to Zimbardo and the participants from the study. You can also find a lot of information about the study on this website.

Blog and Comments for weeks 8/9 =]



My blog is here:

My comments this week can be found here:

What is validity and why is it important in research?


Validity is described as the degree to which a research study measures what it intends to measure.  There are two main types of validity, internal and external.  Internal validity refers to the validity of the measurement and test itself, whereas external validity refers to the ability to generalise the findings to the target population.  Both are very important in analysing the appropriateness, meaningfulness and usefulness of a research study.  However, here I will focus on the validity of the measurement technique (i.e. internal validity).

The 4 main types of validity

There are 4 main types of validity used when assessing internal validity.  Each type views validity from a different perspective and evaluates different relationships between measurements.

Face validity-This refers to whether a technique looks as if it should measure the variable it intends to measure.  For example, a method where a participant is required to click a button as soon as a stimulus appears and this time is measured appears to have face validity for measuring reaction time.  An example of analysing research for face validity by Hardesty and Bearden (2004) can be found here.

Concurrent validity-This compares the results from a new measurement technique to those of a more established technique that claims to measure the same variable to see if they are related.  Often two measurements will behave in the same way, but are not necessarily measuring the same variable, therefore this kind of validity must be examined thoroughly.  An example and some weakness associated with this type of validity can be found here (Shuttleworth, 2009).

Predictive validity-This is when the results obtained from measuring a construct can be accurately used to predict behaviour.  There are obvious limitations to this as behaviour cannot be fully predicted to great depths, but this validity helps predict basic trends to a certain degree.  A meta-analysis by van IJzendoorn (1995) examines the predictive validity of the Adult Attachment Interview.

Construct validity-This is whether the measurements of a variable in a study behave in exactly the same way as the variable itself.  This involves examining past research regarding different aspects of the same variable.  The use of construct validity in psychology is examined by Cronbach and Meehl (1955) here.

*Definitions taken from Research Methods for the Behavioural Sciences by Gravetter and Forzano (2009)*

A research study will often have one or more types of these validities but maybe not them all so caution should be taken. For example, using measurements of weight to measure the variable height has concurrent validity as weight generally increases as height increases, however it lacks construct validity as weight fluctuates based on food deprivation whereas height does not.

What are the threats to Internal Validity?

Factors that can effect internal validity can come in many forms, and it is important that these are controlled for as much as possible during research to reduce their impact on validity.  The term history refers to effects that are not related to the treatment that may result in a change of performance over time.  This could refer to events in the participant’s life that have led to a change in their mood etc.  Instrumental bias refers to a change in the measuring instrument over time which may change the results.  This is often evident in behavioural observations where the practice and experience of the experimenter influences their ability to notice certain things and changes their standards.  A main threat to internal validity is testing effects.  Often participants can become tired or bored during an experiment, and previous tests may influence their performance.  This is often counterbalanced in experimental studies so that participants receive the tasks in a different order to reduce their impact on validity.

So why is validity important?

If the results of a study are not deemed to be valid then they are meaningless to our study.  If it does not measure what we want it to measure then the results cannot be used to answer the research question, which is the main aim of the study.  These results cannot then be used to generalise any findings and become a waste of time and effort.  It is important to remember that just because a study is valid in one instance it does not mean that it is valid for measuring something else.

Validity’s relationship to Reliability

It is important to ensure that validity and reliability do not get confused.  Reliability is the consistency of results when the experiment is replicated under the same conditions, which is very different to validity.  These two evaluations of research studies are independent factors, therefore a study can be reliable without being valid, and vice versa, as demonstrated here (this resource also provides more information on types of validity and threats).  However, a good study will be both reliable and valid.

So to conclude, validity is very important in a research study to ensure that our results can be used effectively, and variables that may threaten validity should be controlled as much as possible.

For Simon, weeks 4/5


My blog for this week is here:

Comments I have made this week are:

Outliers: the good, the bad, and the ugly


Outliers are the really annoying data points that all researchers hope they won’t have in their data, although they would be lucky to manage this.  And why is this?  Because they’re basically just a pain and can threaten the validity of our data if treated incorrectly, or not at all.

Before going into the details of why it is important to pay attention to outliers lets first look at what one is.  The basic definition of an outlier is that it is an extreme value that does not follow the norm, or the pattern of the majority of our data.  For example, the graph shown below shows a set of data which has a strong correlation or pattern, but one data point is positioned away from the rest of the points, so is the outlier.

To start with the good news about outliers is that sometimes they can be helpful (I know this is very hard to believe).  Once outliers have been identified they can be looked at more closely and can lead to some unexpected knowledge, and can show more about individuals that do not fit the ‘norm’.  They can also be used to reveal errors within the research model.  For example, if there is a form of measurement error, such as the data being recorded incorrectly, or a participant has not understood instructions then these are possible causes of outliers that the researcher could modify their study to exclude in the future.

However now we come to the bad news.  Outliers are more often than not seen as a problem rather than a help.  Not only do they suggest that the data was taken from a different population than the intended population, thus effecting external validity, but it can also cause problems with analysis.  An outlier can distort results, such as dragging the mean in a certain direction, and can lead to faulty conclusions being made.

Detecting these outliers can be a very complicated task and can sometimes require assumptions to be made about the parameters of the data and the expected distribution.  For example, outliers are often detected as values that lie outside a certain range, but how is this range calculated?  Sometimes it is based on the expected standard deviation, or by calculating the interquartile ranges and allowing a multiple of this either end of the spectrum.  This method is shown by the use of a box plot as well.  Other visual aids, such as a scatter graph, as shown earlier, can also help identify outliers.  They are particularly problematic in categorical data as they are more complicated to deal with, but I won’t go into that here, but this article explains more about one method of dealing with them called wavelet transforms.

Now it’s time to embrace yourselves as we come to the ugly part of working out what to do with these outliers.  There are many methods of doing this, which include both leaving them in the data and taking them out.  An important thing to do before modifying your research is to work out how influential a data point is on the data as a whole.  This can be done by using the adjusted predicted value.  This involves removing the suspected point from the data to create a new model and to analyse this.  By using the results of this analysis you can then predict what the data point should be, and if the difference between the predicted value and the actual value is large then it’s influential and can affect your data. For more info on this and other methods please see ‘Discovering Statistics using SPSS’ by Andy Field, chapter 5.

Extreme values can be ignored by some methods of analysis, such as when using z-scores.  This involves using the middle 95% or 99% of the data (depending on the researchers preference), which the ‘normal population’ should come under.  Therefore, any extreme values should come under the remaining percentage excluded from the calculations, so will have little impact on the results.  Another way in which the impact of outliers can be reduced is by using accommodation or robust methods.  This uses non-parametric statistics, such as the median, and can use simple estimation tasks, so that the impact of extreme values is lessened.  Transformation is another method that can be used to work with outliers.  This method involves taking the logs of all of the data and using this instead of the raw data as it reduces the skew.  If none of these methods can be appropriately used to accommodate for outliers then it could be justifiable to remove them from the data altogether, but this is only done as a last resort.

So now that we’ve looked at identifying and dealing with outliers, I think we can safely say it’s okay to really hate them and start panicking about the prospect of having these in our dataset.  However, I hope I have also made it clear that just because your data has outliers it does not mean it’s the end of the world as they can be managed, and provided they are dealt with correctly your data can still provide valid results.

For more information about the whole process of dealing with outliers see this article here.