Monthly Archives: October 2011

For Simon, weeks 4/5


My blog for this week is here:

Comments I have made this week are:


Outliers: the good, the bad, and the ugly


Outliers are the really annoying data points that all researchers hope they won’t have in their data, although they would be lucky to manage this.  And why is this?  Because they’re basically just a pain and can threaten the validity of our data if treated incorrectly, or not at all.

Before going into the details of why it is important to pay attention to outliers lets first look at what one is.  The basic definition of an outlier is that it is an extreme value that does not follow the norm, or the pattern of the majority of our data.  For example, the graph shown below shows a set of data which has a strong correlation or pattern, but one data point is positioned away from the rest of the points, so is the outlier.

To start with the good news about outliers is that sometimes they can be helpful (I know this is very hard to believe).  Once outliers have been identified they can be looked at more closely and can lead to some unexpected knowledge, and can show more about individuals that do not fit the ‘norm’.  They can also be used to reveal errors within the research model.  For example, if there is a form of measurement error, such as the data being recorded incorrectly, or a participant has not understood instructions then these are possible causes of outliers that the researcher could modify their study to exclude in the future.

However now we come to the bad news.  Outliers are more often than not seen as a problem rather than a help.  Not only do they suggest that the data was taken from a different population than the intended population, thus effecting external validity, but it can also cause problems with analysis.  An outlier can distort results, such as dragging the mean in a certain direction, and can lead to faulty conclusions being made.

Detecting these outliers can be a very complicated task and can sometimes require assumptions to be made about the parameters of the data and the expected distribution.  For example, outliers are often detected as values that lie outside a certain range, but how is this range calculated?  Sometimes it is based on the expected standard deviation, or by calculating the interquartile ranges and allowing a multiple of this either end of the spectrum.  This method is shown by the use of a box plot as well.  Other visual aids, such as a scatter graph, as shown earlier, can also help identify outliers.  They are particularly problematic in categorical data as they are more complicated to deal with, but I won’t go into that here, but this article explains more about one method of dealing with them called wavelet transforms.

Now it’s time to embrace yourselves as we come to the ugly part of working out what to do with these outliers.  There are many methods of doing this, which include both leaving them in the data and taking them out.  An important thing to do before modifying your research is to work out how influential a data point is on the data as a whole.  This can be done by using the adjusted predicted value.  This involves removing the suspected point from the data to create a new model and to analyse this.  By using the results of this analysis you can then predict what the data point should be, and if the difference between the predicted value and the actual value is large then it’s influential and can affect your data. For more info on this and other methods please see ‘Discovering Statistics using SPSS’ by Andy Field, chapter 5.

Extreme values can be ignored by some methods of analysis, such as when using z-scores.  This involves using the middle 95% or 99% of the data (depending on the researchers preference), which the ‘normal population’ should come under.  Therefore, any extreme values should come under the remaining percentage excluded from the calculations, so will have little impact on the results.  Another way in which the impact of outliers can be reduced is by using accommodation or robust methods.  This uses non-parametric statistics, such as the median, and can use simple estimation tasks, so that the impact of extreme values is lessened.  Transformation is another method that can be used to work with outliers.  This method involves taking the logs of all of the data and using this instead of the raw data as it reduces the skew.  If none of these methods can be appropriately used to accommodate for outliers then it could be justifiable to remove them from the data altogether, but this is only done as a last resort.

So now that we’ve looked at identifying and dealing with outliers, I think we can safely say it’s okay to really hate them and start panicking about the prospect of having these in our dataset.  However, I hope I have also made it clear that just because your data has outliers it does not mean it’s the end of the world as they can be managed, and provided they are dealt with correctly your data can still provide valid results.

For more information about the whole process of dealing with outliers see this article here.

For my TA-Week 3


My blog:


Is the use of scientific method necessary in psychological research?


The debate regarding whether psychology is a science has been ongoing for many years, as those who remember the module from last year will recall.  Many psychologists believe that it is the use of the scientific method in research that classifies the discipline as a science.  However, what does the scientific method involve, and is this the only or best way that psychology should be studied?  Evidence suggests otherwise.

There are various ways that the scientific method can be defined, whether it is through the use of structured experimentation, enforcing the principle of parsimony, or Popper’s use of falsification.  The general definition taught to students like ourselves is that the scientific method involves forming an objective hypothesis, testing this using structured empirical methods, and evaluating the data to either support, reject or modify the hypothesis (Gravetter & Forzano, 2008).  Popper believed that a hypothesis can never be proven correct, as not every case could possibly be tested, so in order for a hypothesis to be scientific it must have the potential to be falsified.

In psychology both qualitative and quantitative research methods are used, however many people question as to whether qualitative research violates the rules of the scientific method.  Based on the features described above I am inclined to say no it does not.  Although qualitative research does not usually manipulate variables or do experimentation in laboratory settings, it does not make the research any less structured or objective than quantitative methods.  There are also many similarities between these two types of research, such as the selection of samples, requirement of validity, and the use of general and specific hypotheses, as described in more detail by Willig, chapter 9. As it can be hard to quantify characteristics of behaviour and the processes of research have many similarities, I believe qualitative research does follow the scientific method.

So is the scientific method the only way in which we should research in psychology? No it is not.  Many important discoveries have been made by pure observation rather than testing a hypothesis.  For example, B. Skinner began investigating behavioural processes, but did not have a hypothesis so he just observed the behaviour to see what would happen.  Throughout the study he kept noticing new things about the behaviour of rats, so kept modifying his techniques and starting new experiments as he made new discoveries.  This investigation led to the invention of the Skinner Box, and the discovery of reinforced learning.  The research described was not done using the scientific method as there was no objective hypothesis or structured experimentation, but it did lead to important discoveries that have had a huge impact on psychology. Skinner describes his investigations in more detail here.

The use of the scientific method does not guarantee that a study will not have any credibility issues, such as the discovery of phrenology.  In the 1700s Franz Gall hypothesised that different parts of the brain influence different functions, and that he could investigate this by measuring the size of external lumps of the skull and connect this to the development of different functions.  His study led to the discovery of cerebral localization and has influenced lots of cognitive investigation.  However, his method of identifying these areas was not valid, so even though the scientific method was employed it does not ensure investigations will be carried out correctly. This article provides more information about the history and theories of Phrenology and Gall.

Lewin founded a model of research similar to the scientific method that involved active participation of the researcher to employ their results in the real world, known as the Action Research Model.  He believed that the scientific method works best when a limited number of variables are being investigated, but that due to behaviour being so complex and having so many influences acting upon it, the scientific method alone could not investigate the uniqueness of characteristics to the desired quality.  The similarities and differences between the action research model and the scientific method are analysed in this article.

So has our need to be classified as a science restricted our views and forced us to reject other methods of research that may be just as effective, if not more so, than the scientific method?  Or could these methods be used alongside the scientific method to optimize our understanding even more? Is it possible that the need for high objectivity and a hypothesis has resulted too much on the focus of the attributes and behaviours that we are investigating, and miss the opportunity to perceive unique mannerisms and find patterns regarding characteristics we hadn’t initially set out to investigate?  These are all questions that need to be considered thoroughly when contemplating whether the scientific method is the best research method for psychology.

Do you need statistics to understand your data?


Statistics come in many shapes and forms, whether it’s descriptive, such as the mean that provides us with a simple understanding of the data, inferential which are more complex, or as a visual representation such as a graph.  All of these types of statistics are used within psychology to enhance our understanding of raw data in a research study, by collectively transforming it into more manageable data.

It is a common misconception that when qualitative data is collected statistics are not used to analyse it due to the data being non-numerical.  This is not always the case.  Often, qualitative data is collected via observations and used to write a report based on the findings, thus leaving it in a non-numerical form, such as case studies.  A case study that had huge impact on the field of Psychology is Freud’s analysis of Anna O.  However, on many occasions qualitative data is collected through nominal research, therefore is categorical, and is then transformed into quantitative data in order to analyse it easier.  This could be done by creating a statistical table using the frequencies or by creating a pie chart etc, as demonstrated by Mendenhall, Beaver & Beaver, page 12.

Often, experimental error and variability within data can obscure meaningful differences.  By just looking at the data without performing any statistical procedure it can be very difficult to identify a pattern or consistent differences between treatment conditions.  For example, those who attended the Personality & Individual Differences lecture this week will recognise the 2 graphs shown below for Attention Blinks.  It is very hard to identify a pattern on the graph which shows all the individual results, but after a statistical procedure to show the mean values the pattern is much clearer.  Sometimes large differences can be identified without using statistics, but these mathematical procedures can help determine the extent of the difference more accurately.


However, although the use of statistics can help to understand data and identify patterns we may not naturally notice ourselves, it can also lead to confusion.  Sometimes after using a statistical technique your results may look too good to be true, and they often are.  Frequently, statistically significant treatment effects are not as meaningful in the real world as they initially appear.  For example, studies have shown that if the average human consumes 20-30g of wine a day, it could reduce their risk of heart disease by 40% (Renaud & Logeril, 1992).  This statistic looks amazing and encourages us to follow the suggestion and drink the recommended amount of wine per day.  However, if in reality your susceptibility to getting heart disease is only 5% then this reduction is not really that significant practically, and the other health risks you may intensify by increasing your alcohol consumption may not be equal in comparison.

The aim of a researcher is not only to find evidence to support a hypothesis, but also to understand why this happens.  Although statistics can be used to find cause and effect relationships it does not suggest anything about why these relationships exist.  Statistics help to describe what the data has found, however it provides no insight as to what this means in reality, what this revelation can lead to, or why your data is like this.  Therefore they can be helpful in many circumstances to analyse data and lead to the acknowledgment of findings that may not have been discovered otherwise, however they do not provide a complete picture of the findings and are not necessary in the understanding of data.