Evaluating Research

In an essay question (but not in a short answer question), you are expected to evaluate the research studies and theories that are discussed.   This is assessed in Criterion D Critical Thinking, and is worth 6 marks out of the total of 22 for the essay.  In my experience as an IB Psychology teacher, evaluating research effectively is one of the most challenging skills of the course - and often what separates truly exceptional Psychology students from the rest.  In fact, when I studied Psychology at university, I wasn't expected to evaluate research until I made it to grad school!  On the other hand, effective evaluation is a skill that can be developed, and this post will show you how.
Think Critically

Read the following evaluations written by two different students, who are both discussing Roszenweig and Bennet's study on brain plasticity.  (If you aren't familiar with the study, you can read a summary of it on this page).  

What are the similarities and differences in how the two students evaluate the study?

If you were an examiner, which student would you give the higher grade?  Why?

Phil's evaluation 

"Roszenweig and Bennet's study has low ecological validity because it was a laboratory experiment, low generalizabilty because it was carried out on rats, and low ethics because it was an animal experiment"

Sarah's evaluation

"A strength of Reszenweig and Bennet's study is its careful control of extraneous variables.  By randomly assigning rats to one of the two environments (enriched cage with toys, or deprived cage with no toys), and keeping all other variables constant, this study demonstrates a causal relationship between the rats' environment and brain development"


Which student do you think did a better job?  On the one hand, Phil's answer had more individual points - he discussed ethics, generalizability, and the research method.  On the other hand, his points weren't very well explained or justified.  He didn't explain why a laboratory experiment might have low ecological validity, for instance, or why a study on rats might not generalize to humans.  Sarah's answer was focused on one point - the careful control of variables in a laboratory experiment - but this point was fully explained and supported.  That's why an examiner would almost certainly give Sarah a higher grade for critical thinking.

The moral of the story is this:  when it comes to critical thinking, the quality of your analysis matters far more than the number of evaluation points you make.  Between quality and quantity, quality wins every time.
Recap: Evaluation Questions

One of the goals of IB Psychology is to encourage you to think critically about research.  That means whenever you hear about an exciting new study on TV or online, you will approach the findings with a little healthy skepticism - asking questions about how the study was carried out, what the findings of the study really tell us, and how certain the evidence truly is.  There is a whole section of this website devoted to Critical Thinking (use the menu at the top of this page to have a look), but here is a brief recap of critical thinking questions to ask when evaluating research:

Critical Thinking questions for research studies ( GRAVE)

  • Generalizability - Who participated in the research study?  What were the characteristics of the sample, in terms of age, gender, culture, education level, and so on?  Is the sample a good representation of the entire target population?  To what extent can the findings from the sample be generalized to others?

  • Reliability - Is it possible to replicate the study?  Why or why not?  Has the study been replicated, and are the results consistent with the original findings?  If the study was replicated today with different participants, would you expect the results to be the same?

  • Research method - What research method was used in this study?  What are the strengths and limitations of this research method? (For more information, have a look at the Research Methods section of this website)

  • Applicability - Does this study provide important research evidence in support of a theory?  If the study focused on biological determinants of behavior, could social factors also play a role? (And vice versa) If so, how?  Are the results of this study useful or relevant for understanding people's behavior in real life?  How?

  • Alternative explanations - Are there any other ways to explain the findings?   If the study compared results between two groups of participants (ex. American and Japanese), are there any other differences between the groups that might provide an alternative explanation of the results?

  • Validity - Does the study demonstrate a "cause and effect" relationship?  If so, between which variables?  Does the study have high ecological validity, meaning that the results from the study can be applied to understanding behavior in real life situations?

  • Ethics - What ethical issues were raised in this study?  How did the researchers try to address these issues?  Would a modern day ethics committee approve this study?  Why or why not?

Critical Thinking questions for theories (PEAR)

  • Predict - Does this theory make predictions about how people will behave in a given situation?  Have these predictions been tested?

  • Explain - What does the theory explain about human behavior?  How useful are these explanations in real life?

  • Applicability - Can this theory be applied to help people or have a real-world impact?  If so, how?

  • Assumptions / biases - What assumptions is the theory based on?  What biases are inherent in the theory?  (For example, evolutionary theories of behavior are based on the assumption that genes play a significant role in causing us to act the way we do, and these genes are selected by the process of natural selection)

  • Research evidence  - How much research has been carried out in support of this theory? How strong is the research evidence?

​However, remember that quality of evaluation is more important than quantity.  You don't need to address all (or even most) of these questions for any particular study.  Instead, address the issues that you think are most relevant.  If a study has issues with its choice of sample, for instance, you should concentrate on discussing generalizability.  In general, you should aim to discuss two or three evaluation points for each study or theory, including at least one strength and one limitation.
Evaluating with PEEL

When you compared Phil and Sarah's evaluation of Rosenweig and Bennet's study, you saw that reciting a large number of evaluation points - without explaining or justifying them - is not a good approach.  After all, not everyone might agree that your points are correct.  If you think a study has low ecological validity, for instance, you need to explain why you have come to that conclusion.  What is it about the procedure of the study that results in poor ecological validity?  A useful trick for making sure you fully justify each point is to remember PEEL.
  • Point:  Begin by making your point. (Example - A limitation of Rosenweig and Bennet's study is that the experiment was performed on rats, and so the findings may not fully generalize to humans)

  • Evidence / explanation:  Give an explanation of the point you have made.  Refer to details of the study to provide evidence that your point is correct.  (Example - Although rats and humans share many brain regions in common, there are, of course, important differences between species.  Furthermore, while a cage full of toys may be a stimulating environment for a rat, it is unclear what sorts of environments are the most stimulating for humans)

  • Link: What do the strengths and limitations of the study tell us about the topic in general?  How does this relate to the question?  (Example - Rosenweig and Bennet's study provides important evidence of how the environment can shape the brain of rats, however, more research should be carried out to investigate neuroplasticity in humans)
Try it Out

In the grid below, you'll see fully developed evaluation points for four studies.  However, the text boxes in the grid has been mixed up.  See if you can match the study to the relevant evaluation point, explanation, and link.  (You might want to print the grid off, cut out the squares, and then match them)
TOK Link

In Psychology, it is customary to say that a particular study supports a theory, not that a study "proves" a theory.  Can you think of why?

The simple answer is that no one study can ever provide any final, definite answers.  All studies involve a limited number of participants, and measure behavior under a specific set of conditions.  It is entirely possible that a different group of participants - or a slightly different research design - could produce very different results.

That's why the science of Psychology is an ongoing process - as more and more research is carried out, involving different participants and research methods, a better understanding of behavior emerges.  Sometimes results seem to contradict each other, and then Psychologists might have to go back to their original theory and make modifications (or scrap it entirely!)

When you evaluate research, you are asking important questions about the significance of a research study within the broader pursuit of knowledge in Psychology.  Every study you learn about in this course provides important clues to understanding behavior - but every study has limitations that must also be taken into account.

​Exam Tip

In published Psychological research papers, a common convention is to discuss areas of further research that could address some of the limitations of the study or further advance knowledge.  It is a great idea to end your evaluation of a study (or theory) by doing the same. 

For example, if a study has been conducted on animals, you might say that an area of further research is to see whether similar results can be obtained in humans.  Or, if a study has limited ecological validity, you might say that an area of further research is to see if similar results can be obtained in more natural, realistic settings.

Discussing how further research could address the limitations of a study demonstrates great critical thinking skills, and will be sure to impress any examiner!

  • I can identify strengths and limitations of research studies and theories

  • I know how to fully explain and justify each evaluation point 

  • I understand how to link my evaluation to the topic, discussing the need for further research to address any limitations in the current evidence
Quiz Yourself!

1.  In the acronym GRAVE, what two terms does the "R" stand for?

(a) Reliability and replication

(b) Reliability and research method

(c) Replication and research evidence

(d) Replication and relapse

2.  A study that measures behavior in an artificial environment, which may not be relevant for understanding real world behavior, has low ____

(a) Internal validity

(b) Reliability

(c) Generalizability

(d) Ecological validity

3.  How many evaluation points should you aim to discuss for each study?

(a) At least one

(b) At least two

(c) At least three

(d) At least four

4.  A study was conducted on 42 undergraduate students studying at New York University.  What might be a limitation of this study?

(a) Low ecological validity

(b) Low generalizability

(c) Low reliability

(d) Low internal validity

5.  A study was conducted on 42 undergraduate students studying at New York University.  What might be a suggestion for further research to address the limitations of this study?

(a) Replicate the study in other universities

(b) Replicate the study with an even mix of male and female participants

(c) Replicate the study with different age groups and education levels

(d) Replicate the study with graduate students at New York university


1 - B, 2 - D, 3 - B, 4 - B, 5 - C