Yale Center for Teaching and Learning

Considerations when Interpreting Research Results

When interpreting research results, it is important to consider some factors that could impact internal validity, external validity and possible measurement error. 

Internal validity 

The internal validity in a study refers to whether the differences in the measured dependent variable are due to the manipulation of the independent variable in a study and not to to other confounding variables or outside factors. There are thirteen classic threats to internal validity.

  • History– something unexpected happens while treatment occurs, which causes a change
    •  Example: A participant has a death in the family that impacts their performance.
  • Maturation– something expected happens while treatment occurs, which causes a change
    •  Example: A participant graduates high school, which impacts their level of social anxiety.
  • Selection– the sample is biased or unrepresentative of the population
    • Example: Only students from one gender participate in the study.
  • Mortality– participants systematically drop out of the research project
    • Example: All students, who struggled with math on the pretest, quit participating in the study before it ends.
  • Testing– tests cause participants to learn or make new inquiries
    • Example: A student struggles with a question about climate change on the pre-test and does independent research before the conclusion of the study.
  • Instrumentation– not all measurement instruments are equal
    • Example: Students in one section of an introductory course scored better on the final paper compared to other sections because of different grading practices.
  • Regression toward the mean– individuals who score near the upper or lower limits of a test will score closer to the average on subsequent tests
    •  Example: A student who got a 100% on the pre-test, received a 95% on the post-test.
  • Contamination/Diffusion– individuals in the control group and treatment group talk about treatment methods which results in some treatment practices to also occur in the control group
    • Example: Teachers in the treatment group, who received training on how to utilize active learning practices, tell teachers in the control group about their training.
  • Resentful demoralization– individuals in the control group want to be in the treatment group
    •  Example: Teachers in the control group, who want to engage in collaborative lesson planning, which is the focus of the study, put in less effort in their classroom to encourage the administration to switch to that method.
  • John Henry effect – new treatments or methods cause individuals in the control group to try and outperform the innovation
    •  Example: Teachers in the control group, who do not want to engage in collaborative lesson planning which is the focus of the study, put in extra effort in their classroom to justify their current methods and prevent any systematic change.
  • Hawthorne effect – behavior changes when individuals realize they are being observed
    • Example: Individuals wash their hands less when they are alone.
  • Pygmalion effect– researcher unintentionally elicits the desired effect
    •  Example: A researcher smiles when a participant gets close to making the right decision.
  • Treatment fidelity– the treatment method is not fully or appropriately implemented
    • Example: Teachers who are asked to use clickers regularly in their class only use them during one lecture the entire semester.

External validity

The external validity in a study refers to whether the observed measurements are generalizable to other participant groups, settings, cultures, time or periods.[1]  For example, results from a case study on one person may not extend to other people from different geographic locations, genders, ethnicities, etc.  Sample size, sampling methods, analysis methods, and assignment methods (if multiple conditions) all influence external validity.

Measurement error 

The first two considerations refer to the validity of the study and design while measurement error refers to the specific scores or analysis procedures within the study.

  • Reliability- Reliability refers to how well a score represents an individual’s ability, and within education, ensures that assessments accurately measure student knowledge. Because reliability refers specifically to score, a full test or rubric cannot be described as reliable or unreliable.
  • Validity- Validity in educational measurement is the process of providing evidence that supports the interpretation of test scores and conclusions that are derived from various data analyses.



[1]Onwuegbuzie, A.J. (2000). Expanding the framework of internal and external validity in quantitative research, Presented at the Association for the Advancement of Educational Research.