Internal validity of research studies

Ph.D. Topics : Research and Evaluation Methods

What is validity? If I say “that’s a valid argument” to you, it means your facts and your logic seem reasonable to me. In research methods, we talk about validity because we want to make statements about the world; we want to make knowledge claims. We want these claims to be valid, meaning they should be well-grounded in logic and fact so that we can trust in them.

Much of scientific research is concerned with making claims about causality. In education research, for example, we want to know what causes students’ math achievement to be high or low. Is it what their teacher does? Their raw brain power? How hard they work? And so forth. Obviously, the answer is, it’s many things, but to what extent possible we want to isolate the factors that are under our control (teaching method, curriculum, school culture) and find the factors that will result in the highest math achievement.

Internal validity = Extent to which you can infer causality

The internal validity of a research study is the extent to which you can make a causal claim based on the study; it is the validity of the causal inference you make. Different research designs provide stronger or weaker internal validity. For example, well-designed randomized experimental designs generally are considered to provide the strongest internal validity. Quasi-experimental studies in which treatments are assigned randomly to intact groups (e.g., classrooms) can have strong internal validity also.

Continue reading

psychometrics, statistics

Attitude towards math vs. confidence, liking, usefulness of math in TIMSS

Kadijevich, D. (2006). Developing trustworthy TIMSS Background Measures: A case study on mathematics attitude. The Teaching of Mathematics IX(2), 41-51.

Abstract. This study, which used a sample of 197,707 students from 46 countries that participated in the TIMSS 2003 project in eight grade, examined whether, for a large number of the TIMSS countries, trustworthy TIMSS measures of several dimensions of mathematics attitude can be developed. By focusing on self-confidence in learning mathematics, usefulness of mathematics, and liking mathematics, it was found that both factor validity and reliability of the measures of these three dimensions derived from the raw data was only attained for the students from the United States. However, when scores concerning the utilized attitudinal statements of all subjects were transformed into Guttman’s image form scores, the factor validity and reliability of the three measures utilizing such transformed data was attained for thirtythree countries (N = 137;346). It was found that for all these thirty-three countries mathematics attitude was mostly saturated by either usefulness of mathematics or self-confidence in learning mathematics. A higher mathematics achievement was found for countries where mathematics attitude was mostly saturated by self-confidence in learning mathematics.

It’s not mentioned in the abstract, but if you combine all the mathematics attitude items into one grand attitude-towards-math scale, you get decent internal reliability (alpha above .70) for almost all countries.

This makes me think maybe I ought to use an overall “attitude towards math” score in the next iteration of my model. Or try that Guttman transformation, which doesn’t make any sense to me, so will need to understand what’s going on with it first. How can it eliminate measurement error?

Also, interesting that attitude towards math is either saturated by self-confidence (something intrinsic) or usefulness (something extrinsic), and the intrinsic one predicts higher math achievement.