A. Continuum of designs.
C. Steps in a design.
II. Continuum of designs.
A. Observation: Go to the world and carefully observe what
happens. Little or no idea of what to expect, little or no experimenter
involvement. Can conclude that certain things tend to happen after
or in association with certain other things.
B. Correlation: Similar to observation, but you have some idea of what to expect, and you're observing to investigate the relationship between two things. For example: Do smoking and cancer tend to occur together? Look for instances of smoking, and see if you also observe cancer. CORRELATION DOES NOT IMPLY CAUSATION. Even if two things are very strongly related, you still can't say that one caused the other. Why not?
1. No random assignment. You took what was available, so you had no control over participants’ history, experience, etc., and these things might be relevant.
2. Interpretive ambiguity. Could be any of these three:
a. A causes B (smoking causes cancer).
b. B causes A (a predisposition to cancer makes you more likely to smoke).
c. Something else causes both (poverty gives you cancer and makes you smoke).
In a way, determining causality is the goal of research (e.g. if we know what causes depression we'll know how to treat it). Why do people do correlation research if correlation does not imply causation? There are three good reasons to do correlation research:
1. Ethics. If I wanted to investigate the relationship between self-esteem and body image, I could randomly assign one group to be low in self-esteem, but that would require me to lower their self-esteem. This would violate ethical principles. Similarly, it would not be possible to manipulate childhood sexual abuse to investigate its effect on adult psychological well-being. To investigate these questions, we would have to do correlational research.
For a real-world example, consider recent research investigating the relationship between serum testosterone levels and aggression. Brooks and Reddon (1996) looked at the relationship between violent vs. nonviolent youthful offenders and found that violent offenders tended to have higher levels of testosterone. Berman, Gladue, and Taylor (1993) found a significant correlation (r = .42) between testosterone levels and aggression (the strength of an electric shock selected for a competitor). Ethically, it would be difficult to investigate this relationship by experimentally manipulating the testosterone levels. Even if elevated testosterone levels weren't potentially harmful, the potential aggression that might be committed would be.
2. Generalizability. A good experiment requires careful control by the experimenter. Achieving this control can reduce the naturalness of the task. The Stroop task allows us to investigate whether or not reading is an automatic process. To get good control over the situation, participants see words one at a time and respond with the color of the word. You can see that this is a very artificial reading situation. If an experimenter wants the conditions of the research to closely match those in the real world, a correlation design may be chosen.
Consider research by Becker, Abel, Blanchard, Murphy, and Coleman (1978). They were interested in the relationship between comfort level in a situation and sexual aggressiveness in men who were sexually aggressive. They found that higher levels of discomfort were associated with more inappropriate behaviors. It would be difficult to randomly assign a man to be sexually aggressive to investigate this relationship. A person who was pretending (or was made) to be sexually aggressive is unlikely to respond the same way as a man who really is sexually aggressive. So, to make sure the research generalizes to real sexual aggressives, real sexual aggressives were used as the participants.
3. Feasibility. Experiments can be difficult and expensive to conduct. The corresponding correlation research might be simpler because participant populations are readily available. In those cases, researchers do correlation research to fully explore the relationship between two variables. Once the relationship is understood, it is then possible to do an experiment to determine causality. Later in the semester we'll look at some research on the relationship between judged control and depression that used this approach.
C. Surveys: Take even more control. You can pinpoint exactly what you want to know, and ask exactly that question. You're still just observing something that's already happened (my attitude towards Jesse Helms existed before the survey measured it) so you can't make causal statements.
MTSU recently decided to change the time between classes from 10 minutes to 15 minutes. To make that decision, the school wanted to know exactly how many students were having a problem getting from one class to another in 10 minutes. Research Methods students undertook a survey of the campus to find out. The results were that 74% of students, at some time in the past, had two back-to-back classes scheduled in such a way that they couldn't get from one to the other in 10 minutes. 40% of the respondents had classes like that during the semester the survey was taken. Interestingly, less than half of the respondents wanted to increase the time between classes if it meant lengthening the school day (starting earlier or ending later).
D. “True” Experiments: Simple description: Take two groups of people, give one group a treatment, do nothing to the other group, look to see what effect the treatment had relative to the no treatment group (did forcing one group of people to smoke increase the incidence of cancer in that group relative to a group that didn't smoke at all?). You can make causal conclusions (A caused B). Properties:
1. Manipulate something.
2. Measure something.
*3. Random assignment.
Natural (in the real world) vs. artificial (in the lab).
Exploratory (know little about what to expect, general question “What happens if?”) vs. hypothesis testing (know what to expect, explicit question).
A. Parts of the experiment.
1. Hypothesis: Explicit statement of the question of interest in a way that makes it testable. Usually an if-then statement. Example: “What effect does studying have on exam performance?” --> “If you study, then exam performance will increase.”
2. Independent variable (a.k.a. factor) (IV): What you manipulate. Example: Above, presence of studying (or amount of studying).
a. Levels: IV's are made up of levels. Each level is an amount of whatever the variable is. For example, if you're varying the amount of study time, then the levels are the various amounts of time (1 hr., 2 hr., 3 hr.). Levels are arranged along a single dimension. You'll always have at least two levels, or what you have is a constant and not a variable.
b. Restriction of range: When you don't have enough levels to adequately cover the range of the effect produced by the IV. For example, the relationship between arousal and performance is hump-shaped. At low levels of arousal, performance is poor. At high levels, performance is also poor. In the middle, performance is good. If you only have two levels of arousal, it isn't possible to capture this entire relationship. OR: When you don't have adequate spacing amongst the levels to cover the range. Example: I have people in the high group study 1 hr. and 1 min. and people in the low group study 1 hr. The levels are so close together as to make it impossible to detect any effect.
c. Treatment: In a simple (two-group) experiment, one group gets some manipulation or treatment. This group is called the treatment group (they study for the exam).
d. Control: The other group in the simple experiment gets no treatment or a placebo (a treatment that mimics the procedures used in the real treatment group, but should produce no effect). This is the control group (they don't study).
3. Dependent variable (DV): What you measure in your experiment. In our example, that's exam performance.
4. Participant variables: Things that masquerade as IV's, but are actually only characteristics of the participants, and not variables that you (the experimenter) manipulate. The most common of these is sex.
5. Controlled variables: Sources of variation that can potentially affect your experiment that you control to make them constant. For example, if you have one group of people who study, and one who doesn't, you might make sure they all get the same amount of sleep, or take the exam at the same time, because amount of sleep and exam time could both affect exam performance in addition to study time. By controlling these, they're constant for both groups, and you don't have to worry about them.
6. Confounding variables: Things that a) covary with the variable of interest (IV) and b) could reasonably be expected to have caused the change in the DV. They're things you should have controlled, but overlooked. At the end of the experiment, you don't know if changes were due to the IV or to the confound.
7. Operational definitions: Explicit statements of what you're doing in the experiment, in terms of operations performed to carry them out. Particularly associated with levels and DVs. So, it's useless to say participants studied “a little” or “a lot” for the purpose of evaluating research. But, 1 hr. vs. 0 hr. is simple and clear.
8. Population: The people (or whatever) that you're trying to study. You usually want to make some statement about them (like “for MTSU seniors, studying two hours a night increases exam performance 20%”). You usually can't measure every member of the population.
9. Sample: The people (or whatever) that you actually measure in your experiment. These are drawn from the population using some procedure. To the extent that this procedure is good (we'll talk about “good” in a minute), your results from the sample will apply to the population.
B. Stuff about the quality of the experiment.
1. Reliability: How well you measure what you intend to measure. Related to repeatability. Does your measuring instrument give the same answer every time that you use it (given no actual changes in the state of whatever you're measuring). For example, if I get a different reading on my scale every time I get on it, in the absence of real weight changes, then my scale is unreliable. You need reliable measuring instruments to make any conclusions about your results.
a. Internal: Whether you measure what you intend to measure. You don't want your scale to tell you how tall you are. You don't want an IQ test to measure emotional state. If your measuring instrument is not actually tapping what it's intended to tap, it's not valid.
b. External (Generalizability): How well the results of your experiment generalize to the real world or the population you think you're studying. Alternatively, do the results of my experiment mean anything outside the context of my experiment?
C. Control of the experiment.
1. Random sampling: When every member of the population of interest has an equal chance of being included in your experiment. This has an impact on generalizability. To the extent that your sample is representative of the population, your results hold for that population. A good sampling procedure uses random sampling.
2. Random assignment: Every member of your experiment has an equal chance to be in any condition in your experiment. You want to equate the various group prior to doing the experiment. It's impossible to explicitly control for every possible difference, but random assignment can usually eliminate differences. This has an impact on internal validity. Not equating groups can be a serious threat to your measurement.
IV. Steps in a design.
A. Ask a question. Done.
B. Make a hypothesis. Done.
C. Collect observations. Next.
D. Analyze statistically. In progress.
E. Conclude. What does it all mean? Later.
Back to Langston's Research Methods Page