Methods, Notes 13 -- Strong inference
A. What is strong inference?
B. Digression to discuss causation.
B. How does this apply to real-world research?
C. How does it all tie together?
II. What is strong inference? (Platt, 1964) Until
now we've pretended that experiments happen in isolation. In reality,
that is almost never the case. The experimental enterprise is cumulative.
New experiments build on old experiments; research is conducted as part of
a larger program of investigation. What we'll do now is look at ways
in which series of experiments can be combined to test theories.
Our first example of this will be strong inference. At its core, strong
inference works like statistical hypothesis testing. You set up a group
of mutually exclusive and exhaustive hypotheses and then attempt to rule out
all but one of those hypotheses. The difference is that we're getting
our hypotheses from theories that exist to explain the phenomenon of interest.
We can have as many hypotheses as there are theories.
A. The steps in strong inference:
1. Devise multiple hypotheses. You can do this based on your
own intuitions (not so hot) or based on a careful review of the literature
(much better). The goal is to gather together all proposed answers
to your question of interest. Why do this?
a. It protects us against our bias to try and confirm our first idea.
Instead, we're forced to think about all of the possibilities.
b. If our “pet” hypothesis isn't supported by the data, we can still
make some sort of statement (as opposed to saying “I guess we were wrong”
we can say “the data supported this hypothesis”).
This step is the most important. If you fail to generate all alternatives
you won't be guaranteed that your data will be consistent with any of the
hypotheses, and you'll have a harder time convincing people that you're correct.
As an example, imagine that I have a hypothesis which you didn't consider
that generates the same predictions as the hypothesis that was consistent
with your data. You want to say your hypothesis is correct, I can argue
that based on your data mine's just as good, and you're out of luck.
2. Design an experiment to test between these alternatives.
Ideally, the outcome of one experiment will be consistent with only one of
your alternatives and inconsistent with all the rest. Sometimes, it's
not possible to do it all in one experiment. In this case, you'll carry
out a series of experiments.
Note the importance of predictions. As you design the experiment you
need to constantly check to be sure that each hypothesis makes different
predictions (like no main effects but an interaction vs. one main effect
and no interaction). If they don't all make different predictions,
then you won't be able to rule them all out, and you're wasting your time.
3. Carry out your experiment. This step involves most of the
first part of this course. If you run a clean experiment, your conclusions
will carry more weight. If your experiment is full of confounds, your
conclusions are worthless.
Sometimes, you repeat these steps several times to get to just one hypothesis.
This might happen if you can't force all of the hypotheses to make different
predictions, or if you realize after the experiment that a slightly different
version of one of the hypotheses might still be OK, and you want to refine
it even more. It's kind of like tournament bowling: The two lowest
contenders meet. The winner advances to meet the next highest contender.
The winner of that round advances...
B. Characteristics of strong inference research:
1. Consideration of multiple hypotheses. This helps to save
you from your natural bias to always do confirmation research. It also
means that this kind of research is rarely done in an exploratory context
(if you don't know anything about a phenomenon, how can you test theories
to explain it?).
2. Organization of experiments. There's a particular structure
to the sequence of experiments and this organization is largely responsible
for the power of the technique.
3. Increment by exclusion. We gain in knowledge by ruling out
alternatives. It seems paradoxical, but by eliminating possibilities
we have a much better estimate of what the correct answer is.
IV. Digression to discuss causation. When we talk about
A causing B, there are two kinds of possible causal relationships:
A. Necessary cause: If A is necessary to cause B, then when
A is absent you won't find B.
B. Sufficient cause: If A is sufficient to cause B, then when
A and A alone is present you'll find B.
C. These two are independent. The combinations:
1. Neither necessary nor sufficient: sunlight is NNNS for photosynthesis.
2. Necessary but not sufficient: light is NBNS for vision.
3. Sufficient but not necessary: watering the grass is SBNN
for the ground becoming wet.
4. Necessary and sufficient: temperatures below 0°C are
NAS for water to freeze.
D. These are related to the logic of hypothesis testing:
1. The two valid forms presuppose a sufficient relationship.
For example, "if it's a lemon, then it's sour," using modus ponens I say
"here's a lemon, is it sour?", having a lemon is sufficient to cause sour.
I don't need anything with the lemon. You can try modus tollens on
2. The two invalid forms presuppose a necessary relationship.
For example, "if it's a lemon, then it's sour," affirming the consequent
I say "here's something sour, is it a lemon?," I'm assuming that the only
way something could be sour is if it's a lemon, or that lemons are necessary
for sour. Try denying the antecedent on your own.
III. How does this apply to real-world research? Consider
obstacle detection by the blind. Several explanations exist to explain
how blind people can move around in unfamiliar environments without bumping
into stuff. The goal of the experiments is to see which of these is
1. ESP: Blind people have honed a sixth sense that we all possess
but that sighted people ignore or can't use properly.
2. Facial vision (cutaneous feedback): Blind people are able
to sense changes in air currents moving around their faces and use this information
to avoid obstacles.
3. Auditory: Blind people pay more attention to sounds of things
and use this information to avoid obstacles.
1. Experiments 1 - 3: The goal is to establish the method.
Take two blind participants and two blindfolded sighted participants, position
them in a hall, tell them to approach the wall. Measure three things:
Distance perception (D-P): How far away can they detect the wall?;
Close perception (C-P): how close can they get without hitting the
wall?; and Number of collisions (Coll): How many times do they hit
the wall en route to getting close to it 25 times? The experimenters
used two basic conditions: Hard shoes on a hard floor and socks on
Experiment 1 established that blind participants were good obstacle detectors,
and that sighted participants weren't as good (but practice led to improvement).
This is illustrated in the “I. Wall as obstacle” row in the table.
Experiment 2 refined the methodology by letting carpet runners guide the
participants and using a movable screen as an obstacle.
Experiment 3 made one last change by using thicker carpeting.
The results of these experiments do three things. First, they show
that blind participants locate obstacles very well. So, there's something
to investigate with additional experiments. Second, they establish a
nice methodology for investigating obstacle detection. Third, they show
that sighted participants can become nearly as good as blind participants
at detecting obstacles with just a little practice. This pretty much
eliminates the ESP hypothesis (it's unlikely that some latent sense could
be activated so quickly).
2. Experiment 4: Is cutaneous feedback necessary? The
first step in chipping away at one of the hypotheses. Repeat Experiment
3, but put a heavy felt hood over the participants' heads and heavy gloves
on their hands. Now, with no air currents, do the task. You can
see the data in the “IV. Screen (felt head cover)” row of the table.
Participants were a bit worse, but not as bad as they should be if they must
have air currents to detect obstacles. Conclusion: Cutaneous feedback
isn't necessary for obstacle detection.
3. Experiments 5 - 6: But, air currents might still be sufficient.
In other words, both hypotheses could be correct. To truly eliminate
cutaneous feedback as an option we need to know: Is cutaneous feedback
sufficient? To answer this, mask all sounds, and provide only cutaneous
feedback. In Experiment 5, this was done with big ear muffs, in Experiment
6 this was done with a masking tone. Results: Without auditory
feedback, participants couldn't do it at all. So, cutaneous feedback
is not sufficient. At this point, it's eliminated. At the same
time, we can say that auditory feedback is necessary (without it, no detection).
4. Experiment 7: To polish it up: Is auditory feedback
sufficient? Put the participant in a sound-proof room with headphones
that play the sounds from the hall. Have the experimenter approach
the wall, otherwise, the task is the same. Now, all the participant
gets is hearing. The experimenter is actually doing the walking.
Result: Auditory is sufficient. This is the death blow for the
other two hypotheses. Auditory is necessary and sufficient, they're
neither necessary nor sufficient.
IV. How does it all tie together? Let's make a chart
of the progress made in the series of experiments:
You can see how we started with multiple hypotheses (the first characteristic),
performed an organized series of experiments (the second characteristic),
and incremented knowledge by excluding alternatives (the third characteristic).
The steps are also all present (generate multiple hypotheses, design experiments,
experiment and revise). Notice how we repeated the steps several times
to get a final answer.
One last note: The wise student will use this technique for research
projects as you're always guaranteed something to say after the results
are in. Also, introduction and discussion sections from this type of
research pretty much write themselves.
Research Methods Notes 13
Back to Langston's Research Methods Page