Research Methods, Notes 3 -- Hypotheses
A. Ideal Hypotheses.
B. Logic of Hypothesis Testing.
C. Induction and deduction.
II. Ideal Hypotheses.
A. Gremlin example: "What makes this watch go?"
a gremlin. No matter how someone wants to test it, bat the
away. Should suggest:
B. Properties of the ideal hypothesis:
1. Produces testable implications. Some procedure can be
followed to verify whether or not the implications occur. Two
a. testable in practice: We can do the experiment today
to see if the implications are true of the world.
b. testable in principle: In a perfect world (with some
technology we don't have at present) we could test the implications.
Example: 100 years ago, hypotheses about the surface of the moon
were testable in principle, but now they're testable in practice.
2. Incompatible with certain outcomes: At least one of
the potential observations I can collect will be incompatible with my
This is where the Gremlin example fails (no observation could rule it
Without this property, the hypothesis is essentially untestable (it's
really a test if no matter what you find it still works).
3. Simplicity: The hypothesis is based on few or no
assumptions. It should involve only one unknown, and the purpose
of the experiment is to answer a question about that unknown.
a. Untested assumptions could cause the hypothesis to fail
of the assumptions, even when the hypothesis really is correct.
One example of this comes from the medical literature. A group
of people in New Guinea called the Fore were dying of a mysterious
disease called Kuru. People with the disease would lose motor
The questions were: What's the disease agent and how is it
A genetic cause was originally suspected. The disease clustered
women and children, and tended to run in families. A disease
was ruled out because viruses and bacteria produce symptoms like fever
that indicate that an infection has taken place. People with Kuru
showed no symptoms.
An alternative possibility for the spread of Kuru was
This alternative was ruled out because men didn't seem to get
It was assumed that both men and women participated in cannibalism
There are two untested assumptions. The first is that disease
agents cause symptoms. The second is that everyone participates
cannibalism. Both assumptions are wrong, and caused the correct
to be ruled out. Cannibalism first. Men usually hunted and
did not share the pigs they caught with the women and children.
the women and children were severely protein deprived. The women
were responsible for dressing bodies for burial. Around the turn
of the century, some woman discovered that human flesh was tasty.
Eventually, the women would eat entire corpses. Choice bits were
parceled out according to rank and kinship. The only restriction
was corpses of people who died of obvious disease. Since the Fore
thought Kuru was the result of sorcery, the women generally ate people
who died of Kuru. Steaming the brains in bamboo tubes was
the best method of transmission. (See Goodfield, J.
. Quest for the Killers.
New York: Hill and Wang.) There's also an article on the
from the Straight
The point: An untested assumption caused a correct hypothesis
(cannibalism) to be rejected. Fortunately, cannibalism stopped
and Kuru is now almost extinct (fewer than six cases per year).
b. Untested assumptions can also cause the hypothesis to be
strictly because of the assumptions and not because it's correct.
Some examples of hypotheses that are not simple and what's wrong with
1) "If people enjoy football then they are more likely to have
violent personalities." Problem: "More likely." If it
fails, you can hide behind it, if it succeeds, that could be due to the
definition of "more likely." More simply: Adding "more
makes the hypothesis harder to test and harder to falsify (you hurt the
other two properties by including it).
2) "If people see the full moon then their brains will secrete
a chemical that will make them more violent." Problem: Is
the chemical secretion that you're testing, or the violence? What
if they're not more violent, does that mean you conclude chemicals
secreted? If they are more violent, does that mean chemicals were
secreted? Figure out if the moon causes the secretion of brain
then test if those chemicals cause violence.
C. Generating hypotheses: Keep these principles in mind
when generating hypotheses. The attitude is to look for
where the hypothesis can fail, and to make it produce clear and precise
predictions about the world. Note we're already seeing the
of the falsification approach. Part of the reason we want to have
our hypothesis be incompatible with certain outcomes is so we can set
situations where those outcomes are likely in an attempt to falsify the
hypothesis. Only one counter-example rules out a hypothesis, but
it takes an infinite (or nearly infinite) number of positive instances
to prove it true.
III. Logic of hypothesis testing.
A. Let's ease into this slowly. Consider the following
example. You're on the vice squad, and you go into a party with
following rule in mind: "Anyone who's under 21 can't be drinking
alcohol." Here are some people at the party:
18 yrs. old
drinking a Coke
43 yrs. old
Which two people do you check to see if the rule above is being
(check meaning look at their age or drink depending on which you don't
If you think Bob and Emily, you're exactly correct. This same
situation will underlie all of the logic of hypothesis testing.
Here's how it works: We have a hypothesis and we want to test
it. We can set up a situation and perform a test, and we want
process to yield the most information about the hypothesis.
we're just wasting our time. We'll consider each situation above
in turn (we'll turn the rule into the hypothesis "If you're under 21
you can't be drinking alcohol" and pretend we're doing an experiment to
||A situation where someone who is 18 is drinking
under 21, so if we look at his drink and it's not alcohol, the
is supported. The hypothesis would be false if he is drinking
||A situation where someone is having a coke. We can look
age, but there's no point. Our hypothesis doesn't say anything
the ages of people not drinking alcohol.
||A situation where someone is 43 years old. We can look
he's drinking, but there's no point. Our hypothesis is only about
people under 21, it says nothing about people over 21, so checking
won't tell us about our hypothesis.
||A situation where someone is having beer. This can
hypothesis. If Emily's under 21, then the hypothesis is
If she's over 21, the hypothesis is supported.
B. Now we'll make it more formal. Some definitions:
The "if" part of a hypothesis is called the antecedent (p)
The "then" part of a hypothesis is called the consequent (q)
So, for our example:
p = "you're under 21"
q = "you can't be drinking alcohol"
A hypothesis "If p then q" leads to four situations we could set up:
1. present p, look for q (Bob)
2. present not q, look for not p (Emily)
3. present q, look for p (Carol)
4. present not p, look for not q (Jerry)
We'll consider each situation in turn with its formal name. To
check your comprehension, make sure you understand how the names map
1. Present p, look for q.
This is called modus ponens. It is also called confirmatory
It's confirmatory because we're looking for instances where the
is correct. (We're selecting people under 21 to see if they're
drinking). We present p, look for q, present p, look for q,
p, look for q...
This can disconfirm if we present p and find not q. So, if we
select someone under 21 and they're drinking, the hypothesis is false.
Several problems with this approach (it's perfectly valid, logically,
these problems have to do with how fast we can accumulate information
a. We tend to be biased towards confirmation. The
we set up will necessarily support it. This is a characteristic
the humans doing science who like their hypotheses and want to find
One place where confirmation biases come up is stereotypes.
with stereotypes tend to note instances that support the stereotype,
discount instances that do not support it. Another example can be
had from reasoning experiments. Here's a brain teaser: I
a rule that generated this set of things: a fire truck and an
You generate additional examples, and I'll tell you whether or not my
would have put them on the list. When you think you know the rule
(100% certain), tell it to me. (My rule was "vehicles.")
people think "emergency vehicles," which is too specific. Then,
test by thinking up more emergency vehicles. The appropriate test
would be a non-emergency vehicle. That way, you can prove that
first guess is wrong.
As an example of this, consider the hypothesis (now defunct) that brain
size is an index of intelligence. The implication of this for the
researchers who studied it was that different races would have
brain sizes, and so could be ranked in terms of intelligence. The
idea was wrong, but confirmation biases played a role in perpetuating
a lot longer than the data would have supported. The lecture
more on this. The controversy is from Gould, S. J.
The Mismeasure of Man. New York: W. W. Norton and Co.
How might coincidences be a result of confirmation bias? Check
this article from Skeptical
For bible code debate examples presented in class, check here
b. We end up wading through a lot of crap because of the volume
of research you do showing all of the contexts in which the hypothesis
holds. Each time you support it I can say "yes, but how about
and you have to support it again in that situation...
c. No matter how many times you support your hypothesis there's
always that element of doubt. You've shown me 500,000,000 people
under 18 not drinking alcohol, but that very next person you check
be the one that rules out the hypothesis. No matter how many
you confirm, we'll never know for sure if the counter-example is just
to be found.
d. Why it isn't much use in research. If your hypothesis
is something like "If my theory is true then I will find data to
it," you can"t know if the theory is true to do the test in the first
Finding data to support it is affirming the consequent.
2. Present not q, look for not p.
This is called modus tollens. It's also called disconfirmatory
reasoning. We're looking for situations where the hypothesis will
fail (find people who are drinking and see if they're under 21).
This has a good chance of disconfirming. If we find one drinker
under 21, then the hypothesis is ruled out.
This approach has a big advantage over modus ponens. Using modus
tollens we only have to find one case where the hypothesis fails and
through. If we use modus ponens we have to find too many to count
where it succeeds.
3. Present q, look for p.
This is called affirming the consequent. It's NOT VALID.
Look closely at the hypothesis. It says "when you see p you'll
q". It does not say "when you see q you will see p." But,
the test you're performing here. No matter what you find (present
q and get p or not p), nothing happens to the hypothesis. So,
not learning anything here.
Aside: The name makes sense. The consequent is the second
part of the hypothesis, and you're affirming it. So, don't try to
memorize these as arbitrary strings, think about the meaning.
4. Present not p, look for not q.
This is called denying the antecedent. It's NOT VALID.
Again, the hypothesis says "when you see p you'll see q." It does
not say "when you see not p you'll see not q." You might see q
p, there's nothing in the hypothesis to rule that out. So,
at a case where p is missing is pointless. Again, note that the
Here's a table that might help make this all clear. (mind your
p's and q's)
|hypothesis is supported
hypothesis is not supported
|hypothesis is supported
hypothesis is not supported
|Affirming the consequent
|hypothesis is not affected
hypothesis is not affected
|Denying the antecedent
|hypothesis is not affected
hypothesis is not affected
Let's have another example: If it's a lemon, then it's sour.
p = "it's a lemon"
q = "it's sour"
1. present something sour, see if it's a lemon: affirming
the consequent, no information. The hypothesis doesn't say
about sour things, it says something about lemons.
2. present a lemon, see if it's sour: modus ponens.
3. present something not a lemon (ex. a tree), see if it's not
sour: denying the antecedent. We didn't say anything about
things not lemons, only lemons.
4. present something not sour, see if it's not a lemon:
Note the relation to the ideal hypothesis: The two that are
set up a situation where every outcome is consistent with the
No matter what happens, you can't rule out the hypothesis.
C. You may be asking yourself two questions about now:
1. How am I supposed to remember all of this? Cram two
things in your head:
a. present p = modus ponens.
b. present not q = modus tollens.
Whenever you get a problem, label p and q in the hypothesis, then label
all the p's, q's, not p's, and not q's in the questions. Then,
a and b above, figure out what you've got. Note: If it's
one of the two you've memorized, the names are informative enough to
them. If you see not p being presented, that's "not the
or "denying the antecedent."
2. What am I supposed to take from all this? Some
are more informative than others. It's to your advantage to set
experimental situations that yield useful information.
if you seek to rule out hypotheses, you can get away with even less
It takes only one counter-example and you're finished, but you can
provide enough confirmations.
V. Induction and deduction. Now we'll step back
a level and look at the bigger picture. Where do all of these
come from, and where are all of these experiments going?
A. Induction: Induce a rule from a set of specific
So, I get a lemon, it's sour, get another lemon, it's sour, [repeat ad
infinitum]. After a while, I can induce the rule "if it's a
then it's sour." Note: this is closely tied to modus
Confirmatory reasoning research puts out all of these examples from
I can induce the rule. It's also tied to exploratory
The reason I'm doing these experiments is I don't know the rule (I
know enough to have a rule). So, I try a few contexts and get
results, and then induce the rule.
B. Deduction: Once I have a rule, I can use logic to derive
predictions from that rule that I can then test. "If it's a
then it's sour" --> "if I get something that's not sour, then it
be a lemon." I can test this prediction. Note how much that
sounds like modus tollens. Once we have a hypothesis, we can try
to make it fail (or identify more precisely the conditions under which
it holds). This is related to hypothesis testing research.
When I know enough I can use deduction to derive hypotheses to test.
C. The sequence is induction first to get hypotheses, and then
deduction to test them. Here's a picture:
Here's a rough example. I observe several people complaining
about grades, and they all say the professor made the test too
From this data I induce the rule "people attribute things to external
That becomes my theory. Using deduction, I predict that people
good grades will also credit the professor. I collect some more
The people with good grades say it's because they studied. This
leads me to induce a new theory "people attribute bad things to
sources, and good things to internal sources." I predict that
will be true for self and other. I test this by asking people at
a drive-through line why it's taking so long. People waiting
the delay on the cars in front of them ("people order such weird stuff
at the drive-through"). But, when they're at the window, they
it on the workers in the restaurant ("these people are so slow").
I use the data to revise the theory again "people attribute good things
to internal sources and attribute bad things to external sources for
but internal sources for others." We could keep going on, but you
get the idea. Data leads to a theory, the theory leads to a
the prediction leads to more data, ...
VI. Wrap-up. The final spin on hypothesis stuff:
A. Some situations are more informative than others.
B. Falsification is better because it only has to happen once.
C. There are sometimes limits on the kinds of causal statements
you can make.
D. There are different ways to propose hypotheses that are
to how much you know and what kinds of experiments you do.
Research Methods Notes 3
Back to Langston's Research Methods Page