Note: A lot of the demonstrations for this unit were derived from Reed's textbook or the instructor's manual by Reed and Pusateri.
I. Goals.
A. Where we are/themes.
B. Kinds of categorization.
C. Theories of categorization.
II. Where we are/themes. We're finally moving into
higher cognition. We're drifting away from describing parts
(representation)
and looking at process instead. Note that this won't be perfect
since
representation will still determine process. But, we're going
that
way. This week's topic is categorization. This is how you
partition
the continuous experience of the world into discrete things.
Categorizing
does four things for you:
A. Reduces complexity. Instead of having to treat every
experience as unique, we can put things in classes and save some
work.
For example, if you get bitten by a bee, you can avoid getting bitten
by
a wasp if you can see the similarities.
B. Allows identification. Think about this: How would
you know what something was if you didn't categorize it?
C. Reduces the need for constant learning. You can deal
with classes of things instead of tons of individuals. That can
help,
but as we'll see with person perception, it can also lead to
stereotypes.
D. Allows for action. If I know something belongs to the
category of things that are likely to eat me, I can run away from it.
What we'll do in this unit is look at kinds of categories and how you
learn them. Then, we'll consider theories to explain
categorization.
We'll wrap up by looking at some implications of categorizing.
Top
III. Kinds of categorization. Categories can be
defined in a number of ways. We will look at these in some
detail.
How you define a category will impact how it is learned.
A. Logical categories. A category can be defined according
to some rule. For example, a dog might be defined by the
conjunction
of two features (has 4 legs and barks). So, anything satisfying
the
rule is called a dog. There are four kinds of logical rules.
1. Conjunction: Join two features with "and." For
example, has 4 legs and barks.
2. Disjunction: Join two features with "or." If
either
feature is present, then it's a member. For example, we might
define
a mammal as warm blooded or live birth. That lets in platypuses,
which would otherwise be out.
3. Conditional: If...then... For example, to decide
if something is a mammal you might say "if lays eggs, then warm
blooded."
Again, this lets in platypuses, plus everything that doesn't lay eggs.
4. Biconditional: If and only if...then... For
mammals,
we might say "if and only if lays eggs, then warm blooded." This
lets in platypuses, but rules out anything that's warm blooded but
doesn't
lay eggs. In other words, this is a lousy rule for mammals, but I
couldn't think of another one.
Let's try the demonstration to make sure we all understand.
Demonstration: I'm going to show you circles, squares,
and triangles. They can be large or small, and they can be red,
green,
or blue. I will tell you a rule, we'll look at each example, the
goal is to prove you understand the rule by classifying them all
correctly.
For example, I might have the rule blue and large. If I show you
a small, blue square, you should say "no." If I show you a large,
blue square, you should say "yes."
If you're following along with the notes, don't read the answers, pay
attention to the lecture at this point.
Conjunction: Large and red. (Numbers 7, 16, 17.)
Disjunction: Square or green. (Numbers 1, 2, 5, 6, 7, 8,
9, 12, 15, 18.)
Conditional: If it's blue, then circle. (Numbers 1, 3,
4, 5, 6, 7, 9, 11, 12, 13, 15, 16, 17, 18.)
Biconditional: If and only if it's a triangle, then blue.
(Numbers 5, 6, 7, 10, 12, 13, 14, 15, 16, 18.)
The Appendix has the images for this demonstration.
The rules vary in complexity. Conjunction is the simplest,
biconditional
is the hardest. If you look at people learning these, that's what
you see.
What does all this mean? The more complex the rule is that
determines
category membership, the harder it is to learn.
Criticism: Real categories don't work like this. They have
continuous and probabilistic features, there might not be a rule that
tells
things apart. For example, think about "game." What's the
rule
that defines games? If I'm on top of my game, I should be able to
shoot down any rule.
B. Natural categories. Look at real world
categorization.
You'll notice a few properties.
1. Continuous variables. For example, various colors (a
continuous variable) all get grouped into one category (for example,
yellow).
2. Graded membership. Which is a "better" mammal, a whale
or a bear? Which is a better even number, 4 or 106?
3. Hierarchical organization.
a. There is a superordinate level. This is for general
classes of things, like furniture. At this level, the things in
the
category can be pretty dissimilar.
b. There is a basic level. For example, chairs. The
things in the category are all pretty similar.
c. There is a subordinate level. For example, living room
chairs. These things are even more similar.
The basic level seems to be the one where people prefer to work
(Rosch).
Why? It has to do with features. At the superordinate
level,
there is little feature overlap. What relates chairs to
refrigerators,
other than being furniture? At the subordinate level, there is
too
much overlap. Living room chairs are all very similar, they are
hard
to distinguish from one another. In fact, telling things apart at
this level takes some expertise. Depending on which one of these
is your area, you can see this if we think about subordinate categories
of trees, insects, or psychologists. I could do very well telling
apart the subordinate category "cognitive psychologists." I would
do pretty bad on "beetles."
At the basic level, the amount of feature overlap is just right.
Chairs share a number of common features, so there is overlap.
But,
it's still easy to tell chairs from trees, so there's not too much
overlap.
Evidence for basic level categories? Three sorts.
1. List features. I present you with some categories.
List all of the features of these categories that you possibly
can.
Previous research indicates that superordinate categories only get a
few
features (for clothing, only two). Basic level get a lot (pants
got
six). Subordinate only get a few (Levi's only get one).
Let's
demonstrate that here.
Demonstration: Here are some categories. List all
of the features of these categories that you possibly can. The
features
you list should be things that all members of the category share.
Superordinate Animals Furniture |
Basic Birds Shelves |
Subordinate Sparrows Book-shelf |
What this demonstrates is that at a feature level, basic is where
all
of the action is. That's where the most similar and dissimilar
features
come into play. Note that "basic" is a little arbitrary, but
counting
features gives us a method for determining it.
2. Identify category members. If I ask you to verify that
things are in a category, and measure the time, I get differences
between
the levels. So, I might ask if a picture is a living-room chair,
a chair, or furniture. If I then show you a living-room chair,
the
answer to all is "yes," but the fastest responders are at the basic
level.
Somehow, you start there and then compare more generally for
superordinate
or more specifically for subordinate.
3. Typicality. Some things are better members of the
category
than others. If you ask people to rate typicality, you get pretty
consistent patterns. Rosch and Mervis (1975) had people make
these
ratings. Let's see what we get.
Demonstration: I've got members of two
categories.
I want you to put a 1 by the thing that's most typical of the category
up to a 5 by the thing that's the worst example of the category.
Vehicles _____ Car _____ Elevator _____ Sled _____ Tractor _____ Train |
Clothes _____ Jacket _____ Mittens _____ Necklace _____ Pajamas _____ Pants |
What does this mean? The basic level is where typicality comes
from. The typical ones share a lot of features with other members
of the basic category and differ from other basic categories. The
atypical ones don't.
To test this, compute a measure of family resemblance. For each
item in a category, list all of its features. Then, for each
feature
of an item, count the other members of the category that share the
feature.
Add them all up, and you get resemblance. For example, if you're
looking at animals, and 4 legs was a feature of dogs, then you could
count
other members of the category animals that had 4 legs. If there
were
20, then the score would be 20. Then, if barks was the next
feature,
and four animals barked, you would add in four to get 24 and so forth.
The final step is to correlate family resemblance and typicality.
They should be related based on what we've been saying. They are
(r's between .84 and .94, which are really high).
C. Goal directed categories. There are some categories
that have varying membership. The defining characteristic is
related
to a particular goal. For example, the category of things to
rescue
from the house in a fire. Or things to eat on a diet. The
organization
of these categories is around ideals. For example, if your ideal
vacation is a cruise, then membership in the category of good things to
do on a vacation will be based on that. These categories throw
off
everything that I've said previously (they're not logical and they
don't
follow the rules of natural categories). What do they tell us
about
cognition?
Top
IV. Theories of categorization. These are all going
to boil down to some kind of feature analysis. You might think
back
to pattern recognition and refresh your memory as to the pros and cons
of using features.
A. Classify by comparing to specific examples. Basically,
compare every thing you have in memory to the thing you're trying to
classify,
go with the category that contains the thing that provides the best
match.
This sounds insane, but I'll present a model later that does something
like this. The simple version of this could lead to a lot of
mistakes.
For example, you might call a whale a fish instead of a mammal because
of the neighbors in the two categories.
B. Classify by looking at the average distance. Compare
to every thing in memory, compute a distance from each thing, average
the
distances for things in every category, go with the category that has
the
lowest average distance. Again, it's a bit wacky.
C. Feature counting. Compare to every thing in memory,
feature by feature. Count the number of matches. Go with
the
category with the most matches.
D. Prototypes. Extract an average for each category.
That average is called a prototype. It's kind of like the ideal
member.
Then, compare the new thing to the prototype and see whether it
matches.
Evidence: Posner and Keele (1968). Make a prototype out
of dots, like a triangle. Then, create category members by moving
around the dots. Move them a little bit, a medium amount, or a
lot.
The amount of movement leads to variability (differences amongst
category
members). The more you move the dots, the more variability there
is. Then, have people learn to classify the examples (don't show
the prototype). Two things happen:
1. The prototype is recognized as a member of its category, even
though it was never seen before. Prototype recognition is usually
the best, indicating that people have developed a representation of it,
even though they never saw it.
2. The amount of variability in training really matters.
If you trained with a highly variable set, you can categorize examples
you never saw before. This is true even if the new examples are
really
variable. But, people trained with low variability are poor at
classifying
new examples that are highly variable.
So, categorizing appears to be based on knowing the prototype plus
some estimate of how variable the category is.
CogLab: We'll look at
the results of our prototype demonstration.
Let's look at a more developed prototype model. This has a lot
of the parts of an ideal theory: Hintzman's trace model.
1. We need to define two terms. Schema: A
configuration
of typical knowledge. This can be a frame: A representation
of an object (like a house has walls, roof, windows, door, etc.), or a
script: A representation of a typical event (like going to a
restaurant
involves sitting down, ordering, eating, etc.).
Prototype: The idealized representation of a concept. This
is the perfect exemplar of a concept (exemplar: an example of an
item in a particular category or concept). For example, the
prototypical
bird is a robin (has wings, flies, sings, lays eggs, nests in trees,
eats
insects, etc.). A non-prototypical bird would be a penguin or a
chicken.
They have some of the properties, but not all of them.
2. Traditionally, people assume that schemas and prototypes are
abstracted as a result of experience. Some executive process
looks
through all of the birds you've seen and figures out what the ideal
bird
is, then represents that as a prototype (this may be an automatic
process).
Experimental evidence suggests this (see above).
In a typical experiment I make a random dot pattern. This is
my prototype. Then, I vary a few dots to make exemplars of the
category
represented by that prototype. I might make three prototypes with
three exemplars of the first prototype, six of the second, and nine of
the third.
Then, I have people classify the exemplars (they never get to see the
prototype). I say "Which category, 1, 2, or 3, does this belong
to?"
At first, people miss them all since they don't know anything about the
categories. We keep going and I keep giving feedback until they
can
do all 18 perfectly. Then I test to see if they "abstracted" the
prototype. Evidence that they have (as described by Hintzman):
a. Prototype classification (ask people to assign the prototype
to a category) is more stable over time than the exemplars they
actually
saw. This suggests that the prototype was abstracted and is in
memory.
b. Old exemplars are classified better than new ones.
c. Classification of unseen stimuli is best for prototypes
followed
by low-level distortions followed by high-level distortions. This
suggests that a prototype is in memory, and classification is made by
comparing
to it.
d. Transfer is better with bigger categories (more exemplars
shown). Suggests that more variability in the input enables you
to
do a better job at zeroing in on the prototype.
3. Hintzman's critique:
a. The concepts of schema and prototype are very fuzzy and
vague.
They don't really have any predictive power. We need something
more
specific.
b. A multiple trace model accounts for these findings plus
others.
c. We shouldn't postulate processes (prototype abstraction) or
representations (multiple memory stores) if we don't have to have them.
4. The model:
a. Memory contains a trace of every experience. The trace
is made up of primitive features (what these are is not too well
specified,
but they're low level, and few in number relative to the number of
traces
in memory).
b. A retrieval is sending a probe through all of these traces
based on what's in working memory (what's currently activated or
conscious
out of all of the traces). An echo comes back.
c. The echo has intensity (overall similarity to the
probe).
This is akin to familiarity. It also has content (the sum of the
contributions of all of the traces). The content is akin to the
memory.
For example, I see an exemplar and need to know what category it
belongs
to. Some features are the exemplar, some are its category
name.
I have the exemplar in working memory, send it through all of my
traces,
and get back an echo. The content of the name part of the echo is
my response (as in "Category 1").
Note that this memory is content addressable. You don't have
to know where in memory something is. Instead, traces are
activated
based on their similarity to the trace in working memory (they're
activated
based on their content).
5. Does it work?
a. A basic simulation shows that you can get it to recall
something
that looks like a prototype even with no prototype in memory (the echo
is more correct than any of the exemplars). So, it looks like
it's
abstracting, but it really isn't.
b. Comparison to the classic human experiment yields almost the
exact same pattern of results.
So, prototype models were winning, but when you get down to the nuts
and bolts, there isn't really any need for a prototype to get evidence
of one. Similarity is sufficient, the math will make it work out
like you need it to.
E. Fuzzy-set theory. A person named Zadeh has proposed
that classic logic is too restrictive for categorization. In
classic
logic, only two truth values are allowed, 0 (false) and 1 (true).
This is fine for a robin as a member of the category bird (true).
It's also OK as a robin as a member of the category fish (false).
But, a lot of ordinary categorization doesn't work like that.
One place it shows up is something called linguistic hedges.
I might say "loosely speaking, a whale is a fish." The hedge is
"loosely
speaking" and it limits the extent to which I'm putting whales in the
fish
category. Or, consider "technically speaking, a whale is a
mammal."
Again, the hedge illustrates graded membership. A whale is a
mammal,
but it's not a very good example. A whale is not a fish, but it's
pretty close.
Fuzzy logic allows for graded truth values. A whale can be a
mammal to degree 0.7. In that case, a whale is a lot like a
mammal,
but it's not all the way in the category. This solves some
serious
problems for categorization. Consider the category "tall
men."
Who is in the category? The boundary between tall and short is
fuzzy.
So, we need fuzzy logic to account for it. A man who is four feet
tall is in the category "tall men" to degree 0.0. A man who is
eight
feet tall is in the category to degree 1.0. Someone who is six
feet
tall is in between. This also allows us to escape some
paradoxes.
If you take a man with a full head of hair and pluck one hair, is he
bald?
Now pluck another. And so on. Pulling one hair at a time
doesn't
seem to ever lead to calling someone bald, but at some point we'll be
calling
a rather bald guy "not bald." If we had fuzzy values, we could
say
"bald to degree 0.6."
How does this fare when you try to test it? First, we need to
know a bit more logic. There are two big relations:
conjunction
(and) and disjunction (or). Traditional logic has rules for
handling
these. You usually use a truth table. For AND:
A T T F F |
B T F T F |
A AND B T F F F |
A T T F F |
B T F T F |
A OR B T T T F |