Psychology, Notes 4 -- Short Term Memory
A. Where we are/themes.
B. Two kinds of memory.
C. Properties of short-term memory.
D. Working memory.
II. Where we are/themes.
A. Here are some situations:
1. You can still remember details of your tenth birthday party
(which you don't need), but you have trouble remembering a definition
enough to write it down. Why?
2. Pizza I: You look up the phone number of a pizza
place and someone asks you a question before you can make the
When you go to dial, the number is gone. Why?
3. You're trying to get the lunch order straight. Three
people tell you what they don't want on their hamburger, but you can
remember part of the information. Why?
4. Pizza II: Why can't you remember a number and talk to
someone, but you can remember the number and look around the room?
What do these questions have in common? Short-term memory.
We're going to discuss why there seem to be two kinds of memories, some
that only last a short time and some that last forever. Then
look at the properties of the brief memory store.
B. Where we are. Remember, we're working our way through
this box model of the mind. We've talked about the sensory
and pattern recognition. The register holds information briefly,
and pattern recognition figures out what the information is. Last
time we looked at the "filter" and "selection" components. Both
these are controlled by attention (it allows you to filter out some
and select from what's left). Now we'll look at brief memories.
Short-term memory: A brief memory store with a limited capacity
allows you to hold information as you process it.
1. One short-term memory or many? The evidence indicates
that there are three or four. We'll see how that comes about.
2. Is short-term memory really different from long-term
Most of this evidence will come later, but it's an important theme.
III. Two kinds of memory. The basic division is
into long-term memory and short-term memory (Atkinson and Shifrin,
A long philosophical tradition supports the phenomenological experience
of having some memories that only last a tiny little while and other
that last forever. The short ones are assumed to be in short-term
store, the long ones in long-term store. Why do we think there
A. The classic serial position curve generated from free-recall
Demonstration: Free recall and the serial position
Here's a list of words to memorize: roof, latch, hot, shirt, clog,
court, slot, hand, dirt, table, apple, radio, clock, phone, box,
picture. Now, write down all of the words you remember, in any
Complete word counts on the overhead graph.
Glanzer and Cunitz tested a two store explanation for this curve.
The first part is due to short-term memory and the second part is due
long-term memory. What's the test?
1. Show that it's long-term store at the beginning: Some
variable should affect long-term but not short-term store.
Spacing. The reason stuff gets into long-term is because you
it over and over to yourself between words. If we increase the
between words, you'll have more time to rehearse, so primacy should get
bigger. Recency should be unaffected. Sure enough.
2. Show that recency is short-term store: What should
recency is stopping rehearsal for a moment before recall (recency is
contents of short-term store, if we make you wait a while, those
will go away, so there will be no recency). Add a 30-second
backwards interval before recall. Sure enough.
B. Neuropsychological evidence. The idea is to look for
patients with brain damage who lose one kind of memory, but not the
HM is a famous patient with damage to his hippocampus. He still
details of his life before his accident (long-term memory) but nothing
after. He doesn't seem to have any short-term memory. Other
patients can lose their long-term memory but still do short-term memory
tasks (retrograde amnesiacs). This condition usually fades, but
separation is possible. The thinking goes like this: If
damage can cause differential effects, then there must be two stores.
This explains why birthday parties last (long-term memory) but
don't (short-term memory).
IV. Properties of short-term memory. Now that we're
convinced short-term memory is separate, what are its properties?
Five things to look at (1) How long does it last?, (2) What makes
go away?, (3) What's its capacity?, (4) What's the representation like
(what's the code)?, and (5) How do you search it (remember from it)?
A. Peterson and Peterson (1959) look at duration. Give
a person a three-letter list (G, P, K). Have them count backwards
by 3's for some interval, then ask for recall. The counting can
on from 3 to 18 seconds. After 18 seconds, short-term store is
(note Glanzer and Cunitz's 30-second interval was motivated by this).
This explains Pizza I. If you get distracted, you lose the
in short-term memory.
B. What is forgetting? It could be a kind of decay (stuff
just fades over time). The analogy is to rusting. If you
metal exposed to the elements, it rusts. The longer it's there
worse it gets. The problem is this: There's a mechanism for
rusting. Similarly, there should be a mechanism for
What is it? Interference.
Why do we think it's interference? Waugh and Norman (1965).
Manipulate two things. Rate: Faster should cause better
if it's decay (less time passes, so less decay). But, if it's
what difference does rate make? Amount of items interfering
the test: Should make memory worse if there's more
But, if decay explains it, then it won't matter how many items
Hypothetical Demonstration: Here are some lists of random
The list will be read at a slow rate (approx. 1 per second) or fast
4 per second). You hear the list, then a target item, and say
came after the target item in the list.
|# Int Items
9 5 8 4 7 9 8 1 4 3 3 4 8 7 6 9
5 5 9 9 6 8 5 4 2 6 1 2 8 5 6 7
5 2 4 5 5 9 7 3 9 4 9 4 1 2 3 5
7 7 3 1 4 6 8 6 8 3 7 1 2 8 9 7
2 1 9 6 7 8 9 5 4 6 7 5 6 9 2 3
1 9 3 2 8 4 4 3 2 9 9 4 6 2 5 1
There are two kinds of interference.
a. Retroactive: Something you learn now messes up something
you used to know. This is the classic type of interference.
It's what Peterson and Peterson were trying to manipulate in their
That's also what Waugh and Norman were investigating. You know
digit after the test digit if it comes right away. As you hear
digits, interference happens, and you get worse.
b. Proactive: Something you learned before messes up what
you're trying to learn now. The classic experiment is to present
one kind of information to be memorized (fruits). Do this for
lists. Memory gets worse for each list. Why? All of
earlier items are interfering with what you're trying to learn
Then, for the fourth list change to new material (professions).
for the new material will rebound to be like the first list. This
is called release from proactive interference.
Demonstration: There are four sets of words below.
The task is a Peterson and Peterson task (see list, count backwards by
3's, recall). The counting should go for 18 seconds for all four
lists. The experimental group gets the professions for list
The control group gets more fruits for list four. Otherwise, both
groups are the same. The experimental group should show release
proactive interference on list four.
The words and data are taken from an experiment by Wickens, Dalezman,
and Eggemeier (1976). The point is that the more the fourth list
from the original lists, the more people remembered it.
It turns out that proactive interference can explain all of the
in the Peterson and Peterson experiment (Keppel and Underwood,
All of the similar lists confused participants. Test: Give
a bunch of people just one list and a delay. With just one list,
the function is flat (people can remember nearly 100% even after 18
The decline was due to proactive interference from all of the letter
that had been learned previously.
C. Miller (1956) looked at capacity. The task to measure
capacity is simple. You present a list of information to someone
and ask them to recall it. The amount they can remember is called
their memory span. If you present digits, then you're measuring
span. If you present letters, then you're measuring letter
You can measure span for words, pictures, binary numbers, etc.
CogLab: We will look at our data from a memory span task.
If you're like most people, five is easy, seven is not too bad, and
nine is really hard or impossible. The overall capacity boils
to 7 ± 2. For words, it's around five, for binary numbers
Note, this isn't really your capacity. If you can chunk the
you can get in a lot more. How you make chunks is based on
memory. For example, a track athlete was able to get an enormous
digit span by chunking the numbers according to a track meet (a good
for the mile, a good time for the 100 yard dash, etc.). The more
you know about something, the better your chunking will be, and the
you'll be able to hold in short-term memory.
Example: Chase and Simon (1973) looked at chunking of chess
by chess masters. When the boards represented real game
masters remembered a lot more pieces, but still only about eight
This isn't outside the range of normal spans. What the masters
is about 10,000 to 100,000 chunks. That's why they remember
However, when the board arrangement was random, chunking wouldn't work,
and masters didn't have very impressive memory.
Here's a chunking demonstration.
Demonstration: Try to remember these letters (a test of
Just a list
A M J Y K F C V D A S R T L E
Chunkable with rhythm
A M J Y K F C V
A S R T L E
Chunkable with knowledge
Y M C A J F K
L S D E R A
Fifteen letters is way beyond your span, but you can easily remember
the last list anyway, even though it's the same letters.
So, when you're trying to remember a lunch order, it's hard because
you have such a limited capacity. If you can chunk the orders, it
should be a lot easier.
D. What's the code in short-term memory? Conrad (1964)
used lists of letters to determine that the code in short-term memory
auditory. There were two lists: BCPTV and FMNSX. Each
one is confusable (auditorially) within itself, but not with the other
list. People's confusions confirmed this. Wickelgren (1965)
also looked at confusions. He would present a list like
People's confusions tended to be auditory in nature (for example, B or
3 could replace the Z, but not R or 4).
The code can also be visual. Posner and Keele (1967) presented
pairs like A-a, A-A, a-A, and a-a. There were also different
(a-B, A-b, etc.). People had to make a same different
If they had less than 1.5 seconds between letters, judgments for
similar letters were faster. So, early coding was in terms of
features. After 1.5 seconds, the visual code was replaced with an
auditory code, and physical similarity didn't matter anymore.
Finally, the code can be semantic. Release from proactive
can only happen if people are sensitive to the meaning of the items on
Acoustic recoding (like with Posner and Keele's letters) occurs in
reading as well. If you interfere with this process, it messes up
certain kinds of information that people get from reading. We'll
see more of this when we talk about working memory.
E. Sternberg looked at search of short-term memory. This
is probably the most important experiment in cognitive psychology in
of methodology. We'll have to take a minute to discuss how
did his work, then we'll look at what he found out about short-term
Sternberg used something called the additive factors method.
What is the additive factors method? Set up a situation where you
have some number of hypothesized stages, then manipulate variables that
should influence each of the stages. You can learn about the
in each stage by looking at the effect of manipulating variables that
Sternberg's task is memory search. The gist is: you
a set of items (the size varies from one item to six items), you're
with an item, you say whether or not that item is in the set. The
items are digits from 0 to 9, half the time the answer is yes, half the
time the answer is no. The search always takes place in
What are the stages identified by Sternberg? They start after
the set has been memorized. The test stimulus has just been
The stages are:
1. Stimulus encoding: Time to get the test stimulus into
2. Comparison: Time to check the test stimulus against
the list in memory.
3. Decision: Time to decide if the response is yes or no.
4. Response organization: Time to translate the decision
into a response and get it ready to come out.
What variables affect each stage?
1. Encoding: Manipulate the quality of the stimulus.
2. Comparison: Manipulate the number of items in the set.
3. Decision: Manipulate response type (yes or no).
4. Response organization: Manipulate response probability
(how likely a yes response is over the course of the experiment).
How about an example? Let's look at the comparison stage
This is the one Sternberg was most interested in investigating.
do you compare test items to what's in memory? He'll vary the
of items in the memory set and whether the response is yes or no.
Sternberg came up with a lot of possible search strategies, we'll
three (the simplest versions of each):
1. You search in parallel. This means that set size won't
matter, because you search all positions simultaneously (assuming all
take place at the same rate). This prediction appears in overhead
6. The lines are both flat because you really only have to do one
search for each set size.
2. You search serially, and stop when you find the item (serial,
self-terminating). For situations where the test item is in the
set, you should stop on average after (n + 1) / 2 searches (where n is
the set size). In other words, you'll search halfway through the
set on average before finding the test item and stopping the
When the item isn't in the set then you have to search the whole thing
every time, and you'll always stop after checking n items. This
is illustrated in overhead 6. Note that the slope for the yes
is half that of the slope for the no trials (reflecting the fact that
stop about halfway through for yes trials).
3. You search serially and exhaustively (serial,
In other words, whether the response is yes or no you always search to
the end of the set before stopping. In this case you check n
for both yes and no trials, and they should have the same slope.
This prediction is illustrated in overhead 6.
The results were that the slopes for the two response types were the
same, responding no took a little longer than responding yes. The
lines can be captured with the following function:
RT = 38n + 397 (in msec)
What does this tell us about search? Several things:
1. We know that the search process is serial and exhaustive.
2. We know how long each comparison takes (38 msec). This
because the slope is 38 msec. For each item, that's how much time
3. We know that the processing time for stages 1, 3, and 4 is
397 msec total. That's because the y-intercept for the yes line
397 msec. Since that's the theoretical time if there were no
(no stage 2) it must be the total time for the other three stages.
CogLab: We tried a
Sternberg Search task, we can look at our data.
We can also use this method to look at encoding of the stimulus in
short-term memory. Manipulate the quality of the stimulus.
If it's poor quality, it should take longer to encode, so the line for
degraded should be shifted up relative to intact. If degrading
stimulus affects search, then the slopes should change.
What happened? The degraded line was shifted up, but it had the
same slope. (See your text for a discussion of this experiment.)
V. Working memory. We talked about short-term
The problem with the 1970's version of short-term memory was the static
nature of the memory. Short-term memory is not just for holding
for a short time. There was some awareness of this (Sternberg
at searching the contents), but it was still mostly about
The new version of short-term memory replaces storage with storage +
You hold stuff while you do stuff to it. To distinguish that from
short-term memory we call the old short-term memory box working memory.
A. The first thing we'll do is develop a new measure that
storage and processing (digit span is just storage). The task is
to read sentences and hold the last word in memory. We start with
sets of two. If you can do three sets of two, we go to sets of
If you can do three of these, we go to four, etc. Most people do
three or four. Five is possible, six is very rare. The test
is called "reading span." This span correlates well with measures
like SAT score, GPA, comprehension when reading, etc. So it is
into something interesting.
CogLab: Operation Span is
another way to look at working memory capacity. We will look at
our data from that task.
Now we have a new definition and a way to measure it, what is the
of working memory? That's the research program of a person named
Baddeley. Let's look at that in the next section.
B. You can divide working memory into the boxes below:
Each box represents a kind of information that is stored/processed
and each has its own special capacity (kind of like mental
The capacities can be shared (some) when one gets filled up, but,
speaking, when a box gets full, if you want to put more in it, you have
to take something else out of it. The three kinds:
1. Executive: Some capacity is devoted to controlling the
overall processing. Some process has to put stuff in, take stuff
out, assign stuff to boxes, etc. This is the executive.
2. Articulatory loop: A repeating loop that holds auditory
information. If you try to remember a phone number from looking
up to dialing, it's going around in the loop.
3. Visuo-spatial sketchpad: A space for holding image-like
information and spatial relationships.
Note how these three kinds of store/process resemble our discussion
of codes earlier. Think about how this new version of short-term
memory might clear up other mysteries from before.
C. Evidence. Two sorts: Evidence that a kind of
exists and has certain properties and evidence that the stores are
1. Evidence for stores:
a. Articulatory loop:
1) Conrad showed that letters that are visually confusable but
sound different (B, K, R) are easier to remember than letters that look
different but sound the same (D, C, E). So, when people memorize
lists of letters it seems to be based on the sounds of the letters.
2) Articulatory suppression affects memorizing verbal
If you have to repeat "the, the, the..." out loud while memorizing a
of letters, it is really hard.
3) The lengths of words affects how many you can recall.
As the words get longer, fewer fit into the repetition loop, and you
remember as many.
b. Visuo-spatial sketchpad:
1) When people are asked to mentally scan pictures, the time
it takes to scan reflects real-world sizes/distances. For
if I tell you to picture a rabbit by an elephant and then ask about the
rabbit's eyelash, it takes longer to respond than if the rabbit is by a
fly. The reason is the size of the rabbit in the two
Or, I show you a map of an island with some buildings on it (hotel,
souvenir shop) and then I ask you to mentally scan from one building to
the other (go from the hotel to the souvenir shop), the farther apart
are on the map, the longer it takes you to scan.
2) Shepard has demonstrated that when people are asked to
rotate objects, the farther they have to rotate them, the longer it
3) If you're asked to imagine walking home carrying a cannonball,
it takes longer than walking home carrying a balloon.
2. Evidence for separation. Comes from a dual task
Ask people to do two things at once. The basic logic is
If the two tasks use the same capacity, it will be harder than if they
use different capacities.
a. Loop vs. sketchpad: Take these four tasks:
suppression (hurts articulatory loop), moving your finger in a
(sp?) pattern (go back and forth from left to right and back up a
hurts visuo-spatial sketchpad), remember a verbal list (articulatory
remember the pattern of dots in an array (visuo-spatial
Arrange the tasks as shown below:
If articulatory loop and visuo-spatial sketchpad are separate, then
the combinations in I and IV should be hard, II and III should be
If you add a task that is supposed to reduce the role of the executive
(like producing a string of random letters), that interferes with
tasks (decision making). You can incorporate this into the scheme
above to separate the third component.
Sum up: Working memory seems to break into these three kinds
of capacity. The capacity is not just storage, but processing
Now we can finally answer Pizza II. You can look around and
a phone number because looking is visuo-spatial and the number is
You can't talk to someone because those tasks are both auditory.
Cognitive Psychology Notes 4
Back to Langston's Cognitive Psychology