Langston, Research Methods, Notes 7 -- Observation Research

I.  Goals.
A.  Kinds of observation.
B.  Data.
C.  What to do with the data.
II.  Kinds of observation.
These are arranged based on the amount of intervention by the observer (from none to total).
A.  None:  Naturalistic observation.  To the extent that it's possible, the observer is invisible in the situation.  Usually, you know very little about the phenomenon being investigated.  It is used to:
1.  Suggest hypotheses for more controlled research.  Once you get some idea of what's happening in a situation, you can begin to figure out precise relationships.
2.  Verify work from laboratory experiments.  Make sure that results from experiments do apply to real world situations.
Plus (+):  Can investigate things ethical considerations would otherwise forbid you from studying.  Also frees you from Hawthorne effects (people behave differently when they're being observed).  Observing can cause demand characteristics:  People try to figure out what the researcher wants in a situation and then do their best to make it happen.  Observing can also cause other kinds of reactivity (like nervousness, embarrassment, anger) that could influence the results.
Minus (-):  Lack of control limits conclusions.
B.  Some:  Participant observation.  The observer joins the group under study and plays a part in the group's interactions.  This can be:
1.  Disguised:  Nobody in the group knows an observer is present.
2.  Undisguised:  Everyone knows.
+:  Can gain access to situations that would otherwise be closed to you.  For example, if you wanted to study behavior in a cult that only allowed members to participate in its rites, you'd have to join to see them.  Also, you can better introspect about what it feels like to be in the situation because you've experienced it (helps with perspective taking).
-:  Reactivity:  People will behave differently with an observer present.  Even if it's disguised, by being in the group, the observer might unconsciously direct the group's behavior (e.g. road rage).  Also, you lose objectivity.  By thinking like a group member you might develop sympathies for them or antipathies to them that change the observations you collect.
C.  A lot:  Structured observation.  You set up the context in which the observation occurs, and then let it happen naturally.  For example, to observe mother-child interaction, you might ask the mother to read a story to the child.  The particular event observed is structured, but the behavior of the participants is “natural.”  Piaget used this to assess children's cognitive development (think about tasks like conservation of volume).
+:  You can cause infrequent events.  It might take a long time to see how a mother reads to her child if you just wait for it, but this allows us to make it happen.  Also, you can test the limits of a person's abilities.  If Piaget found that a child could conserve volume, he would move on to a new task to find out just what each child could do.
-:  Not as natural.
D.  Total:  Field experiments.  Set up the antecedents to an event to completely control the situation.  Participants are usually not aware that they're being observed, even though the observer is controlling the situation.  Studies of bystander apathy usually work in this way.  Some confederates of the observer will pretend to be in trouble, setting up a situation, and then the participant's amount of helping is observed.
+:  Control is good.  If you're careful, you can even make causal statements.
-:  You sacrifice some of the naturalness of this kind of research.  All of the intervention could be affecting what's observed.  You'll see this problem (trading naturalness for control) over and over.
III.  Data.
A.  What to record:  Make a narrative record, which is a faithful reproduction of everything that happened (video or audio tape if you can get it).  Why record it all?  Because you usually don't know what's important.  If you're selective, you might overlook the most important thing.  You can always condense the record once you have it.
It helps if you follow a research protocol.  This is a specific description of how the measurements are to be made.  This helps to eliminate random or systematic errors in recording the data.
What to record?
1.  Setting:  What is the environment around the observation?  Include anything that could influence the participant's behavior.
2.  Participants:  Who's there?  Record all characteristics of these that might be relevant.
3.  Events:  What happens?
4.  Behaviors:  This is your main data.  What do the people being observed do in response to the events that take place?
Two points about these records:
1.  Record them immediately.  You should avoid relying on memory if at all possible.
2.  Avoid interpretation.  We don't want personal biases to get in the way of the data.  For example, if you see one person hit another person, you don't want to write down anything about their emotional states, just the fact that someone got hit.
Recording the data brings up a related issue:  What kind of data have you got?  Remember the measurement scales when you're deciding what to do with your data.
1.  Nominal:  The data are names or labels for categories.  Frequency analyses (mode, chi-square).
2.  Ordinal:  The order matters.  Order analyses (median, rank-based non-parametric statistics).
3.  Interval:  The intervals between the numbers are equal.  All math but ratio comparisons.
4.  Ratio:  The point that is zero on the scale really has none of the thing being measured.  All math.
B.  Sampling:  What do we sample?
1.  Behavior:  We look for particular behaviors.
a.  Time:  We observe for a set amount of time, and record the number of behaviors in that interval.  This could cause us to miss behaviors or catch some in the middle, but it makes some kinds of observation practical (you can't watch someone 24 hours a day).
b.  Event:  We record all instances of a particular event, regardless of when they happen.  This requires a precise definition of the beginning and ending of an event.
2.  Situation:  If we want to study drinking in bars, we observe that situation.  But, if we want to observe drinking in general, we'd want to sample numerous situations (bars, parties, picnics, etc.).
IV.  What to do with the data.
A.  Descriptive statistics.  Everything applies.
B.  Chi square (contingency tables).  You can compare distributions of groups of people observed.  For example, I might observe three territorial markers in men and women and see if the two sexes are the same in their use of those territorial markers.  Note that for this kind of analysis both variables are usually nominal.  Here are some sample data:
fo Books spread out Book bag on chair Moves furniture Total
Men 10 (40%) 2 (8%) 13 (52%) 25 (100%)
Women 10 (40%) 11 (44%) 4 (16%) 25 (100%)
Total  20 (40%) 13 (26%) 17 (34%) 50 (100%)
So, 10 of the men I observed (40%) expressed territoriality by spreading out their books.  Only two men put their book bag in a chair to express territoriality.  It looks like the type of territoriality isn't independent of gender.  Men seem more likely to move the furniture and women seem more likely to put a bag in a chair.  A chi-square test of these data supports that conclusion, X^2(2) = 10.98, p < .05.  This is called a chi-square (X^2) test of independence (are the two distributions independent?).
How is chi-square computed?  Easy as pie.  Make a table like this:
fo fe fo-fe fo-fe^2 fo-fe^2/fe
10 10 0 0 0
2 6.5 -4.5 20.25 3.11
13 8.5 4.5 20.25 2.38
10 10 0 0 0
11 6.5 4.5 20.25 3.11
4 8.5 -4.5 20.25 2.38
X^2 = 10.98
fo is the observed frequency in each condition, fe is the expected frequency.  Where does expected frequency come from?  A simple formula.  For each cell in the table above, take the (row total * the column total)/the grand total.  Fill that into the expected frequency table below.  For example, for women moving furniture, it's (25 * 17) / 50 = 8.5.  I filled that number in below.
fe Books spread out Book bag on chair Moves furniture
Men 10 6.5 8.5
Women 10 6.5 8.5
We want to look at how these differ (basically).  To get chi-square, add up the last column.  The more the two differ, the bigger the number will be, and the more likely the two distributions are not independent.  To tell if the differences are significant, look up chi-square in a table with the right df [(number of rows - 1) * (number of columns - 1)].
If my theory predicts a distribution, I can use chi square to see if the observed distribution fits the prediction.  Let's say I'm interested in students' opinions about parking on campus.  I predict that 30% will feel that it's too hard to park but have no preferred solution, 20% will feel that parking will improve if Freshmen can't park on campus, 45% would like a parking garage, and 5% would be willing to pay for a garage if one is built (students could only endorse one of these opinions).  I measure opinion and find 12% think finding a spot is too hard but have no preferred solution, 30% want to prevent Freshman parking, 47% want a garage, and 11% will pay for a garage.  I can use the predictions as expected values to compute a chi-square.  If it's small, then there's no evidence against my theory.  If it's large, then my theory may be wrong.  This is called a goodness of fit test (how do the data fit with the model?).  For the data above, X^2= 23.09, which is significant, so my prediction was wrong.

Research Methods Notes 7
Will Langston

Back to Langston's Research Methods Page