Langston,
Research Methods, Notes 7 -- Observation Research
I. Goals.
A. Kinds of observation.
B. Data.
C. What to do with the data.
II. Kinds of observation.
These are arranged based on the amount of intervention by the observer
(from none to total).
A. None: Naturalistic observation. To the extent
that it's possible, the observer is invisible in the situation. Usually,
you know very little about the phenomenon being investigated. It
is used to:
1. Suggest hypotheses for more controlled research. Once
you get some idea of what's happening in a situation, you can begin to
figure out precise relationships.
2. Verify work from laboratory experiments. Make sure that
results from experiments do apply to real world situations.
Plus (+): Can investigate things ethical considerations would
otherwise forbid you from studying. Also frees you from Hawthorne
effects (people behave differently when they're being observed).
Observing can cause demand characteristics: People try to figure
out what the researcher wants in a situation and then do their best to
make it happen. Observing can also cause other kinds of reactivity
(like nervousness, embarrassment, anger) that could influence the results.
Minus (-): Lack of control limits conclusions.
B. Some: Participant observation. The observer joins
the group under study and plays a part in the group's interactions.
This can be:
1. Disguised: Nobody in the group knows an observer is
present.
2. Undisguised: Everyone knows.
+: Can gain access to situations that would otherwise be closed
to you. For example, if you wanted to study behavior in a cult that
only allowed members to participate in its rites, you'd have to join to
see them. Also, you can better introspect about what it feels like
to be in the situation because you've experienced it (helps with perspective
taking).
-: Reactivity: People will behave differently with an observer
present. Even if it's disguised, by being in the group, the observer
might unconsciously direct the group's behavior (e.g. road rage).
Also, you lose objectivity. By thinking like a group member you might
develop sympathies for them or antipathies to them that change the observations
you collect.
C. A lot: Structured observation. You set up the
context in which the observation occurs, and then let it happen naturally.
For example, to observe mother-child interaction, you might ask the mother
to read a story to the child. The particular event observed is structured,
but the behavior of the participants is “natural.” Piaget used this
to assess children's cognitive development (think about tasks like conservation
of volume).
+: You can cause infrequent events. It might take a long
time to see how a mother reads to her child if you just wait for it, but
this allows us to make it happen. Also, you can test the limits of
a person's abilities. If Piaget found that a child could conserve
volume, he would move on to a new task to find out just what each child
could do.
-: Not as natural.
D. Total: Field experiments. Set up the antecedents
to an event to completely control the situation. Participants are
usually not aware that they're being observed, even though the observer
is controlling the situation. Studies of bystander apathy usually
work in this way. Some confederates of the observer will pretend
to be in trouble, setting up a situation, and then the participant's amount
of helping is observed.
+: Control is good. If you're careful, you can even make
causal statements.
-: You sacrifice some of the naturalness of this kind of research.
All of the intervention could be affecting what's observed. You'll
see this problem (trading naturalness for control) over and over.
Top
III. Data.
A. What to record: Make a narrative record, which is a
faithful reproduction of everything that happened (video or audio tape
if you can get it). Why record it all? Because you usually
don't know what's important. If you're selective, you might overlook
the most important thing. You can always condense the record once
you have it.
It helps if you follow a research protocol. This is a specific
description of how the measurements are to be made. This helps to
eliminate random or systematic errors in recording the data.
What to record?
1. Setting: What is the environment around the observation?
Include anything that could influence the participant's behavior.
2. Participants: Who's there? Record all characteristics
of these that might be relevant.
3. Events: What happens?
4. Behaviors: This is your main data. What do the
people being observed do in response to the events that take place?
Two points about these records:
1. Record them immediately. You should avoid relying on
memory if at all possible.
2. Avoid interpretation. We don't want personal biases
to get in the way of the data. For example, if you see one person
hit another person, you don't want to write down anything about their emotional
states, just the fact that someone got hit.
Recording the data brings up a related issue: What kind of data
have you got? Remember the measurement scales when you're deciding
what to do with your data.
1. Nominal: The data are names or labels for categories.
Frequency analyses (mode, chi-square).
2. Ordinal: The order matters. Order analyses (median,
rank-based non-parametric statistics).
3. Interval: The intervals between the numbers are equal.
All math but ratio comparisons.
4. Ratio: The point that is zero on the scale really has
none of the thing being measured. All math.
B. Sampling: What do we sample?
1. Behavior: We look for particular behaviors.
a. Time: We observe for a set amount of time, and record
the number of behaviors in that interval. This could cause us to
miss behaviors or catch some in the middle, but it makes some kinds of
observation practical (you can't watch someone 24 hours a day).
b. Event: We record all instances of a particular event,
regardless of when they happen. This requires a precise definition
of the beginning and ending of an event.
2. Situation: If we want to study drinking in bars, we
observe that situation. But, if we want to observe drinking in general,
we'd want to sample numerous situations (bars, parties, picnics, etc.).
Top
IV. What to do with the data.
A. Descriptive statistics. Everything applies.
B. Chi square (contingency tables). You can compare distributions
of groups of people observed. For example, I might observe three
territorial markers in men and women and see if the two sexes are the same
in their use of those territorial markers. Note that for this kind
of analysis both variables are usually nominal. Here are some sample
data:
| fo |
Books spread out |
Book bag on chair |
Moves furniture |
Total |
| Men |
10 (40%) |
2 (8%) |
13 (52%) |
25 (100%) |
| Women |
10 (40%) |
11 (44%) |
4 (16%) |
25 (100%) |
| Total |
20 (40%) |
13 (26%) |
17 (34%) |
50 (100%) |
So, 10 of the men I observed (40%) expressed territoriality by spreading
out their books. Only two men put their book bag in a chair to express
territoriality. It looks like the type of territoriality isn't independent
of gender. Men seem more likely to move the furniture and women seem
more likely to put a bag in a chair. A chi-square test of these data
supports that conclusion, X^2(2) = 10.98, p < .05. This is called
a chi-square (X^2) test of independence (are the two distributions independent?).
How is chi-square computed? Easy as pie. Make a table like
this:
| fo |
fe |
fo-fe |
fo-fe^2 |
fo-fe^2/fe |
| 10 |
10 |
0 |
0 |
0 |
| 2 |
6.5 |
-4.5 |
20.25 |
3.11 |
| 13 |
8.5 |
4.5 |
20.25 |
2.38 |
| 10 |
10 |
0 |
0 |
0 |
| 11 |
6.5 |
4.5 |
20.25 |
3.11 |
| 4 |
8.5 |
-4.5 |
20.25 |
2.38 |
|
|
|
|
X^2 = 10.98 |
fo is the observed frequency in each condition, fe is the expected frequency.
Where does expected frequency come from? A simple formula.
For each cell in the table above, take the (row total * the column total)/the
grand total. Fill that into the expected frequency table below.
For example, for women moving furniture, it's (25 * 17) / 50 = 8.5.
I filled that number in below.
| fe |
Books spread out |
Book bag on chair |
Moves furniture |
| Men |
10 |
6.5 |
8.5 |
| Women |
10 |
6.5 |
8.5 |
We want to look at how these differ (basically). To get chi-square,
add up the last column. The more the two differ, the bigger the number
will be, and the more likely the two distributions are not independent.
To tell if the differences are significant, look up chi-square in a table
with the right df [(number of rows - 1) * (number of columns - 1)].
If my theory predicts a distribution, I can use chi square to see if
the observed distribution fits the prediction. Let's say I'm interested
in students' opinions about parking on campus. I predict that 30%
will feel that it's too hard to park but have no preferred solution, 20%
will feel that parking will improve if Freshmen can't park on campus, 45%
would like a parking garage, and 5% would be willing to pay for a garage
if one is built (students could only endorse one of these opinions).
I measure opinion and find 12% think finding a spot is too hard but have
no preferred solution, 30% want to prevent Freshman parking, 47% want a
garage, and 11% will pay for a garage. I can use the predictions
as expected values to compute a chi-square. If it's small, then there's
no evidence against my theory. If it's large, then my theory may
be wrong. This is called a goodness of fit test (how do the data
fit with the model?). For the data above, X^2= 23.09, which is significant,
so my prediction was wrong.
Top
Research Methods Notes 7
Will Langston
Back to Langston's Research Methods Page