Coding and Entering Data

Coding is the development and use of a language that will be used to transfer data from the instument which was employed in the data collection process to a "codebook" or directly to the computer in a form that is appropriate for data analysis and reporting results. For the purpose of this course, this means making decisions about how to represent the data you collected on questionnaires in a way that the SPSS program will use effectively to do analysis. Since numbers usually work best as the way to represent the different responses made by the people you surveyed, the task is to find ways to code every response with a number.

Entering data is then done by putting the appropriate number for each datum (the singular of data) into an appropriate place in a data file. The first computer assignment requires you to build a data file which contains coded responses from at least fifty of the people who completed your questionnaire.

There are some concepts which students need to be very clear about before they begin the coding process. These include the differences between variables and values.

Variables are "logical groupings of attributes." (Rubin and Babbie) Values are the attributes thus grouped. So the variable gender has two attributes or values: female and male (the way it is usually constructed in our society). The variable age has many possible values: 18, 19, 20, 21, 34, 67, etc. It is important to remember that a variable needs more than one value. If all your respondents are social work majors, then major is not a variable. Don't waste your time coding and entering that data.

So each variable which you measured in your survey needs to be coded. If you start with an attitude, such as the first item on the demonstration questionnaire, "My heart goes out to people in wheelchairs," you need to have a code for each of the possible values: strongly agree, agree, undecided, disagree, and strongly disagree. One way to do this, which I think is the easiest way, is to start with one and use as many numbers as you need, so the coding becomes: 1 for "strongly agree," 2 for "agree," 3 for "undecided," 4 for "disagree," and 5 for "strongly disagree." These are entirely arbitrary. You can reverse these or use any numbering scheme you want, but you need to be sure that you remember what the codes mean.

A good way to remember is to prepare a codesheet. This is best done by using a blank copy of the questionnaire which you used to collect data. It will give you a place to write variable names which are explained on the handout for the first computer assignment. Also, write down the numbers you use as codes for each value of every variable.

Variables that have the same values will be coded identically. Since all of the attitude items on the demonstration questionnaire have the same choices, they would all be coded the same, 1 for "strongly agree," 2 for "agree," 3 for "undecided," 4 for "disagree," and 5 for "strongly disagree." The next group of items, behavior variables, are all Yes/No questions. These can be coded 1 for "Yes" and 2 for "No." It is OK for the 1 to be the code for "strongly agree" for one variable and for 1 to mean "Yes" for another. Later we will tell the computer what everything means.

It is usually the case that the values for each demographic will be coded differently, so that gender has the codes 1 for "Female" and 2 for "Male" (they can be in the other order if that's the way they are on your instrument); marital status would be: 1 for "Single," 2 for "Married," etc. A variable such as age, which is already represented by numbers, uses those numbers as the code, so, for example, the written in age of "22" is coded 22.

Other open-ended responses, such as major, are a little more complicated. Often we just code the first response we get as 1, the second as 2, and so on. However, you may want to reserve the low numbers for frequently occurring values. For example, if you have several social work majors in your sample, use 1 for "social work" since it is easy to remember. Open-ended items that have more individualistic responses, such as "What else do you think about this topic?" need to be coded more creatively. Remember: write things down! Keep a record of these codes on your codesheet so that you can remember what they mean.

Once you are clear about the coding process you are ready to begin data entry. The first computer assignment will help you with this. Be sure to ask in class if you have any questions as you begin that assignment.