analyzed the data using a test designed for a measurement variable, those two sleepy
isopods would cause the average time for males to be much greater than for females, and
the difference might look statistically significant. When converted to ranks and analyzed
using a non-parametric test, the last and next-to-last isopods would have much less
influence on the overall result, and you would be less likely to get a misleadingly
“significant” result if there really isn’t a difference between males and females.
Some variables are impossible to measure objectively with instruments, so people are
asked to give a subjective rating. For example, pain is often measured by asking a person
to put a mark on a 10-cm scale, where 0 cm is “no pain” and 10 cm is “worst possible
pain.” This is not a ranked variable; it is a measurement variable, even though the
“measuring” is done by the person’s brain. For the purpose of statistics, the important
thing is that it is measured on an “interval scale”; ideally, the difference between pain
rated 2 and 3 is the same as the difference between pain rated 7 and 8. Pain would be a
ranked variable if the pains at different times were compared with each other; for
example, if someone kept a pain diary and then at the end of the week said “Tuesday was
the worst pain, Thursday was second worst, Wednesday was third, etc....” These rankings
are not an interval scale; the difference between Tuesday and Thursday may be much
bigger, or much smaller, than the difference between Thursday and Wednesday.
Just like with measurement variables, if there are a very small number of possible
values for a ranked variable, it would be better to treat it as a nominal variable. For
example, if you make a honeybee sting people on one arm and a yellowjacket sting people
on the other arm, then ask them “Was the honeybee sting the most painful or the second
most painful?”, you are asking them for the rank of each sting. But you should treat the
data as a nominal variable, one which has three values (“honeybee is worse” or
“yellowjacket is worse” or “subject is so mad at your stupid, painful experiment that they
refuse to answer”).
Categorizing
It is possible to convert a measurement variable to a nominal variable, dividing
individuals up into a two or more classes based on ranges of the variable. For example, if
you are studying the relationship between levels of HDL (the “good cholesterol”) and
blood pressure, you could measure the HDL level, then divide people into two groups,
“low HDL” (less than 40 mg/dl) and “normal HDL” (40 or more mg/dl) and compare the
mean blood pressures of the two groups, using a nice simple two-sample t–test.
Converting measurement variables to nominal variables (“dichotomizing” if you split
into two groups, “categorizing” in general) is common in epidemiology, psychology, and
some other fields. However, there are several problems with categorizing measurement
variables (MacCallum et al. 2002). One problem is that you’d be discarding a lot of
information; in our blood pressure example, you’d be lumping together everyone with
HDL from 0 to 39 mg/dl into one group. This reduces your statistical power, decreasing
your chances of finding a relationship between the two variables if there really is one.
Another problem is that it would be easy to consciously or subconsciously choose the
dividing line (“cutpoint”) between low and normal HDL that gave an “interesting” result.
For example, if you did the experiment thinking that low HDL caused high blood
pressure, and a couple of people with HDL between 40 and 45 happened to have high
blood pressure, you might put the dividing line between low and normal at 45 mg/dl.
This would be cheating, because it would increase the chance of getting a “significant”
difference if there really isn’t one.
To illustrate the problem with categorizing, let’s say you wanted to know whether tall
basketball players weigh more than short players. Here’s data for the 2012-2013 men’s
basketball team at Morgan State University: