Thursday, April 26, 2007

Actually Useful Statistical Tests

I always get a kick out of any dire analysis that includes words similar to, “If these trends continue …,” or “Trends indicate …,” because they are usually based on very short term views of the issue at stake. Of course, it doesn’t have to be couched in these precise phrases. It’s coined many different ways, including, “Conservative estimates show …,” “This means that in X number of years …,” etc. I’ve learned to take almost all such material with a grain of salt, knowing that I need a lot more information before I can determine whether the material is useful or not.

There was a great deal of religious growth in former Soviet bloc countries after the collapse of the Soviet Union. Along with other religions, the LDS Church experienced an increased rate of growth in those heady years, leading some to ‘conservatively estimate’ that membership would reach into the hundreds of millions within a couple of decades based on that trend. LDS hobbyist researcher David G. Stewart, Jr., MD explains the actual reality of the matter in this lengthy report (released this year), which explains that “Annual LDS growth has progressively declined from over 5 percent in the late 1980s to less than 3 percent from 2000 to 2005.” Guess what? Trends didn’t continue, so the ‘conservative estimates’ were wildly off base.

This same tactic is employed in the field of finances. The entire basis of the broadly subscribed to Morningstar Report is to insinuate that investments should be piled into funds that are currently performing well — because if current trends continue, you will make a heck of a lot of money over the next X number of years. Of course, short term trends rarely do continue. A one-year (or even five-year or ten-year) look at a fund’s performance means something about how the fund is managed. But it does not necessarily mean that such performance will continue. In fact, every single prospectus includes a disclaimer to that effect.

Media organizations just love to report fantastical prognostications based on extrapolations of short term trends. They do it with respect to crime, environmental issues, economics, and just about everything else. And it doesn’t matter if they’re wildly off base because nobody ever calls them on it. The forecast time is usually far enough down the road that nobody will remember by then. (They customarily work in multiples of five or ten years.) And even if somebody does remember, it is a simple matter to ride roughshod over that information with whatever is the latest sensational forecast.

Organizations that raise funds commonly use the trends tactic to stir up enough emotional response to get someone to send cash. Our political debates are constantly infused with this tactic. In other words, the use of this deceptive tactic is widespread.

My four-year-old has experienced a phenomenal growth rate since birth. If these trends continue she will be over 27 feet tall by age 50. In this case, the data included in drawing the conclusion were insufficient and were based on the faulty assumption that growth rates remain constant over her first 50 years of life.

Whenever I am exposed to some forecast, I realize that I need a lot more information before I can determine how useful the forecast is to me. I need to understand the relevance, the breadth, and the depth of the data, as well as how the data were derived and what assumptions were applied.

I found in my college statistics course that a common tactic is to work backwards. That is, the conclusion is developed and then data are selected that support that conclusion. But it is also common for incorrect conclusions to be developed when going the correct direction because relevant data are often excluded. Another common practice is to put blinders on, thereby, ignoring other possible conclusions that could be equally valid.

Perhaps the most useful thing I learned in my college statistics courses was given as advice from my professor. He suggested that almost every statistic that could be cited in a sound bite was riddled with inaccuracies. Why? Because news delivery organizations understand how they make money. If they are not sufficiently brief, people tend to tune it out. The brevity required to sustain ratings does not permit for a discussion of the complexities involved in almost every study. But brevity alone does not increase ratings. You need sensationalism for that. So media organizations tend to pull out one or two factors that are likely to get people to pay attention. Or they rewrite the conclusion to give it a more sensational appeal.

My professor once gave an example of what he was talking about. On the way to work years ago, he heard a report that Utah and Hawaii had the slimmest populations in the U.S. Having lived in both places (as well as a handful of other locations), he couldn’t see how this could possibly be true. Being in the statistics business, he was able to get the whole study. It turned out to be a telephone survey of a small number of people in each state. The research did not validate any of the information provided over the phone. It could easily have been concluded that the study merely showed that Utahns and Hawaiians are more likely to lie about their weight than people in other states. Several other possible conclusions exist. The breadth and depth of the data were insufficient to support the conclusion. The collection method was faulty.

Unfortunately, we usually don’t have time to research all of the findings we are constantly bombarded with. There are many different statistical test methods. I remember only a few of them from college statistics (like the Student’s T-Test). However, I do remember the Sixth Grade Test and the Smell Test.

The Sixth Grade Test basically states that if an average sixth grader could easily tell you the answer, it doesn’t need to be studied. A study was undertaken to determine if college students moving between buildings in a rainstorm got wetter if they walked or if they ran. Ask an average sixth grader if they would run or walk in such a situation. Most would run. Of course, the study concluded that the students that ran were drier that the students that walked. This study didn’t pass the Sixth Grade Test. It didn’t need to be done.

The Smell Test isn’t as simple. It requires experience and personal research. Over time, one can begin to get a feel for the level of accuracy of a reported statistic. The colloquial version is that you begin to be able to tell whether it “smells” bad or not. The weight by state study just smelled bad. Anything that smells bad warrants skepticism. Of course, you have to be willing to be honest with yourself. Did a reported statistic smell OK simply because it supported your personal philosophy? Are you fooling yourself that something rotten smells OK?

We should be interested in getting at the actual truth of matters. This means using painful objectivity in considering information. Very often, information presented to us is incomplete, inaccurate, or even deliberately skewed. We need to learn to blow off a lot of this junk. And when something is really important, we should make an effort to validate it before we catalog it as truth.

1 comment:

Charles D said...

Trends indicate that these kind of blog posts will receive favorable responses. :>)

You are so right about this, particularly with the media. They are quick to find a "trend" and much more inclined to simply report it than do the research to determine whether it's true. This applies also to anonymous sources.

How many articles or news segments have you read or heard that refer only to "administration sources", "pentagon sources", etc. These aren't individuals who are risking their jobs to speak out, they are messengers of the policy who think it's better not to use their names. Media should not assume such statements are true without investigation, especially when they are self-serving.