Friday, November 28, 2008

Judging the Value of Marketing Data

Last week’s post on ranking demand generation vendors highlighted a fundamental challenge in marketing measurement: the data you want often isn’t available. So a great deal of marketing measurement comes down to deciding which of the available data best suits your needs, and ultimately whether that data is better than nothing.

It’s probably obvious why using bad data can be worse than doing nothing, but in case this is read by, say, a creature from Mars: we humans tend to assume others are telling the truth unless we have a specific reason to question them. This innate optimism is probably a good thing for society as a whole. But it also means we’ll use bad data to make decisions which we would approach more cautiously if we had no data at all.

But how do you judge a piece of data? Here is a list of criteria presented in my book The MPM Toolkit, due in late January.

· Existence. Ok, this is pretty basic, but the information does have to exist. Let’s avoid the deeper philosophical issues and just say that data exists if it is recorded somewhere, or can be derived from something that’s recorded. So the color of your customers’ eyes only exists as data if you’ve stored it on their records or can look it up somewhere else. If the data doesn’t exist, you may be able to capture it. Then you have to compare the cost of capturing it with its value. But that’s a topic for another day.

· Accessibility. Can you actually access the data? To get back to last week’s post, we’d love to know the revenue of each demand generation vendor. This data certainly exists in their accounting systems, but they haven’t shared it with us so we can’t use it. Again, it’s often possible to gain access to information if you’re willing to pay the price, and you must once more compare the price with the value. In fact, the price / value tradeoff will apply to every factor in this list, so I won’t bother to mention it from here on out.

· Coverage. What portion of the universe is covered by the data? In the case of demand generation vendors, the number of blog posts was a poor measure of market attention because the available sources clearly didn’t capture all the posts. In itself, this isn’t necessarily fatal flaw, since a fair sample could still give a useful relative ranking. But we can’t judge whether the coverage was a fair sample because we don’t know why it was incomplete. This is a critical issue when assessing whether, or more precisely how, to use incomplete data. (In the demand generation case, the very small numbers of blog posts added another issue, which is that the statistical noise of a few random posts could distort the results. This is also something to consider, although hopefully most of your marketing data deals with larger quantities.)

· Accuracy. Data may not have been accurate to begin with or it may be outdated. Data can be inaccurate because someone purposely provided false information or because the mechanism is inherently flawed. Survey replies can have both problems: people lie for various reasons and they may not actually know the correct answers. Even seemingly objective data can be incorrect: a simple temperature reading may inaccurate because the thermometer was miscalibrated, someone read it wrong, or the scale was Celsius rather than Fahrenheit. Errors can also be introduced after the data is captured, such as incorrect conversions (e.g., inflation adjustments used to create “constant dollar” values) or incorrect aggregation (e.g., customer value statistics that do not associate transactions with the correct customers). In our demand generation example, statistics on search volume were highly inaccurate because the counts for some terms included results that were clearly irrelevant. As with other factors listed here, you need to determine the level of accuracy that’s required for your specific purpose and assess whether the particular source is adequate.

· Consistency. Individually accurate items can be collectively incorrect. To continue with the thermometer example, readings from some stations may be in Celsius and others in Fahrenheit, or readings from a single station may have changed from Fahrenheit to Celsius over time. This particular difference would be obvious to anyone examining the data, although it could easily be overlooked in a large data set that combined information from many sources. Other inconsistencies are much more subtle, such as changes in wording of survey questions or the collection mechanism (e.g., media consumption diaries vs. automated “people meters”). As with coverage, it’s important to understand any bias introduced by these factors. In our demand generation analysis, used several different techniques to measure Web traffic, and it appeared that these yielded inconsistent results for sites with different traffic levels.

· Timeliness. The primary issue with timeliness is how quickly data becomes available. In the past, it often took weeks or months to gather marketing information. Today, data in general moves much more quickly, although some information still take months to assemble. There is a danger that quickly available data will overwhelm higher-quality data that appears later. For example, initial response rate to a promotion is immediately available, but the value of those responses can only be measured over time. Decisions based only on gross response often turn out to be incorrect once the later performance is included in the analysis. Still, timely data can be extremely important when it can lead to adjustments that improve results, such as moving funds from one promotion to another. Online marketing in particular often allows for such reactions because changes can be made in hours or minutes, rather than the weeks and months needed for traditional marketing programs.

I haven’t listed cost as a separates consideration only because there are often incremental investments that can made to change a data element’s existence, accessibility, coverage, etc. Those investments would change its value as well. But you will ultimately still need to assess the total cost and value of a particular element, and then compare it with the cost and value of other elements that could serve a similar purpose. This assessment will often be fairly informal, as it was in last week’s blog post. But you still need to do it: while an unexamined life may or not be worth living, unexamined marketing data will get you in trouble for sure.

No comments: