March 30, 2007

Data Classification: Brains or Brawn

Elements of data classification may apply strongly to your data mining program. I recommend building consolidated classification coding systems for attribute, interest, and funding categories. For example engineering graduates may have an engineering interest, which rolls-up into a science interest, which rolls into the constituency pool for science and technology. When a person notes their interest in engineering on a survey, attends an engineering event, or gives to engineering, they join this pool as well.

By "smart-coding" your entire systems into these categories, you will multiply the availability of independent characteristics for predictive modeling. Similar work might be done for occupations and industries. The manual mapping is the most difficult step in these classification projects.

On a deeper level, Here is an article on data classification for the techies on the list.

The current state of data classification is largely a byproduct of historical, hierarchical storage management (HSM) implementations where data age is the primary classification criterion. Early visions of classifying data based on business value never fully came to fruition because it required a manual, brute force approach and was too hard to automate. Age-based classification enabled automation processes to be more easily applied to data classification initiatives and became the de facto standard.


Read More

Labels: ,

Data Mining for Airfares

A University of Washington professor used data mining to predict when to buy airfares. As these tools become increasingly accessible to people, I expect we will see the pricing sophistication of companies increase. Similarly, in fundraising, we might use data from our database to understand when a person is most likely to give the largest amounts, make certain types of gifts, or be willing to volunteer.

In 2003, Etzioni and colleagues published a paper showing that they could predict the fluctuation in airline-ticket prices surprisingly well. By sifting through the history of more than 12,000 airfares for nonstop flights from Seattle to Washington, D.C., and from Los Angeles to Boston, the researchers could predict with 62 percent accuracy whether or not those ticket prices would rise or fall in the future.

Read More

Labels:

The Next Wave of Business Analytics

Here is a great overview of the current state and future of analytics in the market.

Although Albert Einstein said, "Not everything that counts can be counted and not everything that can be counted counts," organizations in all industries are collecting and storing an increasing amount of data generated by internal transactional systems as well as external content sources. The challenge of what to measure and how to agree on key performance indicators (KPIs) is a point of frustration for both IT and business. However, most organizations are willing to err on the side of caution and deal with more data rather than discarding it.

Read More

Labels: ,

Data Miners Can Be Funny... Well, they can try.

On a light-hearted note, Here is the Data Mining Limericks contest at KDnuggets. It was from a few years back. Below is one of my favorites by Ross Bettinger:

There once was a data miner
Who claimed, "I'm a Forty-Niner."
His main obsession
Was logistic regression
But neural networks predicted finer.

Read More

Labels:

March 15, 2007

Lifetime Value and Fundraising

Most prospecting strategies concentrate on the capacity rating. This is generally an amount a person can give in an ideal scenario if your organization is their top philanthropic priority. This amount is generally compared to gift officer yield rates and/or target ask amounts to project portfolio performance.

In the for-profit arena, lifetime value is the preferred metric of customer rating. From an annual giving perspective, it makes sense to consider lifetime value in segmentation. However, if an annual giving directors only goal is the participation rate, are they likely to risk overall participation for the sake of high lifetime value acquisition? It is likely preferable for the big picture.

Customer lifetime value is a way of measuring how much your customers are worth to you, over the length of time that they remain your customers. The lifetime for customers will vary from industry to industry, and from brand to brand. The lifetime of customers should come to an end when their contribution ceases to be profitable unless steps are taken to revitalize them.

Here is an article with methodology for determining lifetime value

Labels: ,

You've Got Data: Now What Do You Do With It?

This article from one of my favorite web sites, CRMguru, presents a four step process for making the most of your data. I am glad it points out predictive models don't predict customers. Rather, it points out they identify groups most likely to be customers. Similar to fundraising, predictive modeling will identify and prioritize your prospects to bring efficiency to your high-touch prospect research area or bring efficiency to your broad-based solicitation strategies.

For example, Toyota took customer data from previous repurchase campaigns and developed simple statistical models to predict which customers are more likely to repurchase in the future. The models identified the characteristics of customers who repurchased before and used this insight to identify, say, the 30 percent of customers most likely to repurchase in any given month in the future. The models were used to target customers in a number of repurchase campaigns to great success: The campaigns doubled the repurchase rate but at only one half of the volume (and cost) of mailings, a net 400 percent increase in campaign ROI.

Read more

Labels:

Data Mining and Video Games

This article about a video game data mining tool reminds me of a KDD conference session I attended five or six years ago. In this session, a data miner teamed up with a video game designer on a project. They sought out to see if they could predict click-through, customer behavior based on how the players maneuvered their way the game. Their results were compelling. We are just scratching the surface of understanding the predictability and interrelationships of human behavior.

You may have heard of Emergent Technologies Gamebryo engine, but you have probably never heard of their data mining tool called Metrics. It allows you to instrument and visually see data sets in your game letting you figure out problems hopefully before they are a problem.

Read More

Labels:

March 2, 2007

Is Predictive Analytics at the Tipping Point?

It has been my mantra that predictive analytics is within your reach. Although the science comes out of the academic community, the products are designed by the experts, and the development was ushered by math geeks, it is time for the user to step up. The tools are increasingly user-friendly and you know your data better than anyone.

It seems we are at the point where the leading fundraising organizations are building data mining programs rapidly. By the increasing size of the campaign totals, I can understand why. The tools are there. The necessity is clear.

The organizations that have built predictive analytics programs are likely more efficiently identifying prospects, applying effective segmentation strategies, understanding their constituents, making smart data enhancement decisions, and raising more money.

The business world agrees:
Business intelligence and decision support have been utilized by organizations for several decades. Their deployment continues to spread in both breadth and depth within many of these organizations while gaining new converts in others. However, data mining, and its application as predictive analytics, has often been characterized as a technology that could only be utilized by highly skilled technical practitioners with strong statistical backgrounds...This may have been true in the past, but is arguably no longer the situation today.

Read More

Labels:

The Averaged American - Book Review

The Averaged American: Surveys, Citizens, and the Making of a Mass Public by Sarah E. Igo traces the history of market and survey research. BusinessWeek provides this solid book review. I have not yet read this book, but it is on my short list.

Polling, once considered a scandalous invasion of privacy, is now an accepted practice. More than 20% of Americans were polled at least once in the past year. As Igo aptly concludes, "we will continue to live in a world shaped by, and perceived through, survey data."

Read More

Labels:

Data Mining with Java

Here is an introduction to data mining using Java. It provides some interesting descriptions, tables, and examples that will apply to you non-Java folks as well.

This article, an excerpt from Java Data Mining: Strategy, Standard, and Practice by Mark F. Hornick, Erik Marcade, Sunil Venkayala (Morgan Kaufman, 2007), introduces data mining concepts for those new to data mining, and will familiarize data mining experts with data mining terminology and capabilities specific to the Java Data Mining API (JDM).

Read More

Labels:

Is Data Mining Fraught with Peril?

This fun piece from ABC news describes the use and abuse of data mining. Often, the abuse surfaces when the analytics professional is overly interested in desired results over natural results. If there is not a clear business understanding and the model evaluation is not circled back to the original premise, error will result.

In fundraising, I see many people wanting unique and interesting factors to be the "key" for predicting giving. Instead, our goals should be identifying prospects, prioritizing prospects, predicting behaviors, segmenting our lists, and so on. In prospect identification, data mining is typically followed by prospect research. This hand verification will catch Type I errors. However, the deployment of models to annual giving does not have a safety net. I recommend testing a model like you might a survey or a direct mail piece to control for this potential of error.

I believe the science of data mining is strong and evolving, but I definitely recommend reading the ABC article to introduce some of the "cons."

That's been a good thing in some ways, because it has helped researchers spot trends in everything from politics to the stock market to long range weather patterns. But it's probably also why you get advertisements for stuff you don't want, and why sometimes it rains when it's supposed to be sunny. According to Austin and his colleagues, data mining is fraught with peril.

Read More

Labels: