The 7 Deadly Sins of Data Mining and How To Avoid Them

This post was kindly contributed by The SAS Training Post - go there to comment and to read the full post.

Our M2010 Data Mining Conference keynote speaker, Dick De Veaux from Williams College just finished his entertaining and informative presentation. He thoughtfully noted that our location (Las Vegas) is very appropriate for the subject of his presentation.

Are you guilty of any of these data mining sins? Luckily, Dick also presented the seven virtues of data mining to help absolve us of our sinful ways.

Seven Deadly Sins of Data Mining.
1. Not asking the right questions.
2. Not fully understanding the problem.
3. Underestimating data preparation.
4. Ignoring what’s not there.
5. Falling in love with your models.
6. Going it alone.
7. Using bad data.

Seven Virtues of Data Mining
1. Define the problem.
2. Prepare the data, use domain knowledge.
3. Be open to new methods and models. Keep the toolbox open.
4. Be aware of missing data, create dummy variables.
5. Work in teams.
6. Ensure data quality.
7. Use models, not just associations.

This post was kindly contributed by The SAS Training Post - go there to comment and to read the full post.