The 7 Deadly Sins of Data Mining and How To Avoid Them

October 25, 2010
By

This post was kindly contributed by The SAS Training Post - go there to comment and to read the full post.

Our M2010 Data Mining Conference keynote speaker, Dick De Veaux from Williams College just finished his entertaining and informative presentation. He thoughtfully noted that our location (Las Vegas) is very appropriate for the subject of his presentation.

Are you guilty of any of these data mining sins? Luckily, Dick also presented the seven virtues of data mining to help absolve us of our sinful ways.

Seven Deadly Sins of Data Mining.
1. Not asking the right questions.
2. Not fully understanding the problem.
3. Underestimating data preparation.
4. Ignoring what’s not there.
5. Falling in love with your models.
6. Going it alone.
7. Using bad data.

Seven Virtues of Data Mining
1. Define the problem.
2. Prepare the data, use domain knowledge.
3. Be open to new methods and models. Keep the toolbox open.
4. Be aware of missing data, create dummy variables.
5. Work in teams.
6. Ensure data quality.
7. Use models, not just associations.

This post was kindly contributed by The SAS Training Post - go there to comment and to read the full post.

Tags: , ,

Proc-x is looking for sponsors!

Dear readers, proc-x is looking for sponsors who would be willing to support the site in exchange for banner ads in the right sidebar of the site. If you are interested, please e-mail me at: [email protected]

Welcome!

SAS-X.com offers news and tutorials about the various SAS® software packages, contributed by bloggers. You are welcome to subscribe to e-mail updates, or add your SAS-blog to the site.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.