The SAS Training Post is born, and introductions are in order. I’m Catherine Truxillo, but everyone except my mom calls me Cat. I have been a statistician in the Education division at SAS since 2000. I’ve been blogging a little longer than that, a…
Book review: SAS and R by Ken Kleinman and Nicholas J. Horton
There are many books that teach you to use SAS or that teach you to use R. There is at least one book that teaches R to people who know SAS or SPSS (R for SAS and SPSS users by Robert Muenchen, and it’s very good). Most of those try to teach you …
Random Sample Selection
Last week my manager asked me to randomly pick 10%observations from a large data set and then create a listing so that the Data management programmers can QC the data. I want to share some thoughts here … how easy and simple to do random sampling. …
It’s the added value that counts
Welcome to SAS Training Post, the official blog of SAS Training & Certification! My name is Michele Reister and I am the social media manager for SAS Education. This blog will be a channel to provide you with value-add educational content t…
K-Nearest Neighbor in SAS
K-Nearest-Neighbor, aka KNN, is a widely used data mining tool and is often called memory-based/case-based/instance-based method as no model is fit. A good introduction to KNN can be find at [1], or @ Wiki.
Typically, KNN algorithm relies on a soph…
Next Project: Regularized Logistic Regression
L1 Regularized Logistic Regression effectively handles large number of predictors and serves variable selection simultaneously. [1] indicates that L1 RLR can be implemented via IRLS-LARS algorithm. You can tweak PROC GLMSELECT in v9.2 for this.
L2 R…