Hello, and Welcome

The SAS Training Post is born, and introductions are in order. I’m Catherine Truxillo, but everyone except my mom calls me Cat. I have been a statistician in the Education division at SAS since 2000. I’ve been blogging a little longer than that, a…

Random Sample Selection

Last week my manager asked me to randomly pick 10%observations from a large data set and then create a listing so that the Data management programmers can QC the data. I want to share some thoughts here … how easy and simple to do random sampling. …

It’s the added value that counts

Welcome to SAS Training Post, the official blog of SAS Training & Certification! My name is Michele Reister and I am the social media manager for SAS Education. This blog will be a channel to provide you with value-add educational content t…

K-Nearest Neighbor in SAS

K-Nearest-Neighbor, aka KNN, is a widely used data mining tool and is often called memory-based/case-based/instance-based method as no model is fit. A good introduction to KNN can be find at [1], or @ Wiki.

Typically, KNN algorithm relies on a soph…

Next Project: Regularized Logistic Regression

L1 Regularized Logistic Regression effectively handles large number of predictors and serves variable selection simultaneously. [1] indicates that L1 RLR can be implemented via IRLS-LARS algorithm. You can tweak PROC GLMSELECT in v9.2 for this.

L2 R…