PROC-X: an online (unofficial) journal about the SAS(R) software - written by bloggers

Random Sample Selection

by sarath • May 9, 2010

Last week my manager asked me to randomly pick 10%observations from a large data set and then create a listing so that the Data management programmers can QC the data. I want to share some thoughts here … how easy and simple to do random sampling. …

SAS

It’s the added value that counts

by michele reister • May 7, 2010

Welcome to SAS Training Post, the official blog of SAS Training & Certification! My name is Michele Reister and I am the social media manager for SAS Education. This blog will be a channel to provide you with value-add educational content t…

SAS

K-Nearest Neighbor in SAS

by L X • May 5, 2010

K-Nearest-Neighbor, aka KNN, is a widely used data mining tool and is often called memory-based/case-based/instance-based method as no model is fit. A good introduction to KNN can be find at [1], or @ Wiki.

Typically, KNN algorithm relies on a soph…

SAS

Next Project: Regularized Logistic Regression

by L X • May 5, 2010

L1 Regularized Logistic Regression effectively handles large number of predictors and serves variable selection simultaneously. [1] indicates that L1 RLR can be implemented via IRLS-LARS algorithm. You can tweak PROC GLMSELECT in v9.2 for this.

L2 R…

SAS

Repeating a line of data

by ken kleinman • May 4, 2010

Repeating a line of a data set for each line in another
Suppose you want to access the same information in every line of a data set, and that this information is data-dependent. For example, suppose you want to add the 25th, 50th, and 75th per…

SAS

Hey Look: There’s Log In

by renee harper • May 3, 2010

We are working on some changes to support.sas.com that I’ll talk about here over the next few weeks. We tried to squeak a few in before leaving Cary for SAS Global Forum. If you didn’t see these on the demo floor, let me point them out now. Thes…