Tag: predictive modeling

Implement Boost Algorithm in SAS

by L X • March 29, 2010 • Comments Off

algorithms are proven to be very effective data mining tools, either used stand alone, or as a building block to handle nonlinearity, etc. Implementation of Boost algorithm in SAS is not easy to find although it is not difficult to wr…

An efficient macro for Stump – two terminal nodes tree

by L X • February 17, 2010 • Comments Off

In this post, I post an improved SAS macro of the single partition split algorithm in Chapter 2 of “Pharmaceutical Statistics Using SAS: A Practical Guide” by Alex Dmitrienko, Christy Chuang-Stein, Ralph B. D’Agostino.
The single part…

SAS implementation of Kernel PCA

by L X • February 6, 2010 • Comments Off

Kernel method is a very useful technique in data mining that is applicable to any algorithms relying on inner product [1]. The key is applying appropriate kernel function to the inner product of original data space.

I show here SAS/STAT+BASE ex…

Partial Least Square

by L X • February 2, 2010 • Comments Off

In some predictive modelling projects, we may have variables that most of the observations have the same value, while the small percentage rest ones are populated with meaningful values. For example, 90% observations have values=0 but the rest 10% ha…

Implementing Gap statistic for clustering number estimation

by L X • January 22, 2010 • Comments Off

Gap statistic is a method used to estimate the most possible number of clusters in a partition clustering, noticeablly k-means clustering. This measurement was originated by Trevor Hastie, Robert Tibshirani, and Guenther Walther, all from Standford U…

Tensor in SAS

by L X • December 10, 2009 • Comments Off

Tensor, a math concept for high order array, is a very useful tool in modern data mining applications. HOSVD, the counter part of SVD in higher order array, is at the heart of modern applications, such as face recognition and clustering, segmentation…

AUC calculation using Wilcoxon Rank Sum Test

by L X • October 23, 2009 • Comments Off

Accurately Calculate AUC (Area Under the Curve) in SAS for a binary classifier rank ordered data

In order to calculate AUC for a given SAS data set that is already rank ordered by a binary classifier (such as linear logistic regression), where we h…

Page 3 of 3

« 1 2 3