Posts Tagged ‘ data mining ’

Flexibility of SAS Enterprise Miner

August 24, 2015
By

Do you use an array of tools to perform predictive analytics on your data? Is your current tool not flexible enough to accommodate some of your requirements? SAS Enterprise Miner may be your solution. With growing number of data mining applications, having a tool which can do variety of analysis

The post Read more »

Tags: , , , ,
Posted in SAS | Comments Off on Flexibility of SAS Enterprise Miner

Use recursion and gradient ascent to solve logistic regression in Python

May 21, 2014
By
Use recursion and gradient ascent to solve logistic regression in Python

This post was kindly contributed by DATA ANALYSIS - go there to comment and to read the full post. In his book Machine Learning in Action, Peter Harrington provides a solution for parameter estimation of logistic regression . I use pandas and ggplot to realize a recursive alternative. Comparing with the iterative method, the recursion costs more space but may bring...
Read more »

Tags: ,
Posted in SAS | Comments Off on Use recursion and gradient ascent to solve logistic regression in Python

PROC PLS and multicollinearity

December 10, 2013
By
PROC PLS and multicollinearity

Multicollinearity and its consequences

Multicollinearity usually brings significant challenges to a regression model by using either normal equation or gradient descent.

1. Invertible SSCP for normal equation

According to normal equation, the coefficients could be obtained by \hat{\beta}=(X'X)^{-1}X'y. If the SSCP turns to be singular and non-invertible due to multicollinearity, then the coefficients are...
Read more »

Tags:
Posted in SAS | Comments Off on PROC PLS and multicollinearity

Use R in Hadoop by streaming

December 9, 2013
By
Use R in Hadoop by streaming

It seems that the combination of R and Hadoop is a must-have toolkit for people working with both statistics and large data set.

An aggregation example

The Hadoop version used here is Cloudera’s CDH4, and the underlying Linux OS is CentOS 6. The data used is a simulated sales data set form a training...
Read more »

Tags: , ,
Posted in SAS | Comments Off on Use R in Hadoop by streaming

A cheat sheet for linear regression validation

November 29, 2013
By
A cheat sheet for linear regression validation

The link of the cheat sheet is here.I have benefited a lot from the UCLA SAS tutorial, especially the chapter of regression diagnostics. However, the content on the webpage seems to be outdated. The great thing for PROC REG is that it creates...
Read more »

Tags:
Posted in SAS | Comments Off on A cheat sheet for linear regression validation

Kernel selection in PROC SVM

November 21, 2013
By
Kernel selection in PROC SVM

The support vector machine (SVM) is a flexible classification or regression method by using its many kernels. To apply a SVM, we possibly need to specify a kernel, a regularization parameter c and some kernel parameters like gamma. Besides the selectio...
Read more »

Tags:
Posted in SAS | Comments Off on Kernel selection in PROC SVM

When ROC fails logistic regression for rare-event data

November 13, 2013
By
When ROC fails logistic regression for rare-event data

ROC or AUC is widely used in logistic regression or other classification methods for model comparison and feature selection, which measures the trade-off between sensitivity and specificity. The paper by Gary King warns the dangers using...
Read more »

Tags:
Posted in SAS | Comments Off on When ROC fails logistic regression for rare-event data

A SAS macro that exports data to MongoDB

August 29, 2013
By

MongoDB is possibly the most popular NoSQL data store. To bypass schema and constraint, I feel quite convenient to implement MongoDB as buffer to accompany current RDBMS .Also it is straightforward to use MongoDB and other tools (MEAN) to build s...
Read more »

Tags:
Posted in SAS | Comments Off on A SAS macro that exports data to MongoDB

Proc-x is looking for sponsors!

Dear readers, proc-x is looking for sponsors who would be willing to support the site in exchange for banner ads in the right sidebar of the site. If you are interested, please e-mail me at: tal.galili@gmail.com

Welcome!

SAS-X.com offers news and tutorials about the various SAS® software packages, contributed by bloggers. You are welcome to subscribe to e-mail updates, or add your SAS-blog to the site.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.