# Posts Tagged ‘ data mining ’

## SAS introduces the blended classroom

March 23, 2018
By

We all have different learning styles. Some learn best by seeing and doing; others by listening to lectures in a traditional classroom; still others simply by diving in and asking questions along the way. Traditional face-to-face classroom instruction,...

Tags: , , , , ,
Posted in SAS | Comments Off on SAS introduces the blended classroom

## Flexibility of SAS Enterprise Miner

August 24, 2015
By

Do you use an array of tools to perform predictive analytics on your data? Is your current tool not flexible enough to accommodate some of your requirements? SAS Enterprise Miner may be your solution. With growing number of data mining applications, having a tool which can do variety of analysis

Tags: , , , ,
Posted in SAS | Comments Off on Flexibility of SAS Enterprise Miner

## Use recursion and gradient ascent to solve logistic regression in Python

May 21, 2014
By

This post was kindly contributed by DATA ANALYSIS - go there to comment and to read the full post. In his book Machine Learning in Action, Peter Harrington provides a solution for parameter estimation of logistic regression . I use pandas and ggplot to realize a recursive alternative. Comparing with the iterative method, the recursion costs more space but may bring...

Tags: ,
Posted in SAS | Comments Off on Use recursion and gradient ascent to solve logistic regression in Python

## PROC PLS and multicollinearity

December 10, 2013
By

### Multicollinearity and its consequences

Multicollinearity usually brings significant challenges to a regression model by using either normal equation or gradient descent.

#### 1. Invertible SSCP for normal equation

According to normal equation, the coefficients could be obtained by . If the SSCP turns to be singular and non-invertible due to multicollinearity, then the coefficients are...

Tags:
Posted in SAS | Comments Off on PROC PLS and multicollinearity

## Use R in Hadoop by streaming

December 9, 2013
By

It seems that the combination of R and Hadoop is a must-have toolkit for people working with both statistics and large data set.

### An aggregation example

The Hadoop version used here is Cloudera’s CDH4, and the underlying Linux OS is CentOS 6. The data used is a simulated sales data set form a training...

Tags: , ,
Posted in SAS | Comments Off on Use R in Hadoop by streaming

## A cheat sheet for linear regression validation

November 29, 2013
By

The link of the cheat sheet is here.I have benefited a lot from the UCLA SAS tutorial, especially the chapter of regression diagnostics. However, the content on the webpage seems to be outdated. The great thing for PROC REG is that it creates...

Tags:
Posted in SAS | Comments Off on A cheat sheet for linear regression validation

## Kernel selection in PROC SVM

November 21, 2013
By

The support vector machine (SVM) is a flexible classification or regression method by using its many kernels. To apply a SVM, we possibly need to specify a kernel, a regularization parameter c and some kernel parameters like gamma. Besides the selectio...

Tags:
Posted in SAS | Comments Off on Kernel selection in PROC SVM

## When ROC fails logistic regression for rare-event data

November 13, 2013
By

ROC or AUC is widely used in logistic regression or other classification methods for model comparison and feature selection, which measures the trade-off between sensitivity and specificity. The paper by Gary King warns the dangers using...

Tags:
Posted in SAS | Comments Off on When ROC fails logistic regression for rare-event data

## Welcome!

SAS-X.com offers news and tutorials about the various SAS® software packages, contributed by bloggers. You are welcome to subscribe to e-mail updates, or add your SAS-blog to the site.