Posts Tagged ‘ data analysis ’

Use a bar chart to visualize pairwise correlations

August 16, 2017
By
Use a bar chart to visualize pairwise correlations

Visualizing the correlations between variables often provides insight into the relationships between variables. I've previously written about how to use a heat map to visualize a correlation matrix in SAS/IML, and Chris Hemedinger showed how to use Base SAS to visualize correlations between variables. Recently a SAS programmer asked how

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on Use a bar chart to visualize pairwise correlations

What is rank correlation?

August 14, 2017
By
What is rank correlation?

When someone refers to the correlation between two variables, they are probably referring to the Pearson correlation, which is the standard statistic that is taught in elementary statistics courses. Elementary courses do not usually mention that there are other measures of correlation. Why would anyone want a different estimate of

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on What is rank correlation?

Robust principal component analysis in SAS

August 9, 2017
By
Robust principal component analysis in SAS

Recently, I was asked whether SAS can perform a principal component analysis (PCA) that is robust to the presence of outliers in the data. A PCA requires a data matrix, an estimate for the center of the data, and an estimate for the variance/covariance of the variables. Classically, these estimates

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on Robust principal component analysis in SAS

Dimension reduction: Guidelines for retaining principal components

August 2, 2017
By
Dimension reduction: Guidelines for retaining principal components

Last week I blogged about the broken-stick problem in probability, which reminded me that the broken-stick model is one of the many techniques that have been proposed for choosing the number of principal components to retain during a principal component analysis. Recall that for a principal component analysis (PCA) of

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on Dimension reduction: Guidelines for retaining principal components

A quantile definition for skewness

July 19, 2017
By
A quantile definition for skewness

Skewness is a measure of the asymmetry of a univariate distribution. I have previously shown how to compute the skewness for data distributions in SAS. The previous article computes Pearson's definition of skewness, which is based on the standardized third central moment of the data. Moment-based statistics are sensitive to

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on A quantile definition for skewness

3 ways to visualize prediction regions for classification problems

July 17, 2017
By
3 ways to visualize prediction regions for classification problems

An important problem in machine learning is the "classification problem." In this supervised learning problem, you build a statistical model that predicts a set of categorical outcomes (responses) based on a set of input features (explanatory variables). You do this by training the model on data for which the outcomes

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on 3 ways to visualize prediction regions for classification problems

Test for the equality of two proportions in SAS

July 5, 2017
By
Test for the equality of two proportions in SAS

A SAS customer asked how to use SAS to conduct a Z test for the equality of two proportions. He was directed to the SAS Usage Note "Testing the equality of two or more proportions from independent samples." The note says to "specify the CHISQ option in the TABLES statement

The post Read more »

Tags: , ,
Posted in SAS | Comments Off on Test for the equality of two proportions in SAS

The average bootstrap sample omits 36.8% of the data

June 28, 2017
By
The average bootstrap sample omits 36.8% of the data

Suppose you roll six identical six-sided dice. Chance are that you will see at least one repeated number. The probability that you will see six unique numbers is very small: only 6! / 6^6 ≈ 0.015. This example can be generalized. If you draw a random sample with replacement from

The post Read more »

Tags: ,
Posted in SAS | Comments Off on The average bootstrap sample omits 36.8% of the data

Welcome!

SAS-X.com offers news and tutorials about the various SAS® software packages, contributed by bloggers. You are welcome to subscribe to e-mail updates, or add your SAS-blog to the site.

Sponsors





Dear readers, proc-x is looking for sponsors who would be willing to support the site in exchange for banner ads in the right sidebar of the site. If you are interested, please e-mail me at: tal.galili@gmail.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.