Tag: Statistics

Top 10 tips and tricks about PROC SQL

Interestingly, I just found that the most searched keyword is PROC SQL, through the traffic analysis of my tiny blog. The reason possibly is: nowadays everybody knows SQL, more or less; then someone can do some parts of the SAS job by PROC SQL wit…

10 keywords taken out from SAS Global Forum 2012

1. In-memory
SAS is famous for hitting hard disk at every operation, which is a proved strategy to save memory.  To speed up the processing of ‘Big Data’, SAS at the server side will aggregate memories, load data into memory and then deal w…

Stored Processes: SAS’s voice on Business Intelligence

Everyday I write SAS scripts to extract, transform and load data from various sources, which is a step before the database, and also pull out data to do analysis such as aggregation and regression in SAS, which is a step after the database. According t…

Correlations of three variables

Question
There is an interesting question in statistics —
“There are 3 random variables X, Y and Z. The correlation between X and Y is 0.8 and the
correlation between X and Z is 0.8. What is the maximum and minimum correlation between Y and Z?…

The 7-year question

I have been at SAS for 7 years and up until 10 days ago, I had never been asked this question. Since then, I’ve been asked four times, so now must be the time to answer it! Question: Can we simply use a linear regression model to predict the response …

Digital Life and Personal Data Analysis

This picture (by newsobserver.com) took from Wake County Library Book Fair in North Carolina Fairground where I also went to pick up some books. I was and still is a big book (almost paper books) fan. But when I was standing among the 450,000 used books in the book fair, I felt depressed and reluctantly […]

US Post Beats Menu Cost

One of the interesting observations in my first few months in US: there is no price printed in the new mail stamps! It is interesting because as a former student of economics, I think US Post system did a nice attempt to beat the so called “menu cost” which should honor to Harvard economist Mankiw. […]

Multicollinearity and the solutions

In his book, Rudolf Freund described a confounding phenomenon while fitting a linear regression. Given a small data set below, there are three variables – dependent variable(y) and independent variables(x1 and x2). Using x2 to fit y alone, the estimat…