Tag: Statistics

Statistical Notes (4): Dragon’s Teeth and Fleas: Hypothesis Testing in Plain English

Statisticians aren’t the problem for data science. The real problem is too many posers — Cathy O’Neil you actually do need to understand how to invert a matrix at some point in your life if you want to be a data scientist. — Cathy O’Neil I was asked in several different occasions to explain hypothesis […]

Statistical Notes (3): Confidence Intervals for Binomial Proportion Using SAS

A guy notices a bunch of targets scattered over a barn wall, and in the center of each, in the "bulls-eye," is a bullet hole. "Wow," he says to the farmer, "that’s pretty good shooting. How’d you do it?" "Oh," says the farmer, "it was easy. I painted the targets after I shot the holes." […]

Statistical Notes (2): Equivalence Testing and TOST (Two One-Sided Test)

Programmers Need to Learn Statistics Or I will Kill Them All –Zed A. Shaw In an equivalence testing example against lognormal data,  a TOST (Two One-Sided Test)  option used in SAS TTEST procedure: proc ttest data=auc dist=lognormal tost(0.8, 1.25);    paired TestAUC*RefAUC; run; And the output: Since the 90% (who not 95%? see below) limit of […]

Statistical Notes (1): Geometric Mean and Geometric Mean Ratio

Programmers Need to Learn Statistics Or I will Kill Them All –Zed A. Shaw Just read since SAS 9.2, the TTEST procedure also natively supports Equivalence Test by simply adding a TOST option (Two one-sided tests). In a example, TTEST procedure reports a geometric mean as 0.9412, which is the geometric mean of a ratio, […]

SAS Beats R on July 2012 TIOBE Rankings

The TIOBE Community Programming Index ranks the popularity of programming languages, but from a programming language perspective rather than as analytical software (http://www.tiobe.com). It extracts measurements from blogs, entries in Wikipedia, books on Amazon, search engine results, etc. and combines them into a single index. … Continue reading

100% stacked bar chart in SAS’s SGPLOT

A 100% stacked bar chart is useful for comparing the relative frequencies of an m x n table where frequencies in m are very different. While this is easy to do in Excel, SAS requires an extra step, which you … Continue reading →

Why R is Hard to Learn

The open source R software for analytics has a reputation for being hard to learn. It certainly can be, especially for people who are already familiar with similar packages such as SAS, SPSS or Stata. Training and documentation that leverages … Continue reading