Really useful R package: sas7bdat

This post was kindly contributed by SAS and R - go there to comment and to read the full post.

For SAS users, one hassle in trying things in R, let alone migrating, is the difficulty of getting data out of SAS and into R. In our book (section 1.2.2) and in a blog entry we’ve covered getting data out of SAS native data sets. Unfortunately, for all of these methods, you need a working, licensed version of SAS.

However Matt Shotwell has reverse-engineered the sas7bdat file format. This means that you can now read a SAS data set without a working copy of SAS. This is a wonderful thing, and in fact SAS Institute ought to have provided this ability long ago. The package is experimental, but it worked fine for two small data sets. Matt tells me that as of 7/2011, the package only works for sas7bdat files generated on 32-bit Windows systems.

R
Install the package sas7bdat. The use the read.sas7bdat() function.


library(sas7bdat)
helpfromSAS = read.sas7bdat("http://www.math.smith.edu
/sasr/datasets/help.sas7bdat")

(Note that newlines are not allowed in the URL in practice, but formatting for the blog required it.)


> is.data.frame(helpfromSAS)
[1] TRUE
> summary(helpfromSAS$MCS)
Min. 1st Qu. Median Mean 3rd Qu. Max.
6.763 21.680 28.600 31.680 40.940 62.180
> with(helpfromSAS, summary(SUBSTANCE))
alcohol cocaine heroin
177 152 124

It’s unclear why all the variable names are all capitalized. That didn’t happen in another trial, so it must be something about the way the help.sas7bdat data set was constructed.

This post was kindly contributed by SAS and R - go there to comment and to read the full post.