Translate SAS’s sas7bdat format to SQLite and Pandas

This post was kindly contributed by DATA ANALYSIS - go there to comment and to read the full post.

Thanks Jared Hobbs’ sas7bdat package, Python can read SAS’s data sets quickly and precisely. And it will be great to have a few extension functions to enhance this package with SQLite and Pandas.
The good things to transfer SAS libraries to SQLite:
  1. Size reduction:
    SAS’s sas7bdat format is verbose. So far successfully loaded 40GB SAS data to SQLite with 85% reduction of disk usage.
  2. Save the cost to buy SAS/ACCESS
    SAS/ACCESS costs around $8,000 a year for a server, while SQLite is accessible for most common softwares.
The good things to transfer SAS data set to Pandas:
  1. Pandas’ powerful Excel interface:
    Write very large Excel file quickly as long as memory can hold data.
  2. Validation of statistics
    Pandas works well with statsmodels and scikit-learn. Easy to validate  SAS’s outputs. 

This post was kindly contributed by DATA ANALYSIS - go there to comment and to read the full post.