Using SAS and ODS PACKAGE to create ZIP files

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.

SAS users are big data consumers and big data creators. Often, we have to deal in large data files (or many smaller files) — and that means ZIP compression. ZIP compression tools such as gzip, 7-Zip, and WinZip are ubiquitous, but they aren’t always convenient to use from within a SAS program. To use an external ZIP utility you must issue a shell command via the X command or SYSTASK function, and that’s not always possible within today’s complex SAS environments.

Fortunately, SAS can read and write ZIP files directly. Ever since SAS 9.2, we’ve been able to create ZIP files with ODS PACKAGE. Beginning with SAS 9.4, we can read ZIP content by using FILENAME ZIP.

In this post, I’ll review how to create ZIP files using ODS PACKAGE. I’ll cover reading ZIP files with FILENAME ZIP in a future post.

Let’s pretend that I’m working for a government agency, and that part of my job is to crunch some government data and publish it for the public. Of course, I’m using SAS for the analysis, but I need to publish the data in a non-proprietary format such as CSV. (It seems unbelievable, I know, but not every citizen is lucky enough to have access to SAS.)

First, I’ll set up the output directory for this project. Since the ZIP file will contain a couple of files, including a subfolder, I want to mirror that structure here. The FEXIST and FDELETE functions will delete an existing ZIP file (perhaps left over from the last time I ran the process). The DLCREATEDIR option will create a “data” subfolder as needed. All of these mechanisms interact with the file system, but do not require XCMD privileges. This means that they’ll work in SAS Enterprise Guide and stored processes.

%let projectDir = c:\projects\sgf2013\filenamezip;
 
/* Clean slate! */
filename newfile "&projectDir./carstats.zip";
data _null_;
  if (fexist('newfile')) then 
  	rc = fdelete('newfile');
run;
filename newfile clear;
 
/* Create folder if it doesn't exist */
options dlcreatedir;
libname out "&projectDir./data";

Next, I need to create the content to include in the ZIP file. In this scenario, I’m crunching some heavy-duty numbers about Cars data, and then putting the results into a CSV file. Then I’m creating a README file in RTF format; the document contains a simple data dictionary plus instructions (such as they are) for using the data. I used ODS TEXT to throw in some ad-hoc text among the SAS output.

/* Create some data */
filename newcsv "&projectDir./data/pct.csv";
proc means noprint data=sashelp.cars;
var msrp;
output out=out.pct median=p50 p95=p95 p99=p99;
run;
ods csv file=newcsv;
proc print data=out.pct;
format _all_; /* clear the formats */
run;
ods csv close;
 
/* Create an informative document about this package */
filename rm "&projectDir./readme.rtf";
ods rtf(readme) 
  file="&projectDir./readme.rtf" style=Printer;
ods rtf(readme) 
  text="These are some instructions for what to do next";
proc datasets lib=out nolist;
contents data=pct;
quit;
ods rtf(readme) close;

Finally, I’m going to take those results and package them in a ZIP file. The ODS PACKAGE mechanism was originally designed to share results from a SAS stored process. By default, it adds a PackageMetaData entry that a consuming SAS application could use to interpret the result. In this case we don’t need this entry; the NOPF option suppresses it.

Notice that I specify the PATH= option to place the CSV file in the “data” folder within the archive. As soon as the ODS PACKAGE CLOSE statement executes, the ZIP file is created.

/* Creating a ZIP file with ODS PACKAGE */
ods package(newzip) open nopf;
ods package(newzip) add file=newcsv path="data/";
ods package(newzip) add file=rm;
ods package(newzip) publish archive 
  properties(
   archive_name="carstats.zip" 
   archive_path="&projectDir."
  );
ods package(newzip) close;

Here’s a screen shot of the ZIP file opened in WinZip:

That’s it! I can add any file that I want to the ZIP archive; I’m not restricted to files that were created by SAS. This makes it easy to use SAS as an automated method to update data archives regularly, creating user-friendly packages for consumers to make use of our data.

Note: A common question: does ODS PACKAGE (and FILENAME ZIP) support password-protected ZIP files (encryption)? The answer is No. If that’s a requirement, you’ll need to use an external package such as 7-Zip.

Download the complete program (SAS 9.3 or later): createZipODSPackage.sas

You might also enjoy:

tags: DLCREATEDIR, FILENAME ZIP, ODS PACKAGE, ZIP files

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.