CAS-Action! Advanced Frequency Tables – Part 4

This post was kindly contributed by SAS Users - go there to comment and to read the full post.

Welcome back to my SAS Users blog series CAS Action! – a series on fundamentals. The previous posts show how to use the simple.freq CAS action to generate, save and group simple frequency tables. In this post I will show you how to use the freqTab.freqTab CAS action to generate more advanced one-way frequency and crosstabulation tables.

In this example, I will use the CAS language (CASL) to execute the freqTab CAS action. Instead of using CASL, I could execute the same action with Python, R and more with some slight changes to the syntax for the specific language. Refer to the documentation for syntax in other languages.

Load the demonstration data into memory

I’ll start by executing the loadTable action to load the WARRANTY_CLAIMS_0117.sashdat file from the Samples caslib into memory. By default the Samples caslib should be available in your SAS Viya environment. I’ll load the table to the Casuser caslib and then I’ll clean up the CAS table by renaming and dropping columns to make the table easier to use. For more information how to rename columns check out my previous post. Lastly I’ll execute the fetch action to preview 5 rows.

proc cas;
   * Specify the input/output CAS table *;
   casTbl = {name = "WARRANTY_CLAIMS", caslib = "casuser"};
 
   * Load the CAS table into memory *;
    table.loadtable / 
        path = "WARRANTY_CLAIMS_0117.sashdat", caslib = "samples",
        casOut = casTbl + {replace=TRUE};
 
* Rename columns with the labels. Spaces replaced with underscores *;
 
   *Store the results of the columnInfo action in a dictionary *;
   table.columnInfo result=cr / table = casTbl;
 
   * Loop over the columnInfo result table and create a list of dictionaries *;
   listElementCounter = 0;
   do columnMetadata over cr.ColumnInfo;
	listElementCounter = listElementCounter + 1;
	convertColLabel = tranwrd(columnMetadata['Label'],' ','_');
	renameColumns[listElementCounter] = {name = columnMetadata['Column'], rename = convertColLabel, label=""};
   end;
 
   * Rename columns *;
   keepColumns = {'Campaign_Type', 'Platform','Trim_Level','Make','Model_Year','Engine_Model',
                  'Vehicle_Assembly_Plant','Claim_Repair_Start_Date', 'Claim_Repair_End_Date'};
   table.alterTable / 
	name = casTbl['Name'], caslib = casTbl['caslib'], 
	columns=renameColumns,
	keep = keepColumns;
 
   * Preview CAS table *;
   table.fetch / table = casTbl, to = 5;
quit;

The results above show a preview of the warranty_claims CAS table.

One-way frequency tables

To create more advanced one-way frequency tables, use the freqTab.freqTab CAS action. In the freqTab action, use the table parameter to specify the CAS table and the tabulate parameter to specify the column, or columns, to analyze. The tabulate parameter is extremely flexible and provides a variety of ways to analyze your data. In this example, I’ll specify the warranty_claims CAS table and the Campaign_Type and Make columns as a list in the tabulate parameter.

proc cas;
   * CAS table reference *;
   casTbl = {name = "WARRANTY_CLAIMS", caslib = "casuser"};
 
   * One-way frequency tables *;
   freqTab.freqTab / 
       table = casTbl,
       tabulate = {'Campaign_Type','Make'};
quit;

Results

The results above show the freqTab action generates a variety additional information compared to the freq action. Information including the number of observations used, variable level and timing. The freqTab action frequency tables contain the expected total frequency of each value; moreover, it also includes the total percentage, cumulative frequency and cumulative percent.

Two-way crosstabulation tables

Instead of producing one-way frequency tables, you can also create two-way crosstabulation tables. One way is to continue to add elements in the list in the tabulate parameter. Here a one-way frequency table will be created for Campaign_Type and Make as seen before. Then, I’ll add a dictionary in the list. The vars key specifies the Make column, and the cross key specifies the Campaign_Type and Model_Year columns. The columns in the cross key are paired with those specified in vars. This example will produce a two-way crosstabulation between Make by Campaign_Type and Make by Model_Year.

proc cas;
   * CAS table reference *;
   casTbl = {name = "WARRANTY_CLAIMS", caslib = "casuser"};
 
   * One-way frequency and two-way crosstabulation tables *;
   freqTab.freqTab / 
      table = casTbl,
      tabulate = {
 		'Campaign_Type',
		'Make',
              	{vars = 'Make', cross = {'Campaign_Type', 'Model_Year'}}
      };
quit;

Partial results

The results above show the freqTab action returns the one-way frequency tables for Campaign_Type and Make as shown earlier and displays the two-way crosstabulation between Make by Campaign_Type and Make by Model_Year.

While this is great, what if I want to avoid the total row and display the Make for each row in the two-way crosstabulation as a single table?

Two-way crosstabulation as a single table

To present the results as a single table and remove the total row, add the tabDisplay parameter with the value list. In the code, I’ll remove the one-way frequencies and add the tabDisplay parameter.

proc cas;
   * CAS table reference *;
   casTbl = {name = "WARRANTY_CLAIMS", caslib = "casuser"};
 
   * Two-way crosstabulation as a single table *;
   freqTab.freqTab / 
       table = casTbl,
       tabulate = {
              	{vars = 'Make', cross = {'Campaign_Type', 'Model_Year'}}
       }, 
       tabDisplay='list';
quit;

Partial results

The results show each two-way crosstabulation as a single table and the total row is removed.

Next, what if you want to create a three-way crosstabulation table? Well this is a bit tricky.

Three-way crosstabulation

To produce a three-way crosstabulation, specify the columns as a list within the vars parameter, with the tabDisplay parameter equal to list. If you do not specify the tabDisplay parameter, the freqTab action will return three two-way crosstabulation tables. Each table will be created for each distinct group of the first column specified in the list. This example will create a crosstabulation of Model_Year by Campaign_Type by Make.

proc cas;
   * CAS table reference *;
   casTbl = {name = "WARRANTY_CLAIMS", caslib = "casuser"};
 
   * Three-way crosstabulation table *;
   freqTab.freqTab / 
       table = casTbl,
       tabulate = {
		{vars={'Model_Year','Campaign_Type','Make'}}
       }, 
       tabDisplay='list';
quit;

Partial results

Notice in the results above a three-way crosstabulation table was created. The order of the columns is created by the order of the columns specified in the list.

Summary

The freqTab.freqTab CAS action provides a variety of ways to create frequency and crosstabulation tables in the distributed CAS server. There are a variety of parameters you can add to modify it to meet your objectives. We’ve only scratched the surface!

Additional resources

freqTab CAS action
SAS® Cloud Analytic Services: CASL Programmer’s Guide 
CAS Action! – a series on fundamentals
Getting Started with Python Integration to SAS® Viya® – Index
SAS® Cloud Analytic Services: Fundamentals

CAS-Action! Advanced Frequency Tables – Part 4 was published on SAS Users.

This post was kindly contributed by SAS Users - go there to comment and to read the full post.