A fresh helping of hash: the SHA256 function in SAS 9.4m1

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.

For several releases, SAS has supported a cryptographic hash function called MD5, or “message digest”. In SAS 9.4 Maintenance 1, the new SHA256 function can serve the same purpose with a better implementation.

The job of a hash function is to take some input (of any type and of any size) and distill it to a fixed-length series of bytes that we believe should be unique to that input. As a practical example, systems use this to check the integrity of file downloads. You can verify that the published hash matches the actual hash after downloading.

Sometimes a hash is used to track changes in records within a database. You first calculate a hash value for each data record based on all of the fields. Periodically, you recheck those calculations. If a hash value changes for a data record, you know that some part of that record has changed since the last time you looked.

Here’s another common use: storing passwords in a database. Because you can’t (theoretically) reverse the hash process, you can use a hash function to verify that a supplied password is the same as a value you’ve stored, without having to store the original clear-text version of the password. It’s not the same as encryption, because there is no decryption method that would compromise the original supplied password value.

MD5 has known vulnerabilities, especially with regard to uniqueness. A malicious person can use a relatively low-powered computer to compute an input that produces an identical hash to one you’ve stored, thus compromising the algorithm’s usefulness.

Enter the SHA256 algorithm. It’s the same idea as MD5, but without the known vulnerabilities. Here’s a program example:

data _null_;
  format hash $hex64.;
  hash = sha256("SHA256 is part of SAS 9.4m1!");
  put hash;
run;

Output (formatted as hexadecimal so as to be easier on the eyes than 256 ones-and-zeros):

876CF270E81BA3E6219F9518AD9CBE303D8EEC734D4B5966F8D4FD9E89449C6C

As the name implies, it produces a value that is 256 bits (32 bytes) in size, as compared to 128 bits from MD5. Here’s a useful article that compares the effectiveness of hash algorithms.

The SHA256 function was added to SAS 9.4 Maintenance 1. If you’ve been wanting to hash your data in SAS, but you’ve been poo-pooing the MD5 function — well, now is your chance!

tags: MD5, SAS 9.4, security, SHA256

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.