This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |
“Code golf” is a fun programming pastime that challenges you to solve a problem with the least amount of code possible. Like regular golf, the goal is to use fewest code “strokes” to hit the mark. Here’s a recent challenge that was posted to me via Twitter.
@cjdinger @SASJedi got a fun puzzle for you guys, we've been discussing at my office.
You have a character var with the string "000112010302". What's the least about of code that can be written to determine what is the highest number (3) in the string?— Wes (@SigurWes) July 17, 2018
While I feel that I can solve nearly any problem (that I can understand) using SAS, my knowledge of the SAS language is quite limited when compared to that of many experts. And so, I reached out to the SAS Support Communities for help on this one.
The answers were quick, creative, and diverse. I’ll share a few of them here.
The winner, in terms of concision, came from FreelanceReinhard. He supplied a macro-function one-liner:
%sysfunc(findc(123456789,00112010302,b));
With this entry, FreelanceReinhard defied a natural algorithmic instinct to treat this as a numerical digit comparison problem, and instead approached it as simple pattern matching problem. The highest digit comes from a finite set (0..9). The FINDC function can tell you which of those digits is the first to be found in the target string. The b directive tells FINDC to work backwards through the pattern, from ‘9’ down to ‘0’.
In a similar vein, novinosrin’s approach uses the COMPRESS function to keep only the highest digits from the pattern, in descending order, and then applies the FIRST function to return the top value.
a=first(compress('9876543210','00112010302','k'));
The COMPRESS function is often used to eliminate matching characters from a string, but the k directive inverts the action to keep only the matching characters instead.
If you wanted to use the more traditional approach of looping through values, comparing, and keeping just the maximum value, then you can hardly do better than the code offered by hashman.
do j = 1 to length (str) ; d = d <> input (char (str, j), 1.) ; end ;
Experienced SAS programmers will remember that the <> operator is shorthand for MAX (as opposed to “not equal” as some of us learned in Pascal or SQL). “MAX” might be clearer to read, but it requires an additional character. (Remember the “><” is shorthand for the MIN operator in SAS.)
AhmedAl_Attar offered the most dangerous approach, using memory manipulation techniques to populate members of an array:
array ct [20] $1 _temporary_; call pokelong (str,addrlong(ct[1]),length(str)); c=max(of ct{*});
CALL POKELONG and ADDRLONG are documented along with several cautions due to the risk of overwriting something important in your process or system memory. But, they are fast-acting.
And finally, I knew that there would be an elegant matrix-based approach in SAS/IML. ChanceTGardener offered the first variant, and then Rick Wicklin echoed it shortly after.
proc iml; str='000112010302'; maximum=max((substr(str,1:length(str),1))); print maximum; quit;
Code golf does not always produce the most readable, maintainable code. But puzzles like these encourage us to explore new features and nuanced behaviors of our favorite programming language, and thus broaden our understanding of how SAS really works.
Apprendix: Code for featured solutions
Want to experiment with these different approaches? Here’s a SAS program that combines all of them. Think you can do better (or different)? Visit the communities topic and chime in.
data max; str = '00112010302'; /* novinosrin's approach */ a=first(compress('9876543210',str,'k')); /* FreelanceReinhard's approach */ b=findc('123456789',str,-9); /* AhmedAl_Attar's approach using POKELONG */ array ct [20] $1 _temporary_; call pokelong (str,addrlong(ct[1]),length(str)); c=max(of ct{*}); /* loop approach from hashman */ /* remember that <> is MAX */ do j = 1 to length (str) ; d = d <> input (char (str, j), 1.) ; end ; drop j; run; /* FreelanceReinhard's approach in a one-liner macro function */ %let str=00112010302; %put max=%sysfunc(findc(123456789,&str.,b)); /* IML approach from ChanceTGardener */ /* Requires SAS/IML to run */ proc iml; str='000112010302'; maximum=max((substr(str,1:length(str),1))); print maximum; quit; |
The post SAS code golf: find the max digit in a string of digits appeared first on The SAS Dummy.
This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |