To be and not to be – the uncertainty principle in SAS

This post was kindly contributed by SAS Users - go there to comment and to read the full post.

Uncertainty Principle blackboard

If I were to say that we live in uncertain times, that would probably be an understatement. Therefore, I won’t say that. Oops, I already did. Or did I?

For centuries, people around the world have been busy scratching their heads in search of a meaningful answer to Shakespeare’s profoundly elementary question: “To be or not to be?”

Have we succeeded? Sure. And in pursuit of even further greatness, we have progressed beyond the simple binary choice. Thanks to human ingenuity, it is now possible to have it all: to be and not to be.

But doesn’t this contradict human logic? Not at all, according to the Heisenberg uncertainty principle – a cornerstone of quantum mechanics asserting a fundamental limit to the certainty of knowledge.

According to the uncertainty principle, it is not possible to determine both the momentum and position of particles (bosons, electrons, quarks, etc.) simultaneously. Here is the famous formula:

where
Δx = uncertainty in position.
Δp = uncertainty in momentum.
h = Planck’s constant (a rare and precious number equal to 6.62607015×10−34 representing how much the energy of a photon increases, when the frequency of its electromagnetic wave increases by 1).
4π = π π π π (4 pi’s; no mathematical formula of any scientific significance can do without at least one of them!)

In addition, every particle or quantum entity may be defined as either a particle or a wave depending on how you feel about it according to the wave-particle duality principle. But let’s not let the dual meaning inconvenience us. Let’s just call them matters, or things for simplicity.

Then we can formulate the uncertainty principle in plain and clear terms:

Since it is impossible to know whether the position of a thing is X or not X, then that thing can be in position X and not be in position X simultaneously. Thus “to be and not to be”.

Capeesh?

There is an abundance of examples of the uncertainty principle in SAS software. Let’s consider several of them.

History of the present and present of the history

Some of you may remember SAS version 7.0. It’s remarkable in a way that it was the shortest-lived SAS version that lasted roughly one year. It was released in October 1998 and was replaced by SAS 8.0 in November 1999. There were no 7.1 or 7.2 sub-versions, only 7.0.

But (and this is a big BUT), have you noticed that even today the latest SAS products (9.4 and Viya) use the following version 7 file extensions?

  • .sas7bdat – SAS data set
  • .sas7bvew – SAS view
  • .sas7bndx – SAS index
  • .sas7bcat – SAS catalog
  • .sas7bmdb – SAS multi-dimensional database file
  • .sas7bdmd – SAS data mining database file

. . . and this is just a partial list.

When you define a SAS library with v9 engine
libname AAA v9 'c:\temp';

SAS log will indicate:
NOTE: Libref AAA was successfully assigned as follows:
Engine:        V9
Physical Name: c:\temp

Notice how it’s SAS Engine V9, but SAS datasets created with it have .sas7bdat extensions.

Where do you think that digit “7” came from? Obviously, even almost two decades after version 7.0’s demise it is still alive and kicking. How can you explain that other than by the uncertainty principle: “it is while it is not”!

Transience and permanence

Let’s take another example. How long have you known the fact that in order to create a permanent SAS data set you need to specify its name as a two-level name, e.g. LIBREF.DATASETNAME, while for temporary data sets you can specify a one-level name, e.g. DATASETNAME, or you can use a two-level name where the first level is WORK to explicitly signify the temporary library. Now, equipped with that “settled science” knowledge, what do you think the following code will create, a temporary or a permanent data set?

options user='c:\temp';
 
data MYDATA;
   x = 22371;
run;

Just run this code and check your c:\temp folder to make sure that data set MYDATA is permanent. Credit for this shortcut goes to the option user= . Now we can say that to create a permanent data set we can use a two-level name or one-level name, which makes it indistinguishable from temporary data sets.

To bring this uncertainty to an even higher level, you can drop MYDATA name altogether and still create a permanent data set:

options user='c:\temp';
 
data;
   x = 22371;
run;

SAS Log will show:
NOTE: The data set USER.DATA1 has 1 observations and 1 variables.

Isn’t an ultimate proof of the “to be and not to be” principle (sponsored by DATAn naming convention)!

In addition, you can create a data set by defining its physical pathname without even relying on SAS data set names, whether one or two-level:

data "c:\temp\aaa";
   x = 22371;
   format x date9.;
run;

This code runs perfectly fine, creating a SAS data set as a file named aaa.sas7bdat in the c:\temp folder.

And I am not even talking  about the NOWORKTERM option (well, I am now) which preserves all the SAS files and directory of the temporary WORK library at the termination of a SAS session, which essentially makes temporary SAS files permanent.

As you can see even “well settled science” crumbles right in front of your eyes under the certainty of the uncertainty principle.

Uncertainty principle: Final Exam

And now, ladies and gentlemen, you will have to pass your final exam to receive an official April Fools diploma from SAS University.

Problem to solve

You know that every SAS data step creates automatic variables, _N_ and _ERROR_, which are available during the data step execution. Is it possible to save those automatic variables on the output data set?

In other words, will the following code create 3 variables on the output data set ABC?

data ABC (keep=MODEL _N_ _ERROR_);
   set SASHELP.CARS(keep=MODEL);
run;

If you answered “No” you get 1 credit. If you answered “Yes” you get 0 credit. But that’s only if you answered the second question (I assume you noticed that I asked two questions in a row). If your “Yes”/ ”No” answer relates to the first question your credits are in reverse.

Bonus for creativity

However, if you not only answered “Yes” to the first question, but also provided a “how-to” code example, you get a bonus in the amount of 10 credits. Here is your bonus for creativity:

data BBC;
   set SASHELP.CARS(keep=MODEL);
   x = _n_;
   e = _error_;
   rename x=_n_ e=_error_;
run;

You still have to run this code to make sure it creates data set BBC with 3 variables: MODEL, _N_, and _ERROR_ in order to get your 10 credits vested.

Problem solved = problem created

And lastly, the final curiosity test and exercise where you find out about SAS’ no-nonsense solution in the face of uncertainty. What happens in the following data step when the SAS-created automatic data step variables, _N_ and _ERROR_, collide with the same-name variables brought in by the previously created BBC data set?

data CBC;
   set BBC;
run;

After you complete this test/exercise and find out the answer, you can grab your diploma below and proudly brag about it and display it anywhere.

SAS Institute diploma

WAIT! Before you leave, please do not forget to provide your answers, questions, code examples, and comments below.

More April Fools’ Day SAS articles

April 1, 2020: Theory of relativity in SAS programming
April 1, 2019: Dividing by zero with SAS
April 1, 2018: SAS discovers a new planet in the Solar System
April 1, 2017: SAS code to prove Fermat’s Last Theorem

To be and not to be – the uncertainty principle in SAS was published on SAS Users.

This post was kindly contributed by SAS Users - go there to comment and to read the full post.