Who is Alfred?

This post was kindly contributed by From a Logical Point of View » SAS - go there to comment and to read the full post.

Tell me something about Alfred, male or female? age? height and weight?

Oracle database (version 9 and below) had a well known default demo account SCOTT with a password, TIGER (and TIGER was the name of the real person Bruce Scott ’s cat, see) and in this account, there are some tables named DEPT, EMP, BONUS and SALGRADE (you can read their meaning). Almost every Oracle DBA learn SQL using these database and an joke just says that in DBA’s meetings, people just  warm up saying “how about Smith?” And you should know that in the database, Smith is a clerk and his boss is Ford (whose boss is Jones)!

In the beginning I also raise a question for SAS programmers: who is Alfred? Don’t give quick answer such that “Alfred who”. Actually, you should already go through with Alfred very well as a SAS programmer:

proc print data=sashelp.class;
    where name="Alfred";
run;

As a clinical SAS programmer, I play with data, get acquaintance with the data and subjects and then subjects are no longer “subject”. They have identities and  Alfred is a 14 years old boy. I have such habit mostly because in clinical world, data are very expensive (not like the massive transaction data in financial industry) and should be took more care.

I dare say that “class” is the most famous SAS dataset in sashelp library and then in the SAS world. The first dataset used for demo is almost this “class”. I just did a quick Google search, “sas sashelp.class” returns about 44,400 results. Hope you can find any other SAS datasets to beat it.

Alfred in “class” pops into my mind because today, I do find a strong candidate. In SAS 9.2 (and 9.3), the sashelp library has a new member, Iris. YES, it is the “Fisher Iris Flower Data”, which can be safely considered the most famous and most  used dataset in machine learning and data mining papers and statistical applications. Currently it has only 859 hits in Google, I think the number will reach high accompany with the wide use of SAS 9.2 and above, and to enforce my prediction, I will definitely play with the Iris data in the following future!

This post was kindly contributed by From a Logical Point of View » SAS - go there to comment and to read the full post.