This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |
Rick Wicklin and I are engaged in an arms race of birthday-related blog posts. To recap:
- Rick analyzed national data about births in the USA and what time of year they are most likely to occur.
- I responded by analyzing the birthdays of my Facebook friends.
- Rick responded by analyzing yet another sample of birthdays to resolve a controversy over the birthday “seasons“.
Now I have no choice but to respond again. This isn’t my fault. I didn’t start this.
Today, I’m going to take the data that Rick supplied and attempt to answer the deep scientific question, “What is our (zodiac) sign?”
You might as well ask the ever-reliable desktop scientist, the Magic 8-Ball: “Reply hazy, try again.” As you can see from the PROC FREQ output below, there doesn’t appear to be a clear dominant horoscope that might be influencing our collective fate.
Attributes of Virgo (meticulous and reliable) and Taurus (warm-hearted and loving) are definitely reflected in our corporate culture, but I’m not sure that these signs are concentrated enough within our population to affect that. Perhaps the most useful thing that came out of this exercise is my user-written SAS format that equates birthdates to signs of the zodiac. Here’s the complete program, which you ought to be able to run as-is from SAS 9.2 or later, or in SAS Enterprise Guide.
filename bdays url "http://blogs.sas.com/content/iml/files/2011/09/SASBirthdays.csv" /* behind a corporate firewall? don't forget the PROXY= option here */ ; /* SAS format for zodiac signs in a given year */ proc format lib=work; value sign '21Mar2000'd - '19Apr2000'd = 'Aries' '20Apr2000'd - '20May2000'd = 'Taurus' '21May2000'd - '20Jun2000'd = 'Gemini' '21Jun2000'd - '22Jul2000'd = 'Cancer' '23Jul2000'd - '22Aug2000'd = 'Leo' '23Aug2000'd - '22Sep2000'd = 'Virgo' '23Sep2000'd - '22Oct2000'd = 'Libra' '23Oct2000'd - '21Nov2000'd = 'Scorpio' '22Nov2000'd - '21Dec2000'd = 'Sagittarious' /* split Capricorn to make two valid ranges */ /* that don't span the calendar boundary */ '22Dec2000'd - '31Dec2000'd = 'Capricorn' '01Jan2000'd - '19Jan2000'd = 'Capricorn' '20Jan2000'd - '18Feb2000'd = 'Aquarius' '19Feb2000'd - '20Mar2000'd = 'Pisces' other = 'Unknown'; run; data bdays; infile bdays dsd firstobs=2; input mon day; length birthdate 8 birthsign 8; format birthdate date5.; format birthsign sign.; label birthsign="Zodiac sign"; /* make sure we pick a leap year, so 29Feb is valid */ birthdate = mdy(mon,day,2000); birthsign = birthdate; run; ods graphics on / height=400 width=800; title "What's our sign?"; ods noproctitle; proc freq data=bdays order=data; tables birthsign /plots=freqplot(scale=percent); run; /* clear the filename */ filename bdays;
I’m sure that the question has occurred to you: what about my Facebook friends? How are they distributed among the stars? Of those that report their birthdays, here is how they fall:
As you can see, there really isn’t a dominant sign among them. However, Aquarius might be a bit underrepresented. That’s a shame, because I could probably use more honest and loyal people among my friends.
This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post. |