# Data-driven simulation

September 27, 2017
In a large simulation study, it can be convenient to have a "control file" that contains the parameters for the study. My recent article about how to simulate multivariate normal clusters demonstrates a simple example of this technique. The simulation in that article uses an input data set that contains

# The ‘virtual wall’ along the US/Mexican border

September 25, 2017
Recent news reports indicate that the number of illegal immigrants apprehended along the US/Mexican border has dropped significantly since Trump took office. So, even though Trump hasn't had time to build a physical wall, perhaps there's a virtual wall...

# Simulate multivariate normal data in SAS by using PROC SIMNORMAL

September 25, 2017
My article about Fisher's transformation of the Pearson correlation contained a simulation. The simulation uses the RANDNORMAL function in SAS/IML software to simulate multivariate normal data. If you are a SAS programmer who does not have access to SAS/IML software, you can use the SIMNORMAL procedure in SAS/STAT software to

# ggformula: another option for teaching graphics in R to beginners

September 21, 2017
A previous entry (http://sas-and-r.blogspot.com/2017/07/options-for-teaching-r-to-beginners.htmldescribes an approach to teaching graphics in R that also “get students doing powerful things quickly”, as David Robinson suggested

In this guest blog entry, Randall Pruim offers an alternative way based on a different formula interface. Here's Randall:

For a number of years I and...

# Amusement park attendance (could Wikipedia be wrong?!?)

September 20, 2017
How do the North American amusement parks compare in popularity? If this question was to come up during a lunch discussion, I bet someone would pull out their smartphone and go to Wikipedia for the answer. But is Wikipedia the definitive answer - how can we tell if Wikipedia is wrong?

# Fisher’s transformation of the correlation coefficient

September 20, 2017
Pearson's correlation measures the linear association between two variables. Because the correlation is bounded between , the sampling distribution for highly correlated variables is highly skewed. Even for bivariate normal data, the skewness makes it challenging to estimate confidence intervals for the correlation, to run one-sample hypothesis tests ("Is

# The path of zip codes

September 18, 2017
Toe bone connected to the foot bone, Foot bone connected to the leg bone, Leg bone connected to the knee bone,...              — American Spiritual, "Dem Bones" Last week I read an interesting article on Robert Kosara's data visualization blog. Kosara connected the geographic centers of the US zip codes in

# WUSS 2017: The Papers

September 16, 2017
The Western Users of SAS Software 2017 conference is coming to Long Beach, CA, September 20-22.  I have been to a lot of SAS conferences, but WUSS is always my favorite because it is big enough for me to learn a lot, but small enough to be really friendly. If you come I hope...

September 15, 2017
A while back, I wrote about the proliferation of interfaces for writing SAS programs.  I am reposting that blog here (with a few changes) because a lot of SAS users still don’t understand that they have a choice. These days SAS programmers have more choices than ever before about how to run SAS.  They...

