The flu seems to be especially potent this year. “How potent is it,” you might ask? … Well, let’s plot some data on a map to help find out! Here in the US, the Centers for Disease Control and Prevention (CDC) compiles data about diseases, and provide…
A better way to view flu data
Have you, or someone you know, gotten the flu this year? Word on the street is that this year’s flu might be particularly bad, and the data seem to be corroborating that so far. You don’t want to take my word for it? — well then, let’s have a look […..
Jedi SAS Tricks – The Double vs. Decimal Dilemma
I ran across an interesting conundrum the other day. The intent was to compare the value of X to a list of desired values, and if X matched one of the values in the list, set Flag to 1, otherwise set Flag to 0. I wrote and executed this test program, …
Type I error rates in two-sample t-test by simulation
What do you do when analyzing data is fun, but you don’t have any new data? You make it up.
This simulation tests the type I error rates of two-sample t-test in R and SAS. It demonstrates efficient methods for simulation, and it reminders the reader not to take the result of any single hypothesis test as gospel truth. That is, there is always a risk of a false positive (or false negative), so determining truth requires more than one research study.
A type I error is a false positive. That is, it happens when a hypothesis test rejects the null hypothesis when in fact it is not true. In this simulation the null hypothesis is true by design, though in the real world we cannot be sure the null hypothesis is true. This is why we write that we “fail to reject the null hypothesis” rather than “we accept it.” If there were no errors in the hypothesis tests in this simulation, we would never reject the null hypothesis, but by design it is normal to reject it according to alpha, the significance level. The de facto standard for alpha is 0.05.
R
First, we run a simulation in R by repeatedly comparing randomly-generated sets of normally-distributed values using the two-sample t-test. Notice the simulation is vectorized: there are no “for” loops that clutter the code and slow the simulation.
For more posts like this, see Heuristic Andrew.
North Korea’s 117 missile tests
Data doesn’t always have to be ‘big data’ to be interesting. For example, I recently ran across a small, but very interesting, database containing all of North Korea’s missile tests. The data was a bit difficult to digest in the formats provided, therefore I decided to try my hand at […]
The post North Korea’s 117 missile tests appeared first on SAS Learning Post.
How to test PROC HTTP and the JSON library engine
Using SAS with REST APIs is fun and rewarding, but it’s also complicated. When you’re dealing with web services, credentials, data parsing and security, there are a lot of things that can go wrong. It’s useful to have a simple program that verifies that the “basic plumbing” is working before […]
The post How to test PROC HTTP and the JSON library engine appeared first on The SAS Dummy.