In many occasions, fast access into a lookup table to find desired value is necessary. In computer science, linked list, associative array, and hash table are widely used to construct the relationship between values and keys. Hash function, like value…
Tag: proc sql
Self-matching and its applications
Programming is all about data structure and algorithm. For example, value comparison needs to find right data structure and iteration method. To fulfill this purpose, the first thing is to load the variable with a key-value like data structure, follow…
Top 10 most powerful functions for Proc SQL
Proc SQL is actually not a standard SAS procedure but a distinctive subsystem with all features from SQL (structured query language). Equipped with it, SAS upgrades to a full-fledging relational database management system. In addition, Proc SQL always …
Five reasons to use the SAS DATA Step or PROC SQL
When I joined SAS Institute I had no knowledge of SAS. I studied, researched and practiced. The questions I asked would make even the most seasoned programmer blush a beetroot red! Those were an intense nine months—I’ll tell you more in a lat…
5D visualiztion: from SAS to Google Motion Chart
Three dimensions are usually regarded as the maximum for data presentation. With the opening of ODS from SAS 9.2 and its graph template language, 3D graphing is no longer a perplexing problem for SAS programmers. However, nowadays magnificent amount of data with multi-dimension structure needs more vivid and simpler way to be displayed.
The emerging of Google Motion Chart now provides a sound solution to visualize data in a more than three dimensions scenario. This web-based analytical technology originated from Dr. Hans Rosling’s innovation. Dr. Rosling and his Gapminder foundation invented a technology to demonstrate the relationship among multiple dimensions by animated bubbles. They developed a lot of bubble plots in Gapminder’s website to discover knowledge form a bulk of public information, especially for regional/national comparison. It soon attracted Google’s attention. In 2008 after an agreement between Dr. Rosling and Google’s two founders, Google launched its Motion Chart gadget. People could create motion chart by using Google Docs, an online alternative to Microsoft’s Office.
The combination between SAS and Google Motion Chart shows a handy and cheap way for up-to-five-dimension data visualization. For Motion Chart, it supports five variables all together in a plot. Commonly the data structure requires time(animation), var1(X axis), var2(Y axis), var3(color) and var4(bubble size). The correlation from var1 to var4 is expected: usually the bubbles with changing color and size tend to move along the diagonal line. Overall 5d visualization can be rendered within such a single plot. In this example, a SAS help dataset ‘SASHELP.SHOES’ is used. The data set has several regions to compare each other. Logged return money is Y-axis, while logged sale money is X-axis. A series of virtual time is given to each region, with inventory as bubble size and the store number as color. By SAS, the data structure in Motion Chart can be prepared quickly. Thus, once the CSV file is uploaded to Google Docs, a motion chart is ready to be published in any webpage. OK, it’s time to sit and discover some interesting tendency…
Reference: 1.’Show me–New ways of visualising data’. The Economist. Feb 25th 2010.
2.‘Making data dance’. The Economist. Dec 11st 2010.
3. Google Docs online help center. 2010.
*********(1) Extract data from SASHELP.SHOES***********;
proc sql;
create table test as
select region, Sales, Inventory, Returns, Stores
from sashelp.shoes
order by region , sales desc
;quit;
********(2) Create a random variable for time************;
data test1;
do i=1 by 1 until (last.region);
set test;
by region;
time=today()-i+1;
mytime=put(time, mmddyy8.);
drop i;
output;
end;
run;
********(3) Transform some variables with log**********;
proc sql;
create table test2 as
select region, mytime, log(sales) as logsales, log(returns) as logreturn, Stores as storenum, Inventory
from test1
order by region, mytime
;quit;
********(4) Export data as CSV***************;
proc export data=test2 outfile='C:\Users\Yanyi\Desktop\test.csv' replace;
run;
*******(5) Upload CSV to Google Docs************;
******(6) Create Google Motion Chart manually**********;
**********END*********TEST PASSED 12DEC2010****************************;
The efficiency of five SAS methods in multi-dataset merging
Introduction: Merging two or multiple datasets is essential for many ‘data people’. Yes, it is a dirty and routine job. Everyone wants to get it done quick and accurate. Actually, SAS has many ways to tackle this job[3]. In two competing papers fro…
My answer to a complex transpose question from SAS-L
The question is raised from http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1005d&L=sas-l&F=&S=&P=11947proc sort data=indat out=one; by subjid vist;run;proc transpose data=one out=two ; var result; by subjid vist; run;proc transpose data=o…
How to do data cleaning
1) How to use SAS to merge base and look-up tables ? pro and con?1. array 2. sort-sort-merge;3. proc sql; 4. proc format; 5. hash objectCoding efficiency: 3>2>4>1>5I/O resource: 5>4>1>3>2flexibility: 3>>others2) What are the common methods for large…