Last week I wrote about how our test cases should be considered an asset and added to an ever growing library of regression tests. I had a few correspondents ask how this could be the case when their test cases would only work with specific data; the s…
just been to sas professionals road show in the new sas office in London
- Sas has significantly hardened the security
- Sas has introduced two new scripting languages: FedSQL and DS2. The latter, DS2, is something every sas programmer who respect his-self should know. It harks back to the AF object oriented SCL (oh the good old days) so sas dinosaurs like my self will feel at home. The power, according to the presenters, is the ability to truly harness parallel programming and code objects that a truly portable to other environments. We are just facing a case where we could have benefited for the latter feature – we created an amazing solution and now the client wants the beautiful data steps dumbed down to SQL. In the new world we can just hand over the DS2 and it will work as is in say Oracle.
- The IT oriented people will be thrilled with the new embedded web-server (save some licencing money there) and the shinny new sas environment manager
- Sas Contextual Analysis
- Recommendation engine
SAS Proc Groovy in Action: JSON File Processing
Last year I took a bite of the newly SAS Proc Groovy to read JSON data since there was no direct “proc import” or “infile” or “libname” way to play with JSON. Here is an nice example from SAS official blog, by Falko Schulz where Proc Groovy is used to parse Twitter JSON file: How […]
An alternative way to use SAS and Hadoop together
The challenges for SAS in Hadoop
Pull data through MySQL and Sqoop
On the Cluster
purchases.txt is a tab delimited text file by a training course at Udacity. At any data node of a Hadoop cluster, the data transferring work should be carried out.MySQL
insert operations.# Check the head of the text file that is imported on Hadoop
hadoop fs -cat myinput\purchases.txt | head -5
# Set up the database and table
mysql --username mysql-username --password mysql-pwd
create database test1;
create table purchases (date varchar(10), time varchar(10), store varchar(20), item varchar(20), price decimal(7,2), method varchar(20));
Sqoop
# Use Sqoop to run MapReduce and export the tab delimited
# text file under specified directory to MySQL
sqoop export --username mysql-username --password mysql-pwd \
--export-dir myinput \
--input-fields-terminated-by '\t' \
--input-lines-terminated-by '\n' \
--connect jdbc:mysql://localhost/test1 \
--table purchases
On the client
proc sql;
connect to mysql (user=mysql-username password=mysql-pwd server=mysqlserv database=test1 port=11021);
select * from connection to mysql
(select * from purchases limit 10000);
disconnect from mysql;
quit;
Learn R and/or Data Management from Home January or April
If you want to learn R, or improve your current R skills, join me for two workshops that I’m offering through Revolution Analytics in January and April. If you already know another analytics package, the workshop, Intro to R for … Continue reading →![]()
Test Cases, an Investment
It never ceases to frustrate and disappoint me when I hear people talking of test cases as use-once, throwaway artefacts. Any team worth its salt will be building a library of tests and will see that library as an asset and something worth investing in…