Using SAS to call Twitter

This post was kindly contributed by Key Happenings at support.sas.com - go there to comment and to read the full post.

Contributed by Richard Foley, Product Manager, SAS

Twitter, a microblogging platform, has become all the rage. Companies are using Twitter to inform and market to customers and the world. People use it as a way to keep in touch and let others quickly know what they are thinking and where they are. Visit Twitter at twitter.com.

Wouldn’t it be cool to use SAS as your Twitter information hub? Post twitter updates, called tweets, from SAS; have SAS query Twitter and then load the results into SAS datasets for further analysis.

Twitter’s API uses Web Services to allow clients, such as SAS, access to the Twitter functions. Typically you find two different types of Web Services, a SOAP style Web Service and a RESTful style Web Service. (I won’t go into the differences, but have provided links for reference.) SAS 9.2 has two new procedures to handle these services:

  • PROC SOAP for SOAP style Web Services
  • PROC HTTP for RESTful style Web Services.

Twitter has a developer API Wiki that describes the various operations available to the public along with the parameters required when invoking these methods. To follow along with this example, you’ll need a Twitter username and SAS 9.2.

According to the Twitter API documentation, we see that many of the API methods require authentication using HTTP Basic Authentication and are REST style services. Therefore, we use PROC HTTP, which supports HTTP Basic Authentication via the two procedure options webusername and webpassword. (HTTP basic authentication passes your username and password. If you’re overly concerned about securing your account you shouldn’t use HTTP basic authentication; HTTP basic authentication can be intercepted and reused or broken, giving someone else access to your account.)

Refer to the Twitter API to find out how to update our status. We see that we need to call the URL http://twitter.com/statuses/update.format where format is either xml or json. We want XML in this case, so we’ll use the URL http://twitter.com/statuses/update.xml. The method or HTTP verb that’s required for this particular operation is POST because we’re updating data (many other Twitter API functions only retrieve data and therefore use a GET operation). The one parameter we really need is the status parameter that contains our update. So let’s look at our SAS code to see how this is done.

A SAS developer Zach Marshall was the first to use the Twitter API and here is the code he developed for tweeting on Twitter.


filename twtIn "\\sas\status_update.txt";
filename twtOut temp;
%let proxyhst="myproxy host"
%let twUser="mytwitterusername";
%let twPass="mytwitterpassword";
proc http
in=twtIn
out=twtOut
url="http://twitter.com/statuses/update.xml"
method="post"
proxyhost=&proxyhst
proxyport=80
webusername=&twUser
webpassword=&twPass;
run;

where \\sas\status_update.txt is the text you will be sending to Twitter.

Let’s use Zach’s code as a template and go one step further; query Twitter and put the information into SAS. Again going back to the Twitter API, we see that we need to call the Twitter Search Service. The Search Service sends results back in JSON Format or ATOM Format. ATOM is an XML format so we are using ATOM for the request.

For this example, let’s look for everyone talking about SAS on Twitter. The query is #SAS, the Twitter API documentation says we need URL encoding for the #. The URL encoding for the # sign is %23. We pass a second parameter, page, which limits the amount of information we will be bringing down. (I could put in a loop to get more tweets but that is not necessary for this example).


filename REQUEST temp;
DATA _NULL_;
FILE request;;
INPUT;;
PUT _INFILE_;;
CARDS4;;
q=%23sas&page=1;
;;;;
filename SchOut "\\sas\http\SASTweets.xml";
proc http;
in=REQUEST;
out=SchOut;
url="http://search.twitter.com/search.atom";
method="get";
proxyhost=&proxyhst;
proxyport=80;
webusername=&twUser;
webpassword=&twPass;;
run;

Now I will use the XML libname engine with a SAS XMLMAP to bring the information into SAS.


filename SXLEMAP '\\sas\http\xmlMap\TwitterSearch.map';
libname twOut xml xmlmap=SXLEMAP access=READONLY;;
libname Searches '\\sas\http\dataSet';
DATA Searches.entry; SET twSearch.entry; ;
run;

The SAS XMLMap, TwitterSearch, used in this code, was created using SAS XMLMapper. The map XML can be copied into a file named TwitterSearch.map. Or better yet, create your own SAS XMLMap to customize how the tweets are loaded into SAS to meet your needs.

Next up, parsing the tweets to look for topics related to SAS, then I’ll mine the tweets and see what new and interesting friends I can make on Twitter.

I would love to hear what fun and interesting things you have done with PROC SOAP and PROC HTTP.

This post was kindly contributed by Key Happenings at support.sas.com - go there to comment and to read the full post.