In last post, I mentioned Hadoop, the open source implementation of Google’s MapReduce for parallelized processing of big data. In this long National Holiday, I read the original Google paper, MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean and Sanjay Ghemawat and got that the terminologies of “map” and “reduce” were basically...

Read more »

Tags: array, data mining, Database, Hadoop, Lisp, List, MapReduce, python, SAS

Posted in SAS | Comments Off on Map and Reduce in MapReduce: a SAS Illustration

Array is probably the only number-indexed data type in SAS. SAS programmers adopt it mainly for multiple-variable batch-processing. For example, longitudinal summation can be achieved by specifying a one-dimensional array and then adding all array elem...

Read more »

Tags: array, finance, proc fcmp

Posted in SAS | Comments Off on Array 2.0: matrix-friendly array in Proc Fcmp

The very truth is that SAS has limited financial functions. Thanks to SAS Institute, they finally added some option pricing functions in the base module of SAS 9.2, such as Black-Scholes put/call functions, Garman-Kohlhagen put/call functions, etc. Thu...

Read more »

Tags: array, finance, function, proc fcmp, simulation

Posted in SAS | Comments Off on Proc Fcmp(4): Binomial tree vs. Black-Scholes model

Problems: Quote for six-month American style euro currency options on plain vanilla, Maxand 〖Max〗^0.5. Exchange rate S_0=$1.3721 /euroSix-month continuously compounded inter-bank rates: r=0.4472%,r_f=1.2840%.Assumptions:The exchange r...

Read more »

Tags: array, finance, function, macro, proc fcmp

Posted in SAS | Comments Off on Proc Fcmp(2): a subroutine for Binomial-CRR model

I have a data set of sales data by day. Unfortunately the names of the columns represent the dates. In order to work with the data, I need to transform the data set so each day represents an observation.The data set looks something like this:store _0...

Read more »

Tags: array, character, function, informat

Posted in SAS | Comments Off on Data Steps 2010-07-19 22:25:00

In this post, I post an improved SAS macro of the single partition split algorithm in Chapter 2 of "Pharmaceutical Statistics Using SAS: A Practical Guide" by Alex Dmitrienko, Christy Chuang-Stein, Ralph B. D'Agostino.
The single part...

Read more »

Tags: array, Boost Algorithms, data mining, Gini Index, predictive modeling

Posted in SAS | Comments Off on An efficient macro for Stump – two terminal nodes tree

Gap statistic is a method used to estimate the most possible number of clusters in a partition clustering, noticeablly k-means clustering. This measurement was originated by Trevor Hastie, Robert Tibshirani, and Guenther Walther, all from Standford U...

Read more »

Tags: array, Gap Statistic, K-means Clustering, predictive modeling, SVD

Posted in SAS | Comments Off on Implementing Gap statistic for clustering number estimation

The max() function makes it easy to find the maximum value in a SAS array. Given an array like:array x x1-x10;maxValue = max(of x);Pretty slick, eh? Remember, it doesn't return the position of the max element, just the max value.This can be pret...

Read more »

Tags: array, function

Posted in SAS | Comments Off on Finding the Max Value In An Array