This post was kindly contributed by SAS ANALYSIS - go there to comment and to read the full post. |
I recently discovered 10 questions most daunting in R development, and I am trying to find the answers below.
1. Data structure — How many data structures R has? How do you build a binary search tree in R?
2. Sorting — How many sorting algorithms are available? Show me an example in R.
3. Low level — How do you build a R function powered by C?
4. String — How do you implement string operation in R?
5. Vectorization — If you want to do Monte Carlo simulation by R, how do you improve the efficiency?
6. Function — How do you take function as argument of another function? What is the apply() function family?
7. Threading — How do you do multi-threading in R?
8. Memory limit and database — What is the memory limit of R? How do you avoid it? How do you use SQL in R?
9. Testing — How do you do testing and debugging in R?
10. Software development — How do you develop a package? How do you do version control?
Q1. Data structure — How many data structures R has? How do you build a binary search tree in R?
My answer: R mainly has 5 data structures.
Homogeneous(contain the same type of objects): vector –> matrix –> array
Heterogeneous(allow different type of objects): list –> data frame
A binary search tree requires several actions: implement a tree, insert nodes and delete nodes. We should create individual routines in R.
In R, a matrix is the ideal data structure to contain the linked elements. Then a list should be used to wrap the matrix and pass the arguments.
For insert-node routine, there is the pseudocode for it. The key point here is to use recursion in R’s function. Norman Matloff gave a complete example at page 180 of his book.
insert(X, node){
if(node = NULL)
node = new binaryNode(X,NULL,NULL)
return
}
if(X = node:data)
return
else if(X < node:data)
insert(X, node:leftChild)
else // X > node:data
insert(X, node:rightChild)
}
Q2. Sorting — How many sorting algorithms are available? Show me an example in R.
My answer: there are mainly 5 kinds of sorting algorithms:
Bubble Sort – O(n^2);
Selection Sort – O(n^2)
Merge Sort – O(n log n)
Quick Sort – from O(n^2) to O(n log n)
Bucket Sort – O(n + m)
R has a native sort function sort(), which is written by C and uses Quick Sort.
There is an example of Quick Sort in R by recursion
qs <- function(x) {
if (length(x) <= 1) return(x)
seed <- x[1]
rest <- x[-1]
sv1 <- rest[rest < seed]
sv2 <- rest[rest >= seed]
sv1 <- qs(sv1)
sv2 <- qs(sv2)
return(c(sv1,seed,sv2))
}
This post was kindly contributed by SAS ANALYSIS - go there to comment and to read the full post. |