10 toughest interview questions for R developers (1)

This post was kindly contributed by SAS ANALYSIS - go there to comment and to read the full post.

I recently discovered 10 questions most daunting in R development, and I am trying to find the answers below.

1. Data structure — How many data structures R has? How do you build a binary search tree in R?
2. Sorting — How many sorting algorithms are available? Show me an example in R.
3. Low level — How do you build a R function powered by C?
4. String — How do you implement string operation in R?
5. Vectorization — If you want to do Monte Carlo simulation by R, how do you improve the efficiency?
6. Function — How do you take function as argument of another function? What is the apply() function family?
7. Threading — How do you do multi-threading in R?
8. Memory limit and database — What is the memory limit of R? How do you avoid it? How do you use SQL in R?
9. Testing — How do you do testing and debugging in R?
10. Software development — How do you develop a package? How do you do version control?

Q1. Data structure — How many data structures R has? How do you build a binary search tree in R?
My answer: R mainly has 5 data structures.
Homogeneous(contain the same type of objects): vector –> matrix –> array
Heterogeneous(allow different type of objects): list –> data frame

A binary search tree requires several actions: implement a tree, insert nodes and delete nodes. We should create individual routines in R.

In R, a matrix is the ideal data structure to contain the linked elements. Then a list should be used to wrap the matrix and pass the arguments.

For insert-node routine, there is the pseudocode for it. The key point here is to use recursion in R’s function. Norman Matloff gave a complete example at page 180 of his book. 


insert(X, node){
 if(node = NULL)
 node = new binaryNode(X,NULL,NULL)
 return
 }
 if(X = node:data)
 return
 else if(X < node:data)
 insert(X, node:leftChild)
 else // X > node:data
 insert(X, node:rightChild)
}

Q2. Sorting — How many sorting algorithms are available? Show me an example in R.
My answer: there are mainly 5 kinds of sorting algorithms:
Bubble Sort – O(n^2);
Selection Sort – O(n^2)
Merge Sort – O(n log n)
Quick Sort – from O(n^2) to O(n log n)
Bucket Sort – O(n + m)

R has a native sort function sort(), which is written by C and uses Quick Sort.

There is an example of Quick Sort in R by recursion


qs <- function(x) {
 if (length(x) <= 1) return(x)
 seed <- x[1]
 rest <- x[-1]
 sv1 <- rest[rest < seed]
 sv2 <- rest[rest >= seed]
 sv1 <- qs(sv1)
 sv2 <- qs(sv2)
 return(c(sv1,seed,sv2))
}

This post was kindly contributed by SAS ANALYSIS - go there to comment and to read the full post.