Thursday, April 19, 2012

R: running by Java process

After trying Multiple Linear Regression in sandbox, let's try some integration.
In this post we will concentrate on how to install and run R from regular Java process; in next post we will plug R into Hadoop mapreduce.

R is programming language and software environment written in C and FORTRAN, so interaction with Java requires JNI layer. It is provided by Java/R Interface project [1] and contains platform-specific .so files.
To prepare environment, we need both R and JRI installed and configured. For Ubuntu these are next two lines:
sudo apt-get install r-base r-recommended r-base-dev
sudo apt-get install r-cran-rjava

For other platforms follow steps from [5] to install R and [6] for JRI.

To reference .so files for Java processes, we need to update LD_LIBRARY_PATH and pass -Djava.library.path to JVM. Feel free to dig a little deeper on configuration reasoning in [2] and [3]. script in Ubuntu will look like:

Having environment configured, we can now turn to code:

[1] Java/R Interface

[2] Talking R through Java

[3] java.library.path and LD_LIBRARY_PATH

[4] How to convert a data frame column to numeric type?

[5] CRAN mirrors: chose your favourite location and follow R installation instruction:

[6] rJava package on CRAN

No comments: