Why Use Docker with R? A DevOps Perspective

(This article was first published on OpenCPU, and kindly contributed to R-bloggers)

opencpu logo

There have been several blog posts going around about why one would use Docker with R.
In this post I’ll try to add a DevOps point of view and explain how containerizing
R is used in the context of the OpenCPU system for building and deploying R servers.

1: Easy Development

The flagship of the OpenCPU system is the OpenCPU server:
a mature and powerful Linux stack for embedding R in systems and applications.
Because OpenCPU is completely open source we can build and ship on DockerHub. A ready-to-go linux server with both OpenCPU and RStudio
can be started using the following (use port 8004 or 80):

docker run -t -p 8004:8004 opencpu/rstudio

Now simply open http://localhost:8004/ocpu/ and
http://localhost:8004/rstudio/ in your browser!
Login via rstudio with user: opencpu (passwd: opencpu) to build or install apps.
See the readme for more info.

Docker makes it easy to get started with OpenCPU. The container gives you the full
flexibility of a Linux box, without the need to install anything on your system.
You can install packages or apps via rstudio server, or use docker exec to a
root shell on the running server:

# Lookup the container ID
docker ps

# Drop a shell
docker exec -i -t eec1cdae3228 /bin/bash

From the shell you can install additional software in the server, customize the apache2 httpd
config (auth, proxies, etc), tweak R options, optimize performance by preloading data or
packages, etc.

2: Shipping and Deployment via DockerHub

The most powerful use if Docker is shipping and deploying applications via DockerHub. To create a fully standalone
application container, simply use a standard opencpu image
and add your app.

For the purpose of this blog post I have wrapped up some of the example apps as docker containers by adding a very simple Dockerfile to each repository. For example the nabel app has a Dockerfile that contains the following:

FROM opencpu/base

RUN R -e 'devtools::install_github("rwebapps/nabel")'

It takes the standard opencpu/base
image and then installs the nabel app from the Github repository.
The result is a completeley isolated, standalone application. The application can be
started by anyone using e.g:

docker run -d 8004:8004 rwebapps/nabel

The -d daemonizes on port 8004.
Obviously you can tweak the Dockerfile to install whatever extra software or settings you need
for your application.

Containerized deployment shows the true power of docker: it allows for shipping fully
self contained appliations that work out of the box, without installing any software or
relying on paid hosting services. If you do prefer professional hosting, there are
many companies that will gladly host docker applications for you on scalable infrastructure.

3 Cross Platform Building

There is a third way Docker is used for OpenCPU. At each release we build
the opencpu-server installation package for half a dozen operating systems, which
get published on http://ift.tt/2ur2Jdq.
This process has been fully automated using DockerHub. The following images automatically
build the enitre stack from source:

DockerHub automatically rebuilds this images when a new release is published on Github.
All that is left to do is run a script
which pull down the images and copies the opencpu-server binaries to the archive server.

To leave a comment for the author, please follow the link and comment on their blog: OpenCPU.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
Data http://ift.tt/2zcxykE October 16, 2017 at 04:00AM

Comments

Popular posts from this blog

How Blockchain Will Redefine Social Media Marketing (And How to Prepare)

.rprofile: David Smith

CEO Daily: Wednesday, 12th July