Shiny Server on AWS

The Shiny web framework for R is great, and one of my most frequently used packages. I’ve used it to develop exploratory data analysis and visualization tools for my coworkers. When I first started developing these apps I would send instructions to my coworkers explaining how to install R, RStudio, the packages they needed, and how to run the app from GitHub. It was easy enough, albeit very tedious. Most of the first users were beta-testers checking the functionality and providing feature requests. Most importantly all of these users were in the same office as me which allowed them to grab the data off of our server (in-house). I have several colleagues that work in remote offices, and some of these offices aren’t networked to these servers. I wouldn’t be surprised if this was the case for many small state agencies in rural states. The sensitive nature of some of the data requires it be protected. Another potential issue. To solve these issues I’ve turned to Amazon Web Services (AWS) to deploy Shiny apps and host our data on the cloud.

A few caveats. I am not a SysAdmin, DevOps, or trained in any other computer science related fields. I work for a small state agency (Department of Wildlife) and we don’t have people with those skills (unfortunately). I’ve learned all with the help of several online resource (I’ll link to those). Depending on the sensitivity of your data and methods there may be easier options. AWS isn’t your only option, Digital Ocean, Heroku, Google, Microsoft all have similar offerings. Your mileage may vary. As always, any input is greatly appreciated.

AWS

AWS is a huge offering of 55 (at least) services to manage, store and run the cloud. I started using AWS at the recommendation of my supervisor, he host a few ESRI related products on AWS. Getting started is complicated, however AWS is extremely well documented and is as intuitive as possible.

This post will cover how to set up an EC2 instance from scratch, install R, Shiny Server, and other helpful libraries and packages. I relied on Dean Attali’s post about deploying RStudio and ShinySever on Digital Ocean. AWS has a 1 year free tier for many of their products. Sign up for an account to get started. Warning: I am using a Mac, any command line stuff might be different for Windows users. I’ll link to resources when I can find them.

aws dashboard

AWS home dashboard. All the services are under the services menu bar tab. Recently used services show up onder shortcuts. Check your region at the top right.

Regions

You can launch many AWS services in different regions. Hosting apps or databases in multiple regions is a feature of AWS that helps with latency, fault tolerance and scaling. AWS defaults to your nearest region. I recommend you stick with the default region. If you can’t find any of your AWS instances of a service you’ve launched, check that you’re in the correct region. Some regions are more expensive than others.

Elastic Compute Cloud (EC2)

EC2 is the bread and butter of AWS. It is a virtual computing environment that allows you to run applications in the cloud. You can launch instances of these computing environments with different operating systems, programs and packages quickly and easily.

Configure

To get started find the EC2 icon and click it to go to the EC2 Dashboard. I followed this tutorial from Amazon to configure my first several instances. I highly recommend you follow along (and bookmark) this page as it explains the details of each option.

The configuration setting I’ve used are as follows.

  1. AMI: Ubuntu Server 14.04 - I use an Ubuntu Server because I’m familiar with Ubuntu and how to use the command line interface. Ubuntu plays nicely with many of the tutorials and resources I’ve found on the web for setting up Shiny Server.
  2. Instance Type: r3.large - R works in memory, so I need to most RAM for the buck, r3 instances are all memory optimized and have the cheapest RAM.
  3. Configure: all default - Skip this page if you want.
  4. Storage: 16gb - Default is 8gb, I choose more because I can (no good reason really).
  5. Tag: ShinyApplication - Tags can be helpful for organizing you AWS services.
  6. Security Group: http-ssh-anywhere - This part of the configuration tells your instance who is allowed to talk to and access it. These are the rules I use so that I can connect remotely (ssh), and http ports to display web data.
Type Protocol Port Range Source Info
SSH TCP 22 Anywhere: 0.0.0.0/0 remote login
HTTP TCP 80 Anywhere: 0.0.0.0/0 Use nginx to password protect and reroute
HTTP TCP 3838 Anywhere: 0.0.0.0/0 Default shiny port
HTTP TCP 8787 Anywhere: 0.0.0.0/0 Default RStudio port

Before launching you’ll be prompted to download a public key pair. Be sure to save this in a secure and easily accessible location. Review your choices and launch (at the bottom of the page). After a few minutes of setup and initialization your instance is ready to use. Be sure to make a note of the Public IP and Public DNS. You’ll use this to connect to the instance. I’ll use the example IP 11.22.33.44 (whenever you see this replace it with your public IP address), replace publicDNS with your publicDNS.

Connect

To connect to your instance, launch you terminal and use the following template to connect. The first command makes you public key not publicly viewable (only needs to be run the first time you use you SSH key). The second connects to your instance. Replace the name and path with your key’s name and location, replace publicDNS with your public DNS. You may be prompted to proceed.

$ chmod 400 ~/.ssh/mykey.pem
$ ssh -i ~/.ssh/mykey.pem ubunut@publicDNS

You should now be connected to your server. The commands are similar to the terminal commands you use on a Mac. pwd returns your current directory, ls -a lists all the files in the current directory and cd will change your directory to the specified directory. Here is a good bash cheat sheet

Disconnect

Type exit into your terminal and you’ll be logged out of you EC2 instance.

Install R, then some other things

First, we need to update the linux application library. Then we can start installing packages. Reconnect to you instance (if you’ve logged out).

$ sudo apt-get update
$ sudo apt-get -y install nginx

As a quick example, I’ve installed the nginx webserver to this instance. Type your public IP into the url bar of a browser you should get an nginx welcome screen. A nice bit of validation that things are working as they should be. We will be using nginx to password protect our shiny application.

nginx default respons

The default nginx page returned after installing on the server.

Installing R

Installing the latest version of R isn’t as easy as apt-get R. Sources and keys need to be added to the source.list file, then R can be installed (the latest version). The first command sets the sources to the RStudio mirror, then grabs keys to authenticate the installation.

$ sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list'

# public keys
$ gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
$ gpg -a --export E084DAB9 | sudo apt-key add -

# installing R
sudo apt-get update
sudo apt-get -y install r-base

# run R after install
$ R
> sessionInfo()
> quit()

Warning: Be certain to type everything correctly (copy and paste), for the first command. If you receive errors while installing R, navigate to your /etc/apt/sources.list and use sudo nano to make any changes, or delete then retry.

Install other things & Shiny Server, part 1

There are a few external libraries that are required to use Shiny Server. Lets install those, the R package shiny then we will move on to installing Shiny Server. Note: shiny installs 8 dependencies, it may take a while.

$ sudo apt-get -y install gdebi-core
# instal shiny package (more on this later)
$ sudo su - -c "R -e \"install.packages('shiny', repos='http://cran.rstudio.com/')\""
$ wget https://download3.rstudio.org/ubuntu-12.04/x86_64/shiny-server-1.4.2.786-amd64.deb
$ sudo gdebi shiny-server-1.4.2.786-amd64.deb

Shiny Server is served on port 3838. Once these commands have completed go to 11.22.33.44:3838 and you should see a default index.html page with a shiny app and rmarkdown document. The rmarkdown app is likely showing an error. This is because rmarkdown isn’t installed on the server. We will fix that shortly (or not, if you won’t need rmarkdown). For more information on administrating Shiny Sever check the docs.

default shiny server

The default shiny server `index.html` page. On the right are a `shiny` app and an `rmarkdown` document. I haven't installed `rmarkdown` yet, so it is showing an error.

Read the text on this page to learn where Shiny Server is installed on the server, and where important files are located. For instance, the index.html is located at /srv/shiny-server/index.html. This can be modified or replaced with a different file. The sample apps are located at /srv/shiny-server/sample-apps. You can upload your own apps here (or git clone/git pull). Any app in this directory will be added to the page 11.22.33.44:3838/sample-apps/.

BONUS: RStudio Server (not required, but useful)

I do all my R work in RStudio Server. The GUI is intuitive and easy to use. If you aren’t already using it, I highly recommend it. Working in R on the server is super easy with RStudio installed. Bonus, you can use it if your at an offsite meeting and don’t have your computer, or R isn’t installed on the machine you’re using. It’ll have all the packages you need (if not installing them is as easy as install.packages, with some caveats). The system requirements are already in place, so this should be easy. Again, check the admin guide for more information. Warning: RStudio defaults to port 8787, be sure to allow traffic to this port in your security group.

$ wget https://download2.rstudio.org/rstudio-server-0.99.902-amd64.deb
$ sudo gdebi rstudio-server-0.99.902-amd64.deb
rstudio login

If the installation worked navigate to `11.22.33.44:8787` to get to RStudio. Remember to allow traffic to port 8787 on your server.

Other things, part 2

It is unlikely that a base R installation will not cut it for the analyses and apps that will be on the server. There are several external libraries that are required by many R packages. Here is a list of (and installation) of many of those libraries on the server. To find out if any applications you need on the server require external libraries check the CRAN information. There is a section called SystemRequirements, if there are any listed, find an appropriate installation for Ubuntu. Warning: it is possible that some of these libraries may fail to install. This is likely due to a bug in the source code. Check Debian/Ubuntu forums for help.

$ sudo apt-get install -y \
    pandoc \
    pandoc-citeproc \
    libssl-dev \
    libcurl4-gnutls-dev \
    libcairo2-dev \
    libgdal-dev \
    libgeos-dev \
    libproj-dev \
    libxml2-dev \
    libxt-dev \
    libv8-dev \
    git

gdal, geos, proj are all used for geographic analysis. They are required for many functions in the rgdal, rgeos, sp, maptools packages. v8 is a javascript library that is required for some of the htmltools and htmlwidgets that I use in some applications, or converting sp objects to geojson objects.

Installing R packages is a little different on the server than on a local machine. For packages to be available for all users and roles. Use sudo su - -c "R ..." when installing packages to make them available for all users. The command below installs rmarkdown and several useful packages. The example shiny app on the server should work now since rmarkdown is installed. Warning: if you are using an instance with little memory some packages may exhaust the memory and won’t be able to compile.

$ sudo su - -c "R -e \"install.packages('rmarkdown', repos='http://cran.rstudio.com/')\""
# the hadley verse, other packages
$ sudo su - -c "R -e \"install.packages(c('ggplot2', 'dplyr', 'sp', 'rgdal', 'rgeos'), repos='http://cran.rstudio.com/')\""

Password protecting you apps

The open source version of Shiny Server doesn’t have password protection as a feature, Shiny Server Pro does. If you have the resources or you application is business critical I highly recommend going with Shiny Server Pro. I’ve used this post as a template for password protecting my application. I’ll be using nginx. You can use Apache to password protect the server as well.

$ sudo apt-get install apache2-utils

# stop nginx and shiny-server
$ sudo service nginx stop
$ sudo stop shiny-server

There are some server configuration files that need to be changed for the password protection to take effect. First change the nginx config file located at /etc/nginx/sites-available/default, then change the shiny-server config file at /etc/shiny-server/shiny-server.conf. Use nano to open and change the config files. Add the following lines. Comment out any other server config settings in each file (use the hash #).

# add to nginx config
$ sudo nano /etc/nginx/sites-available/default

server {
    listen 80;

    location / {
        proxy_pass http://127.0.0.1:3838/;
        proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
        # note, the following 3 lines added 2016-07-11, see updates section
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        auth_basic "Username and Password are required";
        auth_basic_user_file /etc/nginx/.htpasswd;
     }
    }

# add to shiny-server config
$ sudo nano /etc/shiny-server/shiny-server.conf

server{
    listen 3838 127.0.0.1;

    location / {
      site_dir /srv/shiny-server;
      log_dir /var/log/shiny-server;
      directory_index on;
    }
}

nano takes some getting used to, be sure to use sudo for permission to make edits to these files. The shiny-server.conf file needs to have 127.0.0.1 added, thats it.

Now create users and passwords with htpasswd. You’ll first create a user then, be prompted to enter and confirm a password. Restart the nginx and shiny-server once users are added.

$ cd /etc/nginx
$ sudo htpasswd -c /etc/nginx/.htpasswd exampleuser

# restart servers
$ sudo start shiny-server
$ sudo service nginx start

Now, your shiny app is redirected from port 3838 to port 80, the default http port. Navigate to 11.22.33.44 to get to the login screen. You can’t sneak around to get around authentication by going to 11.22.33.44:3838

nginx login screen

nginx authentication for the Shiny app.

In order to completely secure the app it is highly recommend that the app is secured with SSL. This is a little more work and singing an SSL cert, however this is beyond the scope of this post right now. Refer to this post about securing with SSL. A future post will cover SSL in more detail.

Use GitHub to deploy apps

GitHub can be used to add shiny apps to the /srv/shiny-server directory. This is a great alternative to secure copy (scp). Dean’s post recommends creating a new git repo on the server, then pushing all your shiny apps to that repo on GitHub from your computer and pulling the repo to the server. I’ve set mine up a little differently. I clone then pull individual shiny apps to the /srv/shiny-server/sample-apps directory on my instance. This may be more work, but each shiny app is it’s own repo on GitHub, which is nice for managing issues and development. Check out Dean’s shiny server repo on GitHub, it is a very useful example.

scp data

If you have any flat file data that needs to be loaded for your shiny app you can include it in your GitHub. However, if the data is too large to store in GitHub you can secure copy the data to your server with scp. There may be some trial and error required, all dependent on file permissions on your server. If you are logged in as root you should be able to write to directories created as that user. It may be a good idea to save this procedure in a text file on your computer (to save typing). I put the data in the home/ubuntu folder. You can move it wherever you need to from this folder. Be sure that your shiny app is pointing to the data on the server.

$ scp -i ~/.ssh/mykey.pem ~/documents/your_data.csv ubuntu@publicDNS:~

Thats all she wrote!

Hopefully you have Shiny Server and RStudio up and running now. It is a bit of work, but it is worth it. This should get you going, I tried to include everything I could think of. Please email questions, comments, or improvements to me, I’ll gladly accept any. In a future post I’ll talk about using Docker to deploy shiny apps on AWS; I think Docker will be my final deployment method for apps for work.

My complete server set up

Here is all the set up code to spin up shiny server to run my apps.

# connect to instance
$ chmod 400 ~/.ssh/MyFirstKey.pem
$ ssh -i ~/.ssh/MyFirstKey.pem ubunut@publicDNS

# installing system libraries
$ sudo apt-get update
$ sudo apt-get -y install \
    nginx \
    gdebi-core \
    apache2-utils \
    pandoc \
    pandoc-citeproc \
    libssl-dev \
    libcurl4-gnutls-dev \
    libcairo2-dev \
    libgdal-dev \
    libgeos-dev \
    libproj-dev \
    libxml2-dev \
    libxt-dev \
    libv8-dev

# install R, Shiny Server, and RStudio Server
$ sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list'
$ gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
$ gpg -a --export E084DAB9 | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get -y install r-base
$ wget https://download3.rstudio.org/ubuntu-12.04/x86_64/shiny-server-1.4.2.786-amd64.deb
$ sudo gdebi shiny-server-1.4.2.786-amd64.deb
$ wget https://download2.rstudio.org/rstudio-server-0.99.902-amd64.deb
$ sudo gdebi rstudio-server-0.99.902-amd64.deb

# install R libraries
## shiny related libraries
$ sudo su - -c "R -e \"install.packages(c('shiny', 'rmarkdown', 'shinydashboard', 'shinyjs'), repos='http://cran.rstudio.com/')\""
## hadleyverse
$ sudo su - -c "R -e \"install.packages(c('ggplot2', 'dplyr', 'tidyr', 'readr', 'lazyeval', 'stringr', 'ggthemes', 'ggExtra', 'magrittr', 'viridis', 'gridExtra', 'lubridate', 'fasttime', 'data.table'), repos='http://cran.rstudio.com/')\""
## spatial analysis
$ sudo su - -c "R -e \"install.packages(c('sp', 'rgdal', 'rgeos', 'adehabitatHR', 'geojsonio', 'maptools'), repos='http://cran.rstudio.com/')\""
## htmlwidgets
$ sudo su - -c "R -e \"install.packages(c('leaflet', 'highcharter', 'DT'), repos='http://cran.rstudio.com/')\""

# password protecting the apps
$ sudo service nginx stop
$ sudo stop shiny-server

## nginx config changes
$ sudo nano /etc/nginx/sites-available/default
server {
    listen 80;

    location / {
        proxy_pass http://127.0.0.1:3838/;
        proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
        # note, the following 3 lines added 2016-07-11, see updates section
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        auth_basic "Username and Password are required";
        auth_basic_user_file /etc/nginx/.htpasswd;
     }
    }

## shiny-server config changes
$ sudo nano /etc/shiny-server/shiny-server.conf

server{
    listen 3838 127.0.0.1;

    location / {
      site_dir /srv/shiny-server;
      log_dir /var/log/shiny-server;
      directory_index on;
    }
}

## create passwords
$ cd /etc/nginx
$ sudo htpasswd -c /etc/nginx/.htpasswd exampleuser

$ sudo start shiny-server
$ sudo service nginx start

Resources

How to get your very own RStudio Server and Shiny Server with DigitalOcean - Dean Attali Setup Shiny Server on Ubuntu 14.04 - Huidong Tian Running R on AWS - Markus Schmidberger, AWS Blog Logging Process - Huidong Tian Deplyoing Your Very Own Shiny Server - Morgan Benton Running Shiny Server with a Proxy - Ian Pylvainen, RStudio Support

Updates

2016-07-09

One package I need to use frequently is RSQLServer. The version on CRAN depends on dplyr 0.4.3, dplyr 0.5.0 has major backend updates that breaks RSQLServer installation. To install version 0.4.3 use this code. Alternatively, I could use devtools to install the development version of RSQLServer, but would rather install the stable version.

sudo su - -c "R -e \"install.packages('https://cran.rstudio.com/src/contrib/Archive/dplyr/dplyr_0.4.3.tar.gz', repos=NULL, type='source')\""

2016-07-11

My app was crashing due to some potential websocket issues running apps behind nginx. This issue is fairly common and decently documented on GitHub. The solution was to add a few lines to the default nginx config file at /etc/nginx/sites_enabled/default. I’ve added those changes to the post.

proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";

2016-07-16

Finally got around to including my own index.html file to act as a landing page for shiny apps on my server. I’ve brushed up on Bootstrap and decided to use it for the landing page. It is a work in progress. It’s on GitHub.

shiny server index.html

The `index.html` landing page for shiny-server. Notice the obvious use of Bootstrap.