Chapter 7 Connecting to Data in Production

7.1 The config package

The config package allows the connection code in R to reference an external file that defines values based on the environment. This process makes it easy to specify values to use for a connection locally and values to use after deployment.

For this example R code:

library(DBI)
library(odbc)
library(config)

dw <- get("datawarehouse")

con <- dbConnect(
   Driver = dw$driver,
   Server = dw$server,
   UID    = dw$uid,
   PWD    = dw$pwd,
   Port   = dw$port,
   Database = dw$database
)

config.yml might look like this:

default:
  datawarehouse:
    driver: 'Postgres' 
    server: 'mydb-test.company.com'
    uid: !expr keyring::key_get("db-credentials")[1,2]'
    pwd: !expr keyring::key_get("db-credentials")[1,2]'
    port: 5432
    database: 'regional-sales-sample'
    
rsconnect:
  datawarehouse:
    driver: 'PostgresPro'
    server: 'mydb-prod.company.com'
    uid: !expr Sys.getenv("DBUSER")
    pwd: !expr Sys.getenv("DBPWD")
    port: 5432
    database: 'regional-sales-full'

The config package determines the active configuration by looking at the R_CONFIG_ACTIVE environment variable. By default, RStudio Connect sets R_CONFIG_ACTIVE to the value rsconnect. In the config file above, the default datawarehouse values would be used locally and the datawarehouse values defined in the rsconnect section would be used on RStudio Connect. Administrators can optionally customize the name of the active configuration used in Connect.

7.2 Environment Variables

When developing content for RStudio Connect, you should never place secrets (keys, tokens, passwords, etc.) in the code itself. Best practices dictate that this kind of sensitive information should be protected through the use of environment variables or another method of configuration such as the config package.

The Vars panel makes it easy to define environment variables which are then exposed to the processes executing your content. Note that there is no way to define environment variables prior to publishing content. If your content code relies on environment variables, publish it in an initial ‘broken’ state, then add the environment variables through this pane before sharing or testing the content.

Environment Variables in RStudio Connect

Environment Variables in RStudio Connect

Click on the Add Environment Variable button, then provide a name and value for your environment variable. For security reasons, once you add a variable, the value will be obscured and cannot be edited. You can always delete a variable and create a new one with the same name.

7.3 Activity: Databases

First: Read This!

Discussion:

Data Management

  • How does your organization connect to data?
  • What data are we exposing to viewers? Is it appropriate for all viewers?
  • What else could we manage with the config package?

Deliverable: Running, Deployed App

  • Update config.yml and Redeploy
  • Edit the Environment Variables in RStudio Connect

Revisit the Deployment Checklist

Database Checklist

Database Checklist