Chapter 9 Plot Caching

Blog: Shiny 1.2.0: Plot caching - November 18, 2018

The Shiny 1.2.0 package release introduced Plot Caching, an important new tool for improving performance and scalability in Shiny applications.

In a nutshell: The term “caching” means that when time-consuming operations are performed, we can save (cache) the results so that the next time that operation is requested, instead of re-running that calculation, we instead go fetch the previously cached result. When applied appropriately, this “fetch” operation should take less time that the original calculation and improve application performace (and user experience) overall.

Shiny’s reactive expressions do some amount of caching by default, and you can use more explicit techniques to cache various data operations. Examples include use of the memoise package, or manually saving intermediate data frames to disk as CSV or RDS.

Plots are a very common and (potentially) expensive to compute type of output object in Shiny applications, which makes them a great candidate for caching. In theory, you could use renderImage to accomplish this, but because Shiny’s renderPlot function contains a lot of complex infrastructure code, it’s actually quite a difficult task.

Shiny v1.2.0 introduces a new function, renderCachedPlot, that makes it much easiter to add plot caching to your application.

9.1 When to use plot caching

A shiny app is a good candidate for plot caching if:

  1. The app has plot outputs that are time-consuming to generate
  2. These plots are a significant fraction of the total amount of time the app spends thinking
  3. Most users are likely to request the same few plots

9.1.1 Using renderCachedPlot

renderChachedPlot()

renderChachedPlot()

The following example is a simple, but computationally expensive, plot output:

output$plot <- renderPlot({
  ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
    geom_point()
})

The diamonds dataset has 53,940 rows. This plot might take roughly 1580 milliseconds (1.58 seconds) to generate depending on the resources available. For a high traffic Shiny application in production, 1.58 seconds is likely slower than ideal.

Plot caching can be implemented in two steps:

  1. Change renderPlot to renderCachedPlot
  2. Provide a suitable cacheKeyExpr. This is an expression that Shiny will use to determine which invocations of renderCachedPlot should be considered equivalent to each other. In the example case, two plots with different input$color_by values can’t be considered the “same” plot, so the cacheKeyExpr needs to be input$color_by

The example plot code would be updated like this:

output$plot <- renderCachedPlot({
  ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
    geom_point()
}, cacheKeyExpr = { input$color_by })

With these code changes, the first time a plot with a particular input$color_by is requested, it will take the normal amount of time. But the next time it is requested, it will be almost instant, as the previously rendered plot will be reused.

9.2 Activity: Plot cache benchmarking

First: Update your app code to use renderCachedPlot

Discussion: caching + shinyloadtest

  • How will introducing renderCachedPlot affect our load test experience?

What is the performance comparison between tests?

Deliverable: Re-run Load Test

  • Redeploy the version of the app with renderCachedPlot
  • Re-run the load test and compare the outputs (continue to follow along with runloadtest.R)

9.3 Extended Topics

Plot Caching on RStudio Connect

Shiny can store cached plots in memory, on disk, or with another backend like Redis. There are also a number of options for limiting the size of the cache. Applications deployed on RStudio Connect should use a disk cache and specify a subdirectory of the application directory as the location for the cache. To do so, add this code to the top of your application:

library(shiny)
shinyOptions(cache = diskCache("./cache"))

This option ensures that cached plots will be saved and used across the multiple R processes that RStudio Connect runs in support of an application. In addition, this configuration results in the cache being deleted and reset when new versions of the application are deployed.