Chapter 9 Plot Caching
Blog: Shiny 1.2.0: Plot caching - November 18, 2018
The Shiny 1.2.0 package release introduced Plot Caching, an important new tool for improving performance and scalability in Shiny applications.
In a nutshell: The term “caching” means that when time-consuming operations are performed, we can save (cache) the results so that the next time that operation is requested, instead of re-running that calculation, we instead go fetch the previously cached result. When applied appropriately, this “fetch” operation should take less time that the original calculation and improve application performace (and user experience) overall.
Shiny’s reactive expressions do some amount of caching by default, and you can use more explicit techniques to cache various data operations. Examples include use of the memoise
package, or manually saving intermediate data frames to disk as CSV or RDS.
Plots are a very common and (potentially) expensive to compute type of output object in Shiny applications, which makes them a great candidate for caching. In theory, you could use renderImage
to accomplish this, but because Shiny’s renderPlot
function contains a lot of complex infrastructure code, it’s actually quite a difficult task.
Shiny v1.2.0 introduces a new function, renderCachedPlot
, that makes it much easiter to add plot caching to your application.
9.1 When to use plot caching
A shiny app is a good candidate for plot caching if:
- The app has plot outputs that are time-consuming to generate
- These plots are a significant fraction of the total amount of time the app spends thinking
- Most users are likely to request the same few plots
9.1.1 Using renderCachedPlot
The following example is a simple, but computationally expensive, plot output:
output$plot <- renderPlot({
ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
geom_point()
})
The diamonds
dataset has 53,940 rows. This plot might take roughly 1580 milliseconds (1.58 seconds) to generate depending on the resources available. For a high traffic Shiny application in production, 1.58 seconds is likely slower than ideal.
Plot caching can be implemented in two steps:
- Change
renderPlot
torenderCachedPlot
- Provide a suitable
cacheKeyExpr
. This is an expression that Shiny will use to determine which invocations ofrenderCachedPlot
should be considered equivalent to each other. In the example case, two plots with differentinput$color_by
values can’t be considered the “same” plot, so thecacheKeyExpr
needs to beinput$color_by
The example plot code would be updated like this:
output$plot <- renderCachedPlot({
ggplot(diamonds, aes(carat, price, color = !!input$color_by)) +
geom_point()
}, cacheKeyExpr = { input$color_by })
With these code changes, the first time a plot with a particular input$color_by
is requested, it will take the normal amount of time. But the next time it is requested, it will be almost instant, as the previously rendered plot will be reused.
9.2 Activity: Plot cache benchmarking
First: Update your app code to use renderCachedPlot
Discussion: caching + shinyloadtest
- How will introducing
renderCachedPlot
affect our load test experience?
What is the performance comparison between tests?
Deliverable: Re-run Load Test
- Redeploy the version of the app with
renderCachedPlot
- Re-run the load test and compare the outputs (continue to follow along with
runloadtest.R
)
9.3 Extended Topics
Plot Caching on RStudio Connect
Shiny can store cached plots in memory, on disk, or with another backend like Redis. There are also a number of options for limiting the size of the cache. Applications deployed on RStudio Connect should use a disk cache and specify a subdirectory of the application directory as the location for the cache. To do so, add this code to the top of your application:
library(shiny)
shinyOptions(cache = diskCache("./cache"))
This option ensures that cached plots will be saved and used across the multiple R processes that RStudio Connect runs in support of an application. In addition, this configuration results in the cache being deleted and reset when new versions of the application are deployed.