(tie-dye graphs)
Long time series datasets can be difficult to plot and examine using static figures. Interactive plots using the dygraphs package are a great tool but are currently not easily implemented for tidy data. This package provides methods for dygraph()
to make it easier to plot tidy time series. Currently there are methods for dataframes and tibbles, grouped tibbles, and tsibbles.
Currently tydygraphs is only available on GitHub. You can install the current development version using remotes or by downloading and building from source:
# Install using remotes
remotes::install_github("jpshanno/tydygraphs")
# or from source
tydygraphs_source <- file.path(tempdir(), "tydygraphs-master.zip")
download.file("https://github.com/jpshanno/tydygraphs/archive/master.zip",
tydygraphs_source)
unzip(tydygraphs_source,
exdir = dirname(tydygraphs_source))
install.packages(sub(".zip$", "", tydygraphs_source),
repos = NULL,
type = "source")
The dataset great_lakes_hydro
included in tydygraphs demonstrates the problems with examing long-term time series datasets and the benefits of using dygraphs. The dataset consists of monthly water levels, precipitation, evaporation, discharge from, and runoff into Lake Superior and the combined Lakes Michigan and Huron. The data were downloaded from NOAAβs Great Lakes Dashboard Project (Smith et al. 2016)
library(tydygraphs)
#> Loading required package: dygraphs
#>
#> Attaching package: 'tydygraphs'
#> The following object is masked from 'package:dygraphs':
#>
#> dygraph
great_lakes_hydro
#> # A tibble: 1,535 x 7
#> lake measurement_date water_level_m precip_mm evaporation_mm
#> <chr> <date> <dbl> <dbl> <dbl>
#> 1 Supe⦠1950-01-01 183.42 95 128.15
#> 2 Supe⦠1950-02-01 183.362 39.1 49.4
#> 3 Supe⦠1950-03-01 183.316 48.2 38.01
#> 4 Supe⦠1950-04-01 183.347 72.9 25.4
#> 5 Supe⦠1950-05-01 183.539 85.3 -0.51
#> 6 Supe⦠1950-06-01 183.725 95.5 -4.86
#> 7 Supe⦠1950-07-01 183.801 82 -5.26
#> 8 Supe⦠1950-08-01 183.841 81.8 1.66
#> 9 Supe⦠1950-09-01 183.81 56.1 24.13
#> 10 Supe⦠1950-10-01 183.795 61.6 42.62
#> # β¦ with 1,525 more rows, and 2 more variables: discharge_mm <dbl>,
#> # runoff_mm <dbl>
Letβs start simple and show how the we can use the function for a single time series by filtering and supplying just a single variable to dygraph
. The first column representing a time series will be used if nothing is supplied for time
.
great_lakes_hydro %>%
filter(lake == "Superior") %>%
dygraph(water_level_m)
#> Registered S3 method overwritten by 'xts':
#> method from
#> as.zoo.xts zoo
And if you noticed in the data, all of the other variables are expressed in millimeters, but because tydygraphs is built on top of dygraphs we can easily add two a second y-axis to look at precipitation in the same plot. This plot also makes it pretty clear why a tool like dygraphs is great for long time series. When the entire dataset is displayed it is nearly useless for exploratory analysis or diagnostics, but now we have the ability to zoom in on periods of interest.
two_variables <-
great_lakes_hydro %>%
filter(lake == "Superior") %>%
dygraph(precip_mm,
water_level_m) %>%
dySeries("precip_mm",
axis = 'y2')
two_variables
If we wanted to use this plot in something like a Shiny application or a blog post we could build even more using the dygraphs customizations. The chunk below makes precipitation into a step plot, adds a border around the water level series, and starts the plot zoomed in on a period of interest.
There are lots of ways to customize dygraphs and I highly recommend looking at the documentation for dygraphs.
# Start with the plot from above
two_variables %>%
# Modify precipitation to make a filled step plot with no border by adjusting
# the strokePattern
dySeries("precip_mm",
axis = "y2",
stepPlot = TRUE,
fillGraph = TRUE,
color = "dodgerblue",
strokePattern = c(0,1)) %>%
# Modify water level to add a white border around the line to ensure it stands
# out from precipitation
dySeries("water_level_m",
strokeBorderWidth = 2,
color = "darkblue") %>%
dyRangeSelector(dateWindow = c("2000-01-01", "2010-01-01")) %>%
dyLegend(labelsSeparateLines = TRUE)
And the final example is showing how we can plot our tidy data by passing a grouped tibble to dygraph()
.
great_lakes_hydro %>%
group_by(lake) %>%
dygraph(water_level_m)
Smith, Joeseph P., Timothy S. Hunter, Anne H. Clites, Craig A. Stow, Tad Slawecki, Glenn C. Muhr, and Andrew D. Gronewold. 2016. βAn Expandable Web-Based Platform for Visually Analyzing Basin-Scale Hydro-Climate Time Series Data.β Environmental Modelling & Software 78 (April). Elsevier BV: 97β105. https://doi.org/10.1016/j.envsoft.2015.12.005.