Chapter 3 reporting-googlesheets

Goal: Create a daily percent to goal report

As described previously, a common practice is to export aggregated code from Civis into a Google Sheet. In this section we will work with Google Sheets to create a report that can be used to schedule automatic reporting.

To use the googlesheets package you must first install it (install.packages("googlesheets")). To load a spreadsheet use the gs_title() or gs_url() functions. The former takes the name of the sheet and the latter uses the url of it. Once this line is ran Google will open and require you to authenticate. This will create .httr-oath file in your working directory. This contains your authorization token which will be used later for automating this workflow.

library(googlesheets)

# register the sheet
sheet <- gs_url("https://docs.google.com/spreadsheets/d/1tROfDCP8meClFSDupPJViEQVIZ-Fsn36U9on7qSERzs/edit?usp=sharing")
## Sheet-identifying info appears to be a browser URL.
## googlesheets will attempt to extract sheet key from the URL.
## Putative key: 1tROfDCP8meClFSDupPJViEQVIZ-Fsn36U9on7qSERzs
## Auto-refreshing stale OAuth token.
## Sheet successfully identified: "R 4 Progressive Campaigns"
# read the `weekly_canvas` tab
weekly_canvass <- gs_read(sheet, "weekly_canvass")
## Accessing worksheet titled 'weekly_canvass'.
## Parsed with column specification:
## cols(
##   turf_code = col_character(),
##   week = col_double(),
##   n_pledged = col_double(),
##   vol_yes = col_double()
## )
# read the `goals` tab
goals <- gs_read(sheet, "goals")
## Accessing worksheet titled 'goals'.
## Parsed with column specification:
## cols(
##   week = col_double(),
##   region = col_character(),
##   goal = col_double()
## )
weekly_canvass
## # A tibble: 66 x 4
##    turf_code  week n_pledged vol_yes
##    <chr>     <dbl>     <dbl>   <dbl>
##  1 A             1       114      41
##  2 A             2       109      28
##  3 A             3        96      33
##  4 A             4       121      34
##  5 A             5        95      32
##  6 A             6        99      29
##  7 A             7        78      28
##  8 A             8       111      29
##  9 A             9       119      34
## 10 A            10       119      27
## # … with 56 more rows
goals
## # A tibble: 312 x 3
##     week region  goal
##    <dbl> <chr>  <dbl>
##  1     1 A        120
##  2     1 B        130
##  3     1 C         70
##  4     1 D        120
##  5     1 E         90
##  6     1 F        130
##  7     2 A        110
##  8     2 B         90
##  9     2 C        130
## 10     2 D         70
## # … with 302 more rows

Notice that this code creates two tibbles in memory. Now everything else is the same. We will work within the context of an R Markdown document (intro to R Markdown).

We want to create a simple report which will be emailed out to the organizing director each morning with the most up to date weekly PTG.

Since this is a weekly report, we want to filter to the current week. In this case, the most recent data is from the 11th week of the month. We will filter both the weekly_canvass and goals tibbles to these weeks and then join. Note that it is important to filter before joining as joining can be computationally intensive. We want to reduce the data before joining it.

current_week <- weekly_canvass %>% 
  filter(week == 11) %>% 
  left_join(filter(goals, week == 11),
            by = c("turf_code" = "region", "week")) %>% 
  mutate(ptg = n_pledged / goal)

current_week
## # A tibble: 6 x 6
##   turf_code  week n_pledged vol_yes  goal   ptg
##   <chr>     <dbl>     <dbl>   <dbl> <dbl> <dbl>
## 1 A            11        76      20    90 0.844
## 2 B            11        51      20    50 1.02 
## 3 C            11        87      17   130 0.669
## 4 D            11        83      31    60 1.38 
## 5 E            11        55      20    70 0.786
## 6 F            11        88      29    70 1.26

While it nice to have each region’s PTG, it’s also useful to have the entire program weekly PTG. We can create an aggregate table and bind it to the existing table.

totals <- current_week %>% 
  bind_rows(
    current_week %>% 
      summarise(n_pledged = sum(n_pledged),
                goal = sum(goal),
                ptg = n_pledged / goal,
                vol_yes = sum(vol_yes),
                turf_code = "Total", 
                week = mean(week)) 
  )

totals
## # A tibble: 7 x 6
##   turf_code  week n_pledged vol_yes  goal   ptg
##   <chr>     <dbl>     <dbl>   <dbl> <dbl> <dbl>
## 1 A            11        76      20    90 0.844
## 2 B            11        51      20    50 1.02 
## 3 C            11        87      17   130 0.669
## 4 D            11        83      31    60 1.38 
## 5 E            11        55      20    70 0.786
## 6 F            11        88      29    70 1.26 
## 7 Total        11       440     137   470 0.936
totals %>% 
  ggplot(aes(turf_code, ptg)) +
  geom_col() +
  geom_hline(yintercept = 1, lty = 2, alpha = .4) +
  theme_minimal() +
  scale_y_continuous(labels = scales::percent_format()) +
  labs(title = "Weekly PTG", 
       y = "% to goal",
       x = "Turf Code")

It is important that a clean table is presented alongside the chart. For this we will use knitr::kable().

totals %>% 
  mutate(ptg = ptg * 100) %>% 
  select(`Turf Code` = turf_code,
         `# Pledged` = n_pledged, 
         Goal = goal, 
         `% to Goal` = ptg) %>% 
  knitr::kable(digits = 2)
Turf Code # Pledged Goal % to Goal
A 76 90 84.44
B 51 50 102.00
C 87 130 66.92
D 83 60 138.33
E 55 70 78.57
F 88 70 125.71
Total 440 470 93.62