-
Notifications
You must be signed in to change notification settings - Fork 3
who is visiting this content (table view) #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| source("renv/activate.R") |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| .posit/ | ||
| app_cache/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| library(shiny) | ||
| library(bslib) | ||
| library(shinyjs) | ||
| library(connectapi) | ||
| library(dplyr) | ||
| library(glue) | ||
| library(lubridate) | ||
| library(tidyr) | ||
|
|
||
| shinyOptions( | ||
| cache = cachem::cache_disk("./app_cache/cache/", max_age = 60 * 60 * 8) | ||
| ) | ||
|
|
||
| source("get_usage.R") | ||
|
|
||
| ui <- page_fillable( | ||
| useShinyjs(), | ||
| theme = bs_theme(version = 5), | ||
|
|
||
| card( | ||
| card_header("Who is Visiting This Content?"), | ||
| layout_sidebar( | ||
| sidebar = sidebar( | ||
| title = "No Filters Yet", | ||
| open = FALSE, | ||
|
|
||
| actionButton("clear_cache", "Clear Cache", icon = icon("refresh")) | ||
| ), | ||
|
|
||
| textInput( | ||
| "content_guid", | ||
| "Content GUID" | ||
| ), | ||
|
|
||
| h4( | ||
| id = "guid_input_msg", | ||
| "Please enter a content GUID" | ||
| ), | ||
|
|
||
| textOutput("summary_message"), | ||
|
|
||
| tabsetPanel( | ||
| id = "content_visit_tables", | ||
| tabPanel( | ||
| "List of Visits", | ||
| tableOutput("all_visits") | ||
| ), | ||
| tabPanel( | ||
| "Aggregated Visits", | ||
| tableOutput("aggregated_visits") | ||
| ) | ||
| ) | ||
| ) | ||
| ) | ||
| ) | ||
|
|
||
| server <- function(input, output, session) { | ||
| # Cache invalidation button ---- | ||
| cache <- cachem::cache_disk("./app_cache/cache/") | ||
| observeEvent(input$clear_cache, { | ||
| print("Cache cleared!") | ||
| cache$reset() # Clears all cached data | ||
| session$reload() # Reload the app to ensure fresh data | ||
| }) | ||
|
|
||
| observe({ | ||
| if (nchar(input$content_guid) == 0) { | ||
| show("guid_input_msg") | ||
| hide("content_visit_tables") | ||
| } else { | ||
| hide("guid_input_msg") | ||
| show("content_visit_tables") | ||
| } | ||
| }) | ||
|
Comment on lines
+66
to
+74
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this would go here, but what do you think about adding in a parser that lets someone paste in a full URL that includes the GUID? This is a bit of an extension / scope creep so if we want to make a separate issue for it and do that in a follow on that's totally fine. But the UX of needing to select just the GUID form the URL (or get to it in the settings panel) could be better. We had good success with the publisher where the field that accepts GUID also accepts the URL (and, in fact, in the publisher it's the GUID alone that is the hidden easter egg that it just works, we though that folks would be more confident / able to / comfortable with grabbing the URL)
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was trying to keep this as minimal as possible, figuring it wouldn't be the final UI for selecting a piece of content. Perhaps it would be fine to add a bit of extra logic to extract the GUID if a URL is pasted in. OTOH, perhaps even having this hide / show logic and a "paste in a content guid" message is too much code to pour into something that won't be a final UI. I can really see both arguments. I'd rather not put more effort in here till a later edition, at the very least. |
||
|
|
||
| # Loading and processing data ---- | ||
| client <- connect() | ||
|
|
||
| # Default dates. "This week" is best "common sense" best represented by six | ||
| # days ago thru the end of today. Without these, content takes too long to | ||
| # display on some servers. | ||
| date_range <- reactive({ | ||
| list( | ||
| from_date = today() - ddays(6), | ||
| to_date = today() | ||
| ) | ||
| }) | ||
|
Comment on lines
+79
to
+87
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a great default + kick the can of "which is the best period? Should I do forever? etc" for when we get to the filtering edition later 💯 |
||
|
|
||
| content <- reactive({ | ||
| # Grab the entire content data frame here and filter it using the pasted-in | ||
| # GUID to obtain content title and other metadata, rather than making a | ||
| # request to `v1/content/{GUID}`. If this were a prod, standalone dashboard, | ||
| # might be better to call that endpoint. | ||
| get_content(client) | ||
| }) |> bindCache("static_key") | ||
|
|
||
| user_names <- reactive({ | ||
| get_users(client) |> | ||
| mutate(full_name = paste(first_name, last_name)) |> | ||
| select(user_guid = guid, full_name, username) | ||
| }) |> bindCache("static_key") | ||
|
|
||
| usage_data <- reactive({ | ||
| get_usage( | ||
| client, | ||
| from = date_range()$from_date, | ||
| to = date_range()$to_date + hours(23) + minutes(59) + seconds(59) | ||
| ) | ||
| }) |> bindCache(date_range()$from_date, date_range()$to_date) | ||
|
|
||
| # Compute data | ||
| all_visits_data <- reactive({ | ||
| usage_data() |> | ||
| filter(content_guid == input$content_guid) |> | ||
| left_join(user_names(), by = "user_guid") |> | ||
| replace_na(list(full_name = "[Anonymous]")) |> | ||
| arrange(desc(timestamp)) |> | ||
| select(timestamp, full_name, username) | ||
| }) |> bindCache(date_range()$from_date, date_range()$to_date, input$content_guid) | ||
|
|
||
| aggregated_visits_data <- reactive({ | ||
| usage_data() |> | ||
| filter(content_guid == input$content_guid) |> | ||
| group_by(user_guid) |> | ||
| summarize(n_visits = n()) |> | ||
| left_join(user_names(), by = "user_guid") |> | ||
| replace_na(list(full_name = "[Anonymous]")) |> | ||
| arrange(desc(n_visits)) |> | ||
| select(n_visits, full_name, username) | ||
| }) |> bindCache(date_range()$from_date, date_range()$to_date, input$content_guid) | ||
|
Comment on lines
+121
to
+130
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might very well be another issue/PR later, since it wasn't in the original and is conceptually a bit of an addition, but: If I'm remembering / understanding correctly items that have multiple URL routes and that folks navigate through will have a bunch of lines in the firehosue endpoint that are really one session of visits (to use the correct word to describe it, but we have overloaded that slightly with our old API endpoint! I'm meaning the abstract / general use not specifically what comes from our shiny sessions endpoint). What do you think about adding in a collapse here too? Something like "entries within N minutes of each other are considered one visit"? We definitely wouldn't want only one measure or the other — both are valuable.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is definitely a priority! Following our discussion What do you think about separating this investigation out as its own issue? Kind of a spike on computing the right metric? It'll involve some research and experimentation, and possibly discussion with @marcosnav.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I created https://github.com/posit-dev/connect/issues/30515 about figuring out hits vs sessions |
||
|
|
||
| summary_message <- reactive({ | ||
| content_title <- content() |> | ||
| filter(guid == input$content_guid) |> | ||
| pull(title) | ||
| hits <- all_visits_data() | ||
| glue( | ||
| "Content '{content_title}' had {nrow(hits)} between ", | ||
| "{min(hits$timestamp)} and {max(hits$timestamp)}." | ||
| ) | ||
| }) | ||
|
|
||
|
|
||
| output$summary_message <- renderText(summary_message()) | ||
| output$all_visits <- renderTable( | ||
| all_visits_data() |> | ||
| transmute(timestamp = format(timestamp, "%Y-%m-%d %H:%M:%S"), full_name, username) |> | ||
| rename("Time" = timestamp, "Full Name" = full_name, "Username" = username) | ||
| ) | ||
| output$aggregated_visits <- renderTable( | ||
| aggregated_visits_data() |> | ||
| rename("Total Visits" = n_visits, "Full Name" = full_name, "Username" = username) | ||
| ) | ||
| } | ||
|
|
||
| shinyApp(ui, server) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| library(connectapi) | ||
|
|
||
| # This file contains functions that ultimately will more likely be part of | ||
| # connectapi. As such, I'm not using dplyr or pipes here. | ||
|
|
||
| NA_datetime_ <- vctrs::new_datetime(NA_real_, tzone = "UTC") | ||
| NA_list_ <- list(list()) | ||
|
|
||
| usage_dtype <- tibble::tibble( | ||
| "id" = NA_integer_, | ||
| "user_guid" = NA_character_, | ||
| "content_guid" = NA_character_, | ||
| "timestamp" = NA_datetime_, | ||
| "data" = NA_list_ | ||
| ) | ||
|
|
||
| # A rough implementation of how a new firehose usage function would work in | ||
| # `connectapi`. | ||
| get_usage_firehose <- function(client, from = NULL, to = NULL) { | ||
| usage_raw <- client$GET( | ||
| connectapi:::unversioned_url("instrumentation", "content", "hits"), | ||
| query = list( | ||
| from = from, | ||
| to = to | ||
| ) | ||
| ) | ||
|
|
||
| # FIXME for connectapi: This is slow, it's where most of the slowness is with | ||
| # the new endpoint. | ||
| usage_parsed <- connectapi:::parse_connectapi_typed(usage_raw, usage_dtype) | ||
|
Comment on lines
+28
to
+30
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we already have an issue in connectapi for this? Also if it's getting in our way (by being slow) here, do we even need to enforce typing like this here?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is now an issue to track it: posit-dev/connectapi#383 :) It's not horrendously slow, but it is definitely at least worth investigating as a And as for typing enforcement — I guess that depends on what you mean by "need". It's nice: Another way I'm thinking about it is… if I just coded from scratch all the things However, if, say, the base
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't quite tell from this part of your comment;
Is
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't sure when I wrote the comment (I wanted to avoid opening cans of worms while I was responding to your comments). It isn't much faster: > length(usage_raw)
[1] 485794
> system.time(connectapi:::parse_connectapi_typed(usage_raw, usage_dtype))
user system elapsed
28.245 1.488 29.798
> system.time(connectapi:::parse_connectapi(usage_raw))
user system elapsed
17.602 0.330 18.128I'm also going to post this info to the |
||
|
|
||
| usage_parsed[c("user_guid", "content_guid", "timestamp")] | ||
| } | ||
|
|
||
| get_usage_legacy <- function(client, from = NULL, to = NULL) { | ||
| shiny_usage <- get_usage_shiny(client, limit = Inf, from = from, to = to) | ||
| shiny_usage_cols <- shiny_usage[c("user_guid", "content_guid")] | ||
| shiny_usage_cols$timestamp <- shiny_usage$started | ||
|
|
||
| static_usage <- get_usage_static(client, limit = Inf, from = from, to = to) | ||
| static_usage_cols <- static_usage[c("user_guid", "content_guid")] | ||
| static_usage_cols$timestamp <- static_usage$time | ||
|
|
||
| bind_rows(shiny_usage_cols, static_usage_cols) | ||
| } | ||
|
|
||
| get_usage <- function(client, from = NULL, to = NULL) { | ||
| from <- format(from, "%Y-%m-%dT%H:%M:%SZ") | ||
| to <- format(to, "%Y-%m-%dT%H:%M:%SZ") | ||
| tryCatch( | ||
| { | ||
| print("Trying firehose usage endpoint.") | ||
| get_usage_firehose(client, from, to) | ||
| }, | ||
| error = function(e) { | ||
| print("Could not use firehose endpoint; trying legacy usage endpoints.") | ||
| get_usage_legacy(client, from, to) | ||
| } | ||
| ) | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, requests for this dashboard respond pretty fast, do you think managing cache is still needed?
It'll be nice if this one can be simplified by removing cache operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one will eventually be a sub-view of the other dashboard, and at that point it'll just probably get its data passed in from that dashboard's cache.
For now, since it needs to load the firehouse, and there is some time taken e.g. on Dogfood for processing the ~500000 records returned, caching is helpful for demo purposes.
That's from memory -- I can definitely try removing the cache to see if it performs sufficiently well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes 30 seconds to process the firehose endpoint in
connectapi. There is an issue I opened to see if that performance cost can be reduced, but for now, caching after that processing step makes the app responsive faster after loading.