-
Notifications
You must be signed in to change notification settings - Fork 3
who is visiting this content (table view) #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
who is visiting this content (table view) #38
Conversation
💯 right about this, drafting error when we (I, I think in this case) was breaking things down. |
| observe({ | ||
| if (nchar(input$content_guid) == 0) { | ||
| show("guid_input_msg") | ||
| hide("content_visit_tables") | ||
| } else { | ||
| hide("guid_input_msg") | ||
| show("content_visit_tables") | ||
| } | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would go here, but what do you think about adding in a parser that lets someone paste in a full URL that includes the GUID? This is a bit of an extension / scope creep so if we want to make a separate issue for it and do that in a follow on that's totally fine. But the UX of needing to select just the GUID form the URL (or get to it in the settings panel) could be better. We had good success with the publisher where the field that accepts GUID also accepts the URL (and, in fact, in the publisher it's the GUID alone that is the hidden easter egg that it just works, we though that folks would be more confident / able to / comfortable with grabbing the URL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to keep this as minimal as possible, figuring it wouldn't be the final UI for selecting a piece of content.
Perhaps it would be fine to add a bit of extra logic to extract the GUID if a URL is pasted in. OTOH, perhaps even having this hide / show logic and a "paste in a content guid" message is too much code to pour into something that won't be a final UI. I can really see both arguments. I'd rather not put more effort in here till a later edition, at the very least.
| # FIXME for connectapi: This is slow, it's where most of the slowness is with | ||
| # the new endpoint. | ||
| usage_parsed <- connectapi:::parse_connectapi_typed(usage_raw, usage_dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we already have an issue in connectapi for this? Also if it's getting in our way (by being slow) here, do we even need to enforce typing like this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is now an issue to track it: posit-dev/connectapi#383 :)
It's not horrendously slow, but it is definitely at least worth investigating as a connectapi issue.
And as for typing enforcement — I guess that depends on what you mean by "need". It's nice: parse_connectapi_typed() "automatically" handles deserializing the JSON from Connect into the correct R types and structure, abstracting away a lot of stuff that's fiddly in R, and there's value in that.
Another way I'm thinking about it is… if I just coded from scratch all the things parse_connectapi_typed does to shape the data into the shape I want, maybe be just as slow. And if that were the case, then I'd want to keep it, because it makes life a lot simpler in a few ways.
However, if, say, the base parse_connectapi() function, which doesn't do any type enforcement, was much faster, than it'd probably be better to use that and handle type conversion in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't quite tell from this part of your comment;
However, if, say, the base
parse_connectapi()function, which doesn't do any type enforcement, was much faster, than it'd probably be better to use that and handle type conversion in a different way.
Is parse_conenctapi() much faster in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure when I wrote the comment (I wanted to avoid opening cans of worms while I was responding to your comments).
It isn't much faster:
> length(usage_raw)
[1] 485794
> system.time(connectapi:::parse_connectapi_typed(usage_raw, usage_dtype))
user system elapsed
28.245 1.488 29.798
> system.time(connectapi:::parse_connectapi(usage_raw))
user system elapsed
17.602 0.330 18.128I'm also going to post this info to the connectapi issue so that it doesn't get lost.
| # Default dates. "This week" is best "common sense" best represented by six | ||
| # days ago thru the end of today. Without these, content takes too long to | ||
| # display on some servers. | ||
| date_range <- reactive({ | ||
| list( | ||
| from_date = today() - ddays(6), | ||
| to_date = today() | ||
| ) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great default + kick the can of "which is the best period? Should I do forever? etc" for when we get to the filtering edition later 💯
| aggregated_visits_data <- reactive({ | ||
| usage_data() |> | ||
| filter(content_guid == input$content_guid) |> | ||
| group_by(user_guid) |> | ||
| summarize(n_visits = n()) |> | ||
| left_join(user_names(), by = "user_guid") |> | ||
| replace_na(list(full_name = "[Anonymous]")) |> | ||
| arrange(desc(n_visits)) |> | ||
| select(n_visits, full_name, username) | ||
| }) |> bindCache(date_range()$from_date, date_range()$to_date, input$content_guid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might very well be another issue/PR later, since it wasn't in the original and is conceptually a bit of an addition, but:
If I'm remembering / understanding correctly items that have multiple URL routes and that folks navigate through will have a bunch of lines in the firehosue endpoint that are really one session of visits (to use the correct word to describe it, but we have overloaded that slightly with our old API endpoint! I'm meaning the abstract / general use not specifically what comes from our shiny sessions endpoint). What do you think about adding in a collapse here too? Something like "entries within N minutes of each other are considered one visit"? We definitely wouldn't want only one measure or the other — both are valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is definitely a priority!
Following our discussion What do you think about separating this investigation out as its own issue? Kind of a spike on computing the right metric? It'll involve some research and experimentation, and possibly discussion with @marcosnav.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created https://github.com/posit-dev/connect/issues/30515 about figuring out hits vs sessions
marcosnav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far looks good to me, some minor feedback from running it.
- Table headers,
timestamp->Time,full_name->Full Name,n_visits->Total Visits - Values on the timestamp column to be human friendly
| shinyOptions( | ||
| cache = cachem::cache_disk("./app_cache/cache/", max_age = 60 * 60 * 8) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, requests for this dashboard respond pretty fast, do you think managing cache is still needed?
It'll be nice if this one can be simplified by removing cache operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one will eventually be a sub-view of the other dashboard, and at that point it'll just probably get its data passed in from that dashboard's cache.
For now, since it needs to load the firehouse, and there is some time taken e.g. on Dogfood for processing the ~500000 records returned, caching is helpful for demo purposes.
That's from memory -- I can definitely try removing the cache to see if it performs sufficiently well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes 30 seconds to process the firehose endpoint in connectapi. There is an issue I opened to see if that performance cost can be reduced, but for now, caching after that processing step makes the app responsive faster after loading.
Thanks for taking a look! As part of the feedback from the review session, we're gonna investigate changing the table display framework we're using. That will change how we perform date formatting and header renaming, so in a discussion with @jonkeane I decided to skip adding those for this version. |
|
As I got two comments about it, I added column names and date formatting to this version, rather than waiting till a later iteration. |
marcosnav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing previous comments
The actual issue describing the "table view" goes into more detail than just a table view — I'm guessing that's actually referring to later issues in the series (https://github.com/posit-dev/connect/issues/30286).
Dogfood: https://rsc.radixu.com/who-is-visiting-this-content/
connect.posit.it: https://connect.posit.it/who-is-visiting-this-content/
Most basic possible approach: paste in a content GUID and get a table of individual visits and a table of total activity, sorted by descending date and number of visits respectively. Shows a summary sentence of total visits.
Notes: