Skip to content

Conversation

@dsweber2
Copy link
Contributor

@dsweber2 dsweber2 commented Oct 4, 2024

Checklist

Please:

  • Make sure this PR is against "dev", not "main".
  • Request a review from one of the current epipredict main reviewers:
    dajmcdon.
  • Make sure to bump the version number in DESCRIPTION and NEWS.md.
    Always increment the patch version number (the third number), unless you are
    making a release PR from dev to main, in which case increment the minor
    version number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    0.7.2, then write your changes under the 0.8 heading).
  • Consider pinning the epiprocess version in the DESCRIPTION file if
    • You anticipate breaking changes in epiprocess soon
    • You want to co-develop features in epipredict and epiprocess

Change explanations for reviewer

In the process of using both grf_quantile for trees and actually using step_adjust_latency in the context of the flusion dataset, I ran into a number of issues, which this addresses (actually separated by commit, so they're fairly atomic).
For step_adjust_latency:

  • epi_keys_checked had a bug when there was more than one key
  • the epi_df was incorrectly being reconstructed during the bake step because of epiprocess0.9
  • I had some key values (a different source) which stopped in 2020, but was still useful to include in the dataset. To ignore this for latency adjustment purposes, I added the keys_to_ignore parameter, which is a named list, where the name is a key column name from epi_keys_checked, and the values are the key values to ignore. In my use case, this is list(source = c("flusurv"))

For grf_quantile, I just ran into quantiles that were out of order, filed as #407. For now I'm just dropping the order requirement, and sorting the values. We may want to investigate why it's producing out of order coefficients, or add a check to see if it's actully meaningfully out of order

@dsweber2 dsweber2 requested a review from dajmcdon as a code owner October 4, 2024 20:12
Copy link
Contributor

@dajmcdon dajmcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an issue with the quantile sorting. I think we should track it down and make this change elsewhere, rather than in the quantile object constructor. I suspect an upstream bug.

Comment on lines 25 to 29
# if (is.unsorted(values, na.rm = TRUE)) {
# cli_abort("`values[order(quantile_levels)]` produces unsorted quantiles.")
# }

new_rcrd(list(values = values, quantile_levels = quantile_levels),
new_rcrd(list(values = values[order(quantile_levels)], quantile_levels = quantile_levels),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to do this in the constructor. And I'm concerned that there's something else happening. Does grf do something unexpected to the quantile_levels?

The predictions are already sorted:

out <- lapply(vctrs::vec_chop(x), function(x) sort(drop(x)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#407, there's something about it being not quite right about how it's sorting very small values. I don't really have time to track down why it's wrong just yet, this is mostly just a patch to be able to move forward with forecasting using grf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, also where would be a better place to do the reordering? the same function but not inline (say on line 27?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the bug is specific to grf_quantiles(), so it should be done there instead of in new_dist_quantiles().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lapply(vctrs::vec_chop(x), function(x) sort(drop(x)))
I'm actually quite confused what this is trying to accomplish that sort(x) wouldn't do. It has the wrong behavior on the problematic example

x <- c(-8.609792e-02, -6.965615e-02, -5.696382e-02, -4.377567e-02, -3.561048e-02,
 -2.770179e-02, -2.030610e-02, -1.367990e-02, -8.079690e-03, -3.761972e-03,
 -9.832426e-04,  3.469447e-17,  0.000000e+00,  0.000000e+00,  0.000000e+00,
  0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,
  0.000000e+00,  0.000000e+00,  0.000000e+00)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you suspect a bug in process_qrf_preds()? Can you provide a reproducible example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what that code does:

set.seed(12345)
x <- matrix(rnorm(20), 5, 4) # 5 x 4 matrix, unsorted
lapply(vctrs::vec_chop(x), function(x) sort(drop(x))) 
#> [[1]]
#> [1] -1.8179560 -0.1162478  0.5855288  0.8168998
#> 
#> [[2]]
#> [1] -0.8863575  0.6300986  0.7094660  1.8173120
#> 
#> [[3]]
#> [1] -0.3315776 -0.2761841 -0.1093033  0.3706279
#> 
#> [[4]]
#> [1] -0.4534972 -0.2841597  0.5202165  1.1207127
#> 
#> [[5]]
#> [1] -0.9193220 -0.7505320  0.2987237  0.6058875

Now, each row (list element) is sorted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably; since it's outputting this vector,

x <- c(-8.609792e-02, -6.965615e-02, -5.696382e-02, -4.377567e-02, -3.561048e-02,
 -2.770179e-02, -2.030610e-02, -1.367990e-02, -8.079690e-03, -3.761972e-03,
 -9.832426e-04,  3.469447e-17,  0.000000e+00,  0.000000e+00,  0.000000e+00,
  0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,
  0.000000e+00,  0.000000e+00,  0.000000e+00)

having this as a row should probably reproduce the error.

@dsweber2 dsweber2 merged commit a415100 into dev Oct 9, 2024
2 checks passed
@dshemetov dshemetov deleted the prodFixes branch October 10, 2024 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants