-
Notifications
You must be signed in to change notification settings - Fork 189
time intervals coerce isostrings #1326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s fine to do this for the built-in time intervals, but we shouldn’t do this implicit coercion if someone provides an interval implementation, including one of the D3 time interval implementations. The implicit coercion should be part of the interval implementation, and we can do that for the named time intervals we provide. (The magic of invoking the interval implementation and seeing if it returns a Date is too much, and some intervals may want to have different semantics for input coercion.)
Edit: I mean, I see that we already have “magic” detection for time thresholds in the bin transform. But I think that predates having named time intervals, and it’d probably be better if we didn’t do that, somehow.
I had the same hesitation but went the other way:
And, it's well defined in the README: “Time intervals are intervals that are also functions that return a Date instance when called with no arguments.” |
src/time.js
Outdated
@@ -41,14 +42,22 @@ const utcIntervals = new Map([ | |||
["sunday", utcSunday] | |||
]); | |||
|
|||
function isoInterval(interval) { | |||
return Object.assign((t) => interval(t), {...interval, floor: (t) => interval.floor(coerceDate(t))}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels awkward to me. Like, isTimeInterval is the way it is because that was a quick and dirty way to test for a D3 time interval. But now we have to recreate that representation—even though we don’t need it—to satisfy the isTimeInterval check. And this is only adding coercion for one of the possible methods (floor) on the time interval, not the main function, not interval.offset and interval.range, etc. I’m not sure what fix I recommend yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the other methods, offset, ceil, range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can give them the same treatment, but in practice don't use them on user-provided data. Their input is in all cases already coerced.
(Agree it's awkward!)
6a2c5f3
to
355d869
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d still like to try to find a different way to do this. I think in many cases, like with binning, we’ve already done the coercion by the time the interval is applied, so this coercion is redundant. Would it be possible to instead do the coercion outside the interval, before the interval is applied, when the scale interval is a time interval?
Edit: For what it’s worth, the performance of the bin1m isn’t affected by this branch, at least. So performance doesn’t appear to be an issue. Let me think some more.
src/time.js
Outdated
@@ -41,14 +42,22 @@ const utcIntervals = new Map([ | |||
["sunday", utcSunday] | |||
]); | |||
|
|||
function isoInterval(interval) { | |||
return Object.assign((t) => interval(t), {...interval, floor: (t) => interval.floor(coerceDate(t))}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the other methods, offset, ceil, range?
closes #1182 (Note: I considered doing this in d3-time, but it would mean adding d3-time-format as a dependency of d3-time.) Rebase on main Co-authored-by: Mike Bostock <[email protected]>
2d64d00
to
0a57694
Compare
@@ -490,3 +490,11 @@ export function named(things) { | |||
export function maybeNamed(things) { | |||
return isIterable(things) ? named(things) : things; | |||
} | |||
|
|||
export function isTimeInterval(t) { | |||
return isInterval(t) && typeof t === "function" && t() instanceof Date; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This disagrees with how we’ve defined intervals in TypeScript, where they are objects rather than decorated functions:
Lines 19 to 22 in 820144e
export interface IntervalImplementation<T> { | |
floor(value: T): T; | |
offset(value: T, offset?: number): T; | |
} |
And this means that you can use a {floor, offset, range?} object in Plot, which works, except when we try to do this time interval test.
Also, I don’t think we actually need this restriction, because D3 doesn’t require that time intervals are functions, either. It instead checks for the range method (and then blindly invokes the floor and offset methods):
https://github.com/d3/d3-time/blob/9e8dc940f38f78d7588aad68a54a25b1f0c2d97b/src/ticks.js#L38
https://github.com/d3/d3-scale/blob/2b7db622b1a224d9ea19ff15c4cc8cbb3b25f4a4/src/time.js#L58
I added that duck testing in #572, but I think it would be better to replace that duck testing with something consistent with our new TypeScript declarations and with D3. Probably we should invoke the floor method and see if it returns a Date? But that’s a little messy too, since that method requires an argument, and it’s unclear whether we should pass it a Date or a number… All D3 time intervals support coercion (as does Plot’s internal number interval), but should we expect users to implement coercion in their own intervals? I mean, isn’t the point of this PR is that they don’t? 🤔
I’ve surveyed the code base to see where we might need coercion for intervals. The (implicit) interval transform uses intervals in maybeIntervalK and maybeIntervalMidK. Both of these would benefit from an implicitly-coercing interval. (Though it would also be possible to coerce all the values prior to invoking the interval.) plot/src/transforms/interval.js Line 31 in 820144e
plot/src/transforms/interval.js Line 50 in 820144e
A scale transform can be specified as an interval as described in #1182. This happens in applyScaleTransform when the interval.floor is promoted to a transform. This would benefit from an implicitly-coercing interval, too. Lines 365 to 367 in 820144e
The axisMark can also use the scale interval to compute ticks. This would not benefit from an implicitly-coercing interval: the scale should already have coerced values at this point, and we wouldn’t want the interval to use different coercion than the scale. Lines 520 to 521 in 820144e
Ordinal scales use intervals to compute the domain. This would not benefit from an implicitly-coercing interval: we would need to do the coercion earlier on the scale’s associated channel values, so that the extent is computed correctly and so that the scale’s imputed domain matches the channel values. Lines 100 to 101 in 820144e
The contour mark can use an interval to compute thresholds. The majority of the time the contour mark will be used with numeric data, but it is theoretically possible to compute contours of arbitrary values, including temporal data. So, this might benefit from implicit coercion. However, it again would not work to perform that coercion within the interval; we would need to coerce all the contour values, not just coerce when computing ticks. Line 191 in 820144e
The bin mark can likewise use an interval to compute thresholds. It already performs coercion and would not benefit from additional implicit coercion inside the interval. Line 232 in 820144e
To summarize the effect of implicitly-coercing intervals:
My recommendation therefore is, instead of implicitly-coercing time intervals, that we coerce values as appropriate prior to using the interval similar to what the bin transform already does. In the case of ordinal scales, we’ll also need to coerce to numbers, not just dates, so that we compute the extent correctly. |
I have revised this PR's description with the TODO. |
@@ -28,7 +29,9 @@ function maybeIntervalK(k, maybeInsetK, options, trivial) { | |||
let D1, V1; | |||
function transform(data) { | |||
if (V1 !== undefined && data === D1) return V1; // memoize | |||
return (V1 = map(valueof((D1 = data), value), (v) => interval.floor(v))); | |||
let V = valueof((D1 = data), value); | |||
if (isTimeInterval(interval)) V = map(V, coerceDate, Float64Array); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (isTimeInterval(interval)) V = map(V, coerceDate, Float64Array); | |
if (isTimeInterval(interval)) V = coerceDates(V); |
I think we want the slower coerceDates
here. This code using Float64Array is faster, but we can only use it in the bin transform because we know it’s not exposed outside the bin transform. I don’t think that’s true here; we want to pass the interval Date instances, not numbers.
Also, given that we can’t use a typed array here, it will probably be faster to fold the coerceDate
call into the interval.floor(v)
call before to avoid one additional array copy in map
.
const i = maybeInterval(interval, type); | ||
if (isTimeInterval(i)) return (channel.value = channel.value.map((d) => i.floor(coerceDate(d)))), undefined; | ||
transform = i?.floor; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner to refactor this into a helper like so:
function maybeIntervalTransform(interval, type) {
interval = maybeInterval(interval, type);
return isTimeInterval(interval) ? (d) => interval.floor(coerceDate(d)) : interval?.floor;
}
And then replace the default like so:
transform = percent ? (x) => x * 100 : maybeIntervalTransform(interval, type)
In other words, I think it’s preferable to consolidate the code paths so that the interval is still promoted to a transform. And also we’d want to use map
here instead of channel.value.map
because channel.value
could be a typed array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do we want to coerce to numbers for number intervals? Should we further assume that intervals only apply to numbers and dates and not other types? 🤔
The discussion below ends up with the recommendation that if we want to do this, we should coerce isostrings to dates in the channels, rather than wrap the interval object.
todo/done:
closes #1182
(Note: I considered doing this in d3-time, but it would mean adding d3-time-format as a dependency of d3-time.)