Skip to content

Conversation

@hendrikmuhs
Copy link

FEATURE BRANCH PR

implement support for multiple sources and aggregations

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@hendrikmuhs
Copy link
Author

retest this please

@hendrikmuhs hendrikmuhs force-pushed the feature/fib-multi-aggs branch from ed77c6a to cdcfba8 Compare October 17, 2018 07:14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we are needlessly creating a leaky/incomplete abstraction. We should return a stream of objects that knows how to transform themselves into an XContentObject instead of creating a HashMap and then requiring the caller to know how to do the transformation (even thought the caller is us). This will afford us a nice separation of duties and an OK abstraction, though definitely not perfect as aggregations are so flexible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what is leaky/incomplete?

Using a map is intentional, will add some reasoning as documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that all of this building could be self contained. As it stands, we have multiple functions passing around not well defined objects and just happen to know the inner workings of it. Maybe that is just how aggregations are, but seems to me a very easy way to introduce latent bugs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer the questions:

We have no schema for the output, it could be anything, having that said, this is an internal helper class, not exposed to the outside world. Before it goes out XContent::map ensures it is proper.

Regarding using maps as intermediate format: I checked for a more high-level construct and it turns out, there isn't one. The concept of a document in ingest is Map<String, Object>, so I choose the same here. Eventually we want to support running post-pivot ingest transformations. So this allows us to easily hook in ingest. (LBNL the map makes testing easier, but that isn't the primary reason.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I think whatever object we return should know how to take a builder and add itself appropriately.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this class is only ever going to contain static utility methods it's usual to make it final and add a private no-op constructor.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declaration and initial assignment could all be one statement.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this ever did happen it's going to be frustrating as it doesn't say what the unsupported aggregation was. I think the message should include the type of the aggregation. Also, you could assert we don't get here. (The CI runs with assertions enabled but production runs with assertions disabled.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Arrays.asList with 1 element shows up as a warning in IntelliJ. You can avoid this by using Collections.singletonList for 1 element lists.

@hendrikmuhs hendrikmuhs force-pushed the feature/fib-multi-aggs branch from 4af0d68 to f331658 Compare October 18, 2018 08:30
Copy link

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hendrikmuhs hendrikmuhs merged commit ad8cc92 into elastic:feature/fib Oct 19, 2018
@hendrikmuhs
Copy link
Author

Merging this to get some progress, I expect this part to be revisited, not the final version.

Note on CI: Broken as the new module is unknown and causes some tests to fail, so no green CI at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :ml Machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants