Skip to content

Conversation

@martijnvg
Copy link
Member

@martijnvg martijnvg commented Nov 1, 2016

This adds a better extension point for custom cluster state parts by only allowing adding custom cluster state parts via the ClusterPlugin interface.

This is a WIP and would be great to get feedback.

  • By piggybacking on the named writeable infrastructure we can serialize custom cluster state parts for the internal protocol. This seems to workout nice as the serialization code doesn't need to know about custom cluster metadata implementations.
  • For the xcontent serialization there is no such thing as named writaebles, so I passed down a registry holding prototypes for the custom cluster state parts on all the places we need to do de-serialization. We can maybe consider adding something like named writable on top of XContentParser? However passing down the registry doesn't seem to be too bad.
  • Should custom index metadata be removed? It isn't used in any place. I think it should, but in a follow up change.
  • This change needs tests.

PR for #20888

@martijnvg martijnvg added WIP :Core/Infra/Plugins Plugin API and infrastructure labels Nov 1, 2016
@martijnvg martijnvg force-pushed the custom_cluster_metadata_plugin branch 4 times, most recently from fab869c to aaae041 Compare November 3, 2016 09:59
@martijnvg martijnvg force-pushed the custom_cluster_metadata_plugin branch from aaae041 to 5e70b64 Compare November 7, 2016 12:55
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick scan. I left a few "why" style things but I think this is the right approach here.

I like the idea of making a NamedWriteableRegistry-like thing for xcontent but it isn't clear to me that we know how to get it to apply well across the board. I don't want us to make some nice general abstraction and only use it for cluster state. If we made such a thing it'd be an empty victory if we didn't move queries and aggs to it.

So I think the right thing to do is to get this in and then use the lessons learned to open a followup for the more generalized abstraction. A migration of that size deserves to be done in many PRs with more time for review. Something like this we want sooner I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks properly BWC to me. ++

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should be a parameter to the source method. That way it isn't so hidden. It can be @Nullable and we can have javadoc explaining what it is for and when it is needed. I'm not sure if this source method is part of the public Java API or not because we really don't mark things so both everything is and nothing is. Anyway, I think it'd be fine to add a nullable parameter there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason you are considering a "named writeable to xcontent" so we don't have to have these registry objects? I think that is a fine idea! I think some things won't work as they stand now but could be fixed. See things like QueryParseContext.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fine idea that we can work on later, just to clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should be a parameter to the source method.

Initially I didn't want change any method signatures, but it isn't a big deal after all as like you said just null can be provided. To make it even better we can add an overloaded version of the method that just accepts source.

Is the reason you are considering a "named writeable to xcontent" so we don't have to have these registry objects?

Yes, here and also in other places too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should throw an exception if the proto doesn't resolve to anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You made a Safe method for this already I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we should do that here. I'll use the Safe variants for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as class above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the author of the parameter list meant to do one per line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this note. We expect that only custom cluster state named writeables are used here but we support any named writeables. While we don't use this now we don't actively prevent using it and a plugin might use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to remove the note but didn't? I'm not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs javadoc eventually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cRegister/pRegistry/`?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@nik9000
Copy link
Member

nik9000 commented Nov 7, 2016

And, yes, this looks like it maintains wire format and cluster state backwards compatibility to me. I wonder if we should add more stuff to qa:rolling-upgrade or some other test.

@martijnvg
Copy link
Member Author

I wonder if we should add more stuff to qa:rolling-upgrade or some other test.

I'm not sure if this requires its own qa module or can be included in the qa:rolling-upgrade module. It would require a custom plugin and that would make the existing qa module more difficult as it tries to test other things too. I guess the best way to create rolling upgrade test for custom cluster state parts is to build a 5.0.x data dir and a 5.1.x plugin that is able to read it then the test can after upgrading check some assertions.

@nik9000
Copy link
Member

nik9000 commented Nov 7, 2016

I guess the best way to create rolling upgrade test for custom cluster state parts is to build a 5.0.x data dir and a 5.1.x plugin that is able to read it then the test can after upgrading check some assertions.

OldIndexBackwardsCompatibilityIT does the whole "old index version" thing. Maybe that is a good thing to build on?

@martijnvg
Copy link
Member Author

OldIndexBackwardsCompatibilityIT does the whole "old index version" thing. Maybe that is a good thing to build on?

I think that using OldIndexBackwardsCompatibilityIT would work.

@martijnvg martijnvg force-pushed the custom_cluster_metadata_plugin branch from 5e70b64 to 96ad8b8 Compare November 9, 2016 11:49
@martijnvg martijnvg added review and removed WIP labels Nov 9, 2016
@martijnvg
Copy link
Member Author

@nik9000 I've updated the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of supporting a null registry we can provide an empty one instead? That feels like it'd be cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this null support either. I'd prefer sending an empty registry over null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to remove the note but didn't? I'm not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix this javadoc? Those @param annotations are fairly useless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some javadoc to explain what this does and some docs to explain why we need @ESIntegTestCase.ClusterScope and @LuceneTestCase.SuppressFileSystems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the file come from? I think we need to commit the script that generates it so we can make it when we release the next version, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to automate this in a script, since it involves as plugin that specifies custom index metadata.

This file contains the code that creates the data directory and contains the plugin written on the 5.0.0 codebase: https://gist.github.com/martijnvg/ff48df064ad60ee983ec4fd7c7471a4c

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just chatted with Nik and I'll remove the custom plugin in test that is used the test custom index metadata. Then the test is simpler and it can be inlined in OldIndexBackwardsCompatibilityIT class and allows scripting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll fail the build.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, silly IDE

@nik9000
Copy link
Member

nik9000 commented Nov 9, 2016

I think we're getting a longer and longer list of things that we need to run on upgrade. Maybe we need to start a running list somewhere....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test starting a node that doesn't have the plugin installed? The node should start, I think. Though I don't know what it should do with the data that it no longer understands. I think we have a setting like required_plugins to guard against accidental removal of a plugin so we can assume that any removal of the plugin is intentional. I think....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll add that test.

I agree that the required_plugins setting is an extra guard, but if that setting is forgotten there is nothing stopping the node from starting up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! I think it is right to start the node though, right? You have to be able to uninstall plugins if you want, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, but that wasn't your intention, you can lose the custom part in the cluster state.

…by only allowing adding custom cluster, metadata and index metadata via the ClusterPlugin interface.

* By piggybacking on the named writeable infrastructure when can serialize custom cluster metadata for the internal protocol. This seems to workout nice as the serialization code doesn't need to know about custom cluster metadata implementations.
* For the xcontent serialization there is no such thing as named writaebles, so I passed down a registry holding prototypes for custom metadate on all the places we need to de-serialize custom metadata. We can maybe consider adding something like named writable on top of XContentParser? However passing down the registry doesn't seem to be too bad.
* Custom index metadata doesn't work yet. I think we can remove it as it isn't used in any place. Not sure what others think about that, but that can be done in another PR.
* This change needs tests.
@martijnvg martijnvg force-pushed the custom_cluster_metadata_plugin branch from 96ad8b8 to e7c31be Compare November 10, 2016 11:18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is existing behavior but it's bad... forgetting to install a plugin immediately implies the deletion of it's state (and solving it will be tricky). Can we add this warn log and open an issue to deal with it? (maybe the plugin manager to should remove custom meta when uninstalling. not sure)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #21471

DiscoverySettings discoverySettings,
ClusterName clusterName) {
ClusterName clusterName,
CustomPrototypeRegistry registry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should pass these registries all over the place? maybe it's better to give a cluster diff and cluster state deserializers ? It doesn't feel like PublishClusterStateAction should care about the registry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I needed to provide CustomPrototypeRegistry here is because in the handleIncomingClusterStateRequest(...) method (line 380-385) it creates a new StreamInput instance and have to wrap that with a NamedWriteableAwareStreamInput instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something - but do we need both the NamedWriteable in the stream and the explicit registry when reading a diff? Diff<ClusterState> diff = lastSeenClusterState.readDiffFrom(in, registry);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but do we need both the NamedWriteable in the stream

Yes, otherwise nodes that receive a new cluster state don't know how the read the custom cluster state parts. (in the synchronized block line 388 till 407)

and the explicit registry when reading a diff?

Ideally we wouldn't need to pass the registry to the diff, but with the way it reads the custom diff map, I didn't see another of doing that without breaking bwc. (see here, here and here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martijnvg and I discussed this and Martijn is work on centralizing this code so all deserialization logic will be in one place. He's also looking into how it should be done if bwc was not a concern so we know how this should look going forward

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes Like we discussed last friday I tried to stop using named writeables for cluster state related stuff and use the registry directly (so that all code will follow the same deserialization logic), but I ran into an issue that I don't know how to solve without creating a big mess. There are a couple of places where we serialize cluster related stuff in the action framework:

  • ClusterRerouteResponse#readFrom(...)
  • ClusterStateResponse#readFrom(...)
  • CreateIndexRequest#readFrom(...)
  • PutIndexTemplateRequest#readFrom(...)

In these readFrom() methods I don't have access to the registry. I don't see a way of passing down the registry without changing the Streamable interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes now that it seems that #21552 doesn't need this change, maybe we should work on adding named writable support to XContentParser and DiffableValueSerializer, so that this gets in a clean and consistent way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

// sometimes the request comes in before the local node processed that cluster state
// in such cases we can load it from disk
metaData = IndexMetaData.FORMAT.loadLatestState(logger, nodeEnv.indexPaths(shardId.getIndex()));
metaData = metaStateService.getIndexMetadatFormat().loadLatestState(logger, nodeEnv.indexPaths(shardId.getIndex()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just make this a public method on metaStateService? (i.e., loadIndexLatestState) it already has nodeEnv and everything it needs. And now that I check it - it already has one :D org.elasticsearch.gateway.MetaStateService#loadIndexState

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good point :)

* @return Custom cluster state parts used to store temporal information in the cluster state that doesn't need to
* be persisted
*/
default Collection<ClusterState.Custom> getCustomClusterState() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nite: this needs a better names - getCustomParts? or getClusterStateCostums?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for getClusterStateCostums()

…x metadata in create index request from `getIndexMetadataPrototypeSafe()` back to `getIndexMetadataPrototype()`

as there is some logic in the request that demotes custom index metadata to index settings.
@martijnvg martijnvg removed the v5.1.1 label Nov 21, 2016
@martijnvg
Copy link
Member Author

Superseded by #22336

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Plugins Plugin API and infrastructure >enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants