Skip to content

Conversation

@original-brownbear
Copy link
Contributor

This is motivated by a number of recent SDHs that had these transport actions
queue up on the manangement pool. These were not the reason for the blockage on
the managment queue, but they are often sent at a high rate by Beats in the same
scenarios that see a high rate of stats requests from Beats.
Moving them off of the management pool at least makes sure that we don't get Beats
retrying them over and over on slowness and generally saves some resources by
avoiding ctx switches and having these requests live for longer than necessary.

There's no point in running this on the management pool. It should have
already been fast enough for SAME with the exception of reading the public key
from disk maybe. Made it so the public key is just a constant and doesn't have
to be read+deserialized over and over and also cached the verified property for
a License instance so it should never have to be computed in practice anyway.

relates #77466 to some degree, since large CS => slow stats tasks on MANAGEMENT => these tasks queue up

This is motivated by a number of recent SDHs that had these transport actions
queue up on the manangement pool. These were not the reason for the blockage on
the managment queue, but they are often sent at a high rate by Beats in the same
scenarios that see a high rate of stats requests from Beats.
Moving them off of the management pool at least makes sure that we don't get Beats
retrying them over and over on slowness and generally saves some resources by
avoiding ctx switches and having these requests live for longer than necessary.

There's no point in running this on the management pool. It should have
already been fast enough for SAME with the exception of reading the public key
from disk maybe. Made it so the public key is just a constant and doesn't have
to be read+deserialized over and over and also cached the verified property for
a `License` instance so it should never have to be computed in practice anyway.
@original-brownbear original-brownbear added >non-issue :Security/License License functionality for commercial features v8.0.0 v8.1.0 labels Nov 24, 2021
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Nov 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The change of loading the public key into a static variable means that the node will fail to launch if anything goes wrong in that process. This might be a behaviour change in some cases. But I don't think it's an issue since it should not happen at all in normal conditions. When the public key does get corrupted, it is probably better to fail eariler and louder. It also won't "hot-reload" the public key anymore. But I don't think we intended to support that anyway.

throw new IllegalStateException(ex);
PUBLIC_KEY = CryptUtils.readPublicKey(out.toByteArray());
} catch (IOException e) {
throw new AssertionError("key file is part of the source and must deserialize correctly", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The error message is rather cryptic from end-user's perspective and is not really actionable. Should we change it to something along the line of: "The public key file for license verification seems to be corrupted. Please ensure the integrity of the installation files."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, this isn't even a outright file in the packaged release, the user can't run into this unless they compiled their own version (checksums and just general consistency checking on the jar/zip make it impossible to selectively corrupt this) => I'd just stick with this message that explains what's going on here for "us" when we read the code, the user won't be seeing this ever.

@original-brownbear
Copy link
Contributor Author

Thanks Yang!

@original-brownbear original-brownbear merged commit f3b5299 into elastic:master Nov 29, 2021
@original-brownbear original-brownbear deleted the license-management-pool branch November 29, 2021 08:38
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Nov 29, 2021
This is motivated by a number of recent SDHs that had these transport actions
queue up on the manangement pool. These were not the reason for the blockage on
the managment queue, but they are often sent at a high rate by Beats in the same
scenarios that see a high rate of stats requests from Beats.
Moving them off of the management pool at least makes sure that we don't get Beats
retrying them over and over on slowness and generally saves some resources by
avoiding ctx switches and having these requests live for longer than necessary.

There's no point in running this on the management pool. It should have
already been fast enough for SAME with the exception of reading the public key
from disk maybe. Made it so the public key is just a constant and doesn't have
to be read+deserialized over and over and also cached the verified property for
a `License` instance so it should never have to be computed in practice anyway.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0

original-brownbear added a commit that referenced this pull request Nov 29, 2021
This is motivated by a number of recent SDHs that had these transport actions
queue up on the manangement pool. These were not the reason for the blockage on
the managment queue, but they are often sent at a high rate by Beats in the same
scenarios that see a high rate of stats requests from Beats.
Moving them off of the management pool at least makes sure that we don't get Beats
retrying them over and over on slowness and generally saves some resources by
avoiding ctx switches and having these requests live for longer than necessary.

There's no point in running this on the management pool. It should have
already been fast enough for SAME with the exception of reading the public key
from disk maybe. Made it so the public key is just a constant and doesn't have
to be read+deserialized over and over and also cached the verified property for
a `License` instance so it should never have to be computed in practice anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Security/License License functionality for commercial features Team:Security Meta label for security team v8.0.0 v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants