Skip to content

Conversation

@pejovica
Copy link

@pejovica pejovica commented May 30, 2023

The current code for cgroup support in the JDK has large and expensive dependencies: it uses NIO, streams, and regular expressions. This leads to unnecessary class loading and slows down startup, especially when the code is executed early during an application startup. This is especially a problem for GraalVM, which executes this code during VM startup.

This PR reduces the dependencies:

  • NIO is replaced with regular java.io for file access.
  • Streams are replaced with loops (a side effect of this is that files are read in full whereas previously they could be read up to a certain point, e.g., until a match is found).
  • Regular expressions are replaced with manual tokenization (and for usages of String.split, the "regex" is changed to single characters for which String.split has a fast-path implementation that avoids the regular expression engine).

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8309191: Reduce JDK dependencies of cgroup support (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14216/head:pull/14216
$ git checkout pull/14216

Update a local copy of the PR:
$ git checkout pull/14216
$ git pull https://git.openjdk.org/jdk.git pull/14216/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14216

View PR using the GUI difftool:
$ git pr show -t 14216

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14216.diff

Webrev

Link to Webrev Comment

@bridgekeeper bridgekeeper bot added the oca Needs verification of OCA signatory status label May 30, 2023
@bridgekeeper
Copy link

bridgekeeper bot commented May 30, 2023

Hi @pejovica, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user pejovica" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@openjdk
Copy link

openjdk bot commented May 30, 2023

@pejovica The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@jerboaa
Copy link
Contributor

jerboaa commented May 30, 2023

@pejovica Please enable GHA testing on your fork. Once enabled, please merge latest master into your branch so as to trigger a GHA run. Thanks!

@pejovica
Copy link
Author

/covered

@bridgekeeper bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label May 30, 2023
@bridgekeeper
Copy link

bridgekeeper bot commented May 30, 2023

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

@pejovica
Copy link
Author

@jerboaa I enabled GḪA testing. Other than a couple of Windows errors (which seem unrelated), everything else seems to be fine.

@jerboaa
Copy link
Contributor

jerboaa commented May 31, 2023

@pejovica pejovica changed the title Reduce JDK dependencies of cgroup support 8309191: Reduce JDK dependencies of cgroup support May 31, 2023
@pejovica
Copy link
Author

@pejovica There are some jcheck failures. See: https://github.com/openjdk/jdk/pull/14216/checks?check_run_id=13870116111

@jerboaa One failure is due to a lack of reviewers, so would you be able to do a review? As for the rest, I've added an issue reference, so that's fixed, and I guess I'll have to wait for OCA verification.

@jerboaa
Copy link
Contributor

jerboaa commented May 31, 2023

I guess I'll have to wait for OCA verification.

Yes.

One failure is due to a lack of reviewers, so would you be able to do a review?

Yes, I'll try to do a review later today or tomorrow.

Thanks!

@pejovica
Copy link
Author

Yes, I'll try to do a review later today or tomorrow.

Awesome, thanks!

@bridgekeeper bridgekeeper bot removed oca Needs verification of OCA signatory status oca-verify Needs verification of OCA signatory status labels May 31, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label May 31, 2023
@mlbridge
Copy link

mlbridge bot commented May 31, 2023

Webrevs

@dholmes-ora
Copy link
Member

@pejovica what testing have you done in relation to these changes? We run our container tests in tier5 - have you tested that? Thanks.

@AlanBateman
Copy link
Contributor

This seems a real backward step. I think some finer grain analysis is needed to work through specific issues, e.g. maybe startup with the regex usage and report back on how much that helps.

Copy link
Contributor

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the hard-coding of delimiter values and the added accidential complexity in order to avoid the Regex engine. Note that this test fails due to the delimiter hard-coding:

jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java

This change seems hard to maintain. How would you ensure this won't regress?

*/
static CgroupInfo fromCgroupsLine(String line) {
String[] tokens = line.split("\\s+");
String[] tokens = line.split("\t");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, we now hard-code the expected delimiter and, thus, depend on what the kernel does. Do we have sufficient evidence this hasn't changed/won't change in the future?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the delimiter hasn't changed since the file was introduced, and judging by the kernel mailing list (e.g., see the following), I don't think it will change any time soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is a good change. Going by that mailing list thread, it suggests that people considered changing it. If one of those attempts were successful, it would have broken this code. It makes it rather fragile. The issue, with container detection code going wrong is that you most likely never notice. Translating this to GraalVM means that the native image, would a) detect the wrong version in use or b) fail detection and use host values. In both cases the application will likely misbehave in a container setup with resource limits applied and you won't (easily) know why. So even though it's unlikely to be a problem, there is a chance it could be and it's asking for trouble for no good reason.

Therefore, being conservative about delimiters makes sense here. My choice in this case would be more robust code over relying on external factors. YMMV.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so how about something like the following, would that be more acceptable?

Suggested change
String[] tokens = line.split("\t");
String[] tokens = Collections.list(new StringTokenizer(line, " \t")).toArray(new String[0]);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringTokenizer() is a legacy class and is discouraged in new code. So not ideal either.

Comment on lines +346 to +347
// loop over space-separated tokens
for (int tOrdinal = 1, tStart = 0, tEnd = line.indexOf(' '); tEnd != -1; tOrdinal++, tStart = tEnd + 1, tEnd = line.indexOf(' ', tStart)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, this now also hard-codes the delimiter: A single space. If we really want this custom parser, please add a unit test for it and extract it to a separate class.

A hypothetical line like the following would confuse the parser, setting ordinal to wrong values:

36   35 98:0 /mnt1 /mnt2 rw,noatime master:1 - cgroup /dev/root rw,errors=continue

We'd have: mountRoot = 98:0, mountPath = /mnt1, fsType = cgroup since it expects single space separated values. Seems fragile.

CgroupSubsystem.LONG_RETVAL_UNLIMITED);
}

public static long convertHierachicalLimitLine(String line) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing: typo convertHierarchicalLimitLine


public static long convertHierachicalLimitLine(String line) {
String[] tokens = line.split("\\s");
String[] tokens = line.split(" ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, assumes single space ( ) delimited entries in memory.stat. I'm not sure we should hard-code this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delimiter also hasn't changed since the memory.stat file was introduced, and since cgroup v1 is in maintenance mode I'd expect it to stay that way.

}
// $MAX $PERIOD
String[] tokens = cpuMaxRaw.split("\\s+");
String[] tokens = cpuMaxRaw.split(" ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Long.valueOf(0);
}
String[] tokens = line.split("\\s+");
String[] tokens = line.split(" ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +94 to +97
private static void warn(String msg) {
Logger logger = System.getLogger("jdk.internal.platform");
logger.log(Level.DEBUG, msg);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine and uncontroversial. Suggested name change log over warn. Perhaps apply this as a separate change?

@jerboaa
Copy link
Contributor

jerboaa commented Jun 1, 2023

With my Mandrel hat on I support this change if it helps reducing duplication on the native-image side. It appears, though, we need to find a way that's supportable long-term. It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

@AlanBateman
Copy link
Contributor

It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

I do not like the changes proposed here, they are all crying out for several round cleanups and modernization. One thing that would help is to split this up into a series of changes that could be evaluated, e.g. the use of regex may be a significant part of this so maybe start with that, report back, then work through the iterations to make it clean and maintainable.

@pejovica
Copy link
Author

pejovica commented Jun 5, 2023

@pejovica what testing have you done in relation to these changes? We run our container tests in tier5 - have you tested that? Thanks.

@dholmes-ora Thanks for the pointer, I'll post back when I run them.

@christianwimmer
Copy link
Member

It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

We (as the Native Image team) are OK with this. Our testing will detect that pretty quickly, and then the new code can be fixed.

@adinn
Copy link
Contributor

adinn commented Jun 20, 2023

We (as the Native Image team) are OK with this. Our testing will detect that pretty quickly, and then the new code can be fixed.

That may well be the case. However, until all the concerns raised by OpenJDK reviewers who have looked at this PR are addressed to their satisfaction it would not be appropriate to merge this patch.

n.b. That does not automatically mean the course of action the reviewers have recommended has to be followed. A resolution needs to be negotiated according to the merits and risks of the change. However, regarding that negotiation, I'll observe that the (repeated) request to break this change down in several steps appears to me to be motivated by the desire to ensure that the merits of the change are maximized (no unnecessary loss of important functionality) and the risks minimized (no unnecessary perturbation of the current implementation) -- which is not an unusual way for OpenJDK reviewers to proceed.

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 18, 2023

@pejovica This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 15, 2023

@pejovica This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Aug 15, 2023
@dougxc
Copy link
Member

dougxc commented Aug 16, 2023

I'm concerned about the hard-coding of delimiter values and the added accidential complexity in order to avoid the Regex engine. Note that this test fails due to the delimiter hard-coding:

jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java

This change seems hard to maintain. How would you ensure this won't regress?

There's seems to be a lot of usage of sscanf in https://github.com/openjdk/jdk/blob/master/src/hotspot/os/linux/cgroupSubsystem_linux.cpp. Maybe I'm misreading that code, but doesn't it also hard code assumptions about the file format(s)?

@jerboaa
Copy link
Contributor

jerboaa commented Aug 16, 2023

Not as far as I'm aware (it can deal with tabs vs. spaces differences as well as multiple spaces).

FWIW, I've done this change as a PoC a while ago and it seems sufficient to use the JDK's metrics impl in native-image (barring some perf numbers; If somebody provides me with pointers, happy to provide those too).

@dougxc
Copy link
Member

dougxc commented Aug 16, 2023

Ok, thanks. It's obviously been too long since I used sscanf ;-)

@jerboaa
Copy link
Contributor

jerboaa commented Aug 24, 2023

#15416 alternative PR with a more limited scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

7 participants