8309191: Reduce JDK dependencies of cgroup support #14216

pejovica · 2023-05-30T13:03:27Z

The current code for cgroup support in the JDK has large and expensive dependencies: it uses NIO, streams, and regular expressions. This leads to unnecessary class loading and slows down startup, especially when the code is executed early during an application startup. This is especially a problem for GraalVM, which executes this code during VM startup.

This PR reduces the dependencies:

NIO is replaced with regular java.io for file access.
Streams are replaced with loops (a side effect of this is that files are read in full whereas previously they could be read up to a certain point, e.g., until a match is found).
Regular expressions are replaced with manual tokenization (and for usages of String.split, the "regex" is changed to single characters for which String.split has a fast-path implementation that avoids the regular expression engine).

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8309191: Reduce JDK dependencies of cgroup support (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14216/head:pull/14216
$ git checkout pull/14216

Update a local copy of the PR:
$ git checkout pull/14216
$ git pull https://git.openjdk.org/jdk.git pull/14216/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14216

View PR using the GUI difftool:
$ git pr show -t 14216

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14216.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2023-05-30T13:04:49Z

Hi @pejovica, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user pejovica" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

openjdk · 2023-05-30T13:07:31Z

@pejovica The following label will be automatically applied to this pull request:

core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

jerboaa · 2023-05-30T13:48:49Z

@pejovica Please enable GHA testing on your fork. Once enabled, please merge latest master into your branch so as to trigger a GHA run. Thanks!

pejovica · 2023-05-30T13:57:57Z

/covered

bridgekeeper · 2023-05-30T13:59:32Z

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

pejovica · 2023-05-31T09:38:08Z

@jerboaa I enabled GḪA testing. Other than a couple of Windows errors (which seem unrelated), everything else seems to be fine.

jerboaa · 2023-05-31T10:02:34Z

@pejovica There are some jcheck failures. See:
https://github.com/openjdk/jdk/pull/14216/checks?check_run_id=13870116111

pejovica · 2023-05-31T12:21:13Z

@pejovica There are some jcheck failures. See: https://github.com/openjdk/jdk/pull/14216/checks?check_run_id=13870116111

@jerboaa One failure is due to a lack of reviewers, so would you be able to do a review? As for the rest, I've added an issue reference, so that's fixed, and I guess I'll have to wait for OCA verification.

jerboaa · 2023-05-31T12:54:57Z

I guess I'll have to wait for OCA verification.

Yes.

One failure is due to a lack of reviewers, so would you be able to do a review?

Yes, I'll try to do a review later today or tomorrow.

Thanks!

pejovica · 2023-05-31T13:49:46Z

Yes, I'll try to do a review later today or tomorrow.

Awesome, thanks!

mlbridge · 2023-05-31T22:02:30Z

Webrevs

00: Full (4fc3af29)

dholmes-ora · 2023-06-01T03:14:40Z

@pejovica what testing have you done in relation to these changes? We run our container tests in tier5 - have you tested that? Thanks.

AlanBateman · 2023-06-01T06:15:52Z

This seems a real backward step. I think some finer grain analysis is needed to work through specific issues, e.g. maybe startup with the regex usage and report back on how much that helps.

jerboaa

I'm concerned about the hard-coding of delimiter values and the added accidential complexity in order to avoid the Regex engine. Note that this test fails due to the delimiter hard-coding:

jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java

This change seems hard to maintain. How would you ensure this won't regress?

jerboaa · 2023-06-01T08:50:33Z

src/java.base/linux/classes/jdk/internal/platform/CgroupInfo.java

     */
    static CgroupInfo fromCgroupsLine(String line) {
-        String[] tokens = line.split("\\s+");
+        String[] tokens = line.split("\t");


With this change, we now hard-code the expected delimiter and, thus, depend on what the kernel does. Do we have sufficient evidence this hasn't changed/won't change in the future?

As far as I can tell, the delimiter hasn't changed since the file was introduced, and judging by the kernel mailing list (e.g., see the following), I don't think it will change any time soon.

I'm not convinced this is a good change. Going by that mailing list thread, it suggests that people considered changing it. If one of those attempts were successful, it would have broken this code. It makes it rather fragile. The issue, with container detection code going wrong is that you most likely never notice. Translating this to GraalVM means that the native image, would a) detect the wrong version in use or b) fail detection and use host values. In both cases the application will likely misbehave in a container setup with resource limits applied and you won't (easily) know why. So even though it's unlikely to be a problem, there is a chance it could be and it's asking for trouble for no good reason.

Therefore, being conservative about delimiters makes sense here. My choice in this case would be more robust code over relying on external factors. YMMV.

Okay, so how about something like the following, would that be more acceptable?

Suggested change

String[] tokens = line.split("\t");

String[] tokens = Collections.list(new StringTokenizer(line, " \t")).toArray(new String[0]);

StringTokenizer() is a legacy class and is discouraged in new code. So not ideal either.

jerboaa · 2023-06-01T09:36:00Z

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java

+            // loop over space-separated tokens
+            for (int tOrdinal = 1, tStart = 0, tEnd = line.indexOf(' '); tEnd != -1; tOrdinal++, tStart = tEnd + 1, tEnd = line.indexOf(' ', tStart)) {


AFAIK, this now also hard-codes the delimiter: A single space. If we really want this custom parser, please add a unit test for it and extract it to a separate class.

A hypothetical line like the following would confuse the parser, setting ordinal to wrong values:

36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - cgroup /dev/root rw,errors=continue

We'd have: mountRoot = 98:0, mountPath = /mnt1, fsType = cgroup since it expects single space separated values. Seems fragile.

jerboaa · 2023-06-01T09:40:15Z

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java

                                                                  CgroupSubsystem.LONG_RETVAL_UNLIMITED);
    }

    public static long convertHierachicalLimitLine(String line) {


Pre-existing: typo convertHierarchicalLimitLine

jerboaa · 2023-06-01T09:47:29Z

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java


    public static long convertHierachicalLimitLine(String line) {
-        String[] tokens = line.split("\\s");
+        String[] tokens = line.split(" ");


Again, assumes single space ( ) delimited entries in memory.stat. I'm not sure we should hard-code this.

This delimiter also hasn't changed since the memory.stat file was introduced, and since cgroup v1 is in maintenance mode I'd expect it to stay that way.

jerboaa · 2023-06-01T10:00:16Z

src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java

        }
        // $MAX $PERIOD
-        String[] tokens = cpuMaxRaw.split("\\s+");
+        String[] tokens = cpuMaxRaw.split(" ");


This seems OK. According to https://docs.kernel.org/admin-guide/cgroup-v2.html#format

jerboaa · 2023-06-01T10:00:27Z

src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java

            return Long.valueOf(0);
        }
-        String[] tokens = line.split("\\s+");
+        String[] tokens = line.split(" ");


This seems OK. According to https://docs.kernel.org/admin-guide/cgroup-v2.html#format

jerboaa · 2023-06-01T10:02:25Z

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java

+    private static void warn(String msg) {
+        Logger logger = System.getLogger("jdk.internal.platform");
+        logger.log(Level.DEBUG, msg);
+    }


This seems fine and uncontroversial. Suggested name change log over warn. Perhaps apply this as a separate change?

jerboaa · 2023-06-01T12:01:15Z

With my Mandrel hat on I support this change if it helps reducing duplication on the native-image side. It appears, though, we need to find a way that's supportable long-term. It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

AlanBateman · 2023-06-01T12:11:29Z

It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

I do not like the changes proposed here, they are all crying out for several round cleanups and modernization. One thing that would help is to split this up into a series of changes that could be evaluated, e.g. the use of regex may be a significant part of this so maybe start with that, report back, then work through the iterations to make it clean and maintainable.

pejovica · 2023-06-05T09:01:24Z

@pejovica what testing have you done in relation to these changes? We run our container tests in tier5 - have you tested that? Thanks.

@dholmes-ora Thanks for the pointer, I'll post back when I run them.

christianwimmer · 2023-06-13T15:48:31Z

It's easy to introduce a new change to this code which accidentally drags in some of those (unwanted) dependencies again.

We (as the Native Image team) are OK with this. Our testing will detect that pretty quickly, and then the new code can be fixed.

adinn · 2023-06-20T09:07:27Z

We (as the Native Image team) are OK with this. Our testing will detect that pretty quickly, and then the new code can be fixed.

That may well be the case. However, until all the concerns raised by OpenJDK reviewers who have looked at this PR are addressed to their satisfaction it would not be appropriate to merge this patch.

n.b. That does not automatically mean the course of action the reviewers have recommended has to be followed. A resolution needs to be negotiated according to the merits and risks of the change. However, regarding that negotiation, I'll observe that the (repeated) request to break this change down in several steps appears to me to be motivated by the desire to ensure that the merits of the change are maximized (no unnecessary loss of important functionality) and the risks minimized (no unnecessary perturbation of the current implementation) -- which is not an unusual way for OpenJDK reviewers to proceed.

bridgekeeper · 2023-07-18T12:18:41Z

@pejovica This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper · 2023-08-15T17:22:04Z

@pejovica This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

dougxc · 2023-08-16T11:06:51Z

I'm concerned about the hard-coding of delimiter values and the added accidential complexity in order to avoid the Regex engine. Note that this test fails due to the delimiter hard-coding:
jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java
This change seems hard to maintain. How would you ensure this won't regress?

There's seems to be a lot of usage of sscanf in https://github.com/openjdk/jdk/blob/master/src/hotspot/os/linux/cgroupSubsystem_linux.cpp. Maybe I'm misreading that code, but doesn't it also hard code assumptions about the file format(s)?

jerboaa · 2023-08-16T12:29:38Z

Not as far as I'm aware (it can deal with tabs vs. spaces differences as well as multiple spaces).

FWIW, I've done this change as a PoC a while ago and it seems sufficient to use the JDK's metrics impl in native-image (barring some perf numbers; If somebody provides me with pointers, happy to provide those too).

dougxc · 2023-08-16T13:49:32Z

Ok, thanks. It's obviously been too long since I used sscanf ;-)

jerboaa · 2023-08-24T15:10:56Z

#15416 alternative PR with a more limited scope.

pejovica added 5 commits May 29, 2023 14:42

Factor out logging from CgroupSubsystemFactory.create

76cc3e8

Use simple patterns to parse cgroup files

edbacaf

Reimplement mountinfo parsing without using regex

dd9d8e5

Use java.io for reading cgroup files

51bd7d9

Use simple loops to process cgroup files

51904e0

bridgekeeper bot added the oca Needs verification of OCA signatory status label May 30, 2023

openjdk bot added the core-libs [email protected] label May 30, 2023

bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label May 30, 2023

Merge branch 'master' into ap/cgroup-tweaks

4fc3af2

pejovica changed the title ~~Reduce JDK dependencies of cgroup support~~ 8309191: Reduce JDK dependencies of cgroup support May 31, 2023

bridgekeeper bot removed oca Needs verification of OCA signatory status oca-verify Needs verification of OCA signatory status labels May 31, 2023

openjdk bot added the rfr Pull request is ready for review label May 31, 2023

jerboaa reviewed Jun 1, 2023

View reviewed changes

bridgekeeper bot closed this Aug 15, 2023

	String[] tokens = line.split("\t");
	String[] tokens = Collections.list(new StringTokenizer(line, " \t")).toArray(new String[0]);

		// loop over space-separated tokens
		for (int tOrdinal = 1, tStart = 0, tEnd = line.indexOf(' '); tEnd != -1; tOrdinal++, tStart = tEnd + 1, tEnd = line.indexOf(' ', tStart)) {

8309191: Reduce JDK dependencies of cgroup support #14216

8309191: Reduce JDK dependencies of cgroup support #14216

Uh oh!

Conversation

pejovica commented May 30, 2023 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewing

Webrev

Uh oh!

bridgekeeper bot commented May 30, 2023

Uh oh!

openjdk bot commented May 30, 2023

Uh oh!

jerboaa commented May 30, 2023

Uh oh!

pejovica commented May 30, 2023

Uh oh!

bridgekeeper bot commented May 30, 2023

Uh oh!

pejovica commented May 31, 2023

Uh oh!

jerboaa commented May 31, 2023

Uh oh!

pejovica commented May 31, 2023

Uh oh!

jerboaa commented May 31, 2023

Uh oh!

pejovica commented May 31, 2023

Uh oh!

mlbridge bot commented May 31, 2023

Webrevs

Uh oh!

dholmes-ora commented Jun 1, 2023

Uh oh!

AlanBateman commented Jun 1, 2023

Uh oh!

jerboaa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerboaa commented Jun 1, 2023

Uh oh!

AlanBateman commented Jun 1, 2023

Uh oh!

pejovica commented Jun 5, 2023

Uh oh!

christianwimmer commented Jun 13, 2023

Uh oh!

adinn commented Jun 20, 2023

Uh oh!

bridgekeeper bot commented Jul 18, 2023

Uh oh!

bridgekeeper bot commented Aug 15, 2023

Uh oh!

dougxc commented Aug 16, 2023

pejovica commented May 30, 2023 •

edited by openjdk bot

Loading