Skip to content

Conversation

@liach
Copy link
Member

@liach liach commented Apr 22, 2023

Add a method internalName to ClassDesc, and unifies handling of string representation of a class constant in CONSTANT_Class_info via ofInternalName and internalName APIs, documented in ClassDesc itself. In particular, ofInternalName now accepts arrays.

The motivation of this API is that avoiding frequent String creations via caching (enabled by this new API, will be in a separate patch) would speed up Classfile API's writing of simple class files by 1/3. See https://mail.openjdk.org/pipermail/classfile-api-dev/2023-April/000296.html for more context.

This API is futureproof: for Valhalla's Q-types, it will return their string representation in CONSTANT_Class_info, which is most likely their full descriptor string.

Javadoc: https://cr.openjdk.org/~liach/8306697/java.base/java/lang/constant/ClassDesc.html


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8306700 to be approved

Issues

  • JDK-8306697: Add method to obtain String for CONSTANT_Class_info in ClassDesc
  • JDK-8306700: Add method to obtain String for CONSTANT_Class_info in ClassDesc (CSR)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13598/head:pull/13598
$ git checkout pull/13598

Update a local copy of the PR:
$ git checkout pull/13598
$ git pull https://git.openjdk.org/jdk.git pull/13598/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13598

View PR using the GUI difftool:
$ git pr show -t 13598

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13598.diff

Webrev

Link to Webrev Comment

@liach
Copy link
Member Author

liach commented Apr 22, 2023

/csr needed

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 22, 2023

👋 Welcome back liach! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added rfr Pull request is ready for review csr Pull request needs approved CSR before integration labels Apr 22, 2023
@openjdk
Copy link

openjdk bot commented Apr 22, 2023

@liach has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@liach please create a CSR request for issue JDK-8306697 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@openjdk
Copy link

openjdk bot commented Apr 22, 2023

@liach The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Apr 22, 2023

Webrevs

@liach
Copy link
Member Author

liach commented Apr 22, 2023

After taking a look at https://bugs.openjdk.org/browse/JDK-8278863 #9201, I think we should take a bold step and make ofInternalName handle all possible CONSTANT_Class_info values:

  1. Internal names for classes and interfaces have no merit on their own, and has no usages besides CONSTANT_Class_info
  2. In the current form, shall future developments like Valhalla add a new form of value to CONSTANT_Class_info, users won't be able to reliably convert a CONSTANT_Class_info structure into a ClassDesc, adding more maintenance burdens.

The decision for ofInternalName to accept only class and interface internal names was an ad-hoc decision that was questioned by CSR lead in CSR review as well.

* @see ClassDesc#ofDescriptor(String)
* @see ClassDesc#internalName()
* @see <a href="#constant-class-info">The {@code CONSTANT_Class_info} Structure</a>
* @since 20

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have @revised, as array descriptors are not allowed as input to this method in JDK 20.

Suggested change
* @since 20
* @since 20
* @revised 21

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guideline on using revised. Also, I don't think we will declare it revised if it starts accepting Valhalla Q-types.

@liach
Copy link
Member Author

liach commented Apr 25, 2023

@mlchung Sorry to bother, but since you are frequently working on the Constant API and the caching allowed by this patch allows the Classfile API to speed up significantly, can you review this patch and its associated CSR?

In particular, I wish that you can review in these perspectives:

  1. Whether I should add a boolean hasInternalName() to check if the given class can be encoded in a CONSTANT_Class_info; currently it's equivalent to !isPrimitive(), but it might change when, for example, valhalla arrives.
  2. Whether I should use a @revised tag on ofInternalName.

@mlchung
Copy link
Member

mlchung commented Apr 25, 2023

You don't have to use @revised on the spec change. @revised was used in 9 as it was useful to generate special pages for all the API changes by Project Jigsaw.

@mlchung
Copy link
Member

mlchung commented Apr 26, 2023

I'm not working on the Constant API area while I'm happy to help. FYI I'll be on vacation for 3 weeks from May 1.

@mlchung
Copy link
Member

mlchung commented Apr 26, 2023

An array does not have an internal name. Changing ofInternalName to accept arrays seems not right. What is the issue of calling ClassDesc.ofDescriptor for arrays?

Have you considered internalName to return an Optional?

@liach
Copy link
Member Author

liach commented Apr 26, 2023

What is the issue of calling ClassDesc.ofDescriptor for arrays?

Users would have to check the obtained string from a CONSTANT_Class_info and call different factory methods, which feels a bit... weird to me, like calling a == null ? Optional.empty() : Optional.of(a) instead of Optional.ofNullable(a), and CONSTANT_Class_info can be seen as an API provided by the class file as it's defined in the JVMS.

Have you considered internalName to return an Optional?

That might be feasible. I initially fear that this may hurt performance-sensitive operations, but now I think as long as we can make isArray() or isClassOrInterface() constant-foldable (I don't think String::startsWith is optimized that far, so might need to check charAt(0)), we can still make obtaining the internal name fast.

public Optional<String> internalName() {
    if (!isClassOrInterface()) // assuming constant-foldable
        return Optional.of(descriptorString);
    var internalName = this.internalName; // caching
    if (internalName != null)
        return Optional.of(internalName);
    return Optional.of(this.internalName = descriptorString.substring(1, descriptorString.length() - 1));
}

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

Only the class name of a class or interface is encoded in the internal form. I expect the internalName method should do like this:

public Optional<String> internalName() {
    if (!isClassOrInterface())
        return Optional.empty();
    return Optional.of(internalName);  // it can be stored as a final field assigned during instantiation
}

What you are thinking is a method to return some string for any ClassDesc. If that's desirable, that should be a different method name.

@liach
Copy link
Member Author

liach commented Apr 27, 2023

I am aiming for an API to interact with the string in CONSTANT_Class_info than internal names, as I fail to see how internal names have any usage besides being a part of CONSTANT_Class_info when descriptors are present.

Should I use the ofClassInfoString toClassInfoString names instead, since everyone is confused on internal names? And this will avoid touching the preexisting ofInternalName.

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

I want to see what @asotona thinks. I also want to see some examples how ClassFile API implementation uses this new API.

@asotona
Copy link
Member

asotona commented Apr 27, 2023

I support this addition. Classfile API contains utility method obtaining internal name from ClassDesc and it is one of the core pieces.
I don't have any exact measured numbers in my hands now, however String conversions are significant CPU consumers in our benchmarks.
In #13671 we are targeting to support conversion-less paths for class building and transformations (by passing the original ClassDesc instances through the process). External utility for conversions from ClassDesc to internal name does not allow to effectively cache the names and so every repeated need of internal name of the same ClassDesc instance will produce a new substring.


@Override
public String internalName() {
return isArray() ? descriptorString() : descriptorString().substring(1, descriptorString().length() - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to cache the internal name to avoid repeated substring-ing with each call.
ClassDesc instances are frequently used as static constants serving for generations or transformations of many classes, methods, instructions.
It is highly expected that if this method is invoked at least once on the particular instance, it will be invoked many times on the same instance.

@asotona
Copy link
Member

asotona commented Apr 27, 2023

One comment to the topic of if internal name should include array descriptors.

It should not include array descriptors if it is called internal name, however even with excluded arrays the method will help a lot by caching the internal names and avoiding repeated substringing.

Or it can include also array descriptors if it is named differently ;)

Classfile API will benefit from both ways.

@liach
Copy link
Member Author

liach commented Apr 27, 2023

For API design, I have 3 choices for the accessor method:

  1. Return Optional only for internal names for classes or interfaces; Optional.empty() for primitives and arrays
  2. Return Optional suitable for CONSTANT_Class_info; Optional.empty() for primitives
  3. Return String suitable for CONSTANT_Class_info; throw for primitives

Which route should I take? I personally reject 1 as internal names are meaningless if they are used alone without being part of CONSTANT_Class_info. For 2 and 3, the main problem is the overhead of Optional vs whether there are better ways to detect if a class cannot be encoded in CONSTANT_Class_info (for now it's only primitives, but valhalla might change that if non-null types are loadable but cannot be in CONSTANT_Class_info)

On second thought, I think I will go option 2; it will be in line with Class.describeConstable(); though the current check is based on Class.isHidden(), criteria may change when Valhalla is out, and describeConstable() is better suited for future API evolutions.

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

  1. Return a string representation of internal name for a class or interface and a descriptor string for a primitive type and array. The method could be called classDescString. The caller can test if this ClassDesc is not a primitive if it wants the string for a class or interface or an array.

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

With option 4, ConstantPoolBuilder::classEntry can simply become:

    default ClassEntry classEntry(ClassDesc classDesc) {
        return classEntry(utf8Entry(classDesc.classDescString()));
    }

vs option 2:

    default ClassEntry classEntry(ClassDesc classDesc) {
        return classEntry(utf8Entry(classDesc.isPrimitive()
                    ? classDesc.descriptorString()
                    : classDesc.internalName().get()));
    }

@liach
Copy link
Member Author

liach commented Apr 27, 2023

  1. Return a string representation of internal name for a class or interface and a descriptor string for a primitive type and array. The method could be called classDescString. The caller can test if this ClassDesc is not a primitive if it wants the string for a class or interface or an array.

I think I will still resort to an Optional.

Pros of Optional:

  1. Clearly indicates to users that some ClassDesc cannot be converted to a CONSTANT_Class_info string, than failing at primitive classes at runtime (8304031: Classfile API cannot encode Primitive Class as Condy #12996)
  2. Anticipates valhalla changes; if non-null types in Valhalla has descriptors but cannot be encoded in CONSTANT_Class_info, then users that check Optional will not break, as opposed to checking isPrimitive (similar story with Lookup::hasPrivateAccess)

Cons of Optional:

  1. Needs wrap and unwrap, not beautiful in code and has no pattern matching
    • The unwrap performance overhead will be gone in Valhalla with value classes

So, I most likely will add ClassDesc.ofClassDescString() and ClassDesc.classDescString() returning Optional<String> to interact with CONSTANT_Class_info.

With this new model, we will have:

    default ClassEntry classEntry(ClassDesc classDesc) {
        return classEntry(utf8Entry(classDesc.classDescString().orElseThrow(() -> new IllegalArgumentException(classDesc + " cannot be encoded as ClassEntry"))));
    }

which will take care of non-primitive cases from valhalla.

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

ClassDesc.ofClassDescString("I") can't differentiate if it's a primitive or an internal name of class I.

@liach
Copy link
Member Author

liach commented Apr 27, 2023

ClassDesc.ofClassDescString("I") can't differentiate if it's a primitive or an internal name of class I.

It is always treated as an internal name when it is without a [ prefix. if class I in the unnamed package is stored in a CONSTANT_Class_info, then it is class(utf8("I")) in the constant pool, or LI; as a descriptor. I will add an API note informing users not to pass non-CONSTANT_Class_info strings into this factory.

@mlchung
Copy link
Member

mlchung commented Apr 27, 2023

I consulted with @briangoetz. The internal name should probably be cached in the ClassFile API implementation.

@briangoetz
Copy link
Contributor

briangoetz commented Apr 27, 2023 via email

@liach
Copy link
Member Author

liach commented Apr 27, 2023

Sounds good. I will close the pull request and the JBS issue.

@liach liach closed this Apr 27, 2023
@asotona
Copy link
Member

asotona commented Apr 28, 2023

With the upcoming refactor to make parse/build instance methods, there is a logical place and lifetime for caches.

@briangoetz
This is a different level of caches than the discussed ClassHierarchyResolver cache.

Most of the class generation code starts with static ClassDesc constants (or references to ConstantDescs).
Internal caching of class internal name within each single ClassDesc (instead of repeated sub-stringing by an external utility method) has visible performance impact. From each of the static ClassDesc instances there are thousands of calls of their conversions to internal class names and the conversion always create a new sub-string instance.
I don't agree with closing this PR, its function is complementary to the other performance improvements.

@asotona
Copy link
Member

asotona commented Apr 28, 2023

I benchmarked merged this PR with #13671 and benefits of internal names caching in ClassDesc are insignificant (~2% performance boost).
So I'm taking back my disagreement with closing this PR.

@liach liach deleted the feature/classdesc-internalname branch May 7, 2023 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] csr Pull request needs approved CSR before integration rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

5 participants