Skip to content

Conversation

@jonathan-gibbons
Copy link
Contributor

@jonathan-gibbons jonathan-gibbons commented Nov 10, 2022

Please review a "somewhat automated" change to insert @spec tags into doc comments, as appropriate, to leverage the recent new javadoc feature to generate a new page listing the references to all external specifications listed in the @spec tags.

"Somewhat automated" means that I wrote and used a temporary utility to scan doc comments looking for HTML links to selected sites, such as ietf.org, unicode.org, w3.org. These links may be in the main description of a doc comment, or in @see tags. For each link, the URL is examined, and "normalized", and inserted into the doc comment with a new @spec tag, giving the link and tile for the spec.

"Normalized" means...

  • Use https: where possible (includes pretty much all cases)
  • Use a single consistent host name for all URLs coming from the same spec site (i.e. don't use different aliases for the same site)
  • Point to the root page of a multi-page spec
  • Use a consistent form of the spec, preferring HTML over plain text where both are available (this mostly applies to IETF specs)

In addition, a "standard" title is determined for all specs, determined either from the content of the (main) spec page or from site index pages.

The net effect is (or should be) that all the changes are to just add new @spec tags, based on the links found in each doc comment. There should be no other changes to the doc comments, or to the implementation of any classes and interfaces.

That being said, the utility I wrote does have additional abilities, to update the links that it finds (e.g. changing to use https: etc,) but those features are not being used here, but could be used in followup PRs if component teams so desired. I did notice while working on this overall feature that many of our links do point to "outdated" pages, some with eye-catching notices declaring that the spec has been superseded. Determining how, when and where to update such links is beyond the scope of this PR.

Going forward, it is to be hoped that component teams will maintain the underlying links, and the URLs in @spec tags, such that if references to external specifications are updated, this will include updating the @spec tags.

To see the effect of all these new @spec tags, see http://cr.openjdk.java.net/~jjg/8296546/api.00/

In particular, see the new External Specifications page, which you can also find via the new link near the top of the Index pages.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11073/head:pull/11073
$ git checkout pull/11073

Update a local copy of the PR:
$ git checkout pull/11073
$ git pull https://git.openjdk.org/jdk pull/11073/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11073

View PR using the GUI difftool:
$ git pr show -t 11073

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11073.diff

@jonathan-gibbons jonathan-gibbons marked this pull request as draft November 10, 2022 01:10
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 10, 2022

👋 Welcome back jjg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@jonathan-gibbons jonathan-gibbons marked this pull request as ready for review November 10, 2022 01:14
@openjdk
Copy link

openjdk bot commented Nov 10, 2022

@jonathan-gibbons The following labels will be automatically applied to this pull request:

  • client
  • compiler
  • core-libs
  • i18n
  • javadoc
  • net
  • nio
  • security
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Nov 10, 2022

Webrevs

@dfuch
Copy link
Member

dfuch commented Nov 10, 2022

Hi Jon,

When referencing an RFC, it might be good to keep the RFC number in the text link. For instance I see that java.net.URL now has this:

http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/net/URL.html

External Specifications
Format for Literal IPv6 Addresses in URL's, Uniform Resource Identifier (URI): Generic Syntax, Uniform Resource Identifiers (URI): Generic Syntax

You will see that two of the RFC links have the same text but link to different RFCs, which I am finding confusing.
Also I do hope it's clear that if a specification is referenced it doesn't mean it's being implemented.

@AlanBateman
Copy link
Contributor

When referencing an RFC, it might be good to keep the RFC number in the text link. For instance I see that java.net.URL now has this:

I agree and also to add that some RFCs have commas in their titles, the same separator used when there is more than one specification linked. Here's an example:

http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/nio/channels/MulticastChannel.html

@AlanBateman
Copy link
Contributor

AlanBateman commented Nov 10, 2022

I'm trying to understand what "fix-ups" will be needed if the automated patch is applied. In some cases, it looks the same spec will be linked from "See also" and "External Specifications", e.g.
http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/net/StandardSocketOptions.html#TCP_NODELAY
so the @see ref can be dropped.

In other cases we will have inline refs and the same URL in the @spec. This may be okay for the short term but maybe there is a way to inline @spec to avoid the duplication?

There will probably be a bit of cleanup to reflow some lines, e.g. StandardSocketOptions.java, as excessively long lines are problematic for side-by-side diffs.

@wangweij
Copy link
Contributor

@AJ1062910
Copy link

did you changed 420 files ?

@mlbridge
Copy link

mlbridge bot commented Nov 10, 2022

Mailing list message from Michael StJohns on security-dev:

Daniel et al -

Please avoid using ietf.org as the cite location for RFCs

The preferred cite for RFCs is generally via
www.rfc-editor.org/info/rfc<number> - that will get you to the info
page, but with links to pdf, html, and a clean .txt.

Cf https://www.rfc-editor.org/info/rfc4180 - "Cite this RFC"? -

Shafranovich, Y., "Common Format and MIME Type for Comma-Separated Values (CSV) Files", RFC 4180, DOI 10.17487/RFC4180, October 2005,<https://www.rfc-editor.org/info/rfc4180>.

Note that the most stable cite might be the DOI cite, but the above is
going to be fairly useful for a long time to come.

Mike

On 11/10/2022 6:34 AM, Daniel Fuchs wrote:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/security-dev/attachments/20221110/3423e68e/attachment-0001.htm>

@seanjmullan
Copy link
Member

When referencing an RFC, it might be good to keep the RFC number in the text link.

+1.

@jonathan-gibbons jonathan-gibbons changed the title JDK-8296547: Add @spec tags to API JDK-8296546: Add @spec tags to API Nov 10, 2022
@jonathan-gibbons
Copy link
Contributor Author

Hi Jon,

When referencing an RFC, it might be good to keep the RFC number in the text link. For instance I see that java.net.URL now has this:

http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/net/URL.html

External Specifications Format for Literal IPv6 Addresses in URL's, Uniform Resource Identifier (URI): Generic Syntax, Uniform Resource Identifiers (URI): Generic Syntax

You will see that two of the RFC links have the same text but link to different RFCs, which I am finding confusing. Also I do hope it's clear that if a specification is referenced it doesn't mean it's being implemented.

On keeping RFC in the title, I'll go with the team preference. I note that not all spec authorities have such a well-defined naming/numbering scheme, so it does make the summary page a bit inconsistent. Also, the entries under "R" dominate the list, which may not be what you want.

On the same text but linking to different RFCs: that's tantamount to a bug somewhere. The spec for @spec dictates that the URLs and titles should be in 1-1 correspondence, and this is supposed to be enforced in the docket. In other words, specs should have unique titles, and any title should only be used for one spec.

@jonathan-gibbons
Copy link
Contributor Author

jonathan-gibbons commented Nov 10, 2022

When referencing an RFC, it might be good to keep the RFC number in the text link. For instance I see that java.net.URL now has this:

I agree and also to add that some RFCs have commas in their titles, the same separator used when there is more than one specification linked. Here's an example:

http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/nio/channels/MulticastChannel.html

I can change the doclet to use a bulleted list when any spec titles contain a comma.

@jonathan-gibbons
Copy link
Contributor Author

I'm trying to understand what "fix-ups" will be needed if the automated patch is applied. In some cases, it looks the same spec will be linked from "See also" and "External Specifications", e.g. http://cr.openjdk.java.net/~jjg/8296546/api.00/java.base/java/net/StandardSocketOptions.html#TCP_NODELAY so the @see ref can be dropped.

In other cases we will have inline refs and the same URL in the @spec. This may be okay for the short term but maybe there is a way to inline @spec to avoid the duplication?

There will probably be a bit of cleanup to reflow some lines, e.g. StandardSocketOptions.java, as excessively long lines are problematic for side-by-side diffs.

The utility I mentioned has the (optional) ability to remove @see links when the text of the link exactly matches that used by the @spec tag. Unfortunately, the text is typically not exactly the same, and would require manual analysis to see if the @see tag can be removed.

When inline references are used, the wording is very rarely the primary title of the spec: it is more likely to be a word or phrase that makes sense in the context of the enclosing sentence.

History: version 1 of this feature tried replacing inline links and @see tags with a bi-modal @spec tag. The results were "not good", especially in the generated external-specs page. Version 2 used a side file to provide the definitive title for each spec, but that was deemed to be too much of a maintenance issue. This is version 3, in which we've eliminated the side-file in favor of duplicating the title in each @spec tag.

@jonathan-gibbons
Copy link
Contributor Author

/issue add JDK-8296546

@openjdk openjdk bot changed the title JDK-8296546: Add @spec tags to API 8296546: Add @spec tags to API Nov 10, 2022
@openjdk
Copy link

openjdk bot commented Nov 10, 2022

@jonathan-gibbons This issue is referenced in the PR title - it will now be updated.

@jonathan-gibbons
Copy link
Contributor Author

did you changed 420 files ?

I ran a custom utility that edited these files, yes.

* use instances for synchronization, or unpredictable behavior may
* occur. For example, in a future release, synchronization may fail.
*
* @spec https://www.unicode.org/reports/tr27 Unicode 3.1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be removed, as the original link (explaining U+n notation) is broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naotoj The edits are driven by a script, using info about existing links in the same doc comment. If you don't think this reference is appropriate, it would be better to either remove the existing link (and I'll regenerate this patch) or else this patch goes through and you fix up both the existing link and the @spec tag afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine with me. I will fix it up if you choose the latter.

Comment on lines +29 to +30
* @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
* @spec jdwp/jdwp-transport.html Java Debug Wire Protocol Transport Interface (jdwpTransport)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdwp.agent/module-summary.html

The end result here is not very clean. You have the same two specs being referred to just a few lines apart, and the hyperlink titles are not even close to be the same, even though the links are the same. Maybe the "@see" section should be removed.

@@ -104,6 +104,7 @@
* </blockquote>
*
*
* @spec jpda/jpda.html Java Platform Debugger Architecture
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdi/module-summary.html

@spec and @see sections end up one right after the other with the same content, except the @see section has the preferred hyperlink title. Suggest you remove the @see section and also update @spec hyperlink title to include "(JPDA)", or update the actual title in the jpda.html doc so it includes "(JPDA)" in it and then rerun your tool.

@@ -102,6 +102,7 @@ public Connection() {}
* @throws java.io.IOException
* If the length of the packet (as indictaed by the first
* 4 bytes) is less than 11 bytes, or an I/O error occurs.
* @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdi/com/sun/jdi/connect/spi/Connection.html#readPacket()

Within this javadoc page the jdwp-spec.html references are titled "JDWP Specification", but these @spec references are titled "Java Debug Wire Protocol". I suggest making them more consistent. There is one more case below and this same issue also applies to TransportService.java. Perhaps the title in jdwp-spec.html should be updated. I think "Java Debug Wire Protocol (JDWP) Specification" would be good.

@@ -76,6 +76,8 @@
* method is used to accept a connection initiated by a
* target VM.
*
* @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment for Connection.java.

@dfuch
Copy link
Member

dfuch commented Nov 11, 2022

On the same text but linking to different RFCs: that's tantamount to a bug somewhere. The spec for @spec dictates that the URLs and titles should be in 1-1 correspondence, and this is supposed to be enforced in the docket. In other words, specs should have unique titles, and any title should only be used for one spec.

It's not uncommon for a newer version of a RFC to change its number but keep its title. I see that the links in the class level API documentation both have the RFC number in their link text. Somehow that was stripped by your tool - possibly because it tried to extract some meta information from the linked page itself?

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jon,

I only looked at the jar specific updates but there is some duplication leftovers.

It would probably be easier for the reviewers and for you if the PR could be broken out by areas into separate PRs

* Manifest and Signature Specification</a> - The manifest format specification.
* </ul>
*
* @spec jar/jar.html JAR File Specification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 43 should be removed

* <a href="{@docRoot}/../specs/jar/jar.html">
* Manifest format specification</a>.
*
* @spec jar/jar.html JAR File Specification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 44 should be removed

* and will be UTF8-encoded when written to the output stream. See the
* <a href="{@docRoot}/../specs/jar/jar.html">JAR File Specification</a>
* for more information about valid attribute names and values.
* @spec jar/jar.html JAR File Specification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 448 should be removed

* <p>This map and its views have a predictable iteration order, namely the
* order that keys were inserted into the map, as with {@link LinkedHashMap}.
*
* @spec jar/jar.html JAR File Specification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 52 should be removed

* <li>Adler-32 checksum is described in RFC 1950 (above)
* </ul>
*
* @spec https://www.ietf.org/rfc/rfc1951.html DEFLATE Compressed Data Format Specification version 1.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above references should be removed as they duplicate the @spec tags

@dfuch
Copy link
Member

dfuch commented Nov 11, 2022

It would probably be easier for the reviewers and for you if the PR could be broken out by areas into separate PRs

Leaving out the non-public and non-exported classes would also reduce the PR size.

@dfuch
Copy link
Member

dfuch commented Nov 23, 2022

Thanks for adding the RFC NNNN prefix to the RFC link. What is the purpose of editing non exported classes though, like those in the sun.net subpackages?

@jonathan-gibbons
Copy link
Contributor Author

Thanks for adding the RFC NNNN prefix to the RFC link. What is the purpose of editing non exported classes though, like those in the sun.net subpackages?

That was not intentional, and is a result of the scripted edit. I will look to revert those changes and/or change the tooling to ignore those packages.

@dfuch
Copy link
Member

dfuch commented Nov 23, 2022

The java.base/net/, java.http/, java.naming/ changes look reasonable to me - though like Alan I wonder if it wouldn't be better to have an inline {@spec } tag - similar to {@systemProperty }, rather than repeating all the references outside of the context where they were cited. This probably also calls for a review of these references by maintainers of the various areas - as some of them might need some updating - e.g. linking to rfceditor as was previously suggested, and double checking whether all of them still make sense. Not something to be conducted within this PR though.

@jonathan-gibbons
Copy link
Contributor Author

The java.base/net/, java.http/, java.naming/ changes look reasonable to me - though like Alan I wonder if it wouldn't be better to have an inline {@spec } tag - similar to {@systemProperty }, rather than repeating all the references outside of the context where they were cited. This probably also calls for a review of these references by maintainers of the various areas - as some of them might need some updating - e.g. linking to rfceditor as was previously suggested, and double checking whether all of them still make sense. Not something to be conducted within this PR though.

Believe me, I tried very hard to design and use an inline {@spec} tag but such a tag effectively needs a normative external file to indicate the root of a multi-page spec, and the definitive title, since inline tags either do not or are unlikely to contain such information.

The general history of this work is:

  • version 1: bimodal tag with no external file -- the content of the summary page was effectively rubbish
  • version 2: bimodal tag with an external file -- in discussion with @jddarcy and CSR, we decided that was too much of a non-standard maintenance load
  • version 3: new tag, with no external file needed -- as you see here

* @spec https://www.w3.org/TR/xml11 Extensible Markup Language (XML) 1.1 (Second Edition)
* @spec https://www.w3.org/TR/REC-xml-names Namespaces in XML 1.0 (Third Edition)
* @spec https://www.w3.org/TR/xml-names11 Namespaces in XML 1.1 (Second Edition)
* @spec https://www.w3.org/TR/xmlschema-1 XML Schema Part 1: Structures Second Edition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jon,

I would agree with what Alan said earlier that the @see ref can be dropped. This particular class (XMLConstants.java [1]) is a good example for that argument: in the resulting javadoc, 5 specs were listed in the "External Specifications" section, 6 in "See Also:", and then they were listed again for each field. That's a lot of duplicates. Adding to the confusion was that the @SPEC and @see were not always the same, e.g. @SPEC XML 1.0.
points to the fifth edition while @see second.

A minor comment is that the '@SPEC's were rendered in one line while the @see refs a list. I would see the later is easier to read.

[1] http://cr.openjdk.java.net/~jjg/8296546/api.00/java.xml/javax/xml/XMLConstants.html

Copy link
Contributor Author

@jonathan-gibbons jonathan-gibbons Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The presentation of lists of @spec tags was fixed separately (JDK-8297802), and is incorporated into the latest docs that demo this work. The same algorithm is now used for both @see and @spec tags ... if the links are short and do not contain commas, they will be displayed as an inline list; otherwise, they will be displayed in a bulleted list.

Copy link
Member

@JoeWang-Java JoeWang-Java left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specs for XSLT and XPath (many occurrences) need to point to specific version (e.g. 1.0) rather than the "cover page" (this is an issue in the original javadoc).

* <p>All the fields in this class are read-only.</p>
*
* @spec https://www.w3.org/TR/xslt xslt cover page - W3C
* @see <a href="http://www.w3.org/TR/xslt#output">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pages for XSLT and XPath at W3C are organized differently from the days when this javadoc was created. The "latest version" now points to the "cover page". Could you change the spec to the following?
https://www.w3.org/TR/1999/REC-xslt-19991116 XSL Transformations (XSLT) Version 1.0

The @SPEC points to the general spec while @see also a specific section (similar situation as other classes in the package), if we want to keep @see ref here, it would be:
https://www.w3.org/TR/1999/REC-xslt-19991116#output

* @spec https://www.w3.org/TR/xpath xpath cover page - W3C
* @author Norman Walsh
* @author Jeff Suttor
* @see <a href="http://www.w3.org/TR/xpath">XML Path Language (XPath) Version 1.0</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar situation as XSLT above, the latest version now points to "cover page". For this javadoc then, it needs to be:
https://www.w3.org/TR/1999/REC-xpath-19991116/ XML Path Language (XPath) Version 1.0

Unlike XSLT, the original @see ref also points to the spec generally (not a specific section), we could then drop it to keep just the @SPEC ref.

* </dl>
*
* @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
* @spec jni/index.html Java Native Interface Specification
Copy link
Contributor

@AlanBateman AlanBateman Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that that bothers me a bit here is that the JNI and JDWP specs will be listed as "External Specifications" in the generated javadoc. This heading is appropriate for RFCs and other standards that we reference but seems misleading for specifications that are part of Java SE. Also if the existing table is removed then loose the word "Optional". Has there been any other concerns about this?

* <li><a href="doc-files/FocusSpec.html">The AWT Focus Subsystem</a>
* <li><a href="doc-files/Modality.html">The AWT Modality</a>
* <li><a href="{@docRoot}/../specs/AWT_Native_Interface.html">
* The Java AWT Native Interface (JAWT)</a>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only 1 of these 3 ?

* </ul>
*
* @spec AWT_Native_Interface.html The Java AWT Native Interface Specification and Guide
* @since 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if links to html we include in the javadoc should be really treated in the same manner as referecnes to externally defined specifactions ?
But I also wonder why only the native_interface spec was added and not the other two ?

*
* @spec https://www.ietf.org/rfc/rfc1951.html RFC 1951: DEFLATE Compressed Data Format Specification version 1.3
* @see #TAG_COMPRESSION
* @see <a href="https://tools.ietf.org/html/rfc1951">DEFLATE specification</a>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does having @SPEC and @see mean we have two clickable links to the same place adjacent to each other ?

@mlbridge
Copy link

mlbridge bot commented Nov 30, 2022

Mailing list message from Michael StJohns on compiler-dev:

Hi -

I need to repeat again.? Please avoid using www.ietf.org as the URL base
for referencing RFCs.? The appropriate location is www.rfc-editor.org
and is going to be more stable in the long run than any reference to an
RFC that runs through the IETF's website.? These two websites have
different purposes, and the structure of the IETF website has changed at
least once recently and may change again relatively (~5 years) soon.

The most general and correct form for referencing RFCs is
"https://www.rfc-editor.org/info/rfc<number>"? That will get you to the
front page with pointers to all of the current semi-canonical versions
of the spec (e.g. text, pdf-a, html, and xml).

Mike

On 11/28/2022 6:27 PM, Phil Race wrote:

@mlbridge
Copy link

mlbridge bot commented Nov 30, 2022

Mailing list message from Daniel Fuchs on core-libs-dev:

We hear you Mike :-)

I have logged https://bugs.openjdk.org/browse/JDK-8297755 to update the
links in java.net / java.net.http API documentation.

best regards,

-- daniel

On 29/11/2022 03:14, Michael StJohns wrote:

Hi -

I need to repeat again.? Please avoid using www.ietf.org as the URL base
for referencing RFCs.? The appropriate location is www.rfc-editor.org
and is going to be more stable in the long run than any reference to an
RFC that runs through the IETF's website.? These two websites have
different purposes, and the structure of the IETF website has changed at
least once recently and may change again relatively (~5 years) soon.

The most general and correct form for referencing RFCs is
"https://www.rfc-editor.org/info/rfc<number>"? That will get you to the
front page with pointers to all of the current semi-canonical versions
of the spec (e.g. text, pdf-a, html, and xml).

Mike

@mlbridge
Copy link

mlbridge bot commented Dec 1, 2022

Mailing list message from Jonathan Gibbons on compiler-dev:

On 11/10/22 8:18 PM, Chris Plummer wrote:

On Thu, 10 Nov 2022 01:10:13 GMT, Jonathan Gibbons <jjg at openjdk.org> wrote:

Please review a "somewhat automated" change to insert `@spec` tags into doc comments, as appropriate, to leverage the recent new javadoc feature to generate a new page listing the references to all external specifications listed in the `@spec` tags.

"Somewhat automated" means that I wrote and used a temporary utility to scan doc comments looking for HTML links to selected sites, such as `ietf.org`, `unicode.org`, `w3.org`. These links may be in the main description of a doc comment, or in `@see` tags. For each link, the URL is examined, and "normalized", and inserted into the doc comment with a new `@spec` tag, giving the link and tile for the spec.

"Normalized" means...
* Use `https:` where possible (includes pretty much all cases)
* Use a single consistent host name for all URLs coming from the same spec site (i.e. don't use different aliases for the same site)
* Point to the root page of a multi-page spec
* Use a consistent form of the spec, preferring HTML over plain text where both are available (this mostly applies to IETF specs)

In addition, a "standard" title is determined for all specs, determined either from the content of the (main) spec page or from site index pages.

The net effect is (or should be) that **all** the changes are to just **add** new `@spec` tags, based on the links found in each doc comment. There should be no other changes to the doc comments, or to the implementation of any classes and interfaces.

That being said, the utility I wrote does have additional abilities, to update the links that it finds (e.g. changing to use `https:` etc,) but those features are _not_ being used here, but could be used in followup PRs if component teams so desired. I did notice while working on this overall feature that many of our links do point to "outdated" pages, some with eye-catching notices declaring that the spec has been superseded. Determining how, when and where to update such links is beyond the scope of this PR.

Going forward, it is to be hoped that component teams will maintain the underlying links, and the URLs in `@spec` tags, such that if references to external specifications are updated, this will include updating the `@spec` tags.

To see the effect of all these new `@spec` tags, see http://cr.openjdk.java.net/~jjg/8296546/api.00/

In particular, see the new [External Specifications](http://cr.openjdk.java.net/~jjg/8296546/api.00/external-specs.html) page, which you can also find via the new link near the top of the [Index](http://cr.openjdk.java.net/~jjg/8296546/api.00/index-files/index-1.html) pages.
src/jdk.jdi/share/classes/com/sun/jdi/connect/spi/Connection.java line 105:

103: * If the length of the packet (as indictaed by the first
104: * 4 bytes) is less than 11 bytes, or an I/O error occurs.
105: * @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdi/com/sun/jdi/connect/spi/Connection.html#readPacket()

Within this javadoc page the jdwp-spec.html references are titled "JDWP Specification", but these `@spec` references are titled "Java Debug Wire Protocol". I suggest making them more consistent. There is one more case below and this same issue also applies to TransportService.java. Perhaps the title in jdwp-spec.html should be updated. I think "Java Debug Wire Protocol (JDWP) Specification" would be good.

src/jdk.jdi/share/classes/com/sun/jdi/connect/spi/TransportService.java line 79:

77: * target VM.
78: *
79: * @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
See above comment for Connection.java.

src/jdk.jdi/share/classes/module-info.java line 107:

105: *
106: *
107: * @spec jpda/jpda.html Java Platform Debugger Architecture
http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdi/module-summary.html

`@spec` and `@see` sections end up one right after the other with the same content, except the `@see` section has the preferred hyperlink title. Suggest you remove the `@see` section and also update `@spec` hyperlink title to include "(JPDA)", or update the actual title in the jpda.html doc so it includes "(JPDA)" in it and then rerun your tool.

src/jdk.jdwp.agent/share/classes/module-info.java line 30:

28: *
29: * @spec jdwp/jdwp-spec.html Java Debug Wire Protocol
30: * @spec jdwp/jdwp-transport.html Java Debug Wire Protocol Transport Interface (jdwpTransport)
http://cr.openjdk.java.net/~jjg/8296546/api.00/jdk.jdwp.agent/module-summary.html

The end result here is not very clean. You have the same two specs being referred to just a few lines apart, and the hyperlink titles are not even close to be the same, even though the links are the same. Maybe the "@see" section should be removed.

Chris,

The general problem is that we're starting from an inconsistent code base.

Like you, I believe we should strive for consistency, especially between
the title of the document and the label used in any URLs that point
directly at the document.

This patch is generated by tooling that understands specific families of
specs, including sites like `ietf.org` and the sibling `specs`
directory.? I can update the title used in any `@spec` tag for any
document in the sibling specs directory, but it is out of scope for this
work to change the title within any specific document. That would need
to be done in separate Enhancements, preferably by the teams owning the
relevant documents.

As for `@see` tags, the tooling currently has the ability to remove a
`@see` tag if both the URL and title match, although I have not enabled
that option in the work so far.?? It might be reasonable to
automatically remove the `@see` tag if just the URL matches, meaning it
points to the same page, with no #fragment identifier.

-- Jon

@mlbridge
Copy link

mlbridge bot commented Dec 2, 2022

Mailing list message from Jonathan Gibbons on core-libs-dev:

Mike,

Thank you for the additional info.

In general, the intent of this patch is to leverage the existing links
in the doc comments, but given that there is now an intent to update
those links as well, I have incorporated the change into the latest update.

-- Jon

On 11/28/22 7:14 PM, Michael StJohns wrote:

Hi -

I need to repeat again.? Please avoid using www.ietf.org as the URL
base for referencing RFCs.? The appropriate location is
www.rfc-editor.org and is going to be more stable in the long run than
any reference to an RFC that runs through the IETF's website.? These
two websites have different purposes, and the structure of the IETF
website has changed at least once recently and may change again
relatively (~5 years) soon.

The most general and correct form for referencing RFCs is
"https://www.rfc-editor.org/info/rfc<number>"? That will get you to
the front page with pointers to all of the current semi-canonical
versions of the spec (e.g. text, pdf-a, html, and xml).

Mike

On 11/28/2022 6:27 PM, Phil Race wrote:

@mlbridge
Copy link

mlbridge bot commented Dec 2, 2022

Mailing list message from Jonathan Gibbons on core-libs-dev:

On 11/28/22 3:27 PM, Phil Race wrote:

On Wed, 23 Nov 2022 18:57:03 GMT, Jonathan Gibbons <jjg at openjdk.org> wrote:

Please review a "somewhat automated" change to insert `@spec` tags into doc comments, as appropriate, to leverage the recent new javadoc feature to generate a new page listing the references to all external specifications listed in the `@spec` tags.

"Somewhat automated" means that I wrote and used a temporary utility to scan doc comments looking for HTML links to selected sites, such as `ietf.org`, `unicode.org`, `w3.org`. These links may be in the main description of a doc comment, or in `@see` tags. For each link, the URL is examined, and "normalized", and inserted into the doc comment with a new `@spec` tag, giving the link and tile for the spec.

"Normalized" means...
* Use `https:` where possible (includes pretty much all cases)
* Use a single consistent host name for all URLs coming from the same spec site (i.e. don't use different aliases for the same site)
* Point to the root page of a multi-page spec
* Use a consistent form of the spec, preferring HTML over plain text where both are available (this mostly applies to IETF specs)

In addition, a "standard" title is determined for all specs, determined either from the content of the (main) spec page or from site index pages.

The net effect is (or should be) that **all** the changes are to just **add** new `@spec` tags, based on the links found in each doc comment. There should be no other changes to the doc comments, or to the implementation of any classes and interfaces.

That being said, the utility I wrote does have additional abilities, to update the links that it finds (e.g. changing to use `https:` etc,) but those features are _not_ being used here, but could be used in followup PRs if component teams so desired. I did notice while working on this overall feature that many of our links do point to "outdated" pages, some with eye-catching notices declaring that the spec has been superseded. Determining how, when and where to update such links is beyond the scope of this PR.

Going forward, it is to be hoped that component teams will maintain the underlying links, and the URLs in `@spec` tags, such that if references to external specifications are updated, this will include updating the `@spec` tags.

To see the effect of all these new `@spec` tags, see http://cr.openjdk.java.net/~jjg/8296546/api.00/

In particular, see the new [External Specifications](http://cr.openjdk.java.net/~jjg/8296546/api.00/external-specs.html) page, which you can also find via the new link near the top of the [Index](http://cr.openjdk.java.net/~jjg/8296546/api.00/index-files/index-1.html) pages.
Jonathan Gibbons has updated the pull request incrementally with one additional commit since the last revision:

Remove updates from unexported files
src/java.desktop/share/classes/java/awt/package-info.java line 58:

56: * <li><a href="doc-files/Modality.html">The AWT Modality</a>
57: * <li><a href="{@docRoot}/../specs/AWT_Native_Interface.html">
58: * The Java AWT Native Interface (JAWT)</a>
Why only 1 of these 3 ?

Only one is a link outside of the overall api/ documentation hierarchy
(i.e. the one whose URL starts with {@docRoot}/../specs/), which is the
focus of the `@spec` tag.? The other two links (only one shown in your
email) both point to documentation within the same package.

src/java.desktop/share/classes/java/awt/package-info.java line 62:

60: *
61: * @spec AWT_Native_Interface.html The Java AWT Native Interface Specification and Guide
62: * @since 1.0
I wonder if links to html we include in the javadoc should be really treated in the same manner as referecnes to externally defined specifactions ?
But I also wonder why only the native_interface spec was added and not the other two ?

The patch is generated by running a tool that detects existing links to
either the sibling `specs` directory or to well-known hosts that provide
specifications used by JDK.? It would be a feature-enhancement of the
`@spec` tag to also accept "stand-alone" HTML files within the `api/`
hierarchy of pages.

src/java.desktop/share/classes/javax/imageio/plugins/tiff/BaselineTIFFTagSet.java line 226:

224: * @spec https://www.ietf.org/rfc/rfc1951.html RFC 1951: DEFLATE Compressed Data Format Specification version 1.3
225: * @see #TAG_COMPRESSION
226: * @see <a href="https://tools.ietf.org/html/rfc1951">DEFLATE specification</a>
Does having @spec and @see mean we have two clickable links to the same place adjacent to each other ?

At this time yes, although the tooling does currently allow `@see` tags
to be removed if the URL _and title_ match that used for the `@spec`
tag.??? Not all `@see` tags to a spec should be removed, since some may
point to places within a spec, perhaps using a `#fragment` identifier,
or to a sub-page within a multi-page spec.? It is my expectation that we
may way to do a manual pass over the doc comments to examine places
where there may be duplication, such that the `@see` tag can be updated
and/ore removed.? That manual pass might also include updating to more
normative URLs ... see the separate email discussion in the PR comments
about changing `ietf.org` to `rfc-editor.org`.??? Any such manual work
would need to be done in conjunction with the relevant component teams.

Copy link
Member

@dfuch dfuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed again the java.net, java.net.http, jdk.httpserver, java.naming, and javax.management changes - and I spotted a few places where the @spec duplicates an @see (noted below). I believe the duplicated @see should be removed now.

Comment on lines +60 to 62
* @spec https://www.rfc-editor.org/info/rfc919 RFC 919: Broadcasting Internet Datagrams
* @see <a href="http://www.ietf.org/rfc/rfc919.txt">RFC&nbsp;929:
* Broadcasting Internet Datagrams</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @see line should now be removed since it's referencing the exact same document.

Comment on lines +81 to 83
* @spec https://www.rfc-editor.org/info/rfc1122 RFC 1122: Requirements for Internet Hosts - Communication Layers
* @see <a href="http://www.ietf.org/rfc/rfc1122.txt">RFC&nbsp;1122
* Requirements for Internet Hosts -- Communication Layers</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark here: please remove the @see

Comment on lines 153 to 154
* @see <a href="http://www.ietf.org/rfc/rfc1323.txt">RFC&nbsp;1323: TCP
* Extensions for High Performance</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the @see

Comment on lines +185 to 186
* @spec https://www.rfc-editor.org/info/rfc793 RFC 793: Transmission Control Protocol
* @see <a href="http://www.ietf.org/rfc/rfc793.txt">RFC&nbsp;793: Transmission
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the @see

Comment on lines +375 to 377
* @spec https://www.rfc-editor.org/info/rfc1122 RFC 1122: Requirements for Internet Hosts - Communication Layers
* @see <a href="http://www.ietf.org/rfc/rfc1122.txt">RFC&nbsp;1122:
* Requirements for Internet Hosts -- Communication Layers</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the @see

Comment on lines +118 to 122
* @spec https://www.rfc-editor.org/info/rfc2609 RFC 2609: Service Templates and Service: Schemes
* @spec https://www.rfc-editor.org/info/rfc3111 RFC 3111: Service Location Protocol Modifications for IPv6
* @see <a
* href="http://www.ietf.org/rfc/rfc2609.txt">RFC 2609,
* "Service Templates and <code>Service:</code> Schemes"</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two @see should now be removed

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 30, 2022

@jonathan-gibbons This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 27, 2023

@jonathan-gibbons This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.