Skip to content

Conversation

@jonpryor
Copy link
Contributor

Context: #623
Context: #623 (comment)

DO NOT MERGE UNTIL AFTER PR #623 IS MERGED.

Update class-parse --docspath=PATH so that if PATH contains
<javadoc/> elements, as produced by tools/java-source-utils (PR #623),
then those <javadoc/> elements will be inserted into the generated
API description.

The intent is to eventually allow generator to emit the <javadoc/>
data as C# XML Documentation, allowing a pipeline of:

    java -jar java-source-tools path/to/android.jar --output-javadoc android.xml
    mono class-parse.exe --docspath=android.xml -o api.xml …
    mono generator.exe api.xml …

Context: https://github.com/javaparser/javaparser/tree/javaparser-parent-3.16.1

There are two parts of the current `.jar` binding toolchain which are
painful and could be improved:

 1. Parameter names
 2. Documentation extraction

Parameter names (1) are important because they become the names of
event members as part of ["event-ification"][0].  As such they are
semantically important, and the default behavior of "p0" makes for a
terrible user experience.

*If* the `.class` files in the `.jar` file are built with
`javac -parameters` (4273e5c), then the `.class` file will contain
parameter names and we're good.  However, this may not be the case.

If the `.class` files are built with `javac -g`, then we'll try to
deduce parameter names from debug info, but that's also problematic.

What else can be used to provide parameter names?

It is not unusual for Java libraries to provide "source `.jar`" files,
e.g. Android provides `android-stubs-src.jar` files, and other
libraries may provide a `*-sources.jar` file.  The contents of these
files are *Java source code*.  These files are used by Android IDEs to
provide documentation for the Java library.  They contain classes,
methods, parameter names, and associated Javadoc documentation.

What they are *not* guaranteed to do is *compile*.  As such, we can't
compile them ourselves with `javac -parameters` and then process the
`.class` files, as they may refer to unresolvable types.

"Interestingly", we *already* have some tooling to deal with this:
`tools/param-name-importer` uses a custom Irony grammar to parse
the Android SDK `*-stubs-src.jar` files to grab parameter names.
However, this tooling is *too strict*; try to pass arbitrary Java
source code at it, and it quickly fails.

Which brings us to documentation (2): we have a [javadoc2mdoc][1] tool
which will parse Javadoc HTML documentation and convert it into
[**mdoc**(5)][2] documentation, which can be later turned into
[XML documentation comments][3] files by way of
[**mdoc export-msxdoc**(1)][4], but this tool is (1) painful to
maintain, because it processes Javadoc *HTML*, and
(2) *requires Javadoc HTML*.

Google hasn't updated their downloadable Javadoc `.zip` file since
API-24 (2016-October).  API-30 is currently stable.

If we want newer docs, we either need to scrape the
developer.android.com/reference website to use with the existing
tooling, or...  we need to be able to read the Javadoc comments within
the `*-stubs-src.jar` files provided with the Android SDK.
(Note: Android SDK docs are Apache 2; file format conversion is fine.)

We thus have two use-cases for which parsing Java source code would
be useful..

As luck would have it, there's a decent Apache 2-licensed Java project
which supports parsing Java source code: [JavaParser][5].

Add a new `tools/java-source-utils` program which will parse Java
source code to produce two artifacts: parameter names and
consolidated Javadoc documentation:

	$ java -jar java-source-utils.jar --help
	java-source-utils [-v] [<-a|--aar> AAR]* [<-j|--jar> JAR]* [<-s|--source> DIRS]*
		[--bootclasspath CLASSPATH]
		[<-P|--output-params> OUT.params.txt] [<-D|--output-javadoc> OUT.xml] FILES

Provide `--output-params OUT.params.txt`, and the specified file will
be created which follows the file format laid out in
[`JavaParameterNamesLoader.cs`][6]:

	package java.lang
	;---------------------------------------
	  class Object
	    wait(long timeout)

Provide `--output-javadocs OUT.xml`, and the resulting file will be a
`class-parse`-like XML file which uses `//@jni-signature` as the "key"
and a child `<javadoc/>` element to contain documentation, e.g.:

	<api api-source="java-source-utils">
	  <package name="java.lang">
	    <class name="Object" jni-signature="Ljava/lang/Object;">
	      <javadoc>…</javadoc>
	      <constructor jni-signature="()V">
	        <javadoc>…</javadoc>
	      </constructor>
	      <method name="wait" jni-signature="(J)V" jni-returns="V" returns="void">
	        <parameter name="name" jni-type="J" type="long" />
	        <javadoc>…</javadoc>
	      </method>
	    </class>
	  </package
	</api>

This should make it possible to update the Xamarin.Android API
documentation without resorting to web scraping (and updating the code
to deal with whatever new HTML dialects are now used).

If neither `--output-params` nor `--output-javadocs` is used, then
`--output-javadocs` will be executed, writing to stdout.

The XML file *also* contains parameter name information, so that one
file can be the "source of truth" for parameter names and
documentation.

`FILES` can be:

  * Java source code in a `.java` file; or
  * A file with a `.jar` or `.zip` extension, which will be extracted
    into a temp directory and all `.java` files within the directory
    will be processed; or
  * A directory tree, and all `.java` files will be processed.

If a single file references other types, the "root" directory containing
those types may need to be specified via `--source DIR`:

	$ java -jar "bin/Debug/java-source-utils.jar" -v \
	  -s $HOMEandroid-toolchain/sdk/platforms/_t  \
	  $HOME/android-toolchain/sdk/platforms/_t/android/app/Activity.java \
	  -P android.params.txt -D android.xml >o.txt 2>&1

TODO:

In some scenarios, types won't be resolvable.  What should output be?

We don't want to *require* that everything be resolvable -- it's painful, and
possibly impossible, e.g. w/ internal types -- so instead we should "demark"
the unresolvable types.

`.params.txt` output will use `.*` as a type prefix, e.g.

	method(.*UnresolvableType foo, int bar);

`docs.xml` will output `L.*UnresolvableType;`.

Fix JavaParameterNamesLoader.cs to support the above.

[0]: https://docs.microsoft.com/en-us/xamarin/android/internals/api-design#events-and-listeners
[1]: https://github.com/xamarin/xamarin-android/tree/d48cf04f9749664bf48fc16bcb920d5d941cccab/tools/javadoc2mdoc
[2]: http://docs.go-mono.com/?link=man%3amdoc(5)
[3]: https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/xmldoc/
[4]: http://docs.go-mono.com/?link=man%3amdoc-export-msxdoc(1)
[5]: https://javaparser.org
[6]: https://github.com/xamarin/java.interop/blob/93df5a200e7b6f1b5add451aff66bbcb24293720/src/Xamarin.Android.Tools.Bytecode/JavaParameterNamesLoader.cs#L45-L68
Context: dotnet#623
Context: dotnet#623 (comment)

DO NOT MERGE UNTIL AFTER PR dotnet#623 IS MERGED.

Update `class-parse --docspath=PATH` so that if `PATH` contains
`<javadoc/>` elements, as produced by `tools/java-source-utils` (PR dotnet#623),
then those `<javadoc/>` elements will be inserted into the generated
API description.

The intent is to eventually allow `generator` to emit the `<javadoc/>`
data as C# XML Documentation, allowing a pipeline of:

	java -jar java-source-tools path/to/android.jar --output-javadoc android.xml
	mono class-parse.exe --docspath=android.xml -o api.xml …
	mono generator.exe api.xml …
@jonpryor jonpryor force-pushed the jonp-class-parse-merge-javadocs branch from 6534f6f to cee9974 Compare July 30, 2020 20:17
@jonpryor jonpryor marked this pull request as draft July 30, 2020 20:19
@jpobst
Copy link
Contributor

jpobst commented Jul 31, 2020

I wonder if we should try to do the "api documentation" part of this directly in generator instead of class-parse.

Doing it in class-parse means we have to read/write/persist it through all of the various xml files that our pipelines consists of:

api.xml.class-parse -> api.xml -> api.xml.fixed

I suspect the resulting xml is quite large.

On the other hand if it runs after we apply metadata it's probably going to make matching them harder/impossible.

Is the intention to run this on every CI build? Once or for each API level? How long does it take to run on Mono.Android.dll?

@jonpryor
Copy link
Contributor Author

@jpobst wrote:

I wonder if we should try to do the "api documentation" part of this directly in generator instead of class-parse.

Probably! Half the point here is to figure out how this should best be structured. I've implemented it so that there's only "one" api.xml file, and -- with PR #685 -- api.xml.fixed contains the <javadoc/> elements.

Just because it "works" that way doesn't mean it's correct to do it that way.

I suspect the resulting xml is quite large.

Indeed. For API-29 -- because my API-30 doesn't have any Javadoc comments! -- java -jar java-source-tools path/to/android.jar --output-javadoc android.xml emits a 38MB file. After generator invocation, with PR #685, api.xml.fixed and api.xml.adjusted and api.xml.adjusted.fixed are 48MB in size.

On the other hand if it runs after we apply metadata it's probably going to make matching them harder/impossible.

The way PR #684 works is by using //@jni-signature as the way to link things together. That should be a fairly reliable way to match things, hopefully, except for the TODO in PR #623 regarding "what about unresolvable types?", which I forgot to take into consideration. :-/

Is the intention to run this on every CI build?

Yes.

Once or for each API level?

Ideally, we'd run java-source-tool.jar against only one API level, API-30, and reuse those docs for all API level builds.

This might not actually be possible because of parameter name changes between API levels, e.g. using the android-javadocs.xml from API-29 w/ API-21 will almost certainly change parameter names, so this requires more thought.

Additionally, my currently installed API-30 contains no Javadoc information. (Perhaps a more recent API-30 will contain docs?) Thus, for now, we'd need to use the API-29 docs when emitting API-30 files.

How long does it take to run on Mono.Android.dll?

I'm not quite sure what you're asking here? java-source-tool.jar invocations for API-29's platforms/android-29/android-stubs-src.jar takes ~90 seconds to run. It's not very long, but it's certainly noticeable.

@jpobst
Copy link
Contributor

jpobst commented Jul 31, 2020

That is actually much faster and smaller than I was expecting. I was expecting it to take ~5 minutes and generate ~500 MB files. :)

When we integrate it into the Mono.Android.dll build, let's flag it as something the CI opts into so that it doesn't run by default. There are days where I build Mono.Android.dll locally 30+ times and it takes long enough already.

@jonpryor
Copy link
Contributor Author

There are days where I build Mono.Android.dll locally 30+ times and it takes long enough already.

That's a statement that also confuses me. The way I'm envisioning this -- subject to "meeting reality"! -- is that we'd generate an android-javadocs.xml file "once", a'la bin/BuildDebug/api/api-29.xml.in & co, which would be merged into api.xml. I don't see this as significantly altering the actual build time, but it's certainly something to investigate…

@jonpryor
Copy link
Contributor Author

jonpryor commented Aug 3, 2020

Superseded by PR#687: We'll add a generator --with-javadoc-xml option, and have generator "merge" the documentation with the members at code emit time, instead of having class-parse do the merge.

Bonus: it means PR #685 is superseded, and it doesn't need to update the api.xml verification code in src/Xamarin.Android.Tools.ApiXmlAdjuster!

@jonpryor jonpryor closed this Aug 3, 2020
@github-actions github-actions bot locked and limited conversation to collaborators Apr 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants