Skip to content

mono_dl_fallback_register & __Internal interact poorly #20295

@jonpryor

Description

@jonpryor

Context: dotnet/android#5031
Context: https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=3998131&view=logs&j=db00894d-3ef4-5d97-073c-254fbd613a41&t=379a81d4-3138-5f28-ec7b-ce4074947b64
Context: dotnet/java-interop@2cf8ac9

Xamarin.Android uses mono_dl_fallback_register() so that it can properly load native libraries and resolve symbols from those native libraries on Android.

A recent set of changes -- dotnet/java-interop#691 and dotnet/android#5031 -- has demonstrated that things don't work the way I expected them to work, and things are rather confusing.

Steps to Reproduce

  1. Download and extract mono-dlsym-cb.zip

  2. Build it on macOS:

     make 
    
  3. Run it:

     cp App.exe.config.in App.exe.config
     make run
    

Current Behavior

Unhandled Exception:
System.EntryPointNotFoundException: xa_intern_foo assembly:<unknown assembly> type:<unknown type> member:(null)
  at (wrapper managed-to-native) App.xa_intern_foo()
  at App.Main () [0x00001] in <bda945d4a1d14202bdc8e32002d2d281>:0 

Expected Behavior

It works?

On which platforms did you notice this

  • macOS
  • Linux
  • Windows

Version Used:

$ mono --version
Mono JIT compiler version 6.12.0.43 (2020-02/d90665a422e Fri Mar 13 11:52:42 EDT 2020)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com

Discussion

The repro builds mono-dlsym.cc, which does two things:

  1. Calls mono_dl_fallback_register(), providing callbacks for the load_func, symbol_func, and close_func parameters.
  2. Calls mono_main(), because it's easier that way.

The provided callbacks in turn print out that they've been called.

The repro also builds libxa-intern.dylib, which contains a single export xa_intern_foo(), and App.exe, which has a P/Invoke to xa_intern_foo():

#if INTERNAL
    const string LibName = "__Internal";
#else
    const string LibName = "xa-intern";
#endif
    [DllImport (LibName)]
    static extern void xa_intern_foo ();

For "normal" execution, take the repro steps above but don't create App.exe.config:

$ rm App.exe.config
$ make clean all
$ ./mono-dlsym App.exe
# jonp: _load_lib: name=/Users/jon/Documents/Developer/tmp/mono-dlsym-cb/App.exe.dylib
# jonp:   flags= RTLD_LOCAL RTLD_LAZY
# jonp: _load_lib: name=/Library/Frameworks/Mono.framework/Versions/6.12.0/lib/mono/aot-cache/amd64/App.exe.dylib
# jonp:   flags= RTLD_LOCAL RTLD_LAZY
# jonp: _load_lib: name=/Users/jon/Documents/Developer/tmp/mono-dlsym-cb/App.exe.dylib
# jonp:   flags= RTLD_LOCAL RTLD_LAZY
# jonp: _load_lib: name=/Library/Frameworks/Mono.framework/Versions/6.12.0/lib/mono/aot-cache/amd64/App.exe.dylib
# jonp:   flags= RTLD_LOCAL RTLD_LAZY
# jonp: _load_lib: name=/Users/jon/Documents/Developer/tmp/mono-dlsym-cb/xa-intern
# jonp:   flags= RTLD_LOCAL RTLD_LAZY
xa-intern: xa_intern_foo

While it doesn't crash, and otherwise works as expected, it doesn't entirely work "as expected": the symbol_func callback is never called. Only the load_func callback is invoked.

I find this behavior entirely unexpected: why have a symbol_func callback if it's never invoked?

If we build with make clean all USE_INTERNAL=1 or with App.exe.config in place -- either/both of which will cause the P/Invoke to load __Internal instead of libxa-intern.dylib -- then execution fails with the System.EntryPointNotFoundException:

% ./mono-dlsym App.exe
…

Unhandled Exception:
System.EntryPointNotFoundException: xa_intern_foo assembly:<unknown assembly> type:<unknown type> member:(null)
  at (wrapper managed-to-native) App.xa_intern_foo()
  at App.Main () [0x00001] in <08da46aa0c6a45daa3b6057aab83724c>:0 

Again, _get_sym() -- the callback for symbol_func -- is never executed.

Why isn't the symbol callback ever executed?

Furthermore, experience with Xamarin.Android suggests that when resolving P/Invokes to __Internal, loader_func should be invoked with a library name of NULL. That isn't happening in this app.

I do not understand why loading of __Internal isn't being passed into loader_func/_load_lib().

The Xamarin.Android angle

With dotnet/java-interop#691 and dotnet/android#5031 I'm trying to move away from dlopen() in the loader_func callback on Windows, and instead use the being-added java_interop_load_library() and java_interop_get_symbol_address() functions. On CI, this change causes the macOS Designer integration tests to fail unless I load the requested libraries as RTLD_GLOBAL. As seen with the test app attached to this PR, java_interop_get_symbol_address() is never invoked on macOS, only java_interop_load_library().

__Internal enters the picture because of the function that causes the failure:

	Renderer >> 4 [monodroid] Calling into managed runtime init
	Renderer (error) >>
	Renderer (error) >> Unhandled Exception:
	Renderer (error) >> System.EntryPointNotFoundException: java_interop_jnienv_get_java_vm assembly:<unknown assembly> type:<unknown type> member:(null)
	Renderer (error) >> at (wrapper managed-to-native) Java.Interop.NativeMethods.java_interop_jnienv_get_java_vm(intptr,intptr&)
	Renderer (error) >> at Java.Interop.JniEnvironment+References.GetJavaVM (System.IntPtr jnienv, System.IntPtr& vm) [0x00000] in <0f003a4904fd44d0a8cc6a63962ab40b>:0
	Renderer (error) >> at Java.Interop.JniEnvironmentInfo.set_EnvironmentPointer (System.IntPtr value) [0x00037] in <0f003a4904fd44d0a8cc6a63962ab40b>:0
	Renderer (error) >> at Java.Interop.JniEnvironmentInfo..ctor (System.IntPtr environmentPointer, Java.Interop.JniRuntime runtime) [0x00006] in <0f003a4904fd44d0a8cc6a63962ab40b>:0
	Renderer (error) >> at Java.Interop.JniRuntime..ctor (Java.Interop.JniRuntime+CreationOptions options) [0x0017b] in <0f003a4904fd44d0a8cc6a63962ab40b>:0
	Renderer (error) >> at Android.Runtime.AndroidRuntime..ctor (System.IntPtr jnienv, System.IntPtr vm, System.Boolean allocNewObjectSupported, System.IntPtr classLoader, System.IntPtr classLoader_loadClass, System.Boolean jniAddNativeMethodRegistrationAttributePresent) [0x00000] in /Users/builder/azdo/_work/4/s/xamarin-android/src/Mono.Android/Android.Runtime/AndroidRuntime.cs:25

java_interop_jnienv_get_java_vm is from java-interop, which we remap to __Internal via config.xml. We thus should be attempting to resolve libxa-internal-api.dylib!java_interop_jnienv_get_java_vm -- as we remap __Internal to libxa-internal-api.dylib -- and I know libxa-internal-api.dylib is being loaded (printfs!), but Xamarin.Android isn't called to resolve java_interop_jnienv_get_java_vm. Consequently, unless libxa-internal-api.dylib is loaded as RTLD_GLOBAL, no symbols can be resolved, and things crash.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions