Skip to content

Conversation

@lateralusX
Copy link
Member

@lateralusX lateralusX commented Jul 18, 2023

.net8 adds support for dotnet-gcdump on Mono, meaning dotnet-gcdump will be used targeting mobile platforms. Currently dotnet-gcdump doesn't have support needed since it can only connect using pid or process name and has logic to make sure the process connected to has the same pid as the -p argument passed to dotnet-gcdump.

On mobile platforms, runtime is running on device and communicates using TCP/IP (over loopback interface, using Android adb port forwarding or usbmux on iOS). All logic related to communicate with devices on different platforms is handled by dotnet-dsrouter, that expose the same IPC channels as a normal runtime would do, meaning most diagnostic tools can connect to dotnet-dsrouter that will route all communication to/from device. Tools like dotnet-trace can leverage --diagnostic-port=,connect instead of pid to connect to a named IPC channel on dotnet-dsrouter (routed to a TCP/IP port on device). It is also possible to connect directly towards dotnet-dsrouters pid since it will masquerade like a regular runtime, but it will not alter the pid passed in EventPipe messages, meaning that dotnet.gcdump's pid checks currently breaks that scenario.

This PR extends diagnostic tooling support for mobile and dotnet-dsrouter scenarios.

  • Add support for --diagnostic-port argument in dotnet-gcdump, but only support connect mode, since listen mode (reverse diagnostic server) is mainly for startup tracing where GC dump is not full supported.
  • Add support for new command, convert, in dotnet-gcdump that can take a nettrace file as input and convert it into a gcdump file. Can be useful if GC dumps have been captured by tools like dotnet-trace, meaning that dotnet-gcdump will be able to convert it into a gcdump.
  • Add ability to tell diagnostic tools currently supporting --diagnostic-port command that process in -p is a dsrouter process. This will make the tools construct a default IPC channel used by dsrouter based on the -p parameter, simplify the connect scenarios against dsrouter from tools a lot, since it can use the default IPC channel setup by dsrouter.
  • Always setup default IPC server channel in dsrouter if no specific ipcs argument has been set.
  • Fix dsrouter ctrl-c shutdown issue + additional error logging for diagnostic.

Since support for dotnet-gcdump is included in .net8 runtime P7 it would be good to get these changes included in a diagnostic release available around the same time as .net8 p7.

@lateralusX lateralusX force-pushed the lateralusX/extend-dotnet-gcdump branch from 08ca6e1 to 56b77df Compare July 19, 2023 10:15
@lateralusX lateralusX force-pushed the lateralusX/extend-dotnet-gcdump branch from 56b77df to f4a5528 Compare July 19, 2023 15:12
@lateralusX lateralusX marked this pull request as ready for review July 19, 2023 16:10
@lateralusX lateralusX requested a review from a team as a code owner July 19, 2023 16:10
@lateralusX
Copy link
Member Author

One option that could be looked into is instead of having the user manually specify the --dsrouter argument to tools like gcdump, trace, counter ect to connect to the dsrouter default IPC channel (when using -p), we could have the tools first evaluate if the process has setup the dsrouter specific IPC channel and pick that over the default, that way, there is no risk that users forgets to use --dsrouter and end up using the default IPC channel of dsrouter itself. Will look into change using that pattern for this PR.

@lateralusX lateralusX force-pushed the lateralusX/extend-dotnet-gcdump branch from 7abd780 to ddedef5 Compare July 20, 2023 15:15
@lateralusX
Copy link
Member Author

lateralusX commented Jul 20, 2023

Dropped the need for --dsrouter argument initially added in this PR, instead I change the default heuristics currently identifying default runtime IPC diagnostic channel. Regular dotnet runtime processes will create its default IPC channel using dotnet-diagnostics-{pid} on Windows and dotnet-diagnostic-{pid}-{time}-socket on other platforms. In dotnet-dsrouter case it will also create a default IPC channel (if configured that way) named dotnet-diagnostics-dsrouter-{pid} on Windows and dotnet-diagnostic-dsrouter-{pid}-{time}-socket on other platforms. Tooling will now check if both IPC channels exists and if they do and dsrouter's IPC channel is newer, tooling will pick that one as default IPC listener, in all other cases it will pick the same default as today.

This greatly simplify the default dotnet-dsrouter use case since dotnet-dsrouter can now be started without a named IPC channel and all diagnostic tooling can connect to the routed IPC channel using -p <pid> argument only. Previously, dotnet-dsrouter needs to be started using a custom IPC channel and then only diagnostic tooling handling --diagnostic-port argument would be capable of communicating with the router IPC channel.

@lateralusX lateralusX force-pushed the lateralusX/extend-dotnet-gcdump branch 2 times, most recently from 935e858 to 6405f16 Compare July 21, 2023 12:40
@lateralusX
Copy link
Member Author

Looks like the -p scenario in dotnet-gcdump got lost when dropping the use of --dsrouter argument, will need to re-add that to the PR or dotnet-gcdump will ignore dumps since the pid of attached process (dotnet-dsrouter) won't match the pid in returned EventPipe events (pid of mobile process).

@lateralusX lateralusX force-pushed the lateralusX/extend-dotnet-gcdump branch from a2edbf2 to fd5c7e9 Compare July 26, 2023 09:02
@lateralusX
Copy link
Member Author

Successfully retrieving gcdump's from app running on iOS simulator using dotnet-dsrouter and dotnet-gcdump:

image

Copy link
Member

@tommcdon tommcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks like a great step forward for mono diagnostic tooling. I scanned the code changes and nothing jumped out to me. I trust that the right level of validation was done against the code change, and so I am approving.
We should ensure we follow-up with documentation for the new command-line args and support, along with a new walkthrough.

@lateralusX lateralusX merged commit 88dab75 into dotnet:main Aug 29, 2023
ghost pushed a commit that referenced this pull request Aug 29, 2023
dotnet-dsrouter can be a little hard to configure for mobile use cases
since it needs a number of arguments, both to setup its local IPC
client|server and the corresponding TCP client|server and what arguments
to use depends on what mobile platform and scenario the user runs.

There are currently a number of different scenarios described in
different sources of documentation:

Runtime docs:


https://github.com/dotnet/runtime/blob/main/docs/design/mono/diagnostics-tracing.md

iOS SDK docs:

https://github.com/xamarin/xamarin-macios/wiki/Profiling

Android SDK docs:


https://github.com/xamarin/xamarin-android/blob/main/Documentation/guides/tracing.md

They all fall into a number of predefined scenarios and a number of
"default" parameters that should be used.

This PR creates 4 new commands in dotnet-dsrouter to explicitly use the
defaults described in the documentation, ios-sim, ios, android-emu,
android. They all fallback to default IPC server listener for dsrouter
and in order to make that simpler in use with diagnostic tooling,
changes done in #4081 is also
needed to simplify the process.

So lets take an example form the docs, running an app on iOS simulator.
Before this PR the following dsrouter command needs to be run:

```
dotnet-dsrouter client-server -ipcc ~/my-sim-port -tcps 127.0.0.1:9000

launch app with DOTNET_DiagnosticPorts=127.0.0.1:9000

dotnet-trace collect --diagnostic-port ~/my-sim-port --format speedscope
```

With this PR (and #4081) the
above will look like:

```
dotnet-dsrouter ios-sim

launch app with DOTNET_DiagnosticPorts=127.0.0.1:9000

dotnet-trace collect -p:<dsrouter pid> --format speedscope
```

dontet-dsrouter will output both its pid as well as what
DOTNET_DiagnosticPorts that can be used to connect to it on startup.

Using a different mobile platform/scenario, like android emulator is
pretty much identical, just switch ios-sim to android-emu.
dotnet-dsrouter will output the exact DOTNET_DiagnosticPorts that should
be used with any of the configured scenarios.
@jonathanpeppers
Copy link
Member

@lateralusX can you share the example commands you ran for: #4081 (comment)

I was trying this on an Android device:

> adb reverse tcp:9000 tcp:9001
> adb shell setprop debug.mono.profile '127.0.0.1:9000'
> dotnet-dsrouter client-server -tcps 127.0.0.1:9001 -ipcc /tmp/maui-app --verbose debug
(this works if I connect dotnet-trace)
> dotnet-gcdump collect --diagnostic-port /tmp/maui-app
--diagnostic-port is only supporting connect mode.

dotnet-trace collect --diagnostic-port /tmp/maui-app works in my example, but I feel like I'm missing something here for dotnet-gcdump to work.

I also tried to use the simplified dotnet-dsrouter android --verbose debug, but wasn't successful.

My versions are:

> dotnet tool list -g
Package Id                         Version                       Commands
--------------------------------------------------------------------------------------
dotnet-dsrouter                    7.0.447801                    dotnet-dsrouter
dotnet-gcdump                      7.0.447801                    dotnet-gcdump
dotnet-trace                       7.0.447801                    dotnet-trace

@tranb3r
Copy link

tranb3r commented Oct 13, 2023

Here is what I'm doing (and it works)
I hope it'll help.

> adb shell setprop debug.mono.profile '127.0.0.1:9000,suspend'
> dotnet-dsrouter server-server -tcps 127.0.0.1:9000 -ipcs /tmp/maui-app --verbose debug --forward-port Android
> dotnet-trace collect --diagnostic-port /tmp/maui-app,connect --format speedscope
> dotnet-gcdump collect --diagnostic-port /tmp/maui-app,connect -o memory.gcdump -v

Versions:

Package Id                Version         Commands
---------------------------------------------------------
dotnet-dsrouter           7.0.442301      dotnet-dsrouter
dotnet-gcdump             7.0.447201      dotnet-gcdump
dotnet-trace              7.0.442301      dotnet-trace

@jonathanpeppers
Copy link
Member

@tranb3r thank this got me working on an Android emulator. I think the ,connect is what I was missing on dotnet-gcdump collect --diagnostic-port /tmp/maui-app,connect.

@lateralusX I'm still not able to get some combination of the above on an Android device. Normally, an extra adb reverse tcp:9000 tcp:9001 call is all I need for that to work.

@jonathanpeppers
Copy link
Member

@lateralusX another issue I'm seeing is, I have a type like this:

public partial class MyPage
{
	~MyPage() => Console.WriteLine("~MyPage");

The message never prints, which indicates a leak I want to fix. But the dump does not contain these objects.

When I saw this scenario in .NET 7, I was able to diagnose with these instructions. And used Filip Navara's tool to convert to .gcdump.

@lateralusX
Copy link
Member Author

lateralusX commented Oct 16, 2023

@tranb3r thank this got me working on an Android emulator. I think the ,connect is what I was missing on dotnet-gcdump collect --diagnostic-port /tmp/maui-app,connect.

@lateralusX I'm still not able to get some combination of the above on an Android device. Normally, an extra adb reverse tcp:9000 tcp:9001 call is all I need for that to work.

The simplest way to get going with updated dsrouter is to use the pre-configured profiles, so if you just run dotnet-dsrouter android-emu it will be setup in connect mode, and you can use the pid for dsrouter as the -p when running tools like dotnet-trace or dotnet-gcdump. If you run dotnet-dsrouter android, that will try to setup the default port reverse, but it will need to find the adb tooling, so tool needs to either find it using path, or you setup the ANDROID_SDK_ROOT env variable when launching dotnet-dsrouter to point to the root location of your android sdk. If it has issues to setup the port reverse it will be logged in the console.

It is also possible to run the tools using full set of parameters as previously done, so all previous configurations should still work as they used to do. You can also run the port reverse command as a separate process, in that case you don't need to pass the --forward-port command to dotnet-dsrouter.

@jonathanpeppers So to summarize, if you run the later version of dotnet-dsrouter supporting the preconfigured profiles, you should be able to run like this:

Android emulator:

dotnet-dsrouter android-emu
dotnet-trace collect -p <pid of running dotnet-dsrouter>

Android device:

dotnet-dsrouter android
dotnet-trace collect -p <pid of running dotnet-dsrouter>

If you need details on the mode you run and how you should setup the env variable, dotnet-dsrouter (7.0.451301) supports --info argument that will give you additional details.

Also note that the preconfigured android profile uses tcp:9000 tcp:9000 for its port reverse so it doesn't use different ports as I believe you used in the past since its not really needed. But as long as you use the default profiles all that should be preconfigured.

@jonathanpeppers Please let me know if the above doesn't work for you and if possible supply the -v trace output from dotnet-dsrouter when running.

@lateralusX
Copy link
Member Author

lateralusX commented Oct 16, 2023

@lateralusX another issue I'm seeing is, I have a type like this:

public partial class MyPage
{
	~MyPage() => Console.WriteLine("~MyPage");

The message never prints, which indicates a leak I want to fix. But the dump does not contain these objects.

When I saw this scenario in .NET 7, I was able to diagnose with these instructions. And used Filip Navara's tool to convert to .gcdump.

When looking at the linked issue (dotnet/maui#17726) it looks like you have the same behavior in both .net7 and .net8, you only have one instance of Page2 in the dump, correct? So the dumps looks similar in that case, the Page2 instance has actually been GC:ed as reported in the dump?

@jonathanpeppers
Copy link
Member

jonathanpeppers commented Oct 16, 2023

@lateralusX for the "missing objects" / finalizer issue, I'll file an issue on dotnet/runtime, since it appears to happen in both .NET 7 and 8.

For the new commands, is there a doc somewhere on these?

dotnet-dsrouter android-emu
dotnet-trace collect -p <pid of running dotnet-dsrouter>

dotnet-dsrouter android
dotnet-trace collect -p <pid of running dotnet-dsrouter>

I will test them out today, and update the READMEs in xamarin-android, but do we need customer-facing docs somewhere?

@lateralusX
Copy link
Member Author

lateralusX commented Oct 16, 2023

@lateralusX for the "missing objects" / finalizer issue, I'll file an issue on dotnet/runtime, since it appears to happen in both .NET 7 and 8.

For the new commands, is there a doc somewhere on these?

dotnet-dsrouter android-emu
dotnet-trace collect -p <pid of running dotnet-dsrouter>

dotnet-dsrouter android
dotnet-trace collect -p <pid of running dotnet-dsrouter>

I will test them out today, and update the READMEs in xamarin-android, but do we need customer-facing docs somewhere?

Haven't got around adding any new documentation yet, except what was in the original PR's adding the commands into dotnet-dsrouter, #4090.

jonathanpeppers added a commit to jonathanpeppers/xamarin-android that referenced this pull request Oct 17, 2023
Context: dotnet/diagnostics#4081
Context: dotnet/diagnostics#4090
Context: dotnet/runtime#88634
Context: dotnet/diagnostics#4337

Using the latest `dotnet-trace` tooling:

    > dotnet tool list -g
    Package Id                         Version                       Commands
    --------------------------------------------------------------------------------------
    dotnet-dsrouter                    7.0.447801                    dotnet-dsrouter
    dotnet-gcdump                      7.0.447801                    dotnet-gcdump
    dotnet-trace                       7.0.447801                    dotnet-trace

There is a simplified way to profile Android apps.

For an Android emulator:

    > dotnet-dsrouter android-emu
    Start an application on android emulator with one of the following environment variables set:
    DOTNET_DiagnosticPorts=10.0.2.2:9000,nosuspend,connect
    DOTNET_DiagnosticPorts=10.0.2.2:9000,suspend,connect
    > adb shell setprop debug.mono.profile '10.0.2.2:9000,suspend,connect'
    > dotnet-trace ps
    3248  dotnet-dsrouter
    > dotnet-trace collect -p 3248
    ...
    [00:00:00:09]   Recording trace 3.2522   (MB)
    Press <Enter> or <Ctrl+C> to exit...

For an Android device you will eventually be able to do `dotnet-dsrouter
android`, except we found one issue.

Until dotnet/diagnostics#4337 is resolved, we can instead do:

    > adb reverse tcp:9000 tcp:9001
    > dotnet-dsrouter server-server -tcps 127.0.0.1:9001
    > adb shell setprop debug.mono.profile '127.0.0.1:9000,suspend,connect'
    > dotnet-trace ps
    > dotnet-trace collect -p 3248

We can also now use `dotnet-gcdump`! Instead of `dotnet-trace collect
-p`, you can simply do:

    > dotnet-gcdump collect -p 3248

In both cases these tools *know* that the process ID is a
`dotnet-dsrouter` process and to do the right thing.
jonathanpeppers added a commit to dotnet/android that referenced this pull request Oct 18, 2023
Context: dotnet/diagnostics#4081
Context: dotnet/diagnostics#4090
Context: dotnet/runtime#88634
Context: dotnet/diagnostics#4337

Using the latest `dotnet-trace` tooling:

    > dotnet tool list -g
    Package Id                         Version                       Commands
    --------------------------------------------------------------------------------------
    dotnet-dsrouter                    7.0.447801                    dotnet-dsrouter
    dotnet-gcdump                      7.0.447801                    dotnet-gcdump
    dotnet-trace                       7.0.447801                    dotnet-trace

There is a simplified way to profile Android apps.

For an Android emulator:

    > dotnet-dsrouter android-emu
    Start an application on android emulator with one of the following environment variables set:
    DOTNET_DiagnosticPorts=10.0.2.2:9000,nosuspend,connect
    DOTNET_DiagnosticPorts=10.0.2.2:9000,suspend,connect
    > adb shell setprop debug.mono.profile '10.0.2.2:9000,suspend,connect'
    > dotnet-trace ps
    3248  dotnet-dsrouter
    > dotnet-trace collect -p 3248
    ...
    [00:00:00:09]   Recording trace 3.2522   (MB)
    Press <Enter> or <Ctrl+C> to exit...

For an Android device you will eventually be able to do `dotnet-dsrouter
android`, except we found one issue.

Until dotnet/diagnostics#4337 is resolved, we can instead do:

    > adb reverse tcp:9000 tcp:9001
    > dotnet-dsrouter server-server -tcps 127.0.0.1:9001
    > adb shell setprop debug.mono.profile '127.0.0.1:9000,suspend,connect'
    > dotnet-trace ps
    > dotnet-trace collect -p 3248

We can also now use `dotnet-gcdump`! Instead of `dotnet-trace collect
-p`, you can simply do:

    > dotnet-gcdump collect -p 3248

In both cases these tools *know* that the process ID is a
`dotnet-dsrouter` process and to do the right thing.
@github-actions github-actions bot locked and limited conversation to collaborators Jan 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants