From a60b85dbda488e3d19df6f457cd5beae3fde6682 Mon Sep 17 00:00:00 2001
From: Michael Sharp <misharp@microsoft.com>
Date: Tue, 23 Feb 2021 11:29:26 -0800
Subject: [PATCH 01/20] added initial design doc for Cross Platform

---
 docs/code/CrossPlatform.md | 157 +++++++++++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100644 docs/code/CrossPlatform.md

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
new file mode 100644
index 0000000000..fb06ebc614
--- /dev/null
+++ b/docs/code/CrossPlatform.md
@@ -0,0 +1,157 @@
+# Cross-Platform and Architecture design doc
+
+### Table of contents  <!-- omit in toc -->
+- [Cross-Platform and Architecture design doc](#cross-platform-and-architecture-design-doc)
+	- [1. Why cross-platform/architecture](#1-why-cross-platformarchitecture)
+	- [2. Current status](#2-current-status)
+		- [2.1 Problems](#21-problems)
+		- [2.2 Build](#22-build)
+		- [2.3 Managed Code](#23-managed-code)
+		- [2.4 Native Projects](#24-native-projects)
+		- [2.5 3rd Party Dependencies](#25-3rd-party-dependencies)
+	- [3 Possible Solutions](#3-possible-solutions)
+		- [3.1 Fully Remove Intel MKL](#31-fully-remove-intel-mkl)
+		- [3.2 Find Replacements as needed](#32-find-replacements-as-needed)
+		- [3.3 Software fallbacks](#33-software-fallbacks)
+		- [3.4 Hybrid Between Finding Replacements and Software Fallbacks](#34-hybrid-between-finding-replacements-and-software-fallbacks)
+	- [My Suggestion](#my-suggestion)
+		- [NET Core 5](#net-core-5)
+		- [3rd Party Dependencies](#3rd-party-dependencies)
+		- [Helix](#helix)
+		- [Mobile Support](#mobile-support)
+		- [Support Grid](#support-grid)
+
+## 1. Why cross-platform/architecture
+ML.NET is an open-source machine learning framework which makes machine learning accessible to .NET developers with the same code that powers machine learning across many Microsoft products, including Power BI, Windows Defender, and Azure.
+
+ML.NET allows .NET developers to develop/train their own models and infuse custom machine learning into their applications using .NET.
+
+Currently, while .NET is able to run on many different platforms and architectures, ML.NET is only able to run on Windows, Mac, and Linux, either x86 or x64. This excludes many architectures such as Arm, Arm64, M1, and web assembly, places that .NET is currently able to run.
+
+The goal is to enable ML.NET to run everywhere that .NET itself is able to run.
+
+## 2. Current status
+### 2.1 Problems
+ML.NET has a hard dependency on Intel MKL. It performs many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that Intel MKL can only run on x86/x64 machines and is the main blocker for expanding to other architectures.
+
+ML.NET also has dependencies on things that either don't build on other architectures or have to be compiled by the user if its wanted. For example:
+ - LightGBM
+ - TensorFlow
+ - OnnxRuntime
+
+I will go over these in more depth below.
+
+### 2.2 Build
+Since ML.NET has a hard dependency on Intel MKL, the build process assumes it's always there. For example, the build process will try and copy dlls without checking if they exist. The build process will need to be modified so that it doesn't fail when it can't find these files. It does the same copy check for our own Native dlls, so this will need to be fixed for those as well.
+
+### 2.3 Managed Code
+Since ML.NET has a hard dependency on Intel MKL, the managed code imports the dlls without checking whether or not they exist. If the dlls don't exist you get a hard failure. For example, the base test class imports and sets up Intel MKL even if the test itself does not need it. This same situation applies to our own native dlls. When they aren't present, the imports fail. We will need to make ML.NET correctly handle the dll imports.
+
+### 2.4 Native Projects
+ML.NET has 6 native projects. They are:
+ - CpuMathNative
+   - CpuMathNative relies heavily on IntelMKL. Using NetCore 3.1, however, there is a full software fallback.
+ - FastTreeNative
+   - FastTreeNative has a software fallback using a C# compiler flag. This flag is hardcoded to always use the native code.
+ - LdaNative
+   - Builds successfully without native dependencies.
+ - MatrixFactorizationNative
+   - Uses libmf. Currently, we are hardcoding the "USESSE" flag which requires Intel MKL. Removing this flag allows MatrixFactorizationNative to build for Arm.
+ - MklProxyNative
+   - Wrapper for Intel MKL. When Intel MKL is not present, this is not needed. However, the build is hardcoded to always compile this code.
+ - SymSgdNative
+   - Only uses 4 Intel MKL methods. For 2 of them, I can't find direct replacements on other architectures.
+
+Of these 6, only LdaNative builds successfully for other architectures without changing anything.
+
+### 2.5 3rd Party Dependencies
+As mentioned above, there are several 3rd party packages that don't have support for non x86/x64 machines.
+ - LightGBM. LightGBM doesn't offer packages for non x86/x64. I was able to build the code for Arm64, but we would either have to build it ourselves, convince them to create more packages for us, or annotate that this doesn't work on non x86/64 machines.
+ - TensorFlow. The full version of TensorFlow only runs on x86/x64. There is a [lite](https://www.tensorflow.org/lite/guide/build_arm64) version that supports Arm64, and you can install it directly with python, but this isn't the full version so not all models will run. We would also have to verify if the C# library we use to interface with TensorFlow will work with the lite version.
+ - OnnxRuntime. OnnxRuntime doesn't have prebuilt packages for more than just x86/x64. It does support Arm, but we have to build it ourselves. This is the same situation as with LightGBM.
+
+
+## 3 Possible Solutions
+We have several possible options we can use to resolve this:
+ - Fully remove Intel MKL as a dependency and use software that supports all platforms/architectures.
+ - Continue to use Intel MKL for what it supports, and find a replacement for all platforms/architectures that don't support it.
+ - Create software fallbacks, so that if Intel MKL is not found, then ML.NET will run fully in managed code.
+ - Hybrid approach of replacement code and software fallback. This is the approach I recommend.
+
+None of these approaches resolve the 3rd party dependency issues. These solutions only deal with first party ML.NET code itself.
+
+I have lots more info about our dependency on Intel MKL in another document if required.
+
+### 3.1 Fully Remove Intel MKL
+This is the most complicated solution and provides the least amount of short term benefit. Since x86/x64 run fine and gain performance benefits with Intel MKL, it doesn't make sense to spend the time to fully remove it. This is a possible solution, but not one that I would recommend, so I am not going to give more details on it unless we explicitly decide to go this route.
+
+### 3.2 Find Replacements as needed
+This will still allow us to gain the benefits of Intel MKL on architectures that support it but will also keep the benefits of native code in the other places. The downside is that we would have to build the native code for, potentially, a lot of different architectures.
+
+At a high level, this solution would require us to:
+ - Fix the build so it's not hardcoded to look for Intel MKL.
+ - Fix the managed code so that if it can't find Intel MKL, the code behaves correctly.
+ - All the native code would need to have the Intel MKL dependency change to another library or be re-written by us.
+   - CpuMathNative would need to be re-written to not use Intel MKL.
+   - FastTreeNative would need to be re-written to not use Intel MKL.
+   - LdaNative builds just find without Intel MKL, so no replacements would need to be found.
+   - MatrixFactorizationNative builds with the "USESSE" flag that requires Intel MKL. We can conditionally enable/disable that flag and no other work would be required.
+   - MklProxyNative is only a wrapper for Intel MKL and can be ignored on non x86/x64 platforms. We will need to modify the build to exclude it as needed.
+   - SymSgdNative would need to be re-written. There are only 4 methods that would need to be changed.
+
+I was unable to find all the replacements we would need. We would end up having to write many native methods ourselves with this approach.
+
+### 3.3 Software fallbacks
+This will truly allow ML.NET to run anywhere that .NET runs. The only downside is the speed of execution, and the time to rewrite the existing native code we have.  If we restrict new architectures to .NET core 3.1 or newer, we will have an easier time with the software fallbacks as some of this code has already been written. This solution will also require a lot of code rewrite from native code to managed code.
+
+At a high level, this solution would require us to:
+ - Fix the build so it's not hardcoded to look for Intel MKL or any native binaries.
+ - Fix the managed code so that if it can't find the native binaries, the code behaves correctly and performs the software fallback.
+ - CpuMathNative has software fallbacks already in place for .NET core 3.1, so no work is needed.
+ - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
+ - LdaNative would need to be re-written.
+ - MatrixFactorizationNative would need to be re-written.
+ - MklProxyNative can be ignored. It is not needed with software fallbacks. We will need to modify the build to exclude it as needed.
+ - SymSgdNative would need to be re-written.
+
+### 3.4 Hybrid Between Finding Replacements and Software Fallbacks
+Since some of the native code already has replacements and some of the code already has software fallbacks, we can leverage this work by doing a hybrid between the prior 2 solutions.
+
+At a high level, this solution would require us to:
+ - Fix the build so it's not hardcoded to look for Intel MKL or any native binaries.
+ - Fix the managed code so that if it can't find Intel MKL or any native binaries, the code behaves correctly. This includes software fallbacks and/or description error messages.
+ - CpuMathNative has software fallbacks already in place for .NET core 3.1, so no work is needed.
+ - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
+ - LdaNative builds just fine for Arm, so no work would be required.
+ - MatrixFactorizationNative builds with the "USESSE" flag that requires Intel MKL. We can conditionally enable/disable that flag and no other work would be required.
+ - MklProxyNative can be removed/ignored for Arm builds. We will need to modify the build to exclude it as needed.
+ - SymSgdNative needs to be either re-written in managed code, or re-write 4 Intel MKL methods. The 4 methods are just dealing with vector manipulation and shouldn't be hard to do.
+
+## My Suggestion
+My suggestion would be to start with the hybrid approach. It will require the least amount of work to get ML.NET running elsewhere, while still being able to support a large majority of devices out of the gate. This solution will still limit the platforms we can run-on to Arm64 devices, but we can do a generic Arm64 compile, so it should work for all Arm64 v8 devices. It would be a good idea to have full software fallbacks so that truly anywhere .NET Core runs, ML.NET will run, such as Web Assembly and when .NET 6 comes out on mobile as well.
+
+### NET Core 5
+I think we should also add support for .NET Core 5 during this process so that we gain access to the Arm64 intrinsics. However, since .Net Core 5 goes out of support before .NET core 3.1 I don't think .NET Core 5 should be a huge focus.
+
+### 3rd Party Dependencies
+I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. OnnxRuntime is something that we will probably want, but we can look more into this as we get requests for it in the future.
+
+### Helix
+In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works.
+
+### Mobile Support
+.NET Core 6 will allow us to run nativly on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback.
+
+### Support Grid
+This is what I propose for the support grid. Since .NET Core 2.1 is end-of-life this year, I am putting much less emphasis on it. Since .NET Core 5 will be out of support before .NET Core 3.1 will be, I am putting less CI emphasis on .NET Core 5.
+
+| Platform | Architecture | Intel MKL | .NET Framework | .NET Core 2.1 | .NET Core 3.1 | .NET Core 5 |
+| ---------| -------------| --------- | -------------- | ------------- | ------------- | ----------- |
+| Windows  | x64          | Yes       | Yes            | Yes, no CI    | Yes           | Yes         |
+| Windows  | x86          | Yes       | Yes, no CI     | Yes, no CI    | Yes, no CI    | Yes, no CI  |
+| Mac      | x64          | Yes       | No             | Yes, no CI    | Yes           | Yes, no CI  |
+| Mac      | Arm64        | No        | No             | No            | Yes           | Yes, no CI  |
+| Ios      | Arm64        | No        | No             | No            | No            | No          |
+| Linux    | x64          | Yes       | No             | Yes, no CI    | Yes           | Yes, no CI  |
+| Linux    | Arm64        | No        | No             | No            | Yes           | Yes         |
+| Android  | Arm64        | No        | No             | No            | No            | No          |
\ No newline at end of file

From 30f40ca36a6538be2c932740b28d5ad5e999a33d Mon Sep 17 00:00:00 2001
From: Michael Sharp <misharp@microsoft.com>
Date: Mon, 15 Mar 2021 23:20:52 -0700
Subject: [PATCH 02/20] updates based on PR comments

---
 docs/code/CrossPlatform.md | 106 ++++++++++++++++++++++---------------
 1 file changed, 64 insertions(+), 42 deletions(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index fb06ebc614..d65f2a88d9 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -10,12 +10,12 @@
 		- [2.4 Native Projects](#24-native-projects)
 		- [2.5 3rd Party Dependencies](#25-3rd-party-dependencies)
 	- [3 Possible Solutions](#3-possible-solutions)
-		- [3.1 Fully Remove Intel MKL](#31-fully-remove-intel-mkl)
-		- [3.2 Find Replacements as needed](#32-find-replacements-as-needed)
-		- [3.3 Software fallbacks](#33-software-fallbacks)
+		- [3.1 Eliminate native components](#31-eliminate-native-components)
+		- [3.2 Rewrite native components to work on other platforms](#32-rewrite-native-components-to-work-on-other-platforms)
+		- [3.3 Rewrite native components to be only managed code](#33-rewrite-native-components-to-be-only-managed-code)
 		- [3.4 Hybrid Between Finding Replacements and Software Fallbacks](#34-hybrid-between-finding-replacements-and-software-fallbacks)
 	- [My Suggestion](#my-suggestion)
-		- [NET Core 5](#net-core-5)
+		- [Improving managed fallback through intrinsics](#improving-managed-fallback-through-intrinsics)
 		- [3rd Party Dependencies](#3rd-party-dependencies)
 		- [Helix](#helix)
 		- [Mobile Support](#mobile-support)
@@ -31,8 +31,14 @@ Currently, while .NET is able to run on many different platforms and architectur
 The goal is to enable ML.NET to run everywhere that .NET itself is able to run.
 
 ## 2. Current status
+There are several problems complicating us from moving to a fully cross-platform solution. At a high level thes include:
+1. We only build and test for a subset of platforms that .NET supports.
+2. Native components must be explicitly built for additional architectures which we wish to support. This limits our ability to support new platforms without doing work.
+3. Building our native components for new platforms faces challenges due to lack of support for those components dependencies. This limits our ability to support the current set of platforms.
+4. Some of our external dependencies have limited support for .NET's supported platforms.
+
 ### 2.1 Problems
-ML.NET has a hard dependency on Intel MKL. It performs many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that Intel MKL can only run on x86/x64 machines and is the main blocker for expanding to other architectures.
+ML.NET has a hard dependency on x86/x64. Some of the dependency is on Intel MKL, while other parts depend on x86/x64 SIMD instructions. To make things easier I will refer to these as just the x86/x64 dependencies. This is to perform many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that these dependencies can only run on x86/x64 machines and are the main blockers for expanding to other architectures. While you can run the managed code on other architectures, there is no good way to know which parts will run and which ones wont. This includes the build process as well, which currently has these same hard dependencies and building on non x86/x64 machines is not supported.
 
 ML.NET also has dependencies on things that either don't build on other architectures or have to be compiled by the user if its wanted. For example:
  - LightGBM
@@ -42,25 +48,40 @@ ML.NET also has dependencies on things that either don't build on other architec
 I will go over these in more depth below.
 
 ### 2.2 Build
-Since ML.NET has a hard dependency on Intel MKL, the build process assumes it's always there. For example, the build process will try and copy dlls without checking if they exist. The build process will need to be modified so that it doesn't fail when it can't find these files. It does the same copy check for our own Native dlls, so this will need to be fixed for those as well.
+Since ML.NET has a hard dependency on x86/x64, the build process assumes it's running there. For example, the build process will try and copy native dlls without checking if they exist because it assumes the build for them succeeded or that they are available. The build process will need to be modified so that it doesn't fail when it can't find these files. It does the same copy for our own Native dlls, so this will need to be fixed for those as well.
 
 ### 2.3 Managed Code
-Since ML.NET has a hard dependency on Intel MKL, the managed code imports the dlls without checking whether or not they exist. If the dlls don't exist you get a hard failure. For example, the base test class imports and sets up Intel MKL even if the test itself does not need it. This same situation applies to our own native dlls. When they aren't present, the imports fail. We will need to make ML.NET correctly handle the dll imports.
+Since ML.NET has a hard dependency on x86/x64, the managed code imports dlls without checking whether or not they exist. If the dlls don't exist you get a hard failure. For example, if certain columns are active the `MulticlassClassificationScorer` will call `CalculateIntermediateVariablesNative` which is loaded from `CpuMathNative`, but all of this is done without any checks to see if the DLL actually exists. The tests also run into this probleb. The base test class imports and sets up Intel MKL even if the test itself does not need it.
 
 ### 2.4 Native Projects
 ML.NET has 6 native projects. They are:
  - CpuMathNative
-   - CpuMathNative relies heavily on IntelMKL. Using NetCore 3.1, however, there is a full software fallback.
+   - Partial managed fallback when using NetCore 3.1.
+   - A large amount of work would be required to port the native code to other platforms. We would have to change all the SIMD instructions for each platform.
+   - A small amount of work required for a full managed fallback.
+   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback uses x86/x64 intrinsics where possible.
  - FastTreeNative
-   - FastTreeNative has a software fallback using a C# compiler flag. This flag is hardcoded to always use the native code.
+   - Full managed fallback by changing a C# compiler flag. This flag is hardcoded to always use the native code.
+   - Small amount of work required to change build process and verify it's correct.
+   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback does not use hardware intrinsics, so this will be slower than the native solution.
  - LdaNative
-   - Builds successfully without native dependencies.
+   - No managed fallback, but builds successfully on non x86/x64 without changes.
+   - Large amount of work to have a managed fallback. This solution is about 3000 lines of code not including any code from the dependencies.
+   - This was created before we had hardware intrinsics so we used native code for performance.
  - MatrixFactorizationNative
-   - Uses libmf. Currently, we are hardcoding the "USESSE" flag which requires Intel MKL. Removing this flag allows MatrixFactorizationNative to build for Arm.
+   - No sotware fallback.
+   - Currently we are hardcoding the "USESSE" flag which requires x86/x64 SIMD commands. Removing this flag allows MatrixFactorizationNative to build for other platforms.
+   - Small amount of work required to change the build process and verify its working.
+   - Large/Xlarge amount of work to have a managed fallback. This uses libmf, so we would have to not only port our code, but we would have to understand which parts of libmf are being used and port those as well.
+   - This was created before we had hardware intrinsics so we used native code for performance as well as take advantage of libmf.
  - MklProxyNative
    - Wrapper for Intel MKL. When Intel MKL is not present, this is not needed. However, the build is hardcoded to always compile this code.
+   - Small amount of work required to change the build process to exclude this as needed.
  - SymSgdNative
-   - Only uses 4 Intel MKL methods. For 2 of them, I can't find direct replacements on other architectures.
+   - No managed fallback.
+   - Medium amount of work required to have a managed fallback. Only about 500 lines of code plus having to implmement 4 vector operations.
+   - Small amount of work required port to other platforms. Only uses 4 Intel MKL methods which we can replace. I was only able to find 2 direct replacements on other architectures, but we could easily write our own for either those 2 or even all 4.
+   - This was created before we had hardware intrinsics so we used native code for performance as well as take advantage of IntelMKL.
 
 Of these 6, only LdaNative builds successfully for other architectures without changing anything.
 
@@ -68,46 +89,46 @@ Of these 6, only LdaNative builds successfully for other architectures without c
 As mentioned above, there are several 3rd party packages that don't have support for non x86/x64 machines.
  - LightGBM. LightGBM doesn't offer packages for non x86/x64. I was able to build the code for Arm64, but we would either have to build it ourselves, convince them to create more packages for us, or annotate that this doesn't work on non x86/64 machines.
  - TensorFlow. The full version of TensorFlow only runs on x86/x64. There is a [lite](https://www.tensorflow.org/lite/guide/build_arm64) version that supports Arm64, and you can install it directly with python, but this isn't the full version so not all models will run. We would also have to verify if the C# library we use to interface with TensorFlow will work with the lite version.
- - OnnxRuntime. OnnxRuntime doesn't have prebuilt packages for more than just x86/x64. It does support Arm, but we have to build it ourselves. This is the same situation as with LightGBM.
+ - OnnxRuntime. OnnxRuntime doesn't have prebuilt packages for more than just x86/x64. It does support Arm, but we have to build it ourselves or get the OnnxRuntime team to package Arm assemblies. This is the same situation as with LightGBM.
 
 
 ## 3 Possible Solutions
 We have several possible options we can use to resolve this:
- - Fully remove Intel MKL as a dependency and use software that supports all platforms/architectures.
- - Continue to use Intel MKL for what it supports, and find a replacement for all platforms/architectures that don't support it.
- - Create software fallbacks, so that if Intel MKL is not found, then ML.NET will run fully in managed code.
+ - Eliminate ML.NET native components and implement all functionality in managed code.
+ - Keep ML.NET native components and rewrite them to a avoid problematic dependencies completely.
+ - Keep ML.NET native components and ifdef/rewrite to avoid problematic dependencies on platforms/architectures only where they are not supported.
  - Hybrid approach of replacement code and software fallback. This is the approach I recommend.
 
 None of these approaches resolve the 3rd party dependency issues. These solutions only deal with first party ML.NET code itself.
 
-I have lots more info about our dependency on Intel MKL in another document if required.
+I have lots more info about our dependency on x86/x64 in another document if required.
 
-### 3.1 Fully Remove Intel MKL
-This is the most complicated solution and provides the least amount of short term benefit. Since x86/x64 run fine and gain performance benefits with Intel MKL, it doesn't make sense to spend the time to fully remove it. This is a possible solution, but not one that I would recommend, so I am not going to give more details on it unless we explicitly decide to go this route.
+### 3.1 Eliminate native components
+This is the most complicated solution and provides the least amount of short term benefit. Since x86/x64 run fine and gain performance benefits with these components, it doesn't make sense to spend the time to fully remove it. This is a possible solution, but not one that I would recommend, so I am not going to give more details on it unless we explicitly decide to go this route.
 
-### 3.2 Find Replacements as needed
-This will still allow us to gain the benefits of Intel MKL on architectures that support it but will also keep the benefits of native code in the other places. The downside is that we would have to build the native code for, potentially, a lot of different architectures.
+### 3.2 Rewrite native components to work on other platforms
+This will still allow us to gain the benefits of the X86/x64 SIMD instructions and Intel MKL on architectures that support it but will also keep the benefits of native code in the other places. The downside is that we would have to build the native code for, potentially, a lot of different architectures.
 
 At a high level, this solution would require us to:
- - Fix the build so it's not hardcoded to look for Intel MKL.
- - Fix the managed code so that if it can't find Intel MKL, the code behaves correctly.
- - All the native code would need to have the Intel MKL dependency change to another library or be re-written by us.
-   - CpuMathNative would need to be re-written to not use Intel MKL.
-   - FastTreeNative would need to be re-written to not use Intel MKL.
-   - LdaNative builds just find without Intel MKL, so no replacements would need to be found.
-   - MatrixFactorizationNative builds with the "USESSE" flag that requires Intel MKL. We can conditionally enable/disable that flag and no other work would be required.
+ - Fix the build so it's not hardcoded to look for specific native dependencies.
+ - Fix the managed code so that if it can't find the native components the code behaves correctly.
+ - All the native code would need to have the x86/x64 dependencies change to other libraries or be re-written by us.
+   - CpuMathNative would need to be re-written to not use x86/x64 dependencies.
+   - FastTreeNative would need to be re-written to not use x86/x64 dependencies.
+   - LdaNative builds just find without x86/x64 dependencies, so no replacements would need to be found.
+   - MatrixFactorizationNative builds with the "USESSE" flag that requires x86/x64. We can conditionally enable/disable that flag and no other work would be required.
    - MklProxyNative is only a wrapper for Intel MKL and can be ignored on non x86/x64 platforms. We will need to modify the build to exclude it as needed.
-   - SymSgdNative would need to be re-written. There are only 4 methods that would need to be changed.
+   - SymSgdNative would need to be re-written to not use Intel MKL. There are only 4 methods that would need to be changed.
 
 I was unable to find all the replacements we would need. We would end up having to write many native methods ourselves with this approach.
 
-### 3.3 Software fallbacks
+### 3.3 Rewrite native components to be only managed code
 This will truly allow ML.NET to run anywhere that .NET runs. The only downside is the speed of execution, and the time to rewrite the existing native code we have.  If we restrict new architectures to .NET core 3.1 or newer, we will have an easier time with the software fallbacks as some of this code has already been written. This solution will also require a lot of code rewrite from native code to managed code.
 
 At a high level, this solution would require us to:
- - Fix the build so it's not hardcoded to look for Intel MKL or any native binaries.
+ - Fix the build so it's not hardcoded to look for any native dependencies or binaries.
  - Fix the managed code so that if it can't find the native binaries, the code behaves correctly and performs the software fallback.
- - CpuMathNative has software fallbacks already in place for .NET core 3.1, so no work is needed.
+ - CpuMathNative mostly has software fallbacks already in place for .NET core 3.1, so only a little work is needed.
  - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
  - LdaNative would need to be re-written.
  - MatrixFactorizationNative would need to be re-written.
@@ -118,20 +139,20 @@ At a high level, this solution would require us to:
 Since some of the native code already has replacements and some of the code already has software fallbacks, we can leverage this work by doing a hybrid between the prior 2 solutions.
 
 At a high level, this solution would require us to:
- - Fix the build so it's not hardcoded to look for Intel MKL or any native binaries.
- - Fix the managed code so that if it can't find Intel MKL or any native binaries, the code behaves correctly. This includes software fallbacks and/or description error messages.
- - CpuMathNative has software fallbacks already in place for .NET core 3.1, so no work is needed.
+ - Fix the build so it's not hardcoded to look for any native dependencies or binaries.
+ - Fix the managed code so that if it can't find the native binaries, the code behaves correctly and performs the software fallback. This includes software fallbacks and/or description error messages.
+ - CpuMathNative mostly has software fallbacks already in place for .NET core 3.1, so only a little work is needed.
  - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
- - LdaNative builds just fine for Arm, so no work would be required.
- - MatrixFactorizationNative builds with the "USESSE" flag that requires Intel MKL. We can conditionally enable/disable that flag and no other work would be required.
- - MklProxyNative can be removed/ignored for Arm builds. We will need to modify the build to exclude it as needed.
+ - LdaNative builds just find without x86/x64 dependencies, so no replacements would need to be found.
+ - MatrixFactorizationNative builds with the "USESSE" flag that requires x86/x64. We can conditionally enable/disable that flag and no other work would be required.
+ - MklProxyNative is only a wrapper for Intel MKL and can be ignored on non x86/x64 platforms. We will need to modify the build to exclude it as needed.
  - SymSgdNative needs to be either re-written in managed code, or re-write 4 Intel MKL methods. The 4 methods are just dealing with vector manipulation and shouldn't be hard to do.
 
 ## My Suggestion
-My suggestion would be to start with the hybrid approach. It will require the least amount of work to get ML.NET running elsewhere, while still being able to support a large majority of devices out of the gate. This solution will still limit the platforms we can run-on to Arm64 devices, but we can do a generic Arm64 compile, so it should work for all Arm64 v8 devices. It would be a good idea to have full software fallbacks so that truly anywhere .NET Core runs, ML.NET will run, such as Web Assembly and when .NET 6 comes out on mobile as well.
+My suggestion would be to start with the hybrid approach. It will require the least amount of work to get ML.NET running elsewhere, while still being able to support a large majority of devices out of the gate. This solution will still limit the platforms we can run-on to what we build the native components for, initially Arm64 devices, but we can do a generic Arm64 compile so it should work for all Arm64 v8 devices. The goal is to eventually have a general purpose implementation which can work everywhere .NET does and accelerated components to increase performance where possible, such as running on Web Assembly and when .NET 6 comes out on mobile as well.
 
-### NET Core 5
-I think we should also add support for .NET Core 5 during this process so that we gain access to the Arm64 intrinsics. However, since .Net Core 5 goes out of support before .NET core 3.1 I don't think .NET Core 5 should be a huge focus.
+### Improving managed fallback through intrinsics
+We should also target .NET 5 so that we gain access to the Arm64 intrinsics. Rather than implementing special-purpose native libraries to take advantage of architecture-specific instructions we should instead enhance performance ensuring our managed implementation leverage intrinsics.
 
 ### 3rd Party Dependencies
 I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. OnnxRuntime is something that we will probably want, but we can look more into this as we get requests for it in the future.
@@ -140,7 +161,7 @@ I think initially we should annotate that they don't work on non x86/x64 devices
 In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works.
 
 ### Mobile Support
-.NET Core 6 will allow us to run nativly on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback.
+.NET Core 6 will allow us to run nativly on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback. Since the native projects we are proposing to keep build for Arm64, they should work on mobile as well.
 
 ### Support Grid
 This is what I propose for the support grid. Since .NET Core 2.1 is end-of-life this year, I am putting much less emphasis on it. Since .NET Core 5 will be out of support before .NET Core 3.1 will be, I am putting less CI emphasis on .NET Core 5.
@@ -152,6 +173,7 @@ This is what I propose for the support grid. Since .NET Core 2.1 is end-of-life
 | Mac      | x64          | Yes       | No             | Yes, no CI    | Yes           | Yes, no CI  |
 | Mac      | Arm64        | No        | No             | No            | Yes           | Yes, no CI  |
 | Ios      | Arm64        | No        | No             | No            | No            | No          |
+| Ios      | x64          | No        | No             | No            | No            | No          |
 | Linux    | x64          | Yes       | No             | Yes, no CI    | Yes           | Yes, no CI  |
 | Linux    | Arm64        | No        | No             | No            | Yes           | Yes         |
 | Android  | Arm64        | No        | No             | No            | No            | No          |
\ No newline at end of file

From 7bec81b4d2988198f4a773d2a02cdc5d8eb7c0d6 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:07:35 -0700
Subject: [PATCH 03/20] Update docs/code/CrossPlatform.md

Co-authored-by: Stephen Toub <stoub@microsoft.com>
---
 docs/code/CrossPlatform.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index d65f2a88d9..1a4ee854a2 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -69,7 +69,7 @@ ML.NET has 6 native projects. They are:
    - Large amount of work to have a managed fallback. This solution is about 3000 lines of code not including any code from the dependencies.
    - This was created before we had hardware intrinsics so we used native code for performance.
  - MatrixFactorizationNative
-   - No sotware fallback.
+   - No software fallback.
    - Currently we are hardcoding the "USESSE" flag which requires x86/x64 SIMD commands. Removing this flag allows MatrixFactorizationNative to build for other platforms.
    - Small amount of work required to change the build process and verify its working.
    - Large/Xlarge amount of work to have a managed fallback. This uses libmf, so we would have to not only port our code, but we would have to understand which parts of libmf are being used and port those as well.
@@ -176,4 +176,4 @@ This is what I propose for the support grid. Since .NET Core 2.1 is end-of-life
 | Ios      | x64          | No        | No             | No            | No            | No          |
 | Linux    | x64          | Yes       | No             | Yes, no CI    | Yes           | Yes, no CI  |
 | Linux    | Arm64        | No        | No             | No            | Yes           | Yes         |
-| Android  | Arm64        | No        | No             | No            | No            | No          |
\ No newline at end of file
+| Android  | Arm64        | No        | No             | No            | No            | No          |

From 84ecf799e0124968dabf2c88bd898bfa7e149528 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:46:07 -0700
Subject: [PATCH 04/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 1a4ee854a2..794634b5b1 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -48,10 +48,10 @@ ML.NET also has dependencies on things that either don't build on other architec
 I will go over these in more depth below.
 
 ### 2.2 Build
-Since ML.NET has a hard dependency on x86/x64, the build process assumes it's running there. For example, the build process will try and copy native dlls without checking if they exist because it assumes the build for them succeeded or that they are available. The build process will need to be modified so that it doesn't fail when it can't find these files. It does the same copy for our own Native dlls, so this will need to be fixed for those as well.
+Since ML.NET has a hard dependency on x86/x64, the build process assumes it's running there. For example, the build process will try and copy native DLLs without checking if they exist because it assumes the build for them succeeded or that they are available. The build process will need to be modified so that it doesn't fail when it can't find these files. It does the same copy for our own Native DLLs, so this will need to be fixed for those as well.
 
 ### 2.3 Managed Code
-Since ML.NET has a hard dependency on x86/x64, the managed code imports dlls without checking whether or not they exist. If the dlls don't exist you get a hard failure. For example, if certain columns are active the `MulticlassClassificationScorer` will call `CalculateIntermediateVariablesNative` which is loaded from `CpuMathNative`, but all of this is done without any checks to see if the DLL actually exists. The tests also run into this probleb. The base test class imports and sets up Intel MKL even if the test itself does not need it.
+Since ML.NET has a hard dependency on x86/x64, the managed code imports DLLs without checking whether or not they exist. If the DLLs don't exist you get a hard failure. For example, if certain columns are active, the `MulticlassClassificationScorer` will call `CalculateIntermediateVariablesNative` which is loaded from `CpuMathNative`, but all of this is done without any checks to see if the DLL actually exists. The tests also run into this problem, for instance, the base test class imports and sets up Intel MKL even if the test itself does not need it.
 
 ### 2.4 Native Projects
 ML.NET has 6 native projects. They are:

From 1f3041022b9d42d04e43a145fb0d11a72824f7ab Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:46:26 -0700
Subject: [PATCH 05/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 794634b5b1..b4466f41c2 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -40,7 +40,7 @@ There are several problems complicating us from moving to a fully cross-platform
 ### 2.1 Problems
 ML.NET has a hard dependency on x86/x64. Some of the dependency is on Intel MKL, while other parts depend on x86/x64 SIMD instructions. To make things easier I will refer to these as just the x86/x64 dependencies. This is to perform many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that these dependencies can only run on x86/x64 machines and are the main blockers for expanding to other architectures. While you can run the managed code on other architectures, there is no good way to know which parts will run and which ones wont. This includes the build process as well, which currently has these same hard dependencies and building on non x86/x64 machines is not supported.
 
-ML.NET also has dependencies on things that either don't build on other architectures or have to be compiled by the user if its wanted. For example:
+ML.NET also has dependencies on things that either don't build on other architectures or have to be compiled by the user if it's wanted. For example:
  - LightGBM
  - TensorFlow
  - OnnxRuntime

From f381c5d3a316d875fab144086d7e947ae4d53603 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:46:43 -0700
Subject: [PATCH 06/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index b4466f41c2..c924059237 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -31,7 +31,7 @@ Currently, while .NET is able to run on many different platforms and architectur
 The goal is to enable ML.NET to run everywhere that .NET itself is able to run.
 
 ## 2. Current status
-There are several problems complicating us from moving to a fully cross-platform solution. At a high level thes include:
+There are several problems complicating us from moving to a fully cross-platform solution. At a high level these include:
 1. We only build and test for a subset of platforms that .NET supports.
 2. Native components must be explicitly built for additional architectures which we wish to support. This limits our ability to support new platforms without doing work.
 3. Building our native components for new platforms faces challenges due to lack of support for those components dependencies. This limits our ability to support the current set of platforms.

From 16f474f5ade69fa54d9c969067b715232aa53f1b Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:46:54 -0700
Subject: [PATCH 07/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index c924059237..c4ad716f10 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -43,7 +43,7 @@ ML.NET has a hard dependency on x86/x64. Some of the dependency is on Intel MKL,
 ML.NET also has dependencies on things that either don't build on other architectures or have to be compiled by the user if it's wanted. For example:
  - LightGBM
  - TensorFlow
- - OnnxRuntime
+ - ONNX Runtime
 
 I will go over these in more depth below.
 

From b1eff29b92f8c53952e98c6ca5b4553eea6a5b0b Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:02 -0700
Subject: [PATCH 08/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index c4ad716f10..e061ef43d5 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -71,7 +71,7 @@ ML.NET has 6 native projects. They are:
  - MatrixFactorizationNative
    - No software fallback.
    - Currently we are hardcoding the "USESSE" flag which requires x86/x64 SIMD commands. Removing this flag allows MatrixFactorizationNative to build for other platforms.
-   - Small amount of work required to change the build process and verify its working.
+   - Small amount of work required to change the build process and verify it's working.
    - Large/Xlarge amount of work to have a managed fallback. This uses libmf, so we would have to not only port our code, but we would have to understand which parts of libmf are being used and port those as well.
    - This was created before we had hardware intrinsics so we used native code for performance as well as take advantage of libmf.
  - MklProxyNative

From acefea54f781d149a6e55cfd48afd11780b31676 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:14 -0700
Subject: [PATCH 09/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index e061ef43d5..962d6d9a80 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -79,7 +79,7 @@ ML.NET has 6 native projects. They are:
    - Small amount of work required to change the build process to exclude this as needed.
  - SymSgdNative
    - No managed fallback.
-   - Medium amount of work required to have a managed fallback. Only about 500 lines of code plus having to implmement 4 vector operations.
+   - Medium amount of work required to have a managed fallback. Only about 500 lines of code plus having to implement 4 vector operations.
    - Small amount of work required port to other platforms. Only uses 4 Intel MKL methods which we can replace. I was only able to find 2 direct replacements on other architectures, but we could easily write our own for either those 2 or even all 4.
    - This was created before we had hardware intrinsics so we used native code for performance as well as take advantage of IntelMKL.
 

From 6e043073a6019745c0605897e220cf7ebd8808d4 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:28 -0700
Subject: [PATCH 10/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 962d6d9a80..67ae581914 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -95,7 +95,7 @@ As mentioned above, there are several 3rd party packages that don't have support
 ## 3 Possible Solutions
 We have several possible options we can use to resolve this:
  - Eliminate ML.NET native components and implement all functionality in managed code.
- - Keep ML.NET native components and rewrite them to a avoid problematic dependencies completely.
+ - Keep ML.NET native components and rewrite them to avoid problematic dependencies completely.
  - Keep ML.NET native components and ifdef/rewrite to avoid problematic dependencies on platforms/architectures only where they are not supported.
  - Hybrid approach of replacement code and software fallback. This is the approach I recommend.
 

From 88056e64ec59a5c3adec51b7387e134a8b6efe2a Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:40 -0700
Subject: [PATCH 11/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 67ae581914..8e2056fe5f 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -161,7 +161,7 @@ I think initially we should annotate that they don't work on non x86/x64 devices
 In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works.
 
 ### Mobile Support
-.NET Core 6 will allow us to run nativly on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback. Since the native projects we are proposing to keep build for Arm64, they should work on mobile as well.
+.NET Core 6 will allow us to run natively on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback. Since the native projects we are proposing to keep build for Arm64, they should work on mobile as well.
 
 ### Support Grid
 This is what I propose for the support grid. Since .NET Core 2.1 is end-of-life this year, I am putting much less emphasis on it. Since .NET Core 5 will be out of support before .NET Core 3.1 will be, I am putting less CI emphasis on .NET Core 5.

From 94e319f9c89fd2f79e22a7fd04b3d38c9097e028 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:51 -0700
Subject: [PATCH 12/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 8e2056fe5f..9719228dac 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -89,7 +89,7 @@ Of these 6, only LdaNative builds successfully for other architectures without c
 As mentioned above, there are several 3rd party packages that don't have support for non x86/x64 machines.
  - LightGBM. LightGBM doesn't offer packages for non x86/x64. I was able to build the code for Arm64, but we would either have to build it ourselves, convince them to create more packages for us, or annotate that this doesn't work on non x86/64 machines.
  - TensorFlow. The full version of TensorFlow only runs on x86/x64. There is a [lite](https://www.tensorflow.org/lite/guide/build_arm64) version that supports Arm64, and you can install it directly with python, but this isn't the full version so not all models will run. We would also have to verify if the C# library we use to interface with TensorFlow will work with the lite version.
- - OnnxRuntime. OnnxRuntime doesn't have prebuilt packages for more than just x86/x64. It does support Arm, but we have to build it ourselves or get the OnnxRuntime team to package Arm assemblies. This is the same situation as with LightGBM.
+ - ONNX Runtime. ONNX Runtime doesn't have prebuilt packages for more than just x86/x64. It does support Arm, but we have to build it ourselves or get the ONNX Runtime team to package Arm assemblies. This is the same situation as with LightGBM.
 
 
 ## 3 Possible Solutions

From 4408d9a0a87393addb16542cd6fcb9f54324d14f Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Thu, 18 Mar 2021 12:47:58 -0700
Subject: [PATCH 13/20] Update docs/code/CrossPlatform.md

Co-authored-by: Justin Ormont <justinormont@users.noreply.github.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 9719228dac..f71b6d5797 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -155,7 +155,7 @@ My suggestion would be to start with the hybrid approach. It will require the le
 We should also target .NET 5 so that we gain access to the Arm64 intrinsics. Rather than implementing special-purpose native libraries to take advantage of architecture-specific instructions we should instead enhance performance ensuring our managed implementation leverage intrinsics.
 
 ### 3rd Party Dependencies
-I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. OnnxRuntime is something that we will probably want, but we can look more into this as we get requests for it in the future.
+I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. ONNX Runtime is something that we will probably want, but we can look more into this as we get requests for it in the future.
 
 ### Helix
 In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works.

From 4082cd800ed9ba2f34ce81463107b85462d17ef3 Mon Sep 17 00:00:00 2001
From: Michael Sharp <misharp@microsoft.com>
Date: Mon, 22 Mar 2021 09:30:41 -0700
Subject: [PATCH 14/20] updates about helix

---
 docs/code/CrossPlatform.md | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index f71b6d5797..36677f7870 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -10,10 +10,9 @@
 		- [2.4 Native Projects](#24-native-projects)
 		- [2.5 3rd Party Dependencies](#25-3rd-party-dependencies)
 	- [3 Possible Solutions](#3-possible-solutions)
-		- [3.1 Eliminate native components](#31-eliminate-native-components)
-		- [3.2 Rewrite native components to work on other platforms](#32-rewrite-native-components-to-work-on-other-platforms)
-		- [3.3 Rewrite native components to be only managed code](#33-rewrite-native-components-to-be-only-managed-code)
-		- [3.4 Hybrid Between Finding Replacements and Software Fallbacks](#34-hybrid-between-finding-replacements-and-software-fallbacks)
+		- [3.1 Rewrite native components to work on other platforms](#31-rewrite-native-components-to-work-on-other-platforms)
+		- [3.2 Rewrite native components to be only managed code](#32-rewrite-native-components-to-be-only-managed-code)
+		- [3.3 Hybrid Between Finding Replacements and Software Fallbacks](#33-hybrid-between-finding-replacements-and-software-fallbacks)
 	- [My Suggestion](#my-suggestion)
 		- [Improving managed fallback through intrinsics](#improving-managed-fallback-through-intrinsics)
 		- [3rd Party Dependencies](#3rd-party-dependencies)
@@ -36,6 +35,7 @@ There are several problems complicating us from moving to a fully cross-platform
 2. Native components must be explicitly built for additional architectures which we wish to support. This limits our ability to support new platforms without doing work.
 3. Building our native components for new platforms faces challenges due to lack of support for those components dependencies. This limits our ability to support the current set of platforms.
 4. Some of our external dependencies have limited support for .NET's supported platforms.
+5. Some things we use internally are optomized for x86/x64 and wont work well on other platforms. For example various components are parallelized but current webassembly targets are single threaded. It's likely some changes will be necessary to various algorithms to work well in these environments.
 
 ### 2.1 Problems
 ML.NET has a hard dependency on x86/x64. Some of the dependency is on Intel MKL, while other parts depend on x86/x64 SIMD instructions. To make things easier I will refer to these as just the x86/x64 dependencies. This is to perform many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that these dependencies can only run on x86/x64 machines and are the main blockers for expanding to other architectures. While you can run the managed code on other architectures, there is no good way to know which parts will run and which ones wont. This includes the build process as well, which currently has these same hard dependencies and building on non x86/x64 machines is not supported.
@@ -59,11 +59,11 @@ ML.NET has 6 native projects. They are:
    - Partial managed fallback when using NetCore 3.1.
    - A large amount of work would be required to port the native code to other platforms. We would have to change all the SIMD instructions for each platform.
    - A small amount of work required for a full managed fallback.
-   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback uses x86/x64 intrinsics where possible.
+   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback uses x86/x64 intrinsics if the hardware supports it, otherwise it has a plain managed fallback without intrinsics if needed.
  - FastTreeNative
    - Full managed fallback by changing a C# compiler flag. This flag is hardcoded to always use the native code.
    - Small amount of work required to change build process and verify it's correct.
-   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback does not use hardware intrinsics, so this will be slower than the native solution.
+   - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback does not currently use hardware intrinsics, so this will be slower than the native solution. Hardware intrinsics could be added to improve performance.
  - LdaNative
    - No managed fallback, but builds successfully on non x86/x64 without changes.
    - Large amount of work to have a managed fallback. This solution is about 3000 lines of code not including any code from the dependencies.
@@ -94,7 +94,6 @@ As mentioned above, there are several 3rd party packages that don't have support
 
 ## 3 Possible Solutions
 We have several possible options we can use to resolve this:
- - Eliminate ML.NET native components and implement all functionality in managed code.
  - Keep ML.NET native components and rewrite them to avoid problematic dependencies completely.
  - Keep ML.NET native components and ifdef/rewrite to avoid problematic dependencies on platforms/architectures only where they are not supported.
  - Hybrid approach of replacement code and software fallback. This is the approach I recommend.
@@ -103,10 +102,7 @@ None of these approaches resolve the 3rd party dependency issues. These solution
 
 I have lots more info about our dependency on x86/x64 in another document if required.
 
-### 3.1 Eliminate native components
-This is the most complicated solution and provides the least amount of short term benefit. Since x86/x64 run fine and gain performance benefits with these components, it doesn't make sense to spend the time to fully remove it. This is a possible solution, but not one that I would recommend, so I am not going to give more details on it unless we explicitly decide to go this route.
-
-### 3.2 Rewrite native components to work on other platforms
+### 3.1 Rewrite native components to work on other platforms
 This will still allow us to gain the benefits of the X86/x64 SIMD instructions and Intel MKL on architectures that support it but will also keep the benefits of native code in the other places. The downside is that we would have to build the native code for, potentially, a lot of different architectures.
 
 At a high level, this solution would require us to:
@@ -122,7 +118,7 @@ At a high level, this solution would require us to:
 
 I was unable to find all the replacements we would need. We would end up having to write many native methods ourselves with this approach.
 
-### 3.3 Rewrite native components to be only managed code
+### 3.2 Rewrite native components to be only managed code
 This will truly allow ML.NET to run anywhere that .NET runs. The only downside is the speed of execution, and the time to rewrite the existing native code we have.  If we restrict new architectures to .NET core 3.1 or newer, we will have an easier time with the software fallbacks as some of this code has already been written. This solution will also require a lot of code rewrite from native code to managed code.
 
 At a high level, this solution would require us to:
@@ -135,7 +131,7 @@ At a high level, this solution would require us to:
  - MklProxyNative can be ignored. It is not needed with software fallbacks. We will need to modify the build to exclude it as needed.
  - SymSgdNative would need to be re-written.
 
-### 3.4 Hybrid Between Finding Replacements and Software Fallbacks
+### 3.3 Hybrid Between Finding Replacements and Software Fallbacks
 Since some of the native code already has replacements and some of the code already has software fallbacks, we can leverage this work by doing a hybrid between the prior 2 solutions.
 
 At a high level, this solution would require us to:
@@ -149,7 +145,7 @@ At a high level, this solution would require us to:
  - SymSgdNative needs to be either re-written in managed code, or re-write 4 Intel MKL methods. The 4 methods are just dealing with vector manipulation and shouldn't be hard to do.
 
 ## My Suggestion
-My suggestion would be to start with the hybrid approach. It will require the least amount of work to get ML.NET running elsewhere, while still being able to support a large majority of devices out of the gate. This solution will still limit the platforms we can run-on to what we build the native components for, initially Arm64 devices, but we can do a generic Arm64 compile so it should work for all Arm64 v8 devices. The goal is to eventually have a general purpose implementation which can work everywhere .NET does and accelerated components to increase performance where possible, such as running on Web Assembly and when .NET 6 comes out on mobile as well.
+My suggestion would be to start with the hybrid approach. It will require the least amount of work to get ML.NET running elsewhere, while still being able to support a large majority of devices out of the gate. This solution will still limit the platforms we can run-on to what we build the native components for, initially Arm64 devices, but we can do a generic Arm64 compile so it should work for all Arm64 v8 devices. The goal is to eventually have a managed implementation which can work everywhere .NET does and accelerated components to increase performance where possible. This could include native components in some cases or hardware intrinsics in others.
 
 ### Improving managed fallback through intrinsics
 We should also target .NET 5 so that we gain access to the Arm64 intrinsics. Rather than implementing special-purpose native libraries to take advantage of architecture-specific instructions we should instead enhance performance ensuring our managed implementation leverage intrinsics.
@@ -158,7 +154,7 @@ We should also target .NET 5 so that we gain access to the Arm64 intrinsics. Rat
 I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. ONNX Runtime is something that we will probably want, but we can look more into this as we get requests for it in the future.
 
 ### Helix
-In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works.
+In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works. We will be building once for each architecture/platform combination, and then fan out and submit one job for each framework version we want to test. We will also be cross-targeting builds. For example we can build on normal linux to target arm linux and then run those tests using Helix.
 
 ### Mobile Support
 .NET Core 6 will allow us to run natively on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback. Since the native projects we are proposing to keep build for Arm64, they should work on mobile as well.

From ad3ca70acb4d8a120305c08cd3bade6b26b95782 Mon Sep 17 00:00:00 2001
From: Michael Sharp <misharp@microsoft.com>
Date: Mon, 22 Mar 2021 09:31:12 -0700
Subject: [PATCH 15/20] updates about helix

---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 36677f7870..758e8ec979 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -154,7 +154,7 @@ We should also target .NET 5 so that we gain access to the Arm64 intrinsics. Rat
 I think initially we should annotate that they don't work on non x86/x64 devices. This includes logging an error when they try and run an unsupported 3rd party dependency, and then failing gracefully with a helpful and descriptive error. The user should be able to compile the 3rd party dependency, for the ones that support it, and have ML.NET still be able to pick it up and run it if it exists. ONNX Runtime is something that we will probably want, but we can look more into this as we get requests for it in the future.
 
 ### Helix
-In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works. We will be building once for each architecture/platform combination, and then fan out and submit one job for each framework version we want to test. We will also be cross-targeting builds. For example we can build on normal linux to target arm linux and then run those tests using Helix.
+In order to fully test everything we need to, we would also need to change how we test to use the Helix testing servers. Currently, Helix doesn't have the capability to test Apple's new M1 code, but that is in the works. We will be building once for each architecture/platform combination, and then fan out and submit one job for each framework version we want to test. We will also be cross-targeting builds. For example we can build on normal linux to target arm linux and then run those tests using Helix. Its estimated to be about a medium amount of work to make the changes required to use Helix.
 
 ### Mobile Support
 .NET Core 6 will allow us to run natively on mobile. Since we are making these changes before .NET 6 is released, I propose we don't include that work as of yet. As long as we handle the native binaries correctly and make sure ML.NET provides descriptive error methods, we should be able to have mobile support as soon as .NET Core 6 releases for everything that currently has a software fallback. Since the native projects we are proposing to keep build for Arm64, they should work on mobile as well.

From 87c44f295b074294205ca59fab3eda7a2b7dcba4 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Tue, 30 Mar 2021 01:40:00 -0700
Subject: [PATCH 16/20] Update docs/code/CrossPlatform.md

Co-authored-by: Dan Moseley <danmose@microsoft.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 758e8ec979..2b61cc2f93 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -25,7 +25,7 @@ ML.NET is an open-source machine learning framework which makes machine learning
 
 ML.NET allows .NET developers to develop/train their own models and infuse custom machine learning into their applications using .NET.
 
-Currently, while .NET is able to run on many different platforms and architectures, ML.NET is only able to run on Windows, Mac, and Linux, either x86 or x64. This excludes many architectures such as Arm, Arm64, M1, and web assembly, places that .NET is currently able to run.
+Currently, while .NET is able to run on many different platforms and architectures, ML.NET is only able to run on Windows, Mac, and Linux, either x86 or x64. This excludes many architectures such as Arm, Arm64, Apple Silicon, and web assembly (WASM), places that .NET is currently able to run.
 
 The goal is to enable ML.NET to run everywhere that .NET itself is able to run.
 

From b5c901f20ed599ad5371a611f8c67c7fcf7697f1 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Tue, 30 Mar 2021 01:44:49 -0700
Subject: [PATCH 17/20] Update docs/code/CrossPlatform.md

Co-authored-by: Dan Moseley <danmose@microsoft.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 2b61cc2f93..0ef4abe186 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -35,7 +35,7 @@ There are several problems complicating us from moving to a fully cross-platform
 2. Native components must be explicitly built for additional architectures which we wish to support. This limits our ability to support new platforms without doing work.
 3. Building our native components for new platforms faces challenges due to lack of support for those components dependencies. This limits our ability to support the current set of platforms.
 4. Some of our external dependencies have limited support for .NET's supported platforms.
-5. Some things we use internally are optomized for x86/x64 and wont work well on other platforms. For example various components are parallelized but current webassembly targets are single threaded. It's likely some changes will be necessary to various algorithms to work well in these environments.
+5. Some things we use internally are optimized for x86/x64 and wont work well on other platforms. For example various components are parallelized but current web assembly targets are single threaded. It's likely some changes will be necessary to various algorithms to work well in these environments.
 
 ### 2.1 Problems
 ML.NET has a hard dependency on x86/x64. Some of the dependency is on Intel MKL, while other parts depend on x86/x64 SIMD instructions. To make things easier I will refer to these as just the x86/x64 dependencies. This is to perform many optimized math functions and enables several transformers/trainers to be run natively for improved performance. The problem is that these dependencies can only run on x86/x64 machines and are the main blockers for expanding to other architectures. While you can run the managed code on other architectures, there is no good way to know which parts will run and which ones wont. This includes the build process as well, which currently has these same hard dependencies and building on non x86/x64 machines is not supported.

From 88e1508c7b070ea9d1a3ac78801311b16cf0bd42 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Tue, 30 Mar 2021 01:45:53 -0700
Subject: [PATCH 18/20] Update docs/code/CrossPlatform.md

Co-authored-by: Dan Moseley <danmose@microsoft.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 0ef4abe186..2c456fc690 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -56,7 +56,7 @@ Since ML.NET has a hard dependency on x86/x64, the managed code imports DLLs wit
 ### 2.4 Native Projects
 ML.NET has 6 native projects. They are:
  - CpuMathNative
-   - Partial managed fallback when using NetCore 3.1.
+   - Partial managed fallback when using .NET Core 3.1 or later.
    - A large amount of work would be required to port the native code to other platforms. We would have to change all the SIMD instructions for each platform.
    - A small amount of work required for a full managed fallback.
    - This was created before we had hardware intrinsics so we used native code for performance. The managed fallback uses x86/x64 intrinsics if the hardware supports it, otherwise it has a plain managed fallback without intrinsics if needed.

From 48d62ff2cfb635fff09cb58708f46a322b179b43 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Tue, 30 Mar 2021 02:06:36 -0700
Subject: [PATCH 19/20] Update docs/code/CrossPlatform.md

Co-authored-by: Dan Moseley <danmose@microsoft.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 2c456fc690..26f7c9dfc4 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -124,7 +124,7 @@ This will truly allow ML.NET to run anywhere that .NET runs. The only downside i
 At a high level, this solution would require us to:
  - Fix the build so it's not hardcoded to look for any native dependencies or binaries.
  - Fix the managed code so that if it can't find the native binaries, the code behaves correctly and performs the software fallback.
- - CpuMathNative mostly has software fallbacks already in place for .NET core 3.1, so only a little work is needed.
+ - CpuMathNative mostly has software fallbacks already in place for .NET Core 3.1 and later, so only a little work is needed.
  - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
  - LdaNative would need to be re-written.
  - MatrixFactorizationNative would need to be re-written.

From ef6915f7696fda46462fbed9237bc82937484a12 Mon Sep 17 00:00:00 2001
From: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Date: Tue, 30 Mar 2021 02:06:55 -0700
Subject: [PATCH 20/20] Update docs/code/CrossPlatform.md

Co-authored-by: Dan Moseley <danmose@microsoft.com>
---
 docs/code/CrossPlatform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/code/CrossPlatform.md b/docs/code/CrossPlatform.md
index 26f7c9dfc4..7816919138 100644
--- a/docs/code/CrossPlatform.md
+++ b/docs/code/CrossPlatform.md
@@ -137,7 +137,7 @@ Since some of the native code already has replacements and some of the code alre
 At a high level, this solution would require us to:
  - Fix the build so it's not hardcoded to look for any native dependencies or binaries.
  - Fix the managed code so that if it can't find the native binaries, the code behaves correctly and performs the software fallback. This includes software fallbacks and/or description error messages.
- - CpuMathNative mostly has software fallbacks already in place for .NET core 3.1, so only a little work is needed.
+ - CpuMathNative mostly has software fallbacks already in place for .NET Core 3.1 and later, so only a little work is needed.
  - FastTreeNative has a flag for a software fallback. We would need to conditionally enable this. Alternatively, we could change the C# code so the software fallback is always enabled for the cases when it can't find the native binaries.
  - LdaNative builds just find without x86/x64 dependencies, so no replacements would need to be found.
  - MatrixFactorizationNative builds with the "USESSE" flag that requires x86/x64. We can conditionally enable/disable that flag and no other work would be required.