|  | 
|  | 1 | +<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc --> | 
|  | 2 | +**Table of Contents** | 
|  | 3 | + | 
|  | 4 | +- [Assembly Store format and purpose](#assembly-store-format-and-purpose) | 
|  | 5 | +    - [Rationale](#rationale) | 
|  | 6 | +- [Store kinds and locations](#store-kinds-and-locations) | 
|  | 7 | +- [Store format](#store-format) | 
|  | 8 | +    - [Common header](#common-header) | 
|  | 9 | +    - [Assembly descriptor table](#assembly-descriptor-table) | 
|  | 10 | +    - [Index store](#index-store) | 
|  | 11 | +        - [Hash table format](#hash-table-format) | 
|  | 12 | + | 
|  | 13 | +<!-- markdown-toc end --> | 
|  | 14 | + | 
|  | 15 | +# Assembly Store format and purpose | 
|  | 16 | + | 
|  | 17 | +Assembly stores are binary files which contain the managed | 
|  | 18 | +assemblies, their debug data (optionally) and the associated config | 
|  | 19 | +file (optionally).  They are placed inside the Android APK/AAB | 
|  | 20 | +archives, replacing individual assemblies/pdb/config files. | 
|  | 21 | + | 
|  | 22 | +Assembly stores are an optional form of assembly storage in the | 
|  | 23 | +archive, they can be used in all build configurations **except** when | 
|  | 24 | +Fast Deployment is in effect (in which case assemblies aren't placed | 
|  | 25 | +in the archives at all, they are instead synchronized from the host to | 
|  | 26 | +the device/emulator filesystem) | 
|  | 27 | + | 
|  | 28 | +## Rationale | 
|  | 29 | + | 
|  | 30 | +During native startup, the Xamarin.Android runtime looks inside the | 
|  | 31 | +application APK file for the managed assemblies (and their associated | 
|  | 32 | +pdb and config files, if applicable) in order to map them (using the | 
|  | 33 | +`mmap(2)` call) into memory so that they can be given to the Mono | 
|  | 34 | +runtime when it requests a given assembly is loaded.  The reason for | 
|  | 35 | +the memory mapping is that, as far as Android is concerned, managed | 
|  | 36 | +assembly files are just data/resources and, thus, aren't extracted to | 
|  | 37 | +the filesystem.  As a result, Mono wouldn't be able to find the | 
|  | 38 | +assemblies by scanning the filesystem - the host application | 
|  | 39 | +(Xamarin.Android) must give it a hand in finding them. | 
|  | 40 | + | 
|  | 41 | +Applications can contain hundreds of assemblies (for instance a Hello | 
|  | 42 | +World MAUI application currently contains over 120 assemblies) and | 
|  | 43 | +each of them would have to be mmapped at startup, together with its | 
|  | 44 | +pdb and config files, if found.  This not only costs time (each `mmap` | 
|  | 45 | +invocation is a system call) but it also makes the assembly discovery | 
|  | 46 | +an O(n) algorithm, which takes more time as more assemblies are added | 
|  | 47 | +to the APK/AAB archive. | 
|  | 48 | + | 
|  | 49 | +An assembly store, however, needs to be mapped only once and any | 
|  | 50 | +further operations are merely pointer arithmetic, making the process | 
|  | 51 | +not only faster but also reducing the algorithm complexity to O(1). | 
|  | 52 | + | 
|  | 53 | +# Store kinds and locations | 
|  | 54 | + | 
|  | 55 | +Each application will contain at least a single assembly store, with | 
|  | 56 | +assemblies that are architecture-agnostics and any number of | 
|  | 57 | +architecture-specific stores.  dotnet ships with a handful of | 
|  | 58 | +assemblies that **are** architecture-specific - those assemblies are | 
|  | 59 | +placed in an architecture specific store, one per architecture | 
|  | 60 | +supported by and enabled for the application.  On the execution time, | 
|  | 61 | +the Xamarin.Android runtime will always map the architecture-agnostic | 
|  | 62 | +store and one, and **only** one, of the architecture-specific stores. | 
|  | 63 | + | 
|  | 64 | +Stores are placed in the same location in the APK/AAB archive where the | 
|  | 65 | +individual assemblies traditionally live, the `assemblies/` (for APK) | 
|  | 66 | +and `base/root/assemblies/` (for AAB) folders. | 
|  | 67 | + | 
|  | 68 | +The architecture agnostic store is always named `assemblies.blob` while | 
|  | 69 | +the architecture-specific one is called `assemblies.[ARCH].blob`. | 
|  | 70 | + | 
|  | 71 | +Each APK in the application (e.g. the future Feature APKs) **may** | 
|  | 72 | +contain the above two assembly store files (some APKs may contain only | 
|  | 73 | +resources, other may contain only native libraries etc) | 
|  | 74 | + | 
|  | 75 | +Currently, Xamarin.Android applications will produce only one set of | 
|  | 76 | +stores but when Xamarin.Android adds support for Android Features, each | 
|  | 77 | +feature APK will contain its own set of stores.  All of the APKs will | 
|  | 78 | +follow the location, format and naming conventions described above. | 
|  | 79 | + | 
|  | 80 | +# Store format | 
|  | 81 | + | 
|  | 82 | +Each store is a structured binary file, using little-endian byte order | 
|  | 83 | +and aligned to a byte boundary.  Each store consists of a header, an | 
|  | 84 | +assembly descriptor table and, optionally (see below), two tables with | 
|  | 85 | +assembly name hashes.  All the stores are assigned a unique ID, with | 
|  | 86 | +the store having ID equal to `0` being the [Index store](#index-store) | 
|  | 87 | + | 
|  | 88 | +Assemblies are stored as adjacent byte streams: | 
|  | 89 | + | 
|  | 90 | + - **Image data** | 
|  | 91 | +   Required to be present for all assemblies, contains the actual | 
|  | 92 | +   assembly PE image. | 
|  | 93 | + - **Debug data** | 
|  | 94 | +   Optional. Contains the assembly's PDB or MDB debug data. | 
|  | 95 | + - **Config data** | 
|  | 96 | +   Optional. Contains the assembly's .config file. Config data | 
|  | 97 | +   **must** be terminated with a `NUL` character (`0`), this is to | 
|  | 98 | +   make runtime code slightly more efficient. | 
|  | 99 | + | 
|  | 100 | +All the structures described here are defined in the | 
|  | 101 | +[`xamarin-app.hh`](../../src/monodroid/jni/xamarin-app.hh) file. | 
|  | 102 | +Should there be any difference between this document and the | 
|  | 103 | +structures in the header file, the information from the header is the | 
|  | 104 | +one that should be trusted. | 
|  | 105 | + | 
|  | 106 | +## Common header | 
|  | 107 | + | 
|  | 108 | +All kinds of stores share the following header format: | 
|  | 109 | + | 
|  | 110 | +    struct AssemblyStoreHeader | 
|  | 111 | +    { | 
|  | 112 | +        uint32_t magic; | 
|  | 113 | +        uint32_t version; | 
|  | 114 | +        uint32_t local_entry_count; | 
|  | 115 | +        uint32_t global_entry_count; | 
|  | 116 | +        uint32_t store_id; | 
|  | 117 | +    ; | 
|  | 118 | + | 
|  | 119 | +Individual fields have the following meanings: | 
|  | 120 | + | 
|  | 121 | + - `magic`: has the value of 0x41424158 (`XABA`) | 
|  | 122 | + - `version`: a value increased every time assembly store format changes. | 
|  | 123 | + - `local_entry_count`: number of assemblies stored in this assembly | 
|  | 124 | +   store (also the number of entries in the assembly descriptor | 
|  | 125 | +   table, see below) | 
|  | 126 | + - `global_entry_count`: number of entries in the index store's (see | 
|  | 127 | +   below) hash tables and, thus, the number of assemblies stored in | 
|  | 128 | +   **all** of the assembly stores across **all** of the application's | 
|  | 129 | +   APK files, all the other assembly stores have `0` in this field | 
|  | 130 | +   since they do **not** have the hash tables. | 
|  | 131 | + - `store_id`: a unique ID of this store. | 
|  | 132 | +  | 
|  | 133 | +## Assembly descriptor table | 
|  | 134 | + | 
|  | 135 | +Each store header is followed by a table of | 
|  | 136 | +`AssemblyStoreHeader.local_entry_count` entries, each entry | 
|  | 137 | +defined by the following structure: | 
|  | 138 | + | 
|  | 139 | +    struct AssemblyStoreAssemblyDescriptor | 
|  | 140 | +    { | 
|  | 141 | +        uint32_t data_offset; | 
|  | 142 | +        uint32_t data_size; | 
|  | 143 | +        uint32_t debug_data_offset; | 
|  | 144 | +        uint32_t debug_data_size; | 
|  | 145 | +        uint32_t config_data_offset; | 
|  | 146 | +        uint32_t config_data_size; | 
|  | 147 | +    }; | 
|  | 148 | + | 
|  | 149 | +Only the `data_offset` and `data_size` fields must have a non-zero | 
|  | 150 | +value, other fields describe optional data and can be set to `0`.  | 
|  | 151 | + | 
|  | 152 | +Individual fields have the following meanings: | 
|  | 153 | + | 
|  | 154 | +  - `data_offset`: offset of the assembly image data from the | 
|  | 155 | +    beginning of the store file | 
|  | 156 | +  - `data_size`: number of bytes of the image data | 
|  | 157 | +  - `debug_data_offset`: offset of the assembly's debug data from the | 
|  | 158 | +    beginning of the store file. A value of `0` indicates there's no | 
|  | 159 | +    debug data for this assembly. | 
|  | 160 | +  - `debug_data_size`: number of bytes of debug data. Can be `0` only | 
|  | 161 | +    if `debug_data_offset` is `0` | 
|  | 162 | +  - `config_data_offset`: offset of the assembly's config file data | 
|  | 163 | +    from the  beginning of the store file. A value of `0` indicates | 
|  | 164 | +    there's no config file data for this assembly. | 
|  | 165 | +  - `config_data_size`: number of bytes of config file data. Can be | 
|  | 166 | +    `0` only if `config_data_offset` is `0` | 
|  | 167 | + | 
|  | 168 | +## Index store | 
|  | 169 | + | 
|  | 170 | +Each application will contain exactly one store with a global index - | 
|  | 171 | +two tables with assembly name hashes.  All the other stores **do not** | 
|  | 172 | +contain these tables.  Two hash tables are necessary because hashes | 
|  | 173 | +for 32-bit and 64-bit devices are different. | 
|  | 174 | + | 
|  | 175 | +The hash tables follow the [Assembly descriptor | 
|  | 176 | +table](#assembly-descriptor-table) and precede the individual assembly | 
|  | 177 | +streams. | 
|  | 178 | + | 
|  | 179 | +Placing the hash tables in a single index store, while "wasting" a | 
|  | 180 | +certain amount of memory (since 32-bit devices won't use the 64-bit | 
|  | 181 | +table and vice versa), makes for simpler and faster runtime | 
|  | 182 | +implementation and the amount of memory wasted isn't big (1000 | 
|  | 183 | +two tables which are 8kb long each, this being the amount of memory | 
|  | 184 | +wasted) | 
|  | 185 | + | 
|  | 186 | +### Hash table format | 
|  | 187 | + | 
|  | 188 | +Both tables share the same format, despite the hashes themselves being | 
|  | 189 | +of different sizes.  This is done to make handling of the tables | 
|  | 190 | +easier on the runtime. | 
|  | 191 | + | 
|  | 192 | +Each entry contains, among other fields, the assembly name hash.  In | 
|  | 193 | +case of satellite assemblies, the assembly culture (e.g. `en/` or | 
|  | 194 | +`fr/`) is treated as part of the assembly name, thus resulting in a | 
|  | 195 | +unique hash. The  hash value is obtained using the | 
|  | 196 | +[xxHash](https://cyan4973.github.io/xxHash/) algorithm and is | 
|  | 197 | +calculated **without** including the `.dll` extension.  This is done | 
|  | 198 | +for runtime efficiency as the vast majority of Mono requests to load | 
|  | 199 | +an assembly does not include the `.dll` suffix, thus saving us time of | 
|  | 200 | +appending it in order to generate the hash for index lookup.  | 
|  | 201 | + | 
|  | 202 | +Each entry is represented by the following structure: | 
|  | 203 | + | 
|  | 204 | +    struct AssemblyStoreHashEntry | 
|  | 205 | +    { | 
|  | 206 | +        union { | 
|  | 207 | +            uint64_t hash64; | 
|  | 208 | +            uint32_t hash32; | 
|  | 209 | +        }; | 
|  | 210 | +        uint32_t mapping_index; | 
|  | 211 | +        uint32_t local_store_index; | 
|  | 212 | +        uint32_t store_id; | 
|  | 213 | +    }; | 
|  | 214 | + | 
|  | 215 | +Individual fields have the following meanings: | 
|  | 216 | + | 
|  | 217 | + - `hash64`/`hash32`: the 32-bit or 64-bit hash of the assembly's name | 
|  | 218 | +   **without** the `.dll` suffix | 
|  | 219 | + - `mapping_index`: index into a compile-time generated array of | 
|  | 220 | +   assembly data pointers.  This is a global index, unique across | 
|  | 221 | +   **all** the APK files comprising the application. | 
|  | 222 | + - `local_store_index`: index into assembly store [Assembly descriptor table](#assembly-descriptor-table) | 
|  | 223 | +   describing the assembly. | 
|  | 224 | + - `store_id`: ID of the assembly store containing the assembly | 
0 commit comments