Skip to content

Conversation

@BretJohnson
Copy link

Update the BuildApk logic so that:

  • Resources are added last, not first. That way when an APK is first built, all the resources are at the end, after native libraries, etc.
  • Modified resources are moved from their current position in the zip to the end of the APK. We make libzip do that by deleting the entry before we re-add it, so that it doesn't overwrite it's current index.

For a typical dev inner loop scenario, assemblies are outside the APK (with fast deploy) and the APK contains just native libraries and resources. Of these, the native libraries tend not to change, while resources can change frequently as the customer updates their UI. By keeping frequently changing parts of the APK at the end, we minimize the amount of bytes that need to get updated by delta install. Delta install (and Apply Changes) updates any changed item in the APK and everything after that item (since the offsets of everything after change too when the item changed in size).

Android Studio follows a similar approach, where updated parts of the APK are moved to the end. Though Android Studio does that somewhat differently, using zipflinger, which unlike us allows gaps in the APK, so it minimizes items moving around even more. Perhaps one day we'll use zipflinger too, with that exact same algorithm, but for now we approximate it with the above, while sticking with libzip.

Bret Johnson added 3 commits May 7, 2020 00:35
Fix to delete the resource with the output APK zip index, not the input APK
Updated logging, including separate message for added vs updated resources
Copy link
Member

@jonathanpeppers jonathanpeppers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main thing missing here is some tests, can we expand upon this test and make it verify the zip file comes out as we expect after an incremental build?

https://github.com/xamarin/xamarin-android/blob/7dab4ee432dcd1e51c52dfe579023fccaf09c471/src/Xamarin.Android.Build.Tasks/Tests/Xamarin.Android.Build.Tests/IncrementalBuildTest.cs#L1084-L1106

It might also be worth a test doing add/update/delete of an .axml file.

@BretJohnson
Copy link
Author

@jonathanpeppers and @dellis1972 - I added tests here. Can you both please take a look, especially at the the TODO item in the test (which may or may not be a bug).
Otherwise, I believe this PR is good to merge, when you both are comfortable. Thanks.

Copy link
Member

@jonathanpeppers jonathanpeppers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restarted Windows Build and Smoke Test, hopefully that will be green when finished.

}
}

HashSet<ulong> deletedEntries = new HashSet<ulong> ();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a ulong instead of a long, when .EntryCount is long?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grendello - do you know if there's a rationale for that? maybe EntryCount can be -1 in some case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, the rationale for EntryCount being long was some sort of compatibility with BCL's ZipArchive (which doesn't have EntryCount but has a collection which has a signed integer Count property). At the same time, I always avoid using signed integers whenever a size or an index is concerned - on the notion that we have no negative indices and we have no negative sizes (or at least - they make no sense). Thus using an unsigned integer limits the size of the error domain (index can be too big, but can't be too small - one less thing that can go wrong)

// dev scenario where the user is editing a few resources but most of the APK contents (e.g. the native libs)
// don't change and we want to keep their byte offset in the APK fixed. Delta install need not update APK
// contents that don't change and don't move.
apk.Archive.DeleteEntry ((ulong) entryIndexInOutput);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grendello: I don't understand the semantics here. What is entryIndexInOutput. It comes from (apk.Archive.ContainsEntry(entry.FullName, out entryIndexInOutput), but what is it, really?

An "index in the zip file", presumably?

Furthermore, why doesn't apk.Archive.DeleteEntry() invalidate any previously obtained entryIndexInOutput values? Is entryIndexInOutput guaranteed to always increase, even when "previous" entries in the zip have been removed?

This implies that previously obtained entryIndexInOutput values won't change, otherwise it wouldn't work: https://github.com/xamarin/xamarin-android/pull/4659/files#diff-6502ef3162df5eefc5426881ce336007R131

But if that's the case, when do the entryIndexInOutput values change? When the file is saved?

Doesn't ZipArchiveEx implicitly close and re-open the underlying zip "occasionally"? See e.g. 296d854.

https://github.com/xamarin/xamarin-android/blob/3f438e46d7b166a3a3ef54c9ffafb5f426760468/src/Xamarin.Android.Build.Tasks/Utilities/ZipArchiveEx.cs#L30-L38
https://github.com/xamarin/xamarin-android/blob/3f438e46d7b166a3a3ef54c9ffafb5f426760468/src/Xamarin.Android.Build.Tasks/Utilities/ZipArchiveEx.cs#L72-L75
I really don't fully understand the intended semantics here, but what I think I understand, I'm not sure I'm comfortable with.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answering some of those questions:

Yes the index matches the "index in the zip" and always increases (until the zip is written out). The index is managed by the underlying C libzip code. This is basically the algorithm, from what I observed: When the zip is opened, the index matches the position in the zip file (0 is at the beginning, the max value at the end). While the zip is being modified in memory by libzip, new entries are assigned an index of "previousMaxIndex + 1", so the index always increases and deleted entries keep their current index number (so there's a hole). And then when the zip is written & closed, entries are written in index order with deleted entries skipped, so when the file is reopened the index numbers can be different than before (since deleted entries have their index number reused by whatever non-deleted entry came after - no more holes).

FYI, I was using unzip -Zv <zipfile> to dump the index of the zip and see what order things are in. The index number reported by unzip -Zv is the same as managed by libzip, except that unzip reports indexes as being 1 based while libzip has them as 0 based. But conceptually they are the same, indicating the relative order of entries in the zip.

And yes, Flush can change the index numbers, but fortunately it's not called here.

Anyway, all that said, it would probably be cleaner to update LibZipSharp so that the enumerator can cope with deletions, skipping them instead of throwing an exception. Adding a ReadEntry overload that doesn't throw on deleted items and then calling it from https://github.com/xamarin/LibZipSharp/blob/a0973d4a861bf683cf225bc67111b249999673ce/ZipEntryEnumerator.cs#L59 is probably the best way to do that.

@grendello - would you want to take a look at making that change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonpryor what @BretJohnson described is what happens. Basically, libzip maintains only an in-memory image of the zip archive which, until flushed, won't cause problems with indexes being changed underneath.

@BretJohnson I'd be happy to review a PR, frankly, at this moment in time... :) So unless it can wait, I'm not the right person to do it right now... :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BretJohnson wrote:

And yes, Flush can change the index numbers, but fortunately it's not called here.

ZipArchiveEx.AddFiles() calls Flush(). Adding a file may call Flush().

Which means for the entire "scope" between deletedEntries = new HashSet… and apk.FixupWindowsPathSeparators(deletedEntries…), apk.Archive.Add*() must not be called.

It isn't currently called, but this constraint is not at all obvious, and thus could be easily broken in the future.

At minimum that "scope" should be moved to a new method, with adequate documentation that no entries can be added to apk.Archive.

"Better" would be to remove the need for deletedEntries in the first place, which would require libZipSharp changes, but that's not necessarily a bad thing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grendello @jonpryor - I updated LibZipSharp to handle this. Can you please review that PR, here: dotnet/android-libzipsharp#53

After we merge that we'll bump libZipSharp in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this workaround code, with the LibZipSharp update to now skip deleted entries when enumerating. And I bumped xamarin-android to use the new LibZipSharp version 1.0.12, which isn't (yet) published.

@dellis1972 - Can you publish the new LibZipSharp release, from the latest LibZipSharp commit. Again, it should be 1.0.12 (as I bumped the version on the LibZipSharp side). That should be the last thing needed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.0.13 of LibZipSharp has been published

Bret Johnson added 3 commits May 15, 2020 16:08
For this code to all work properly it requires the latest LibZipSharp, 1.0.12
With that, our workaround is no longer required to skip deleted entries when enumerating (which would otherwise throw an exception).
For this code to all work properly it requires the latest LibZipSharp, 1.0.12, which is bumped here.
With that, our workaround is no longer required to skip deleted entries when enumerating (which would otherwise throw an exception).
@jonpryor
Copy link
Contributor

It feels like part of the rationale to this PR is "speed up inner dev loop." It's thus troubling that many of the Build_*_Change tests are failing: https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=3736767&view=ms.vss-test-web.build-test-results-tab&runId=13277742&paneView=debug&resultId=100090

Exceeded expected time of 9400ms, actual 9730ms
Exceeded expected time of 9800ms, actual 10760ms
Exceeded expected time of 9500ms, actual 11180ms
Exceeded expected time of 4150ms, actual 4580ms
Exceeded expected time of 10250ms, actual 11360ms

Some of these tests are seeing over 1.5s in increases!

We're also seeing lots of native crashes on the build machine, e.g. Xamarin.Android.Build.Tests.BuildTest.BuildAfterAddingNuget:

=================================================================
	Native Crash Reporting
=================================================================
Got a segv while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================
…

=================================================================
	Managed Stacktrace:
=================================================================
	  at <unknown> <0xffffffff>
	  at Xamarin.Tools.Zip.Native:zip_stat_index <0x000b3>
	  at Xamarin.Tools.Zip.ZipArchive:ReadEntry <0x0009a>
	  at Xamarin.Tools.Zip.ZipArchive:ReadEntry <0x00052>
	  at Xamarin.Tools.Zip.ZipArchive:AddStream <0x001ea>
	  at Xamarin.Tools.Zip.ZipArchive:AddFile <0x00272>
	  at Xamarin.Android.Tasks.BuildApk:AddFileToArchiveIfNewer <0x0018a>
	  at Xamarin.Android.Tasks.BuildApk:AddAssemblies <0x003aa>
	  at Xamarin.Android.Tasks.BuildApk:ExecuteWithAbi <0x00892>
	  at Xamarin.Android.Tasks.BuildApk:RunTask <0x00262>

This is very troubling.

Windows doesn't necessarily fare any better; Xamarin.Android.Build.Tests.BuildTest.BuildBasicApplicationReleaseFSharp:

error XADJL7000: System.BadImageFormatException: An attempt was made to load a program with an incorrect format. (Exception from HRESULT: 0x8007000B)
error XADJL7000:    at Xamarin.Tools.Zip.Native.zip_open(IntPtr path, OpenFlags flags, ErrorCode& errorp)
error XADJL7000:    at Xamarin.Tools.Zip.Native.zip_open(String path, OpenFlags flags, ErrorCode& errorp) in /Users/runner/runners/2.168.2/work/1/s/Native.cs:line 84
error XADJL7000:    at Xamarin.Tools.Zip.ZipArchive.Open(String path, OpenFlags flags) in /Users/runner/runners/2.168.2/work/1/s/ZipArchive.cs:line 158
error XADJL7000:    at Xamarin.Tools.Zip.ZipArchive.Open(String path, FileMode mode, String defaultExtractionDir, Boolean strictConsistencyChecks, IPlatformOptions options) in /Users/runner/runners/2.168.2/work/1/s/ZipArchive.cs:line 227
error XADJL7000:    at Xamarin.Android.Tools.Files.ZipAny(String filename, Func`2 filter)
error XADJL7000:    at Xamarin.Android.Tasks.DetermineJavaLibrariesToCompile.HasClassFiles(String jar)
error XADJL7000:    at Xamarin.Android.Tasks.DetermineJavaLibrariesToCompile.RunTask()
error XADJL7000:    at Xamarin.Android.Tasks.AndroidTask.Execute()

Something is FUBAR here.

}
}

if (apkInputPathExists) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this entire block not be refactored into a separate method? The levels of indentation here is getting…high.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@BretJohnson
Copy link
Author

As for the crashes (at least on Windows), that should be fixed by the LibZipSharp update to use the right 32 bit libzip.dll. See https://xamarinhq.slack.com/archives/C03CEGRUW/p1589925476196100

As for the timings, let's see where we're at with the fix above.
I also wonder if bumping the version of libzip may have affected the timings. cc @grendello
We should dig more & maybe time that part separately.

…tead for now.

It seems simpler/better to keep the LibZipSharp change separate, as it also involves moving to a newer version of libzip and (potentially) switching the compiler used on Windows.  LibZipSharp can be updated  when Markek & Dean feel comfortable, with a separate PR. See more context here https://xamarinhq.slack.com/archives/C03CEGRUW/p1590002325342000
@BretJohnson
Copy link
Author

I switched back to using our workaround & not moving to a new LibZipSharp (not yet), since it seems better to keep those separate.
See more context here: https://xamarinhq.slack.com/archives/C03CEGRUW/p1590002325342000

We'll see how the CI tests look now...

@jonpryor
Copy link
Contributor

The BCL tests failed, and the error is bizarro, so I'm re-running the Test stage…

}

var ms = new MemoryStream ();
entry.Extract (ms);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this advisable or a good idea?

@grendello: is there a "good" libZipSharp method which can copy data between .zip files without decompressing them?

My fear is that entry may refer to e.g. a 2GB resource (.mp4 file?), at which point this entry.Extract(ms) invocation is going to need to contain said 2GB of data, which very probably isn't going to happen, which will in turn cause "obscure" corner cases in .zip handling.

Maybe the "proper" thing to do is check entry.Size and use an intermediate file if it's "too big", though that has implications with Windows Defender not liking file writes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. zip_fopen_index can be used to open compressed data and copy it around as needed (you need to pass OperationFlags.Compressed to do that, here are the upstream docs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is only publicly exposed via ZipEntry.Extract(), so there's no public way to do this at present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was never needed, so there was no reason to make it public (or, rather, wrapped in a method to call it)

if (apk.Archive.ContainsEntry (entry.FullName, out entryIndexInOutput)) {
ZipEntry e = apk.Archive.ReadEntry (entry.FullName);
// check the CRC values as the ModifiedDate is always 01/01/1980 in the aapt generated file.
if (entry.CRC == e.CRC) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grendello: what's the "likelihood" of CRC collisions? This is "only" a 32-bit value. Should we be relying on it for change "identicality" checks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ZIP purposes it's not likely that we'd have collisions. Adler-32 (used to calculate the CRC) is weak for short messages, however, so there is a possibility of collision on small files. How big a likelihood? Depends on the data set... I wouldn't use it alone to see if the data has changed or not, but combined with the file name check I think it's good enough.

@jonpryor
Copy link
Contributor

jonpryor commented Jun 8, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Base automatically changed from master to main March 5, 2021 23:08
@dellis1972 dellis1972 marked this pull request as draft March 13, 2025 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants