You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix corrupt data when Deleting entries from zip files. (#103)
Fixes#102
The initial report showed the following error
```
Unhandled exception. System.IO.IOException: An attempt was made to move the position before the beginning of the stream.
at System.IO.MemoryStream.Seek(Int64 offset, SeekOrigin loc)
at Xamarin.Tools.Zip.ZipArchive.stream_callback(IntPtr state, IntPtr data, UInt64 len, SourceCommand cmd)
```
The really odd thing was that this only happened if the example files
were added to the archive in a certain order. This pointed to some kind
of memory corruption or stream offset issue.
Digging into the documentation for [zip_source_function]](https://libzip.org/documentation/zip_source_function.html#DESCRIPTION) we can
being to see what the problem is. Our current implementation of this
user callback treats the SourceCommand's of `ZIP_SOURCE_BEGIN_WRITE`
`ZIP_SOURCE_COMMIT_WRITE`, `ZIP_SOURCE_TELL_WRITE` and `ZIP_SOURCE_SEEK_WRITE`
in the same way as their non `WRITE` counter parts. In that when we get
a `ZIP_SOURCE_SEEK_WRITE` we seek in the current archive `Stream` to
the appropriate location. Similarly `ZIP_SOURCE_BEGIN_WRITE` seeks to
the start of the current archive `Stream`. This implementation is incorrect.
The documentation for `ZIP_SOURCE_BEGIN_WRITE` is as follows
```
Prepare the source for writing. Use this to create any temporary file(s).
```
This suggests that a new temporary file is expected to be created. Also
if we look at the following descriptions
ZIP_SOURCE_TELL
Return the current read offset in the source, like ftell(3).
ZIP_SOURCE_TELL_WRITE
Return the current write offset in the source, like ftell(3).
We can see that there are supposed to be two different offsets. One for
reading and one for writing. This leads us to the reason why this problem
was occurring. Because both the `READ` and `WRITE` are operating on the
exact same `Stream` were are getting into a position where the original
data was being overwritten by us deleting an entry.
What we should have been doing was creating a temp `Stream` when we got
the `ZIP_SOURCE_BEGIN_WRITE` SourceCommand. Then use this temp `Stream`
to write all the required changes to before finally overwriting the
original `Stream` when we get the `ZIP_SOURCE_COMMIT_WRITE` SourceCommand.
This will ensure that the original archive data will remain intact until
all the processing is done.
So rather than passing in a `GCHandle` to the `Stream` directly , a new
class has been introduced. The `CallbackContext` class will be used to
pass data between the managed and unmanaged code. It will contain the
properties for the `Source` stream as well as `Destination` stream. The
`Source` will always be used for reading, the `Destination` (if present)
will be used for writing.
Now on `ZIP_SOURCE_BEGIN_WRITE` we create a new temp file stream which
we will use to create the updated zip file. This new stream will be stored
in the `CallbackContext.Destination` property so that all the other `WRITE`
based commands will work on it. Finally when we get the `ZIP_SOURCE_COMMIT_WRITE`
SourceCommand, we will copy the data to the `CallbackContext.Source` stream.
Then finally we will dispose of the `CallbackContext.Destination`.
0 commit comments