Skip to content

improve already downloaded contribution file check #2865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ryantm
Copy link

@ryantm ryantm commented Mar 31, 2015

When downloading a file, it's possible that a corrupted archive could be
bigger, smaller, or the same size as a non-corrupted archive, so it is
not good to rely on the archive file size. Instead, the checksum can be
used to verify if the already downloaded file is viable for reuse.

When downloading a file, it's possible that a corrupted archive could be
bigger, smaller, or the same size as a non-corrupted archive, so it is
not good to rely on the archive file size. Instead, the checksum can be
used to verify if the already downloaded file is viable for reuse.
@cmaglie
Copy link
Member

cmaglie commented Apr 1, 2015

With this patch you remove the chance to resume downloads.

it's possible that a corrupted archive could be bigger, smaller, or the same size as a non-corrupted archive

Not exactly, from the index.json we know size and checksum in advance, so we have three possibilities:

  • if the downloaded file size is less than expected the file is certainly corrupted there is no reason to spend CPU time to calculate the checksum. But it's perfectly possible that the download may have been interrupted, in particular for large archives: in this case the best thing to do is to try to resume the download using the existing file as a partial download.
  • If the downloaded size matches the expected size, the file integrity must be checked with CRC. If the CRC fails the file is certainly corrupted and should be downloaded again.
  • If the downloaded size is more than expected then the file is certainly corrupted and should be downloaded again: no need to check CRC in this case.

@ryantm
Copy link
Author

ryantm commented Apr 1, 2015

@cmaglie Thank you for reviewing my PR. I agree that we can get a bit of performance by checking the size, so I will update it to do that.

I do not understand how my changes remove the chance to resume downloads, if that chance previously existed. I didn't realize it had the feature of resuming downloads though, so that makes the name alreadyDownloadedFileViable wrong, but alreadyDownloadedFileViable should still return false if the file size is smaller than the contribution size.

Does the code currently allow for resuming partially completed downloads? How does the download resuming work?

@cmaglie
Copy link
Member

cmaglie commented Apr 2, 2015

Hi @ryantm,

yeah, sorry, indeed your PR doesn't break the resume I was misleaded by the fact that alreadyDownloadedFileViable always tests the checksum.

The download(...) method is the one that resume the download if outputFile already exists:
https://github.com/arduino/Arduino/blob/master/arduino-core/src/cc/arduino/utils/network/FileDownloader.java#L153

Now going back to your PR it seems to do:

  1. check if the downloaded file is available:
  • looks if the file exists
  • checks CRC
  1. if the file is not available download it

  2. check the CRC

so, in case the file is already downloaded the CRC is checked twice. I guess that the alreadyDownloadedFileViable should check only if the file exists and it's size to avoid checking the checksum twice.

Also I'll rename alreadyDownloadedFileViable to alreadyDownloadedFileAvailable (I don't know if it's a typo).

How does it sounds?

…rformance

Renamed alreadyDownloadedFileViable to fileAlreadyDownloaded to better
reflect what is going on with the download

Added file length check to improve performance

Reordered the logic of DownloadableContributionsDownloader.download to
avoid doing multiple checksums in the case of a good file
@ryantm
Copy link
Author

ryantm commented Apr 6, 2015

@cmaglie Please take a look at my latest commit 7ebecc2. I believe I addressed your concerns and improvement suggestions.

@cmaglie
Copy link
Member

cmaglie commented Apr 10, 2015

Honestly I'm missing the purpose of this PR: what is the problem are you trying to solve exactly?

@cmaglie cmaglie added feature request A request to make an enhancement (not a bug fix) Waiting for feedback More information must be provided before we can proceed Component: IDE user interface The Arduino IDE's user interface labels Apr 15, 2015
@ffissore ffissore self-assigned this May 12, 2015
@ffissore
Copy link
Contributor

Closing for lack of feedback

@ffissore ffissore closed this May 26, 2015
@ffissore ffissore added this to the Release 1.6.5 milestone May 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: IDE user interface The Arduino IDE's user interface feature request A request to make an enhancement (not a bug fix) Waiting for feedback More information must be provided before we can proceed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants