-
Notifications
You must be signed in to change notification settings - Fork 1.6k
another fix opencv/opencv#20575 -- dnn model downloader cleans invalid files #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@diablodale I added "rename" feature in the second commit. This is useful for further investigation of problems (e.g., separate problems of empty file or "captcha" requests). BTW, ResNet50 downloads unique content from OneDrive, so new and new files are created. |
testdata/dnn/download_models.py
Outdated
| return self.verify() | ||
| candidate_verify = self.verify() | ||
| if not candidate_verify: | ||
| if os.path.exists(self.filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to extract this block of code into a new class method, e.g. handle_bad_download:
if not self.verify():
self.handle_bad_download();
return False
return True
testdata/dnn/download_models.py
Outdated
| os.remove(self.filename) | ||
| try: | ||
| if hasattr(self, 'sha_actual') and self.sha_actual: | ||
| rename_target = self.filename + '.' + str(self.sha_actual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to have a single invalid name. Not to have always changing invalid names. Difficult to manage cleanup, ever increase drive usage, etc. 👎 Simpler code too.
It is easily possible for a download's sha_actual to change on every attempt. The results in unbounded number of invalid files. This is a problem I suggest we don't create. (Similar to log files that don't manage themselves.) If some developer out there needs per-download-attempt-backup-files, they can rename it themselves.
I do not think there is any useful mainstream scenario that needs per-download-attempt-backup-files.
A single xxxx.invalid for the most recent attempt is all that is needed and for it to be replaced with a new invalid attempt. AND...to be removed when an attempt succeeds!
Ok, then what needs to change in the downloader? If I take the resnet50 URL from download_models.py and copy it to my browser, it continues to fail with:
|
|
Updated to address review comments.
This can be fixed by new public working link for this model (current link is dead - this happens with "external" resources). |
Oops. I misunderstood your original comment, "...ResNet50 downloads unique content from OneDrive, so new and new files are created." Does your original comment and "...can be fixed by new public working link..." mean the same thing? If not, please help me understand your original comment more. |
testdata/dnn/download_models.py
Outdated
| os.remove(rename_target) | ||
| finally: | ||
| os.rename(self.filename, rename_target) | ||
| print(' downloaded content is renamed to ' + rename_target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend clearly stating this is an error condition and the action taken. Current wording is vague. For example, a change of
"downloaded content is renamed to"
to
"renaming invalid file to".
That makes the two error conditions express their action
"renaming invalid file to"
"deleting invalid file"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
OneDrive returns some stub message on HTTP request for missing or not available files. This stub is different (with different hash sums).
The current link doesn't work. Another valid link should help here (but we don't have such link). |
|
👍 |
another fix for opencv/opencv#20575. The downloader will now delete invalid (bad hash) files. This prevents later failures in running OpenCV code which attempts to use invalid files.
tested on 3.4. Approach should also work in 4.x