-
Notifications
You must be signed in to change notification settings - Fork 1.6k
G-API: Adding a Python script to download onnx models #831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Legal/copyright information is required about the used image. |
Until downloaded these LFS files EXIST and store lfs metadata. Git repository itself should go into dedicated "download_cache" directory. What is about repository commit revision? |
| # and "lfs" folder will exist during download | ||
| lfs_cache_path = CACHE_DIR + '.git/lfs' | ||
| if os.path.exists(lfs_cache_path): | ||
| rmdir_with_data(lfs_cache_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alalek, it is questionable step. Cache directory gets some data after calling git lfs pull. I remove this folder and don't know. Should cache directory have constant size (180 MB) or should contain received data (500 mb)? This (without deleting lfs folder) avoids downloading models after deleting onnx_models folder.
|
@smirnov-alexey, can you review this? |
smirnov-alexey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
OrestChura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, most are just questions to double-check
OrestChura
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comment cavils last, but I'm all good with the script. Thanks!
|
@alalek, does script have correct behavior (check description)? If yes, can we merge it? |
|
@alalek can we merge this thing? I believe it is required for a more correct ONNX models coverage we do now. |
alalek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good!
| def clear_pulled_files(): | ||
| # "lfs" directory contains some data | ||
| # Remove this folder then "onnx_models_cache" will have standard size - 180 MB | ||
| # If we remove "onnx_models" folder then models will be downloaded again | ||
| # and "lfs" folder will exist during download | ||
| lfs_cache_path = CACHE_DIR + '.git/lfs' | ||
| if os.path.exists(lfs_cache_path): | ||
| rmdir_with_data(lfs_cache_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to clean '.git' internals? Does it break anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but without cleaning models won't be downloaded again and will be moved from cache. And for models update you should remove cache directory.
It is main question about behavior. Do we need update models from GH when we remove models directory (NOT cache)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't clear the cache now.
| os.system('git clone --recursive https://github.com/onnx/models.git onnx_models_cache') | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git clone
Need to fetch+checkout the latest commits from the upstream repository if folder already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added git pull of master branch.
|
|
||
| CUR_DIR = os.getcwd() | ||
| # This directory contains result of "git clone" | ||
| CACHE_DIR = CUR_DIR + '/onnx_models_cache/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'/onnx_models_cache/'
Please use '/.cache/onnx_models/'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
8287c09 to
cf77691
Compare
cf77691 to
4e325ce
Compare
|
@alalek, Could this be merged if all comments are applied correctly? |
Added python script which downloads models from https://github.com/onnx/models
Behavior (UPDATED):
onnx_models_cacheandonnx_models;onnx_models_cachecontains result of git clone (small files);onnx_modelscontains large ONNX models which will be used for something;onnx_modelsfolder and skip download if model already exists and pass sha check;onnx_modelsand call script again, then models will be downloaded again;onnx_models_cacheand removes lfs folder fromonnx_models_cachefor that;failedModels.Also you can call
download_onnx_models.pywithmodel_name/print,model_namewill be downloaded in this case;Added new directories: