Skip to content

Conversation

kba
Copy link
Contributor

@kba kba commented May 31, 2020

Companion to cisocrgroup/ocrd_cis#49

pyrnn models are generally distributed gzipped. If loading gzipped models is inefficient, I can also adapt OCR-D/ocrd_all#103 to gunzip models after download.

@bertsky
Copy link
Owner

bertsky commented Jun 2, 2020

Is this really necessary? In recognize, there's get_model (which tries to append .gz), and then in load_object there's ocropus_find_file (which also tries to append .gz). In short, whenever the file can be found with ocrolib, it can also be found without the .gz suffix.

Now, I understand you pushed for cisocrgroup/ocrd_cis#49, which in turn requires the change here. But as I mentioned in my review there, I don't think that was warranted itself.

It does not hurt to use the full file names, though. Maybe this is better regardless of the change in ocrd_cis. With the full file name, the makefile could also have rules to check / download the model files.

@kba kba marked this pull request as draft June 3, 2020 12:34
@kba
Copy link
Contributor Author

kba commented Jun 3, 2020

It should not be necessary but I failed to get ocropy to pick up the models in OCROPUS_DATA, so I tried on different levels to fix it. I need to properly debug this later, the easiest "fix" was to gunzip the models or use explicit .gz extensions. I should have used the Draft PR feature or an issue.

@bertsky
Copy link
Owner

bertsky commented Jun 3, 2020

It should not be necessary but I failed to get ocropy to pick up the models in OCROPUS_DATA, so I tried on different levels to fix it. I need to properly debug this later, the easiest "fix" was to gunzip the models or use explicit .gz extensions.

Here's a summary of exactly where ocrd_cis is currently looking. Hope that helps.

I am still inclined to switch to the .gz filenames in the makefile regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants