Skip to content

Conversation

@bertsky
Copy link
Contributor

@bertsky bertsky commented May 23, 2024

No description provided.

@bertsky
Copy link
Contributor Author

bertsky commented May 24, 2024

@hnesk as you can see I am trying to get CI working again, but there are lots of problems in core ...

One thing I keep stumbling over here is a test that I don't quite understand how it could ever have worked and why it is included here:

def test_missing_image(self):
path = TEST_BASE_PATH / 'example/workspaces/kant_aufklaerung_1784_missing_image/mets.xml'
uri = path.as_uri()
doc = Document.load(uri)
page = doc.page_for_id('PHYS_0017', 'OCR-D-GT-PAGE')
image, info, exif = page.get_image(feature_selector='', feature_filter='binarized')
# Assert no exceptions happened and no image returned
self.assertIsNone(image)

Basically, this must fail, because Workspace.image_from_page always gets you the last annotated image version (derived or original) satisfying the constraints. And that here of course turns out to be the image without a URL or copy in the file system.

Do you remember how this was intended?

EDIT

Now I understand. So this does yield an exception in core, but that is supposed to be caught in ocrd_browser, so the actual function returns none. Which it does – just the pytest output is confusing (because the whole stack trace is shown prior to "ok").

Also, I think I found the cause of the other failures: some of your test datasets still contained the old LOCTYPE="URL" notation, which is now reserved for true URLs, while local paths must use LOCTYPE="OTHER" OTHERLOCTYPE="FILE".

Let's hope this fixes CI 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant