Skip to content

Conversation

@mickael-menu
Copy link
Member

@mickael-menu mickael-menu commented Jun 14, 2024

Normalize URLs on-the-fly when comparing

Normalize URLs when comparing them following these rules:

  • Scheme is lower case.
  • Paths must only percent-encode required characters.
  • Relative paths (e.g. ..) are resolved.

The reason for this change is that some EPUBs encode unreserved characters which breaks locating a Link from an HREF.

For example both of these are valid relative URLs which point to the same resource:

  • Text/Ces_choses_qu'on_laisse.xhtml
  • Text/Ces_choses_qu%27on_laisse.xhtml

URL and Media type in Link and Locator objects

Locator and Link objects now use AnyURL and MediaType instead of strings.

@mickael-menu mickael-menu marked this pull request as ready for review June 14, 2024 09:47
@mickael-menu mickael-menu deleted the url-normalization branch June 17, 2024 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants