Skip to content

idna to_unicode() API has degraded in 1.0 #938

@djc

Description

@djc

I work on a domain search engine that deals with many domains. As part of performance efforts to optimize this path, I had previously carefully optimized Idna::to_unicode() to avoid allocations where possible (for example, in #653). However, the 1.0 release (apart from bringing in 25 transitive new dependencies which IMO is not great by itself for such a low-level crate) proposes I use domain_to_unicode() (which is a little simpler but definitely doesn't enable me to avoid per-conversion allocations), but then says:

This function exists for backward-compatibility. Consider using Uts46::to_user_interface or Uts46::to_unicode.

In turn, Uts64::to_unicode() documents itself as:

Most applications probably shouldn’t use this method and should be using Uts46::to_user_interface instead.

Meanwhile the interface for to_user_interface() is:

pub fn to_user_interface<'a, OutputUnicode: FnMut(&[char], &[char], bool) -> bool>(
        &self,
        domain_name: &'a [u8],
        ascii_deny_list: AsciiDenyList,
        hyphens: Hyphens,
        output_as_unicode: OutputUnicode,
    ) -> (Cow<'a, str>, Result<(), crate::Errors>) {

Which IMO is pretty unreasonable to suggest as an interface "most applications" should use. If there is a need for a more complex API it seems clear that the previous approach of building a Config with a builder pattern and building an instance with an internal buffer that could be reused allowed for more idiomatic and more performant operations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions