Skip to content

Should combining characters return -1 or 0 ? #1

@jquast

Description

@jquast

Thanks , that's just the kind of feedback I was looking for ..

I chose to return -1 because thats what libc wcwidth(3) returns on my OSX and travis-ci.org's linux systems

(unless I'm doing it wrong, bin/wcwidth-libc-comparator.py),

Although there are a few cases where libc returns 1 where wcwidth.py returns -1, there aren't any cases of libc returning 0 ..

Matching values:

libc,ours=-1,-1 [--o͔o--] name=COMBINING LEFT ARROWHEAD BELOW val=852 http://codepoints.net/U+354

libc,ours=-1,-1 [--o᷇o--] name=COMBINING ACUTE-MACRON val=7623 http://codepoints.net/U+1DC7

libc 1 vs. wcwidth.py -1:

libc,ours=1,-1 [--o֭o--] name=HEBREW ACCENT DEHI val=1453 http://codepoints.net/U+5AD

libc,ours=1,-1 [--oฺo--] name=THAI CHARACTER PHINTHU val=3642 http://codepoints.net/U+E3A

Anyway you may be right .. I was hoping for some feedback.

My thoughts were:: If somebody wants to know the printable width of a string, and it contains a combining, until I fully understand their effect in full string as part of wcswidth, I should return -1 as if to say "indeterminate".

Examining a few consumers of wcwidth.c, they often return 0 in such cases, one example:

https://github.com/sickill/libtsm/blob/master/src/tsm_unicode.c#L393

Anyway, feedback appreciated. I'll open a bug, reading http://pubs.opengroup.org/onlinepubs/009696699/functions/wcwidth.html it seems it should return -1 for anything but wide characters and NULL.

jq

On May 5, 2014, at 12:24 AM, wrote:

Hi Mr Quast,

I was wondering if there was a reason behind your choice to return -1 for combining characters when in the original C code it returned 0.

Regards,

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions