-
Notifications
You must be signed in to change notification settings - Fork 455
Use character columns instead of UTF-8 columns in the diagnostics printer #2512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nter Fixes a crash and improves the printing of errors on lines that contain multi-byte UTF-8 characters. Fixes swiftlang#2507 rdar://123409748
15f11e8
to
7676255
Compare
@swift-ci Please test |
/// For example the 👨👩👧👦 character is considered as a single character, not 25 bytes. | ||
/// | ||
/// Both the input and the output column are 1-based. | ||
func characterColumn(ofUtf8Column utf8Column: Int) -> Int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you don't think this should be part of the location converter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think it’s quite specific to this use case.
@swift-ci Please test Windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure if considering emojis as one character is correct. Are they two-character symbols?
I believe in this test:
func testEmojiInSourceCode() {
let source = """
let 👨👩👧👦 = ;
"""
let expectedOutput = """
1 │ let 👨👩👧👦 = ;
│ ╰─ error: expected expression in variable
"""
assertStringsEqualWithDiff(annotate(source: source), expectedOutput)
}
the diagnostic should point to ;
not the space between =
and ;
. I can not test it right now, but I think if we change the variable name to regular characters like so:
func testEmojiInSourceCode() {
let source = """
let e = ;
"""
let expectedOutput = """
1 │ let e = ;
│ ╰─ error: expected expression in variable
"""
assertStringsEqualWithDiff(annotate(source: source), expectedOutput)
}
the annotation will be pointing to semi column.
The problem is that, as far as I know, it depends on the font that renders the font. A font can render an emoji as a single-width character, a double width character, anything in between or however it likes, really. And as far as I know there’s no Unicode specification for it and my thought that assuming that 👨👩👧👦 is rendered as a single character is closer to the truth than assuming that it gets rendered as 25 characters. |
This is mainly to do with terminal emulator behaviour rather than the font (though the font can certainly make things worse if it so desires); generally speaking terminal emulators try to match the behaviour of the |
Fixes a crash and improves the printing of errors on lines that contain multi-byte UTF-8 characters.
Fixes #2507
rdar://123409748