Skip to content

Conversation

@eric-saintetienne
Copy link
Contributor

@eric-saintetienne eric-saintetienne commented Feb 23, 2024

Hi guys,

In this changelist I'm suggesting to improve upon the existing Format Options parsing to make it a little more flexible, closer actually to the defaults that C programmers are used to and that most languages implement. To me, it follows the following principles from the Zen:

  • Incremental improvements,
  • Edge cases matter,
  • Minimize energy spent on coding style,
  • Together we serve end users.

Padding with 0 is very common for hexadecimal output: in many languages 04x left-pads with zeroes, but in Zig {x:04} left-pads with blanks. However for most fill characters like zero, adding an alignment specifier is actually not necessary (it's only necessary for 1-9 and .)

The new behaviour is non-breaking (e.g current unit tests pass) and feels more natural and closer to how other languages handle zero-padding in their format options:

    std.debug.print("{:04}",   .{42}); // "0042"
    std.debug.print("{:10}",   .{42}); // "        42"
    std.debug.print("{:0>3}",  .{1});  // "001"
    std.debug.print("{:1<03}", .{9});  // "911"
    std.debug.print("{:>03}" , .{1});  // "  1"
    std.debug.print("{:5^03}", .{1});  // "515"
    std.debug.print("{:_4}",   .{42}); // "__42"
    std.debug.print("{: 4}",   .{42}); // "  42"
    std.debug.print("{d:.3}", .{1.2345});  // "1.234"

I've also added new unit tests for the new cases.

@eric-saintetienne eric-saintetienne changed the title format: fix default character when no alignment format: fix default character when no alignment provided Feb 23, 2024
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 17 times, most recently from 1fee440 to 837e7b8 Compare February 24, 2024 09:11
@eric-saintetienne eric-saintetienne changed the title format: fix default character when no alignment provided format: do not force user to provide an alignment field when it's not necessary Feb 24, 2024
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 2 times, most recently from e8e480e to 39f5dc5 Compare March 6, 2024 09:56
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 2 times, most recently from b321b41 to 379b4d6 Compare March 9, 2024 09:49
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 2 times, most recently from 4842adb to f6a251f Compare March 17, 2024 09:12
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 2 times, most recently from 6caaba7 to d21a252 Compare March 28, 2024 12:22
@eric-saintetienne eric-saintetienne force-pushed the fmt.defaultchar branch 2 times, most recently from e07ad0d to de1784b Compare April 13, 2024 11:13
@eric-saintetienne
Copy link
Contributor Author

@andrewrk any idea when you guys will come round to reviewing this PR?

Thanks.

@andrewrk could you please advise? Thanks.

@Vexu
Copy link
Member

Vexu commented Jul 15, 2024

There are open proposals related to what you are doing here (#19488, #20152), how compatible are your changes with these proposals?

@eric-saintetienne
Copy link
Contributor Author

eric-saintetienne commented Jul 15, 2024

There are open proposals related to what you are doing here (#19488, #20152), how compatible are your changes with these proposals?

On one hand this pull-request introduce a light and surgical fix for a very common case in a non-breaking way: get 02x (for example) pad with zeros and not using spaces.
On the other hand the two pull-requests you refer to are newer and have the much larger scope of changing formatting, which is a heavy breaking proposal.

If I understand correctly, #19488 suggests using python format which would supersede this PR as then 04x would work as I expect it to. And #20152 would also supersede this PR, albeit using a less usual formatting style.

This PR has been up for review for a few months now, it's ripe for being merged in because it's a non-breaking change implementing a step towards either of these two PR (which will take much longer to get ready). With this PR, users could start using zero padding (e.g. 08x) today without having to wait for an overhaul of the formatting system.

@Vexu Vexu enabled auto-merge (squash) July 15, 2024 17:55
auto-merge was automatically disabled July 16, 2024 23:17

Head branch was pushed to by a user without write access

When no alignment is specified, the character that should be used is the
fill character that is otherwise provided, not space.

This is closer to the default that C programmers (and other languages)
use: "04x" fills with zeroes (in zig as of today x:04 fills with spaces)

Test:

    const std = @import("std");
    const expectFmt = std.testing.expectFmt;

    test "fmt.defaultchar.no-alignment" {

        // as of today the following test passes:
        try expectFmt("0x00ff", "0x{x:0>4}", .{255});

        // as of today the following test fails (returns "0x  ff" instead)
        try expectFmt("0x00ff", "0x{x:04}", .{255});
    }
@eric-saintetienne
Copy link
Contributor Author

Not sure why github has closed the PR, I've reopened it. Github also had automatically requested a review because I've pushed to my branch changes that are in master, and github thought I modified the files ¯_(ツ)_/¯

In any case, I've started from a clean master and pushed a subset of the initial commits I had, because allowing any character as fill char broke standard lib tests (somewhere we test that only 0 is allowed for some reason)

So the pipeline should now pass, the review is not necessary because only fmt.zig is actually modified.

@eric-saintetienne
Copy link
Contributor Author

@Vexu would you mind triggering the pipeline for the latest? It should be passing now.

Thanks.

@Vexu Vexu enabled auto-merge (squash) July 17, 2024 09:13
@Vexu Vexu merged commit c3faae6 into ziglang:master Jul 17, 2024
@eric-saintetienne eric-saintetienne deleted the fmt.defaultchar branch July 19, 2024 12:05
igor84 pushed a commit to igor84/zig that referenced this pull request Aug 11, 2024
… necessary (ziglang#19049)

* format: fix default character when no alignment

When no alignment is specified, the character that should be used is the
fill character that is otherwise provided, not space.

This is closer to the default that C programmers (and other languages)
use: "04x" fills with zeroes (in zig as of today x:04 fills with spaces)

Test:

    const std = @import("std");
    const expectFmt = std.testing.expectFmt;

    test "fmt.defaultchar.no-alignment" {

        // as of today the following test passes:
        try expectFmt("0x00ff", "0x{x:0>4}", .{255});

        // as of today the following test fails (returns "0x  ff" instead)
        try expectFmt("0x00ff", "0x{x:04}", .{255});
    }

* non breaking improvement of string formatting

* improved comment

* simplify the code a little

* small improvement around how  characters identified as valid are consumed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants