Skip to content

[doc] Clarify map's special treatment of braces #23373

Open
@rapidcow

Description

@rapidcow

Where

perlfunc#map, and possibly perlref#Curly-Brackets

Description

The block in map BLOCK always parses a lone pair of curly braces as an anon hash, which is peculiar, because normally Perl takes a guess when braces appear on its own line. For example, these look like a hash to Perl:

use strict;
use warnings;
use Data::Dumper;
use feature qw(say);
$Data::Dumper::Terse = 1;
$Data::Dumper::Indent = 0;

sub print_block (&) {
    local $_ = 'a';
    my $block = shift;
    my @ret = $block->( file => 'index.html', mode => 0100644 );
    say Dumper (@ret == 1 ? @ret : \@ret);
}

print_block { { alpha => 'numeric' } };  # {'alpha' => 'numeric'}
print_block { { 123 => 456 } };          # {'123' => 456}
print_block { { 3x3 => 9x9 } };          # {'333' => '999999999'}
print_block { { "$_" => "$_.png" } };    # {'a' => 'a.png'}
print_block { +{ @_ } };                 # {'file' => 'index.html','mode' => 33188}

whereas these look like a nested block:

print_block { { -alpha => 'numeric' } }; # ['-alpha','numeric']
print_block { { +123 => 456 } };         # [123,456]
print_block { { 3 x 3 => 9x9 } };        # ['333','999999999']
print_block { { $_ => "$_.png" } };      # ['a','a.png']
print_block { { @_ } };                  # ['file','index.html','mode',33188]

But all of these, when placed inside map's block, are parsed as a hash!

my @argument = { file => 'index.html', mode => 0100644 };

say Dumper [ map { { alpha => 'numeric' } } 1 ];  # [{'alpha' => 'numeric'}]
say Dumper [ map { { -alpha => 'numeric' } } 1 ]; # [{'-alpha' => 'numeric'}]
say Dumper [ map { { 123 => 456 } } 1 ];          # [{'123' => 456}]
say Dumper [ map { { +123 => 456 } } 1 ];         # [{'123' => 456}]
say Dumper [ map { { 3x3 => 9x9 } } 1 ];          # [{'333' => '999999999'}]
say Dumper [ map { { 3 x 3 => 9x9 } } 1 ];        # [{'333' => '999999999'}]
say Dumper [ map { { "$_" => "$_.png" } } 'a' ];  # [{'a' => 'a.png'}]
say Dumper [ map { { $_ => "$_.png" } } 'a' ];    # [{'a' => 'a.png'}]
say Dumper [ map { +{ %$_ } } @argument ];   # [{'file' => 'index.html','mode' => 33188}]
say Dumper [ map { { %$_ } } @argument ];    # [{'file' => 'index.html','mode' => 33188}]

The exception is when { isn't at the start of map's block. Namely:

say Dumper [ map {; { %$_ } } @argument ];   # ['file','index.html','mode',33188]

is basically equivalent to map { %$_ } @argument, minus the scope. By the way, this is probably also the only way to write this strange construct, since it becomes impossible to convince Perl that this is a block otherwise once it sees the first curly brace:

say Dumper [ map { { %$_; } } @argument ];   # syntax error: ';' in anon hash
say Dumper [ map { {; %$_ } } @argument ];   # even this is not early enough!

On a tangentially related note, Deparse happens to makes this very mistake when trying to represent the nested block within a map:

$ perl -MO=Deparse <<'EOF'
use feature qw(say);
use Data::Dumper;
say Dumper [ map {; { %$_ } } @argument ];
EOF
use feature 'say';
use Data::Dumper;
say Dumper([map({{  ### XXX: does not work!
    %$_;            ### "{;{" needed.
}} @argument)]);

This is different from known bugs of Deparse as far as I can tell; but it seems more addressable than the others.

(Tested on perl 5, version 40, subversion 2 (v5.40.2) built for x86_64-linux-thread-multi; though I think this has been around for a long time, as this seems more like a subtle quirk than a bug.)

Either way, here is what I think can be done:

  1. At the bottom of perlfunc#map, document how the first pair of inner braces is always parsed as an anon hash, which is an exception to other BLOCKS such as eval BLOCK/do BLOCK/anon subs. (The existing narrative and examples explain the way Perl disambiguates map EXPR and map BLOCK when the outer braces could represent map +{ hash } or map {; BLOCK }.)

    One rationale is that, for me personally, it can be tempting to write map {; EXPR } before realizing that EXPR is no longer subject to the special treatment when EXPR is meant to be a hash (and, very shortly after, realizing that there was special treatment in the first place!), hence the placement right below the existing examples.

  2. On a related note, we should probably explain how map ( ... ) LIST doesn't force the EXPR form much like how map { ... } LIST doesn't always mean BLOCK (which is documented). (I read this a while ago, which I think is a valid concern even today.) I'm not sure where it would be a good place to put this though, since grep, sort, and presumably many others are also like this. (On the other hand, it does force the &-prototype to accept an anon sub and not a block (I think?); that could be one potential source of confusion, i.e. something worth comparing the built-in prototypes to.)

    There is a slightly related "looks-like-a-function rule", but that is likely not the appropriate term, as the parentheses do behave like those on a function. For instance, if I wrote

    my @list = ( map( { uc($_) } 'a', 'b', 'c' ), 'd' );
    #               ^                          ^

    then the parentheses mark the end of the map function, and 'd' doesn't get consumed by the capitalization. So parentheses do serve to delimit arguments; it's just that they don't serve to disambiguate BLOCK from EXPR is all.

  3. (Optionally) Elaborate on the way Perl disambiguates hashes and blocks in perlref#Curly-Brackets. This could be more challenging, since perlref warns that something like { @_ } is inherently ambiguous. But maybe a vague gesture of the heuristics that seems to be currently in use (as I tried to crudely illustrate in the examples above), or some guarantee that something like { bareword => $what . $ever } is always parsed as a hash would be nice. (Also, when it comes to explicitly forcing a hash, I find that ({ hash => ref }) looks more symmetrical than +{ hash => ref } (and nicer to me IMHO). Maybe we could add that as an alternative way of forcing a hash, both in perlref and perlfunc#map? (TIMTOWTDI, right? :)

    And speaking of guarantees, I think it would be nice if we test this (if we were to make any guarantee)? I found some tests for map in t/op/grep.t, but they are definitely not about this...

  4. (Very optional) Document this strange case that Deparse gets wrong where there is a nested scope within a map BLOCK (if not fixing it right away)? This should most certainly be addressed in a new issue, but it is something to consider while we're at it.

Also, I can definitely make a pull request (since I have a rough idea of what goes where, as I explained above (except for the tests, I was thinking of t/comp/ but to be perfectly honest, I have no idea)), but I want to make sure that I'm not missing anything obvious, and that if this is a good idea at all.

(I have skimmed the GitHub issues for duplicates and #23110 is the closest I can find (in the similar spirit of making Perl map less silently wrong, though that is about map EXPR, not anon hash inside map BLOCK). And just for the record, all of this was something I stumbled across while trying to copy an array of hashes, which I know I should have asked on a forum first... but I'm terrible at doing that, so I decided that I wanted to improve the documentation directly (if possible). Here were my attempts at asking on PerlMonks (rejected draft) and Reddit.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions