Skip to content

Conversation

@redgluten
Copy link
Contributor

@redgluten redgluten commented Aug 23, 2020

This is inspired by the similar method in Ruby’s Enumerable. It allows breaking up a collection into chunks based on evaluating a callback rather than just a given size.

An example application is for implementing Run-length encoding where you need to group together consecutive elements of equal nature:

$chunks = collect(str_split('AABBCCCD'))->chunkWhile(fn ($prev, $curr) => $prev === $curr));
// $chunks->toArray() is [ ['A', 'A'], ['B', 'B'], ['C', 'C', 'C'], ['D'] ]

$chunks->reduce(function($sequence, $chunk) {
    $size = $chunk->count();
    $element = $chunk->first();
    return $size > 1 ? "$sequence$size$element" : "$sequence$element";
}, '');
// returns '2A2B3CD'

I’ll gladly write a documentation PR as well if this gets merged.
Please be kind, this is my first code PR to Laravel 😍, I took the time to read the contribution guide and take inspiration from other tests while writing mine but I could have missed something. Collections is my favourite part of the framework, this was a very nice learning experience anyhow. 👍

@redgluten
Copy link
Contributor Author

(updated the P.R description with code that actually makes sense 😅)

@GrahamCampbell GrahamCampbell changed the title Add chunkWhile() collection method [8.x] Add chunkWhile() collection method Aug 23, 2020
@driesvints
Copy link
Member

Ping @JosephSilber

@taylorotwell taylorotwell merged commit 4fd2793 into laravel:master Aug 24, 2020
@JosephSilber
Copy link
Contributor

I actually tried adding this functionality a while ago, but it didn't work out (the naming was quite confusing). I've been thinking of taking another shot at this, so thanks @redgluten for submitting this. Looks great 👍

Here are some thoughts:

  1. We don't need a separate implementation for the eager collection. It can defer to the lazy collection. With code as involved as this, it's better not to have two implementations.

  2. All methods on the lazy collection need to be tested to ensure they're actually lazy.

  3. We always maintain the original collection's keys. This PR resets all keys.

  4. Although Ruby doesn't do this, I think it makes sense to pass the whole chunk (thus far) to the callback. Also, we should pass the current key to the callback.

In short, what I'm proposing is something like this (haven't tested it):

public function chunkWhile(callable $callback)
{
    return new static(function () use ($callback) {
        $iterator = $this->getIterator();

        $chunk = new Collection();

        if ($iterator->valid()) {
            $chunk[$iterator->key()] = $iterator->current();

            $iterator->next();
        }

        while ($iterator->valid()) {
            if (! $callback($iterator->current(), $iterator->key(), $chunk)) {
                yield new static($chunk);

                $chunk = new Collection();
            }

            $chunk[$iterator->key()] = $iterator->current();

            $iterator->next();
        }

        if ($chunk->isNotEmpty()) {
            yield new static($chunk);
        }
    });
}

@redgluten if you want to submit a follow-up PR, go for it 👍

Otherwise I'll cook something up a little later, or tomorrow.

@taylorotwell
Copy link
Member

Yeah, let's make the proposed changes.

@redgluten
Copy link
Contributor Author

@JosephSilber Yeah I stumbled onto your PR while finishing up mine yesterday, seemed to me like this more general approach might be valuable.

Very cool that we can defer to lazy collections. 🔥

Your proposed changes make total sense. The only concern I see is that using the chunk makes the common operation of comparing the current and previous elements a bit more wordy unless I’m missing some trick?

It’s quite late on my side of the world to get a crack at it now feel free to make your changes if you have the time otherwise I’ll get back to it later on. 👍

@JosephSilber
Copy link
Contributor

@redgluten your original example would have to be converted to this:

$chunks = collect(str_split('AABBCCCD'))
    ->chunkWhile(fn ($char, $key, $chars) => $chars->last() === $char));

A tiny bit more verbose, but totally worth it for the additional flexibility.

@redgluten
Copy link
Contributor Author

Oh yeah at that point the chunk’s last element is the previous 😅 that’s what you get from reading code before going to bed late at night I guess. 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants