[SPARK-18371][Streaming] Spark Streaming backpressure generates batch with large number of records #17774

arzt · 2017-04-26T15:07:49Z

What changes were proposed in this pull request?

Omit rounding of backpressure rate. Effects:

no batch with large number of records is created when rate from PID estimator is one
the number of records per batch and partition is more fine-grained improving backpressure accuracy

How was this patch tested?

This was tested by running:

mvn test -pl external/kafka-0-8
mvn test -pl external/kafka-0-10
a streaming application which was suffering from the issue

@JasonMWhite

The contribution is my original work and I license the work to the project under the project’s open source license

JasonMWhite · 2017-04-26T15:34:33Z

Code looks sound. Could you add or modify a test to illustrate/verify?

srowen

CC @koeninger
The thing is, it seems like rates are intentionally not floating point here, but I don't know the history of it.

srowen · 2017-04-26T16:18:34Z

.../kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/DirectKafkaInputDStream.scala

(Given your change, I think double becomes more reasonable than float.

koeninger · 2017-04-26T17:41:12Z

How do you read 0.1 of a kafka message for a given partition of a given batch?

Ultimately the floor for a rate limit, assuming one is set, needs to be 1 message per partition per batch, not a fraction, which is why it's a long.

If you want to delay that conversion by keeping it as a double as long as possible, that makes sense, but the lines like

(secsPerBatch * limit).toLong

probably need attention too.

arzt · 2017-04-27T13:18:50Z

Thanks for your valuable feedback. I added tests as suggested by @JasonMWhite and used toDouble. @koeninger the estimated rate is per second summed over all partitions, is it? The batch time usually is longer. So even values less than 1 but greater than 0 for backpressureRate can make sense for one partition. The casting to long is only needed when the absolute number of messages is computed, but even this number can be zero for some partitions, e.g. when there is no lag. I hope I am not confused here. There is also a test covering this.

arzt · 2017-04-27T13:28:42Z

To run tests or debug using IntelliJ:
mvn test -DforkMode=never -pl external/kafka-0-8 "-Dsuites=org.apache.spark.streaming.kafka.DirectKafkaStreamSuite maxMessagesPerPartition"

koeninger · 2017-04-27T14:51:43Z

@arzt It's entirely possible to have batch times less than a second, and I'm not sure I agree that the absolute number of messages allowable for a partition should ever be zero.

So to put this another way, right now effectiveRateLimitPerPartition is a Map[TopicPartition, Long], which matches the return value of the function maxMessagesPerPartition.

You're wanting to change effectiveRateLimitPerPartition to a Map[TopicPartition, Double], which is probably a good idea, and should fix the bug around treating a very small rate limit as no limit.

But it still needs to be converted to Map[TopicPartition, Long] before returning. Calling .toLong is probably not the right thing to do there, because 0.99 will get truncated to 0.

I think one message per partition per batch is the minimum reasonable rate limit, otherwise particular partitions may not make progress. The relative lag calculation might take care of that in future batches, but it still seems questionable, even if it's a corner case.

arzt · 2017-04-27T16:00:58Z

@koeninger I agree that assuming a long batch size is wrong, not sure whether it even matters.
But what if for one partition there is no lack in the current batch? Then fetching 1 message for this partition from kafka, is you suggest, would fail. So here zero makes sense in my eyes. This is also the old behaviour if rate > 1 and lag == 0 here.
Further, I think that truncating 0.99 to 0 messages per partition is also the right thing to do, as one cannot be sure that there is one message available if (secsPerBatch * limit) < 1.0. And as you say, in a future batch it is very like to become greater than 1.0.
Do you agree?

koeninger · 2017-04-27T16:38:26Z

Have you read the function def clamp? Rate limit of 1 should not imply an attempt to grab 1 message even if it doesn't exist.

…

On Apr 27, 2017 11:01, "Sebastian Arzt" ***@***.***> wrote: @koeninger <https://github.com/koeninger> I agree that assuming a long batch size is wrong, not sure whether it even matters. But what if for one partition there is no lack in the current batch? Then fetching 1 message for this partition from kafka, is you suggest, would fail. So here zero makes sense in my eyes. This is also the old behaviour if rate > 1 and lag == 0 here <https://github.com/apache/spark/blob/master/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala#L107> . Further, I think that truncating 0.99 to 0 messages per partition is also the right thing to do, as one cannot be sure that there is one message available if (secsPerBatch * limit) < 1.0. And as you say, in a future batch it is very like to become greater than 1.0. Do you agree? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#17774 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAGAB1X8NCNqECUlx9X54DSAmbnHmHdAks5r0LvggaJpZM4NJAVA> .

JasonMWhite · 2017-04-27T16:49:53Z

I think @koeninger's suggestion is valid. effectiveRateLimitPerPartition is the upper bound on the number of messages per partition per second, and maxMessagesPerPartition sets an upper bound on the number of messages to be retrieved per partition per batch window.

Making effectiveRateLimitPerPartition a float will allow it to handle properly rates of < 1/partition/s, so this is definitely a good idea. maxMessagesPerPartition must still be an integer, as you can't retrieve partial messages. All agreed there.

Setting maxMessagesPerPartition to have a minimum of 1 message per partition per batch window is a good safe value to allow progress in all cases. If there isn't 1 message to retrieve, clamp will prevent it from attempting to retrieve an invalid message.

arzt · 2017-04-27T17:08:08Z

I changed the max messages per partition to be at least 1. Agreed?

JasonMWhite · 2017-04-27T17:18:34Z

...l/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/DirectKafkaStreamSuite.scala

The actual result should be deterministic, why not check the correct value instead of just not None ?

After omitting the case of zero messages per topic, one of the tests is redundant. I removed it from each DirectKafkaStreamSuite.

JasonMWhite · 2017-04-27T17:19:24Z

Tests have some fairly repetitive code, but not sure if that's a problem or not. Looks good to me.

koeninger · 2017-04-27T18:26:30Z

LGTM pending jason's comments on tests

felixcheung · 2017-04-28T04:01:37Z

Jenkins, ok to test

SparkQA · 2017-04-28T04:28:13Z

Test build #76256 has finished for PR 17774 at commit d4a7867.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-04-28T08:19:57Z

Test build #76263 has finished for PR 17774 at commit c98b9a4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

arzt · 2017-05-02T08:28:38Z

Sorry for being inactive. All good with this?

JasonMWhite · 2017-05-02T14:09:22Z

LGTM

arzt · 2017-05-09T15:06:55Z

@felixcheung will this be merged?

felixcheung · 2017-05-11T03:28:20Z

@brkyvz @zsxwing

arzt · 2017-05-29T15:34:16Z

It's been a while. What can I do to draw some attention to this request? Is this issue not relevant enough? Thanks for reconsideration @felixcheung @brkyvz @zsxwing

SparkQA · 2017-06-26T08:14:02Z

Test build #78618 has finished for PR 17774 at commit 16b9aaf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2017-06-27T16:41:32Z

I think @tdas @zsxwing should comment if this is the right direction...

pptaszynski · 2018-01-25T15:34:58Z

I am looking forward this one to be merged. we are suffering from the issue it resolved quite badly. It effectively makes the back-pressure not working for us at all.

koeninger · 2018-03-11T02:59:21Z

LGTM
@tdas @zsxwing absent any objections from you in the next couple of days, I'll merge this

felixcheung · 2018-03-14T07:12:24Z

Jenkins, ok to test

felixcheung

pending tests

SparkQA · 2018-03-14T07:44:30Z

Test build #88226 has finished for PR 17774 at commit 1acbe4c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

koeninger · 2018-03-16T19:01:55Z

merged to master
Thanks @arzt !

… with large number of records ## What changes were proposed in this pull request? Omit rounding of backpressure rate. Effects: - no batch with large number of records is created when rate from PID estimator is one - the number of records per batch and partition is more fine-grained improving backpressure accuracy ## How was this patch tested? This was tested by running: - `mvn test -pl external/kafka-0-8` - `mvn test -pl external/kafka-0-10` - a streaming application which was suffering from the issue JasonMWhite The contribution is my original work and I license the work to the project under the project’s open source license Author: Sebastian Arzt <[email protected]> Closes apache#17774 from arzt/kafka-back-pressure.

srowen reviewed Apr 26, 2017

View reviewed changes

JasonMWhite reviewed Apr 27, 2017

View reviewed changes

arzt force-pushed the kafka-back-pressure branch from c98b9a4 to 16b9aaf Compare June 26, 2017 07:44

arzt force-pushed the kafka-back-pressure branch from 16b9aaf to 29fe32c Compare August 3, 2017 07:19

arzt force-pushed the kafka-back-pressure branch from 29fe32c to fddf5e5 Compare November 2, 2017 15:29

Sebastian Arzt added 4 commits November 6, 2017 13:53

no rounding of backpressure rate

a7162cb

convert to double instead of float

2ec4311

adding tests

8263efc

unify test names

1d02e3f

Sebastian Arzt added 5 commits November 6, 2017 13:53

correct wrong formatting

48d07b6

max messages per partition is at least 1

2da0d0a

reword assertion text

e135527

formatting

72124ea

remove redundant tests

1acbe4c

arzt force-pushed the kafka-back-pressure branch from fddf5e5 to 1acbe4c Compare November 6, 2017 12:57

felixcheung approved these changes Mar 14, 2018

View reviewed changes

asfgit closed this in dffeac3 Mar 16, 2018

[SPARK-18371][Streaming] Spark Streaming backpressure generates batch with large number of records #17774

[SPARK-18371][Streaming] Spark Streaming backpressure generates batch with large number of records #17774

Uh oh!

Conversation

arzt commented Apr 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

JasonMWhite commented Apr 26, 2017

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

srowen Apr 26, 2017

Choose a reason for hiding this comment

Uh oh!

koeninger commented Apr 26, 2017

Uh oh!

arzt commented Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arzt commented Apr 27, 2017

Uh oh!

koeninger commented Apr 27, 2017

Uh oh!

arzt commented Apr 27, 2017

Uh oh!

koeninger commented Apr 27, 2017 via email

Uh oh!

JasonMWhite commented Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arzt commented Apr 27, 2017

Uh oh!

JasonMWhite Apr 27, 2017

Choose a reason for hiding this comment

Uh oh!

arzt Apr 28, 2017

Choose a reason for hiding this comment

Uh oh!

JasonMWhite commented Apr 27, 2017

Uh oh!

koeninger commented Apr 27, 2017

Uh oh!

felixcheung commented Apr 28, 2017

Uh oh!

SparkQA commented Apr 28, 2017

Uh oh!

SparkQA commented Apr 28, 2017

Uh oh!

arzt commented May 2, 2017

Uh oh!

JasonMWhite commented May 2, 2017

Uh oh!

arzt commented May 9, 2017

Uh oh!

felixcheung commented May 11, 2017

Uh oh!

arzt commented May 29, 2017

Uh oh!

SparkQA commented Jun 26, 2017

Uh oh!

felixcheung commented Jun 27, 2017

Uh oh!

pptaszynski commented Jan 25, 2018

Uh oh!

koeninger commented Mar 11, 2018

Uh oh!

felixcheung commented Mar 14, 2018

Uh oh!

felixcheung left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 14, 2018

Uh oh!

koeninger commented Mar 16, 2018

Uh oh!

Reviewers

Assignees

arzt commented Apr 26, 2017 •

edited

Loading

arzt commented Apr 27, 2017 •

edited

Loading

JasonMWhite commented Apr 27, 2017 •

edited

Loading