Skip to content

Conversation

@YuvalItzchakov
Copy link

@YuvalItzchakov YuvalItzchakov commented Aug 3, 2018

What changes were proposed in this pull request?

This small fix adds a consumer.release() call to KafkaSourceRDD in the case where we've retrieved offsets from Kafka, but the fromOffset is equal to the lastOffset, meaning there is no new data to read for a particular topic partition. Up until now, we'd just return an empty iterator without closing the consumer which would cause a FD leak.

If accepted, this pull request should be merged into master as well.

How was this patch tested?

Haven't ran any specific tests, would love help on how to test methods running inside RDD.compute.

@YuvalItzchakov YuvalItzchakov changed the title SPARK-24987 - Fix Kafka consumer leak when no new offsets for TopicPartition [SPARK-24987][SS] - Fix Kafka consumer leak when no new offsets for TopicPartition Aug 3, 2018
@koeninger
Copy link
Contributor

Jenkins, ok to test

@SparkQA
Copy link

SparkQA commented Aug 3, 2018

Test build #94133 has finished for PR 21983 at commit e5db69f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@YuvalItzchakov
Copy link
Author

Should I create a separate PR for the master branch?

@felixcheung
Copy link
Member

@YuvalItzchakov you should open the PR against master - it can be picked to release branch (eg. 2.3) when merged.

@YuvalItzchakov
Copy link
Author

@felixcheung Thanks, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants