Skip to content

Conversation

@eyalzit
Copy link
Contributor

@eyalzit eyalzit commented Feb 19, 2017

What changes were proposed in this pull request?

currently if multiple streaming queries listeners exists, when a QueryTerminatedEvent is triggered, only one of the listeners will be invoked while the rest of the listeners will ignore the event.
this is caused since the the streaming queries listeners bus holds a set of running queries ids and when a termination event is triggered, after the first listeners is handling the event, the terminated query id is being removed from the set.
in this PR, the query id will be removed from the set only after all the listeners handles the event

How was this patch tested?

a test with multiple listeners has been added to StreamingQueryListenerSuite

postToAll(s)
case t: QueryTerminatedEvent =>
// run all the listeners synchronized before removing the id from the list
postToAll(t)
Copy link
Member

@zsxwing zsxwing Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will post QueryTerminatedEvent to all listeners directly in the current thread. Hence, the listener may see QueryProgressEvent after QueryTerminatedEvent.

You can override postToAll. It's fine to remove final from the postToAll method.

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good. Just left some comments.

/**
* Post the event to all registered listeners. The `postToAll` caller should guarantee calling
* `postToAll` in the same thread for all events. also remove the query id after all listeners
* process the QueryTerminatedEvent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't copy the comments from the parent class. You can just document it as:

Override the parent `postToAll` to remove query from`activeQueryRunIds` after all listeners process `QueryTerminatedEvent`. (SPARK-19594)

}
}

testQuietly("multiple listeners, check trigger events are generated correctly") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add a regression test rather than copying the above test. Most of this test is testing the same thing in "single listener, check trigger events are generated correctly". It doesn't make sense. How about this:

  test("SPARK-19594: all of listeners should receive QueryTerminatedEvent") {
    val df = MemoryStream[Int].toDS().as[Long]
    val listeners = (1 to 5).map(_ => new EventCollector)
    try {
      testStream(df, OutputMode.Append)(
        StartStream(),
        StopStream,
        AssertOnQuery { query =>
          eventually(Timeout(streamingTimeout)) {
            listeners.foreach(listener => assert(listener.terminationEvent !== null))
            listeners.foreach(listener => assert(listener.terminationEvent.id === query.id))
            listeners.foreach(listener => assert(listener.terminationEvent.runId === query.runId))
            listeners.foreach(listener => assert(listener.terminationEvent.exception === None))
          }
          listeners.foreach(listener => listener.checkAsyncErrors())
          listeners.foreach(listener => listener.reset())
          true
        }
      )
    } finally {
      listeners.foreach(spark.streams.removeListener)
    }
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that (with a small fix)
need to add:
listeners.foreach(listener => spark.streams.addListener(listener))

@zsxwing
Copy link
Member

zsxwing commented Feb 26, 2017

ok to test

@SparkQA
Copy link

SparkQA commented Feb 26, 2017

Test build #73496 has finished for PR 16991 at commit 896f5b3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Feb 26, 2017

LGTM. Merging to master and 2.1. Thanks!

asfgit pushed a commit that referenced this pull request Feb 26, 2017
…andle QueryTerminatedEvent if more then one listeners exists

## What changes were proposed in this pull request?

currently if multiple streaming queries listeners exists, when a QueryTerminatedEvent is triggered, only one of the listeners will be invoked while the rest of the listeners will ignore the event.
this is caused since the the streaming queries listeners bus holds a set of running queries ids and when a termination event is triggered, after the first listeners is handling the event, the terminated query id is being removed from the set.
in this PR, the query id will be removed from the set only after all the listeners handles the event

## How was this patch tested?

a test with multiple listeners has been added to StreamingQueryListenerSuite

Author: Eyal Zituny <[email protected]>

Closes #16991 from eyalzit/master.

(cherry picked from commit 9f8e392)
Signed-off-by: Shixiong Zhu <[email protected]>
@asfgit asfgit closed this in 9f8e392 Feb 26, 2017
Yunni pushed a commit to Yunni/spark that referenced this pull request Feb 27, 2017
…andle QueryTerminatedEvent if more then one listeners exists

## What changes were proposed in this pull request?

currently if multiple streaming queries listeners exists, when a QueryTerminatedEvent is triggered, only one of the listeners will be invoked while the rest of the listeners will ignore the event.
this is caused since the the streaming queries listeners bus holds a set of running queries ids and when a termination event is triggered, after the first listeners is handling the event, the terminated query id is being removed from the set.
in this PR, the query id will be removed from the set only after all the listeners handles the event

## How was this patch tested?

a test with multiple listeners has been added to StreamingQueryListenerSuite

Author: Eyal Zituny <[email protected]>

Closes apache#16991 from eyalzit/master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants