Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented Jul 1, 2015

Replace Akka Serialization with Spark Serializer and add unit tests.

@zsxwing
Copy link
Member Author

zsxwing commented Jul 1, 2015

/cc @rxin

@zsxwing
Copy link
Member Author

zsxwing commented Jul 1, 2015

This is the last PR for SPARK-6602

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36265 has finished for PR 7159 at commit ecec410.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@zsxwing zsxwing changed the title [SPARK-6602][Core]Replace Akka Serialization with Spark Serializer [SPARK-6602][Core][WIP]Replace Akka Serialization with Spark Serializer Jul 1, 2015
@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36264 has finished for PR 7159 at commit ff034d0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)
    • case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with Logging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this thing used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

org.apache.curator.test.TestingServer is from this artifact. An embedded ZooKeeper server for testing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in test scope?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Good catch. I was thinking it but forgot to add it here.

@zsxwing zsxwing changed the title [SPARK-6602][Core][WIP]Replace Akka Serialization with Spark Serializer [SPARK-6602][Core]Replace Akka Serialization with Spark Serializer Jul 2, 2015
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this new method to RpcEnv for RpcEndpointRef deserialization

@SparkQA
Copy link

SparkQA commented Jul 2, 2015

Test build #36398 has finished for PR 7159 at commit be3edb0.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@SparkQA
Copy link

SparkQA commented Jul 2, 2015

Test build #36400 has finished for PR 7159 at commit 433115c.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@zsxwing
Copy link
Member Author

zsxwing commented Jul 2, 2015

retest this please

@rxin
Copy link
Contributor

rxin commented Jul 2, 2015

cc @andrewor14 can you review this? Thanks.

@SparkQA
Copy link

SparkQA commented Jul 2, 2015

Test build #36408 has finished for PR 7159 at commit 433115c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@SparkQA
Copy link

SparkQA commented Jul 3, 2015

Test build #36502 has finished for PR 7159 at commit 9ef4af9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@SparkQA
Copy link

SparkQA commented Jul 4, 2015

Test build #36519 has finished for PR 7159 at commit 73251c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@zsxwing
Copy link
Member Author

zsxwing commented Jul 13, 2015

ping @andrewor14

@andrewor14
Copy link
Contributor

retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this is necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, because now we no longer pass akka's Serialization, which has information about the actor system, into PersistenceEngine, so here we ensure that we're using the actor system's serializer.

But more generally, since we always serialize with JavaSerializer in the new code, why can't we always deserialize with the same thing? I just find it a little strange that we have to pass a closure into this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark JavaSerializer is used to deserialize objects. However, it does not have an actor system in the current context. I need to use Akka JavaSerializer.currentSystem to put the current actor system into a thread-local variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why do we need the actor system to deserialize it? Can't we just deserialize it with JavaSerializer? @rxin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why do we need the actor system to deserialize it? Can't we just deserialize it with JavaSerializer?

Oh, that's because WorkerInfo and ApplicationInfo contain a reference to RpcEndpointRef.

@andrewor14
Copy link
Contributor

I left one question, but LGTM otherwise.

@SparkQA
Copy link

SparkQA commented Jul 14, 2015

Test build #37265 has finished for PR 7159 at commit 73251c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

/ping @jlewandowski, just want to give you a head's up about the binary-incompatible change to RecoveryModeFactory. I don't think that we can avoid this change, but I know that you're one of the few users / implementors of custom recovery modes and thought you'd want early notice.

@SparkQA
Copy link

SparkQA commented Jul 15, 2015

Test build #37292 has finished for PR 7159 at commit fc0fca3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class StandaloneRecoveryModeFactory(conf: SparkConf, serializer: Serializer)

@andrewor14
Copy link
Contributor

Hi @zsxwing this LGTM. Feel free to merge it.

@rxin
Copy link
Contributor

rxin commented Jul 15, 2015

Thanks - I've merged it.

@asfgit asfgit closed this in b9a922e Jul 15, 2015
@zsxwing zsxwing deleted the remove-akka-serialization branch July 16, 2015 00:58
@jacek-lewandowski
Copy link
Contributor

@JoshRosen this shouldn't be a problem for us. Thanks for pinging me anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants