Skip to content

Conversation

@JoshRosen
Copy link
Contributor

This pull request enables Unsafe mode by default in Spark SQL. In order to do this, we had to fix a number of small issues:

List of fixed blockers:

@JoshRosen
Copy link
Contributor Author

Found an interesting problem where the "exception on memory leak" configuration can lead to task failure exceptions being masked. Going to push a fix to this branch, which I may end up spinning off into a separate PR depending on how things go.

@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #37925 has finished for PR 7564 at commit ce55b32.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Looks like there’s a problem with using ScalaUDFs in the unsafe GeneratedAggregate path. It looks like there’s a case where we end up using a BoundReference expression to extract fields before passing them to the UDF. This ends up calling the generic Row.apply(), which returns data of the wrong type when called on an UnsafeRow (specifically, the failing test tries to get a Long column but ends up getting a byte array back).

One way to fix this would be to implement code generation for ScalaUDF so that the field access expressions are code generated. Edit: this is covered by https://issues.apache.org/jira/browse/SPARK-9162.

@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #37928 timed out for PR 7564 at commit a49ca2a after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #37931 timed out for PR 7564 at commit e872840 after a configured wait of 175m.

@JoshRosen
Copy link
Contributor Author

In this latest round of tests, it looks like HiveThriftBinaryServerSuite failed a test:

[info] - test multiple session *** FAILED *** (2 seconds, 488 milliseconds)
[info]   java.sql.SQLException: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: sort, tree:
[info] UnsafeExternalSort [KEY#270 DESC], true, 0
[info]  ConvertToSafe
[info]   Exchange rangepartitioning(KEY#270 DESC)
[info]    UnsafeExternalSort [key#270 DESC], true, 0
[info]     Exchange rangepartitioning(key#270 DESC)
[info]      HiveTableScan [key#270], (MetastoreRelation default, test_map, None)
[info]   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
[info]   at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:355)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$8$$anonfun$apply$mcV$sp$13.apply(HiveThriftServer2Suites.scala:327)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$8$$anonfun$apply$mcV$sp$13.apply(HiveThriftServer2Suites.scala:324)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$1.apply(HiveThriftServer2Suites.scala:438)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$1.apply(HiveThriftServer2Suites.scala:438)
[info]   at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
[info]   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest.withMultipleConnectionJdbcStatement(HiveThriftServer2Suites.scala:438)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$8.apply$mcV$sp(HiveThriftServer2Suites.scala:209)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$8.apply(HiveThriftServer2Suites.scala:204)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$8.apply(HiveThriftServer2Suites.scala:204)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
[info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
[info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
[info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
[info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
[info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
[info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
[info]   at scala.collection.immutable.List.foreach(List.scala:318)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
[info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
[info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
[info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
[info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
[info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
[info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
[info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
[info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Test.org$scalatest$BeforeAndAfterAll$$super$run(HiveThriftServer2Suites.scala:450)
[info]   at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
[info]   at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
[info]   at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2Test.run(HiveThriftServer2Suites.scala:450)
[info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
[info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
[info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
[info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
[info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[info]   at java.lang.Thread.run(Thread.java:745)

It looks like there may also be a planner bug that occurs when unsafe is enabled while codegen is disabled:

org.apache.spark.sql.hive.execution.HiveWindowFunctionQueryWithoutCodeGenSuite. testMultipleRangeWindows
 Failed to execute query using catalyst: Error: assertion failed: UnsafeExternalSort requires code generation to be enabled java.lang.AssertionError: assertion failed: UnsafeExternalSort requires code generation to be enabled  at scala.Predef$.assert(Predef.scala:179)  at org.apache.spark.sql.execution.UnsafeExternalSort$$anonfun$doExecute$7.apply(basicOperators.scala:281)  at org.apache.spark.sql.execution.UnsafeExternalSort$$anonfun$doExecute$7.apply(basicOperators.scala:280)  at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)  at 

This is probably easy to fix by adding configuration validation which does not allow unsafe to be enabled unless codegen is also enabled, then updating the relevant non-code-gen suites to also disable unsafe. We can also modify the planner to guard against this by also checking whether codegen is enabled when figuring out whether an unsafe sort operator can be used.

Finally, it looks like a number of the HiveCompatibilitySuite tests are failing due to the minimum buffer size issue, which I'll try to fix today.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from e872840 to e71ddbc Compare July 21, 2015 19:00
@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #37973 timed out for PR 7564 at commit e71ddbc after a configured wait of 175m.

@JoshRosen
Copy link
Contributor Author

I lowered some buffers to 4MB so that the majority of HiveCompatibilitySuite runs and have found a few other test failures; see updated ticket description for more details.

@SparkQA
Copy link

SparkQA commented Jul 22, 2015

Test build #37987 timed out for PR 7564 at commit e42fb7d after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 22, 2015

Test build #37990 timed out for PR 7564 at commit 5464206 after a configured wait of 175m.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from 5464206 to 4fcae4a Compare July 26, 2015 21:11
@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38482 timed out for PR 7564 at commit 4fcae4a after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38496 has finished for PR 7564 at commit f4ac642.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch 2 times, most recently from 88aa54a to c38bcdd Compare July 27, 2015 06:41
@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38514 timed out for PR 7564 at commit 88aa54a after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38519 timed out for PR 7564 at commit c38bcdd after a configured wait of 175m.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from c38bcdd to 393a3eb Compare July 27, 2015 16:36
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be unnecessary now that we support generic getters in UnsafeRows.

@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38563 timed out for PR 7564 at commit 393a3eb after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38638 has finished for PR 7564 at commit fa1c3e2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Lots of changes are temporarily merged in here while I test on top of Reynold's struct type patch.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from a2ccefa to 8946cb9 Compare July 28, 2015 06:09
@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38667 has finished for PR 7564 at commit 8946cb9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public static class StructWriter
    • abstract class InternalRow extends Serializable with SpecializedGetters
    • case class CreateStructUnsafe(children: Seq[Expression]) extends Expression
    • case class CreateNamedStructUnsafe(children: Seq[Expression]) extends Expression
    • case class LastDay(startDate: Expression) extends UnaryExpression with ImplicitCastInputTypes
    • case class NextDay(startDate: Expression, dayOfWeek: Expression)
    • case class TungstenProject(projectList: Seq[NamedExpression], child: SparkPlan) extends UnaryNode

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38677 has finished for PR 7564 at commit 7f463f8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class RFormula(override val uid: String) extends Estimator[RFormulaModel] with RFormulaBase
    • public static class StructWriter
    • abstract class InternalRow extends Serializable with SpecializedGetters
    • case class CreateStructUnsafe(children: Seq[Expression]) extends Expression
    • case class CreateNamedStructUnsafe(children: Seq[Expression]) extends Expression
    • case class LastDay(startDate: Expression) extends UnaryExpression with ImplicitCastInputTypes
    • case class NextDay(startDate: Expression, dayOfWeek: Expression)
    • case class TungstenProject(projectList: Seq[NamedExpression], child: SparkPlan) extends UnaryNode

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from b4ab1c0 to 445b4db Compare July 28, 2015 19:52
@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38757 has finished for PR 7564 at commit 445b4db.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp
Copy link
Contributor

jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #38761 has finished for PR 7564 at commit 445b4db.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen force-pushed the unsafe-by-default branch from 4f92106 to 5d0b2d3 Compare July 29, 2015 23:50
@JoshRosen JoshRosen changed the title [SPARK-8850] [SQL] [WIP] Enable Unsafe mode by default [SPARK-8850] [SQL] Enable Unsafe mode by default Jul 29, 2015
@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38896 has finished for PR 7564 at commit 7652648.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Nooo! This failed a PySpark test because I need to configure a lower default page size in those tests.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38910 has finished for PR 7564 at commit 4f92106.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor Author

(Re-testing because a flaky MLlib test may have caused a spurious failure and I want to have the Python tests run...)

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38940 has finished for PR 7564 at commit d6986de.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38945 has finished for PR 7564 at commit 963f567.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #1231 has finished for PR 7564 at commit 963f567.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38965 has finished for PR 7564 at commit 963f567.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #38992 has finished for PR 7564 at commit f4cc859.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class TungstenSort(

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

(Just preemptively in case the first build hits a flaky streaming test)

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #39056 has finished for PR 7564 at commit 83c0c56.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins retest this please

@rxin
Copy link
Contributor

rxin commented Jul 30, 2015

I've merged this!

@asfgit asfgit closed this in 520ec0f Jul 30, 2015
@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #1236 has finished for PR 7564 at commit 83c0c56.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #39073 has finished for PR 7564 at commit 83c0c56.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public static final class FloatPrefixComparator extends PrefixComparator
    • case class QRDecomposition[UType, VType](Q: UType, R: VType)
    • case class UnsafeExternalSort(

@JoshRosen JoshRosen deleted the unsafe-by-default branch October 16, 2015 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants