Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

  /**
   * Certain optimizations should not be applied if UDF is not deterministic.
   * Deterministic UDF returns same result each time it is invoked with a
   * particular input. This determinism just needs to hold within the context of
   * a query.
   *
   * @return true if the UDF is deterministic
   */
  boolean deterministic() default true;

Based on the definition of UDFType, when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic.

How was this patch tested?

Added test cases.

Row(null, null, 110.0, null, null, 10.0) :: Nil)
}

test("non-deterministic children expressions of UDAF") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to improve the test case coverage.

))
}

test("non-deterministic children expressions of UDAF") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to improve the test case coverage.

@SparkQA
Copy link

SparkQA commented Apr 14, 2017

Test build #75793 has started for PR 17635 at commit b593be1.

@SparkQA
Copy link

SparkQA commented Apr 14, 2017

Test build #75791 has finished for PR 17635 at commit 5cb4206.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

test("non-deterministic children expressions of UDAF") {
withTempView("view1") {
spark.range(1).selectExpr("id as x", "id as y").createTempView("view1")
withUserDefinedFunction("testUDAFPercentile" -> true, "testMock" -> true) {
Copy link
Member

@viirya viirya Apr 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testMock? Do we use it?

@viirya
Copy link
Member

viirya commented Apr 14, 2017

LGTM except a minor comment on the test.

@SparkQA
Copy link

SparkQA commented Apr 15, 2017

Test build #75824 has finished for PR 17635 at commit 508a43d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @cloud-fan

@dongjoon-hyun
Copy link
Member

+1

@cloud-fan
Copy link
Contributor

LGTM, merging to master! @gatorsmile shall we backport this PR?

@asfgit asfgit closed this in e090f3c Apr 16, 2017
@gatorsmile
Copy link
Member Author

gatorsmile commented Apr 16, 2017

Maybe, yes. Will do it later. Thank you!

asfgit pushed a commit that referenced this pull request Apr 17, 2017
…acts the determinism of Hive UDF

### What changes were proposed in this pull request?

This PR is to backport #17635 to Spark 2.1

---
```JAVA
  /**
   * Certain optimizations should not be applied if UDF is not deterministic.
   * Deterministic UDF returns same result each time it is invoked with a
   * particular input. This determinism just needs to hold within the context of
   * a query.
   *
   * return true if the UDF is deterministic
   */
  boolean deterministic() default true;
```

Based on the definition of [UDFType](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L42-L50), when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic.

### How was this patch tested?
Added test cases.

Author: Xiao Li <[email protected]>

Closes #17652 from gatorsmile/backport-17635.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…minism of Hive UDF

### What changes were proposed in this pull request?
```JAVA
  /**
   * Certain optimizations should not be applied if UDF is not deterministic.
   * Deterministic UDF returns same result each time it is invoked with a
   * particular input. This determinism just needs to hold within the context of
   * a query.
   *
   * return true if the UDF is deterministic
   */
  boolean deterministic() default true;
```

Based on the definition of [UDFType](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L42-L50), when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic.

### How was this patch tested?
Added test cases.

Author: Xiao Li <[email protected]>

Closes apache#17635 from gatorsmile/udfDeterministic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants