Skip to content

Conversation

@mccheah
Copy link

@mccheah mccheah commented Apr 9, 2019

But don't close the partition stream returned by the partition writer.

But don't close the partition stream returned by the partition writer.
Copy link
Collaborator

@ifilonenko ifilonenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New contract of stream closing

@ifilonenko ifilonenko merged commit f4f6772 into bloomberg:shuffle-writer-default-impl-unsafe Apr 9, 2019
ifilonenko added a commit that referenced this pull request Apr 16, 2019
* initial push

* closing input channel error

* added certain fixes to make all tests except transferTo pass

* small fix to make transferTo properly work

* Always close the serializer stream. (#10)

* But don't close the partition stream returned by the partition writer.

* fix issues that are blocking tests

* style

* further style

* resolve comments

* remove unecessary file creation

* remove .toStream

* hot fix to remove log statement

* fix contract
ifilonenko pushed a commit that referenced this pull request Oct 21, 2019
…nput of UDF as double in the failed test in udf-aggregate_part1.sql

## What changes were proposed in this pull request?

It still can be flaky on certain environments due to float limitation described at apache#25110 . See apache#25110 (comment)

- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6584/testReport/org.apache.spark.sql/SQLQueryTestSuite/udf_pgSQL_udf_aggregates_part1_sql___Regular_Python_UDF/

```
Expected "700000000000[6] 1", but got "700000000000[5] 1" Result did not match for query apache-spark-on-k8s#33
SELECT CAST(avg(udf(CAST(x AS DOUBLE))) AS long), CAST(udf(var_pop(CAST(x AS DOUBLE))) AS decimal(10,3))
FROM (VALUES (7000000000005), (7000000000007)) v(x)
```

Here;s what's going on: apache#25110 (comment)

```
scala> Seq("7000000000004.999", "7000000000006.999").toDF().selectExpr("CAST(avg(value) AS long)").show()
+--------------------------+
|CAST(avg(value) AS BIGINT)|
+--------------------------+
|             7000000000005|
+--------------------------+
```

Therefore, this PR just avoid to cast in the specific test.

This is a temp fix. We need more robust way to avoid such cases.

## How was this patch tested?

It passes with Maven in my local before/after this PR. I believe the problem seems similarly the Python or OS installed in the machine. I should test this against PR builder with `test-maven` for sure..

Closes apache#25128 from HyukjinKwon/SPARK-28270-2.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants