Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to clean up UnivocityParser.

Why are the changes needed?

It will slightly improve the performance since we don't do unnecessary computation for Array concatenations/creation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually ran the existing tests.

@HyukjinKwon
Copy link
Member Author

cc @cloud-fan and @MaxGekk. It's rather minor but wanted to clean up here.

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117081 has finished for PR 27287 at commit e2cf12a.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117087 has finished for PR 27287 at commit c2c5ab0.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117077 has finished for PR 27287 at commit bde0b64.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

1 similar comment
@HyukjinKwon
Copy link
Member Author

retest this please

@MaxGekk
Copy link
Member

MaxGekk commented Jan 20, 2020

Something wrong is going on in #27287 (comment). This exception shouldn't happen:

Caused by: java.lang.IllegalStateException: Number of logging event reached the limit: 100
	at org.apache.spark.SparkFunSuite$LogAppender.append(SparkFunSuite.scala:197)
	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)

It was added by #27166, and the limit shouldn't be reachable.

The test doesn't register any log appenders:

test("SPARK-23786: enforce inferred schema") {
val expectedSchema = new StructType().add("_c0", DoubleType).add("_c1", StringType)
val withHeader = spark.read
.option("inferSchema", true)
.option("enforceSchema", false)
.option("header", true)
.csv(Seq("_c0,_c1", "1.0,a").toDS())
assert(withHeader.schema == expectedSchema)
checkAnswer(withHeader, Seq(Row(1.0, "a")))
// Ignore the inferSchema flag if an user sets a schema
val schema = new StructType().add("colA", DoubleType).add("colB", StringType)
val ds = spark.read
.option("inferSchema", true)
.option("enforceSchema", false)
.option("header", true)
.schema(schema)
.csv(Seq("colA,colB", "1.0,a").toDS())
assert(ds.schema == schema)
checkAnswer(ds, Seq(Row(1.0, "a")))
val exception = intercept[IllegalArgumentException] {
spark.read
.option("inferSchema", true)
.option("enforceSchema", false)
.option("header", true)
.schema(schema)
.csv(Seq("col1,col2", "1.0,a").toDS())
}
assert(exception.getMessage.contains("CSV header does not conform to the schema"))
}
but some appender throws exceptions.

@MaxGekk
Copy link
Member

MaxGekk commented Jan 20, 2020

I propose to assign names to log appenders, and print the names in exception. So, we will know which one wasn't removed.

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117097 has finished for PR 27287 at commit c2c5ab0.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 20, 2020

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117121 has finished for PR 27287 at commit c2c5ab0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 20, 2020

Test build #117124 has finished for PR 27287 at commit c2c5ab0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants