Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ class DefaultSource extends HadoopFsRelationProvider with DataSourceRegister {
partitionColumns: Option[StructType],
parameters: Map[String, String]): HadoopFsRelation = {
dataSchema.foreach(verifySchema)
new TextRelation(None, partitionColumns, paths)(sqlContext)
new TextRelation(None, dataSchema, partitionColumns, paths)(sqlContext)
}

override def shortName(): String = "text"
Expand All @@ -70,15 +70,16 @@ class DefaultSource extends HadoopFsRelationProvider with DataSourceRegister {

private[sql] class TextRelation(
val maybePartitionSpec: Option[PartitionSpec],
val textSchema: Option[StructType],
override val userDefinedPartitionColumns: Option[StructType],
override val paths: Array[String] = Array.empty[String],
parameters: Map[String, String] = Map.empty[String, String])
(@transient val sqlContext: SQLContext)
extends HadoopFsRelation(maybePartitionSpec, parameters) {

/** Data schema is always a single column, named "value". */
override def dataSchema: StructType = new StructType().add("value", StringType)

/** Data schema is always a single column, named "value" if original Data source has no schema. */
override def dataSchema: StructType =
textSchema.getOrElse(new StructType().add("value", StringType))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make sure that textSchema is a struct type that has only one string field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan DefaultSource.scala is the only place that creates a TextRelation, and it verifies that the schema is size 1 and of type string before creating a TextRelation. So I think it is fine not to verify again here. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, then it's fine

/** This is an internal data source that outputs internal row format. */
override val needConversion: Boolean = false

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ class TextSuite extends QueryTest with SharedSQLContext {
verifyFrame(sqlContext.read.text(testFile))
}

test("writing") {
val df = sqlContext.read.text(testFile)
test("SPARK-12562 verify write.text() can handle column name beyond `value`") {
val df = sqlContext.read.text(testFile).withColumnRenamed("value", "adwrasdf")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be failed because the later method verifyFrame will check if the df read in has schema like new StructType().add("value", StringType). You could update verifyFrame to check if it has only one StringType column.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After write.text(), the local text file actually does not carry the schema name like JSON does. When reading back the text file and then call verifyFrame, it will always have value as the column name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. it is right.


val tempFile = Utils.createTempDir()
tempFile.delete()
Expand Down