Skip to content

Conversation

@HyukjinKwon
Copy link

What changes were proposed in this pull request?

This PR proposes to get rid of "obj" prefix when name is None.

How was this patch tested?

Manually tested and checked via

sh run-tests --modules pyspark-sql --python-executable python2.7

Also, manually tested as below:

from pyspark.sql.types import *

schema = StructType([StructField("a", IntegerType())])
spark.createDataFrame([["a"]], schema)
...
TypeError: a: IntegerType can not accept object 'a' in type <type 'str'>


schema = IntegerType()
spark.createDataFrame(["a"], IntegerType())
...
TypeError: value: IntegerType can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", ArrayType(IntegerType()))])
spark.createDataFrame(["a"], schema)
...
TypeError: StructType can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", ArrayType(IntegerType()))])
spark.createDataFrame([["a"]], schema)
...
TypeError: a: ArrayType(IntegerType,true) can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", ArrayType(IntegerType()))])
spark.createDataFrame([[["a"]]], schema)
...
TypeError: a[0]: IntegerType can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", ArrayType(StructType([StructField("a", IntegerType())])))])
spark.createDataFrame([[["b"]]], schema)
...
TypeError: a[0]: StructType can not accept object 'b' in type <type 'str'>


schema = StructType([StructField("a", MapType(IntegerType(), IntegerType()))])
spark.createDataFrame([{"a": "1"}], schema)
...
TypeError: a: MapType(IntegerType,IntegerType,true) can not accept object '1' in type <type 'str'>


schema = StructType([StructField("a", MapType(IntegerType(), IntegerType()))])
spark.createDataFrame([[{"1": "aa"}]], schema)
...
TypeError: a[1](key): IntegerType can not accept object '1' in type <type 'str'>


schema = StructType([StructField("a", MapType(IntegerType(), IntegerType()))])
spark.createDataFrame([[{1: "aa"}]], schema)
...
TypeError: a[1]: IntegerType can not accept object 'aa' in type <type 'str'>


schema = StructType([StructField("a", MapType(IntegerType(), StructType([StructField("a", IntegerType())])))])
spark.createDataFrame([[{1: ["1"]}]], schema)
...
TypeError: a[1].a: IntegerType can not accept object '1' in type <type 'str'>


schema = StructType([StructField("a", StructType([StructField("b", IntegerType())]))])
spark.createDataFrame([[["1"]]], schema)
...
TypeError: a.b: IntegerType can not accept object '1' in type <type 'str'>

schema = StructType([StructField("a", StructType([StructField("b", StructType([StructField("c", IntegerType())]))]))])
spark.createDataFrame([[[["1"]]]], schema)
...
TypeError: a.b.c: IntegerType can not accept object '1' in type <type 'str'>


schema = StructType([StructField("a", StructType([StructField("b", StructType([StructField("c", StructType([StructField("b", IntegerType())]))]))]))])
spark.createDataFrame([[[[["1"]]]]], schema)
...
TypeError: a.b.c.b: IntegerType can not accept object '1' in type <type 'str'>

@HyukjinKwon
Copy link
Author

(it would be even nicer if some nits maybe I made are clean up!)

@HyukjinKwon
Copy link
Author

I just tested nested array and map types as below for sure:

from pyspark.sql.types import *

schema = StructType([StructField("a", ArrayType(ArrayType(IntegerType())))])
spark.createDataFrame([[[["a"]]]], schema)
...
TypeError: a[0][0]: IntegerType can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", MapType(StringType(), MapType(IntegerType(), IntegerType())))])
spark.createDataFrame([[{"a": {"a": "b"}}]], schema)
...
TypeError: a[a][a](key): IntegerType can not accept object 'a' in type <type 'str'>


schema = StructType([StructField("a", MapType(StringType(), MapType(IntegerType(), IntegerType())))])
spark.createDataFrame([[{"a": {1: "b"}}]], schema)
TypeError: a[a][1]: IntegerType can not accept object 'b' in type <type 'str'>

@HyukjinKwon
Copy link
Author

Let's make efforts to PR 18521.

@HyukjinKwon HyukjinKwon closed this Jul 4, 2017
@HyukjinKwon HyukjinKwon deleted the format-message branch January 2, 2018 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant