From 103906fb23b5212858e89e9a090693b6fb2c6307 Mon Sep 17 00:00:00 2001 From: Harsh Sharma Date: Mon, 20 Feb 2017 12:21:55 +0530 Subject: [PATCH 1/5] Updated the SQL programming guide to explain about the Encoding operation --- docs/sql-programming-guide.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 235f5ecc40c9f..87146a1559a07 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be registered as a table. Tables can be used in subsequent SQL statements. +Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name. +To overcome this problem, we use the kryo encoder. It generally tells spark sql to encode our custom object, so that the operation could find this encoded object. + {% include_example schema_inferring scala/org/apache/spark/examples/sql/SparkSQLExample.scala %} From 9c8f63cc6f5876e4a9fec4279898f1c0e702068f Mon Sep 17 00:00:00 2001 From: Harsh Sharma Date: Mon, 20 Feb 2017 18:25:51 +0530 Subject: [PATCH 2/5] Updated the docs to match the voice of my updates in SQL Programming Guide --- docs/sql-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 87146a1559a07..391aac9632cbd 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -297,8 +297,8 @@ reflection and become the names of the columns. Case classes can also be nested types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be registered as a table. Tables can be used in subsequent SQL statements. -Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name. -To overcome this problem, we use the kryo encoder. It generally tells spark sql to encode our custom object, so that the operation could find this encoded object. +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of Encoder[T] where T is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name. +To overcome this problem, we use the kryo encoder. It generally tells Spark SQL to encode our custom object, so that the operation could find this encoded object. {% include_example schema_inferring scala/org/apache/spark/examples/sql/SparkSQLExample.scala %} From 7a539a7bd17b963a08707ba7c541120dd31f5694 Mon Sep 17 00:00:00 2001 From: Harsh Sharma Date: Tue, 21 Feb 2017 11:10:36 +0530 Subject: [PATCH 3/5] Modified the content and replaced the code block inside back-ticks instead of bold tags --- docs/sql-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 391aac9632cbd..6d5caad21ec76 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -297,8 +297,8 @@ reflection and become the names of the columns. Case classes can also be nested types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be registered as a table. Tables can be used in subsequent SQL statements. -Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of Encoder[T] where T is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name. -To overcome this problem, we use the kryo encoder. It generally tells Spark SQL to encode our custom object, so that the operation could find this encoded object. +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, Spark requires an encoder which takes the form of Encoder[T] where T is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into `java.lang.UnsupportedOperationException: No Encoder found for Object-Name`. +To overcome this problem, the kryo encoder is used. It generally tells Spark SQL to encode a custom object, so that the operation could find the encoded object. {% include_example schema_inferring scala/org/apache/spark/examples/sql/SparkSQLExample.scala %} From d49ae50187bbcf6690d85963e62e63e918df9c3b Mon Sep 17 00:00:00 2001 From: Harsh Sharma Date: Tue, 21 Feb 2017 15:14:17 +0530 Subject: [PATCH 4/5] Updated the content to provide object name --- docs/sql-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 6d5caad21ec76..52fd7bf82c588 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -297,8 +297,8 @@ reflection and become the names of the columns. Case classes can also be nested types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be registered as a table. Tables can be used in subsequent SQL statements. -Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, Spark requires an encoder which takes the form of Encoder[T] where T is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into `java.lang.UnsupportedOperationException: No Encoder found for Object-Name`. -To overcome this problem, the kryo encoder is used. It generally tells Spark SQL to encode a custom object, so that the operation could find the encoded object. +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create a Dataset, Spark requires an encoder which takes the form of `Encoder[T]` where `T` is the type which has to be encoded. +Considering an object of class `DemoObj(id: Int, name: String)` as a type of the Dataset to be created, may result into java.lang.UnsupportedOperationException: No Encoder found for DemoObj. To overcome this problem, the kryo encoder is used. It generally tells Spark SQL to encode object of DemoObj, so that the operation could find the encoded DemoObj. {% include_example schema_inferring scala/org/apache/spark/examples/sql/SparkSQLExample.scala %} From c2fd0ad9109f2d0947cfdf213f79dc8f50251a03 Mon Sep 17 00:00:00 2001 From: Harsh Sharma Date: Tue, 21 Feb 2017 15:15:20 +0530 Subject: [PATCH 5/5] Updated the content to provide object name --- docs/sql-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 52fd7bf82c588..5ba4af2ed1aa8 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -298,7 +298,7 @@ types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a Data registered as a table. Tables can be used in subsequent SQL statements. Spark Encoders are used to convert a JVM object to Spark SQL representation. To create a Dataset, Spark requires an encoder which takes the form of `Encoder[T]` where `T` is the type which has to be encoded. -Considering an object of class `DemoObj(id: Int, name: String)` as a type of the Dataset to be created, may result into java.lang.UnsupportedOperationException: No Encoder found for DemoObj. To overcome this problem, the kryo encoder is used. It generally tells Spark SQL to encode object of DemoObj, so that the operation could find the encoded DemoObj. +Considering an object of class `DemoObj(id: Int, name: String)` as a type of the Dataset to be created, may result into java.lang.UnsupportedOperationException: No Encoder found for DemoObj. To overcome this problem, the kryo encoder is used. It generally tells Spark SQL to encode object of `DemoObj`, so that the operation could find the encoded `DemoObj`. {% include_example schema_inferring scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}