You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/sparkr.md
+17-7Lines changed: 17 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,17 +37,27 @@ sc <- sparkR.init()
37
37
sqlContext <- sparkRSQL.init(sc)
38
38
{% endhighlight %}
39
39
40
+
In the event you are creating `SparkContext` instead of using `sparkR` shell or `spark-submit`, you
41
+
could also specify certain Spark driver properties. Normally these
42
+
[Application properties](configuration.html#application-properties) and [Runtime Environment](configuration.html#runtime-environment) cannot be set programmatically, as the
43
+
driver JVM process would have been started, in this case SparkR takes care of this for you. To set
44
+
them, pass them as you would other configuration properties in the `sparkEnvir` argument.
With a `SQLContext`, applications can create `DataFrame`s from a local R data frame, from a [Hive table](sql-programming-guide.html#hive-tables), or from other [data sources](sql-programming-guide.html#data-sources).
44
54
45
55
### From local data frames
46
-
The simplest way to create a data frame is to convert a local R data frame into a SparkR DataFrame. Specifically we can use `createDataFrame` and pass in the local R data frame to create a SparkR DataFrame. As an example, the following creates a `DataFrame` based using the `faithful` dataset from R.
56
+
The simplest way to create a data frame is to convert a local R data frame into a SparkR DataFrame. Specifically we can use `createDataFrame` and pass in the local R data frame to create a SparkR DataFrame. As an example, the following creates a `DataFrame` based using the `faithful` dataset from R.
47
57
48
58
<divdata-lang="r"markdown="1">
49
59
{% highlight r %}
50
-
df <- createDataFrame(sqlContext, faithful)
60
+
df <- createDataFrame(sqlContext, faithful)
51
61
52
62
# Displays the content of the DataFrame to stdout
53
63
head(df)
@@ -96,7 +106,7 @@ printSchema(people)
96
106
</div>
97
107
98
108
The data sources API can also be used to save out DataFrames into multiple file formats. For example we can save the DataFrame from the previous example
99
-
to a Parquet file using `write.df`
109
+
to a Parquet file using `write.df`
100
110
101
111
<divdata-lang="r"markdown="1">
102
112
{% highlight r %}
@@ -139,7 +149,7 @@ Here we include some basic examples and a complete list can be found in the [API
SparkR data frames support a number of commonly used functions to aggregate data after grouping. For example we can compute a histogram of the `waiting` time in the `faithful` dataset as shown below
SparkR also provides a number of functions that can directly applied to columns for data processing and during aggregation. The example below shows the use of basic arithmetic functions.
207
+
SparkR also provides a number of functions that can directly applied to columns for data processing and during aggregation. The example below shows the use of basic arithmetic functions.
0 commit comments