You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations) and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (String, Int) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:
129
+
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations), and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (String, Int) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:
130
130
131
131
{% highlight scala %}
132
132
scala> wordCounts.collect()
@@ -163,7 +163,7 @@ One common data flow pattern is MapReduce, as popularized by Hadoop. Spark can i
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations) and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (string, int) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:
166
+
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations), and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (string, int) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:
167
167
168
168
{% highlight python %}
169
169
>>> wordCounts.collect()
@@ -217,13 +217,13 @@ a cluster, as described in the [programming guide](programming-guide.html#initia
217
217
</div>
218
218
219
219
# Self-Contained Applications
220
-
Now say we wanted to write a self-contained application using the Spark API. We will walk through a
221
-
simple application in both Scala (with SBT), Java (with Maven), and Python.
220
+
Suppose we wish to write a self-contained application using the Spark API. We will walk through a
221
+
simple application in Scala (with SBT), Java (with Maven), and Python.
222
222
223
223
<divclass="codetabs">
224
224
<divdata-lang="scala"markdown="1">
225
225
226
-
We'll create a very simple Spark application in Scala. So simple, in fact, that it's
226
+
We'll create a very simple Spark application in Scala--so simple, in fact, that it's
227
227
named `SimpleApp.scala`:
228
228
229
229
{% highlight scala %}
@@ -258,8 +258,8 @@ We pass the SparkContext constructor a
258
258
object which contains information about our
259
259
application.
260
260
261
-
Our application depends on the Spark API, so we'll also include an sbt configuration file,
262
-
`simple.sbt` which explains that Spark is a dependency. This file also adds a repository that
261
+
Our application depends on the Spark API, so we'll also include an SBT configuration file,
262
+
`simple.sbt`, which explains that Spark is a dependency. This file also adds a repository that
0 commit comments