@@ -1813,7 +1813,7 @@ To run a Spark Streaming applications, you need to have the following.
18131813 + * Mesos* - [ Marathon] ( https://github.com/mesosphere/marathon ) has been used to achieve this
18141814 with Mesos.
18151815
1816- - * [ Since Spark 1.2 ] Configuring write ahead logs* - Since Spark 1.2,
1816+ - * Configuring write ahead logs* - Since Spark 1.2,
18171817 we have introduced _ write ahead logs_ for achieving strong
18181818 fault-tolerance guarantees. If enabled, all the data received from a receiver gets written into
18191819 a write ahead log in the configuration checkpoint directory. This prevents data loss on driver
@@ -1828,6 +1828,17 @@ To run a Spark Streaming applications, you need to have the following.
18281828 stored in a replicated storage system. This can be done by setting the storage level for the
18291829 input stream to ` StorageLevel.MEMORY_AND_DISK_SER ` .
18301830
1831+ - * Setting the max receiving rate* - If the cluster resources is not large enough for the streaming
1832+ application to process data as fast as it is being received, the receivers can be rate limited
1833+ by setting a maximum rate limit in terms of records / sec.
1834+ See the [ configuration parameters] ( configuration.html#spark-streaming )
1835+ ` spark.streaming.receiver.maxRate ` for receivers and ` spark.streaming.kafka.maxRatePerPartition `
1836+ for Direct Kafka approach. In Spark 1.5, we have introduced a feature called * backpressure* that
1837+ eliminate the need to set this rate limit, as Spark Streaming automatically figures out the
1838+ rate limits and dynamically adjusts them if the processing conditions change. This backpressure
1839+ can be enabled by setting the [ configuration parameter] ( configuration.html#spark-streaming )
1840+ ` spark.streaming.backpressure.enabled ` to ` true ` .
1841+
18311842### Upgrading Application Code
18321843{:.no_toc}
18331844
0 commit comments