Skip to content

Commit cb7a90a

Browse files
yanboliangjkbradley
authored andcommitted
[SPARK-14298][ML][MLLIB] LDA should support disable checkpoint
## What changes were proposed in this pull request? In the doc of [```checkpointInterval```](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala#L241), we told users that they can disable checkpoint by setting ```checkpointInterval = -1```. But we did not handle this situation for LDA actually, we should fix this bug. ## How was this patch tested? Existing tests. cc jkbradley Author: Yanbo Liang <[email protected]> Closes #12089 from yanboliang/spark-14298. (cherry picked from commit 56af8e8) Signed-off-by: Joseph K. Bradley <[email protected]>
1 parent 1e61ff4 commit cb7a90a

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicCheckpointer.scala

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,8 @@ import org.apache.spark.storage.StorageLevel
5151
* - This class removes checkpoint files once later Datasets have been checkpointed.
5252
* However, references to the older Datasets will still return isCheckpointed = true.
5353
*
54-
* @param checkpointInterval Datasets will be checkpointed at this interval
54+
* @param checkpointInterval Datasets will be checkpointed at this interval.
55+
* If this interval was set as -1, then checkpointing will be disabled.
5556
* @param sc SparkContext for the Datasets given to this checkpointer
5657
* @tparam T Dataset type, such as RDD[Double]
5758
*/
@@ -88,7 +89,8 @@ private[mllib] abstract class PeriodicCheckpointer[T](
8889
updateCount += 1
8990

9091
// Handle checkpointing (after persisting)
91-
if ((updateCount % checkpointInterval) == 0 && sc.getCheckpointDir.nonEmpty) {
92+
if (checkpointInterval != -1 && (updateCount % checkpointInterval) == 0
93+
&& sc.getCheckpointDir.nonEmpty) {
9294
// Add new checkpoint before removing old checkpoints.
9395
checkpoint(newData)
9496
checkpointQueue.enqueue(newData)

mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,8 @@ import org.apache.spark.storage.StorageLevel
6969
* // checkpointed: graph4
7070
* }}}
7171
*
72-
* @param checkpointInterval Graphs will be checkpointed at this interval
72+
* @param checkpointInterval Graphs will be checkpointed at this interval.
73+
* If this interval was set as -1, then checkpointing will be disabled.
7374
* @tparam VD Vertex descriptor type
7475
* @tparam ED Edge descriptor type
7576
*

0 commit comments

Comments
 (0)