Skip to content

Commit f6f9d02

Browse files
guojcpwendell
authored andcommitted
Add timeout for fetch file
Currently, when fetch a file, the connection's connect timeout and read timeout is based on the default jvm setting, in this change, I change it to use spark.worker.timeout. This can be usefull, when the connection status between worker is not perfect. And prevent prematurely remove task set. Author: Jiacheng Guo <[email protected]> Closes #98 from guojc/master and squashes the following commits: abfe698 [Jiacheng Guo] add space according request 2a37c34 [Jiacheng Guo] Add timeout for fetch file Currently, when fetch a file, the connection's connect timeout and read timeout is based on the default jvm setting, in this change, I change it to use spark.worker.timeout. This can be usefull, when the connection status between worker is not perfect. And prevent prematurely remove task set.
1 parent 52834d7 commit f6f9d02

File tree

2 files changed

+13
-0
lines changed

2 files changed

+13
-0
lines changed

core/src/main/scala/org/apache/spark/util/Utils.scala

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,10 @@ private[spark] object Utils extends Logging {
278278
uc = new URL(url).openConnection()
279279
}
280280

281+
val timeout = conf.getInt("spark.files.fetchTimeout", 60) * 1000
282+
uc.setConnectTimeout(timeout)
283+
uc.setReadTimeout(timeout)
284+
uc.connect()
281285
val in = uc.getInputStream();
282286
val out = new FileOutputStream(tempFile)
283287
Utils.copyStream(in, out, true)

docs/configuration.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -523,6 +523,15 @@ Apart from these, the following properties are also available, and may be useful
523523
<td>
524524
Whether to overwrite files added through SparkContext.addFile() when the target file exists and its contents do not match those of the source.
525525
</td>
526+
</tr>
527+
<tr>
528+
<td>spark.files.fetchTimeout</td>
529+
<td>false</td>
530+
<td>
531+
Communication timeout to use when fetching files added through SparkContext.addFile() from
532+
the driver.
533+
</td>
534+
</tr>
526535
<tr>
527536
<td>spark.authenticate</td>
528537
<td>false</td>

0 commit comments

Comments
 (0)