Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion docs/content.zh/docs/deployment/elastic_scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,6 @@ Adaptive 调度器可以通过[所有在名字包含 `adaptive-scheduler` 的配
- **不支持[本地恢复]({{< ref "docs/ops/state/large_state_tuning">}}#task-local-recovery)**:本地恢复是将 Task 调度到状态尽可能的被重用的机器上的功能。不支持这个功能意味着 Adaptive 调度器需要每次从 Checkpoint 的存储中下载整个 State。
- **不支持部分故障恢复**: 部分故障恢复意味着调度器可以只重启失败 Job 其中某一部分(在 Flink 的内部结构中被称之为 Region)而不是重启整个 Job。这个限制只会影响那些独立并行(Embarrassingly Parallel)Job的恢复时长,默认的调度器可以重启失败的部分,然而 Adaptive 将需要重启整个 Job。
- **与 Flink Web UI 的集成受限**: Adaptive 调度器会在 Job 的生命周期中改变它的并行度。Web UI 上只显示 Job 当前的并行度。
- **Job 的指标受限**: 除了 `numRestarts` 外,`Job` 作用域下所有的 [可用性]({{< ref "docs/ops/metrics" >}}#availability) 和 [Checkpoint]({{< ref "docs/ops/metrics" >}}#checkpointing) 指标都不准确。
- **空闲 Slot**: 如果 Slot 共享组的最大并行度不相等,提供给 Adaptive 调度器所使用的的 Slot 可能不会被使用。
- 扩缩容事件会触发 Job 和 Task 重启,Task 重试的次数也会增加。

Expand Down
8 changes: 0 additions & 8 deletions docs/content.zh/docs/ops/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -1051,10 +1051,6 @@ Metrics related to data exchange between task executors using netty network comm

### Availability

{{< hint warning >}}
If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode) is enabled then these metrics, except `numRestarts`, do not work correctly.
{{< /hint >}}

<table class="table table-bordered">
<thead>
<tr>
Expand Down Expand Up @@ -1103,10 +1099,6 @@ If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode)
{
### Checkpointing

{{< hint warning >}}
If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode) is enabled then the checkpointing metrics with the `Job` scope do not work correctly.
{{< /hint >}}

Note that for failed checkpoints, metrics are updated on a best efforts basis and may be not accurate.
<table class="table table-bordered">
<thead>
Expand Down
1 change: 0 additions & 1 deletion docs/content/docs/deployment/elastic_scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,6 @@ The behavior of Adaptive Scheduler is configured by [all configuration options c
- **No support for [local recovery]({{< ref "docs/ops/state/large_state_tuning">}}#task-local-recovery)**: Local recovery is a feature that schedules tasks to machines so that the state on that machine gets re-used if possible. The lack of this feature means that Adaptive Scheduler will always need to download the entire state from the checkpoint storage.
- **No support for partial failover**: Partial failover means that the scheduler is able to restart parts ("regions" in Flink's internals) of a failed job, instead of the entire job. This limitation impacts only recovery time of embarrassingly parallel jobs: Flink's default scheduler can restart failed parts, while Adaptive Scheduler will restart the entire job.
- **Limited integration with Flink's Web UI**: Adaptive Scheduler allows that a job's parallelism can change over its lifetime. The web UI only shows the current parallelism the job.
- **Limited Job metrics**: With the exception of `numRestarts` all [availability]({{< ref "docs/ops/metrics" >}}#availability) and [checkpointing]({{< ref "docs/ops/metrics" >}}#checkpointing) metrics with the `Job` scope are not working correctly.
- **Unused slots**: If the max parallelism for slot sharing groups is not equal, slots offered to Adaptive Scheduler might be unused.
- Scaling events trigger job and task restarts, which will increase the number of Task attempts.

Expand Down
8 changes: 0 additions & 8 deletions docs/content/docs/ops/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -1051,10 +1051,6 @@ Metrics related to data exchange between task executors using netty network comm

### Availability

{{< hint warning >}}
If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode) is enabled then these metrics, except `numRestarts`, do not work correctly.
{{< /hint >}}

<table class="table table-bordered">
<thead>
<tr>
Expand Down Expand Up @@ -1102,10 +1098,6 @@ If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode)

### Checkpointing

{{< hint warning >}}
If [Reactive Mode]({{< ref "docs/deployment/elastic_scaling" >}}#reactive-mode) is enabled then checkpointing metrics with the `Job` scope do not work correctly.
{{< /hint >}}

Note that for failed checkpoints, metrics are updated on a best efforts basis and may be not accurate.
<table class="table table-bordered">
<thead>
Expand Down