Skip to content

Commit 2be52ef

Browse files
committed
Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>) should be preferred. This enforces a limit to the number of scripts that can be compiled within a minute. A new dynamic setting is added - `script.max_compilations_per_minute`, which defaults to 15. If more dynamic scripts are sent, a user will get the following exception: ```json { "error" : { "root_cause" : [ { "type" : "circuit_breaking_exception", "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead", "bytes_wanted" : 0, "bytes_limit" : 0 } ], "type" : "search_phase_execution_exception", "reason" : "all shards failed", "phase" : "query", "grouped" : true, "failed_shards" : [ { "shard" : 0, "index" : "i", "node" : "a5V1eXcZRYiIk8lecjZ4Jw", "reason" : { "type" : "general_script_exception", "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]", "caused_by" : { "type" : "circuit_breaking_exception", "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead", "bytes_wanted" : 0, "bytes_limit" : 0 } } } ], "caused_by" : { "type" : "general_script_exception", "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]", "caused_by" : { "type" : "circuit_breaking_exception", "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead", "bytes_wanted" : 0, "bytes_limit" : 0 } } }, "status" : 500 } ``` This also fixes a bug in `ScriptService` where requests being executed concurrently on a single node could cause a script to be compiled multiple times (many in the case of a powerful node with many shards) due to no synchronization between checking the cache and compiling the script. There is now synchronization so that a script being compiled will only be compiled once regardless of the number of concurrent searches on a node. Relates to #19396
1 parent a91bb29 commit 2be52ef

File tree

19 files changed

+249
-25
lines changed

19 files changed

+249
-25
lines changed

core/src/main/java/org/elasticsearch/common/settings/ClusterSettings.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,7 @@ public void apply(Settings value, Settings current, Settings previous) {
300300
ScriptService.SCRIPT_CACHE_EXPIRE_SETTING,
301301
ScriptService.SCRIPT_AUTO_RELOAD_ENABLED_SETTING,
302302
ScriptService.SCRIPT_MAX_SIZE_IN_BYTES,
303+
ScriptService.SCRIPT_MAX_COMPILATIONS_PER_MINUTE,
303304
IndicesService.INDICES_CACHE_CLEAN_INTERVAL_SETTING,
304305
IndicesFieldDataCache.INDICES_FIELDDATA_CACHE_SIZE_KEY,
305306
IndicesRequestCache.INDICES_CACHE_QUERY_SIZE,

core/src/main/java/org/elasticsearch/node/Node.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -294,6 +294,7 @@ protected Node(final Environment environment, Collection<Class<? extends Plugin>
294294
// this is as early as we can validate settings at this point. we already pass them to ScriptModule as well as ThreadPool
295295
// so we might be late here already
296296
final SettingsModule settingsModule = new SettingsModule(this.settings, additionalSettings, additionalSettingsFilter);
297+
scriptModule.registerClusterSettingsListeners(settingsModule.getClusterSettings());
297298
resourcesToClose.add(resourceWatcherService);
298299
final NetworkService networkService = new NetworkService(settings,
299300
getCustomNameResolvers(pluginsService.filterPlugins(DiscoveryPlugin.class)));

core/src/main/java/org/elasticsearch/script/ScriptModule.java

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
package org.elasticsearch.script;
2121

22+
import org.elasticsearch.common.settings.ClusterSettings;
2223
import org.elasticsearch.common.settings.Setting;
2324
import org.elasticsearch.common.settings.Settings;
2425
import org.elasticsearch.env.Environment;
@@ -45,8 +46,8 @@ public class ScriptModule {
4546
* Build from {@linkplain ScriptPlugin}s. Convenient for normal use but not great for tests. See
4647
* {@link ScriptModule#ScriptModule(Settings, Environment, ResourceWatcherService, List, List)} for easier use in tests.
4748
*/
48-
public static ScriptModule create(Settings settings, Environment environment, ResourceWatcherService resourceWatcherService,
49-
List<ScriptPlugin> scriptPlugins) {
49+
public static ScriptModule create(Settings settings, Environment environment,
50+
ResourceWatcherService resourceWatcherService, List<ScriptPlugin> scriptPlugins) {
5051
Map<String, NativeScriptFactory> factoryMap = scriptPlugins.stream().flatMap(x -> x.getNativeScripts().stream())
5152
.collect(Collectors.toMap(NativeScriptFactory::getName, Function.identity()));
5253
NativeScriptEngineService nativeScriptEngineService = new NativeScriptEngineService(settings, factoryMap);
@@ -61,8 +62,9 @@ public static ScriptModule create(Settings settings, Environment environment, Re
6162
/**
6263
* Build {@linkplain ScriptEngineService} and {@linkplain ScriptContext.Plugin}.
6364
*/
64-
public ScriptModule(Settings settings, Environment environment, ResourceWatcherService resourceWatcherService,
65-
List<ScriptEngineService> scriptEngineServices, List<ScriptContext.Plugin> customScriptContexts) {
65+
public ScriptModule(Settings settings, Environment environment,
66+
ResourceWatcherService resourceWatcherService, List<ScriptEngineService> scriptEngineServices,
67+
List<ScriptContext.Plugin> customScriptContexts) {
6668
ScriptContextRegistry scriptContextRegistry = new ScriptContextRegistry(customScriptContexts);
6769
ScriptEngineRegistry scriptEngineRegistry = new ScriptEngineRegistry(scriptEngineServices);
6870
scriptSettings = new ScriptSettings(scriptEngineRegistry, scriptContextRegistry);
@@ -87,4 +89,11 @@ public List<Setting<?>> getSettings() {
8789
public ScriptService getScriptService() {
8890
return scriptService;
8991
}
92+
93+
/**
94+
* Allow the script service to register any settings update handlers on the cluster settings
95+
*/
96+
public void registerClusterSettingsListeners(ClusterSettings clusterSettings) {
97+
scriptService.registerClusterSettingsListeners(clusterSettings);
98+
}
9099
}

core/src/main/java/org/elasticsearch/script/ScriptService.java

Lines changed: 89 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
import org.elasticsearch.common.ParseField;
3737
import org.elasticsearch.common.ParseFieldMatcher;
3838
import org.elasticsearch.common.Strings;
39+
import org.elasticsearch.common.breaker.CircuitBreakingException;
3940
import org.elasticsearch.common.bytes.BytesReference;
4041
import org.elasticsearch.common.cache.Cache;
4142
import org.elasticsearch.common.cache.CacheBuilder;
@@ -47,6 +48,7 @@
4748
import org.elasticsearch.common.io.stream.StreamInput;
4849
import org.elasticsearch.common.io.stream.StreamOutput;
4950
import org.elasticsearch.common.logging.LoggerMessageFormat;
51+
import org.elasticsearch.common.settings.ClusterSettings;
5052
import org.elasticsearch.common.settings.Setting;
5153
import org.elasticsearch.common.settings.Setting.Property;
5254
import org.elasticsearch.common.settings.Settings;
@@ -86,6 +88,8 @@ public class ScriptService extends AbstractComponent implements Closeable, Clust
8688
Setting.boolSetting("script.auto_reload_enabled", true, Property.NodeScope);
8789
public static final Setting<Integer> SCRIPT_MAX_SIZE_IN_BYTES =
8890
Setting.intSetting("script.max_size_in_bytes", 65535, Property.NodeScope);
91+
public static final Setting<Integer> SCRIPT_MAX_COMPILATIONS_PER_MINUTE =
92+
Setting.intSetting("script.max_compilations_per_minute", 15, 0, Property.Dynamic, Property.NodeScope);
8993

9094
private final String defaultLang;
9195

@@ -106,6 +110,11 @@ public class ScriptService extends AbstractComponent implements Closeable, Clust
106110

107111
private ClusterState clusterState;
108112

113+
private int totalCompilesPerMinute;
114+
private long lastInlineCompileTime;
115+
private double scriptsPerMinCounter;
116+
private double compilesAllowedPerNano;
117+
109118
public ScriptService(Settings settings, Environment env,
110119
ResourceWatcherService resourceWatcherService, ScriptEngineRegistry scriptEngineRegistry,
111120
ScriptContextRegistry scriptContextRegistry, ScriptSettings scriptSettings) throws IOException {
@@ -165,6 +174,13 @@ public ScriptService(Settings settings, Environment env,
165174
// automatic reload is disable just load scripts once
166175
fileWatcher.init();
167176
}
177+
178+
this.lastInlineCompileTime = System.nanoTime();
179+
this.setMaxCompilationsPerMinute(SCRIPT_MAX_COMPILATIONS_PER_MINUTE.get(settings));
180+
}
181+
182+
void registerClusterSettingsListeners(ClusterSettings clusterSettings) {
183+
clusterSettings.addSettingsUpdateConsumer(SCRIPT_MAX_COMPILATIONS_PER_MINUTE, this::setMaxCompilationsPerMinute);
168184
}
169185

170186
@Override
@@ -188,7 +204,12 @@ private ScriptEngineService getScriptEngineServiceForFileExt(String fileExtensio
188204
return scriptEngineService;
189205
}
190206

191-
207+
void setMaxCompilationsPerMinute(Integer newMaxPerMinute) {
208+
this.totalCompilesPerMinute = newMaxPerMinute;
209+
// Reset the counter to allow new compilations
210+
this.scriptsPerMinCounter = totalCompilesPerMinute;
211+
this.compilesAllowedPerNano = ((double) totalCompilesPerMinute) / TimeValue.timeValueMinutes(1).nanos();
212+
}
192213

193214
/**
194215
* Checks if a script can be executed and compiles it if needed, or returns the previously compiled and cached script.
@@ -224,6 +245,38 @@ public CompiledScript compile(Script script, ScriptContext scriptContext, Map<St
224245
return compileInternal(script, params);
225246
}
226247

248+
/**
249+
* Check whether there have been too many compilations within the last minute, throwing a circuit breaking exception if so.
250+
* This is a variant of the token bucket algorithm: https://en.wikipedia.org/wiki/Token_bucket
251+
*
252+
* It can be thought of as a bucket with water, every time the bucket is checked, water is added proportional to the amount of time that
253+
* elapsed since the last time it was checked. If there is enough water, some is removed and the request is allowed. If there is not
254+
* enough water the request is denied. Just like a normal bucket, if water is added that overflows the bucket, the extra water/capacity
255+
* is discarded - there can never be more water in the bucket than the size of the bucket.
256+
*/
257+
void checkCompilationLimit() {
258+
long now = System.nanoTime();
259+
long timePassed = now - lastInlineCompileTime;
260+
lastInlineCompileTime = now;
261+
262+
scriptsPerMinCounter += ((double) timePassed) * compilesAllowedPerNano;
263+
264+
// It's been over the time limit anyway, readjust the bucket to be level
265+
if (scriptsPerMinCounter > totalCompilesPerMinute) {
266+
scriptsPerMinCounter = totalCompilesPerMinute;
267+
}
268+
269+
// If there is enough tokens in the bucket, allow the request and decrease the tokens by 1
270+
if (scriptsPerMinCounter >= 1) {
271+
scriptsPerMinCounter -= 1.0;
272+
} else {
273+
// Otherwise reject the request
274+
throw new CircuitBreakingException("[script] Too many dynamic script compilations within one minute, max: [" +
275+
totalCompilesPerMinute + "/min]; please use on-disk, indexed, or scripts with parameters instead; " +
276+
"this limit can be changed by the [" + SCRIPT_MAX_COMPILATIONS_PER_MINUTE.getKey() + "] setting");
277+
}
278+
}
279+
227280
/**
228281
* Compiles a script straight-away, or returns the previously compiled and cached script,
229282
* without checking if it can be executed based on settings.
@@ -271,28 +324,44 @@ CompiledScript compileInternal(Script script, Map<String, String> params) {
271324
CacheKey cacheKey = new CacheKey(scriptEngineService, type == ScriptType.INLINE ? null : name, code, params);
272325
CompiledScript compiledScript = cache.get(cacheKey);
273326

274-
if (compiledScript == null) {
275-
//Either an un-cached inline script or indexed script
276-
//If the script type is inline the name will be the same as the code for identification in exceptions
277-
try {
278-
// but give the script engine the chance to be better, give it separate name + source code
279-
// for the inline case, then its anonymous: null.
280-
String actualName = (type == ScriptType.INLINE) ? null : name;
281-
compiledScript = new CompiledScript(type, name, lang, scriptEngineService.compile(actualName, code, params));
282-
} catch (ScriptException good) {
283-
// TODO: remove this try-catch completely, when all script engines have good exceptions!
284-
throw good; // its already good
285-
} catch (Exception exception) {
286-
throw new GeneralScriptException("Failed to compile " + type + " script [" + name + "] using lang [" + lang + "]", exception);
327+
if (compiledScript != null) {
328+
return compiledScript;
329+
}
330+
331+
// Synchronize so we don't compile scripts many times during multiple shards all compiling a script
332+
synchronized (this) {
333+
// Retrieve it again in case it has been put by a different thread
334+
compiledScript = cache.get(cacheKey);
335+
336+
if (compiledScript == null) {
337+
try {
338+
// Either an un-cached inline script or indexed script
339+
// If the script type is inline the name will be the same as the code for identification in exceptions
340+
341+
// but give the script engine the chance to be better, give it separate name + source code
342+
// for the inline case, then its anonymous: null.
343+
String actualName = (type == ScriptType.INLINE) ? null : name;
344+
if (logger.isTraceEnabled()) {
345+
logger.trace("compiling script, type: [{}], lang: [{}], params: [{}]", type, lang, params);
346+
}
347+
// Check whether too many compilations have happened
348+
checkCompilationLimit();
349+
compiledScript = new CompiledScript(type, name, lang, scriptEngineService.compile(actualName, code, params));
350+
} catch (ScriptException good) {
351+
// TODO: remove this try-catch completely, when all script engines have good exceptions!
352+
throw good; // its already good
353+
} catch (Exception exception) {
354+
throw new GeneralScriptException("Failed to compile " + type + " script [" + name + "] using lang [" + lang + "]", exception);
355+
}
356+
357+
// Since the cache key is the script content itself we don't need to
358+
// invalidate/check the cache if an indexed script changes.
359+
scriptMetrics.onCompilation();
360+
cache.put(cacheKey, compiledScript);
287361
}
288362

289-
//Since the cache key is the script content itself we don't need to
290-
//invalidate/check the cache if an indexed script changes.
291-
scriptMetrics.onCompilation();
292-
cache.put(cacheKey, compiledScript);
363+
return compiledScript;
293364
}
294-
295-
return compiledScript;
296365
}
297366

298367
private String validateScriptLanguage(String scriptLang) {

core/src/test/java/org/elasticsearch/script/ScriptServiceTests.java

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
import org.elasticsearch.cluster.ClusterState;
2727
import org.elasticsearch.cluster.metadata.MetaData;
2828
import org.elasticsearch.common.Nullable;
29+
import org.elasticsearch.common.breaker.CircuitBreakingException;
2930
import org.elasticsearch.common.bytes.BytesArray;
3031
import org.elasticsearch.common.bytes.BytesReference;
3132
import org.elasticsearch.common.io.Streams;
@@ -80,6 +81,7 @@ public void setup() throws IOException {
8081
baseSettings = Settings.builder()
8182
.put(Environment.PATH_HOME_SETTING.getKey(), createTempDir().toString())
8283
.put(Environment.PATH_CONF_SETTING.getKey(), genericConfigFolder)
84+
.put(ScriptService.SCRIPT_MAX_COMPILATIONS_PER_MINUTE.getKey(), 10000)
8385
.build();
8486
resourceWatcherService = new ResourceWatcherService(baseSettings, null);
8587
scriptEngineService = new TestEngineService();
@@ -123,6 +125,30 @@ String getScriptFromClusterState(String scriptLang, String id) {
123125
};
124126
}
125127

128+
public void testCompilationCircuitBreaking() throws Exception {
129+
buildScriptService(Settings.EMPTY);
130+
scriptService.setMaxCompilationsPerMinute(1);
131+
scriptService.checkCompilationLimit(); // should pass
132+
expectThrows(CircuitBreakingException.class, () -> scriptService.checkCompilationLimit());
133+
scriptService.setMaxCompilationsPerMinute(2);
134+
scriptService.checkCompilationLimit(); // should pass
135+
scriptService.checkCompilationLimit(); // should pass
136+
expectThrows(CircuitBreakingException.class, () -> scriptService.checkCompilationLimit());
137+
int count = randomIntBetween(5, 50);
138+
scriptService.setMaxCompilationsPerMinute(count);
139+
for (int i = 0; i < count; i++) {
140+
scriptService.checkCompilationLimit(); // should pass
141+
}
142+
expectThrows(CircuitBreakingException.class, () -> scriptService.checkCompilationLimit());
143+
scriptService.setMaxCompilationsPerMinute(0);
144+
expectThrows(CircuitBreakingException.class, () -> scriptService.checkCompilationLimit());
145+
scriptService.setMaxCompilationsPerMinute(Integer.MAX_VALUE);
146+
int largeLimit = randomIntBetween(1000, 10000);
147+
for (int i = 0; i < largeLimit; i++) {
148+
scriptService.checkCompilationLimit();
149+
}
150+
}
151+
126152
public void testNotSupportedDisableDynamicSetting() throws IOException {
127153
try {
128154
buildScriptService(Settings.builder().put(ScriptService.DISABLE_DYNAMIC_SCRIPTING_SETTING, randomUnicodeOfLength(randomIntBetween(1, 10))).build());

docs/build.gradle

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ integTest {
2323
cluster {
2424
setting 'script.inline', 'true'
2525
setting 'script.stored', 'true'
26+
setting 'script.max_compilations_per_minute', '1000'
2627
Closure configFile = {
2728
extraConfigFile it, "src/test/cluster/config/$it"
2829
}

docs/reference/modules/indices/circuit_breaker.asciidoc

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,5 +72,18 @@ memory on a node. The memory usage is based on the content length of the request
7272
A constant that all in flight requests estimations are multiplied with to determine a
7373
final estimation. Defaults to 1
7474

75-
[[http-circuit-breaker]]
75+
[[script-compilation-circuit-breaker]]
7676
[float]
77+
==== Script compilation circuit breaker
78+
79+
Slightly different than the previous memory-based circuit breaker, the script
80+
compilation circuit breaker limits the number of inline script compilations
81+
within a period of time.
82+
83+
See the "prefer-parameters" section of the <<modules-scripting-using,scripting>>
84+
documentation for more information.
85+
86+
`script.max_compilations_per_minute`::
87+
88+
Limit for the number of unique dynamic scripts within a minute that are
89+
allowed to be compiled. Defaults to 15.

docs/reference/modules/scripting/using.asciidoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,12 @@ Instead, pass it in as a named parameter:
103103
The first version has to be recompiled every time the multiplier changes. The
104104
second version is only compiled once.
105105
106+
If you compile too many unique scripts within a small amount of time,
107+
Elasticsearch will reject the new dynamic scripts with a
108+
`circuit_breaking_exception` error. By default, up to 15 inline scripts per
109+
minute will be compiled. You can change this setting dynamically by setting
110+
`script.max_compilations_per_minute`.
111+
106112
========================================
107113

108114

modules/lang-expression/build.gradle

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,8 @@ dependencyLicenses {
3535
mapping from: /asm-.*/, to: 'asm'
3636
}
3737

38+
integTest {
39+
cluster {
40+
setting 'script.max_compilations_per_minute', '1000'
41+
}
42+
}

modules/lang-groovy/build.gradle

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ integTest {
3030
cluster {
3131
setting 'script.inline', 'true'
3232
setting 'script.stored', 'true'
33+
setting 'script.max_compilations_per_minute', '1000'
3334
}
3435
}
3536

0 commit comments

Comments
 (0)