-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Consider a pipeline like the following:
PUT _ingest/pipeline/test-pipeline-1
{
"processors" : [
{
"script": {
"lang": "painless",
"params": {
"list": [1,2,3,4]
},
"source": "ctx.list = params.get('list');"
}
},
{
"append": {
"field": "list",
"value": 10
}
}
]
}
And then repeatedly invoke the _simulate API against that pipeline:
POST /_ingest/pipeline/test-pipeline-1/_simulate
{
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"message": "hello"
}
}
]
}
Each invocation will result in growing a new '10' because the script is leaking a reference to the underlying mutable 'list' parameter:
{
"docs" : [
{
"doc" : {
"_index" : "index",
"_id" : "id",
"_version" : "-3",
"_source" : {
"message" : "hello",
"list" : [
1,
2,
3,
4,
10,
10,
10,
10,
10,
10,
10 # weeeeeeeeeeeeeeeeeeeeeee
]
},
"_ingest" : {
"timestamp" : "2022-11-28T17:46:35.03053Z"
}
}
}
]
}
Fixing this in the calling code is relatively easy, the pipeline should be rewritten as:
PUT _ingest/pipeline/test-pipeline-1
{
"processors" : [
{
"script": {
"lang": "painless",
"params": {
"list": [1,2,3,4]
},
"source": "ctx.list = new ArrayList(params.get('list'));"
}
},
{
"append": {
"field": "list",
"value": 10
}
}
]
}
Now the list parameter is defensively copied and so we're no longer banging on an unintentionally static object.
Note: the params object itself is unmodifiable, you can't invent/inject a new parameter at runtime via script, however it's only shallowly immutable -- the tree of maps and lists that it represents is made of mutable maps and lists.
IMHO we should update ScriptProcessor to either return a shared deeply immutable param tree, or we should duplicate the params tree on each invocation in which case it's still mutable but no longer shared.
The current behavior is harmless enough when ingest pipelines merely leak the mutable reference, but it can cause rare and hard to track down race-condition bugs when the pipelines actually do mutate the shared mutable reference.