-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
POST /_ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"community_id": {
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"source": {
"ip": "1.2.3.4",
"port": 1122
},
"destination": {
"ip": "5.6.7.8",
"port": 3344
},
"network": {
"transport": "TCP"
}
}
}
]
}
Results in 1:wCb3OG7yAFWelaUydu0D+125CLM=, which matches the first row in the table 'Default settings' table.
Similarly:
POST /_ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"community_id": {
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"source": {
"ip": "5.6.7.8",
"port": 0
},
"destination": {
"ip": "1.2.3.4",
"port": 0
},
"network": {
"iana_number": 1
}
}
}
]
}
Results in 1:crodRHL2FEsHjbv3UkRrfbs4bZ0= in the middle of the table.
So we're all good on those.
However, changing the iana_number to 17 on that last example gives the following result:
{
"docs" : [
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "invalid source port [0]"
}
],
"type" : "illegal_argument_exception",
"reason" : "invalid source port [0]"
}
}
]
}
Placeholder editorial comment: I'm not absolutely certain as I write this ticket right now whether it's correct for us to do this kind of validation at this point or not -- is it up to us at this stage to tell users that the data is bad (is it even bad?) or should we just process it?
Another case where we deviate from the documented spec is captured in #105247 (comment). I haven't debugged the issue there well enough to say why our value is different than the specified value, however.
My recollection is that the Elasticsearch implementation is a port of the beats implementation, so if we're wrong in these ways then they're probably wrong in these ways, too. When we figure out the ways in which we're wrong, we should make sure to raise that with them, too. At the very least it makes sense to add a few more cases to their test suite to confirm that they handle these cases correctly, but if they have the same issues we do then those issues should also be fixed so that our behaviors stay in sync. (See #55685 / #66534.)
An annoyance here is that fixing the broken behavior is probably a breaking change -- if you've been relying on the community ids that we've been providing, and those values have been consistently wrong, then when we fix the behavior you'll get new different values that while correct are now different than what you'd seen before. 🤷