Skip to content

Conversation

@eyalkoren
Copy link
Contributor

Closes #97032

Adding the ability to set require_data_stream parameter (boolean) on several APIs.
For document indexing and update, this flag affects also auto creation of underlying index - if set to true, an index will be created only if a matching index template is found and it contains a data stream template.

@eyalkoren eyalkoren added >feature :Data Management/Data streams Data streams and their lifecycles v8.12.0 labels Nov 7, 2023
@eyalkoren eyalkoren self-assigned this Nov 7, 2023
@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Nov 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @eyalkoren, I've created a changelog YAML for you.

axw added a commit to axw/apm-server that referenced this pull request Nov 17, 2023
We'll switch to setting require_data_stream when
it's available:
elastic/elasticsearch#101872
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good general direction to me. I think we should check whether writing to an alias that points to a data stream will still work with the require_data_stream flag (and potentially whether it should or should not.) I left a few minor comments since this is still a draft PR

@felixbarny felixbarny assigned dakrone and unassigned eyalkoren Dec 8, 2023
@dakrone dakrone changed the title [WIP] Adding require_data_stream feature Add require_data_stream feature Dec 11, 2023
@dakrone dakrone marked this pull request as ready for review December 11, 2023 15:40
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@dakrone dakrone removed the request for review from jbaiera December 11, 2023 15:41
@dakrone dakrone requested a review from jbaiera January 9, 2024 23:11
@dakrone
Copy link
Member

dakrone commented Jan 9, 2024

Alright, I've pulled out the update support, and made this flag mean "if require_data_stream is set, the indexing operation needs to be targeting a data stream, or a data stream that will be created by a template". I've also added more tests for this. Hopefully it's clearer and not too bad to review.

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚢

@dakrone
Copy link
Member

dakrone commented Jan 17, 2024

@elasticmachine update branch

@dakrone dakrone merged commit 6f4e293 into elastic:main Jan 18, 2024
@axw
Copy link
Member

axw commented Jan 22, 2024

🎉 thank you @dakrone and @eyalkoren!

@leandrojmp
Copy link
Contributor

leandrojmp commented Mar 26, 2024

Hello @dakrone and @eyalkoren

Does this change have any impact when using Logstash elasticsearch output to index data on data streams, but on data streams that have a custom naming pattern?

We are planning the upgrade to 8.13 and checking the release notes, we have Logstash writing to a custom data stream and since logstash does not support custom data stream names, we use data_stream => false and on the index option we point to the data stream alias name and use the action as a create.

Our logstash outputs are like this:

output {
    elasticsearch {
        hosts => ["HOSTS"]
        index => "data-stream-name"
        action => "create"
        http_compression => true
        data_stream => false
        manage_template => false
        ilm_enabled => false
        cacert => 'ca.crt'
        user => 'USER'
        password => 'PASSWORD'
    }
}

Not sure if this change here will break this.

@axw
Copy link
Member

axw commented Mar 27, 2024

@leandrojmp this is an opt-in feature, so it will have no effect on your use case. You would need to reconfigure Logstash to pass ?require_data_stream=true in the _bulk API requests for it to have any effect.

@dmytro-freger-mesalvo
Copy link

Hi,

I have in my exporter:

  elasticsearch:
     endpoints: ["https://elasticsearch-es-http:9200"]
     logs_index: "logs-test"
     logs_dynamic_index:
       enabled: false
     user: "elastic"
     password: "ILJ0lVi91D91z4gfe1202YZO"
     tls:
       insecure: false
       ca_file: "/etc/otel/certs/tls.crt"
       insecure_skip_verify: false

How can I set require_data_stream to false?

thank you

@eyalkoren
Copy link
Contributor Author

@dmytro-freger-mesalvo I am not sure this is supported. Why are you trying to set it to false?
@carsonip @lahsivjar - can you assist with this?

@carsonip
Copy link
Member

Hi @dmytro-freger-mesalvo , do you mind either

I'll be able to assist you further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Data streams Data streams and their lifecycles external-contributor Pull request authored by a developer outside the Elasticsearch team >feature Team:Data Management Meta label for data/management team v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Option to prevent auto-creating index with no matching index template