Optimize indexing by replacing sychroinzed lock of TranslogWriter.add  by Disruptor  in async mode

When indexing big data in high speed, sychronized lock of TranslogWriter.add will waste a lot of time. Here is some situations that Completing for sync lock happens:

- with many other thread  writing the same index shard 
- flush  trigger by IndexService.AsyncTranslogFSync
- flush  trigger when translog reach flush_threshold_size 
- flush trigger when rolling generation

for user,  translog durability strategy was set **async** , but there are still much performance  losed (about 50% in my situation) compared with no translog writing. So I try to using RingBuffer(Disruptor) to make  translog adding **lock free** for writing thread.

Test scenarios:
- machine: Cpu=24 core, memory=64G
- doc count: 10 million
- field count per doc: 400+
- translog size per doc: about 3k （34G tlog generate for 10 million doc）

elasticsearch node config and index config:
    
    # large size to reduce times of full flush
    translog.flush_threshold_size: 30G （default 512M）
    index.translog.sync_interval: 60s （default 5s）

    indices.memory.index_buffer_size: 20% (default 10%)

Indexing elapsed time for  each scene list below:

- translog open with async durability model : 18 minutes
- translog close (change source code) : 12 minutes
- translog writing async using disruptor(change source code) : 12 minutes

reference: https://lmax-exchange.github.io/disruptor/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize indexing by replacing sychroinzed lock of TranslogWriter.add by Disruptor in async mode #45371

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize indexing by replacing sychroinzed lock of TranslogWriter.add by Disruptor in async mode #45371

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions