Skip to content

Race condition in async removal after destroy #266

@georgeclaghorn

Description

@georgeclaghorn

With enqueue: true, an after_destroy callback is added which enqueues a MeiliSearch::Rails::MSJob intended to remove the destroyed record from the Meilisearch index:

elsif respond_to?(:after_destroy)
after_destroy { |searchable| searchable.ms_enqueue_remove_from_index!(ms_synchronous?) }

def ms_enqueue_remove_from_index!(synchronous)
if meilisearch_options[:enqueue]
unless self.class.send(:ms_indexing_disabled?, meilisearch_options)
meilisearch_options[:enqueue].call(self, true)
end
else
ms_remove_from_index!(synchronous || ms_synchronous?)
end
end

if options[:enqueue]
raise ArgumentError, 'Cannot use a enqueue if the `synchronous` option is set' if options[:synchronous]
proc = if options[:enqueue] == true
proc do |record, remove|
MSJob.perform_later(record, remove ? 'ms_remove_from_index!' : 'ms_index!')
end
elsif options[:enqueue].respond_to?(:call)
options[:enqueue]
elsif options[:enqueue].is_a?(Symbol)
proc { |record, remove| send(options[:enqueue], record, remove) }
else
raise ArgumentError, "Invalid `enqueue` option: #{options[:enqueue]}"
end
meilisearch_options[:enqueue] = proc do |record, remove|
proc.call(record, remove) unless ms_without_auto_index_scope
end
end

Line 372 above passes the destroyed record as an argument to the job. Active Job transparently serializes the record to a Global ID on enqueue and looks up the record from the Global ID when the job is performed.

The enqueued background job may be performed after the transaction wrapping the record’s destruction has committed. In this case, Active Job won’t be able to find the record in the DB from its Global ID, and the MSJob will fail with an ActiveJob::DeserializationError.

From a quick look, it seems that the plugin’s tests don’t catch this because they only assert that a job is enqueued with the correct arguments. They might have caught it if they exercised the end-to-end destroy→enqueue→perform flow using ActiveJob::TestHelper’s perform_enqueued_jobs or similar.

Suggested fix:

  1. Replace the after_destroy with an after_destroy_commit callback.
  2. Update the job to take a model and record ID as arguments. (Alternatively, add a new job to avoid breaking changes to the existing background job’s signature.)
  3. Attempt to look up the record. If it exists, index it. If it doesn't, delete it from the index.
    def perform(model, id)
      if record = model.unscoped.find_by(id: id)
        record.ms_index!
      else
        model.ms_remove_document_from_index_by_id!(id)
      end
    end

Further reading: elasticsearch-model’s documentation includes an example Sidekiq job implementing an alternative approach (taking an operation name argument rather than using the record’s existence to decide what to do).

History:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions