MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no #4446

vlad-lesin · 2025-11-17T16:16:52Z

The scenario of the bug is the following. Before killing the server some transaction A starts undo log writing in some undo segment U of rseg R. It writes its trx_id into the undo log header. Then new trx_id is assigned to transaction B, but undo log hasn't been started yet. Then transaction A commits and writes trx_no into its undo log header. Transaction B starts writing undo log into the undo segment U. So we have the following undo logs in the undo segments U:

... undo log 1...
... undo log 2...
...
undo log A, trx_id: L, trx_no: M, ...
undo log B, trx_id: N, trx_no: 0, ...

Where L < N < M.

Then server is killed.

On recovery the maximum trx_no is extracted from each rseg, and the maximum trx_no among all rsegs plus one is considered as a new value for server-wide transaction id/no counter.

For each undo segment of each rseg we read the last undo log header. If the last undo log is committed, then we read trx_no from the header, otherwise we treat trx_id as trx_no. The maximum trx_no from all undo log segments of the current rseg is treated as the maximum trx_no of the rseg.

For the above case the undo log of transaction B is not committed and its trx_no is 0. So we read trx_id and treat it as trx_no. But M < N. If U is the last modified undo segment in rseg R, and trx_(id/no) N is the maximum trx_no among all rsegs, then there can be the case when after recovery some transaction with trx_no_C, such as N < trx_no_C <= M, is committed.

During a purging we store trx_no of the last parsed undo log of a committed transaction in purge_sys.tail.trx_no. So if the last parsed undo log is the undo log of transaction A(transaction B was rolled back on recovery and its undo log was also removed from the undo segment U), then purse_sys.tail.trx_no = M. Than if some other transaction C with trx_no_C <= M is being committed and purged, we hit "tail.trx_no <= last_trx_no" assertion failure in
purge_sys_t::choose_next_log(), because purge queue is min-heap of (trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no, and it must not be that trx_no of the last parsed undo log of a committed transaction is greater than the last trx_no of the rseg at the top of the queue.

The fix is to read the trx_no of the previous to last undo log in undo segment, if the last undo log in that undo segment is not committed, and set trx_no=max(trx_id of the last undo log, trx_no of the previous to last undo log) during recovery.

We can do this because we need to extract the maximum value of trx_no or trx_id of the undo log segment, and the maximum value is either trx_id of the last undo log or trx_no of the previous to last undo log, because undo segment can be assigned only to the one transaction at time, and undo logs in the undo segment are ordered by trx_id.

Reviewed by Marko Mäkelä.

The Jira issue number for this PR is: MDEV-______

Description

TODO: fill description here

Release Notes

TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

How can this PR be tested?

TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".

If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

Basing the PR against the correct MariaDB version

This is a new feature or a refactoring, and the PR is based against the main branch.
This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

CLAassistant · 2025-11-17T16:16:58Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

dr-m

For some reason, the test case and related instrumentation would not reproduce the problem in the 10.6 branch, even though I believe that this bug should have existed starting with 947efe1 in 10.3.

Anyway, because the bug is quite rare and a possible work-around exists (start up the server with innodb_force_recovery=2, shut down, and upgrade to 10.11 or later), I think that it is OK not to fix this in the oldest maintained branch (10.6).

mysql-test/suite/innodb/t/max_trx_no_recovery.test

storage/innobase/trx/trx0undo.cc

The scenario of the bug is the following. Before killing the server some transaction A starts undo log writing in some undo segment U of rseg R. It writes its trx_id into the undo log header. Then new trx_id is assigned to transaction B, but undo log hasn't been started yet. Then transaction A commits and writes trx_no into its undo log header. Transaction B starts writing undo log into the undo segment U. So we have the following undo logs in the undo segments U: ... undo log 1... ... undo log 2... ... undo log A, trx_id: L, trx_no: M, ... undo log B, trx_id: N, trx_no: 0, ... Where L < N < M. Then server is killed. On recovery the maximum trx_no is extracted from each rseg, and the maximum trx_no among all rsegs plus one is considered as a new value for server-wide transaction id/no counter. For each undo segment of each rseg we read the last undo log header. If the last undo log is committed, then we read trx_no from the header, otherwise we treat trx_id as trx_no. The maximum trx_no from all undo log segments of the current rseg is treated as the maximum trx_no of the rseg. For the above case the undo log of transaction B is not committed and its trx_no is 0. So we read trx_id and treat it as trx_no. But M < N. If U is the last modified undo segment in rseg R, and trx_(id/no) N is the maximum trx_no among all rsegs, then there can be the case when after recovery some transaction with trx_no_C, such as N < trx_no_C <= M, is committed. During a purging we store trx_no of the last parsed undo log of a committed transaction in purge_sys.tail.trx_no. So if the last parsed undo log is the undo log of transaction A(transaction B was rolled back on recovery and its undo log was also removed from the undo segment U), then purse_sys.tail.trx_no = M. Than if some other transaction C with trx_no_C <= M is being committed and purged, we hit "tail.trx_no <= last_trx_no" assertion failure in purge_sys_t::choose_next_log(), because purge queue is min-heap of (trx_no, trx_sys.rseg_array index) pairs, where the key is trx_no, and it must not be that trx_no of the last parsed undo log of a committed transaction is greater than the last trx_no of the rseg at the top of the queue. The fix is to read the trx_no of the previous to last undo log in undo segment, if the last undo log in that undo segment is not committed, and set trx_no=max(trx_id of the last undo log, trx_no of the previous to last undo log) during recovery. We can do this because we need to extract the maximum value of trx_no or trx_id of the undo log segment, and the maximum value is either trx_id of the last undo log or trx_no of the previous to last undo log, because undo segment can be assigned only to the one transaction at time, and undo logs in the undo segment are ordered by trx_id. Reviewed by Marko Mäkelä.

vlad-lesin requested a review from dr-m November 17, 2025 16:16

dr-m approved these changes Nov 18, 2025

View reviewed changes

vlad-lesin force-pushed the 10.11-MDEV-36845 branch from fd3006d to 9c60174 Compare November 19, 2025 07:47

vlad-lesin merged commit 9c60174 into 10.11 Nov 19, 2025
16 of 17 checks passed

vlad-lesin deleted the 10.11-MDEV-36845 branch November 19, 2025 08:48

svoj added the MariaDB Corporation label Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no #4446

MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no #4446

vlad-lesin commented Nov 17, 2025

Uh oh!

CLAassistant commented Nov 17, 2025

Uh oh!

dr-m left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Uh oh!

MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no #4446

MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no #4446

Conversation

vlad-lesin commented Nov 17, 2025

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

Uh oh!

CLAassistant commented Nov 17, 2025

Uh oh!

dr-m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants