From: "yebin (H)" <yebin10@huawei.com>
To: liubaolin <liubaolin12138@163.com>, <tytso@mit.edu>, <jack@suse.com>
Cc: <linux-ext4@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<wangguanyu@vivo.com>, Baolin Liu <liubaolin@kylinos.cn>
Subject: Re: [PATCH v1] jbd2: check transaction state before stopping handle
Date: Fri, 6 Mar 2026 09:20:53 +0800 [thread overview]
Message-ID: <69AA2BF5.3000403@huawei.com> (raw)
In-Reply-To: <e52f326a-8da2-457f-8aee-5729373b9582@163.com>
On 2026/3/5 21:11, liubaolin wrote:
>> Dear maintainers and community,
>>
>> Our customer reported a kernel crash issue. The kernel crashes in
>> stop_this_handle() with:
>> J_ASSERT(atomic_read(&transaction->t_updates) > 0);
>>
>> Crash stack:
>> Call trace:
>> stop_this_handle+0x148/0x158
>> jbd2_journal_stop+0x198/0x388
>> __ext4_journal_stop+0x70/0xf0
>> ext4_create+0x12c/0x188
>> ...
>>
>> From the vmcore dump, I found that handle->h_transaction->t_updates is
>> 0 and handle->h_transaction->t_state is T_FINISHED. This means the
>> handle is still bound to a transaction that has already completed.
>>
>> In the JBD2 commit process, once a transaction enters T_LOCKED state,
>> the commit thread waits for all associated handles to complete
>> (t_updates becomes 0). After T_LOCKED completes, the transaction
>> transitions to T_FLUSH, T_COMMIT, and eventually T_FINISHED. At this
>> point, t_updates is legitimately 0, and there should be no active
>> handles bound to this transaction.
>>
>> This appears to be a low-probability race condition that occurs under
>> high concurrency scenarios. The crash happened during ext4_create(),
I'm curious about what race condition causes this?
>> and at some rare timing point during the execution, a handle ended up
>> bound to a transaction that had already completed. The symptom is
>> clear: a handle is bound to a T_FINISHED transaction with t_updates == 0.
>>
>> To prevent this, I added a transaction state check in both
>> jbd2_journal_stop() and jbd2__journal_restart() before calling
>> stop_this_handle().
>> If the transaction is not in T_RUNNING or T_LOCKED state (i.e., it's
>> in T_FLUSH or later states), I clear handle->h_transaction, clear
>> current->journal_info if needed, restore the memalloc_nofs context,
>> and skip calling stop_this_handle(). This is a defensive check to
>> handle the edge case where a handle is bound to a transaction in an
>> invalid state.
>>
If this state occurs, there must be something wrong somewhere.
Assertions should also be added to detect such anomalies. I think
directly skipping the assertion check is not a good way to solve the
problem.
>> Please let me know if you have any questions or concerns.
>>
>> Best regards,
>> Baolin Liu
>
>
>
> 在 2026/3/5 20:57, Baolin Liu 写道:
>>
>> Add others
>>
>>
>>
>>
>>
>> At 2026-03-05 20:54:02, "Baolin Liu" <liubaolin12138@163.com> wrote:
>>> From: Baolin Liu <liubaolin@kylinos.cn>
>>>
>>> When a transaction enters T_FLUSH or later states,
>>> handle->h_transaction may still point to it.
>>> If jbd2_journal_stop() or jbd2__journal_restart() is called,
>>> stop_this_handle() checks t_updates > 0, but t_updates is
>>> already 0 for these states, causing a kernel BUG.
>>>
>>> Fix by checking transaction->t_state in jbd2_journal_stop()
>>> and jbd2__journal_restart() before calling stop_this_handle().
>>> If the transaction is not in T_RUNNING or T_LOCKED state,
>>> clear handle->h_transaction and skip stop_this_handle().
>>>
>>> Crash stack:
>>> Call trace:
>>> stop_this_handle+0x148/0x158
>>> jbd2_journal_stop+0x198/0x388
>>> __ext4_journal_stop+0x70/0xf0
>>> ext4_create+0x12c/0x188
>>> lookup_open+0x214/0x6d8
>>> do_last+0x364/0x878
>>> path_openat+0x6c/0x280
>>> do_filp_open+0x70/0xe8
>>> do_sys_open+0x178/0x200
>>> sys_openat+0x3c/0x50
>>> el0_svc_naked+0x44/0x48
>>>
>>> Signed-off-by: Baolin Liu <liubaolin@kylinos.cn>
>>> ---
>>> fs/jbd2/transaction.c | 25 +++++++++++++++++++++++--
>>> 1 file changed, 23 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
>>> index dca4b5d8aaaa..3779382dbb80 100644
>>> --- a/fs/jbd2/transaction.c
>>> +++ b/fs/jbd2/transaction.c
>>> @@ -772,14 +772,25 @@ int jbd2__journal_restart(handle_t *handle, int
>>> nblocks, int revoke_records,
>>> journal = transaction->t_journal;
>>> tid = transaction->t_tid;
>>>
>>> + jbd2_debug(2, "restarting handle %p\n", handle);
>>> +
>>> + /* Check if transaction is in invalid state */
>>> + if (transaction->t_state != T_RUNNING &&
>>> + transaction->t_state != T_LOCKED) {
>>> + if (current->journal_info == handle)
>>> + current->journal_info = NULL;
>>> + handle->h_transaction = NULL;
>>> + memalloc_nofs_restore(handle->saved_alloc_context);
>>> + goto skip_stop;
>>> + }
>>> +
>>> /*
>>> * First unlink the handle from its current transaction, and
>>> start the
>>> * commit on that.
>>> */
>>> - jbd2_debug(2, "restarting handle %p\n", handle);
>>> stop_this_handle(handle);
>>> handle->h_transaction = NULL;
>>> -
>>> +skip_stop:
>>> /*
>>> * TODO: If we use READ_ONCE / WRITE_ONCE for j_commit_request we
>>> can
>>> * get rid of pointless j_state_lock traffic like this.
>>> @@ -1856,6 +1867,16 @@ int jbd2_journal_stop(handle_t *handle)
>>> memalloc_nofs_restore(handle->saved_alloc_context);
>>> goto free_and_exit;
>>> }
>>> + /* Check if transaction is in invalid state */
>>> + if (transaction->t_state != T_RUNNING &&
>>> + transaction->t_state != T_LOCKED) {
>>> + if (current->journal_info == handle)
>>> + current->journal_info = NULL;
>>> + handle->h_transaction = NULL;
>>> + memalloc_nofs_restore(handle->saved_alloc_context);
>>> + goto free_and_exit;
>>> + }
>>> +
>>> journal = transaction->t_journal;
>>> tid = transaction->t_tid;
>>>
>>> --
>>> 2.39.2
>
>
>
> .
>
next prev parent reply other threads:[~2026-03-06 1:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 12:54 [PATCH v1] jbd2: check transaction state before stopping handle Baolin Liu
[not found] ` <257c6f1e.a166.19cbe12f387.Coremail.liubaolin12138@163.com>
2026-03-05 13:08 ` Jan Kara
2026-03-06 9:19 ` liubaolin
2026-03-05 13:11 ` liubaolin
2026-03-06 1:20 ` yebin (H) [this message]
2026-03-06 10:00 ` liubaolin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=69AA2BF5.3000403@huawei.com \
--to=yebin10@huawei.com \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liubaolin12138@163.com \
--cc=liubaolin@kylinos.cn \
--cc=tytso@mit.edu \
--cc=wangguanyu@vivo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.