From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org
Cc: Brian Foster <bfoster@redhat.com>,
Baokun Li <libaokun1@huawei.com>, Theodore Ts'o <tytso@mit.edu>,
stable@kernel.org, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15.y] ext4: fix dirtyclusters double decrement on fs shutdown
Date: Wed, 25 Feb 2026 09:52:32 -0500 [thread overview]
Message-ID: <20260225145232.546260-1-sashal@kernel.org> (raw)
In-Reply-To: <2026022422-deviancy-scanning-ca65@gregkh>
From: Brian Foster <bfoster@redhat.com>
[ Upstream commit 94a8cea54cd935c54fa2fba70354757c0fc245e3 ]
fstests test generic/388 occasionally reproduces a warning in
ext4_put_super() associated with the dirty clusters count:
WARNING: CPU: 7 PID: 76064 at fs/ext4/super.c:1324 ext4_put_super+0x48c/0x590 [ext4]
Tracing the failure shows that the warning fires due to an
s_dirtyclusters_counter value of -1. IOW, this appears to be a
spurious decrement as opposed to some sort of leak. Further tracing
of the dirty cluster count deltas and an LLM scan of the resulting
output identified the cause as a double decrement in the error path
between ext4_mb_mark_diskspace_used() and the caller
ext4_mb_new_blocks().
First, note that generic/388 is a shutdown vs. fsstress test and so
produces a random set of operations and shutdown injections. In the
problematic case, the shutdown triggers an error return from the
ext4_handle_dirty_metadata() call(s) made from
ext4_mb_mark_context(). The changed value is non-zero at this point,
so ext4_mb_mark_diskspace_used() does not exit after the error
bubbles up from ext4_mb_mark_context(). Instead, the former
decrements both cluster counters and returns the error up to
ext4_mb_new_blocks(). The latter falls into the !ar->len out path
which decrements the dirty clusters counter a second time, creating
the inconsistency.
To avoid this problem and simplify ownership of the cluster
reservation in this codepath, lift the counter reduction to a single
place in the caller. This makes it more clear that
ext4_mb_new_blocks() is responsible for acquiring cluster
reservation (via ext4_claim_free_clusters()) in the !delalloc case
as well as releasing it, regardless of whether it ends up consumed
or returned due to failure.
Fixes: 0087d9fb3f29 ("ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://patch.msgid.link/20260113171905.118284-1-bfoster@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
[ Drop mballoc-test changes ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/ext4/mballoc.c | 21 +++++----------------
1 file changed, 5 insertions(+), 16 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 0fd921ada8973..e16f64daf1334 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3836,8 +3836,7 @@ void ext4_exit_mballoc(void)
* Returns 0 if success or error code
*/
static noinline_for_stack int
-ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
- handle_t *handle, unsigned int reserv_clstrs)
+ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, handle_t *handle)
{
struct buffer_head *bitmap_bh = NULL;
struct ext4_group_desc *gdp;
@@ -3925,13 +3924,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
ext4_unlock_group(sb, ac->ac_b_ex.fe_group);
percpu_counter_sub(&sbi->s_freeclusters_counter, ac->ac_b_ex.fe_len);
- /*
- * Now reduce the dirty block count also. Should not go negative
- */
- if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
- /* release all the reserved blocks if non delalloc */
- percpu_counter_sub(&sbi->s_dirtyclusters_counter,
- reserv_clstrs);
if (sbi->s_log_groups_per_flex) {
ext4_group_t flex_group = ext4_flex_group(sbi,
@@ -5803,7 +5795,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
ext4_mb_pa_free(ac);
}
if (likely(ac->ac_status == AC_STATUS_FOUND)) {
- *errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
+ *errp = ext4_mb_mark_diskspace_used(ac, handle);
if (*errp) {
ext4_discard_allocated_blocks(ac);
goto errout;
@@ -5835,12 +5827,9 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
kmem_cache_free(ext4_ac_cachep, ac);
if (inquota && ar->len < inquota)
dquot_free_block(ar->inode, EXT4_C2B(sbi, inquota - ar->len));
- if (!ar->len) {
- if ((ar->flags & EXT4_MB_DELALLOC_RESERVED) == 0)
- /* release all the reserved blocks if non delalloc */
- percpu_counter_sub(&sbi->s_dirtyclusters_counter,
- reserv_clstrs);
- }
+ /* release any reserved blocks */
+ if (reserv_clstrs)
+ percpu_counter_sub(&sbi->s_dirtyclusters_counter, reserv_clstrs);
trace_ext4_allocate_blocks(ar, (unsigned long long)block);
--
2.51.0
prev parent reply other threads:[~2026-02-25 14:52 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 21:45 FAILED: patch "[PATCH] ext4: fix dirtyclusters double decrement on fs shutdown" failed to apply to 5.15-stable tree gregkh
2026-02-25 14:52 ` Sasha Levin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260225145232.546260-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=bfoster@redhat.com \
--cc=libaokun1@huawei.com \
--cc=stable@kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.