From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B92E1271464 for ; Tue, 24 Feb 2026 21:45:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771969508; cv=none; b=UJlWDLsRz47+zc4e2VrLpFPxy0Fb/M+laErXu4xc0ImtOXa+D8/7FLvRBCUI57m6N7yHJq0yFL6sipAFJWP56P3pSZFzqwX+XR8n0/C4o8jJ+V2pIKt6My3NGeIaAsuoONMNFpraCn7dAsKrr7idxWDYJ5pbggd0W1ltE0Dcu28= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771969508; c=relaxed/simple; bh=kLLQz4hrDVBThYhsBrbslZwAfVghh2Xvg3qVMG0yYgI=; h=Subject:To:Cc:From:Date:Message-ID:MIME-Version:Content-Type; b=S0Vkz7CtmfTQrigzuZIGXLLk4DWzKZfnPLIySuFUdsj4UFKRr9GbayAuBbKFdMRqhuZ4pHQ8HeXkrfnn9VKX86eV5ghhhEaH6rW1qPsC42jFLlKlFyA2S4n+h38ab3X0ixfsyJbtiLkuT50BStPom3+qiVov8DI4C+uW3biCQPc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=zmRh0l+3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="zmRh0l+3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5DD2DC116D0; Tue, 24 Feb 2026 21:45:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1771969508; bh=kLLQz4hrDVBThYhsBrbslZwAfVghh2Xvg3qVMG0yYgI=; h=Subject:To:Cc:From:Date:From; b=zmRh0l+3MW3BZzrzlRjqqOZo8YXcZq62ZG7lewiQth4G1eJLMEUV4Qzqy2tBZw4Xd dylEtn1KOHzPABmKBnKnxowSiWBTe3DME5ZgzDmmcoz89Z768lQBK9qoCkRWEmJubw AQ66d14DF2EaNpXVyu62VPudY7rweDuhpq2/crok= Subject: FAILED: patch "[PATCH] ext4: fix e4b bitmap inconsistency reports" failed to apply to 6.1-stable tree To: sunyongjian1@huawei.com,jack@suse.cz,libaokun1@huawei.com,tytso@mit.edu,yi.zhang@huawei.com Cc: From: Date: Tue, 24 Feb 2026 13:45:01 -0800 Message-ID: <2026022401-arose-calm-3e73@gregkh> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x bdc56a9c46b2a99c12313122b9352b619a2e719e # git commit -s git send-email --to '' --in-reply-to '2026022401-arose-calm-3e73@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From bdc56a9c46b2a99c12313122b9352b619a2e719e Mon Sep 17 00:00:00 2001 From: Yongjian Sun Date: Tue, 6 Jan 2026 17:08:20 +0800 Subject: [PATCH] ext4: fix e4b bitmap inconsistency reports A bitmap inconsistency issue was observed during stress tests under mixed huge-page workloads. Ext4 reported multiple e4b bitmap check failures like: ext4_mb_complex_scan_group:2508: group 350, 8179 free clusters as per group info. But got 8192 blocks Analysis and experimentation confirmed that the issue is caused by a race condition between page migration and bitmap modification. Although this timing window is extremely narrow, it is still hit in practice: folio_lock ext4_mb_load_buddy __migrate_folio check ref count folio_mc_copy __filemap_get_folio folio_try_get(folio) ...... mb_mark_used ext4_mb_unload_buddy __folio_migrate_mapping folio_ref_freeze folio_unlock The root cause of this issue is that the fast path of load_buddy only increments the folio's reference count, which is insufficient to prevent concurrent folio migration. We observed that the folio migration process acquires the folio lock. Therefore, we can determine whether to take the fast path in load_buddy by checking the lock status. If the folio is locked, we opt for the slow path (which acquires the lock) to close this concurrency window. Additionally, this change addresses the following issues: When the DOUBLE_CHECK macro is enabled to inspect bitmap-related issues, the following error may be triggered: corruption in group 324 at byte 784(6272): f in copy != ff on disk/prealloc Analysis reveals that this is a false positive. There is a specific race window where the bitmap and the group descriptor become momentarily inconsistent, leading to this error report: ext4_mb_load_buddy ext4_mb_load_buddy __filemap_get_folio(create|lock) folio_lock ext4_mb_init_cache folio_mark_uptodate __filemap_get_folio(no lock) ...... mb_mark_used mb_mark_used_double mb_cmp_bitmaps mb_set_bits(e4b->bd_bitmap) folio_unlock The original logic assumed that since mb_cmp_bitmaps is called when the bitmap is newly loaded from disk, the folio lock would be sufficient to prevent concurrent access. However, this overlooks a specific race condition: if another process attempts to load buddy and finds the folio is already in an uptodate state, it will immediately begin using it without holding folio lock. Signed-off-by: Yongjian Sun Reviewed-by: Zhang Yi Reviewed-by: Baokun Li Reviewed-by: Jan Kara Link: https://patch.msgid.link/20260106090820.836242-1-sunyongjian@huaweicloud.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 56d50fd3310b..de4cacb740b3 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -1706,16 +1706,17 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group, /* Avoid locking the folio in the fast path ... */ folio = __filemap_get_folio(inode->i_mapping, pnum, FGP_ACCESSED, 0); - if (IS_ERR(folio) || !folio_test_uptodate(folio)) { + if (IS_ERR(folio) || !folio_test_uptodate(folio) || folio_test_locked(folio)) { + /* + * folio_test_locked is employed to detect ongoing folio + * migrations, since concurrent migrations can lead to + * bitmap inconsistency. And if we are not uptodate that + * implies somebody just created the folio but is yet to + * initialize it. We can drop the folio reference and + * try to get the folio with lock in both cases to avoid + * concurrency. + */ if (!IS_ERR(folio)) - /* - * drop the folio reference and try - * to get the folio with lock. If we - * are not uptodate that implies - * somebody just created the folio but - * is yet to initialize it. So - * wait for it to initialize. - */ folio_put(folio); folio = __filemap_get_folio(inode->i_mapping, pnum, FGP_LOCK | FGP_ACCESSED | FGP_CREAT, gfp); @@ -1764,7 +1765,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group, /* we need another folio for the buddy */ folio = __filemap_get_folio(inode->i_mapping, pnum, FGP_ACCESSED, 0); - if (IS_ERR(folio) || !folio_test_uptodate(folio)) { + if (IS_ERR(folio) || !folio_test_uptodate(folio) || folio_test_locked(folio)) { if (!IS_ERR(folio)) folio_put(folio); folio = __filemap_get_folio(inode->i_mapping, pnum,