Re: [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Greg KH <greg@kroah.com>
To: Zhaoyang Li <lizy04@hust.edu.cn>
Cc: stable@vger.kernel.org, dzm91@hust.edu.cn,
	Boris Burkov <boris@bur.io>, Qu Wenruo <wqu@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: Re: [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio()
Date: Tue, 20 May 2025 12:53:26 +0200	[thread overview]
Message-ID: <2025052013-june-visiting-ab7d@gregkh> (raw)
In-Reply-To: <20250513032523.377137-1-lizy04@hust.edu.cn>

On Tue, May 13, 2025 at 11:25:23AM +0800, Zhaoyang Li wrote:
> From: Boris Burkov <boris@bur.io>
> 
> [ Upstream commit 3e74859ee35edc33a022c3f3971df066ea0ca6b9 ]
> 
> When we call btrfs_read_folio() to bring a folio uptodate, we unlock the
> folio. The result of that is that a different thread can modify the
> mapping (like remove it with invalidate) before we call folio_lock().
> This results in an invalid page and we need to try again.
> 
> In particular, if we are relocating concurrently with aborting a
> transaction, this can result in a crash like the following:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000000
>   PGD 0 P4D 0
>   Oops: 0000 [#1] SMP
>   CPU: 76 PID: 1411631 Comm: kworker/u322:5
>   Workqueue: events_unbound btrfs_reclaim_bgs_work
>   RIP: 0010:set_page_extent_mapped+0x20/0xb0
>   RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246
>   RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880
>   RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0
>   R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000
>   R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff
>   FS:  0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>   <TASK>
>   ? __die+0x78/0xc0
>   ? page_fault_oops+0x2a8/0x3a0
>   ? __switch_to+0x133/0x530
>   ? wq_worker_running+0xa/0x40
>   ? exc_page_fault+0x63/0x130
>   ? asm_exc_page_fault+0x22/0x30
>   ? set_page_extent_mapped+0x20/0xb0
>   relocate_file_extent_cluster+0x1a7/0x940
>   relocate_data_extent+0xaf/0x120
>   relocate_block_group+0x20f/0x480
>   btrfs_relocate_block_group+0x152/0x320
>   btrfs_relocate_chunk+0x3d/0x120
>   btrfs_reclaim_bgs_work+0x2ae/0x4e0
>   process_scheduled_works+0x184/0x370
>   worker_thread+0xc6/0x3e0
>   ? blk_add_timer+0xb0/0xb0
>   kthread+0xae/0xe0
>   ? flush_tlb_kernel_range+0x90/0x90
>   ret_from_fork+0x2f/0x40
>   ? flush_tlb_kernel_range+0x90/0x90
>   ret_from_fork_asm+0x11/0x20
>   </TASK>
> 
> This occurs because cleanup_one_transaction() calls
> destroy_delalloc_inodes() which calls invalidate_inode_pages2() which
> takes the folio_lock before setting mapping to NULL. We fail to check
> this, and subsequently call set_extent_mapping(), which assumes that
> mapping != NULL (in fact it asserts that in debug mode)
> 
> Note that the "fixes" patch here is not the one that introduced the
> race (the very first iteration of this code from 2009) but a more recent
> change that made this particular crash happen in practice.
> 
> Fixes: e7f1326cc24e ("btrfs: set page extent mapped after read_folio in relocate_one_page")
> CC: stable@vger.kernel.org # 6.1+
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Boris Burkov <boris@bur.io>
> Signed-off-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Zhaoyang Li <lizy04@hust.edu.cn>

You forgot to backport to 6.6.y first :(

next prev parent reply	other threads:[~2025-05-20 10:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-30  8:39 FAILED: patch "[PATCH] btrfs: check folio mapping after unlock in" failed to apply to 6.1-stable tree gregkh
2025-05-13  3:25 ` [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio() Zhaoyang Li
2025-05-13 18:50   ` Sasha Levin
2025-05-20 10:53   ` Greg KH [this message]
2025-05-21  1:50 ` Zhaoyang Li
2025-05-22  2:03   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2025052013-june-visiting-ab7d@gregkh \
    --to=greg@kroah.com \
    --cc=boris@bur.io \
    --cc=dsterba@suse.com \
    --cc=dzm91@hust.edu.cn \
    --cc=lizy04@hust.edu.cn \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.