Re: [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio()

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: Greg KH <greg@kroah.com>
To: Zhaoyang Li <lizy04@hust.edu.cn>
Cc: stable@vger.kernel.org, dzm91@hust.edu.cn,
	Boris Burkov <boris@bur.io>, Qu Wenruo <wqu@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: Re: [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio()
Date: Tue, 20 May 2025 12:53:26 +0200	[thread overview]
Message-ID: <2025052013-june-visiting-ab7d@gregkh> (raw)
In-Reply-To: <20250513032523.377137-1-lizy04@hust.edu.cn>

On Tue, May 13, 2025 at 11:25:23AM +0800, Zhaoyang Li wrote:
> From: Boris Burkov <boris@bur.io>
> 
> [ Upstream commit 3e74859ee35edc33a022c3f3971df066ea0ca6b9 ]
> 
> When we call btrfs_read_folio() to bring a folio uptodate, we unlock the
> folio. The result of that is that a different thread can modify the
> mapping (like remove it with invalidate) before we call folio_lock().
> This results in an invalid page and we need to try again.
> 
> In particular, if we are relocating concurrently with aborting a
> transaction, this can result in a crash like the following:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000000
>   PGD 0 P4D 0
>   Oops: 0000 [#1] SMP
>   CPU: 76 PID: 1411631 Comm: kworker/u322:5
>   Workqueue: events_unbound btrfs_reclaim_bgs_work
>   RIP: 0010:set_page_extent_mapped+0x20/0xb0
>   RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246
>   RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880
>   RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0
>   R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000
>   R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff
>   FS:  0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>   <TASK>
>   ? __die+0x78/0xc0
>   ? page_fault_oops+0x2a8/0x3a0
>   ? __switch_to+0x133/0x530
>   ? wq_worker_running+0xa/0x40
>   ? exc_page_fault+0x63/0x130
>   ? asm_exc_page_fault+0x22/0x30
>   ? set_page_extent_mapped+0x20/0xb0
>   relocate_file_extent_cluster+0x1a7/0x940
>   relocate_data_extent+0xaf/0x120
>   relocate_block_group+0x20f/0x480
>   btrfs_relocate_block_group+0x152/0x320
>   btrfs_relocate_chunk+0x3d/0x120
>   btrfs_reclaim_bgs_work+0x2ae/0x4e0
>   process_scheduled_works+0x184/0x370
>   worker_thread+0xc6/0x3e0
>   ? blk_add_timer+0xb0/0xb0
>   kthread+0xae/0xe0
>   ? flush_tlb_kernel_range+0x90/0x90
>   ret_from_fork+0x2f/0x40
>   ? flush_tlb_kernel_range+0x90/0x90
>   ret_from_fork_asm+0x11/0x20
>   </TASK>
> 
> This occurs because cleanup_one_transaction() calls
> destroy_delalloc_inodes() which calls invalidate_inode_pages2() which
> takes the folio_lock before setting mapping to NULL. We fail to check
> this, and subsequently call set_extent_mapping(), which assumes that
> mapping != NULL (in fact it asserts that in debug mode)
> 
> Note that the "fixes" patch here is not the one that introduced the
> race (the very first iteration of this code from 2009) but a more recent
> change that made this particular crash happen in practice.
> 
> Fixes: e7f1326cc24e ("btrfs: set page extent mapped after read_folio in relocate_one_page")
> CC: stable@vger.kernel.org # 6.1+
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Boris Burkov <boris@bur.io>
> Signed-off-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Zhaoyang Li <lizy04@hust.edu.cn>

You forgot to backport to 6.6.y first :(

next prev parent reply	other threads:[~2025-05-20 10:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-30  8:39 FAILED: patch "[PATCH] btrfs: check folio mapping after unlock in" failed to apply to 6.1-stable tree gregkh
2025-05-13  3:25 ` [PATCH 6.1.y] btrfs: check folio mapping after unlock in relocate_one_folio() Zhaoyang Li
2025-05-13 18:50   ` Sasha Levin
2025-05-20 10:53   ` Greg KH [this message]
2025-05-21  1:50 ` Zhaoyang Li
2025-05-22  2:03   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2025052013-june-visiting-ab7d@gregkh \
    --to=greg@kroah.com \
    --cc=boris@bur.io \
    --cc=dsterba@suse.com \
    --cc=dzm91@hust.edu.cn \
    --cc=lizy04@hust.edu.cn \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox