Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Leah Rumancik <leah.rumancik@gmail.com>
Cc: stable@vger.kernel.org, Miaohe Lin <linmiaohe@huawei.com>,
	Thorvald Natvig <thorvald@google.com>,
	Jane Chu <jane.chu@oracle.com>,
	Christian Brauner <brauner@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Muchun Song <muchun.song@linux.dev>,
	Oleg Nesterov <oleg@redhat.com>,
	Peng Zhang <zhangpeng.00@bytedance.com>,
	Tycho Andersen <tandersen@netflix.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH 6.6] fork: defer linking file vma until vma is fully initialized
Date: Tue, 2 Jul 2024 08:27:49 -0400	[thread overview]
Message-ID: <ZoPyReRlZ6ViNH-6@sashalap> (raw)
In-Reply-To: <20240702042948.2629267-1-leah.rumancik@gmail.com>

On Mon, Jul 01, 2024 at 09:29:48PM -0700, Leah Rumancik wrote:
>From: Miaohe Lin <linmiaohe@huawei.com>
>
>commit 35e351780fa9d8240dd6f7e4f245f9ea37e96c19 upstream.
>
>Thorvald reported a WARNING [1]. And the root cause is below race:
>
> CPU 1					CPU 2
> fork					hugetlbfs_fallocate
>  dup_mmap				 hugetlbfs_punch_hole
>   i_mmap_lock_write(mapping);
>   vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree.
>   i_mmap_unlock_write(mapping);
>   hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem!
>					 i_mmap_lock_write(mapping);
>   					 hugetlb_vmdelete_list
>					  vma_interval_tree_foreach
>					   hugetlb_vma_trylock_write -- Vma_lock is cleared.
>   tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem!
>					   hugetlb_vma_unlock_write -- Vma_lock is assigned!!!
>					 i_mmap_unlock_write(mapping);
>
>hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside
>i_mmap_rwsem lock while vma lock can be used in the same time.  Fix this
>by deferring linking file vma until vma is fully initialized.  Those vmas
>should be initialized first before they can be used.
>
>Backport notes:
>
>The first backport attempt (cec11fa2e) was reverted (dd782da4707). This is
>the new backport of the original fix (35e351780fa9).
>
>35e351780f ("fork: defer linking file vma until vma is fully initialized")
>fixed a hugetlb locking race by moving a bunch of intialization code to earlier
>in the function. The call to open() was included in the move but the call to
>copy_page_range was not, effectively inverting their relative ordering. This
>created an issue for the vfio code which assumes copy_page_range happens before
>the call to open() - vfio's open zaps the vma so that the fault handler is
>invoked later, but when we inverted the ordering, copy_page_range can set up
>mappings post-zap which would prevent the fault handler from being invoked
>later. This patch moves the call to copy_page_range to earlier than the call to
>open() to restore the original ordering of the two functions while keeping the
>fix for hugetlb intact.
>
>Commit aac6db75a9 made several changes to vfio_pci_core.c, including
>removing the vfio-pci custom open function. This resolves the issue on
>the main branch and so we only need to apply these changes when
>backporting to stable branches.
>
>35e351780f ("fork: defer linking file vma until vma is fully initialized")-> v6.9-rc5
>aac6db75a9 ("vfio/pci: Use unmap_mapping_range()") -> v6.10-rc4

Is there a strong reason not to take the commit above instead? That way:

a. We stay aligned with upstream, not needed a custom backport.
b. We avoid similar issues in the future.

-- 
Thanks,
Sasha

  parent reply	other threads:[~2024-07-02 15:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-02  4:29 [PATCH 6.6] fork: defer linking file vma until vma is fully initialized Leah Rumancik
2024-07-02  7:35 ` Miaohe Lin
2024-07-02 12:27 ` Sasha Levin [this message]
2024-07-08 18:28   ` Leah Rumancik
2024-07-15 20:35     ` Axel Rasmussen
2024-07-15 22:21       ` Alex Williamson
2024-07-16  1:06         ` Axel Rasmussen
2024-07-16 16:08           ` Alex Williamson
2024-07-16 16:58             ` Axel Rasmussen
2024-07-16 17:25               ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZoPyReRlZ6ViNH-6@sashalap \
    --to=sashal@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hca@linux.ibm.com \
    --cc=jane.chu@oracle.com \
    --cc=kent.overstreet@linux.dev \
    --cc=leah.rumancik@gmail.com \
    --cc=linmiaohe@huawei.com \
    --cc=mjguzik@gmail.com \
    --cc=muchun.song@linux.dev \
    --cc=oleg@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tandersen@netflix.com \
    --cc=thorvald@google.com \
    --cc=willy@infradead.org \
    --cc=zhangpeng.00@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox