From: Sasha Levin <sashal@kernel.org>
To: Leah Rumancik <leah.rumancik@gmail.com>
Cc: stable@vger.kernel.org, Miaohe Lin <linmiaohe@huawei.com>,
Thorvald Natvig <thorvald@google.com>,
Jane Chu <jane.chu@oracle.com>,
Christian Brauner <brauner@kernel.org>,
Heiko Carstens <hca@linux.ibm.com>,
Kent Overstreet <kent.overstreet@linux.dev>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Mateusz Guzik <mjguzik@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
Muchun Song <muchun.song@linux.dev>,
Oleg Nesterov <oleg@redhat.com>,
Peng Zhang <zhangpeng.00@bytedance.com>,
Tycho Andersen <tandersen@netflix.com>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH 6.6] fork: defer linking file vma until vma is fully initialized
Date: Tue, 2 Jul 2024 08:27:49 -0400 [thread overview]
Message-ID: <ZoPyReRlZ6ViNH-6@sashalap> (raw)
In-Reply-To: <20240702042948.2629267-1-leah.rumancik@gmail.com>
On Mon, Jul 01, 2024 at 09:29:48PM -0700, Leah Rumancik wrote:
>From: Miaohe Lin <linmiaohe@huawei.com>
>
>commit 35e351780fa9d8240dd6f7e4f245f9ea37e96c19 upstream.
>
>Thorvald reported a WARNING [1]. And the root cause is below race:
>
> CPU 1 CPU 2
> fork hugetlbfs_fallocate
> dup_mmap hugetlbfs_punch_hole
> i_mmap_lock_write(mapping);
> vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree.
> i_mmap_unlock_write(mapping);
> hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem!
> i_mmap_lock_write(mapping);
> hugetlb_vmdelete_list
> vma_interval_tree_foreach
> hugetlb_vma_trylock_write -- Vma_lock is cleared.
> tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem!
> hugetlb_vma_unlock_write -- Vma_lock is assigned!!!
> i_mmap_unlock_write(mapping);
>
>hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside
>i_mmap_rwsem lock while vma lock can be used in the same time. Fix this
>by deferring linking file vma until vma is fully initialized. Those vmas
>should be initialized first before they can be used.
>
>Backport notes:
>
>The first backport attempt (cec11fa2e) was reverted (dd782da4707). This is
>the new backport of the original fix (35e351780fa9).
>
>35e351780f ("fork: defer linking file vma until vma is fully initialized")
>fixed a hugetlb locking race by moving a bunch of intialization code to earlier
>in the function. The call to open() was included in the move but the call to
>copy_page_range was not, effectively inverting their relative ordering. This
>created an issue for the vfio code which assumes copy_page_range happens before
>the call to open() - vfio's open zaps the vma so that the fault handler is
>invoked later, but when we inverted the ordering, copy_page_range can set up
>mappings post-zap which would prevent the fault handler from being invoked
>later. This patch moves the call to copy_page_range to earlier than the call to
>open() to restore the original ordering of the two functions while keeping the
>fix for hugetlb intact.
>
>Commit aac6db75a9 made several changes to vfio_pci_core.c, including
>removing the vfio-pci custom open function. This resolves the issue on
>the main branch and so we only need to apply these changes when
>backporting to stable branches.
>
>35e351780f ("fork: defer linking file vma until vma is fully initialized")-> v6.9-rc5
>aac6db75a9 ("vfio/pci: Use unmap_mapping_range()") -> v6.10-rc4
Is there a strong reason not to take the commit above instead? That way:
a. We stay aligned with upstream, not needed a custom backport.
b. We avoid similar issues in the future.
--
Thanks,
Sasha
next prev parent reply other threads:[~2024-07-02 15:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-02 4:29 [PATCH 6.6] fork: defer linking file vma until vma is fully initialized Leah Rumancik
2024-07-02 7:35 ` Miaohe Lin
2024-07-02 12:27 ` Sasha Levin [this message]
2024-07-08 18:28 ` Leah Rumancik
2024-07-15 20:35 ` Axel Rasmussen
2024-07-15 22:21 ` Alex Williamson
2024-07-16 1:06 ` Axel Rasmussen
2024-07-16 16:08 ` Alex Williamson
2024-07-16 16:58 ` Axel Rasmussen
2024-07-16 17:25 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZoPyReRlZ6ViNH-6@sashalap \
--to=sashal@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=hca@linux.ibm.com \
--cc=jane.chu@oracle.com \
--cc=kent.overstreet@linux.dev \
--cc=leah.rumancik@gmail.com \
--cc=linmiaohe@huawei.com \
--cc=mjguzik@gmail.com \
--cc=muchun.song@linux.dev \
--cc=oleg@redhat.com \
--cc=stable@vger.kernel.org \
--cc=tandersen@netflix.com \
--cc=thorvald@google.com \
--cc=willy@infradead.org \
--cc=zhangpeng.00@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox