From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>,
Jianhui Zhou <jianhuizzzzz@gmail.com>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
Mike Rapoport <rppt@kernel.org>
Cc: jane.chu@oracle.com, Peter Xu <peterx@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
SeongJae Park <sj@kernel.org>, Hugh Dickins <hughd@google.com>,
Sidhartha Kumar <sidhartha.kumar@oracle.com>,
Jonas Zhou <jonaszhou@zhaoxin.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org,
syzbot+f525fd79634858f478e7@syzkaller.appspotmail.com
Subject: Re: [PATCH v4] mm/userfaultfd: fix hugetlb fault mutex hash calculation
Date: Wed, 25 Mar 2026 09:49:09 +0100 [thread overview]
Message-ID: <1075f7a0-232f-4268-94b3-573d11c4203f@kernel.org> (raw)
In-Reply-To: <20260324170311.dc5b54fe0765f2e680e3cc90@linux-foundation.org>
On 3/25/26 01:03, Andrew Morton wrote:
> On Wed, 11 Mar 2026 18:54:26 +0800 Jianhui Zhou <jianhuizzzzz@gmail.com> wrote:
>
>> On Tue, Mar 10, 2026 at 12:47:07PM -0700, jane.chu@oracle.com wrote:
>>> Just wondering whether making the shift explicit here instead of
>>> introducing another hugetlb helper might be sufficient?
>>>
>>> idx >>= huge_page_order(hstate_vma(vma));
>>
>> That would work for hugetlb VMAs since both (address - vm_start) and
>> vm_pgoff are guaranteed to be huge page aligned. However, David
>> suggested introducing hugetlb_linear_page_index() to provide a cleaner
>> API that mirrors linear_page_index(), so I kept this approach.
>>
>
> Thanks.
>
> Would anyone like to review this cc:stable patch for us?
I would hope the hugetlb+userfaultfd submaintainers could have a
detailed look! Moving them to "To:"
One of the issue why this doesn't get more attention might be posting a
new revision as reply to an old revision, which is an anti-pattern :)
>
>
> From: Jianhui Zhou <jianhuizzzzz@gmail.com>
> Subject: mm/userfaultfd: fix hugetlb fault mutex hash calculation
> Date: Tue, 10 Mar 2026 19:05:26 +0800
>
> In mfill_atomic_hugetlb(), linear_page_index() is used to calculate the
> page index for hugetlb_fault_mutex_hash(). However, linear_page_index()
> returns the index in PAGE_SIZE units, while hugetlb_fault_mutex_hash()
> expects the index in huge page units. This mismatch means that different
> addresses within the same huge page can produce different hash values,
> leading to the use of different mutexes for the same huge page. This can
> cause races between faulting threads, which can corrupt the reservation
> map and trigger the BUG_ON in resv_map_release().
>
> Fix this by introducing hugetlb_linear_page_index(), which returns the
> page index in huge page granularity, and using it in place of
> linear_page_index().
>
> Link: https://lkml.kernel.org/r/20260310110526.335749-1-jianhuizzzzz@gmail.com
> Fixes: a08c7193e4f1 ("mm/filemap: remove hugetlb special casing in filemap.c")
> Signed-off-by: Jianhui Zhou <jianhuizzzzz@gmail.com>
> Reported-by: syzbot+f525fd79634858f478e7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f525fd79634858f478e7
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: JonasZhou <JonasZhou@zhaoxin.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: SeongJae Park <sj@kernel.org>
> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> include/linux/hugetlb.h | 17 +++++++++++++++++
> mm/userfaultfd.c | 2 +-
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> --- a/include/linux/hugetlb.h~mm-userfaultfd-fix-hugetlb-fault-mutex-hash-calculation
> +++ a/include/linux/hugetlb.h
> @@ -796,6 +796,23 @@ static inline unsigned huge_page_shift(s
> return h->order + PAGE_SHIFT;
> }
>
> +/**
> + * hugetlb_linear_page_index() - linear_page_index() but in hugetlb
> + * page size granularity.
> + * @vma: the hugetlb VMA
> + * @address: the virtual address within the VMA
> + *
> + * Return: the page offset within the mapping in huge page units.
> + */
> +static inline pgoff_t hugetlb_linear_page_index(struct vm_area_struct *vma,
> + unsigned long address)
> +{
> + struct hstate *h = hstate_vma(vma);
> +
> + return ((address - vma->vm_start) >> huge_page_shift(h)) +
> + (vma->vm_pgoff >> huge_page_order(h));
> +}
> +
> static inline bool order_is_gigantic(unsigned int order)
> {
> return order > MAX_PAGE_ORDER;
> --- a/mm/userfaultfd.c~mm-userfaultfd-fix-hugetlb-fault-mutex-hash-calculation
> +++ a/mm/userfaultfd.c
> @@ -573,7 +573,7 @@ retry:
> * in the case of shared pmds. fault mutex prevents
> * races with other faulting threads.
> */
> - idx = linear_page_index(dst_vma, dst_addr);
> + idx = hugetlb_linear_page_index(dst_vma, dst_addr);
> mapping = dst_vma->vm_file->f_mapping;
> hash = hugetlb_fault_mutex_hash(mapping, idx);
> mutex_lock(&hugetlb_fault_mutex_table[hash]);
> _
>
Let's take a look at other hugetlb_fault_mutex_hash() users:
* remove_inode_hugepages: uses folio->index >> huge_page_order(h)
-> hugetlb granularity
* hugetlbfs_fallocate(): start/index is in hugetlb granularity
-> hugetlb granularity
* memfd_alloc_folio(): idx >>= huge_page_order(h);
-> hugetlb granularity
* hugetlb_wp(): uses vma_hugecache_offset()
-> hugetlb granularity
* hugetlb_handle_userfault(): uses vmf->pgoff, which hugetlb_fault()
sets to vma_hugecache_offset()
-> hugetlb granularity
* hugetlb_no_page(): similarly uses vmf->pgoff
-> hugetlb granularity
* hugetlb_fault(): similarly uses vmf->pgoff
-> hugetlb granularity
So this change here looks good to me
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
But it raises the question:
(1) should be convert all that to just operate on the ordinary index,
such that we don't even need hugetlb_linear_page_index()? That would be
an addon patch.
(2) Alternatively, could we replace all users of vma_hugecache_offset()
by the much cleaner hugetlb_linear_page_index() ?
In general, I think we should look into having idx/vmf->pgoff being
consistent with the remainder of MM, converting all code in hugetlb to
do that.
Any takers?
--
Cheers,
David
next prev parent reply other threads:[~2026-03-25 8:49 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 14:03 [PATCH] mm/userfaultfd: fix hugetlb fault mutex hash calculation Jianhui Zhou
2026-03-06 16:53 ` Peter Xu
2026-03-07 13:37 ` 周建辉
2026-03-07 13:59 ` Jianhui Zhou
2026-03-07 3:27 ` SeongJae Park
2026-03-08 13:41 ` Jianhui Zhou
2026-03-08 22:57 ` SeongJae Park
2026-03-07 14:35 ` [PATCH v2] " Jianhui Zhou
2026-03-09 2:08 ` Hugh Dickins
2026-03-09 3:08 ` Jianhui Zhou
2026-03-09 16:47 ` David Hildenbrand (Arm)
2026-03-10 10:24 ` Jianhui Zhou
2026-03-09 3:30 ` [PATCH v3] " Jianhui Zhou
2026-03-10 11:05 ` [PATCH v4] " Jianhui Zhou
2026-03-10 19:47 ` jane.chu
2026-03-11 10:54 ` Jianhui Zhou
2026-03-25 0:03 ` Andrew Morton
2026-03-25 1:06 ` SeongJae Park
2026-03-25 6:07 ` Jianhui Zhou
2026-03-25 8:49 ` David Hildenbrand (Arm)
2026-03-25 19:08 ` Mike Rapoport
2026-03-25 8:49 ` David Hildenbrand (Arm) [this message]
2026-03-25 19:02 ` Mike Rapoport
2026-03-25 23:46 ` jane.chu
2026-03-26 9:18 ` David Hildenbrand (Arm)
2026-03-25 19:10 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1075f7a0-232f-4268-94b3-573d11c4203f@kernel.org \
--to=david@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=jane.chu@oracle.com \
--cc=jianhuizzzzz@gmail.com \
--cc=jonaszhou@zhaoxin.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=sidhartha.kumar@oracle.com \
--cc=sj@kernel.org \
--cc=stable@vger.kernel.org \
--cc=syzbot+f525fd79634858f478e7@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox