From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Lorenzo Stoakes <ljs@kernel.org>, Mike Rapoport <rppt@kernel.org>,
David Hildenbrand <david@kernel.org>,
"Kiryl Shutsemau (Meta)" <kas@kernel.org>,
stable@vger.kernel.org,
Sashiko AI review <sashiko-bot@kernel.org>,
Peter Xu <peterx@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Jerome Glisse <jglisse@redhat.com>
Subject: [PATCH 5/6] userfaultfd: gate must_wait writability check on pte_present()
Date: Fri, 29 May 2026 18:23:29 +0100 [thread overview]
Message-ID: <20260529172331.356655-6-kas@kernel.org> (raw)
In-Reply-To: <20260529172331.356655-1-kas@kernel.org>
userfaultfd_must_wait() and userfaultfd_huge_must_wait() read the PTE
without taking the page table lock and then apply pte_write() /
huge_pte_write() to it. Those accessors decode bits from the present
encoding only; on a swap or migration entry they read the offset bits
that happen to share the same position and return an undefined result.
The intent of the check is "is this fault still WP-blocked?". A
non-marker swap entry means the page is in transit -- the userfault
context the original fault delivered against is no longer the same,
and the swap-in or migration completion path will re-deliver a fresh
fault if userspace still needs to handle it. Worst case under the
current code the garbage write bit says "wait", and the thread stays
asleep until a UFFDIO_WAKE that may never arrive.
Gate the writability check on pte_present() so the lockless re-check
only inspects present-PTE bits when the entry is actually present.
The non-present, non-marker case returns "don't wait" and lets the
fault path retry.
Fixes: 369cd2121be4 ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait for hugepmd ranges")
Fixes: 63b2d4174c4a ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl")
Cc: stable@vger.kernel.org
Reported-by: Sashiko AI review <sashiko-bot@kernel.org>
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
---
mm/userfaultfd.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 35b206cc9aa6..f6d2a1c67019 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -2535,6 +2535,15 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx,
/* UFFD PTE markers require userspace to resolve the fault. */
if (pte_is_uffd_marker(pte))
return true;
+ /*
+ * Concurrent migration may have replaced the present PTE with a
+ * non-marker swap entry between fault delivery and this lockless
+ * re-check. huge_pte_write() on a swap entry decodes random offset
+ * bits, so gate it on pte_present(). The migration completion path
+ * will re-deliver the fault if it still needs userspace.
+ */
+ if (!pte_present(pte))
+ return false;
/*
* If VMA has UFFD WP faults enabled and WP fault, wait for userspace to
* resolve the fault.
@@ -2621,6 +2630,17 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
/* UFFD PTE markers require userspace to resolve the fault. */
if (pte_is_uffd_marker(ptent))
goto out;
+ /*
+ * Concurrent swap-out / migration may have replaced the present PTE
+ * with a non-marker swap entry between fault delivery and this
+ * lockless re-check. pte_write() on a swap entry decodes random
+ * offset bits, so gate it on pte_present(). The page-in path will
+ * re-deliver the fault if it still needs userspace.
+ */
+ if (!pte_present(ptent)) {
+ ret = false;
+ goto out;
+ }
/*
* If VMA has UFFD WP faults enabled and WP fault, wait for userspace to
* resolve the fault.
--
2.54.0
next prev parent reply other threads:[~2026-05-29 17:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260529172331.356655-1-kas@kernel.org>
2026-05-29 17:23 ` [PATCH 1/6] fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race Kiryl Shutsemau (Meta)
2026-05-29 17:23 ` [PATCH 2/6] fs/proc/task_mmu: use huge_page_size() in pagemap_scan_hugetlb_entry() Kiryl Shutsemau (Meta)
2026-05-29 17:23 ` [PATCH 3/6] fs/proc/task_mmu: fix hugetlb self-deadlock in pagemap_scan_pte_hole() Kiryl Shutsemau (Meta)
2026-05-29 17:23 ` [PATCH 4/6] mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade Kiryl Shutsemau (Meta)
2026-05-29 17:23 ` Kiryl Shutsemau (Meta) [this message]
2026-05-29 17:23 ` [PATCH 6/6] userfaultfd: build __VMA_UFFD_FLAGS from config-gated masks Kiryl Shutsemau (Meta)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260529172331.356655-6-kas@kernel.org \
--to=kas@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jglisse@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=sashiko-bot@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox