From: sashiko-bot@kernel.org
To: "Kiryl Shutsemau" <kirill@shutemov.name>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v7 10/15] mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP
Date: Mon, 29 Jun 2026 12:42:20 +0000 [thread overview]
Message-ID: <20260629124221.119C31F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260629120749.566063-11-kirill@shutemov.name>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] RWP fault resolution creates invalid clean+writable PTEs.
- [High] Ambiguous PTE encoding between UFFD_RWP and NUMA hinting on UFFD_WP pages causes spurious RWP faults and WP security bypass.
- [High] FOLL_FORCE accesses (e.g., via /proc/pid/mem) silently bypass UFFD_RWP tracking.
- [High] UFFDIO_RWPROTECT skips unmapped PTEs, leaving them completely unprotected upon subsequent population.
--
commit 0937077addf68b8c5ba299257ab75ab0a674a8cd
Author: Kiryl Shutsemau (Meta) <kas@kernel.org>
mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP
This patch wires up the fault delivery side of read-write protection (RWP)
tracking for userfaultfd and exposes the UFFDIO_REGISTER_MODE_RWP interface.
It updates page fault handlers to intercept PROT_NONE PTEs with the uffd bit
set and route them to userfaultfd, falling back to NUMA hinting otherwise.
> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index d65e1833f08d6..8d5d6ca60b671 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
[ ... ]
> @@ -236,6 +236,18 @@ static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma,
> return userfaultfd_wp(vma) && pmd_uffd(pmd);
> }
>
> +static inline bool userfaultfd_pte_rwp(struct vm_area_struct *vma,
> + pte_t pte)
> +{
> + return userfaultfd_rwp(vma) && pte_uffd(pte);
> +}
[Severity: High]
Could FOLL_FORCE accesses bypass UFFD_RWP tracking entirely?
When a VMA is made inaccessible (e.g., via mprotect(PROT_NONE)), GUP checks
accessibility in gup_can_follow_protnone(). Because FOLL_FORCE overrides VMA
accessibility restrictions, GUP successfully extracts the page.
When verifying write permissions, can_follow_write_pte() checks
userfaultfd_pte_wp() but appears to omit checking userfaultfd_pte_rwp():
mm/gup.c:can_follow_write_pte() {
...
return !userfaultfd_pte_wp(vma, pte);
}
Does this allow local privileged processes to read or write RWP-protected
memory without triggering the userfaultfd handler?
[ ... ]
> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> index 7b78aa3b5318c..d803e76d47adc 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
[ ... ]
> @@ -54,13 +56,15 @@
> (__u64)1 << _UFFDIO_MOVE | \
> (__u64)1 << _UFFDIO_WRITEPROTECT | \
> (__u64)1 << _UFFDIO_CONTINUE | \
> - (__u64)1 << _UFFDIO_POISON)
> + (__u64)1 << _UFFDIO_POISON | \
> + (__u64)1 << _UFFDIO_RWPROTECT)
[Severity: High]
Does UFFDIO_RWPROTECT skip unmapped PTEs during registration?
During UFFDIO_RWPROTECT, change_pte_range() iterates over the VMA's page
tables. When it encounters pte_none entries, it skips them unless uffd_wp is
true:
mm/mprotect.c:change_pte_range() {
...
} else if (pte_none(oldpte)) {
/*
* Nobody plays with any none ptes besides
* userfaultfd when applying the protections.
*/
if (likely(!uffd_wp))
continue;
...
}
Because RWP has no marker for unmapped pages, the entries remain empty. When
a page fault subsequently occurs, do_anonymous_page() populates the page using
the default VMA permissions without PROT_NONE or the _PAGE_UFFD bit.
Could this leave previously unmapped or zapped pages within an RWP-protected
range completely unprotected?
[ ... ]
> diff --git a/mm/memory.c b/mm/memory.c
> index 978ead28c6d15..16b7596625abd 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
[ ... ]
> @@ -6126,6 +6126,16 @@ static void numa_rebuild_large_mapping(struct vm_fault *vmf, struct vm_area_stru
> if (!pte_present(ptent) || !pte_protnone(ptent))
> continue;
>
> + /*
> + * RWP-armed PTEs are also protnone but carry _PAGE_UFFD as a
> + * marker. Leave them alone -- rewriting to vm_page_prot would
> + * stop the RWP trap. Gate on userfaultfd_rwp(vma) too:
> + * NUMA balancing preserves _PAGE_UFFD on UFFD_WP-marked PTEs
> + * when applying PROT_NONE, and those still need rebuilding.
> + */
> + if (userfaultfd_rwp(vma) && pte_uffd(ptent))
> + continue;
[Severity: High]
Does this logic correctly differentiate between RWP and NUMA hinted pages?
When a VMA has both UFFD_WP and UFFD_RWP enabled, NUMA balancing modifies PTEs
to PROT_NONE while preserving the _PAGE_UFFD bit (used by UFFD_WP). Because
UFFD_RWP uses the same PROT_NONE + _PAGE_UFFD encoding, it appears a
NUMA-hinted WP page would be falsely identified as an RWP page here.
Could this skip rebuilding NUMA mappings for WP pages when the VMA also has
RWP enabled?
> +
> if (pfn_folio(pte_pfn(ptent)) != folio)
> continue;
>
[ ... ]
> @@ -6141,6 +6151,12 @@ static void numa_rebuild_large_mapping(struct vm_fault *vmf, struct vm_area_stru
> }
> }
>
> +static vm_fault_t do_uffd_rwp(struct vm_fault *vmf)
> +{
> + pte_unmap(vmf->pte);
> + return handle_userfault(vmf, VM_UFFD_RWP);
> +}
[Severity: Critical]
Is it possible for do_uffd_rwp() to create an invalid clean and writable PTE?
When an RWP fault is resolved on a private anonymous exclusive page,
can_change_pte_writable() can return true. The PTE is then upgraded to writable
using pte_mkwrite():
mm/memory.c:do_uffd_rwp() {
...
if (!pte_write(pte) &&
vma_wants_manual_pte_write_upgrade(vmf->vma) &&
can_change_pte_writable(vmf->vma, vmf->address, pte))
pte = pte_mkwrite(pte, vmf->vma);
...
}
It seems this fails to pair pte_mkwrite() with pte_mkdirty(). According to
the MM subsystem invariants, a clean and writable PTE is invalid. On
architectures that trap invalid PTE combinations, could this cause a kernel
crash or silent data loss when data is dropped instead of being written back?
> +
> static vm_fault_t do_numa_page(struct vm_fault *vmf)
[ ... ]
> @@ -6416,8 +6432,16 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
> if (!pte_present(vmf->orig_pte))
> return do_swap_page(vmf);
>
> - if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma))
> + if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) {
> + /*
> + * RWP-protected PTEs are protnone plus the uffd bit. On a
> + * VM_UFFD_RWP VMA, a protnone PTE without the uffd bit is
> + * NUMA hinting and must still fall through to do_numa_page().
> + */
> + if (userfaultfd_pte_rwp(vmf->vma, vmf->orig_pte))
> + return do_uffd_rwp(vmf);
[Severity: High]
Could this intercept NUMA hinting faults on UFFD_WP pages and route them
incorrectly to the RWP handler?
As noted above, NUMA hinting preserves the _PAGE_UFFD bit. If a VMA has both
UFFD_WP and UFFD_RWP enabled, userfaultfd_pte_rwp() will return true for a
NUMA-hinted UFFD_WP page because it relies on the same _PAGE_UFFD encoding.
This routes the fault to do_uffd_rwp() instead of the NUMA handler, which
clears the UFFD bit and destroys the WP protection. Could this silently
allow unauthorized writes to write-protected memory?
> return do_numa_page(vmf);
> + }
>
> spin_lock(vmf->ptl);
> entry = vmf->orig_pte;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629120749.566063-1-kirill@shutemov.name?part=10
next prev parent reply other threads:[~2026-06-29 12:42 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-29 12:07 [PATCH v7 00/15] userfaultfd: working set tracking for VM guest memory Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 01/15] mm: decouple protnone helpers from CONFIG_NUMA_BALANCING Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 02/15] mm: rename uffd-wp PTE bit macros to uffd Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 03/15] mm: rename uffd-wp PTE accessors " Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 04/15] userfaultfd: test uffd VMA flags through the vma_flags_t API Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 05/15] mm: add VM_UFFD_RWP VMA flag Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 06/15] mm: add MM_CP_UFFD_RWP change_protection() flag Kiryl Shutsemau
2026-06-29 12:27 ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 07/15] mm: preserve RWP marker across PTE rewrites Kiryl Shutsemau
2026-06-29 12:33 ` sashiko-bot
2026-06-29 16:02 ` Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 08/15] mm: handle VM_UFFD_RWP in khugepaged, rmap, and GUP Kiryl Shutsemau
2026-06-29 12:50 ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 09/15] userfaultfd: add UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT plumbing Kiryl Shutsemau
2026-06-29 12:40 ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 10/15] mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP Kiryl Shutsemau
2026-06-29 12:42 ` sashiko-bot [this message]
2026-06-29 12:07 ` [PATCH v7 11/15] mm/pagemap: add PAGE_IS_ACCESSED for RWP tracking Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 12/15] userfaultfd: add UFFD_FEATURE_RWP_ASYNC for async fault resolution Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 13/15] userfaultfd: add UFFDIO_SET_MODE for runtime sync/async toggle Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 14/15] selftests/mm: add userfaultfd RWP tests Kiryl Shutsemau
2026-06-29 12:46 ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 15/15] Documentation/userfaultfd: document RWP working set tracking Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260629124221.119C31F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox