Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kiryl Shutsemau" <kirill@shutemov.name>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v7 10/15] mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP
Date: Mon, 29 Jun 2026 12:42:20 +0000	[thread overview]
Message-ID: <20260629124221.119C31F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260629120749.566063-11-kirill@shutemov.name>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] RWP fault resolution creates invalid clean+writable PTEs.
- [High] Ambiguous PTE encoding between UFFD_RWP and NUMA hinting on UFFD_WP pages causes spurious RWP faults and WP security bypass.
- [High] FOLL_FORCE accesses (e.g., via /proc/pid/mem) silently bypass UFFD_RWP tracking.
- [High] UFFDIO_RWPROTECT skips unmapped PTEs, leaving them completely unprotected upon subsequent population.
--

commit 0937077addf68b8c5ba299257ab75ab0a674a8cd
Author: Kiryl Shutsemau (Meta) <kas@kernel.org>

    mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP

    This patch wires up the fault delivery side of read-write protection (RWP)
    tracking for userfaultfd and exposes the UFFDIO_REGISTER_MODE_RWP interface.
    It updates page fault handlers to intercept PROT_NONE PTEs with the uffd bit
    set and route them to userfaultfd, falling back to NUMA hinting otherwise.

> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index d65e1833f08d6..8d5d6ca60b671 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
[ ... ]
> @@ -236,6 +236,18 @@ static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma,
>  	return userfaultfd_wp(vma) && pmd_uffd(pmd);
>  }
>  
> +static inline bool userfaultfd_pte_rwp(struct vm_area_struct *vma,
> +				       pte_t pte)
> +{
> +	return userfaultfd_rwp(vma) && pte_uffd(pte);
> +}

[Severity: High]
Could FOLL_FORCE accesses bypass UFFD_RWP tracking entirely?

When a VMA is made inaccessible (e.g., via mprotect(PROT_NONE)), GUP checks
accessibility in gup_can_follow_protnone(). Because FOLL_FORCE overrides VMA
accessibility restrictions, GUP successfully extracts the page.

When verifying write permissions, can_follow_write_pte() checks
userfaultfd_pte_wp() but appears to omit checking userfaultfd_pte_rwp():

mm/gup.c:can_follow_write_pte() {
    ...
    return !userfaultfd_pte_wp(vma, pte);
}

Does this allow local privileged processes to read or write RWP-protected
memory without triggering the userfaultfd handler?

[ ... ]
> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> index 7b78aa3b5318c..d803e76d47adc 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
[ ... ]
> @@ -54,13 +56,15 @@
>  	 (__u64)1 << _UFFDIO_MOVE |		\
>  	 (__u64)1 << _UFFDIO_WRITEPROTECT |	\
>  	 (__u64)1 << _UFFDIO_CONTINUE |		\
> -	 (__u64)1 << _UFFDIO_POISON)
> +	 (__u64)1 << _UFFDIO_POISON |		\
> +	 (__u64)1 << _UFFDIO_RWPROTECT)

[Severity: High]
Does UFFDIO_RWPROTECT skip unmapped PTEs during registration?

During UFFDIO_RWPROTECT, change_pte_range() iterates over the VMA's page
tables. When it encounters pte_none entries, it skips them unless uffd_wp is
true:

mm/mprotect.c:change_pte_range() {
    ...
    } else if (pte_none(oldpte)) {
        /*
         * Nobody plays with any none ptes besides
         * userfaultfd when applying the protections.
         */
        if (likely(!uffd_wp))
            continue;
    ...
}

Because RWP has no marker for unmapped pages, the entries remain empty. When
a page fault subsequently occurs, do_anonymous_page() populates the page using
the default VMA permissions without PROT_NONE or the _PAGE_UFFD bit.

Could this leave previously unmapped or zapped pages within an RWP-protected
range completely unprotected?

[ ... ]
> diff --git a/mm/memory.c b/mm/memory.c
> index 978ead28c6d15..16b7596625abd 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
[ ... ]
> @@ -6126,6 +6126,16 @@ static void numa_rebuild_large_mapping(struct vm_fault *vmf, struct vm_area_stru
>  		if (!pte_present(ptent) || !pte_protnone(ptent))
>  			continue;
>  
> +		/*
> +		 * RWP-armed PTEs are also protnone but carry _PAGE_UFFD as a
> +		 * marker. Leave them alone -- rewriting to vm_page_prot would
> +		 * stop the RWP trap. Gate on userfaultfd_rwp(vma) too:
> +		 * NUMA balancing preserves _PAGE_UFFD on UFFD_WP-marked PTEs
> +		 * when applying PROT_NONE, and those still need rebuilding.
> +		 */
> +		if (userfaultfd_rwp(vma) && pte_uffd(ptent))
> +			continue;

[Severity: High]
Does this logic correctly differentiate between RWP and NUMA hinted pages?

When a VMA has both UFFD_WP and UFFD_RWP enabled, NUMA balancing modifies PTEs
to PROT_NONE while preserving the _PAGE_UFFD bit (used by UFFD_WP). Because
UFFD_RWP uses the same PROT_NONE + _PAGE_UFFD encoding, it appears a
NUMA-hinted WP page would be falsely identified as an RWP page here.

Could this skip rebuilding NUMA mappings for WP pages when the VMA also has
RWP enabled?

> +
>  		if (pfn_folio(pte_pfn(ptent)) != folio)
>  			continue;
>  
[ ... ]
> @@ -6141,6 +6151,12 @@ static void numa_rebuild_large_mapping(struct vm_fault *vmf, struct vm_area_stru
>  	}
>  }
>  
> +static vm_fault_t do_uffd_rwp(struct vm_fault *vmf)
> +{
> +	pte_unmap(vmf->pte);
> +	return handle_userfault(vmf, VM_UFFD_RWP);
> +}

[Severity: Critical]
Is it possible for do_uffd_rwp() to create an invalid clean and writable PTE?

When an RWP fault is resolved on a private anonymous exclusive page,
can_change_pte_writable() can return true. The PTE is then upgraded to writable
using pte_mkwrite():

mm/memory.c:do_uffd_rwp() {
    ...
    if (!pte_write(pte) &&
        vma_wants_manual_pte_write_upgrade(vmf->vma) &&
        can_change_pte_writable(vmf->vma, vmf->address, pte))
            pte = pte_mkwrite(pte, vmf->vma);
    ...
}

It seems this fails to pair pte_mkwrite() with pte_mkdirty(). According to
the MM subsystem invariants, a clean and writable PTE is invalid. On
architectures that trap invalid PTE combinations, could this cause a kernel
crash or silent data loss when data is dropped instead of being written back?

> +
>  static vm_fault_t do_numa_page(struct vm_fault *vmf)
[ ... ]
> @@ -6416,8 +6432,16 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
>  	if (!pte_present(vmf->orig_pte))
>  		return do_swap_page(vmf);
>  
> -	if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma))
> +	if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) {
> +		/*
> +		 * RWP-protected PTEs are protnone plus the uffd bit. On a
> +		 * VM_UFFD_RWP VMA, a protnone PTE without the uffd bit is
> +		 * NUMA hinting and must still fall through to do_numa_page().
> +		 */
> +		if (userfaultfd_pte_rwp(vmf->vma, vmf->orig_pte))
> +			return do_uffd_rwp(vmf);

[Severity: High]
Could this intercept NUMA hinting faults on UFFD_WP pages and route them
incorrectly to the RWP handler?

As noted above, NUMA hinting preserves the _PAGE_UFFD bit. If a VMA has both
UFFD_WP and UFFD_RWP enabled, userfaultfd_pte_rwp() will return true for a
NUMA-hinted UFFD_WP page because it relies on the same _PAGE_UFFD encoding.

This routes the fault to do_uffd_rwp() instead of the NUMA handler, which
clears the UFFD bit and destroys the WP protection. Could this silently
allow unauthorized writes to write-protected memory?

>  		return do_numa_page(vmf);
> +	}
>  
>  	spin_lock(vmf->ptl);
>  	entry = vmf->orig_pte;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260629120749.566063-1-kirill@shutemov.name?part=10

  reply	other threads:[~2026-06-29 12:42 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 12:07 [PATCH v7 00/15] userfaultfd: working set tracking for VM guest memory Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 01/15] mm: decouple protnone helpers from CONFIG_NUMA_BALANCING Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 02/15] mm: rename uffd-wp PTE bit macros to uffd Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 03/15] mm: rename uffd-wp PTE accessors " Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 04/15] userfaultfd: test uffd VMA flags through the vma_flags_t API Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 05/15] mm: add VM_UFFD_RWP VMA flag Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 06/15] mm: add MM_CP_UFFD_RWP change_protection() flag Kiryl Shutsemau
2026-06-29 12:27   ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 07/15] mm: preserve RWP marker across PTE rewrites Kiryl Shutsemau
2026-06-29 12:33   ` sashiko-bot
2026-06-29 16:02     ` Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 08/15] mm: handle VM_UFFD_RWP in khugepaged, rmap, and GUP Kiryl Shutsemau
2026-06-29 12:50   ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 09/15] userfaultfd: add UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT plumbing Kiryl Shutsemau
2026-06-29 12:40   ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 10/15] mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP Kiryl Shutsemau
2026-06-29 12:42   ` sashiko-bot [this message]
2026-06-29 12:07 ` [PATCH v7 11/15] mm/pagemap: add PAGE_IS_ACCESSED for RWP tracking Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 12/15] userfaultfd: add UFFD_FEATURE_RWP_ASYNC for async fault resolution Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 13/15] userfaultfd: add UFFDIO_SET_MODE for runtime sync/async toggle Kiryl Shutsemau
2026-06-29 12:07 ` [PATCH v7 14/15] selftests/mm: add userfaultfd RWP tests Kiryl Shutsemau
2026-06-29 12:46   ` sashiko-bot
2026-06-29 12:07 ` [PATCH v7 15/15] Documentation/userfaultfd: document RWP working set tracking Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629124221.119C31F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox