The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kas@kernel.org>
To: akpm@linux-foundation.org, rppt@kernel.org, peterx@redhat.com,
	 david@kernel.org
Cc: ljs@kernel.org, surenb@google.com, vbabka@kernel.org,
	 Liam.Howlett@oracle.com, ziy@nvidia.com, corbet@lwn.net,
	skhan@linuxfoundation.org,  seanjc@google.com,
	pbonzini@redhat.com, jthoughton@google.com, aarcange@redhat.com,
	 sj@kernel.org, usama.arif@linux.dev, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org,  kvm@vger.kernel.org,
	kernel-team@meta.com
Subject: Addressing Sashiko AI review
Date: Fri, 3 Jul 2026 17:43:15 +0100	[thread overview]
Message-ID: <akfmUXjqaBQnxqYX@thinkstation> (raw)
In-Reply-To: <20260703133615.1039465-1-kirill@shutemov.name>

I looked through Sashiko review and queued changes for v9 (see incremental
diff below). The v9 branch is available at:

https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git uffd/v9

- 05/15: uffd_disable_fault_around() did not cover RWP, so one
  missing-page fault on shmem pre-installed up to 15 cached neighbours
  as accessible ptes and PAGEMAP_SCAN reported them as
  PAGE_IS_ACCESSED. Fault-around is now disabled for RWP.

- 06/15: change_huge_pud() rejects MM_CP_UFFD_RWP_ALL next to the
  existing MM_CP_UFFD_WP_ALL check. Defensive only.

- 09/15: the PROT_NONE accessibility check lived in
  vma_can_userfault(), which unregister also calls, so RWP register ->
  mprotect(PROT_NONE) -> UFFDIO_UNREGISTER failed with EINVAL. The
  check is now register-time only.

- 13/15: userfaultfd_register() read ctx->features with a plain load,
  racing the WRITE_ONCE() in userfaultfd_set_mode(). Benign (the
  tested bit is not toggleable), now uses userfaultfd_features().

- 15/15: the documentation wrongly claimed UFFDIO_API returns the
  supported feature bitmask on EINVAL; the kernel returns the
  structure zeroed. Corrected.

The finding against 03/15 is a pre-existing bug that the rename only
made visible: copy_hugetlb_page_range() clears the uffd-wp bit of
hwpoison/migration entries with the present-pte accessor, corrupting
the swap offset. Fix posted separately:

  https://lore.kernel.org/all/20260703161833.57416-1-kirill@shutemov.name/

I believe the remaining findings are either pre-existing or false
positives. Ping me if you think something is worth discussing.

diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
index 69836e39aa0b..783d969f0e28 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -348,9 +348,9 @@ without ``CONFIG_USERFAULTFD_RWP``, and architectures whose ptes cannot
 carry the uffd bit at runtime (e.g. riscv without the ``SVRSW60T59B``
 extension). Requesting an unsupported feature in
 ``uffdio_api.features`` makes ``UFFDIO_API`` fail with ``EINVAL`` and
-leaves the userfaultfd context uninitialized; the bitmask returned in
-``uffdio_api.features`` then advertises the features the kernel does
-support. The recommended probe sequence is therefore to open a
+leaves the userfaultfd context uninitialized; the structure is returned
+zeroed, so the error path cannot be used to discover what the kernel
+supports. The recommended probe sequence is therefore to open a
 throwaway userfaultfd, call ``UFFDIO_API`` once with ``features = 0``,
 inspect the returned bitmask, close that fd, then open the real one
 and call ``UFFDIO_API`` again with only the supported features set.
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index ca718e8590c5..bfbd6a59909f 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -171,7 +171,8 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma,
 /*
  * Never enable huge pmd sharing on some uffd registered vmas:
  *
- * - VM_UFFD_WP VMAs, because write protect information is per pgtable entry.
+ * - VM_UFFD_WP and VM_UFFD_RWP VMAs, because the write protect / access
+ *   tracking information is per pgtable entry.
  *
  * - VM_UFFD_MINOR VMAs, because otherwise we would never get minor faults for
  *   VMAs which share huge pmds. (If you have two mappings to the same
@@ -182,21 +183,25 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma,
 static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma)
 {
 	return vma_test_any_mask(vma,
-		mk_vma_flags_from_masks(VMA_UFFD_WP, VMA_UFFD_MINOR,
-					VMA_UFFD_RWP));
+		mk_vma_flags_from_masks(VMA_UFFD_WP, VMA_UFFD_RWP,
+					VMA_UFFD_MINOR));
 }
 
 /*
- * Don't do fault around for either WP or MINOR registered uffd range.  For
+ * Don't do fault around for WP, RWP or MINOR registered uffd range.  For
  * MINOR registered range, fault around will be a total disaster and ptes can
  * be installed without notifications; for WP it should mostly be fine as long
  * as the fault around checks for pte_none() before the installation, however
- * to be super safe we just forbid it.
+ * to be super safe we just forbid it; for RWP, pre-faulted neighbours would
+ * be indistinguishable from accessed pages in PAGEMAP_SCAN (PAGE_IS_ACCESSED)
+ * and pollute the tracked working set, so each page must be populated by its
+ * own fault.
  */
 static inline bool uffd_disable_fault_around(struct vm_area_struct *vma)
 {
 	return vma_test_any_mask(vma,
-		mk_vma_flags_from_masks(VMA_UFFD_WP, VMA_UFFD_MINOR));
+		mk_vma_flags_from_masks(VMA_UFFD_WP, VMA_UFFD_RWP,
+					VMA_UFFD_MINOR));
 }
 
 static inline bool userfaultfd_missing(struct vm_area_struct *vma)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c67261f04731..4150f1bdea0c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2766,10 +2766,10 @@ int change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		return 1;
 
 	/*
-	 * Huge entries on userfault-wp only works with anonymous, while we
-	 * don't have anonymous PUDs yet.
+	 * Huge entries on userfault-wp or userfault-rwp only work with
+	 * anonymous, while we don't have anonymous PUDs yet.
 	 */
-	if (WARN_ON_ONCE(cp_flags & MM_CP_UFFD_WP_ALL))
+	if (WARN_ON_ONCE(cp_flags & (MM_CP_UFFD_WP_ALL | MM_CP_UFFD_RWP_ALL)))
 		return 1;
 
 	ptl = __pud_trans_huge_lock(pudp, vma);
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index a7659b0a1147..c96cfe0871dd 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -2221,15 +2221,6 @@ static bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
 	    !vma_is_anonymous(vma))
 		return false;
 
-	/*
-	 * RWP uses protnone as an access-tracking marker. PROT_NONE VMAs
-	 * have vm_page_prot == PAGE_NONE, so RWP resolution can't make a
-	 * page accessible -- the next access would fault again. Reject up
-	 * front instead of letting FOLL_FORCE loop on protnone+uffd PTEs.
-	 */
-	if ((vm_flags & VM_UFFD_RWP) && !vma_is_accessible(vma))
-		return false;
-
 	return ops->can_userfault(vma, vm_flags);
 }
 
@@ -3758,7 +3749,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 	if (uffdio_register.mode & UFFDIO_REGISTER_MODE_RWP) {
 		if (!pgtable_supports_uffd() || VM_UFFD_RWP == VM_NONE)
 			goto out;
-		if (!(ctx->features & UFFD_FEATURE_RWP))
+		if (!(userfaultfd_features(ctx) & UFFD_FEATURE_RWP))
 			goto out;
 		vm_flags |= VM_UFFD_RWP;
 	}
@@ -3825,6 +3816,17 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 		if (!vma_can_userfault(cur, vm_flags, wp_async))
 			goto out_unlock;
 
+		/*
+		 * RWP uses protnone as an access-tracking marker. PROT_NONE
+		 * VMAs have vm_page_prot == PAGE_NONE, so RWP resolution
+		 * cannot make a page accessible again. Reject at register
+		 * time only: a VMA that later becomes inaccessible via
+		 * mprotect() must still be unregisterable, so this is not
+		 * part of vma_can_userfault().
+		 */
+		if ((vm_flags & VM_UFFD_RWP) && !vma_is_accessible(cur))
+			goto out_unlock;
+
 		/*
 		 * UFFDIO_COPY will fill file holes even without
 		 * PROT_WRITE. This check enforces that if this is a
-- 
  Kiryl Shutsemau / Kirill A. Shutemov

      parent reply	other threads:[~2026-07-03 16:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-03 13:35 [PATCH v8 00/15] userfaultfd: working set tracking for VM guest memory Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 01/15] mm: decouple protnone helpers from CONFIG_NUMA_BALANCING Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 02/15] mm: rename uffd-wp PTE bit macros to uffd Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 03/15] mm: rename uffd-wp PTE accessors " Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 04/15] userfaultfd: test uffd VMA flags through the vma_flags_t API Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 05/15] mm: add VM_UFFD_RWP VMA flag Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 06/15] mm: add MM_CP_UFFD_RWP change_protection() flag Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 07/15] mm: preserve RWP marker across PTE rewrites Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 08/15] mm: handle VM_UFFD_RWP in khugepaged, rmap, and GUP Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 09/15] userfaultfd: add UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT plumbing Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 10/15] mm/userfaultfd: add RWP fault delivery and expose UFFDIO_REGISTER_MODE_RWP Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 11/15] mm/pagemap: add PAGE_IS_ACCESSED for RWP tracking Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 12/15] userfaultfd: add UFFD_FEATURE_RWP_ASYNC for async fault resolution Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 13/15] userfaultfd: add UFFDIO_SET_MODE for runtime sync/async toggle Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 14/15] selftests/mm: add userfaultfd RWP tests Kiryl Shutsemau
2026-07-03 13:36 ` [PATCH v8 15/15] Documentation/userfaultfd: document RWP working set tracking Kiryl Shutsemau
2026-07-03 16:43 ` Kiryl Shutsemau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akfmUXjqaBQnxqYX@thinkstation \
    --to=kas@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=jthoughton@google.com \
    --cc=kernel-team@meta.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=sj@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=surenb@google.com \
    --cc=usama.arif@linux.dev \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox