linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] mm: some optimizations for prot numa
@ 2025-10-20  6:18 Kefeng Wang
  2025-10-20  6:18 ` [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip() Kefeng Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20  6:18 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Kefeng Wang

v4:
- update changelog for all patches, per Lorenzo, David
- split the rename/refactor into a new patch, per Lorenzo
- collect ACK/RB

v3:
- use "&&" instead of "&" in patch2
- rename folio_skip_prot_numa() to folio_needs_prot_numa()
  and add kerneldoc
- add RB

v2:
- update comments for pte_protnone() in patch2
- rename prot_numa_skip() to folio_skip_prot_numa() and
  cleanup it a bit
- add RB/ACK


Kefeng Wang (4):
  mm: mprotect: always skip dma pinned folio in prot_numa_skip()
  mm: mprotect: avoid unnecessary struct page accessing if
    pte_protnone()
  mm: mprotect: convert to folio_needs_prot_numa()
  mm: huge_memory: use folio_needs_prot_numa() for pmd folio

 mm/huge_memory.c | 22 ++++++---------
 mm/internal.h    |  3 ++
 mm/mprotect.c    | 71 ++++++++++++++++++++++++------------------------
 3 files changed, 47 insertions(+), 49 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip()
  2025-10-20  6:18 [PATCH v4 0/4] mm: some optimizations for prot numa Kefeng Wang
@ 2025-10-20  6:18 ` Kefeng Wang
  2025-10-21  3:06   ` Barry Song
  2025-10-20  6:18 ` [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone() Kefeng Wang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20  6:18 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Kefeng Wang, Sidhartha Kumar

If the folio (even not CoW folio) is dma pinned, it can't be migrated
due to the elevated reference count. So always skip a pinned folio
to avoid wasting cycles when folios are migrated.

Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/mprotect.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 988c366137d5..056986d9076a 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -136,9 +136,12 @@ static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr,
 	if (folio_is_zone_device(folio) || folio_test_ksm(folio))
 		goto skip;
 
-	/* Also skip shared copy-on-write pages */
-	if (is_cow_mapping(vma->vm_flags) &&
-	    (folio_maybe_dma_pinned(folio) || folio_maybe_mapped_shared(folio)))
+	/* Also skip shared copy-on-write folios */
+	if (is_cow_mapping(vma->vm_flags) && folio_maybe_mapped_shared(folio))
+		goto skip;
+
+	/* Folios are pinned and can't be migrated */
+	if (folio_maybe_dma_pinned(folio))
 		goto skip;
 
 	/*
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone()
  2025-10-20  6:18 [PATCH v4 0/4] mm: some optimizations for prot numa Kefeng Wang
  2025-10-20  6:18 ` [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip() Kefeng Wang
@ 2025-10-20  6:18 ` Kefeng Wang
  2025-10-20 12:53   ` Lorenzo Stoakes
  2025-10-20 13:08   ` David Hildenbrand
  2025-10-20  6:18 ` [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa() Kefeng Wang
  2025-10-20  6:18 ` [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio Kefeng Wang
  3 siblings, 2 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20  6:18 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Kefeng Wang, Sidhartha Kumar

If the pte_protnone() is true, we could avoid unnecessary struct page
accessing and reduce cache footprint when scanning page tables for prot
numa, there was a similar change before, see more commit a818f5363a0e
("autonuma: reduce cache footprint when scanning page tables").

Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/mprotect.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 056986d9076a..6236d120c8e6 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -118,18 +118,13 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
 	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr_ptes, flags);
 }
 
-static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr,
-			   pte_t oldpte, pte_t *pte, int target_node,
-			   struct folio *folio)
+static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
+		struct folio *folio)
 {
 	bool ret = true;
 	bool toptier;
 	int nid;
 
-	/* Avoid TLB flush if possible */
-	if (pte_protnone(oldpte))
-		goto skip;
-
 	if (!folio)
 		goto skip;
 
@@ -307,23 +302,25 @@ static long change_pte_range(struct mmu_gather *tlb,
 			struct page *page;
 			pte_t ptent;
 
+			/* Already in the desired state. */
+			if (prot_numa && pte_protnone(oldpte))
+				continue;
+
 			page = vm_normal_page(vma, addr, oldpte);
 			if (page)
 				folio = page_folio(page);
+
 			/*
 			 * Avoid trapping faults against the zero or KSM
 			 * pages. See similar comment in change_huge_pmd.
 			 */
-			if (prot_numa) {
-				int ret = prot_numa_skip(vma, addr, oldpte, pte,
-							 target_node, folio);
-				if (ret) {
+			if (prot_numa &&
+			    prot_numa_skip(vma, target_node, folio)) {
 
-					/* determine batch to skip */
-					nr_ptes = mprotect_folio_pte_batch(folio,
-						  pte, oldpte, max_nr_ptes, /* flags = */ 0);
-					continue;
-				}
+				/* determine batch to skip */
+				nr_ptes = mprotect_folio_pte_batch(folio,
+					  pte, oldpte, max_nr_ptes, /* flags = */ 0);
+				continue;
 			}
 
 			nr_ptes = mprotect_folio_pte_batch(folio, pte, oldpte, max_nr_ptes, flags);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20  6:18 [PATCH v4 0/4] mm: some optimizations for prot numa Kefeng Wang
  2025-10-20  6:18 ` [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip() Kefeng Wang
  2025-10-20  6:18 ` [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone() Kefeng Wang
@ 2025-10-20  6:18 ` Kefeng Wang
  2025-10-20 13:09   ` David Hildenbrand
  2025-10-20 13:10   ` Lorenzo Stoakes
  2025-10-20  6:18 ` [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio Kefeng Wang
  3 siblings, 2 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20  6:18 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Kefeng Wang

The prot_numa_skip() naming is not good since it updates the folio
access time except checking whether to skip prot NUMA, so rename it
to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
directly return value instead of goto style, also make it non-static
function so that it can be reused.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/internal.h |  3 +++
 mm/mprotect.c | 43 ++++++++++++++++++++++---------------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 6691d3ea55af..b521b5177d3c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1403,6 +1403,9 @@ int numa_migrate_check(struct folio *folio, struct vm_fault *vmf,
 		      unsigned long addr, int *flags, bool writable,
 		      int *last_cpupid);
 
+bool folio_needs_prot_numa(struct folio *folio, struct vm_area_struct *vma,
+		int target_node);
+
 void free_zone_device_folio(struct folio *folio);
 int migrate_device_coherent_folio(struct folio *folio);
 
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 6236d120c8e6..1369ba6f6294 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -118,26 +118,30 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
 	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr_ptes, flags);
 }
 
-static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
-		struct folio *folio)
+/**
+ * folio_needs_prot_numa() - Whether the folio needs prot numa
+ * @folio: The folio.
+ * @vma: The VMA mapping.
+ * @target_node: The numa node being accessed.
+ *
+ * Return: True if folio needs prot numa and the access time of
+ *	   folio is adjusted. False otherwise.
+ */
+bool folio_needs_prot_numa(struct folio *folio, struct vm_area_struct *vma,
+		int target_node)
 {
-	bool ret = true;
-	bool toptier;
 	int nid;
 
-	if (!folio)
-		goto skip;
-
-	if (folio_is_zone_device(folio) || folio_test_ksm(folio))
-		goto skip;
+	if (!folio || folio_is_zone_device(folio) || folio_test_ksm(folio))
+		return false;
 
 	/* Also skip shared copy-on-write folios */
 	if (is_cow_mapping(vma->vm_flags) && folio_maybe_mapped_shared(folio))
-		goto skip;
+		return false;
 
 	/* Folios are pinned and can't be migrated */
 	if (folio_maybe_dma_pinned(folio))
-		goto skip;
+		return false;
 
 	/*
 	 * While migration can move some dirty pages,
@@ -145,7 +149,7 @@ static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
 	 * context.
 	 */
 	if (folio_is_file_lru(folio) && folio_test_dirty(folio))
-		goto skip;
+		return false;
 
 	/*
 	 * Don't mess with PTEs if page is already on the node
@@ -153,23 +157,20 @@ static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
 	 */
 	nid = folio_nid(folio);
 	if (target_node == nid)
-		goto skip;
-
-	toptier = node_is_toptier(nid);
+		return false;
 
 	/*
 	 * Skip scanning top tier node if normal numa
 	 * balancing is disabled
 	 */
-	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) && toptier)
-		goto skip;
+	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
+	    node_is_toptier(nid))
+		return false;
 
-	ret = false;
 	if (folio_use_access_time(folio))
 		folio_xchg_access_time(folio, jiffies_to_msecs(jiffies));
 
-skip:
-	return ret;
+	return true;
 }
 
 /* Set nr_ptes number of ptes, starting from idx */
@@ -315,7 +316,7 @@ static long change_pte_range(struct mmu_gather *tlb,
 			 * pages. See similar comment in change_huge_pmd.
 			 */
 			if (prot_numa &&
-			    prot_numa_skip(vma, target_node, folio)) {
+			    !folio_needs_prot_numa(folio, vma, target_node)) {
 
 				/* determine batch to skip */
 				nr_ptes = mprotect_folio_pte_batch(folio,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20  6:18 [PATCH v4 0/4] mm: some optimizations for prot numa Kefeng Wang
                   ` (2 preceding siblings ...)
  2025-10-20  6:18 ` [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa() Kefeng Wang
@ 2025-10-20  6:18 ` Kefeng Wang
  2025-10-20 13:11   ` David Hildenbrand
  2025-10-20 13:15   ` Lorenzo Stoakes
  3 siblings, 2 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20  6:18 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Kefeng Wang, Sidhartha Kumar

The folio_needs_prot_numa() checks whether to need prot numa, which
skips unsuitable folio, i.e. zone device, shared folios(ksm, CoW),
non-movable dma pinned, dirty file folio and already numa affinity's
folios, the policy should be applied to pmd folio too, which helps
to avoid unnecessary pmd change and folio migration attempts.

Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/huge_memory.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2764613a9b3d..121c92f5c486 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2477,8 +2477,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 #endif
 
 	if (prot_numa) {
-		struct folio *folio;
-		bool toptier;
+		int target_node = NUMA_NO_NODE;
+
 		/*
 		 * Avoid trapping faults against the zero page. The read-only
 		 * data is likely to be read-cached on the local CPU and
@@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		if (pmd_protnone(*pmd))
 			goto unlock;
 
-		folio = pmd_folio(*pmd);
-		toptier = node_is_toptier(folio_nid(folio));
-		/*
-		 * Skip scanning top tier node if normal numa
-		 * balancing is disabled
-		 */
-		if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
-		    toptier)
-			goto unlock;
+		/* Get target node for single threaded private VMAs */
+		if (!(vma->vm_flags & VM_SHARED) &&
+		    atomic_read(&vma->vm_mm->mm_users) == 1)
+			target_node = numa_node_id();
 
-		if (folio_use_access_time(folio))
-			folio_xchg_access_time(folio,
-					       jiffies_to_msecs(jiffies));
+		if (!folio_needs_prot_numa(pmd_folio(*pmd), vma, target_node))
+			goto unlock;
 	}
 	/*
 	 * In case prot_numa, we are under mmap_read_lock(mm). It's critical
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone()
  2025-10-20  6:18 ` [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone() Kefeng Wang
@ 2025-10-20 12:53   ` Lorenzo Stoakes
  2025-10-20 13:08   ` David Hildenbrand
  1 sibling, 0 replies; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-20 12:53 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett,
	Sidhartha Kumar

On Mon, Oct 20, 2025 at 02:18:43PM +0800, Kefeng Wang wrote:
> If the pte_protnone() is true, we could avoid unnecessary struct page
> accessing and reduce cache footprint when scanning page tables for prot
> numa, there was a similar change before, see more commit a818f5363a0e
> ("autonuma: reduce cache footprint when scanning page tables").
>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Thanks for updating the commit message, this LGTM so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  mm/mprotect.c | 29 +++++++++++++----------------
>  1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 056986d9076a..6236d120c8e6 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -118,18 +118,13 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
>  	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr_ptes, flags);
>  }
>
> -static bool prot_numa_skip(struct vm_area_struct *vma, unsigned long addr,
> -			   pte_t oldpte, pte_t *pte, int target_node,
> -			   struct folio *folio)
> +static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
> +		struct folio *folio)
>  {
>  	bool ret = true;
>  	bool toptier;
>  	int nid;
>
> -	/* Avoid TLB flush if possible */
> -	if (pte_protnone(oldpte))
> -		goto skip;
> -
>  	if (!folio)
>  		goto skip;
>
> @@ -307,23 +302,25 @@ static long change_pte_range(struct mmu_gather *tlb,
>  			struct page *page;
>  			pte_t ptent;
>
> +			/* Already in the desired state. */
> +			if (prot_numa && pte_protnone(oldpte))
> +				continue;
> +
>  			page = vm_normal_page(vma, addr, oldpte);
>  			if (page)
>  				folio = page_folio(page);
> +
>  			/*
>  			 * Avoid trapping faults against the zero or KSM
>  			 * pages. See similar comment in change_huge_pmd.
>  			 */
> -			if (prot_numa) {
> -				int ret = prot_numa_skip(vma, addr, oldpte, pte,
> -							 target_node, folio);
> -				if (ret) {
> +			if (prot_numa &&
> +			    prot_numa_skip(vma, target_node, folio)) {
>
> -					/* determine batch to skip */
> -					nr_ptes = mprotect_folio_pte_batch(folio,
> -						  pte, oldpte, max_nr_ptes, /* flags = */ 0);
> -					continue;
> -				}
> +				/* determine batch to skip */
> +				nr_ptes = mprotect_folio_pte_batch(folio,
> +					  pte, oldpte, max_nr_ptes, /* flags = */ 0);
> +				continue;
>  			}
>
>  			nr_ptes = mprotect_folio_pte_batch(folio, pte, oldpte, max_nr_ptes, flags);
> --
> 2.27.0
>
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone()
  2025-10-20  6:18 ` [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone() Kefeng Wang
  2025-10-20 12:53   ` Lorenzo Stoakes
@ 2025-10-20 13:08   ` David Hildenbrand
  1 sibling, 0 replies; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 13:08 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Sidhartha Kumar

On 20.10.25 08:18, Kefeng Wang wrote:
> If the pte_protnone() is true, we could avoid unnecessary struct page
> accessing and reduce cache footprint when scanning page tables for prot
> numa, there was a similar change before, see more commit a818f5363a0e
> ("autonuma: reduce cache footprint when scanning page tables").
> 
> Acked-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20  6:18 ` [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa() Kefeng Wang
@ 2025-10-20 13:09   ` David Hildenbrand
  2025-10-20 13:10   ` Lorenzo Stoakes
  1 sibling, 0 replies; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 13:09 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett

On 20.10.25 08:18, Kefeng Wang wrote:
> The prot_numa_skip() naming is not good since it updates the folio
> access time except checking whether to skip prot NUMA, so rename it
> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
> directly return value instead of goto style, also make it non-static
> function so that it can be reused.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20  6:18 ` [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa() Kefeng Wang
  2025-10-20 13:09   ` David Hildenbrand
@ 2025-10-20 13:10   ` Lorenzo Stoakes
  2025-10-20 15:14     ` Kefeng Wang
  1 sibling, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-20 13:10 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
> The prot_numa_skip() naming is not good since it updates the folio
> access time except checking whether to skip prot NUMA, so rename it
> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by

Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
that you're updating the access time to be honest.

Also it seems to suggest that you're determining whether a mapping of the
folio should be made a NUMA hint by the folio alone rather than the reality
that the mapping is being considered for NUMA hinting and you're checking
to see if you actually have to do it.

prot_numa_hint_needed() seems better to me?

folio_xxx() stuff all seems to be derived from the folio alone.

> directly return value instead of goto style, also make it non-static
> function so that it can be reused.

It's worth saying you plan to reuse it in change_huge_pmd() in the next
commit, otherwise 'so it can be reused' seems like like premature not
optimisation but abstract perhaps?

>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

this functionally LGTM so With the naming, comment nits addressed:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  mm/internal.h |  3 +++
>  mm/mprotect.c | 43 ++++++++++++++++++++++---------------------
>  2 files changed, 25 insertions(+), 21 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 6691d3ea55af..b521b5177d3c 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1403,6 +1403,9 @@ int numa_migrate_check(struct folio *folio, struct vm_fault *vmf,
>  		      unsigned long addr, int *flags, bool writable,
>  		      int *last_cpupid);
>
> +bool folio_needs_prot_numa(struct folio *folio, struct vm_area_struct *vma,
> +		int target_node);
> +
>  void free_zone_device_folio(struct folio *folio);
>  int migrate_device_coherent_folio(struct folio *folio);
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 6236d120c8e6..1369ba6f6294 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -118,26 +118,30 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
>  	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr_ptes, flags);
>  }
>
> -static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
> -		struct folio *folio)
> +/**
> + * folio_needs_prot_numa() - Whether the folio needs prot numa
> + * @folio: The folio.
> + * @vma: The VMA mapping.
> + * @target_node: The numa node being accessed.
> + *
> + * Return: True if folio needs prot numa and the access time of
> + *	   folio is adjusted. False otherwise.

Yeah this comment isn't really helpful :)

You're basically putting the function name into a longer form.

You should mention NUMA hinting, that we are checking to see if the folio
actually indicates that we need to make the mapping one which causes a NUMA
hinting fault, as there are cases where it's simply unnecessary.

You should also explain that you're checking a folio whose mapping is
already being considered for being made NUMA hintable.

As right now it isn't clear that you're not somehow checking for prot numa
based on the folio alone :)

> + */
> +bool folio_needs_prot_numa(struct folio *folio, struct vm_area_struct *vma,
> +		int target_node)
>  {
> -	bool ret = true;
> -	bool toptier;
>  	int nid;
>
> -	if (!folio)
> -		goto skip;
> -
> -	if (folio_is_zone_device(folio) || folio_test_ksm(folio))
> -		goto skip;
> +	if (!folio || folio_is_zone_device(folio) || folio_test_ksm(folio))
> +		return false;
>
>  	/* Also skip shared copy-on-write folios */
>  	if (is_cow_mapping(vma->vm_flags) && folio_maybe_mapped_shared(folio))
> -		goto skip;
> +		return false;
>
>  	/* Folios are pinned and can't be migrated */
>  	if (folio_maybe_dma_pinned(folio))
> -		goto skip;
> +		return false;
>
>  	/*
>  	 * While migration can move some dirty pages,
> @@ -145,7 +149,7 @@ static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
>  	 * context.
>  	 */
>  	if (folio_is_file_lru(folio) && folio_test_dirty(folio))
> -		goto skip;
> +		return false;
>
>  	/*
>  	 * Don't mess with PTEs if page is already on the node
> @@ -153,23 +157,20 @@ static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
>  	 */
>  	nid = folio_nid(folio);
>  	if (target_node == nid)
> -		goto skip;
> -
> -	toptier = node_is_toptier(nid);
> +		return false;
>
>  	/*
>  	 * Skip scanning top tier node if normal numa
>  	 * balancing is disabled
>  	 */
> -	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) && toptier)
> -		goto skip;
> +	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
> +	    node_is_toptier(nid))
> +		return false;
>
> -	ret = false;
>  	if (folio_use_access_time(folio))
>  		folio_xchg_access_time(folio, jiffies_to_msecs(jiffies));
>
> -skip:
> -	return ret;
> +	return true;
>  }
>
>  /* Set nr_ptes number of ptes, starting from idx */
> @@ -315,7 +316,7 @@ static long change_pte_range(struct mmu_gather *tlb,
>  			 * pages. See similar comment in change_huge_pmd.
>  			 */
>  			if (prot_numa &&
> -			    prot_numa_skip(vma, target_node, folio)) {
> +			    !folio_needs_prot_numa(folio, vma, target_node)) {
>
>  				/* determine batch to skip */
>  				nr_ptes = mprotect_folio_pte_batch(folio,
> --
> 2.27.0
>
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20  6:18 ` [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio Kefeng Wang
@ 2025-10-20 13:11   ` David Hildenbrand
  2025-10-20 13:15   ` Lorenzo Stoakes
  1 sibling, 0 replies; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 13:11 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, Lorenzo Stoakes, linux-mm
  Cc: Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Liam.Howlett, Sidhartha Kumar

On 20.10.25 08:18, Kefeng Wang wrote:
> The folio_needs_prot_numa() checks whether to need prot numa, which
> skips unsuitable folio, i.e. zone device, shared folios(ksm, CoW),

ksm only applies to small folios.

> non-movable dma pinned, dirty file folio and already numa affinity's

"...and folios that already have the expected node affinity ... " ?

> folios, the policy should be applied to pmd folio too, which helps
> to avoid unnecessary pmd change and folio migration attempts.
> 
> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---


Acked-by: David Hildenbrand <david@redhat.com>


-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20  6:18 ` [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio Kefeng Wang
  2025-10-20 13:11   ` David Hildenbrand
@ 2025-10-20 13:15   ` Lorenzo Stoakes
  2025-10-20 13:23     ` David Hildenbrand
  1 sibling, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-20 13:15 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett,
	Sidhartha Kumar

On Mon, Oct 20, 2025 at 02:18:45PM +0800, Kefeng Wang wrote:
> The folio_needs_prot_numa() checks whether to need prot numa, which
> skips unsuitable folio, i.e. zone device, shared folios(ksm, CoW),
> non-movable dma pinned, dirty file folio and already numa affinity's
> folios, the policy should be applied to pmd folio too, which helps
> to avoid unnecessary pmd change and folio migration attempts.
>
> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

This is a nice change thanks, and splitting this up is useful.

With the duplication addressed below, this LGTM so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  mm/huge_memory.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2764613a9b3d..121c92f5c486 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2477,8 +2477,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  #endif
>
>  	if (prot_numa) {
> -		struct folio *folio;
> -		bool toptier;
> +		int target_node = NUMA_NO_NODE;
> +
>  		/*
>  		 * Avoid trapping faults against the zero page. The read-only
>  		 * data is likely to be read-cached on the local CPU and
> @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  		if (pmd_protnone(*pmd))
>  			goto unlock;
>
> -		folio = pmd_folio(*pmd);
> -		toptier = node_is_toptier(folio_nid(folio));
> -		/*
> -		 * Skip scanning top tier node if normal numa
> -		 * balancing is disabled
> -		 */
> -		if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
> -		    toptier)
> -			goto unlock;
> +		/* Get target node for single threaded private VMAs */
> +		if (!(vma->vm_flags & VM_SHARED) &&
> +		    atomic_read(&vma->vm_mm->mm_users) == 1)
> +			target_node = numa_node_id();

This is duplicated in both callers, and only used by folio_needs_prot_numa(),
why not abstract this to the function also?

>
> -		if (folio_use_access_time(folio))
> -			folio_xchg_access_time(folio,
> -					       jiffies_to_msecs(jiffies));
> +		if (!folio_needs_prot_numa(pmd_folio(*pmd), vma, target_node))
> +			goto unlock;
>  	}
>  	/*
>  	 * In case prot_numa, we are under mmap_read_lock(mm). It's critical
> --
> 2.27.0
>
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20 13:15   ` Lorenzo Stoakes
@ 2025-10-20 13:23     ` David Hildenbrand
  2025-10-20 15:18       ` Kefeng Wang
  0 siblings, 1 reply; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 13:23 UTC (permalink / raw)
  To: Lorenzo Stoakes, Kefeng Wang
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett, Sidhartha Kumar


>>   		/*
>>   		 * Avoid trapping faults against the zero page. The read-only
>>   		 * data is likely to be read-cached on the local CPU and
>> @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>>   		if (pmd_protnone(*pmd))
>>   			goto unlock;
>>
>> -		folio = pmd_folio(*pmd);
>> -		toptier = node_is_toptier(folio_nid(folio));
>> -		/*
>> -		 * Skip scanning top tier node if normal numa
>> -		 * balancing is disabled
>> -		 */
>> -		if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
>> -		    toptier)
>> -			goto unlock;
>> +		/* Get target node for single threaded private VMAs */
>> +		if (!(vma->vm_flags & VM_SHARED) &&
>> +		    atomic_read(&vma->vm_mm->mm_users) == 1)
>> +			target_node = numa_node_id();
> 
> This is duplicated in both callers, and only used by folio_needs_prot_numa(),
> why not abstract this to the function also?

There was a discussion on that in v3 I think where I asked the same 
question.

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 13:10   ` Lorenzo Stoakes
@ 2025-10-20 15:14     ` Kefeng Wang
  2025-10-20 17:34       ` David Hildenbrand
  0 siblings, 1 reply; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20 15:14 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett



On 2025/10/20 21:10, Lorenzo Stoakes wrote:
> On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
>> The prot_numa_skip() naming is not good since it updates the folio
>> access time except checking whether to skip prot NUMA, so rename it
>> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
> 
> Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
> that you're updating the access time to be honest.
> 
> Also it seems to suggest that you're determining whether a mapping of the
> folio should be made a NUMA hint by the folio alone rather than the reality
> that the mapping is being considered for NUMA hinting and you're checking
> to see if you actually have to do it.
> 
> prot_numa_hint_needed() seems better to me?
> 
> folio_xxx() stuff all seems to be derived from the folio alone.

Naming is hard, David suggested it with folio_ prefix in v2,
the most check are derived from folio, but I could update it if
we like the new prot_numa_hint_needed().

> 
>> directly return value instead of goto style, also make it non-static
>> function so that it can be reused.
> 
> It's worth saying you plan to reuse it in change_huge_pmd() in the next
> commit, otherwise 'so it can be reused' seems like like premature not
> optimisation but abstract perhaps?

Will add change_huge_pmd() part in the message.

> 
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> 
> this functionally LGTM so With the naming, comment nits addressed:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
>> ---
>>   mm/internal.h |  3 +++
>>   mm/mprotect.c | 43 ++++++++++++++++++++++---------------------
>>   2 files changed, 25 insertions(+), 21 deletions(-)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 6691d3ea55af..b521b5177d3c 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -1403,6 +1403,9 @@ int numa_migrate_check(struct folio *folio, struct vm_fault *vmf,
>>   		      unsigned long addr, int *flags, bool writable,
>>   		      int *last_cpupid);
>>
>> +bool folio_needs_prot_numa(struct folio *folio, struct vm_area_struct *vma,
>> +		int target_node);
>> +
>>   void free_zone_device_folio(struct folio *folio);
>>   int migrate_device_coherent_folio(struct folio *folio);
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 6236d120c8e6..1369ba6f6294 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -118,26 +118,30 @@ static int mprotect_folio_pte_batch(struct folio *folio, pte_t *ptep,
>>   	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr_ptes, flags);
>>   }
>>
>> -static bool prot_numa_skip(struct vm_area_struct *vma, int target_node,
>> -		struct folio *folio)
>> +/**
>> + * folio_needs_prot_numa() - Whether the folio needs prot numa
>> + * @folio: The folio.
>> + * @vma: The VMA mapping.
>> + * @target_node: The numa node being accessed.
>> + *
>> + * Return: True if folio needs prot numa and the access time of
>> + *	   folio is adjusted. False otherwise.
> 
> Yeah this comment isn't really helpful :)
> 
> You're basically putting the function name into a longer form.
> 
> You should mention NUMA hinting, that we are checking to see if the folio
> actually indicates that we need to make the mapping one which causes a NUMA
> hinting fault, as there are cases where it's simply unnecessary.
> 
> You should also explain that you're checking a folio whose mapping is
> already being considered for being made NUMA hintable.
> 
> As right now it isn't clear that you're not somehow checking for prot numa
> based on the folio alone :)

Ignore the naming,borrowing from your suggestion,

/**
  * folio_needs_prot_numa() - check whether the folio needs prot numa
  * @folio: The folio whose mapping considered for being made NUMA hintable
  * @vma: The VMA mapping.
  * @target_node: The numa node being accessed.
  *
  * This function checks to see if the folio actually indicates that we
  * need to make the mapping one which causes a NUMA hinting fault,
  * as there are cases where it's simply unnecessary, and the folio's
  * access time is adjusted for memory tiering if prot numa needed. 

  *
  * Return: True if the mapping of the folio need to be changed, false 
otherwise.
  */

Please forgive my pool English.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20 13:23     ` David Hildenbrand
@ 2025-10-20 15:18       ` Kefeng Wang
  2025-10-21 13:37         ` Kefeng Wang
  2025-10-21 15:29         ` Lorenzo Stoakes
  0 siblings, 2 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-20 15:18 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett, Sidhartha Kumar



On 2025/10/20 21:23, David Hildenbrand wrote:
> 
>>>           /*
>>>            * Avoid trapping faults against the zero page. The read-only
>>>            * data is likely to be read-cached on the local CPU and
>>> @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather *tlb, 
>>> struct vm_area_struct *vma,
>>>           if (pmd_protnone(*pmd))
>>>               goto unlock;
>>>
>>> -        folio = pmd_folio(*pmd);
>>> -        toptier = node_is_toptier(folio_nid(folio));
>>> -        /*
>>> -         * Skip scanning top tier node if normal numa
>>> -         * balancing is disabled
>>> -         */
>>> -        if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
>>> -            toptier)
>>> -            goto unlock;
>>> +        /* Get target node for single threaded private VMAs */
>>> +        if (!(vma->vm_flags & VM_SHARED) &&
>>> +            atomic_read(&vma->vm_mm->mm_users) == 1)
>>> +            target_node = numa_node_id();
>>
>> This is duplicated in both callers, and only used by 
>> folio_needs_prot_numa(),
>> why not abstract this to the function also?
> 
> There was a discussion on that in v3 I think where I asked the same 
> question.
> 

Yes, it is in v1, for pte, we could avoid 512 times check and the
numa_node_id(), so we leave it as is.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 15:14     ` Kefeng Wang
@ 2025-10-20 17:34       ` David Hildenbrand
  2025-10-20 17:49         ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 17:34 UTC (permalink / raw)
  To: Kefeng Wang, Lorenzo Stoakes
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On 20.10.25 17:14, Kefeng Wang wrote:
> 
> 
> On 2025/10/20 21:10, Lorenzo Stoakes wrote:
>> On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
>>> The prot_numa_skip() naming is not good since it updates the folio
>>> access time except checking whether to skip prot NUMA, so rename it
>>> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
>>
>> Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
>> that you're updating the access time to be honest.
>>
>> Also it seems to suggest that you're determining whether a mapping of the
>> folio should be made a NUMA hint by the folio alone rather than the reality
>> that the mapping is being considered for NUMA hinting and you're checking
>> to see if you actually have to do it.

... because the folio doesn't need prot_none protection for NUMA hitning? :)

>>
>> prot_numa_hint_needed() seems better to me?
>>
>> folio_xxx() stuff all seems to be derived from the folio alone.
> 
> Naming is hard, David suggested it with folio_ prefix in v2,
> the most check are derived from folio, but I could update it if
> we like the new prot_numa_hint_needed().

I prefer what we have here, but I will not fight for it.

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 17:34       ` David Hildenbrand
@ 2025-10-20 17:49         ` Lorenzo Stoakes
  2025-10-20 18:12           ` David Hildenbrand
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-20 17:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Mon, Oct 20, 2025 at 07:34:51PM +0200, David Hildenbrand wrote:
> On 20.10.25 17:14, Kefeng Wang wrote:
> >
> >
> > On 2025/10/20 21:10, Lorenzo Stoakes wrote:
> > > On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
> > > > The prot_numa_skip() naming is not good since it updates the folio
> > > > access time except checking whether to skip prot NUMA, so rename it
> > > > to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
> > >
> > > Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
> > > that you're updating the access time to be honest.
> > >
> > > Also it seems to suggest that you're determining whether a mapping of the
> > > folio should be made a NUMA hint by the folio alone rather than the reality
> > > that the mapping is being considered for NUMA hinting and you're checking
> > > to see if you actually have to do it.
>
> ... because the folio doesn't need prot_none protection for NUMA hitning? :)

folio_xxx() impliles to me that the folio independently has proeprty 'xxx'.

So it's like saying 'here's a folio, does it NEED prot numa?' right?

But that's not what we're doing, we're actually kinda doing the opposite, we're
saying 'ok we want NUMA hinting, but do we actually need it?'

The sematics are hard yes.

>
> > >
> > > prot_numa_hint_needed() seems better to me?
> > >
> > > folio_xxx() stuff all seems to be derived from the folio alone.
> >
> > Naming is hard, David suggested it with folio_ prefix in v2,
> > the most check are derived from folio, but I could update it if
> > we like the new prot_numa_hint_needed().
>
> I prefer what we have here, but I will not fight for it.

I also dislike the 'folio' in the name. We're not only basing it on folio, but
alsoo vma, targe_node.

It's hard to put all that across in the name, but I'm personally not a big fan
of the original for these reasons.

>
> --
> Cheers
>
> David / dhildenb
>
>

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 17:49         ` Lorenzo Stoakes
@ 2025-10-20 18:12           ` David Hildenbrand
  2025-10-20 18:56             ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: David Hildenbrand @ 2025-10-20 18:12 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On 20.10.25 19:49, Lorenzo Stoakes wrote:
> On Mon, Oct 20, 2025 at 07:34:51PM +0200, David Hildenbrand wrote:
>> On 20.10.25 17:14, Kefeng Wang wrote:
>>>
>>>
>>> On 2025/10/20 21:10, Lorenzo Stoakes wrote:
>>>> On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
>>>>> The prot_numa_skip() naming is not good since it updates the folio
>>>>> access time except checking whether to skip prot NUMA, so rename it
>>>>> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
>>>>
>>>> Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
>>>> that you're updating the access time to be honest.
>>>>
>>>> Also it seems to suggest that you're determining whether a mapping of the
>>>> folio should be made a NUMA hint by the folio alone rather than the reality
>>>> that the mapping is being considered for NUMA hinting and you're checking
>>>> to see if you actually have to do it.
>>
>> ... because the folio doesn't need prot_none protection for NUMA hitning? :)
> 
> folio_xxx() impliles to me that the folio independently has proeprty 'xxx'.
> 

I would agree if it would be a folio_has_* or folio_test_*.

To me the folio is really just the main entity we are querying 
information about. Not the VMA, not the target node, but the folio.

> So it's like saying 'here's a folio, does it NEED prot numa?' right?

I'd say: "here is a folio, does it need numa protection in this vma". So 
I still don't understand your point, unfortunately.

And keep disliking prot_numa_hint_needed() ;)

But again, I won't fight for it if you have strong opinions on it.

It would be better if we could have discussed that as part of v2 to 
avoid having Kefeng go back and forth. Maybe v3+v4 were sent out a bit 
too quickly

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 18:12           ` David Hildenbrand
@ 2025-10-20 18:56             ` Lorenzo Stoakes
  2025-10-21  8:41               ` Kefeng Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-20 18:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Mon, Oct 20, 2025 at 08:12:08PM +0200, David Hildenbrand wrote:
> On 20.10.25 19:49, Lorenzo Stoakes wrote:
> > On Mon, Oct 20, 2025 at 07:34:51PM +0200, David Hildenbrand wrote:
> > > On 20.10.25 17:14, Kefeng Wang wrote:
> > > >
> > > >
> > > > On 2025/10/20 21:10, Lorenzo Stoakes wrote:
> > > > > On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
> > > > > > The prot_numa_skip() naming is not good since it updates the folio
> > > > > > access time except checking whether to skip prot NUMA, so rename it
> > > > > > to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
> > > > >
> > > > > Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
> > > > > that you're updating the access time to be honest.
> > > > >
> > > > > Also it seems to suggest that you're determining whether a mapping of the
> > > > > folio should be made a NUMA hint by the folio alone rather than the reality
> > > > > that the mapping is being considered for NUMA hinting and you're checking
> > > > > to see if you actually have to do it.
> > >
> > > ... because the folio doesn't need prot_none protection for NUMA hitning? :)
> >
> > folio_xxx() impliles to me that the folio independently has proeprty 'xxx'.
> >
>
> I would agree if it would be a folio_has_* or folio_test_*.

folio_is_zone_device()
folio_is_zone_movable()
folio_needs_release()
folio_needs_cow_for_dma()
folio_memcg_kmem()
folio_memcg_charged()
folio_pgoff()
folio_pos()
folio_contains()
folio_mapcount()

etc. etc. etc.

All properties of the folio in particular, not folio_has_*() or folio_test_*().

I mean the pattern is established in the kernel. Including folio_needs_*()!

I honestly wonder if the original formulation was correct - check for exemptions.

So something like folio_exempt_from_prot_numa()?

Or this way around folio_suitable_prot_numa()?

I think this is an English thing. 'folio needs prot numa' reads as 'this folio
_needs_ prot numa' right?

Oh - folio_can_map_prot_numa()? Something like this?

>
> To me the folio is really just the main entity we are querying information
> about. Not the VMA, not the target node, but the folio.

OK, I guess folio_needs_cow_for_dma() does the same thing with MMF_HAS_PINNED.

>
> > So it's like saying 'here's a folio, does it NEED prot numa?' right?
>
> I'd say: "here is a folio, does it need numa protection in this vma". So I
> still don't understand your point, unfortunately.

Again I think it's an English thing. As I said above.

>
> And keep disliking prot_numa_hint_needed() ;)

Well you are very particular about naming :)

I similarly dislike folio_needs_prot_numa()...

>
> But again, I won't fight for it if you have strong opinions on it.

It's not the biggest issue in the world. Equally if you insist I won't lose
sleep over this...

>
> It would be better if we could have discussed that as part of v2 to avoid
> having Kefeng go back and forth. Maybe v3+v4 were sent out a bit too quickly

At v2 I had just got back from holiday (+ being sick) and had a >1,000 mail
backlog, sorry.

>
> --
> Cheers
>
> David / dhildenb
>
>

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip()
  2025-10-20  6:18 ` [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip() Kefeng Wang
@ 2025-10-21  3:06   ` Barry Song
  0 siblings, 0 replies; 32+ messages in thread
From: Barry Song @ 2025-10-21  3:06 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, linux-mm,
	Zi Yan, Baolin Wang, Ryan Roberts, Dev Jain, Lance Yang,
	Liam.Howlett, Sidhartha Kumar

On Tue, Oct 21, 2025 at 2:06 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
> If the folio (even not CoW folio) is dma pinned, it can't be migrated
> due to the elevated reference count. So always skip a pinned folio
> to avoid wasting cycles when folios are migrated.
>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

LGTM,
Reviewed-by: Barry Song <baohua@kernel.org>

Thanks
Barry


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-20 18:56             ` Lorenzo Stoakes
@ 2025-10-21  8:41               ` Kefeng Wang
  2025-10-21  9:13                 ` David Hildenbrand
  0 siblings, 1 reply; 32+ messages in thread
From: Kefeng Wang @ 2025-10-21  8:41 UTC (permalink / raw)
  To: Lorenzo Stoakes, David Hildenbrand
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett



On 2025/10/21 2:56, Lorenzo Stoakes wrote:
> On Mon, Oct 20, 2025 at 08:12:08PM +0200, David Hildenbrand wrote:
>> On 20.10.25 19:49, Lorenzo Stoakes wrote:
>>> On Mon, Oct 20, 2025 at 07:34:51PM +0200, David Hildenbrand wrote:
>>>> On 20.10.25 17:14, Kefeng Wang wrote:
>>>>>
>>>>>
>>>>> On 2025/10/20 21:10, Lorenzo Stoakes wrote:
>>>>>> On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
>>>>>>> The prot_numa_skip() naming is not good since it updates the folio
>>>>>>> access time except checking whether to skip prot NUMA, so rename it
>>>>>>> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
>>>>>>
>>>>>> Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
>>>>>> that you're updating the access time to be honest.
>>>>>>
>>>>>> Also it seems to suggest that you're determining whether a mapping of the
>>>>>> folio should be made a NUMA hint by the folio alone rather than the reality
>>>>>> that the mapping is being considered for NUMA hinting and you're checking
>>>>>> to see if you actually have to do it.
>>>>
>>>> ... because the folio doesn't need prot_none protection for NUMA hitning? :)
>>>
>>> folio_xxx() impliles to me that the folio independently has proeprty 'xxx'.
>>>
>>
>> I would agree if it would be a folio_has_* or folio_test_*.
> 
> folio_is_zone_device()
> folio_is_zone_movable()
> folio_needs_release()
> folio_needs_cow_for_dma()
> folio_memcg_kmem()
> folio_memcg_charged()
> folio_pgoff()
> folio_pos()
> folio_contains()
> folio_mapcount()
> 
> etc. etc. etc.
> 
> All properties of the folio in particular, not folio_has_*() or folio_test_*().
> 
> I mean the pattern is established in the kernel. Including folio_needs_*()!
> 
> I honestly wonder if the original formulation was correct - check for exemptions.
> 
> So something like folio_exempt_from_prot_numa()?
> 
> Or this way around folio_suitable_prot_numa()?
> 
> I think this is an English thing. 'folio needs prot numa' reads as 'this folio
> _needs_ prot numa' right?
> 
> Oh - folio_can_map_prot_numa()? Something like this?
> 
>>
>> To me the folio is really just the main entity we are querying information
>> about. Not the VMA, not the target node, but the folio.
> 
> OK, I guess folio_needs_cow_for_dma() does the same thing with MMF_HAS_PINNED.
> 
>>
>>> So it's like saying 'here's a folio, does it NEED prot numa?' right?
>>
>> I'd say: "here is a folio, does it need numa protection in this vma". So I
>> still don't understand your point, unfortunately.
> 
> Again I think it's an English thing. As I said above.
> 
>>
>> And keep disliking prot_numa_hint_needed() ;)
> 
> Well you are very particular about naming :)
> 
> I similarly dislike folio_needs_prot_numa()...
> 

Maybe folio_needs_protnone_mapping()? if there are no better options,
we'll leave it as is.

>>
>> But again, I won't fight for it if you have strong opinions on it.
> 
> It's not the biggest issue in the world. Equally if you insist I won't lose
> sleep over this...
> 
>>
>> It would be better if we could have discussed that as part of v2 to avoid
>> having Kefeng go back and forth. Maybe v3+v4 were sent out a bit too quickly
> 
> At v2 I had just got back from holiday (+ being sick) and had a >1,000 mail
> backlog, sorry.
> 
>>
>> --
>> Cheers
>>
>> David / dhildenb
>>
>>
> 
> Cheers, Lorenzo
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21  8:41               ` Kefeng Wang
@ 2025-10-21  9:13                 ` David Hildenbrand
  2025-10-21  9:25                   ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: David Hildenbrand @ 2025-10-21  9:13 UTC (permalink / raw)
  To: Kefeng Wang, Lorenzo Stoakes
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On 21.10.25 10:41, Kefeng Wang wrote:
> 
> 
> On 2025/10/21 2:56, Lorenzo Stoakes wrote:
>> On Mon, Oct 20, 2025 at 08:12:08PM +0200, David Hildenbrand wrote:
>>> On 20.10.25 19:49, Lorenzo Stoakes wrote:
>>>> On Mon, Oct 20, 2025 at 07:34:51PM +0200, David Hildenbrand wrote:
>>>>> On 20.10.25 17:14, Kefeng Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 2025/10/20 21:10, Lorenzo Stoakes wrote:
>>>>>>> On Mon, Oct 20, 2025 at 02:18:44PM +0800, Kefeng Wang wrote:
>>>>>>>> The prot_numa_skip() naming is not good since it updates the folio
>>>>>>>> access time except checking whether to skip prot NUMA, so rename it
>>>>>>>> to folio_needs_prot_numa(), and cleanup it a bit, remove ret by
>>>>>>>
>>>>>>> Hmm not sure 'folio_needs_prot_numa()' is any better in terms of indicating
>>>>>>> that you're updating the access time to be honest.
>>>>>>>
>>>>>>> Also it seems to suggest that you're determining whether a mapping of the
>>>>>>> folio should be made a NUMA hint by the folio alone rather than the reality
>>>>>>> that the mapping is being considered for NUMA hinting and you're checking
>>>>>>> to see if you actually have to do it.
>>>>>
>>>>> ... because the folio doesn't need prot_none protection for NUMA hitning? :)
>>>>
>>>> folio_xxx() impliles to me that the folio independently has proeprty 'xxx'.
>>>>
>>>
>>> I would agree if it would be a folio_has_* or folio_test_*.
>>
>> folio_is_zone_device()
>> folio_is_zone_movable()
>> folio_needs_release()
>> folio_needs_cow_for_dma()
>> folio_memcg_kmem()
>> folio_memcg_charged()
>> folio_pgoff()
>> folio_pos()
>> folio_contains()
>> folio_mapcount()
>>
>> etc. etc. etc.
>>
>> All properties of the folio in particular, not folio_has_*() or folio_test_*().
>>
>> I mean the pattern is established in the kernel. Including folio_needs_*()!
>>
>> I honestly wonder if the original formulation was correct - check for exemptions.
>>
>> So something like folio_exempt_from_prot_numa()?
>>
>> Or this way around folio_suitable_prot_numa()?
>>
>> I think this is an English thing. 'folio needs prot numa' reads as 'this folio
>> _needs_ prot numa' right?
>>
>> Oh - folio_can_map_prot_numa()? Something like this?
>>
>>>
>>> To me the folio is really just the main entity we are querying information
>>> about. Not the VMA, not the target node, but the folio.
>>
>> OK, I guess folio_needs_cow_for_dma() does the same thing with MMF_HAS_PINNED.
>>
>>>
>>>> So it's like saying 'here's a folio, does it NEED prot numa?' right?
>>>
>>> I'd say: "here is a folio, does it need numa protection in this vma". So I
>>> still don't understand your point, unfortunately.
>>
>> Again I think it's an English thing. As I said above.
>>
>>>
>>> And keep disliking prot_numa_hint_needed() ;)
>>
>> Well you are very particular about naming :)
>>
>> I similarly dislike folio_needs_prot_numa()...
>>
> 
> Maybe folio_needs_protnone_mapping()?

I think prot_numa should be in the name one way or the other, because we 
also have ordinary PROT_NONE unrealted to NUMA faults.

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21  9:13                 ` David Hildenbrand
@ 2025-10-21  9:25                   ` Lorenzo Stoakes
  2025-10-21  9:45                     ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21  9:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
> On 21.10.25 10:41, Kefeng Wang wrote:
> > Maybe folio_needs_protnone_mapping()?
>
> I think prot_numa should be in the name one way or the other, because we
> also have ordinary PROT_NONE unrealted to NUMA faults.

folio_can_map_prot_numa() semes to me to be the best choice.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21  9:25                   ` Lorenzo Stoakes
@ 2025-10-21  9:45                     ` Lorenzo Stoakes
  2025-10-21 12:54                       ` Kefeng Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21  9:45 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
> On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
> > On 21.10.25 10:41, Kefeng Wang wrote:
> > > Maybe folio_needs_protnone_mapping()?
> >
> > I think prot_numa should be in the name one way or the other, because we
> > also have ordinary PROT_NONE unrealted to NUMA faults.
>
> folio_can_map_prot_numa() semes to me to be the best choice.
>

I don't want to hold up the series so if you guys feel strongly, leave it
as-is. But this reads better to me.

Apologies for this noise Kefeng this is hardly the most important thing in
the world :)

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21  9:45                     ` Lorenzo Stoakes
@ 2025-10-21 12:54                       ` Kefeng Wang
  2025-10-21 14:56                         ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: Kefeng Wang @ 2025-10-21 12:54 UTC (permalink / raw)
  To: Lorenzo Stoakes, David Hildenbrand
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett



On 2025/10/21 17:45, Lorenzo Stoakes wrote:
> On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
>> On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
>>> On 21.10.25 10:41, Kefeng Wang wrote:
>>>> Maybe folio_needs_protnone_mapping()?
>>>
>>> I think prot_numa should be in the name one way or the other, because we
>>> also have ordinary PROT_NONE unrealted to NUMA faults.
>>
>> folio_can_map_prot_numa() semes to me to be the best choice.
>>
> 
> I don't want to hold up the series so if you guys feel strongly, leave it
> as-is. But this reads better to me.

Let's keep things as-is for now.

> 
> Apologies for this noise Kefeng this is hardly the most important thing in
> the world :)
> 

Never mind,thanks for all the reviews and suggestions :)

> Cheers, Lorenzo
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20 15:18       ` Kefeng Wang
@ 2025-10-21 13:37         ` Kefeng Wang
  2025-10-21 15:35           ` Lorenzo Stoakes
  2025-10-21 15:29         ` Lorenzo Stoakes
  1 sibling, 1 reply; 32+ messages in thread
From: Kefeng Wang @ 2025-10-21 13:37 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett, Sidhartha Kumar



On 2025/10/20 23:18, Kefeng Wang wrote:
> 
> 
> On 2025/10/20 21:23, David Hildenbrand wrote:
>>
>>>>           /*
>>>>            * Avoid trapping faults against the zero page. The read-only
>>>>            * data is likely to be read-cached on the local CPU and
>>>> @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather *tlb, 
>>>> struct vm_area_struct *vma,
>>>>           if (pmd_protnone(*pmd))
>>>>               goto unlock;
>>>>
>>>> -        folio = pmd_folio(*pmd);
>>>> -        toptier = node_is_toptier(folio_nid(folio));
>>>> -        /*
>>>> -         * Skip scanning top tier node if normal numa
>>>> -         * balancing is disabled
>>>> -         */
>>>> -        if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
>>>> -            toptier)
>>>> -            goto unlock;
>>>> +        /* Get target node for single threaded private VMAs */
>>>> +        if (!(vma->vm_flags & VM_SHARED) &&
>>>> +            atomic_read(&vma->vm_mm->mm_users) == 1)
>>>> +            target_node = numa_node_id();
>>>
>>> This is duplicated in both callers, and only used by 
>>> folio_needs_prot_numa(),
>>> why not abstract this to the function also?
>>
>> There was a discussion on that in v3 I think where I asked the same 
>> question.
>>
> 
> Yes, it is in v1, for pte, we could avoid 512 times check and the
> numa_node_id(), so we leave it as is.
> 

If no objection, I won't change this and will send a new version with

1) kerneldoc refresh for folio_need_prot_numa, per Lorenzo
2) update the changelog of this one, per David

Thanks


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21 12:54                       ` Kefeng Wang
@ 2025-10-21 14:56                         ` Lorenzo Stoakes
  2025-10-21 15:01                           ` David Hildenbrand
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21 14:56 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: David Hildenbrand, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Tue, Oct 21, 2025 at 08:54:08PM +0800, Kefeng Wang wrote:
>
>
> On 2025/10/21 17:45, Lorenzo Stoakes wrote:
> > On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
> > > On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
> > > > On 21.10.25 10:41, Kefeng Wang wrote:
> > > > > Maybe folio_needs_protnone_mapping()?
> > > >
> > > > I think prot_numa should be in the name one way or the other, because we
> > > > also have ordinary PROT_NONE unrealted to NUMA faults.
> > >
> > > folio_can_map_prot_numa() semes to me to be the best choice.
> > >
> >
> > I don't want to hold up the series so if you guys feel strongly, leave it
> > as-is. But this reads better to me.
>
> Let's keep things as-is for now.

It'd be nice to have an argument as to why (I did say if you guys had
strong opinions...), but never mind.

I had other feedback you need to address please, so you'll need a respin
anyway.

Thanks, Lorenzo


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21 14:56                         ` Lorenzo Stoakes
@ 2025-10-21 15:01                           ` David Hildenbrand
  2025-10-21 16:36                             ` Lorenzo Stoakes
  0 siblings, 1 reply; 32+ messages in thread
From: David Hildenbrand @ 2025-10-21 15:01 UTC (permalink / raw)
  To: Lorenzo Stoakes, Kefeng Wang
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On 21.10.25 16:56, Lorenzo Stoakes wrote:
> On Tue, Oct 21, 2025 at 08:54:08PM +0800, Kefeng Wang wrote:
>>
>>
>> On 2025/10/21 17:45, Lorenzo Stoakes wrote:
>>> On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
>>>> On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
>>>>> On 21.10.25 10:41, Kefeng Wang wrote:
>>>>>> Maybe folio_needs_protnone_mapping()?
>>>>>
>>>>> I think prot_numa should be in the name one way or the other, because we
>>>>> also have ordinary PROT_NONE unrealted to NUMA faults.
>>>>
>>>> folio_can_map_prot_numa() semes to me to be the best choice.
>>>>
>>>
>>> I don't want to hold up the series so if you guys feel strongly, leave it
>>> as-is. But this reads better to me.
>>
>> Let's keep things as-is for now.
> 
> It'd be nice to have an argument as to why (I did say if you guys had
> strong opinions...), but never mind.
> 

folio_can_map_prot_numa() works for me, so we can just use that and call 
it a day :)

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-20 15:18       ` Kefeng Wang
  2025-10-21 13:37         ` Kefeng Wang
@ 2025-10-21 15:29         ` Lorenzo Stoakes
  2025-10-22  1:33           ` Kefeng Wang
  1 sibling, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21 15:29 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: David Hildenbrand, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett,
	Sidhartha Kumar

On Mon, Oct 20, 2025 at 11:18:27PM +0800, Kefeng Wang wrote:
>
>
> On 2025/10/20 21:23, David Hildenbrand wrote:
> >
> > > >           /*
> > > >            * Avoid trapping faults against the zero page. The read-only
> > > >            * data is likely to be read-cached on the local CPU and
> > > > @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather
> > > > *tlb, struct vm_area_struct *vma,
> > > >           if (pmd_protnone(*pmd))
> > > >               goto unlock;
> > > >
> > > > -        folio = pmd_folio(*pmd);
> > > > -        toptier = node_is_toptier(folio_nid(folio));
> > > > -        /*
> > > > -         * Skip scanning top tier node if normal numa
> > > > -         * balancing is disabled
> > > > -         */
> > > > -        if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
> > > > -            toptier)
> > > > -            goto unlock;
> > > > +        /* Get target node for single threaded private VMAs */
> > > > +        if (!(vma->vm_flags & VM_SHARED) &&
> > > > +            atomic_read(&vma->vm_mm->mm_users) == 1)
> > > > +            target_node = numa_node_id();
> > >
> > > This is duplicated in both callers, and only used by
> > > folio_needs_prot_numa(),
> > > why not abstract this to the function also?
> >
> > There was a discussion on that in v3 I think where I asked the same
> > question.
> >
>
> Yes, it is in v1, for pte, we could avoid 512 times check and the
> numa_node_id(), so we leave it as is.
>

OK so what you're saying is that in change_pte_range() you only need do the
check once (per PMD) rather than for each PTE entry.

It might be worth having this as a separate helper function as it sucks to
duplicate that.

I actually really don't like that we pass in a 'target node' but only when it's
single-threaded. That's a bit silly.

E.g.:

static inline bool vma_is_single_threaded_private(struct vm_area_struct *vma)
{
	if (vma->vm_flags & VM_SHARED)
		return false;

	return atomic_read(&vma->vm_mm->mm_users) == 1;
}

Then you can pass in a boolean and change:

	/*
	 * Don't mess with PTEs if page is already on the node
	 * a single-threaded process is running on.
	 */
	nid = folio_nid(folio);
	if (target_node == nid)
		return false;

To:

	/*
	 * Don't mess with PTEs if page is already on the node
	 * a single-threaded process is running on.
	 */
	if (is_private_single_threaded && folio_nid(folio) == numa_node_id())
		return false;

Which makes a lot more sense than passing in NUMA_NO_NODE for shared VMAs.

Now we're sharing this I really don't know why the function is still in
mprotect.c by the way? Shouldn't it be in mempolicy.c + have a stub
function for !CONFIG_NUMA_BALANCING?

Thanks


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-21 13:37         ` Kefeng Wang
@ 2025-10-21 15:35           ` Lorenzo Stoakes
  0 siblings, 0 replies; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21 15:35 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: David Hildenbrand, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett,
	Sidhartha Kumar

On Tue, Oct 21, 2025 at 09:37:02PM +0800, Kefeng Wang wrote:
>
>
> On 2025/10/20 23:18, Kefeng Wang wrote:
> >
> >
> > On 2025/10/20 21:23, David Hildenbrand wrote:
> > >
> > > > >           /*
> > > > >            * Avoid trapping faults against the zero page. The read-only
> > > > >            * data is likely to be read-cached on the local CPU and
> > > > > @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct
> > > > > mmu_gather *tlb, struct vm_area_struct *vma,
> > > > >           if (pmd_protnone(*pmd))
> > > > >               goto unlock;
> > > > >
> > > > > -        folio = pmd_folio(*pmd);
> > > > > -        toptier = node_is_toptier(folio_nid(folio));
> > > > > -        /*
> > > > > -         * Skip scanning top tier node if normal numa
> > > > > -         * balancing is disabled
> > > > > -         */
> > > > > -        if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
> > > > > -            toptier)
> > > > > -            goto unlock;
> > > > > +        /* Get target node for single threaded private VMAs */
> > > > > +        if (!(vma->vm_flags & VM_SHARED) &&
> > > > > +            atomic_read(&vma->vm_mm->mm_users) == 1)
> > > > > +            target_node = numa_node_id();
> > > >
> > > > This is duplicated in both callers, and only used by
> > > > folio_needs_prot_numa(),
> > > > why not abstract this to the function also?
> > >
> > > There was a discussion on that in v3 I think where I asked the same
> > > question.
> > >
> >
> > Yes, it is in v1, for pte, we could avoid 512 times check and the
> > numa_node_id(), so we leave it as is.
> >
>
> If no objection, I won't change this and will send a new version with

See feedback to your response.

In general - please do leave time for reviewers to get back to you, the
review load in THP is very high and we can't always respond right away.

>
> 1) kerneldoc refresh for folio_need_prot_numa, per Lorenzo

OK you said in the other thread you weren't going to do anything, I guess a
miscommunication as you are saying here that you _are_ going to address at
least this part of my feedback.

We're going to need a long and clear kerneldoc there to clarify things.

> 2) update the changelog of this one, per David

See also my feedback here.

>
> Thanks
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21 15:01                           ` David Hildenbrand
@ 2025-10-21 16:36                             ` Lorenzo Stoakes
  2025-10-22  0:51                               ` Kefeng Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Lorenzo Stoakes @ 2025-10-21 16:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kefeng Wang, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett

On Tue, Oct 21, 2025 at 05:01:30PM +0200, David Hildenbrand wrote:
> On 21.10.25 16:56, Lorenzo Stoakes wrote:
> > On Tue, Oct 21, 2025 at 08:54:08PM +0800, Kefeng Wang wrote:
> > >
> > >
> > > On 2025/10/21 17:45, Lorenzo Stoakes wrote:
> > > > On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
> > > > > On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
> > > > > > On 21.10.25 10:41, Kefeng Wang wrote:
> > > > > > > Maybe folio_needs_protnone_mapping()?
> > > > > >
> > > > > > I think prot_numa should be in the name one way or the other, because we
> > > > > > also have ordinary PROT_NONE unrealted to NUMA faults.
> > > > >
> > > > > folio_can_map_prot_numa() semes to me to be the best choice.
> > > > >
> > > >
> > > > I don't want to hold up the series so if you guys feel strongly, leave it
> > > > as-is. But this reads better to me.
> > >
> > > Let's keep things as-is for now.
> >
> > It'd be nice to have an argument as to why (I did say if you guys had
> > strong opinions...), but never mind.
> >
>
> folio_can_map_prot_numa() works for me, so we can just use that and call it
> a day :)

Thanks!

Kefeng - Since we need a respin anyway, and if you don't have any specific,
strong, objection to this, can we just change the function name to this please?

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa()
  2025-10-21 16:36                             ` Lorenzo Stoakes
@ 2025-10-22  0:51                               ` Kefeng Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-22  0:51 UTC (permalink / raw)
  To: Lorenzo Stoakes, David Hildenbrand
  Cc: Andrew Morton, linux-mm, Zi Yan, Baolin Wang, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Liam.Howlett



On 2025/10/22 0:36, Lorenzo Stoakes wrote:
> On Tue, Oct 21, 2025 at 05:01:30PM +0200, David Hildenbrand wrote:
>> On 21.10.25 16:56, Lorenzo Stoakes wrote:
>>> On Tue, Oct 21, 2025 at 08:54:08PM +0800, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2025/10/21 17:45, Lorenzo Stoakes wrote:
>>>>> On Tue, Oct 21, 2025 at 10:25:08AM +0100, Lorenzo Stoakes wrote:
>>>>>> On Tue, Oct 21, 2025 at 11:13:32AM +0200, David Hildenbrand wrote:
>>>>>>> On 21.10.25 10:41, Kefeng Wang wrote:
>>>>>>>> Maybe folio_needs_protnone_mapping()?
>>>>>>>
>>>>>>> I think prot_numa should be in the name one way or the other, because we
>>>>>>> also have ordinary PROT_NONE unrealted to NUMA faults.
>>>>>>
>>>>>> folio_can_map_prot_numa() semes to me to be the best choice.
>>>>>>
>>>>>
>>>>> I don't want to hold up the series so if you guys feel strongly, leave it
>>>>> as-is. But this reads better to me.
>>>>
>>>> Let's keep things as-is for now.
>>>
>>> It'd be nice to have an argument as to why (I did say if you guys had
>>> strong opinions...), but never mind.
>>>
>>
>> folio_can_map_prot_numa() works for me, so we can just use that and call it
>> a day :)
> 
> Thanks!
> 
> Kefeng - Since we need a respin anyway, and if you don't have any specific,
> strong, objection to this, can we just change the function name to this please?
> 

Sure, will adjust the naming.

Thanks

> Cheers, Lorenzo
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio
  2025-10-21 15:29         ` Lorenzo Stoakes
@ 2025-10-22  1:33           ` Kefeng Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Kefeng Wang @ 2025-10-22  1:33 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Andrew Morton, linux-mm, Zi Yan, Baolin Wang,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Liam.Howlett,
	Sidhartha Kumar



On 2025/10/21 23:29, Lorenzo Stoakes wrote:
> On Mon, Oct 20, 2025 at 11:18:27PM +0800, Kefeng Wang wrote:
>>
>>
>> On 2025/10/20 21:23, David Hildenbrand wrote:
>>>
>>>>>            /*
>>>>>             * Avoid trapping faults against the zero page. The read-only
>>>>>             * data is likely to be read-cached on the local CPU and
>>>>> @@ -2490,19 +2490,13 @@ int change_huge_pmd(struct mmu_gather
>>>>> *tlb, struct vm_area_struct *vma,
>>>>>            if (pmd_protnone(*pmd))
>>>>>                goto unlock;
>>>>>
>>>>> -        folio = pmd_folio(*pmd);
>>>>> -        toptier = node_is_toptier(folio_nid(folio));
>>>>> -        /*
>>>>> -         * Skip scanning top tier node if normal numa
>>>>> -         * balancing is disabled
>>>>> -         */
>>>>> -        if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_NORMAL) &&
>>>>> -            toptier)
>>>>> -            goto unlock;
>>>>> +        /* Get target node for single threaded private VMAs */
>>>>> +        if (!(vma->vm_flags & VM_SHARED) &&
>>>>> +            atomic_read(&vma->vm_mm->mm_users) == 1)
>>>>> +            target_node = numa_node_id();
>>>>
>>>> This is duplicated in both callers, and only used by
>>>> folio_needs_prot_numa(),
>>>> why not abstract this to the function also?
>>>
>>> There was a discussion on that in v3 I think where I asked the same
>>> question.
>>>
>>
>> Yes, it is in v1, for pte, we could avoid 512 times check and the
>> numa_node_id(), so we leave it as is.
>>
> 
> OK so what you're saying is that in change_pte_range() you only need do the
> check once (per PMD) rather than for each PTE entry.
> 
> It might be worth having this as a separate helper function as it sucks to
> duplicate that.
> 
> I actually really don't like that we pass in a 'target node' but only when it's
> single-threaded. That's a bit silly.
> 
> E.g.:
> 
> static inline bool vma_is_single_threaded_private(struct vm_area_struct *vma)
> {
> 	if (vma->vm_flags & VM_SHARED)
> 		return false;
> 
> 	return atomic_read(&vma->vm_mm->mm_users) == 1;
> }
> 
> Then you can pass in a boolean and change:
> 
> 	/*
> 	 * Don't mess with PTEs if page is already on the node
> 	 * a single-threaded process is running on.
> 	 */
> 	nid = folio_nid(folio);
> 	if (target_node == nid)
> 		return false;
> 
> To:
> 
> 	/*
> 	 * Don't mess with PTEs if page is already on the node
> 	 * a single-threaded process is running on.
> 	 */
> 	if (is_private_single_threaded && folio_nid(folio) == numa_node_id())
> 		return false;
> 
> Which makes a lot more sense than passing in NUMA_NO_NODE for shared VMAs.
> 

OK, will add the helper to remove the duplication.

> Now we're sharing this I really don't know why the function is still in
> mprotect.c by the way? Shouldn't it be in mempolicy.c + have a stub
> function for !CONFIG_NUMA_BALANCING?

Do you mean that moving folio_can_map_numa_prot() near change_prot_numa()
function in mempolicy.c?  if so, I will change it.





^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2025-10-22  1:34 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20  6:18 [PATCH v4 0/4] mm: some optimizations for prot numa Kefeng Wang
2025-10-20  6:18 ` [PATCH v4 1/4] mm: mprotect: always skip dma pinned folio in prot_numa_skip() Kefeng Wang
2025-10-21  3:06   ` Barry Song
2025-10-20  6:18 ` [PATCH v4 2/4] mm: mprotect: avoid unnecessary struct page accessing if pte_protnone() Kefeng Wang
2025-10-20 12:53   ` Lorenzo Stoakes
2025-10-20 13:08   ` David Hildenbrand
2025-10-20  6:18 ` [PATCH v4 3/4] mm: mprotect: convert to folio_needs_prot_numa() Kefeng Wang
2025-10-20 13:09   ` David Hildenbrand
2025-10-20 13:10   ` Lorenzo Stoakes
2025-10-20 15:14     ` Kefeng Wang
2025-10-20 17:34       ` David Hildenbrand
2025-10-20 17:49         ` Lorenzo Stoakes
2025-10-20 18:12           ` David Hildenbrand
2025-10-20 18:56             ` Lorenzo Stoakes
2025-10-21  8:41               ` Kefeng Wang
2025-10-21  9:13                 ` David Hildenbrand
2025-10-21  9:25                   ` Lorenzo Stoakes
2025-10-21  9:45                     ` Lorenzo Stoakes
2025-10-21 12:54                       ` Kefeng Wang
2025-10-21 14:56                         ` Lorenzo Stoakes
2025-10-21 15:01                           ` David Hildenbrand
2025-10-21 16:36                             ` Lorenzo Stoakes
2025-10-22  0:51                               ` Kefeng Wang
2025-10-20  6:18 ` [PATCH v4 4/4] mm: huge_memory: use folio_needs_prot_numa() for pmd folio Kefeng Wang
2025-10-20 13:11   ` David Hildenbrand
2025-10-20 13:15   ` Lorenzo Stoakes
2025-10-20 13:23     ` David Hildenbrand
2025-10-20 15:18       ` Kefeng Wang
2025-10-21 13:37         ` Kefeng Wang
2025-10-21 15:35           ` Lorenzo Stoakes
2025-10-21 15:29         ` Lorenzo Stoakes
2025-10-22  1:33           ` Kefeng Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).