linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
@ 2025-09-03  5:46 Dev Jain
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Dev Jain @ 2025-09-03  5:46 UTC (permalink / raw)
  To: akpm, david, kas, willy, hughd
  Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache,
	ryan.roberts, baohua, linux-mm, linux-kernel, Dev Jain

Currently khugepaged does not collapse a region which does not have a
single writable page. This is wasteful since non-writable VMAs mapped by
the application won't benefit from THP collapse. Therefore, remove this
restriction and allow khugepaged to collapse a VMA with arbitrary
protections.

Along with this, currently MADV_COLLAPSE does not perform a collapse on a
non-writable VMA, and this restriction is nowhere to be found on the
manpage - the restriction itself sounds wrong to me since the user knows
the protection of the memory it has mapped, so collapsing read-only
memory via madvise() should be a choice of the user which shouldn't
be overriden by the kernel.

On an arm64 machine, an average of 5% improvement is seen on some mmtests
benchmarks, particularly hackbench, with a maximum improvement of 12%.

Signed-off-by: Dev Jain <dev.jain@arm.com>
---
RFC->v1:
Drop writable references from tracepoints

RFC:
https://lore.kernel.org/all/20250901074817.73012-1-dev.jain@arm.com/

I can see performance improvements on mmtests run on an arm64 machine
comparing with 6.17-rc2. (I) denotes statistically significant improvement,
(R) denotes statistically significant regression (Please ignore the
numbers in the middle column):

+------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
| mmtests/hackbench                  | process-pipes-1 (seconds)                                |                 0.145 |                   -0.06% |
|                                    | process-pipes-4 (seconds)                                |                0.4335 |                   -0.27% |
|                                    | process-pipes-7 (seconds)                                |                 0.823 |              (I) -12.13% |
|                                    | process-pipes-12 (seconds)                               |    1.3538333333333334 |               (I) -5.32% |
|                                    | process-pipes-21 (seconds)                               |    1.8971666666666664 |               (I) -2.87% |
|                                    | process-pipes-30 (seconds)                               |    2.5023333333333335 |               (I) -3.39% |
|                                    | process-pipes-48 (seconds)                               |                3.4305 |               (I) -5.65% |
|                                    | process-pipes-79 (seconds)                               |     4.245833333333334 |               (I) -6.74% |
|                                    | process-pipes-110 (seconds)                              |     5.114833333333333 |               (I) -6.26% |
|                                    | process-pipes-141 (seconds)                              |                6.1885 |               (I) -4.99% |
|                                    | process-pipes-172 (seconds)                              |     7.231833333333334 |               (I) -4.45% |
|                                    | process-pipes-203 (seconds)                              |     8.393166666666668 |               (I) -3.65% |
|                                    | process-pipes-234 (seconds)                              |     9.487499999999999 |               (I) -3.45% |
|                                    | process-pipes-256 (seconds)                              |    10.316166666666666 |               (I) -3.47% |
|                                    | process-sockets-1 (seconds)                              |                 0.289 |                    2.13% |
|                                    | process-sockets-4 (seconds)                              |    0.7596666666666666 |                    1.02% |
|                                    | process-sockets-7 (seconds)                              |    1.1663333333333334 |                   -0.26% |
|                                    | process-sockets-12 (seconds)                             |    1.8641666666666665 |                   -1.24% |
|                                    | process-sockets-21 (seconds)                             |    3.0773333333333333 |                    0.01% |
|                                    | process-sockets-30 (seconds)                             |                4.2405 |                   -0.15% |
|                                    | process-sockets-48 (seconds)                             |     6.459666666666666 |                    0.15% |
|                                    | process-sockets-79 (seconds)                             |    10.156833333333333 |                    1.45% |
|                                    | process-sockets-110 (seconds)                            |    14.317833333333333 |                   -1.64% |
|                                    | process-sockets-141 (seconds)                            |               20.8735 |               (I) -4.27% |
|                                    | process-sockets-172 (seconds)                            |    26.205333333333332 |                    0.30% |
|                                    | process-sockets-203 (seconds)                            |    31.298000000000002 |                   -1.71% |
|                                    | process-sockets-234 (seconds)                            |    36.104000000000006 |                   -1.94% |
|                                    | process-sockets-256 (seconds)                            |     39.44016666666667 |                   -0.71% |
|                                    | thread-pipes-1 (seconds)                                 |   0.17550000000000002 |                    0.66% |
|                                    | thread-pipes-4 (seconds)                                 |   0.44716666666666666 |                    1.66% |
|                                    | thread-pipes-7 (seconds)                                 |                0.7345 |                   -0.17% |
|                                    | thread-pipes-12 (seconds)                                |     1.405833333333333 |               (I) -4.12% |
|                                    | thread-pipes-21 (seconds)                                |    2.0113333333333334 |               (I) -2.13% |
|                                    | thread-pipes-30 (seconds)                                |    2.6648333333333336 |               (I) -3.78% |
|                                    | thread-pipes-48 (seconds)                                |    3.6341666666666668 |               (I) -5.77% |
|                                    | thread-pipes-79 (seconds)                                |                4.4085 |               (I) -5.31% |
|                                    | thread-pipes-110 (seconds)                               |     5.374666666666666 |               (I) -6.12% |
|                                    | thread-pipes-141 (seconds)                               |     6.385666666666666 |               (I) -4.00% |
|                                    | thread-pipes-172 (seconds)                               |     7.403000000000001 |               (I) -3.01% |
|                                    | thread-pipes-203 (seconds)                               |     8.570333333333332 |               (I) -2.62% |
|                                    | thread-pipes-234 (seconds)                               |     9.719166666666666 |               (I) -2.00% |
|                                    | thread-pipes-256 (seconds)                               |    10.552833333333334 |               (I) -2.30% |
|                                    | thread-sockets-1 (seconds)                               |                0.3065 |                (R) 2.39% |
+------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+

+------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
| mmtests/sysbench-mutex             | sysbenchmutex-1 (usec)                                   |    194.38333333333333 |                   -0.02% |
|                                    | sysbenchmutex-4 (usec)                                   |               200.875 |                   -0.02% |
|                                    | sysbenchmutex-7 (usec)                                   |    201.23000000000002 |                    0.00% |
|                                    | sysbenchmutex-12 (usec)                                  |    201.77666666666664 |                    0.12% |
|                                    | sysbenchmutex-21 (usec)                                  |                203.03 |                   -0.40% |
|                                    | sysbenchmutex-30 (usec)                                  |               203.285 |                    0.08% |
|                                    | sysbenchmutex-48 (usec)                                  |    231.30000000000004 |                    2.59% |
|                                    | sysbenchmutex-79 (usec)                                  |               362.075 |                   -0.80% |
|                                    | sysbenchmutex-110 (usec)                                 |     516.8233333333334 |                   -3.87% |
|                                    | sysbenchmutex-128 (usec)                                 |     593.3533333333334 |               (I) -4.46% |
+------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+

 mm/khugepaged.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 4ec324a4c1fe..a0f1df2a7ae6 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 			writable = true;
 	}
 
-	if (unlikely(!writable)) {
-		result = SCAN_PAGE_RO;
-	} else if (unlikely(cc->is_khugepaged && !referenced)) {
+	if (unlikely(cc->is_khugepaged && !referenced)) {
 		result = SCAN_LACK_REFERENCED_PAGE;
 	} else {
 		result = SCAN_SUCCEED;
@@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 		     mmu_notifier_test_young(vma->vm_mm, _address)))
 			referenced++;
 	}
-	if (!writable) {
-		result = SCAN_PAGE_RO;
-	} else if (cc->is_khugepaged &&
+	if (cc->is_khugepaged &&
 		   (!referenced ||
 		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
 		result = SCAN_LACK_REFERENCED_PAGE;
@@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
 		case SCAN_PMD_NULL:
 		case SCAN_PTE_NON_PRESENT:
 		case SCAN_PTE_UFFD_WP:
-		case SCAN_PAGE_RO:
 		case SCAN_LACK_REFERENCED_PAGE:
 		case SCAN_PAGE_NULL:
 		case SCAN_PAGE_COUNT:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
@ 2025-09-03  5:46 ` Dev Jain
  2025-09-03  6:53   ` David Hildenbrand
                     ` (5 more replies)
  2025-09-03  6:52 ` [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs David Hildenbrand
                   ` (5 subsequent siblings)
  6 siblings, 6 replies; 24+ messages in thread
From: Dev Jain @ 2025-09-03  5:46 UTC (permalink / raw)
  To: akpm, david, kas, willy, hughd
  Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache,
	ryan.roberts, baohua, linux-mm, linux-kernel, Dev Jain

Now that all actionable outcomes from checking pte_write() are gone,
drop the related references.

Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 include/trace/events/huge_memory.h | 19 ++++++-------------
 mm/khugepaged.c                    | 14 +++-----------
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index 2305df6cb485..dd94d14a2427 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -19,7 +19,6 @@
 	EM( SCAN_PTE_NON_PRESENT,	"pte_non_present")		\
 	EM( SCAN_PTE_UFFD_WP,		"pte_uffd_wp")			\
 	EM( SCAN_PTE_MAPPED_HUGEPAGE,	"pte_mapped_hugepage")		\
-	EM( SCAN_PAGE_RO,		"no_writable_page")		\
 	EM( SCAN_LACK_REFERENCED_PAGE,	"lack_referenced_page")		\
 	EM( SCAN_PAGE_NULL,		"page_null")			\
 	EM( SCAN_SCAN_ABORT,		"scan_aborted")			\
@@ -55,15 +54,14 @@ SCAN_STATUS
 
 TRACE_EVENT(mm_khugepaged_scan_pmd,
 
-	TP_PROTO(struct mm_struct *mm, struct folio *folio, bool writable,
+	TP_PROTO(struct mm_struct *mm, struct folio *folio,
 		 int referenced, int none_or_zero, int status, int unmapped),
 
-	TP_ARGS(mm, folio, writable, referenced, none_or_zero, status, unmapped),
+	TP_ARGS(mm, folio, referenced, none_or_zero, status, unmapped),
 
 	TP_STRUCT__entry(
 		__field(struct mm_struct *, mm)
 		__field(unsigned long, pfn)
-		__field(bool, writable)
 		__field(int, referenced)
 		__field(int, none_or_zero)
 		__field(int, status)
@@ -73,17 +71,15 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
 	TP_fast_assign(
 		__entry->mm = mm;
 		__entry->pfn = folio ? folio_pfn(folio) : -1;
-		__entry->writable = writable;
 		__entry->referenced = referenced;
 		__entry->none_or_zero = none_or_zero;
 		__entry->status = status;
 		__entry->unmapped = unmapped;
 	),
 
-	TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d",
+	TP_printk("mm=%p, scan_pfn=0x%lx, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d",
 		__entry->mm,
 		__entry->pfn,
-		__entry->writable,
 		__entry->referenced,
 		__entry->none_or_zero,
 		__print_symbolic(__entry->status, SCAN_STATUS),
@@ -117,15 +113,14 @@ TRACE_EVENT(mm_collapse_huge_page,
 TRACE_EVENT(mm_collapse_huge_page_isolate,
 
 	TP_PROTO(struct folio *folio, int none_or_zero,
-		 int referenced, bool  writable, int status),
+		 int referenced, int status),
 
-	TP_ARGS(folio, none_or_zero, referenced, writable, status),
+	TP_ARGS(folio, none_or_zero, referenced, status),
 
 	TP_STRUCT__entry(
 		__field(unsigned long, pfn)
 		__field(int, none_or_zero)
 		__field(int, referenced)
-		__field(bool, writable)
 		__field(int, status)
 	),
 
@@ -133,15 +128,13 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
 		__entry->pfn = folio ? folio_pfn(folio) : -1;
 		__entry->none_or_zero = none_or_zero;
 		__entry->referenced = referenced;
-		__entry->writable = writable;
 		__entry->status = status;
 	),
 
-	TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s",
+	TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, status=%s",
 		__entry->pfn,
 		__entry->none_or_zero,
 		__entry->referenced,
-		__entry->writable,
 		__print_symbolic(__entry->status, SCAN_STATUS))
 );
 
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a0f1df2a7ae6..af5f5c80fe4e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -39,7 +39,6 @@ enum scan_result {
 	SCAN_PTE_NON_PRESENT,
 	SCAN_PTE_UFFD_WP,
 	SCAN_PTE_MAPPED_HUGEPAGE,
-	SCAN_PAGE_RO,
 	SCAN_LACK_REFERENCED_PAGE,
 	SCAN_PAGE_NULL,
 	SCAN_SCAN_ABORT,
@@ -557,7 +556,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 	struct folio *folio = NULL;
 	pte_t *_pte;
 	int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0;
-	bool writable = false;
 
 	for (_pte = pte; _pte < pte + HPAGE_PMD_NR;
 	     _pte++, address += PAGE_SIZE) {
@@ -671,9 +669,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 		     folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm,
 								     address)))
 			referenced++;
-
-		if (pte_write(pteval))
-			writable = true;
 	}
 
 	if (unlikely(cc->is_khugepaged && !referenced)) {
@@ -681,13 +676,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 	} else {
 		result = SCAN_SUCCEED;
 		trace_mm_collapse_huge_page_isolate(folio, none_or_zero,
-						    referenced, writable, result);
+						    referenced, result);
 		return result;
 	}
 out:
 	release_pte_pages(pte, _pte, compound_pagelist);
 	trace_mm_collapse_huge_page_isolate(folio, none_or_zero,
-					    referenced, writable, result);
+					    referenced, result);
 	return result;
 }
 
@@ -1280,7 +1275,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 	unsigned long _address;
 	spinlock_t *ptl;
 	int node = NUMA_NO_NODE, unmapped = 0;
-	bool writable = false;
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 
@@ -1344,8 +1338,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 			result = SCAN_PTE_UFFD_WP;
 			goto out_unmap;
 		}
-		if (pte_write(pteval))
-			writable = true;
 
 		page = vm_normal_page(vma, _address, pteval);
 		if (unlikely(!page) || unlikely(is_zone_device_page(page))) {
@@ -1435,7 +1427,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 		*mmap_locked = false;
 	}
 out:
-	trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced,
+	trace_mm_khugepaged_scan_pmd(mm, folio, referenced,
 				     none_or_zero, result, unmapped);
 	return result;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
@ 2025-09-03  6:52 ` David Hildenbrand
  2025-09-03  8:08 ` Wei Yang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03  6:52 UTC (permalink / raw)
  To: Dev Jain, akpm, kas, willy, hughd
  Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache,
	ryan.roberts, baohua, linux-mm, linux-kernel

On 03.09.25 07:46, Dev Jain wrote:
> Currently khugepaged does not collapse a region which does not have a
> single writable page. This is wasteful since non-writable VMAs mapped by
> the application won't benefit from THP collapse. Therefore, remove this
> restriction and allow khugepaged to collapse a VMA with arbitrary
> protections.
> 
> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
> non-writable VMA, and this restriction is nowhere to be found on the
> manpage - the restriction itself sounds wrong to me since the user knows
> the protection of the memory it has mapped, so collapsing read-only
> memory via madvise() should be a choice of the user which shouldn't
> be overriden by the kernel.
> 
> On an arm64 machine, an average of 5% improvement is seen on some mmtests
> benchmarks, particularly hackbench, with a maximum improvement of 12%.
> 
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
@ 2025-09-03  6:53   ` David Hildenbrand
  2025-09-03  9:04   ` Kiryl Shutsemau
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03  6:53 UTC (permalink / raw)
  To: Dev Jain, akpm, kas, willy, hughd
  Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache,
	ryan.roberts, baohua, linux-mm, linux-kernel

On 03.09.25 07:46, Dev Jain wrote:
> Now that all actionable outcomes from checking pte_write() are gone,
> drop the related references.
> 
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
  2025-09-03  6:52 ` [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs David Hildenbrand
@ 2025-09-03  8:08 ` Wei Yang
  2025-09-03  8:13   ` David Hildenbrand
                     ` (2 more replies)
  2025-09-03  9:03 ` Kiryl Shutsemau
                   ` (3 subsequent siblings)
  6 siblings, 3 replies; 24+ messages in thread
From: Wei Yang @ 2025-09-03  8:08 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>Currently khugepaged does not collapse a region which does not have a
>single writable page. This is wasteful since non-writable VMAs mapped by
>the application won't benefit from THP collapse. Therefore, remove this
>restriction and allow khugepaged to collapse a VMA with arbitrary
>protections.
>
>Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>non-writable VMA, and this restriction is nowhere to be found on the
>manpage - the restriction itself sounds wrong to me since the user knows
>the protection of the memory it has mapped, so collapsing read-only
>memory via madvise() should be a choice of the user which shouldn't
>be overriden by the kernel.
>
>On an arm64 machine, an average of 5% improvement is seen on some mmtests
>benchmarks, particularly hackbench, with a maximum improvement of 12%.
>
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
[...]
> mm/khugepaged.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index 4ec324a4c1fe..a0f1df2a7ae6 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> 			writable = true;
> 	}
> 
>-	if (unlikely(!writable)) {
>-		result = SCAN_PAGE_RO;
>-	} else if (unlikely(cc->is_khugepaged && !referenced)) {

Would this cause more memory usage in system?

For example, one application would fork itself many times. It executable area
is read only, so all of them share one copy in memory.

Now we may collapse the range and create one copy for each process.

Ok, we have max_ptes_shared, while if some ptes are none, could it still do
collapse?

Maybe this is not realistic, just curious.

>+	if (unlikely(cc->is_khugepaged && !referenced)) {
> 		result = SCAN_LACK_REFERENCED_PAGE;
> 	} else {
> 		result = SCAN_SUCCEED;
>@@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
> 		     mmu_notifier_test_young(vma->vm_mm, _address)))
> 			referenced++;
> 	}
>-	if (!writable) {
>-		result = SCAN_PAGE_RO;
>-	} else if (cc->is_khugepaged &&
>+	if (cc->is_khugepaged &&
> 		   (!referenced ||
> 		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
> 		result = SCAN_LACK_REFERENCED_PAGE;
>@@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> 		case SCAN_PMD_NULL:
> 		case SCAN_PTE_NON_PRESENT:
> 		case SCAN_PTE_UFFD_WP:
>-		case SCAN_PAGE_RO:
> 		case SCAN_LACK_REFERENCED_PAGE:
> 		case SCAN_PAGE_NULL:
> 		case SCAN_PAGE_COUNT:
>-- 
>2.30.2
>

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  8:08 ` Wei Yang
@ 2025-09-03  8:13   ` David Hildenbrand
  2025-09-03  8:30     ` Wei Yang
  2025-09-03  9:06   ` Dev Jain
  2025-09-03  9:15   ` Dev Jain
  2 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03  8:13 UTC (permalink / raw)
  To: Wei Yang, Dev Jain
  Cc: akpm, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On 03.09.25 10:08, Wei Yang wrote:
> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> Currently khugepaged does not collapse a region which does not have a
>> single writable page. This is wasteful since non-writable VMAs mapped by
>> the application won't benefit from THP collapse. Therefore, remove this
>> restriction and allow khugepaged to collapse a VMA with arbitrary
>> protections.
>>
>> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> non-writable VMA, and this restriction is nowhere to be found on the
>> manpage - the restriction itself sounds wrong to me since the user knows
>> the protection of the memory it has mapped, so collapsing read-only
>> memory via madvise() should be a choice of the user which shouldn't
>> be overriden by the kernel.
>>
>> On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
> [...]
>> mm/khugepaged.c | 9 ++-------
>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> 			writable = true;
>> 	}
>>
>> -	if (unlikely(!writable)) {
>> -		result = SCAN_PAGE_RO;
>> -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
> 
> Would this cause more memory usage in system?
> 
> For example, one application would fork itself many times. It executable area
> is read only, so all of them share one copy in memory.
> 
> Now we may collapse the range and create one copy for each process.
> 
> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
> collapse?

The max_ptes_shared check should handle that, so I don't immediately see 
a problem with that.

When I thought about the "why is this writable check there" in the past, 
I thought that maybe it was "smarter" to use THP where people are 
actually using that memory for writing (writing heap etc).

But I can understand that some pure r/o users exists that can benefit as 
well.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  8:13   ` David Hildenbrand
@ 2025-09-03  8:30     ` Wei Yang
  0 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2025-09-03  8:30 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Wei Yang, Dev Jain, akpm, kas, willy, hughd, ziy, baolin.wang,
	lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua,
	linux-mm, linux-kernel

On Wed, Sep 03, 2025 at 10:13:35AM +0200, David Hildenbrand wrote:
>On 03.09.25 10:08, Wei Yang wrote:
>> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> > Currently khugepaged does not collapse a region which does not have a
>> > single writable page. This is wasteful since non-writable VMAs mapped by
>> > the application won't benefit from THP collapse. Therefore, remove this
>> > restriction and allow khugepaged to collapse a VMA with arbitrary
>> > protections.
>> > 
>> > Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> > non-writable VMA, and this restriction is nowhere to be found on the
>> > manpage - the restriction itself sounds wrong to me since the user knows
>> > the protection of the memory it has mapped, so collapsing read-only
>> > memory via madvise() should be a choice of the user which shouldn't
>> > be overriden by the kernel.
>> > 
>> > On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> > benchmarks, particularly hackbench, with a maximum improvement of 12%.
>> > 
>> > Signed-off-by: Dev Jain <dev.jain@arm.com>
>> > ---
>> [...]
>> > mm/khugepaged.c | 9 ++-------
>> > 1 file changed, 2 insertions(+), 7 deletions(-)
>> > 
>> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> > index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> > --- a/mm/khugepaged.c
>> > +++ b/mm/khugepaged.c
>> > @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> > 			writable = true;
>> > 	}
>> > 
>> > -	if (unlikely(!writable)) {
>> > -		result = SCAN_PAGE_RO;
>> > -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
>> 
>> Would this cause more memory usage in system?
>> 
>> For example, one application would fork itself many times. It executable area
>> is read only, so all of them share one copy in memory.
>> 
>> Now we may collapse the range and create one copy for each process.
>> 
>> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
>> collapse?
>
>The max_ptes_shared check should handle that, so I don't immediately see a
>problem with that.
>

It seems reasonable, so

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>When I thought about the "why is this writable check there" in the past, I
>thought that maybe it was "smarter" to use THP where people are actually
>using that memory for writing (writing heap etc).
>
>But I can understand that some pure r/o users exists that can benefit as
>well.
>
>-- 
>Cheers
>
>David / dhildenb

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
                   ` (2 preceding siblings ...)
  2025-09-03  8:08 ` Wei Yang
@ 2025-09-03  9:03 ` Kiryl Shutsemau
  2025-09-03 15:46 ` Zi Yan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Kiryl Shutsemau @ 2025-09-03  9:03 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
> Currently khugepaged does not collapse a region which does not have a
> single writable page. This is wasteful since non-writable VMAs mapped by
> the application won't benefit from THP collapse. Therefore, remove this
> restriction and allow khugepaged to collapse a VMA with arbitrary
> protections.
> 
> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
> non-writable VMA, and this restriction is nowhere to be found on the
> manpage - the restriction itself sounds wrong to me since the user knows
> the protection of the memory it has mapped, so collapsing read-only
> memory via madvise() should be a choice of the user which shouldn't
> be overriden by the kernel.
> 
> On an arm64 machine, an average of 5% improvement is seen on some mmtests
> benchmarks, particularly hackbench, with a maximum improvement of 12%.
> 
> Signed-off-by: Dev Jain <dev.jain@arm.com>

Reviewed-by: Kiryl Shutsemau <kas@kernel.org>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
  2025-09-03  6:53   ` David Hildenbrand
@ 2025-09-03  9:04   ` Kiryl Shutsemau
  2025-09-03 13:26   ` Lorenzo Stoakes
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Kiryl Shutsemau @ 2025-09-03  9:04 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On Wed, Sep 03, 2025 at 11:16:35AM +0530, Dev Jain wrote:
> Now that all actionable outcomes from checking pte_write() are gone,
> drop the related references.
> 
> Signed-off-by: Dev Jain <dev.jain@arm.com>

Reviewed-by: Kiryl Shutsemau <kas@kernel.org>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  8:08 ` Wei Yang
  2025-09-03  8:13   ` David Hildenbrand
@ 2025-09-03  9:06   ` Dev Jain
  2025-09-03  9:15   ` Dev Jain
  2 siblings, 0 replies; 24+ messages in thread
From: Dev Jain @ 2025-09-03  9:06 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel


On 03/09/25 1:38 pm, Wei Yang wrote:
> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> Currently khugepaged does not collapse a region which does not have a
>> single writable page. This is wasteful since non-writable VMAs mapped by
>> the application won't benefit from THP collapse. Therefore, remove this
>> restriction and allow khugepaged to collapse a VMA with arbitrary
>> protections.
>>
>> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> non-writable VMA, and this restriction is nowhere to be found on the
>> manpage - the restriction itself sounds wrong to me since the user knows
>> the protection of the memory it has mapped, so collapsing read-only
>> memory via madvise() should be a choice of the user which shouldn't
>> be overriden by the kernel.
>>
>> On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
> [...]
>> mm/khugepaged.c | 9 ++-------
>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> 			writable = true;
>> 	}
>>
>> -	if (unlikely(!writable)) {
>> -		result = SCAN_PAGE_RO;
>> -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
> Would this cause more memory usage in system?
>
> For example, one application would fork itself many times. It executable area
> is read only, so all of them share one copy in memory.
>
> Now we may collapse the range and create one copy for each process.

I forgot to add "anonymous VMAs" in the patch description - for the case you
describe, the VMA will be shmem or file VMA and this patch doesn't concern that.

Andrew, could you please change the first line of the patch description from
"Currently khugepaged does not collapse a region" to "Currently khugepaged does not collapse an anonymous region"?
Thanks.

>
> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
> collapse?
>
> Maybe this is not realistic, just curious.
>
>> +	if (unlikely(cc->is_khugepaged && !referenced)) {
>> 		result = SCAN_LACK_REFERENCED_PAGE;
>> 	} else {
>> 		result = SCAN_SUCCEED;
>> @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>> 		     mmu_notifier_test_young(vma->vm_mm, _address)))
>> 			referenced++;
>> 	}
>> -	if (!writable) {
>> -		result = SCAN_PAGE_RO;
>> -	} else if (cc->is_khugepaged &&
>> +	if (cc->is_khugepaged &&
>> 		   (!referenced ||
>> 		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
>> 		result = SCAN_LACK_REFERENCED_PAGE;
>> @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>> 		case SCAN_PMD_NULL:
>> 		case SCAN_PTE_NON_PRESENT:
>> 		case SCAN_PTE_UFFD_WP:
>> -		case SCAN_PAGE_RO:
>> 		case SCAN_LACK_REFERENCED_PAGE:
>> 		case SCAN_PAGE_NULL:
>> 		case SCAN_PAGE_COUNT:
>> -- 
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  8:08 ` Wei Yang
  2025-09-03  8:13   ` David Hildenbrand
  2025-09-03  9:06   ` Dev Jain
@ 2025-09-03  9:15   ` Dev Jain
  2025-09-03  9:18     ` Dev Jain
  2025-09-03 13:11     ` Wei Yang
  2 siblings, 2 replies; 24+ messages in thread
From: Dev Jain @ 2025-09-03  9:15 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel


On 03/09/25 1:38 pm, Wei Yang wrote:
> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> Currently khugepaged does not collapse a region which does not have a
>> single writable page. This is wasteful since non-writable VMAs mapped by
>> the application won't benefit from THP collapse. Therefore, remove this
>> restriction and allow khugepaged to collapse a VMA with arbitrary
>> protections.
>>
>> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> non-writable VMA, and this restriction is nowhere to be found on the
>> manpage - the restriction itself sounds wrong to me since the user knows
>> the protection of the memory it has mapped, so collapsing read-only
>> memory via madvise() should be a choice of the user which shouldn't
>> be overriden by the kernel.
>>
>> On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
> [...]
>> mm/khugepaged.c | 9 ++-------
>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> 			writable = true;
>> 	}
>>
>> -	if (unlikely(!writable)) {
>> -		result = SCAN_PAGE_RO;
>> -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
> Would this cause more memory usage in system?
>
> For example, one application would fork itself many times. It executable area
> is read only, so all of them share one copy in memory.
>
> Now we may collapse the range and create one copy for each process.
>
> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
> collapse?
>
> Maybe this is not realistic, just curious.

Misunderstood your concern - you mean to say that a parent forks and the children
VMAs are read-only pointing to the pages which were mapped by parent. Hmm.

>
>> +	if (unlikely(cc->is_khugepaged && !referenced)) {
>> 		result = SCAN_LACK_REFERENCED_PAGE;
>> 	} else {
>> 		result = SCAN_SUCCEED;
>> @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>> 		     mmu_notifier_test_young(vma->vm_mm, _address)))
>> 			referenced++;
>> 	}
>> -	if (!writable) {
>> -		result = SCAN_PAGE_RO;
>> -	} else if (cc->is_khugepaged &&
>> +	if (cc->is_khugepaged &&
>> 		   (!referenced ||
>> 		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
>> 		result = SCAN_LACK_REFERENCED_PAGE;
>> @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>> 		case SCAN_PMD_NULL:
>> 		case SCAN_PTE_NON_PRESENT:
>> 		case SCAN_PTE_UFFD_WP:
>> -		case SCAN_PAGE_RO:
>> 		case SCAN_LACK_REFERENCED_PAGE:
>> 		case SCAN_PAGE_NULL:
>> 		case SCAN_PAGE_COUNT:
>> -- 
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  9:15   ` Dev Jain
@ 2025-09-03  9:18     ` Dev Jain
  2025-09-03  9:22       ` David Hildenbrand
  2025-09-03 13:11     ` Wei Yang
  1 sibling, 1 reply; 24+ messages in thread
From: Dev Jain @ 2025-09-03  9:18 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel


On 03/09/25 2:45 pm, Dev Jain wrote:
>
> On 03/09/25 1:38 pm, Wei Yang wrote:
>> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>>> Currently khugepaged does not collapse a region which does not have a
>>> single writable page. This is wasteful since non-writable VMAs 
>>> mapped by
>>> the application won't benefit from THP collapse. Therefore, remove this
>>> restriction and allow khugepaged to collapse a VMA with arbitrary
>>> protections.
>>>
>>> Along with this, currently MADV_COLLAPSE does not perform a collapse 
>>> on a
>>> non-writable VMA, and this restriction is nowhere to be found on the
>>> manpage - the restriction itself sounds wrong to me since the user 
>>> knows
>>> the protection of the memory it has mapped, so collapsing read-only
>>> memory via madvise() should be a choice of the user which shouldn't
>>> be overriden by the kernel.
>>>
>>> On an arm64 machine, an average of 5% improvement is seen on some 
>>> mmtests
>>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>>
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>> [...]
>>> mm/khugepaged.c | 9 ++-------
>>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct 
>>> vm_area_struct *vma,
>>>             writable = true;
>>>     }
>>>
>>> -    if (unlikely(!writable)) {
>>> -        result = SCAN_PAGE_RO;
>>> -    } else if (unlikely(cc->is_khugepaged && !referenced)) {
>> Would this cause more memory usage in system?
>>
>> For example, one application would fork itself many times. It 
>> executable area
>> is read only, so all of them share one copy in memory.
>>
>> Now we may collapse the range and create one copy for each process.
>>
>> Ok, we have max_ptes_shared, while if some ptes are none, could it 
>> still do
>> collapse?
>>
>> Maybe this is not realistic, just curious.
>
> Misunderstood your concern - you mean to say that a parent forks and 
> the children
> VMAs are read-only pointing to the pages which were mapped by parent. 
> Hmm.

I meant to say, writable VMAs with wrprotected ptes. Maybe after this 
patch, people

can finally make some real use of the max_ptes_shared tunable :)


>
>>
>>> +    if (unlikely(cc->is_khugepaged && !referenced)) {
>>>         result = SCAN_LACK_REFERENCED_PAGE;
>>>     } else {
>>>         result = SCAN_SUCCEED;
>>> @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct 
>>> mm_struct *mm,
>>>              mmu_notifier_test_young(vma->vm_mm, _address)))
>>>             referenced++;
>>>     }
>>> -    if (!writable) {
>>> -        result = SCAN_PAGE_RO;
>>> -    } else if (cc->is_khugepaged &&
>>> +    if (cc->is_khugepaged &&
>>>            (!referenced ||
>>>             (unmapped && referenced < HPAGE_PMD_NR / 2))) {
>>>         result = SCAN_LACK_REFERENCED_PAGE;
>>> @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct 
>>> *vma, unsigned long start,
>>>         case SCAN_PMD_NULL:
>>>         case SCAN_PTE_NON_PRESENT:
>>>         case SCAN_PTE_UFFD_WP:
>>> -        case SCAN_PAGE_RO:
>>>         case SCAN_LACK_REFERENCED_PAGE:
>>>         case SCAN_PAGE_NULL:
>>>         case SCAN_PAGE_COUNT:
>>> -- 
>>> 2.30.2
>>>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  9:18     ` Dev Jain
@ 2025-09-03  9:22       ` David Hildenbrand
  2025-09-03 18:25         ` Lorenzo Stoakes
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03  9:22 UTC (permalink / raw)
  To: Dev Jain, Wei Yang
  Cc: akpm, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On 03.09.25 11:18, Dev Jain wrote:
> 
> On 03/09/25 2:45 pm, Dev Jain wrote:
>>
>> On 03/09/25 1:38 pm, Wei Yang wrote:
>>> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>>>> Currently khugepaged does not collapse a region which does not have a
>>>> single writable page. This is wasteful since non-writable VMAs
>>>> mapped by
>>>> the application won't benefit from THP collapse. Therefore, remove this
>>>> restriction and allow khugepaged to collapse a VMA with arbitrary
>>>> protections.
>>>>
>>>> Along with this, currently MADV_COLLAPSE does not perform a collapse
>>>> on a
>>>> non-writable VMA, and this restriction is nowhere to be found on the
>>>> manpage - the restriction itself sounds wrong to me since the user
>>>> knows
>>>> the protection of the memory it has mapped, so collapsing read-only
>>>> memory via madvise() should be a choice of the user which shouldn't
>>>> be overriden by the kernel.
>>>>
>>>> On an arm64 machine, an average of 5% improvement is seen on some
>>>> mmtests
>>>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>>>
>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>> ---
>>> [...]
>>>> mm/khugepaged.c | 9 ++-------
>>>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>>>> --- a/mm/khugepaged.c
>>>> +++ b/mm/khugepaged.c
>>>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct
>>>> vm_area_struct *vma,
>>>>              writable = true;
>>>>      }
>>>>
>>>> -    if (unlikely(!writable)) {
>>>> -        result = SCAN_PAGE_RO;
>>>> -    } else if (unlikely(cc->is_khugepaged && !referenced)) {
>>> Would this cause more memory usage in system?
>>>
>>> For example, one application would fork itself many times. It
>>> executable area
>>> is read only, so all of them share one copy in memory.
>>>
>>> Now we may collapse the range and create one copy for each process.
>>>
>>> Ok, we have max_ptes_shared, while if some ptes are none, could it
>>> still do
>>> collapse?
>>>
>>> Maybe this is not realistic, just curious.
>>
>> Misunderstood your concern - you mean to say that a parent forks and
>> the children
>> VMAs are read-only pointing to the pages which were mapped by parent.
>> Hmm.
> 
> I meant to say, writable VMAs with wrprotected ptes. Maybe after this
> patch, people
> 
> can finally make some real use of the max_ptes_shared tunable :)

I hope not, because it should be burned with fire, lol :)

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  9:15   ` Dev Jain
  2025-09-03  9:18     ` Dev Jain
@ 2025-09-03 13:11     ` Wei Yang
  1 sibling, 0 replies; 24+ messages in thread
From: Wei Yang @ 2025-09-03 13:11 UTC (permalink / raw)
  To: Dev Jain
  Cc: Wei Yang, akpm, david, kas, willy, hughd, ziy, baolin.wang,
	lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua,
	linux-mm, linux-kernel

On Wed, Sep 03, 2025 at 02:45:28PM +0530, Dev Jain wrote:
>
>On 03/09/25 1:38 pm, Wei Yang wrote:
>> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> > Currently khugepaged does not collapse a region which does not have a
>> > single writable page. This is wasteful since non-writable VMAs mapped by
>> > the application won't benefit from THP collapse. Therefore, remove this
>> > restriction and allow khugepaged to collapse a VMA with arbitrary
>> > protections.
>> > 
>> > Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> > non-writable VMA, and this restriction is nowhere to be found on the
>> > manpage - the restriction itself sounds wrong to me since the user knows
>> > the protection of the memory it has mapped, so collapsing read-only
>> > memory via madvise() should be a choice of the user which shouldn't
>> > be overriden by the kernel.
>> > 
>> > On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> > benchmarks, particularly hackbench, with a maximum improvement of 12%.
>> > 
>> > Signed-off-by: Dev Jain <dev.jain@arm.com>
>> > ---
>> [...]
>> > mm/khugepaged.c | 9 ++-------
>> > 1 file changed, 2 insertions(+), 7 deletions(-)
>> > 
>> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> > index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> > --- a/mm/khugepaged.c
>> > +++ b/mm/khugepaged.c
>> > @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> > 			writable = true;
>> > 	}
>> > 
>> > -	if (unlikely(!writable)) {
>> > -		result = SCAN_PAGE_RO;
>> > -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
>> Would this cause more memory usage in system?
>> 
>> For example, one application would fork itself many times. It executable area
>> is read only, so all of them share one copy in memory.
>> 
>> Now we may collapse the range and create one copy for each process.
>> 
>> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
>> collapse?
>> 
>> Maybe this is not realistic, just curious.
>
>Misunderstood your concern - you mean to say that a parent forks and the children
>VMAs are read-only pointing to the pages which were mapped by parent. Hmm.
>

This is one of the case in my mind, while what I described above is file
backed VMA.

Since pages are mapped both in parent and child, we would count shared ptes
during scan. So max_ptes_shared would decide whether to collapse or not.

To play with max_ptes_shared, this is a magic to me... Probably, there is no
optimal value for all scenario. And if it do gain much performance after
collapse, maybe it is the application author's responsibility to use hugetlb?

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
  2025-09-03  6:53   ` David Hildenbrand
  2025-09-03  9:04   ` Kiryl Shutsemau
@ 2025-09-03 13:26   ` Lorenzo Stoakes
  2025-09-03 14:33     ` David Hildenbrand
  2025-09-03 15:47   ` Zi Yan
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-09-03 13:26 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, baohua, linux-mm, linux-kernel

I know it's a small thing, and subjective, but really would prefer a cover
letter to a 2/2 replying to a 1/2 :P

But others may disagree, just a small trivial plea :>)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03 13:26   ` Lorenzo Stoakes
@ 2025-09-03 14:33     ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03 14:33 UTC (permalink / raw)
  To: Lorenzo Stoakes, Dev Jain
  Cc: akpm, kas, willy, hughd, ziy, baolin.wang, Liam.Howlett, npache,
	ryan.roberts, baohua, linux-mm, linux-kernel

On 03.09.25 15:26, Lorenzo Stoakes wrote:
> I know it's a small thing, and subjective, but really would prefer a cover
> letter to a 2/2 replying to a 1/2 :P
> 
> But others may disagree, just a small trivial plea :>)

yes, I was assuming that I didn't get CCed on it. :)

Anything with more than one patch should have a cover letter.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
                   ` (3 preceding siblings ...)
  2025-09-03  9:03 ` Kiryl Shutsemau
@ 2025-09-03 15:46 ` Zi Yan
  2025-09-03 20:34 ` Lorenzo Stoakes
  2025-09-04  6:11 ` Baolin Wang
  6 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2025-09-03 15:46 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On 3 Sep 2025, at 1:46, Dev Jain wrote:

> Currently khugepaged does not collapse a region which does not have a
> single writable page. This is wasteful since non-writable VMAs mapped by
> the application won't benefit from THP collapse. Therefore, remove this
> restriction and allow khugepaged to collapse a VMA with arbitrary
> protections.
>
> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
> non-writable VMA, and this restriction is nowhere to be found on the
> manpage - the restriction itself sounds wrong to me since the user knows
> the protection of the memory it has mapped, so collapsing read-only
> memory via madvise() should be a choice of the user which shouldn't
> be overriden by the kernel.
>
> On an arm64 machine, an average of 5% improvement is seen on some mmtests
> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> RFC->v1:
> Drop writable references from tracepoints
>
> RFC:
> https://lore.kernel.org/all/20250901074817.73012-1-dev.jain@arm.com/
>
> I can see performance improvements on mmtests run on an arm64 machine
> comparing with 6.17-rc2. (I) denotes statistically significant improvement,
> (R) denotes statistically significant regression (Please ignore the
> numbers in the middle column):
>
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
> | mmtests/hackbench                  | process-pipes-1 (seconds)                                |                 0.145 |                   -0.06% |
> |                                    | process-pipes-4 (seconds)                                |                0.4335 |                   -0.27% |
> |                                    | process-pipes-7 (seconds)                                |                 0.823 |              (I) -12.13% |
> |                                    | process-pipes-12 (seconds)                               |    1.3538333333333334 |               (I) -5.32% |
> |                                    | process-pipes-21 (seconds)                               |    1.8971666666666664 |               (I) -2.87% |
> |                                    | process-pipes-30 (seconds)                               |    2.5023333333333335 |               (I) -3.39% |
> |                                    | process-pipes-48 (seconds)                               |                3.4305 |               (I) -5.65% |
> |                                    | process-pipes-79 (seconds)                               |     4.245833333333334 |               (I) -6.74% |
> |                                    | process-pipes-110 (seconds)                              |     5.114833333333333 |               (I) -6.26% |
> |                                    | process-pipes-141 (seconds)                              |                6.1885 |               (I) -4.99% |
> |                                    | process-pipes-172 (seconds)                              |     7.231833333333334 |               (I) -4.45% |
> |                                    | process-pipes-203 (seconds)                              |     8.393166666666668 |               (I) -3.65% |
> |                                    | process-pipes-234 (seconds)                              |     9.487499999999999 |               (I) -3.45% |
> |                                    | process-pipes-256 (seconds)                              |    10.316166666666666 |               (I) -3.47% |
> |                                    | process-sockets-1 (seconds)                              |                 0.289 |                    2.13% |
> |                                    | process-sockets-4 (seconds)                              |    0.7596666666666666 |                    1.02% |
> |                                    | process-sockets-7 (seconds)                              |    1.1663333333333334 |                   -0.26% |
> |                                    | process-sockets-12 (seconds)                             |    1.8641666666666665 |                   -1.24% |
> |                                    | process-sockets-21 (seconds)                             |    3.0773333333333333 |                    0.01% |
> |                                    | process-sockets-30 (seconds)                             |                4.2405 |                   -0.15% |
> |                                    | process-sockets-48 (seconds)                             |     6.459666666666666 |                    0.15% |
> |                                    | process-sockets-79 (seconds)                             |    10.156833333333333 |                    1.45% |
> |                                    | process-sockets-110 (seconds)                            |    14.317833333333333 |                   -1.64% |
> |                                    | process-sockets-141 (seconds)                            |               20.8735 |               (I) -4.27% |
> |                                    | process-sockets-172 (seconds)                            |    26.205333333333332 |                    0.30% |
> |                                    | process-sockets-203 (seconds)                            |    31.298000000000002 |                   -1.71% |
> |                                    | process-sockets-234 (seconds)                            |    36.104000000000006 |                   -1.94% |
> |                                    | process-sockets-256 (seconds)                            |     39.44016666666667 |                   -0.71% |
> |                                    | thread-pipes-1 (seconds)                                 |   0.17550000000000002 |                    0.66% |
> |                                    | thread-pipes-4 (seconds)                                 |   0.44716666666666666 |                    1.66% |
> |                                    | thread-pipes-7 (seconds)                                 |                0.7345 |                   -0.17% |
> |                                    | thread-pipes-12 (seconds)                                |     1.405833333333333 |               (I) -4.12% |
> |                                    | thread-pipes-21 (seconds)                                |    2.0113333333333334 |               (I) -2.13% |
> |                                    | thread-pipes-30 (seconds)                                |    2.6648333333333336 |               (I) -3.78% |
> |                                    | thread-pipes-48 (seconds)                                |    3.6341666666666668 |               (I) -5.77% |
> |                                    | thread-pipes-79 (seconds)                                |                4.4085 |               (I) -5.31% |
> |                                    | thread-pipes-110 (seconds)                               |     5.374666666666666 |               (I) -6.12% |
> |                                    | thread-pipes-141 (seconds)                               |     6.385666666666666 |               (I) -4.00% |
> |                                    | thread-pipes-172 (seconds)                               |     7.403000000000001 |               (I) -3.01% |
> |                                    | thread-pipes-203 (seconds)                               |     8.570333333333332 |               (I) -2.62% |
> |                                    | thread-pipes-234 (seconds)                               |     9.719166666666666 |               (I) -2.00% |
> |                                    | thread-pipes-256 (seconds)                               |    10.552833333333334 |               (I) -2.30% |
> |                                    | thread-sockets-1 (seconds)                               |                0.3065 |                (R) 2.39% |
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
>
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
> | mmtests/sysbench-mutex             | sysbenchmutex-1 (usec)                                   |    194.38333333333333 |                   -0.02% |
> |                                    | sysbenchmutex-4 (usec)                                   |               200.875 |                   -0.02% |
> |                                    | sysbenchmutex-7 (usec)                                   |    201.23000000000002 |                    0.00% |
> |                                    | sysbenchmutex-12 (usec)                                  |    201.77666666666664 |                    0.12% |
> |                                    | sysbenchmutex-21 (usec)                                  |                203.03 |                   -0.40% |
> |                                    | sysbenchmutex-30 (usec)                                  |               203.285 |                    0.08% |
> |                                    | sysbenchmutex-48 (usec)                                  |    231.30000000000004 |                    2.59% |
> |                                    | sysbenchmutex-79 (usec)                                  |               362.075 |                   -0.80% |
> |                                    | sysbenchmutex-110 (usec)                                 |     516.8233333333334 |                   -3.87% |
> |                                    | sysbenchmutex-128 (usec)                                 |     593.3533333333334 |               (I) -4.46% |
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
>
>  mm/khugepaged.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
>

Acked-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
                     ` (2 preceding siblings ...)
  2025-09-03 13:26   ` Lorenzo Stoakes
@ 2025-09-03 15:47   ` Zi Yan
  2025-09-03 20:35   ` Lorenzo Stoakes
  2025-09-04  6:12   ` Baolin Wang
  5 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2025-09-03 15:47 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, baolin.wang, lorenzo.stoakes,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On 3 Sep 2025, at 1:46, Dev Jain wrote:

> Now that all actionable outcomes from checking pte_write() are gone,
> drop the related references.
>
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
>  include/trace/events/huge_memory.h | 19 ++++++-------------
>  mm/khugepaged.c                    | 14 +++-----------
>  2 files changed, 9 insertions(+), 24 deletions(-)
>

Acked-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  9:22       ` David Hildenbrand
@ 2025-09-03 18:25         ` Lorenzo Stoakes
  2025-09-04  3:56           ` Dev Jain
  0 siblings, 1 reply; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-09-03 18:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Dev Jain, Wei Yang, akpm, kas, willy, hughd, ziy, baolin.wang,
	Liam.Howlett, npache, ryan.roberts, baohua, linux-mm,
	linux-kernel

On Wed, Sep 03, 2025 at 11:22:09AM +0200, David Hildenbrand wrote:
> On 03.09.25 11:18, Dev Jain wrote:
> > I meant to say, writable VMAs with wrprotected ptes. Maybe after this
> > patch, people

So, I think you really need to update the commit message here to reflect
that!

And of course, do a cover letter :P

The test results would probably work better as a cover letter as will be
put in the first commit by Andrew in any case.

> >
> > can finally make some real use of the max_ptes_shared tunable :)
>
> I hope not, because it should be burned with fire, lol :)

+1 :)

>
> --
> Cheers
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
                   ` (4 preceding siblings ...)
  2025-09-03 15:46 ` Zi Yan
@ 2025-09-03 20:34 ` Lorenzo Stoakes
  2025-09-04  6:11 ` Baolin Wang
  6 siblings, 0 replies; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-09-03 20:34 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, baohua, linux-mm, linux-kernel

On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
> Currently khugepaged does not collapse a region which does not have a
> single writable page. This is wasteful since non-writable VMAs mapped by

As discussed elsewhere in the thread, you really need to clarify that you
mean the PTE is writable. This is far too vague otherwise.

> the application won't benefit from THP collapse. Therefore, remove this
> restriction and allow khugepaged to collapse a VMA with arbitrary
> protections.

It's weird thie history of this, it looks like we were super conservative
at first, and then introduced this 'at least one PTE writable' thing in
commit 10359213d05a ("mm: incorporate read-only pages into transparent huge
pages"), but it doesn't really explain why you even need (at least) a
writable page.

Perhaps a pre-PAE thing... (David?) we already do the refcount stuff
though, so it's hard to understand.

It seems the main case for anon where it'd matter is swapped in pages
read-faulting for a R/W mapping (as read-faulting R/W mappings would just
get you the zero page which vm_normal_page() would exclude anyway).

But not sure why we'd be reticent to collapse those anyway... you'd just
cahnge R/W bit on PMD instead of PTE?

Yeah it's bizarre.

I can't really see why your change shouldn't be done...


>
> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
> non-writable VMA, and this restriction is nowhere to be found on the

> manpage - the restriction itself sounds wrong to me since the user knows

I'm not sure why a man page would talk about PTE scanning implementation
details?

But I guess as you say you're thinking specifically of a read-only VMA that
naturally has read-only PTE's as as result...

> the protection of the memory it has mapped, so collapsing read-only
> memory via madvise() should be a choice of the user which shouldn't
> be overriden by the kernel.

NIT: overriden -> overridden.

>
> On an arm64 machine, an average of 5% improvement is seen on some mmtests
> benchmarks, particularly hackbench, with a maximum improvement of 12%.

Nice!

Is this on a raw metal machine, or a VM? I thik it's important to clarify
details like this.

Please state precisely what you tested this on.

>
> Signed-off-by: Dev Jain <dev.jain@arm.com>

Can't find any problem with this, and doesn't really seem like it'd be
problematic so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
> RFC->v1:
> Drop writable references from tracepoints
>
> RFC:
> https://lore.kernel.org/all/20250901074817.73012-1-dev.jain@arm.com/
>
> I can see performance improvements on mmtests run on an arm64 machine
> comparing with 6.17-rc2. (I) denotes statistically significant improvement,
> (R) denotes statistically significant regression (Please ignore the
> numbers in the middle column):

Let's drop the numbers in the middle column then please, this is going into the
commit log, let's not put extranous information there.

>
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
> | mmtests/hackbench                  | process-pipes-1 (seconds)                                |                 0.145 |                   -0.06% |
> |                                    | process-pipes-4 (seconds)                                |                0.4335 |                   -0.27% |
> |                                    | process-pipes-7 (seconds)                                |                 0.823 |              (I) -12.13% |
> |                                    | process-pipes-12 (seconds)                               |    1.3538333333333334 |               (I) -5.32% |
> |                                    | process-pipes-21 (seconds)                               |    1.8971666666666664 |               (I) -2.87% |
> |                                    | process-pipes-30 (seconds)                               |    2.5023333333333335 |               (I) -3.39% |
> |                                    | process-pipes-48 (seconds)                               |                3.4305 |               (I) -5.65% |
> |                                    | process-pipes-79 (seconds)                               |     4.245833333333334 |               (I) -6.74% |
> |                                    | process-pipes-110 (seconds)                              |     5.114833333333333 |               (I) -6.26% |
> |                                    | process-pipes-141 (seconds)                              |                6.1885 |               (I) -4.99% |
> |                                    | process-pipes-172 (seconds)                              |     7.231833333333334 |               (I) -4.45% |
> |                                    | process-pipes-203 (seconds)                              |     8.393166666666668 |               (I) -3.65% |
> |                                    | process-pipes-234 (seconds)                              |     9.487499999999999 |               (I) -3.45% |
> |                                    | process-pipes-256 (seconds)                              |    10.316166666666666 |               (I) -3.47% |
> |                                    | process-sockets-1 (seconds)                              |                 0.289 |                    2.13% |
> |                                    | process-sockets-4 (seconds)                              |    0.7596666666666666 |                    1.02% |
> |                                    | process-sockets-7 (seconds)                              |    1.1663333333333334 |                   -0.26% |
> |                                    | process-sockets-12 (seconds)                             |    1.8641666666666665 |                   -1.24% |
> |                                    | process-sockets-21 (seconds)                             |    3.0773333333333333 |                    0.01% |
> |                                    | process-sockets-30 (seconds)                             |                4.2405 |                   -0.15% |
> |                                    | process-sockets-48 (seconds)                             |     6.459666666666666 |                    0.15% |
> |                                    | process-sockets-79 (seconds)                             |    10.156833333333333 |                    1.45% |
> |                                    | process-sockets-110 (seconds)                            |    14.317833333333333 |                   -1.64% |
> |                                    | process-sockets-141 (seconds)                            |               20.8735 |               (I) -4.27% |
> |                                    | process-sockets-172 (seconds)                            |    26.205333333333332 |                    0.30% |
> |                                    | process-sockets-203 (seconds)                            |    31.298000000000002 |                   -1.71% |
> |                                    | process-sockets-234 (seconds)                            |    36.104000000000006 |                   -1.94% |
> |                                    | process-sockets-256 (seconds)                            |     39.44016666666667 |                   -0.71% |
> |                                    | thread-pipes-1 (seconds)                                 |   0.17550000000000002 |                    0.66% |
> |                                    | thread-pipes-4 (seconds)                                 |   0.44716666666666666 |                    1.66% |
> |                                    | thread-pipes-7 (seconds)                                 |                0.7345 |                   -0.17% |
> |                                    | thread-pipes-12 (seconds)                                |     1.405833333333333 |               (I) -4.12% |
> |                                    | thread-pipes-21 (seconds)                                |    2.0113333333333334 |               (I) -2.13% |
> |                                    | thread-pipes-30 (seconds)                                |    2.6648333333333336 |               (I) -3.78% |
> |                                    | thread-pipes-48 (seconds)                                |    3.6341666666666668 |               (I) -5.77% |
> |                                    | thread-pipes-79 (seconds)                                |                4.4085 |               (I) -5.31% |
> |                                    | thread-pipes-110 (seconds)                               |     5.374666666666666 |               (I) -6.12% |
> |                                    | thread-pipes-141 (seconds)                               |     6.385666666666666 |               (I) -4.00% |
> |                                    | thread-pipes-172 (seconds)                               |     7.403000000000001 |               (I) -3.01% |
> |                                    | thread-pipes-203 (seconds)                               |     8.570333333333332 |               (I) -2.62% |
> |                                    | thread-pipes-234 (seconds)                               |     9.719166666666666 |               (I) -2.00% |
> |                                    | thread-pipes-256 (seconds)                               |    10.552833333333334 |               (I) -2.30% |
> |                                    | thread-sockets-1 (seconds)                               |                0.3065 |                (R) 2.39% |
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
>
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+
> | mmtests/sysbench-mutex             | sysbenchmutex-1 (usec)                                   |    194.38333333333333 |                   -0.02% |
> |                                    | sysbenchmutex-4 (usec)                                   |               200.875 |                   -0.02% |
> |                                    | sysbenchmutex-7 (usec)                                   |    201.23000000000002 |                    0.00% |
> |                                    | sysbenchmutex-12 (usec)                                  |    201.77666666666664 |                    0.12% |
> |                                    | sysbenchmutex-21 (usec)                                  |                203.03 |                   -0.40% |
> |                                    | sysbenchmutex-30 (usec)                                  |               203.285 |                    0.08% |
> |                                    | sysbenchmutex-48 (usec)                                  |    231.30000000000004 |                    2.59% |
> |                                    | sysbenchmutex-79 (usec)                                  |               362.075 |                   -0.80% |
> |                                    | sysbenchmutex-110 (usec)                                 |     516.8233333333334 |                   -3.87% |
> |                                    | sysbenchmutex-128 (usec)                                 |     593.3533333333334 |               (I) -4.46% |
> +------------------------------------+----------------------------------------------------------+-----------------------+--------------------------+

This is nice, but is clearly hugely exceeding the column width we should have in commit messages.

Let me use emacs's nice features to make life easy for you :) -

+-------------------------+--------------------------------+---------------+
| mmtests/hackbench       | process-pipes-1 (seconds)      |        -0.06% |
|                         | process-pipes-4 (seconds)      |        -0.27% |
|                         | process-pipes-7 (seconds)      |   (I) -12.13% |
|                         | process-pipes-12 (seconds)     |    (I) -5.32% |
|                         | process-pipes-21 (seconds)     |    (I) -2.87% |
|                         | process-pipes-30 (seconds)     |    (I) -3.39% |
|                         | process-pipes-48 (seconds)     |    (I) -5.65% |
|                         | process-pipes-79 (seconds)     |    (I) -6.74% |
|                         | process-pipes-110 (seconds)    |    (I) -6.26% |
|                         | process-pipes-141 (seconds)    |    (I) -4.99% |
|                         | process-pipes-172 (seconds)    |    (I) -4.45% |
|                         | process-pipes-203 (seconds)    |    (I) -3.65% |
|                         | process-pipes-234 (seconds)    |    (I) -3.45% |
|                         | process-pipes-256 (seconds)    |    (I) -3.47% |
|                         | process-sockets-1 (seconds)    |         2.13% |
|                         | process-sockets-4 (seconds)    |         1.02% |
|                         | process-sockets-7 (seconds)    |        -0.26% |
|                         | process-sockets-12 (seconds)   |        -1.24% |
|                         | process-sockets-21 (seconds)   |         0.01% |
|                         | process-sockets-30 (seconds)   |        -0.15% |
|                         | process-sockets-48 (seconds)   |         0.15% |
|                         | process-sockets-79 (seconds)   |         1.45% |
|                         | process-sockets-110 (seconds)  |        -1.64% |
|                         | process-sockets-141 (seconds)  |    (I) -4.27% |
|                         | process-sockets-172 (seconds)  |         0.30% |
|                         | process-sockets-203 (seconds)  |        -1.71% |
|                         | process-sockets-234 (seconds)  |        -1.94% |
|                         | process-sockets-256 (seconds)  |        -0.71% |
|                         | thread-pipes-1 (seconds)       |         0.66% |
|                         | thread-pipes-4 (seconds)       |         1.66% |
|                         | thread-pipes-7 (seconds)       |        -0.17% |
|                         | thread-pipes-12 (seconds)      |    (I) -4.12% |
|                         | thread-pipes-21 (seconds)      |    (I) -2.13% |
|                         | thread-pipes-30 (seconds)      |    (I) -3.78% |
|                         | thread-pipes-48 (seconds)      |    (I) -5.77% |
|                         | thread-pipes-79 (seconds)      |    (I) -5.31% |
|                         | thread-pipes-110 (seconds)     |    (I) -6.12% |
|                         | thread-pipes-141 (seconds)     |    (I) -4.00% |
|                         | thread-pipes-172 (seconds)     |    (I) -3.01% |
|                         | thread-pipes-203 (seconds)     |    (I) -2.62% |
|                         | thread-pipes-234 (seconds)     |    (I) -2.00% |
|                         | thread-pipes-256 (seconds)     |    (I) -2.30% |
|                         | thread-sockets-1 (seconds)     |     (R) 2.39% |
+-------------------------+--------------------------------+---------------+

+-------------------------+------------------------------------------------+
| mmtests/sysbench-mutex  | sysbenchmutex-1 (usec)         |        -0.02% |
|                         | sysbenchmutex-4 (usec)         |        -0.02% |
|                         | sysbenchmutex-7 (usec)         |         0.00% |
|                         | sysbenchmutex-12 (usec)        |         0.12% |
|                         | sysbenchmutex-21 (usec)        |        -0.40% |
|                         | sysbenchmutex-30 (usec)        |         0.08% |
|                         | sysbenchmutex-48 (usec)        |         2.59% |
|                         | sysbenchmutex-79 (usec)        |        -0.80% |
|                         | sysbenchmutex-110 (usec)       |        -3.87% |
|                         | sysbenchmutex-128 (usec)       |    (I) -4.46% |
+-------------------------+--------------------------------+---------------+


>
>  mm/khugepaged.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 4ec324a4c1fe..a0f1df2a7ae6 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  			writable = true;
>  	}
>
> -	if (unlikely(!writable)) {
> -		result = SCAN_PAGE_RO;
> -	} else if (unlikely(cc->is_khugepaged && !referenced)) {
> +	if (unlikely(cc->is_khugepaged && !referenced)) {
>  		result = SCAN_LACK_REFERENCED_PAGE;
>  	} else {
>  		result = SCAN_SUCCEED;
> @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>  		     mmu_notifier_test_young(vma->vm_mm, _address)))
>  			referenced++;
>  	}
> -	if (!writable) {
> -		result = SCAN_PAGE_RO;
> -	} else if (cc->is_khugepaged &&
> +	if (cc->is_khugepaged &&
>  		   (!referenced ||
>  		    (unmapped && referenced < HPAGE_PMD_NR / 2))) {
>  		result = SCAN_LACK_REFERENCED_PAGE;
> @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>  		case SCAN_PMD_NULL:
>  		case SCAN_PTE_NON_PRESENT:
>  		case SCAN_PTE_UFFD_WP:
> -		case SCAN_PAGE_RO:
>  		case SCAN_LACK_REFERENCED_PAGE:
>  		case SCAN_PAGE_NULL:
>  		case SCAN_PAGE_COUNT:
> --
> 2.30.2
>

I guess you delay the final cleanup so you can combine it with tracepoint
removal in next patch, not really sure why they're separate but meh not a
big deal.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
                     ` (3 preceding siblings ...)
  2025-09-03 15:47   ` Zi Yan
@ 2025-09-03 20:35   ` Lorenzo Stoakes
  2025-09-04  6:12   ` Baolin Wang
  5 siblings, 0 replies; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-09-03 20:35 UTC (permalink / raw)
  To: Dev Jain
  Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, baohua, linux-mm, linux-kernel

On Wed, Sep 03, 2025 at 11:16:35AM +0530, Dev Jain wrote:
> Now that all actionable outcomes from checking pte_write() are gone,
> drop the related references.
>
> Signed-off-by: Dev Jain <dev.jain@arm.com>

LGTM so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  include/trace/events/huge_memory.h | 19 ++++++-------------
>  mm/khugepaged.c                    | 14 +++-----------
>  2 files changed, 9 insertions(+), 24 deletions(-)
>
> diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
> index 2305df6cb485..dd94d14a2427 100644
> --- a/include/trace/events/huge_memory.h
> +++ b/include/trace/events/huge_memory.h
> @@ -19,7 +19,6 @@
>  	EM( SCAN_PTE_NON_PRESENT,	"pte_non_present")		\
>  	EM( SCAN_PTE_UFFD_WP,		"pte_uffd_wp")			\
>  	EM( SCAN_PTE_MAPPED_HUGEPAGE,	"pte_mapped_hugepage")		\
> -	EM( SCAN_PAGE_RO,		"no_writable_page")		\
>  	EM( SCAN_LACK_REFERENCED_PAGE,	"lack_referenced_page")		\
>  	EM( SCAN_PAGE_NULL,		"page_null")			\
>  	EM( SCAN_SCAN_ABORT,		"scan_aborted")			\
> @@ -55,15 +54,14 @@ SCAN_STATUS
>
>  TRACE_EVENT(mm_khugepaged_scan_pmd,
>
> -	TP_PROTO(struct mm_struct *mm, struct folio *folio, bool writable,
> +	TP_PROTO(struct mm_struct *mm, struct folio *folio,
>  		 int referenced, int none_or_zero, int status, int unmapped),
>
> -	TP_ARGS(mm, folio, writable, referenced, none_or_zero, status, unmapped),
> +	TP_ARGS(mm, folio, referenced, none_or_zero, status, unmapped),
>
>  	TP_STRUCT__entry(
>  		__field(struct mm_struct *, mm)
>  		__field(unsigned long, pfn)
> -		__field(bool, writable)
>  		__field(int, referenced)
>  		__field(int, none_or_zero)
>  		__field(int, status)
> @@ -73,17 +71,15 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
>  	TP_fast_assign(
>  		__entry->mm = mm;
>  		__entry->pfn = folio ? folio_pfn(folio) : -1;
> -		__entry->writable = writable;
>  		__entry->referenced = referenced;
>  		__entry->none_or_zero = none_or_zero;
>  		__entry->status = status;
>  		__entry->unmapped = unmapped;
>  	),
>
> -	TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d",
> +	TP_printk("mm=%p, scan_pfn=0x%lx, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d",
>  		__entry->mm,
>  		__entry->pfn,
> -		__entry->writable,
>  		__entry->referenced,
>  		__entry->none_or_zero,
>  		__print_symbolic(__entry->status, SCAN_STATUS),
> @@ -117,15 +113,14 @@ TRACE_EVENT(mm_collapse_huge_page,
>  TRACE_EVENT(mm_collapse_huge_page_isolate,
>
>  	TP_PROTO(struct folio *folio, int none_or_zero,
> -		 int referenced, bool  writable, int status),
> +		 int referenced, int status),
>
> -	TP_ARGS(folio, none_or_zero, referenced, writable, status),
> +	TP_ARGS(folio, none_or_zero, referenced, status),
>
>  	TP_STRUCT__entry(
>  		__field(unsigned long, pfn)
>  		__field(int, none_or_zero)
>  		__field(int, referenced)
> -		__field(bool, writable)
>  		__field(int, status)
>  	),
>
> @@ -133,15 +128,13 @@ TRACE_EVENT(mm_collapse_huge_page_isolate,
>  		__entry->pfn = folio ? folio_pfn(folio) : -1;
>  		__entry->none_or_zero = none_or_zero;
>  		__entry->referenced = referenced;
> -		__entry->writable = writable;
>  		__entry->status = status;
>  	),
>
> -	TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s",
> +	TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, status=%s",
>  		__entry->pfn,
>  		__entry->none_or_zero,
>  		__entry->referenced,
> -		__entry->writable,
>  		__print_symbolic(__entry->status, SCAN_STATUS))
>  );
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index a0f1df2a7ae6..af5f5c80fe4e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -39,7 +39,6 @@ enum scan_result {
>  	SCAN_PTE_NON_PRESENT,
>  	SCAN_PTE_UFFD_WP,
>  	SCAN_PTE_MAPPED_HUGEPAGE,
> -	SCAN_PAGE_RO,
>  	SCAN_LACK_REFERENCED_PAGE,
>  	SCAN_PAGE_NULL,
>  	SCAN_SCAN_ABORT,
> @@ -557,7 +556,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  	struct folio *folio = NULL;
>  	pte_t *_pte;
>  	int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0;
> -	bool writable = false;
>
>  	for (_pte = pte; _pte < pte + HPAGE_PMD_NR;
>  	     _pte++, address += PAGE_SIZE) {
> @@ -671,9 +669,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		     folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm,
>  								     address)))
>  			referenced++;
> -
> -		if (pte_write(pteval))
> -			writable = true;
>  	}
>
>  	if (unlikely(cc->is_khugepaged && !referenced)) {
> @@ -681,13 +676,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  	} else {
>  		result = SCAN_SUCCEED;
>  		trace_mm_collapse_huge_page_isolate(folio, none_or_zero,
> -						    referenced, writable, result);
> +						    referenced, result);
>  		return result;
>  	}
>  out:
>  	release_pte_pages(pte, _pte, compound_pagelist);
>  	trace_mm_collapse_huge_page_isolate(folio, none_or_zero,
> -					    referenced, writable, result);
> +					    referenced, result);
>  	return result;
>  }
>
> @@ -1280,7 +1275,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>  	unsigned long _address;
>  	spinlock_t *ptl;
>  	int node = NUMA_NO_NODE, unmapped = 0;
> -	bool writable = false;
>
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>
> @@ -1344,8 +1338,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>  			result = SCAN_PTE_UFFD_WP;
>  			goto out_unmap;
>  		}
> -		if (pte_write(pteval))
> -			writable = true;
>
>  		page = vm_normal_page(vma, _address, pteval);
>  		if (unlikely(!page) || unlikely(is_zone_device_page(page))) {
> @@ -1435,7 +1427,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>  		*mmap_locked = false;
>  	}
>  out:
> -	trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced,
> +	trace_mm_khugepaged_scan_pmd(mm, folio, referenced,
>  				     none_or_zero, result, unmapped);
>  	return result;
>  }
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03 18:25         ` Lorenzo Stoakes
@ 2025-09-04  3:56           ` Dev Jain
  0 siblings, 0 replies; 24+ messages in thread
From: Dev Jain @ 2025-09-04  3:56 UTC (permalink / raw)
  To: Lorenzo Stoakes, David Hildenbrand
  Cc: Wei Yang, akpm, kas, willy, hughd, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, baohua, linux-mm, linux-kernel


On 03/09/25 11:55 pm, Lorenzo Stoakes wrote:
> On Wed, Sep 03, 2025 at 11:22:09AM +0200, David Hildenbrand wrote:
>> On 03.09.25 11:18, Dev Jain wrote:
>>> I meant to say, writable VMAs with wrprotected ptes. Maybe after this
>>> patch, people
> So, I think you really need to update the commit message here to reflect
> that!
>
> And of course, do a cover letter :P
>
> The test results would probably work better as a cover letter as will be
> put in the first commit by Andrew in any case.

Alright if that is the concensus then I will soon post v2 with cover letter.

>
>>> can finally make some real use of the max_ptes_shared tunable :)
>> I hope not, because it should be burned with fire, lol :)
> +1 :)
>
>> --
>> Cheers
>>
>> David / dhildenb
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
  2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
                   ` (5 preceding siblings ...)
  2025-09-03 20:34 ` Lorenzo Stoakes
@ 2025-09-04  6:11 ` Baolin Wang
  6 siblings, 0 replies; 24+ messages in thread
From: Baolin Wang @ 2025-09-04  6:11 UTC (permalink / raw)
  To: Dev Jain, akpm, david, kas, willy, hughd
  Cc: ziy, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua,
	linux-mm, linux-kernel



On 2025/9/3 13:46, Dev Jain wrote:
> Currently khugepaged does not collapse a region which does not have a
> single writable page. This is wasteful since non-writable VMAs mapped by
> the application won't benefit from THP collapse. Therefore, remove this
> restriction and allow khugepaged to collapse a VMA with arbitrary
> protections.
> 
> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
> non-writable VMA, and this restriction is nowhere to be found on the
> manpage - the restriction itself sounds wrong to me since the user knows
> the protection of the memory it has mapped, so collapsing read-only
> memory via madvise() should be a choice of the user which shouldn't
> be overriden by the kernel.
> 
> On an arm64 machine, an average of 5% improvement is seen on some mmtests
> benchmarks, particularly hackbench, with a maximum improvement of 12%.

I also wondered about the writable check before, but never dug into the 
history. The result looks nice.

> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO
  2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
                     ` (4 preceding siblings ...)
  2025-09-03 20:35   ` Lorenzo Stoakes
@ 2025-09-04  6:12   ` Baolin Wang
  5 siblings, 0 replies; 24+ messages in thread
From: Baolin Wang @ 2025-09-04  6:12 UTC (permalink / raw)
  To: Dev Jain, akpm, david, kas, willy, hughd
  Cc: ziy, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua,
	linux-mm, linux-kernel



On 2025/9/3 13:46, Dev Jain wrote:
> Now that all actionable outcomes from checking pte_write() are gone,
> drop the related references.
> 
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-09-04  6:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-03  5:46 [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs Dev Jain
2025-09-03  5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
2025-09-03  6:53   ` David Hildenbrand
2025-09-03  9:04   ` Kiryl Shutsemau
2025-09-03 13:26   ` Lorenzo Stoakes
2025-09-03 14:33     ` David Hildenbrand
2025-09-03 15:47   ` Zi Yan
2025-09-03 20:35   ` Lorenzo Stoakes
2025-09-04  6:12   ` Baolin Wang
2025-09-03  6:52 ` [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs David Hildenbrand
2025-09-03  8:08 ` Wei Yang
2025-09-03  8:13   ` David Hildenbrand
2025-09-03  8:30     ` Wei Yang
2025-09-03  9:06   ` Dev Jain
2025-09-03  9:15   ` Dev Jain
2025-09-03  9:18     ` Dev Jain
2025-09-03  9:22       ` David Hildenbrand
2025-09-03 18:25         ` Lorenzo Stoakes
2025-09-04  3:56           ` Dev Jain
2025-09-03 13:11     ` Wei Yang
2025-09-03  9:03 ` Kiryl Shutsemau
2025-09-03 15:46 ` Zi Yan
2025-09-03 20:34 ` Lorenzo Stoakes
2025-09-04  6:11 ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).