All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: david@kernel.org, catalin.marinas@arm.com, will@kernel.org,
	lorenzo.stoakes@oracle.com, ryan.roberts@arm.com,
	Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, riel@surriel.com,
	harry.yoo@oracle.com, jannh@google.com, willy@infradead.org,
	baohua@kernel.org, dev.jain@arm.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 0/5] support batch checking of references and unmapping for large folios
Date: Tue, 10 Feb 2026 10:01:02 +0800	[thread overview]
Message-ID: <de69443f-8e0f-4d83-9f29-0b349ff682c8@linux.alibaba.com> (raw)
In-Reply-To: <20260209175316.2ef64ee244599765a74a6975@linux-foundation.org>



On 2/10/26 9:53 AM, Andrew Morton wrote:
> On Mon,  9 Feb 2026 22:07:23 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>> Currently, folio_referenced_one() always checks the young flag for each PTE
>> sequentially, which is inefficient for large folios. This inefficiency is
>> especially noticeable when reclaiming clean file-backed large folios, where
>> folio_referenced() is observed as a significant performance hotspot.
>>
>> Moreover, on Arm architecture, which supports contiguous PTEs, there is already
>> an optimization to clear the young flags for PTEs within a contiguous range.
>> However, this is not sufficient. We can extend this to perform batched operations
>> for the entire large folio (which might exceed the contiguous range: CONT_PTE_SIZE).
>>
>> Similar to folio_referenced_one(), we can also apply batched unmapping for large
>> file folios to optimize the performance of file folio reclamation. By supporting
>> batched checking of the young flags, flushing TLB entries, and unmapping, I can
>> observed a significant performance improvements in my performance tests for file
>> folios reclamation. Please check the performance data in the commit message of
>> each patch.
>>
> 
> Thanks, I updated mm.git to this version.  Below is how v6 altered
> mm.git.
> 
> I notice that this fix:
> 
> https://lore.kernel.org/all/de141225-a0c1-41fd-b3e1-bcab09827ddd@linux.alibaba.com/T/#u
> 
> was not carried forward.  Was this deliberate?

Yes. After discussing with David[1], we believe the original patch is 
correct, so the 'fix' is unnecessary.

[1] 
https://lore.kernel.org/all/280ae63e-d66e-438f-8045-6c870420fe76@linux.alibaba.com/

The following diff looks good to me. Thanks.

> Also, regarding the 80-column tricks in folio_referenced_one(): we're
> allowed to do this ;)
> 
> 
> 				unsigned long end_addr;
> 				unsigned int max_nr;
> 
> 				end_addr = pmd_addr_end(address, vma->vm_end);
> 				max_nr = (end_addr - address) >> PAGE_SHIFT;
> 
> 
> 
> 
>   arch/arm64/include/asm/pgtable.h |    2 +-
>   include/linux/pgtable.h          |   16 ++++++++++------
>   mm/rmap.c                        |    9 +++------
>   3 files changed, 14 insertions(+), 13 deletions(-)
> 
> --- a/arch/arm64/include/asm/pgtable.h~b
> +++ a/arch/arm64/include/asm/pgtable.h
> @@ -1843,7 +1843,7 @@ static inline int clear_flush_young_ptes
>   					 unsigned long addr, pte_t *ptep,
>   					 unsigned int nr)
>   {
> -	if (likely(nr == 1 && !pte_valid_cont(__ptep_get(ptep))))
> +	if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
>   		return __ptep_clear_flush_young(vma, addr, ptep);
>   
>   	return contpte_clear_flush_young_ptes(vma, addr, ptep, nr);
> --- a/include/linux/pgtable.h~b
> +++ a/include/linux/pgtable.h
> @@ -1070,8 +1070,8 @@ static inline void wrprotect_ptes(struct
>   
>   #ifndef clear_flush_young_ptes
>   /**
> - * clear_flush_young_ptes - Clear the access bit and perform a TLB flush for PTEs
> - *			    that map consecutive pages of the same folio.
> + * clear_flush_young_ptes - Mark PTEs that map consecutive pages of the same
> + *			    folio as old and flush the TLB.
>    * @vma: The virtual memory area the pages are mapped into.
>    * @addr: Address the first page is mapped at.
>    * @ptep: Page table pointer for the first entry.
> @@ -1087,13 +1087,17 @@ static inline void wrprotect_ptes(struct
>    * pages that belong to the same folio.  The PTEs are all in the same PMD.
>    */
>   static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
> -					 unsigned long addr, pte_t *ptep,
> -					 unsigned int nr)
> +		unsigned long addr, pte_t *ptep, unsigned int nr)
>   {
> -	int i, young = 0;
> +	int young = 0;
>   
> -	for (i = 0; i < nr; ++i, ++ptep, addr += PAGE_SIZE)
> +	for (;;) {
>   		young |= ptep_clear_flush_young(vma, addr, ptep);
> +		if (--nr == 0)
> +			break;
> +		ptep++;
> +		addr += PAGE_SIZE;
> +	}
>   
>   	return young;
>   }
> --- a/mm/rmap.c~b
> +++ a/mm/rmap.c
> @@ -963,10 +963,8 @@ static bool folio_referenced_one(struct
>   				referenced++;
>   		} else if (pvmw.pte) {
>   			if (folio_test_large(folio)) {
> -				unsigned long end_addr =
> -					pmd_addr_end(address, vma->vm_end);
> -				unsigned int max_nr =
> -					(end_addr - address) >> PAGE_SHIFT;
> +				unsigned long end_addr = pmd_addr_end(address, vma->vm_end);
> +				unsigned int max_nr = (end_addr - address) >> PAGE_SHIFT;
>   				pte_t pteval = ptep_get(pvmw.pte);
>   
>   				nr = folio_pte_batch(folio, pvmw.pte,
> @@ -974,8 +972,7 @@ static bool folio_referenced_one(struct
>   			}
>   
>   			ptes += nr;
> -			if (clear_flush_young_ptes_notify(vma, address,
> -						pvmw.pte, nr))
> +			if (clear_flush_young_ptes_notify(vma, address, pvmw.pte, nr))
>   				referenced++;
>   			/* Skip the batched PTEs */
>   			pvmw.pte += nr - 1;
> _



      reply	other threads:[~2026-02-10  2:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 14:07 [PATCH v6 0/5] support batch checking of references and unmapping for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-02-09 15:25   ` David Hildenbrand (Arm)
2026-03-06 21:07   ` Barry Song
2026-03-07  2:22     ` Baolin Wang
2026-03-07  8:02       ` Barry Song
2026-03-10  1:37         ` Baolin Wang
2026-03-10  8:17           ` David Hildenbrand (Arm)
2026-03-16  6:25             ` Baolin Wang
2026-03-16 14:15               ` David Hildenbrand (Arm)
2026-03-25 14:36                 ` Lorenzo Stoakes (Oracle)
2026-03-25 14:58                   ` David Hildenbrand (Arm)
2026-03-25 15:06                     ` Lorenzo Stoakes (Oracle)
2026-03-25 15:30                       ` Andrew Morton
2026-03-25 15:32                         ` Lorenzo Stoakes (Oracle)
2026-03-25 16:23                           ` Andrew Morton
2026-03-25 16:28                             ` Lorenzo Stoakes (Oracle)
2026-03-25 18:43                               ` Andrew Morton
2026-03-25 18:58                                 ` Lorenzo Stoakes (Oracle)
2026-03-26  1:47                       ` Baolin Wang
2026-03-26  5:31                         ` Barry Song
2026-03-26 11:10                         ` Lorenzo Stoakes (Oracle)
2026-03-26 12:04                           ` Baolin Wang
2026-03-26 12:21                             ` Lorenzo Stoakes (Oracle)
2026-03-27 10:20                               ` Baolin Wang
2026-03-27  9:00                             ` David Hildenbrand (Arm)
2026-03-17  7:30               ` Barry Song
2026-03-18  1:37                 ` Baolin Wang
2026-02-09 14:07 ` [PATCH v6 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 14:07 ` [PATCH v6 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() Baolin Wang
2026-02-09 15:30   ` David Hildenbrand (Arm)
2026-02-10  0:39     ` Baolin Wang
2026-03-06 21:20   ` Barry Song
2026-03-07  2:14     ` Baolin Wang
2026-03-07  7:41       ` Barry Song
2026-02-09 14:07 ` [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-02-09 15:31   ` David Hildenbrand (Arm)
2026-02-10  1:53 ` [PATCH v6 0/5] support batch checking of references and unmapping for " Andrew Morton
2026-02-10  2:01   ` Baolin Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=de69443f-8e0f-4d83-9f29-0b349ff682c8@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.