All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: Vernon Yang <vernon2gm@gmail.com>
Cc: lorenzo.stoakes@oracle.com, ziy@nvidia.com, dev.jain@arm.com,
	baohua@kernel.org, richard.weiyang@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Vernon Yang <yanglincheng@kylinos.cn>,
	akpm@linux-foundation.org, david@kernel.org
Subject: Re: [PATCH v3 5/6] mm: khugepaged: skip lazy-free folios at scanning
Date: Sun, 4 Jan 2026 20:10:17 +0800	[thread overview]
Message-ID: <9c82ffaa-5f62-4110-80cc-00f0c46e90fb@linux.dev> (raw)
In-Reply-To: <20260104054112.4541-6-yanglincheng@kylinos.cn>



On 2026/1/4 13:41, Vernon Yang wrote:
> For example, create three task: hot1 -> cold -> hot2. After all three
> task are created, each allocate memory 128MB. the hot1/hot2 task
> continuously access 128 MB memory, while the cold task only accesses
> its memory briefly andthen call madvise(MADV_FREE). However, khugepaged
> still prioritizes scanning the cold task and only scans the hot2 task
> after completing the scan of the cold task.
> 
> So if the user has explicitly informed us via MADV_FREE that this memory
> will be freed, it is appropriate for khugepaged to skip it only, thereby
> avoiding unnecessary scan and collapse operations to reducing CPU
> wastage.
> 
> Here are the performance test results:
> (Throughput bigger is better, other smaller is better)
> 
> Testing on x86_64 machine:
> 
> | task hot2           | without patch | with patch    |  delta  |
> |---------------------|---------------|---------------|---------|
> | total accesses time |  3.14 sec     |  2.93 sec     | -6.69%  |
> | cycles per access   |  4.96         |  2.21         | -55.44% |
> | Throughput          |  104.38 M/sec |  111.89 M/sec | +7.19%  |
> | dTLB-load-misses    |  284814532    |  69597236     | -75.56% |
> 
> Testing on qemu-system-x86_64 -enable-kvm:
> 
> | task hot2           | without patch | with patch    |  delta  |
> |---------------------|---------------|---------------|---------|
> | total accesses time |  3.35 sec     |  2.96 sec     | -11.64% |
> | cycles per access   |  7.29         |  2.07         | -71.60% |
> | Throughput          |  97.67 M/sec  |  110.77 M/sec | +13.41% |
> | dTLB-load-misses    |  241600871    |  3216108      | -98.67% |
> 
> Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn>
> ---
>   include/trace/events/huge_memory.h | 1 +
>   mm/khugepaged.c                    | 6 ++++++
>   2 files changed, 7 insertions(+)
> 
> diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
> index 01225dd27ad5..e99d5f71f2a4 100644
> --- a/include/trace/events/huge_memory.h
> +++ b/include/trace/events/huge_memory.h
> @@ -25,6 +25,7 @@
>   	EM( SCAN_PAGE_LRU,		"page_not_in_lru")		\
>   	EM( SCAN_PAGE_LOCK,		"page_locked")			\
>   	EM( SCAN_PAGE_ANON,		"page_not_anon")		\
> +	EM( SCAN_PAGE_LAZYFREE,		"page_lazyfree")		\
>   	EM( SCAN_PAGE_COMPOUND,		"page_compound")		\
>   	EM( SCAN_ANY_PROCESS,		"no_process_for_page")		\
>   	EM( SCAN_VMA_NULL,		"vma_null")			\
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 30786c706c4a..1ca034a5f653 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -45,6 +45,7 @@ enum scan_result {
>   	SCAN_PAGE_LRU,
>   	SCAN_PAGE_LOCK,
>   	SCAN_PAGE_ANON,
> +	SCAN_PAGE_LAZYFREE,
>   	SCAN_PAGE_COMPOUND,
>   	SCAN_ANY_PROCESS,
>   	SCAN_VMA_NULL,
> @@ -1337,6 +1338,11 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>   		}
>   		folio = page_folio(page);
>   
> +		if (folio_is_lazyfree(folio)) {
> +			result = SCAN_PAGE_LAZYFREE;
> +			goto out_unmap;
> +		}

That's a bit tricky ... I don't think we need to handle MADV_FREE pages
differently :)

MADV_FREE pages are likely cold memory, but what if there are just
a few MADV_FREE pages in a hot memory region? Skipping the entire
region would be unfortunate ...

Also, even if we skip these pages now, after they are reclaimed, they
become pte_none. Then khugepaged will try to collapse them anyway
(based on khugepaged_max_ptes_none). So skipping them just delays
things, it does not really change the final result ;)

Thanks,
Lance

> +
>   		if (!folio_test_anon(folio)) {
>   			result = SCAN_PAGE_ANON;
>   			goto out_unmap;



  reply	other threads:[~2026-01-04 12:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-04  5:41 [PATCH v3 0/6] Improve khugepaged scan logic Vernon Yang
2026-01-04  5:41 ` [PATCH v3 1/6] mm: khugepaged: add trace_mm_khugepaged_scan event Vernon Yang
2026-01-04  5:41 ` [PATCH v3 2/6] mm: khugepaged: refine scan progress number Vernon Yang
2026-01-05 16:49   ` David Hildenbrand (Red Hat)
2026-01-06  5:55     ` Vernon Yang
2026-01-14 11:18       ` David Hildenbrand (Red Hat)
2026-01-04  5:41 ` [PATCH v3 3/6] mm: khugepaged: just skip when the memory has been collapsed Vernon Yang
2026-01-04  5:41 ` [PATCH v3 4/6] mm: add folio_is_lazyfree helper Vernon Yang
2026-01-04 11:42   ` Lance Yang
2026-01-05  2:09     ` Vernon Yang
2026-01-04  5:41 ` [PATCH v3 5/6] mm: khugepaged: skip lazy-free folios at scanning Vernon Yang
2026-01-04 12:10   ` Lance Yang [this message]
2026-01-05  1:48     ` Vernon Yang
2026-01-05  2:51       ` Lance Yang
2026-01-05  3:12         ` Vernon Yang
2026-01-05  3:35           ` Lance Yang
2026-01-05 12:30             ` Vernon Yang
2026-01-06 10:33               ` Barry Song
2026-01-07  8:36                 ` Vernon Yang
2026-01-04  5:41 ` [PATCH v3 6/6] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY Vernon Yang
2026-01-04 12:20   ` Lance Yang
2026-01-05  0:31     ` Wei Yang
2026-01-05  2:09       ` Lance Yang
2026-01-05  2:06     ` Vernon Yang
2026-01-05  2:20       ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c82ffaa-5f62-4110-80cc-00f0c46e90fb@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=richard.weiyang@gmail.com \
    --cc=vernon2gm@gmail.com \
    --cc=yanglincheng@kylinos.cn \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.