All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Vernon Yang <vernon2gm@gmail.com>, akpm@linux-foundation.org
Cc: lorenzo.stoakes@oracle.com, ziy@nvidia.com, dev.jain@arm.com,
	baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Vernon Yang <yanglincheng@kylinos.cn>
Subject: Re: [PATCH mm-new v4 3/6] mm: khugepaged: just skip when the memory has been collapsed
Date: Wed, 14 Jan 2026 12:40:50 +0100	[thread overview]
Message-ID: <9924fc6c-e3ad-422a-ad60-756efccba0aa@kernel.org> (raw)
In-Reply-To: <20260111121909.8410-4-yanglincheng@kylinos.cn>

On 1/11/26 13:19, Vernon Yang wrote:
> The following data is traced by bpftrace on a desktop system. After
> the system has been left idle for 10 minutes upon booting, a lot of
> SCAN_PMD_MAPPED or SCAN_NO_PTE_TABLE are observed during a full scan
> by khugepaged.
> 
> @scan_pmd_status[1]: 1           ## SCAN_SUCCEED
> @scan_pmd_status[6]: 2           ## SCAN_EXCEED_SHARED_PTE
> @scan_pmd_status[3]: 142         ## SCAN_PMD_MAPPED
> @scan_pmd_status[2]: 178         ## SCAN_NO_PTE_TABLE
> total progress size: 674 MB
> Total time         : 419 seconds ## include khugepaged_scan_sleep_millisecs
> 
> The khugepaged_scan list save all task that support collapse into hugepage,
> as long as the task is not destroyed, khugepaged will not remove it from
> the khugepaged_scan list. This exist a phenomenon where task has already
> collapsed all memory regions into hugepage, but khugepaged continues to
> scan it, which wastes CPU time and invalid, and due to
> khugepaged_scan_sleep_millisecs (default 10s) causes a long wait for
> scanning a large number of invalid task, so scanning really valid task
> is later.
> 
> After applying this patch, when the memory is either SCAN_PMD_MAPPED or
> SCAN_NO_PTE_TABLE, just skip it, as follow:
> 
> @scan_pmd_status[6]: 2
> @scan_pmd_status[3]: 147
> @scan_pmd_status[2]: 173
> total progress size: 45 MB
> Total time         : 20 seconds
> 
> Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn>
> ---
>   mm/khugepaged.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 5c6015ac7b5e..6df2857d94c6 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -68,7 +68,10 @@ enum scan_result {
>   static struct task_struct *khugepaged_thread __read_mostly;
>   static DEFINE_MUTEX(khugepaged_mutex);
>   
> -/* default scan 8*HPAGE_PMD_NR ptes (or vmas) every 10 second */
> +/*
> + * default scan 8*HPAGE_PMD_NR ptes, pmd_mapped, no_pte_table or vmas
> + * every 10 second.
> + */
>   static unsigned int khugepaged_pages_to_scan __read_mostly;
>   static unsigned int khugepaged_pages_collapsed;
>   static unsigned int khugepaged_full_scans;
> @@ -1267,7 +1270,7 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
>   	result = find_pmd_or_thp_or_none(mm, start_addr, &pmd);
>   	if (result != SCAN_SUCCEED) {
>   		if (cur_progress)
> -			*cur_progress = HPAGE_PMD_NR;
> +			*cur_progress = 1;
>   		goto out;
>   	}
>   
> @@ -1276,7 +1279,7 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
>   	pte = pte_offset_map_lock(mm, pmd, start_addr, &ptl);
>   	if (!pte) {
>   		if (cur_progress)
> -			*cur_progress = HPAGE_PMD_NR;
> +			*cur_progress = 1;
>   		result = SCAN_NO_PTE_TABLE;
>   		goto out;
>   	}

The above checks are clear.

> @@ -2347,9 +2350,6 @@ static enum scan_result hpage_collapse_scan_file(struct mm_struct *mm, unsigned
>   			continue;
>   		}
>   
> -		if (cur_progress)
> -			*cur_progress += folio_nr_pages(folio);
> -

This is a all a bit hairy.

Assume we found a single 4k folio in the xarray, but then collapse a 2M THP.

Is the progress really "1" ?

What about shmem swap entries (xa_is_value)?

So I think the whole file path needs more thought

-- 
Cheers

David


  reply	other threads:[~2026-01-14 11:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-11 12:19 [PATCH mm-new v4 0/6] Improve khugepaged scan logic Vernon Yang
2026-01-11 12:19 ` [PATCH mm-new v4 1/6] mm: khugepaged: add trace_mm_khugepaged_scan event Vernon Yang
2026-01-11 13:49   ` Lance Yang
2026-01-12 11:09     ` Vernon Yang
2026-01-11 12:19 ` [PATCH mm-new v4 2/6] mm: khugepaged: refine scan progress number Vernon Yang
2026-01-14 11:23   ` David Hildenbrand (Red Hat)
2026-01-14 11:28     ` David Hildenbrand (Red Hat)
2026-01-14 11:38   ` David Hildenbrand (Red Hat)
2026-01-17  4:18     ` Vernon Yang
2026-01-17 12:15       ` Vernon Yang
2026-01-11 12:19 ` [PATCH mm-new v4 3/6] mm: khugepaged: just skip when the memory has been collapsed Vernon Yang
2026-01-14 11:40   ` David Hildenbrand (Red Hat) [this message]
2026-01-11 12:19 ` [PATCH mm-new v4 4/6] mm: add folio_is_lazyfree helper Vernon Yang
2026-01-11 13:41   ` Lance Yang
2026-01-12 11:11     ` Vernon Yang
2026-01-14 11:45     ` David Hildenbrand (Red Hat)
2026-01-14 11:52       ` Lance Yang
2026-01-17  4:22       ` Vernon Yang
2026-01-11 12:19 ` [PATCH mm-new v4 5/6] mm: khugepaged: skip lazy-free folios at scanning Vernon Yang
2026-01-14 11:50   ` David Hildenbrand (Red Hat)
2026-01-14 12:44     ` Lance Yang
2026-01-17  4:27     ` Vernon Yang
2026-01-11 12:19 ` [PATCH mm-new v4 6/6] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY Vernon Yang
2026-01-11 13:44   ` Lance Yang
2026-01-14 11:52   ` David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9924fc6c-e3ad-422a-ad60-756efccba0aa@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=vernon2gm@gmail.com \
    --cc=yanglincheng@kylinos.cn \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.