linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	 linux-mm@kvack.org, linux-doc@vger.kernel.org, david@redhat.com,
	 ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	lorenzo.stoakes@oracle.com,  Liam.Howlett@oracle.com,
	ryan.roberts@arm.com, dev.jain@arm.com,  corbet@lwn.net,
	rostedt@goodmis.org, mhiramat@kernel.org,
	 mathieu.desnoyers@efficios.com, akpm@linux-foundation.org,
	baohua@kernel.org,  willy@infradead.org, peterx@redhat.com,
	wangkefeng.wang@huawei.com,  usamaarif642@gmail.com,
	sunnanyong@huawei.com, vishal.moola@gmail.com,
	 thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
	kas@kernel.org,  aarcange@redhat.com, raquini@redhat.com,
	anshuman.khandual@arm.com,  catalin.marinas@arm.com,
	tiwai@suse.de, will@kernel.org,  dave.hansen@linux.intel.com,
	jack@suse.cz, cl@gentwo.org, jglisse@google.com,
	 surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org,
	 rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org,
	hughd@google.com,  lance.yang@linux.dev, vbabka@suse.cz,
	rppt@kernel.org, jannh@google.com,  pfalcato@suse.de
Subject: Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Date: Mon, 17 Nov 2025 11:16:53 -0700	[thread overview]
Message-ID: <CAA1CXcA9zaGqLPHnJWH=fKPUjb02dV+rKgfmsRZOGdeukiC8eg@mail.gmail.com> (raw)
In-Reply-To: <20251109024013.fzt7xxpmxwi75xgr@master>

On Sat, Nov 8, 2025 at 7:40 PM Wei Yang <richard.weiyang@gmail.com> wrote:
>
> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
> >There are cases where, if an attempted collapse fails, all subsequent
> >orders are guaranteed to also fail. Avoid these collapse attempts by
> >bailing out early.
> >
> >Signed-off-by: Nico Pache <npache@redhat.com>
> >---
> > mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> >diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> >index e2319bfd0065..54f5c7888e46 100644
> >--- a/mm/khugepaged.c
> >+++ b/mm/khugepaged.c
> >@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
> >                       ret = collapse_huge_page(mm, address, referenced,
> >                                                unmapped, cc, mmap_locked,
> >                                                order, offset);
> >-                      if (ret == SCAN_SUCCEED) {
> >+
> >+                      /*
> >+                       * Analyze failure reason to determine next action:
> >+                       * - goto next_order: try smaller orders in same region
> >+                       * - continue: try other regions at same order
> >+                       * - break: stop all attempts (system-wide failure)
> >+                       */
> >+                      switch (ret) {
> >+                      /* Cases were we should continue to the next region */
> >+                      case SCAN_SUCCEED:
> >                               collapsed += 1UL << order;
> >+                              fallthrough;
> >+                      case SCAN_PTE_MAPPED_HUGEPAGE:
> >                               continue;
> >+                      /* Cases were lower orders might still succeed */
> >+                      case SCAN_LACK_REFERENCED_PAGE:
> >+                      case SCAN_EXCEED_NONE_PTE:
> >+                      case SCAN_EXCEED_SWAP_PTE:
> >+                      case SCAN_EXCEED_SHARED_PTE:
> >+                      case SCAN_PAGE_LOCK:
> >+                      case SCAN_PAGE_COUNT:
> >+                      case SCAN_PAGE_LRU:
> >+                      case SCAN_PAGE_NULL:
> >+                      case SCAN_DEL_PAGE_LRU:
> >+                      case SCAN_PTE_NON_PRESENT:
> >+                      case SCAN_PTE_UFFD_WP:
> >+                      case SCAN_ALLOC_HUGE_PAGE_FAIL:
> >+                              goto next_order;
> >+                      /* All other cases should stop collapse attempts */
> >+                      default:
> >+                              break;
> >                       }
> >+                      break;
>
> One question here:

Hi Wei Yang,

Sorry I forgot to get back to this email.

>
> Suppose we have iterated several orders and not collapse successfully yet. So
> the mthp_bitmap_stack[] would look like this:
>
> [8 7 6 6]
>        ^
>        |

so we always pop before pushing. So it would go

[9]
pop
if (collapse fails)
[8 8]
lets say we pop and successfully collapse a order 8
[8]
Then we fail the other order 8
[7 7]
now if we succeed the first order 7
[7 6 6]
I believe we are now in the state you wanted to describe.

>
> Now we found this one pass the threshold check, but it fails with other
> result.

ok lets say we pass the threshold checks, but the collapse fails for
any reason that is described in the
/* Cases were lower orders might still succeed */
In this case we would continue to order 5 (or lower). Once we are done
with this branch of the tree we go back to the other order 6 collapse.
and eventually the order 7.

>
> Current code looks it would give up at all, but we may still have a chance to
> collapse the above 3 range?

for cases under /* All other cases should stop collapse attempts */
Yes we would bail out and skip some collapses. I tried to think about
all the cases were we would still want to continue trying, vs cases
where the system is probably out of resources or hitting some major
failure, and we should just break out (as others will probably fail
too).

But this is also why I separated this patch out on its own. I was
hoping to have some more focus on the different cases, and make sure I
handled them in the best possible way. So I really appreciate the
question :)

* I did some digging through old message to find this *

I believe these are the remaining cases. If these are hit I figured
it's better to abort.

/* cases where we must stop collapse attempts */
case SCAN_CGROUP_CHARGE_FAIL:
case SCAN_COPY_MC:
case SCAN_ADDRESS_RANGE:
case SCAN_PMD_NULL:
case SCAN_ANY_PROCESS:
case SCAN_VMA_NULL:
case SCAN_VMA_CHECK:
case SCAN_SCAN_ABORT:
case SCAN_PMD_NONE:
case SCAN_PAGE_ANON:
case SCAN_PMD_MAPPED:
case SCAN_FAIL:

Please let me know if you think we should move these to either the
`continue` or `next order` cases.

Cheers,
-- Nico

>
> >               }
> >
> > next_order:
> >--
> >2.51.0
>
> --
> Wei Yang
> Help you, Help me
>


  reply	other threads:[~2025-11-17 18:17 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22 18:37 [PATCH v12 mm-new 00/15] khugepaged: mTHP support Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 01/15] khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2025-11-08  1:42   ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 02/15] introduce collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-10-27  9:00   ` Lance Yang
2025-10-27 15:44   ` Lorenzo Stoakes
2025-11-08  1:44   ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 03/15] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-10-27  9:02   ` Lance Yang
2025-11-08  1:54   ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 04/15] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-10-27  9:05   ` Lance Yang
2025-11-08  2:34   ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 05/15] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-10-27  9:17   ` Lance Yang
2025-10-27 16:00   ` Lorenzo Stoakes
2025-11-10 13:20     ` Nico Pache
2025-11-08  3:01   ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 06/15] khugepaged: introduce collapse_max_ptes_none helper function Nico Pache
2025-10-27 17:53   ` Lorenzo Stoakes
2025-10-28 10:09     ` Baolin Wang
2025-10-28 13:57       ` Nico Pache
2025-10-28 17:07       ` Lorenzo Stoakes
2025-10-28 17:56         ` David Hildenbrand
2025-10-28 18:09           ` Lorenzo Stoakes
2025-10-28 18:17             ` David Hildenbrand
2025-10-28 18:41               ` Lorenzo Stoakes
2025-10-29 15:04                 ` David Hildenbrand
2025-10-29 18:41                   ` Lorenzo Stoakes
2025-10-29 21:10                     ` Nico Pache
2025-10-30 18:03                       ` Lorenzo Stoakes
2025-10-29 20:45                   ` Nico Pache
2025-10-28 13:36     ` Nico Pache
2025-10-28 14:15       ` David Hildenbrand
2025-10-28 17:29         ` Lorenzo Stoakes
2025-10-28 17:36           ` Lorenzo Stoakes
2025-10-28 18:08           ` David Hildenbrand
2025-10-28 18:59             ` Lorenzo Stoakes
2025-10-28 19:08               ` Lorenzo Stoakes
2025-10-29  2:09               ` Baolin Wang
2025-10-29  2:49                 ` Nico Pache
2025-10-29 18:55                 ` Lorenzo Stoakes
2025-10-29 21:14                   ` Nico Pache
2025-10-30  1:15                     ` Baolin Wang
2025-10-29  2:47               ` Nico Pache
2025-10-29 18:58                 ` Lorenzo Stoakes
2025-10-29 21:23                   ` Nico Pache
2025-10-30 10:15                     ` Lorenzo Stoakes
2025-10-31 11:12               ` David Hildenbrand
2025-10-28 16:57       ` Lorenzo Stoakes
2025-10-28 17:49         ` David Hildenbrand
2025-10-28 17:59           ` Lorenzo Stoakes
2025-10-22 18:37 ` [PATCH v12 mm-new 07/15] khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2025-10-27  3:25   ` Baolin Wang
2025-11-06 18:14   ` Lorenzo Stoakes
2025-11-07  3:09     ` Dev Jain
2025-11-07  9:18       ` Lorenzo Stoakes
2025-11-07 19:33     ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 08/15] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 09/15] khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2025-11-06 18:45   ` Lorenzo Stoakes
2025-11-07 17:14     ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 10/15] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 11/15] khugepaged: introduce collapse_allowable_orders helper function Nico Pache
2025-11-06 18:49   ` Lorenzo Stoakes
2025-11-07 18:01     ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 12/15] khugepaged: Introduce mTHP collapse support Nico Pache
2025-10-27  6:28   ` Baolin Wang
2025-11-09  2:08   ` Wei Yang
2025-11-11 21:56     ` Nico Pache
2025-11-19 11:53   ` Lorenzo Stoakes
2025-11-19 12:08     ` Lorenzo Stoakes
2025-11-20 22:32     ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-11-09  2:40   ` Wei Yang
2025-11-17 18:16     ` Nico Pache [this message]
2025-11-18  2:00       ` Wei Yang
2025-11-19 12:05   ` Lorenzo Stoakes
2025-11-26 23:16     ` Nico Pache
2025-11-26 23:29     ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 14/15] khugepaged: run khugepaged for all orders Nico Pache
2025-11-19 12:13   ` Lorenzo Stoakes
2025-11-20  6:37     ` Baolin Wang
2025-10-22 18:37 ` [PATCH v12 mm-new 15/15] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-10-22 19:52   ` Christoph Lameter (Ampere)
2025-10-22 20:22     ` David Hildenbrand
2025-10-23  8:00       ` Lorenzo Stoakes
2025-10-23  8:44         ` Pedro Falcato
2025-10-24 13:54           ` Zach O'Keefe
2025-10-23 23:41       ` Christoph Lameter (Ampere)
2025-10-22 20:13 ` [PATCH v12 mm-new 00/15] khugepaged: mTHP support Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA1CXcA9zaGqLPHnJWH=fKPUjb02dV+rKgfmsRZOGdeukiC8eg@mail.gmail.com' \
    --to=npache@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=kas@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).