linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yu Zhao <yuzhao@google.com>
To: Zi Yan <ziy@nvidia.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	 Jonathan Corbet <corbet@lwn.net>
Subject: Re: [Chapter Two] THP shattering: the reverse of collapsing
Date: Sat, 2 Mar 2024 20:17:50 -0500	[thread overview]
Message-ID: <CAOUHufaO==bcto32Ub7iH=224K9hsVWxAsdR8cHOpbFR3qKMBQ@mail.gmail.com> (raw)
In-Reply-To: <A7637877-6850-4465-9B61-6F441AAFF2F9@nvidia.com>

On Thu, Feb 29, 2024 at 4:55 PM Zi Yan <ziy@nvidia.com> wrote:
>
> On 29 Feb 2024, at 13:34, Yu Zhao wrote:
>
> > In contrast to split, shatter migrates occupied pages in a partially
> > mapped THP to a bunch of base folios. IOW, unlike split done in place,
> > shatter is the exact opposite of collapse.
> >
> > The advantage of shattering is that it keeps the original THP intact.
>
> Why keep the THP intact? To prevent the THP from fragmentation, since
> the shattered part will not be returned to buddy allocator for reuse?

There might be a confusion here: there is no "shattered part" -- the
entire THP becomes free after shattering (the occupied part is moved
to a bunch of 4KB pages).

> I agree with the idea of shattering, but keeping THP intact might
> give us trouble for 1GB THP case when PMD mapping is created after
> shattering. How to update mapcount for a PMD mapping in the middle of
> a 1GB folio? I used head[0], head[512], ... as the PMD mapping head
> page, but that is ugly. For mTHPs, there is no such problem since
> only PTE mappings are involved.

If we don't consider the copying cost during shattering, it can work
for 1GB THPs as it does for 2MB THPs.

> It might be better to just split the THP and move free pages to a
> donot-use free list until the rest are freed too

The main reason we do shattering is, using a crude analogy, a million
dollar in $10,000 bills (yes, they exist) is worth a lot more than
that in pennies. You can carry the former in your pocket but the
latter weighs at least 250 tons. So if we split, we lose money.

1GB THP is one of the important *end goals* for TAO. But I don't want
to go into details since we need to focus on the first few steps at
the current stage.

The problem with shattering for 1GB is the copying cost -- if we
shatter a 1GB THP half mapped/unmapped, we'd have to copy 512MB data,
which is unacceptable. 1GB THP requires something we call "THP
fungibility" (see the epilogue) -- we do split in place, but we also
"collapse" in place (called THP recovery, i.e., MADV_RECOVERY).

Shattering is for 2MB THPs only.


> if the zone enforces
> a minimal order that is larger than the free pages.
>
> > The cost of copying during the migration is not a side effect, but
> > rather by design, since splitting is considered a discouraged
> > behavior. In retail terms, the return of a purchase is charged with a
> > restocking fee and the original goods can be resold.
> >
> > THPs from ZONE_NOMERGE can only be shattered, since they cannot be
> > split or merged. THPs from ZONE_NOSPLIT can be shattered or split (the
> > latter requires [1]), if they are above the minimum order.
> >
> > [1] https://lore.kernel.org/20240226205534.1603748-1-zi.yan@sent.com/
> >
>
>
> --
> Best Regards,
> Yan, Zi


  reply	other threads:[~2024-03-03  1:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 18:34 [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Yu Zhao
2024-02-29 18:34 ` [Chapter One] THP zones: the use cases of policy zones Yu Zhao
2024-02-29 20:28   ` Matthew Wilcox
2024-03-06  3:51     ` Yu Zhao
2024-03-06  4:33       ` Matthew Wilcox
2024-02-29 23:31   ` Yang Shi
2024-03-03  2:47     ` Yu Zhao
2024-03-04 15:19   ` Matthew Wilcox
2024-03-05 17:22     ` Matthew Wilcox
2024-03-05  8:41   ` Barry Song
2024-03-05 10:07     ` Vlastimil Babka
2024-03-05 21:04       ` Barry Song
2024-03-06  3:05         ` Yu Zhao
2024-05-24  8:38   ` Barry Song
2024-11-01  2:35   ` Charan Teja Kalla
2024-11-01 16:55     ` Yu Zhao
2024-02-29 18:34 ` [Chapter Two] THP shattering: the reverse of collapsing Yu Zhao
2024-02-29 21:55   ` Zi Yan
2024-03-03  1:17     ` Yu Zhao [this message]
2024-03-03  1:21       ` Zi Yan
2024-06-11  8:32   ` Barry Song
2024-02-29 18:34 ` [Chapter Three] THP HVO: bring the hugeTLB feature to THP Yu Zhao
2024-02-29 22:54   ` Yang Shi
2024-03-01 15:42     ` David Hildenbrand
2024-03-03  1:46     ` Yu Zhao
2024-02-29 18:34 ` [Epilogue] Profile-Guided Heap Optimization and THP fungibility Yu Zhao
2024-03-05  8:37 ` [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Barry Song
2024-03-06 15:51 ` Johannes Weiner
2024-03-06 16:40   ` Zi Yan
2024-03-13 22:09   ` Kaiyang Zhao
2024-05-15 21:17 ` Yu Zhao
2024-05-15 21:52   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOUHufaO==bcto32Ub7iH=224K9hsVWxAsdR8cHOpbFR3qKMBQ@mail.gmail.com' \
    --to=yuzhao@google.com \
    --cc=corbet@lwn.net \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).