linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: yuzhao@google.com
Cc: corbet@lwn.net, linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations
Date: Tue,  5 Mar 2024 21:37:43 +1300	[thread overview]
Message-ID: <20240305083743.24950-1-21cnbao@gmail.com> (raw)
In-Reply-To: <20240229183436.4110845-1-yuzhao@google.com>

> TAO is an umbrella project aiming at a better economy of physical
> contiguity viewed as a valuable resource. A few examples are:
> 1. A multi-tenant system can have guaranteed THP coverage while
>    hosting abusers/misusers of the resource.
> 2. Abusers/misusers, e.g., workloads excessively requesting and then
>    splitting THPs, should be punished if necessary.
> 3. Good citizens should be awarded with, e.g., lower allocation
>    latency and less cost of metadata (struct page).

I think TAO or similar optimization in buddy is essential to the
success of mTHP.

Ryan's recent mTHP work can widely bring multi-size large folios to
various products while THP might be too large for them.

But a pain is that the buddy of a real device with limited memory
can be seriously fragmented after it runs for some time.

We(OPPO) have actually brought up mTHP-like features on millions of
phones even on 5.4, 5.10, 5.15 and 6.1 kernel with large folios
whose size are 64KiB to leverage ARM64's CONT-PTE. The open source
code for kernel 6.1 can be got here[1]. We found the success rate
of 64KiB allocation could be very low after running monkey[2] on
phones for one hour.
 
After the phone has been running for one hour, the below is the
data we collected from 60mins to 120mins(the second hour). w/o
TAO-like optimization to the existing buddy, 64KiB large folios
allocation can fall back to small folios at the rate of 92.35%
in do_anonymous_page().

thp_do_anon_pages_fallback / (thp_do_anon_pages + thp_do_anon_pages_fallback)
25807330 / 27944523 =  0.9235

in do_anonymous_page(), thp_do_anon_pages_fallback is the number
we try to allocate 64KiB but we fail, thus, we use small folios
instead; thp_do_anon_pages is the number we try to allocate 64KiB
and we succeed.

So this number somehow means mTHP has lost vast majority of value
on a fragmented system, while the fragmentation is always true
for a phone.

This has actually pushed us to implement a similar optimization
to avoid splitting 64KiB and award 64KiB allocation with lower
latency. Our implementation is different with TAO, rather than
adding new zones, we are adding migration_types to mark some
pageblocks are dedicated for mTHP allocation. And we avoid
splitting them into lower orders except for some corner cases.
This has significantly improved our success rate of 64KiB
large folios allocation and decreased the latency, helped
large folios to be finally applied in real products.

[1] https://github.com/OnePlusOSS/android_kernel_oneplus_sm8650/blob/oneplus/sm8650_u_14.0.0_oneplus12/
[2] https://developer.android.com/studio/test/other-testing-tools/monkey

> 4. Better interoperability with userspace memory allocators when
>    transacting the resource.
> 
> This project puts the same emphasis on the established use case for
> servers and the emerging use case for clients so that client workloads
> like Android and ChromeOS can leverage the recent multi-sized THPs
> [1][2].

> Chapter One introduces the cornerstone of TAO: an abstraction called
> policy (virtual) zones, which are overlayed on the physical zones.
> This is in line with item 1 above.
> 
> A new door is open after Chapter One. The following two chapters
> discuss the reverse of THP collapsing, called THP shattering, and THP
> HVO, which brings the hugeTLB feature [3] to THP. They are in line
> with items 2 & 3 above.
> 
> Advanced use cases are discussed in Epilogue, since they require the
> cooperation of userspace memory allocators. This is in line with item
> 4 above.
> 
> [1] https://lwn.net/Articles/932386/
> [2] https://lwn.net/Articles/937239/
> [3] https://www.kernel.org/doc/html/next/mm/vmemmap_dedup.html
> 
> Yu Zhao (4):
>   THP zones: the use cases of policy zones
>   THP shattering: the reverse of collapsing
>   THP HVO: bring the hugeTLB feature to THP
>   Profile-Guided Heap Optimization and THP fungibility

Thanks
Barry



  parent reply	other threads:[~2024-03-05  8:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 18:34 [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Yu Zhao
2024-02-29 18:34 ` [Chapter One] THP zones: the use cases of policy zones Yu Zhao
2024-02-29 20:28   ` Matthew Wilcox
2024-03-06  3:51     ` Yu Zhao
2024-03-06  4:33       ` Matthew Wilcox
2024-02-29 23:31   ` Yang Shi
2024-03-03  2:47     ` Yu Zhao
2024-03-04 15:19   ` Matthew Wilcox
2024-03-05 17:22     ` Matthew Wilcox
2024-03-05  8:41   ` Barry Song
2024-03-05 10:07     ` Vlastimil Babka
2024-03-05 21:04       ` Barry Song
2024-03-06  3:05         ` Yu Zhao
2024-05-24  8:38   ` Barry Song
2024-11-01  2:35   ` Charan Teja Kalla
2024-11-01 16:55     ` Yu Zhao
2024-02-29 18:34 ` [Chapter Two] THP shattering: the reverse of collapsing Yu Zhao
2024-02-29 21:55   ` Zi Yan
2024-03-03  1:17     ` Yu Zhao
2024-03-03  1:21       ` Zi Yan
2024-06-11  8:32   ` Barry Song
2024-02-29 18:34 ` [Chapter Three] THP HVO: bring the hugeTLB feature to THP Yu Zhao
2024-02-29 22:54   ` Yang Shi
2024-03-01 15:42     ` David Hildenbrand
2024-03-03  1:46     ` Yu Zhao
2024-02-29 18:34 ` [Epilogue] Profile-Guided Heap Optimization and THP fungibility Yu Zhao
2024-03-05  8:37 ` Barry Song [this message]
2024-03-06 15:51 ` [LSF/MM/BPF TOPIC] TAO: THP Allocator Optimizations Johannes Weiner
2024-03-06 16:40   ` Zi Yan
2024-03-13 22:09   ` Kaiyang Zhao
2024-05-15 21:17 ` Yu Zhao
2024-05-15 21:52   ` Yu Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240305083743.24950-1-21cnbao@gmail.com \
    --to=21cnbao@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).