All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: gutierrez.asier@huawei-partners.com
Cc: SeongJae Park <sj@kernel.org>,
	artem.kuzin@huawei.com, stepanov.anatoly@huawei.com,
	wangkefeng.wang@huawei.com, yanquanmin1@huawei.com,
	zuoze1@huawei.com, damon@lists.linux.dev,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v3 0/4] mm/damon: Introduce a huge page collapsing mechanism using auto tuning
Date: Thu,  4 Jun 2026 18:34:05 -0700	[thread overview]
Message-ID: <20260605013406.83441-1-sj@kernel.org> (raw)
In-Reply-To: <20260604150338.501128-1-gutierrez.asier@huawei-partners.com>

Hello Asier,


Thank you for revisioning this great patch!

On Thu, 4 Jun 2026 15:03:33 +0000 <gutierrez.asier@huawei-partners.com> wrote:

> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
> 
> Overview
> ========
> 
> This patch set introduces a new autotuning which allows to collapse
> hot regions into hugepages.
> 
> Motivation
> ==========
> 
> Since TLB is a bottleneck for many systems[1], a way to optimize TLB
> misses (or hits) is to use huge pages. Unfortunately, using "always"
> in THP leads to memory fragmentation and memory waste. For this reason,
> most application guides and system administrators suggest to disable THP.
> 
> Currently DAMON has DAMOS_HUGEPAGE, DAMOS_NONHUGEPAGE and DAMOS_COLLAPSE.
> However, there is no way to tune the settings. It will collapse all the
> hot regions that meet the access pattern. If the server is a bare metal
> database or big data server, this will also lead to eventual fragmentation.
> 
> Additionally, currently THP is set globally. Ideally, there should be a
> way to control which tasks can use huge pages.

We can do process level control using prctl(PR_SET_THP_DISABLE) [1], isn't it?
I think the last above sentence is better to be reworded or simply dropped.

> 
> Solution
> ========
> 
> DAMON has now a way to autotune some of the variables and adjust quotas
> automatically, so that DAMON is fired only under the right circumstances.
> It would be nice to have something similar, but for huge pages.
> 
> A new autotuning quota goal[2], damos_get_used_hugepage_mem_bp, is
> introduced, which checks the huge page consumption to total anonymous

In the previous revision I suggested to
s/damos_get_used_hugepage_mem_bp/damos_hugepage_mem_bp/ and you agreed.  Seems
it was forgotten?

> memory consumption. This new quota mechanism reuses current autotuning
> architecture.
> 
> A new module is introduced to demonstrate the use of huge pages

Let's clarify it is a  sample module.  That is,
s/A new module/A new sample module/ ?

> collapse autotuning. The goal is to collapse hot regions of a given
> process into huge pages. The module launches a kdamond thread for a
> certain task provided by the user through monitored_pid module argument.

Following other vaddr based sample modues' pattern, what about
s/monitored_pid/target_pid/ ?

As I also commented on the third patch of this series, apparently it is not
following the sample modules' pattern but that for non-sample modules.  Could
you please rewrite in a more simple way?

> Hugepage goal autotuning will automatically adjust the aggressiveness
> of hot region collapses.
> 
> This module also has a user autotuning knob which allows the user to
> adjust the aggressiveness of page collapsing.
> 
> Benchmarks
> ==========
> 
> Huge page collapse autotuning was tested in a physicial machine with
> MariaDB 10.5.29 and sysbench as the benchmark framework.
> 
> The hugepage module was set up in the following way:
> 
> # echo 1000 > min_age
> # echo 1000 > quota_percentage_hugepage
> # echo $(pidof mariadbd) > monitored_pid
> # echo on > enabled
> 
> The goal was to achieve 5% of the total memory used as hugepage.

Any reason to set it 5% ?

> 
> The table below shows the memory consumption over time. Gaps in the
> timestamp means that no changes in the hugepage consumption happened
> over that period of time.
> 
> +-----------+----------------+----------------+----------------------+
> | timestamp | total mem used | huge page used | percentage hugepage  |
> +-----------+----------------+----------------+----------------------+
> | 0         | 4721188        | 0              | 0%                   |
> | 28        | 4216848        | 4              | 0%                   |
> | 37        | 4189912        | 38912          | 1%                   |
> | 39        | 4195188        | 47104          | 1%                   |
> | 55        | 4111612        | 51200          | 1%                   |
> | 59        | 4137012        | 53248          | 1%                   |
> | 60        | 4137052        | 55296          | 1%                   |
> | 61        | 4156832        | 57344          | 1%                   |
> | 62        | 4136920        | 59392          | 1%                   |
> | 64        | 4109872        | 61440          | 1%                   |
> | 65        | 4119108        | 63488          | 2%                   |
> | 66        | 4145532        | 65536          | 2%                   |
> | 67        | 4134544        | 67584          | 2%                   |
> | 68        | 4158244        | 126976         | 3%                   |
> | 69        | 4124276        | 204800         | 5%                   |
> | 70        | 4100680        | 333824         | 8%                   |
> | 71        | 4095540        | 462848         | 11%                  |
> +-----------+----------------+----------------+----------------------+

What is the timestamp unit?  Second?

What is the mem used unit?  Byytes?  Kiloboytes?

I also remember you mentioned you will compare the numbers for more setups
including module disabled case (baseline) and THP disabled case.  I think "THP
disabled" case was my typo.  Maybe I wanted to say "THP enabled" case.

Is that still on your TODO list?

Given this series is adding relatively small change (assuming the sample module
will be simplified), I wouldn't strictly request all such tests.  I'm just
curious about your plan.

> 
> Performance:
> Baseline -> 18,162.45 transactions per second
> Hugepage autotune -> 18,211.82 transactions per second

So, 2.7% improvement!  I think it is not bad for this simple approach.

Could you further elaborate how the performance is measured?  From when the
transactions per second measurement is started, and when it was stopped?  Are
the numbers average?  Mean?  Or something else?

> 
> 
> Eventually, the amount of huge pages reached 20%. This is consistent
> with how quota goals autotuning work. We are more aggresive when the
> quota is less than 10%, and less aggresive when the quota is higher.
> At some point, the aggressiveness just fades and no more collapses
> occur.

Could you share more hugepage utilization change for long term that captures it
converges to 20% but after that doesn't increase more?

Also, have you tried temporal quota tuner?

> 
> TODO
> ====
> - Support page splitting for cold hugepages.

This is a future work out of the scope of this series, right?  I think that is
better to be clarified.  In the previous revision, I was reading this as a TODO
for a future revision of this patch series.

Also, do you have specific changes you want to make to this series before it is
merged, or dropping the RFC tag?

> 
> Patches Sequence
> ================
> Patch 1 -> Introduce DAMOS_QUOTA_HUGEPAGE and autotuning
> Patch 2 -> damon_modules_new_vaddr_ctx_target
> Patch 3 -> Module that demonstrates how to use DAMOS_QUOTA_HUGEPAGE
>            and the new VADDR ctx creation
> Patch 4 -> Documentation

As I commented to each patch, patch 1 looks good except a few trivial things.
Patch 2 seems unnecessary.  I hope patch 3 to be much simplified and wrote
again following the sample modules' pattern.  Patch 4 seems too much for a
sample module.

> 
> Changes from previous versions
> ==============================
> RFC 2[3] -> RFC 3
>   - Module moved to samples
>   - Change autotune to monitor total memory and hugepage
>   - Added performnace benchmarks to the cover letter
>   - Bail out gracefully when trying to start disable
>     the module after the monitored task exited. This 
>     issue was discovered by sashiko [4]
>   - Fixed typos and added quota_sz to the documentation
>     discovered by sashiko [5]
> RFC 1[6] -> RFC 2
>   - Rebased into mm-new
>   - Use DAMOS_COLLAPSE instead of DAMOS_HUGEPAGE
>   - Fixed an issue that returned silently an error when the PID
>     didn't exist in the system.[7]

Thank you for continuing this great work, Asier.

> 
> [1] https://dl.acm.org/doi/pdf/10.1145/3307650.3322227
> [2] https://lore.kernel.org/e67f05ad-dbb9-45e6-ba30-b167a99ac67d@huawei-partners.com
> [3] https://lore.kernel.org/20260522145518.158910-1-gutierrez.asier@huawei-partners.com
> [4] https://lore.kernel.org/20260522171210.900B11F00A3D@smtp.kernel.org
> [5] https://lore.kernel.org/20260522171633.AAF5B1F000E9@smtp.kernel.org
> [6] https://lore.kernel.org/20260430134139.2446417-1-gutierrez.asier@huawei-partners.com
> [7] https://lore.kernel.org/all/20260430154338.E22E6C2BCB3@smtp.kernel.org/


Thanks,
SJ

[...]


  parent reply	other threads:[~2026-06-05  1:34 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 15:03 [RFC PATCH v3 0/4] mm/damon: Introduce a huge page collapsing mechanism using auto tuning gutierrez.asier
2026-06-04 15:03 ` [RFC PATCH v3 1/4] mm/damon: Introduce DAMOS_QUOTA_HUGEPAGE " gutierrez.asier
2026-06-04 15:19   ` sashiko-bot
2026-06-05 10:57     ` Gutierrez Asier
2026-06-05  0:44   ` SeongJae Park
2026-06-05 11:00     ` Gutierrez Asier
2026-06-04 15:03 ` [RFC PATCH v3 2/4] mm/damon: Generalize ctx_target creation for damon_ops_id and add vaddr support gutierrez.asier
2026-06-05  0:50   ` SeongJae Park
2026-06-05 11:13     ` Gutierrez Asier
2026-06-04 15:03 ` [RFC PATCH v3 3/4] mm/damon: introduce DAMON_HUGEPAGE for hot region hugepage collapsing gutierrez.asier
2026-06-04 15:41   ` sashiko-bot
2026-06-05 14:16     ` Gutierrez Asier
2026-06-05  1:06   ` SeongJae Park
2026-06-05 13:47     ` Gutierrez Asier
2026-06-04 15:03 ` [RFC PATCH v3 4/4] Documentation/admin-guide/mm/damon: add DAMON-based Hugepage Management gutierrez.asier
2026-06-04 15:48   ` sashiko-bot
2026-06-05  1:09   ` SeongJae Park
2026-06-05 10:28     ` Gutierrez Asier
2026-06-05  1:34 ` SeongJae Park [this message]
2026-06-05 10:25   ` [RFC PATCH v3 0/4] mm/damon: Introduce a huge page collapsing mechanism using auto tuning Gutierrez Asier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260605013406.83441-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=artem.kuzin@huawei.com \
    --cc=damon@lists.linux.dev \
    --cc=gutierrez.asier@huawei-partners.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stepanov.anatoly@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=yanquanmin1@huawei.com \
    --cc=zuoze1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.