Re: [RFC PATCH v2 2/2] mm/damon/paddr: Allow multiple migrate targets

All of lore.kernel.org
 help / color / mirror / Atom feed

From: SeongJae Park <sj@kernel.org>
To: Bijan Tabatabai <bijan311@gmail.com>
Cc: SeongJae Park <sj@kernel.org>,
	damon@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	david@redhat.com, ziy@nvidia.com, matthew.brost@intel.com,
	joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com,
	gourry@gourry.net, ying.huang@linux.alibaba.com,
	apopple@nvidia.com, bijantabatab@micron.com,
	venkataravis@micron.com, emirakhur@micron.com,
	ajayjoshi@micron.com, vtavarespetr@micron.com
Subject: Re: [RFC PATCH v2 2/2] mm/damon/paddr: Allow multiple migrate targets
Date: Mon, 23 Jun 2025 17:34:08 -0700	[thread overview]
Message-ID: <20250624003408.47807-1-sj@kernel.org> (raw)
In-Reply-To: <CAMvvPS4CNzc7gSF8Z+6ogB212V+GDJyW9PXrrrP+wMyDNfXKqg@mail.gmail.com>

On Mon, 23 Jun 2025 18:15:00 -0500 Bijan Tabatabai <bijan311@gmail.com> wrote:

[...]
> Hi SeongJae,
> 
> I really appreciate your detailed response.
> The quota auto-tuning helps, but I feel like it's still not exactly
> what I want. For example, I think a quota goal that stops migration
> based on the memory usage balance gets quite a bit more complicated
> when instead of interleaving all data, we are just interleaving *hot*
> data. I haven't looked at it extensively, but I imagine it wouldn't be
> easy to identify how much data is hot in the paddr setting,

I don't think so, and I don't see why you think so.  Could you please
elaborate?

> especially
> because the regions can contain a significant amount of unallocated
> data.

In the case, unallocated data shouldn't be accessed at all, so the region will
just look cold to DAMON.

> Also, if the interleave weights changed, for example, from 11:9
> to 10:10, it would be preferable if only 5% of data is migrated;
> however, with the round robin approach, 50% would be. Finally, and I
> forgot to mention this in my last message, the round-robin approach
> does away with any notion of spatial locality, which does help the
> effectiveness of interleaving [1].

We could use the probabilistic interleaving, if this is the problem?

> I don't think anything done with
> quotas can get around that.

I think I'm not getting your points well, sorry.  More elaboration of your
concern would be helpful.

> I wonder if there's an elegant way to
> specify whether to use rmap or not, but my initial feeling is that
> might just add complication to the code and interface for not enough
> benefit.

Agreed.  Please note that I'm open to add an interface for this behavior if the
benefit is clear.  I'm also thinking adding none-rmap migration first (if it
shows some benefit), and adding rmap support later with additional benefit
confirmation could also be an option.

> 
> Maybe, as you suggest later on, this is an indication that my use case
> is a better fit for a vaddr scheme. I'll get into that more below.
> 
> > > Using the VMA offset to determine where a page
> > > should be placed avoids this problem because it gives a folio a single
> > > node it can be in for a given set of interleave weights. This means
> > > that in steady state, no folios will be migrated.
> >
> > This makes sense for this use case.  But I don't think this makes same sense
> > for possible other use cases, like memory tiering on systems having multiple
> > NUMA nodes of same tier.
> 
> I see where you're coming from. I think the crux of this difference is
> that in my use case, the set of nodes we are monitoring is the same as
> the set of nodes we are migrating to, while in the use case you
> describe, the set of nodes being monitored is disjoint from the set of
> migration target nodes.

I understand and agree this difference.

> I think this in particular makes ping ponging
> more of a problem for my use case, compared to promotion/demotion
> schemes.

But again I'm failing at understanding this, sorry.  Could I ask more
elaborations?

> 
> > If you really need this virtual address space based
> > deterministic behavior, it would make more sense to use virtual address spaces
> > monitoring (damon-vaddr).
> 
> Maybe it does make sense for me to implement vaddr versions of the
> migrate actions for my use case.

Yes, that could also be an option.

> One thing that gives me pause about
> this, is that, from what I understand, it would be harder to have
> vaddr schemes apply to processes that start after damon begins. I
> think to do that, one would have to detect when a process starts, and
> then do a damon tune to upgrade the targets list? It would be nice if,
> say, you could specify a cgroup as a vaddr target and track all
> processes in that cgroup, but that would be a different patchset for
> another day.

I agree that could be a future thing to do.  Note that DAMON user-space tool
implements[1] a similar feature.

> 
> But, using vaddr has other benefits, like the sampling would take into
> account the locality of the accesses. There are also ways to make
> vaddr sampling more efficient by using higher levels of the page
> tables, that I don't think apply to paddr schemes [2]. I believe the
> authors of [2] said they submitted their patches to the kernel, but I
> don't know if it has been upstreamed (sorry about derailing the
> conversation slightly).

Thank you for reminding it.  It was nice finding and approach[2], but
unfortunately it didn't be upstreamed.  I now realize the monitoring intervals
auto-tuning[3] idea was partly motivated by the nice discussion, though.

[1] https://github.com/damonitor/damo/blob/next/release_note#L33
[2] https://lore.kernel.org/damon/20240318132848.82686-1-aravinda.prasad@intel.com/
[3] https://lkml.kernel.org/r/20250303221726.484227-1-sj@kernel.org


Thanks,
SJ

[...]

> 
> [1] https://elixir.bootlin.com/linux/v6.16-rc3/source/mm/mempolicy.c#L213
> [2] https://www.usenix.org/conference/atc24/presentation/nair

next prev parent reply	other threads:[~2025-06-24  0:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20 18:04 [RFC PATCH v2 0/2] mm/damon/paddr: Allow interleaving in migrate_{hot,cold} actions Bijan Tabatabai
2025-06-20 18:04 ` [RFC PATCH v2 1/2] mm/mempolicy: Expose get_il_weight() to MM Bijan Tabatabai
2025-06-23 19:06   ` Gregory Price
2025-06-23 19:14   ` David Hildenbrand
2025-06-23 19:38     ` Gregory Price
2025-06-24 10:58   ` Huang, Ying
2025-06-20 18:04 ` [RFC PATCH v2 2/2] mm/damon/paddr: Allow multiple migrate targets Bijan Tabatabai
2025-06-21 18:02   ` SeongJae Park
2025-06-21 18:11     ` SeongJae Park
2025-06-23 14:08       ` Joshua Hahn
2025-06-23 16:50         ` SeongJae Park
2025-06-23 14:27       ` Bijan Tabatabai
2025-06-23 16:52         ` SeongJae Park
2025-06-23 14:16     ` Bijan Tabatabai
2025-06-23 17:52       ` SeongJae Park
2025-06-23 23:15         ` Bijan Tabatabai
2025-06-24  0:34           ` SeongJae Park [this message]
2025-06-24 16:01             ` Bijan Tabatabai
2025-06-24 22:33               ` SeongJae Park
2025-06-20 20:21 ` [RFC PATCH v2 0/2] mm/damon/paddr: Allow interleaving in migrate_{hot,cold} actions SeongJae Park
2025-06-20 21:47   ` Bijan Tabatabai
2025-06-20 23:13     ` SeongJae Park
2025-06-21 17:36       ` SeongJae Park
2025-06-23 14:39         ` Bijan Tabatabai
2025-06-23 16:32           ` SeongJae Park
2025-06-23 19:28   ` Gregory Price
2025-06-23 23:21     ` Bijan Tabatabai
2025-06-26 19:13       ` Gregory Price
2025-06-23 13:45 ` Joshua Hahn
2025-06-23 14:57   ` Bijan Tabatabai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250624003408.47807-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=ajayjoshi@micron.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=bijan311@gmail.com \
    --cc=bijantabatab@micron.com \
    --cc=byungchul@sk.com \
    --cc=damon@lists.linux.dev \
    --cc=david@redhat.com \
    --cc=emirakhur@micron.com \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=rakie.kim@sk.com \
    --cc=venkataravis@micron.com \
    --cc=vtavarespetr@micron.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.