linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid
@ 2025-07-25 16:33 Bijan Tabatabai
  2025-07-25 19:36 ` SeongJae Park
  2025-07-28  6:16 ` Raghavendra K T
  0 siblings, 2 replies; 4+ messages in thread
From: Bijan Tabatabai @ 2025-07-25 16:33 UTC (permalink / raw)
  To: linux-mm, linux-kernel, damon; +Cc: sj, akpm, Bijan Tabatabai

From: Bijan Tabatabai <bijantabatab@micron.com>

damos_va_migrate_dests_add() determines the node a folio should be in
based on the struct damos_migrate_dests associated with the migration
scheme and adds the folio to the linked list corresponding to that node
so it can be migrated later. Currently, folios are isolated and added to
the list even if they are already in the node they should be in.

In using damon weighted interleave more, I've found that the overhead of
needlessly adding these folios to the migration lists can be quite
high. The overhead comes from isolating folios and placing them in the
migration lists inside of damos_va_migrate_dests_add(), as well as the
cost of handling those folios in damon_migrate_pages(). This patch
eliminates that overhead by simply avoiding the addition of folios that
are already in their intended location to the migration list.

To show the benefit of this patch, we start the test workload and start
a DAMON instance attached to that workload with a migrate_hot scheme
that has one dest field sending data to the local node. This way, we are
only measuring the overheads of the scheme, and not the cost of migrating
pages, since data will be allocated to the local node by default.
I tested with two workloads: the embedding reduction workload used in [1]
and a microbenchmark that allocates 20GB of data then sleeps, which is
similar to the memory usage of the embedding reduction workload.

The time taken in damos_va_migrate_dests_add() and damon_migrate_pages()
each aggregation interval is shown below.

Before this patch:
                       damos_va_migrate_dests_add damon_migrate_pages
microbenchmark                   ~2ms                      ~3ms
embedding reduction              ~1s                       ~3s

After this patch:
                       damos_va_migrate_dests_add damon_migrate_pages
microbenchmark                    0us                      ~40us
embedding reduction               0us                      ~100us

I did not do an in depth analysis for why things are much slower in the
embedding reduction workload than the microbenchmark. However, I assume
it's because the embedding reduction workload oversaturates the
bandwidth of the local memory node, increasing the memory access
latency, and in turn making the pointer chasing involved in iterating
through a linked list much slower.
Regardless of that, this patch results in a significant speedup.

[1] https://lore.kernel.org/damon/20250709005952.17776-1-bijan311@gmail.com/

Signed-off-by: Bijan Tabatabai <bijantabatab@micron.com>
---
Sorry I missed this in the original patchset!

 mm/damon/vaddr.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 7f5dc9c221a0..4404c2ab0583 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -711,6 +711,10 @@ static void damos_va_migrate_dests_add(struct folio *folio,
 		target -= dests->weight_arr[i];
 	}
 
+	/* If the folio is already in the right node, don't do anything */
+	if (folio_nid(folio) == dests->node_id_arr[i])
+		return;
+
 isolate:
 	if (!folio_isolate_lru(folio))
 		return;
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid
  2025-07-25 16:33 [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid Bijan Tabatabai
@ 2025-07-25 19:36 ` SeongJae Park
  2025-07-25 19:39   ` SeongJae Park
  2025-07-28  6:16 ` Raghavendra K T
  1 sibling, 1 reply; 4+ messages in thread
From: SeongJae Park @ 2025-07-25 19:36 UTC (permalink / raw)
  To: Bijan Tabatabai
  Cc: SeongJae Park, linux-mm, linux-kernel, damon, akpm,
	Bijan Tabatabai

On Fri, 25 Jul 2025 11:33:00 -0500 Bijan Tabatabai <bijan311@gmail.com> wrote:

> From: Bijan Tabatabai <bijantabatab@micron.com>
> 
> damos_va_migrate_dests_add() determines the node a folio should be in
> based on the struct damos_migrate_dests associated with the migration
> scheme and adds the folio to the linked list corresponding to that node
> so it can be migrated later. Currently, folios are isolated and added to
> the list even if they are already in the node they should be in.
> 
> In using damon weighted interleave more, I've found that the overhead of
> needlessly adding these folios to the migration lists can be quite
> high. The overhead comes from isolating folios and placing them in the
> migration lists inside of damos_va_migrate_dests_add(), as well as the
> cost of handling those folios in damon_migrate_pages(). This patch
> eliminates that overhead by simply avoiding the addition of folios that
> are already in their intended location to the migration list.
[...]
> Signed-off-by: Bijan Tabatabai <bijantabatab@micron.com>

Reviewed-by: SeongJae Park <sj@kernel.org>

> ---
> Sorry I missed this in the original patchset!

No worry!

Andrew, could we squash this into commit 19c1dc15c859 ("mm/damon/vaddr: use
damos->migrate_dests in migrate_{hot,cold}") on mm-stable?  I think this is
just a simple fixup.

> 
>  mm/damon/vaddr.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> index 7f5dc9c221a0..4404c2ab0583 100644
> --- a/mm/damon/vaddr.c
> +++ b/mm/damon/vaddr.c
> @@ -711,6 +711,10 @@ static void damos_va_migrate_dests_add(struct folio *folio,
>  		target -= dests->weight_arr[i];
>  	}
>  
> +	/* If the folio is already in the right node, don't do anything */
> +	if (folio_nid(folio) == dests->node_id_arr[i])
> +		return;
> +
>  isolate:
>  	if (!folio_isolate_lru(folio))
>  		return;
> -- 
> 2.43.5


Thanks,
SJ


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid
  2025-07-25 19:36 ` SeongJae Park
@ 2025-07-25 19:39   ` SeongJae Park
  0 siblings, 0 replies; 4+ messages in thread
From: SeongJae Park @ 2025-07-25 19:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: SeongJae Park, Bijan Tabatabai, linux-mm, linux-kernel, damon,
	Bijan Tabatabai

+ Andrew

On Fri, 25 Jul 2025 12:36:37 -0700 SeongJae Park <sj@kernel.org> wrote:

> On Fri, 25 Jul 2025 11:33:00 -0500 Bijan Tabatabai <bijan311@gmail.com> wrote:
> 
> > From: Bijan Tabatabai <bijantabatab@micron.com>
> > 
> > damos_va_migrate_dests_add() determines the node a folio should be in
> > based on the struct damos_migrate_dests associated with the migration
> > scheme and adds the folio to the linked list corresponding to that node
> > so it can be migrated later. Currently, folios are isolated and added to
> > the list even if they are already in the node they should be in.
> > 
> > In using damon weighted interleave more, I've found that the overhead of
> > needlessly adding these folios to the migration lists can be quite
> > high. The overhead comes from isolating folios and placing them in the
> > migration lists inside of damos_va_migrate_dests_add(), as well as the
> > cost of handling those folios in damon_migrate_pages(). This patch
> > eliminates that overhead by simply avoiding the addition of folios that
> > are already in their intended location to the migration list.
> [...]
> > Signed-off-by: Bijan Tabatabai <bijantabatab@micron.com>
> 
> Reviewed-by: SeongJae Park <sj@kernel.org>
> 
> > ---
> > Sorry I missed this in the original patchset!
> 
> No worry!
> 
> Andrew, could we squash this into commit 19c1dc15c859 ("mm/damon/vaddr: use
> damos->migrate_dests in migrate_{hot,cold}") on mm-stable?  I think this is
> just a simple fixup.
> 
> > 
> >  mm/damon/vaddr.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> > index 7f5dc9c221a0..4404c2ab0583 100644
> > --- a/mm/damon/vaddr.c
> > +++ b/mm/damon/vaddr.c
> > @@ -711,6 +711,10 @@ static void damos_va_migrate_dests_add(struct folio *folio,
> >  		target -= dests->weight_arr[i];
> >  	}
> >  
> > +	/* If the folio is already in the right node, don't do anything */
> > +	if (folio_nid(folio) == dests->node_id_arr[i])
> > +		return;
> > +
> >  isolate:
> >  	if (!folio_isolate_lru(folio))
> >  		return;
> > -- 
> > 2.43.5
> 
> 
> Thanks,
> SJ
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid
  2025-07-25 16:33 [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid Bijan Tabatabai
  2025-07-25 19:36 ` SeongJae Park
@ 2025-07-28  6:16 ` Raghavendra K T
  1 sibling, 0 replies; 4+ messages in thread
From: Raghavendra K T @ 2025-07-28  6:16 UTC (permalink / raw)
  To: Bijan Tabatabai, linux-mm, linux-kernel, damon; +Cc: sj, akpm, Bijan Tabatabai

On 7/25/2025 10:03 PM, Bijan Tabatabai wrote:
> From: Bijan Tabatabai <bijantabatab@micron.com>
> 
> damos_va_migrate_dests_add() determines the node a folio should be in
> based on the struct damos_migrate_dests associated with the migration
> scheme and adds the folio to the linked list corresponding to that node
> so it can be migrated later. Currently, folios are isolated and added to
> the list even if they are already in the node they should be in.
> 
> In using damon weighted interleave more, I've found that the overhead of
> needlessly adding these folios to the migration lists can be quite
> high. The overhead comes from isolating folios and placing them in the
> migration lists inside of damos_va_migrate_dests_add(), as well as the
> cost of handling those folios in damon_migrate_pages(). This patch
> eliminates that overhead by simply avoiding the addition of folios that
> are already in their intended location to the migration list.
> 
> To show the benefit of this patch, we start the test workload and start
> a DAMON instance attached to that workload with a migrate_hot scheme
> that has one dest field sending data to the local node. This way, we are
> only measuring the overheads of the scheme, and not the cost of migrating
> pages, since data will be allocated to the local node by default.
> I tested with two workloads: the embedding reduction workload used in [1]
> and a microbenchmark that allocates 20GB of data then sleeps, which is
> similar to the memory usage of the embedding reduction workload.
> 
> The time taken in damos_va_migrate_dests_add() and damon_migrate_pages()
> each aggregation interval is shown below.
> 
> Before this patch:
>                         damos_va_migrate_dests_add damon_migrate_pages
> microbenchmark                   ~2ms                      ~3ms
> embedding reduction              ~1s                       ~3s
> 
> After this patch:
>                         damos_va_migrate_dests_add damon_migrate_pages
> microbenchmark                    0us                      ~40us
> embedding reduction               0us                      ~100us
> 
> I did not do an in depth analysis for why things are much slower in the
> embedding reduction workload than the microbenchmark. However, I assume
> it's because the embedding reduction workload oversaturates the
> bandwidth of the local memory node, increasing the memory access
> latency, and in turn making the pointer chasing involved in iterating
> through a linked list much slower.
> Regardless of that, this patch results in a significant speedup.
> 
> [1] https://lore.kernel.org/damon/20250709005952.17776-1-bijan311@gmail.com/
> 
> Signed-off-by: Bijan Tabatabai <bijantabatab@micron.com>
> ---
> Sorry I missed this in the original patchset!
> 
>   mm/damon/vaddr.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> index 7f5dc9c221a0..4404c2ab0583 100644
> --- a/mm/damon/vaddr.c
> +++ b/mm/damon/vaddr.c
> @@ -711,6 +711,10 @@ static void damos_va_migrate_dests_add(struct folio *folio,
>   		target -= dests->weight_arr[i];
>   	}
>   
> +	/* If the folio is already in the right node, don't do anything */
> +	if (folio_nid(folio) == dests->node_id_arr[i])
> +		return;
> +

I have seen good improvements with similar changes in PTE A bit scan
based patches.

Feel free to add

Reviewed-by: Raghavendra K T <raghavendra.kt@amd.com>

>   isolate:
>   	if (!folio_isolate_lru(folio))
>   		return;



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-28  6:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 16:33 [PATCH] mm/damon/vaddr: Skip isolating folios already in destination nid Bijan Tabatabai
2025-07-25 19:36 ` SeongJae Park
2025-07-25 19:39   ` SeongJae Park
2025-07-28  6:16 ` Raghavendra K T

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).