public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling
@ 2026-03-30  7:59 Hannes Reinecke
  2026-04-13 15:43 ` Jonathan Cameron
  0 siblings, 1 reply; 4+ messages in thread
From: Hannes Reinecke @ 2026-03-30  7:59 UTC (permalink / raw)
  To: lsf-pc, linux-cxl, linux-fsdevel

Hi all,

during discussion with our partners regarding implementing dynamic 
capacity devices (DCD) on CXL the question has been brought up if
we can somehow 'steer' which memory page to move.
The problem is that for dynamic capacity devices we have a certain
freedom which memory page to move/deallocate, so ideally there would
be a strategy which pages to move/deallocate.
Should it be per application/cgroup?
Does it make sense to move individual pages from one application/cgroup
or would it be better to move all pages from the application/cgroup?
Should we implement something (eg via madvise()) to allow applicaitons
to influence the policy?
If so, what would that be?

So quite some things to discuss; however, not sure if this isn't too
much of an arcane topic which should rather be directed at places like
LPC. But I'll let the PC decide.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling
  2026-03-30  7:59 [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling Hannes Reinecke
@ 2026-04-13 15:43 ` Jonathan Cameron
  2026-04-13 21:10   ` Gregory Price
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron @ 2026-04-13 15:43 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: lsf-pc, linux-cxl, linux-fsdevel

On Mon, 30 Mar 2026 09:59:56 +0200
Hannes Reinecke <hare@suse.de> wrote:

> Hi all,
> 
> during discussion with our partners regarding implementing dynamic 
> capacity devices (DCD) on CXL the question has been brought up if
> we can somehow 'steer' which memory page to move.
> The problem is that for dynamic capacity devices we have a certain
> freedom which memory page to move/deallocate, so ideally there would
> be a strategy which pages to move/deallocate.

Hi Hannes,

Can you talk through your use model a little bit more?
I'm guessing this is about untagged DCD being used in a virtio-mem
like way?  Hence you want to clear out a range of DPA base so you can
do a partial release?

I may have completely missed what you are targetting though so an
example would be great.

> Should it be per application/cgroup?
> Does it make sense to move individual pages from one application/cgroup
> or would it be better to move all pages from the application/cgroup?
> Should we implement something (eg via madvise()) to allow applicaitons
> to influence the policy?
> If so, what would that be?
> 
> So quite some things to discuss; however, not sure if this isn't too
> much of an arcane topic which should rather be directed at places like
> LPC. But I'll let the PC decide.

Superficially feels a bit arcane, particularly as we are currently
kicking untagged memory into the long grass as there are too many
open questions on how to present it at all (e.g. related to Gregory's
recent work on private nodes).  On recent CXL sync calls the proposal
has been to do tagged memory first and only support allocation of
all memory with a given tag in one go and full release.

Anyhow, sounds like the sort of thing I'm always keen to discuss
but I'm not going to be at LSFMM this year.

Jonathan

> 
> Cheers,
> 
> Hannes


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling
  2026-04-13 15:43 ` Jonathan Cameron
@ 2026-04-13 21:10   ` Gregory Price
  2026-04-14  7:08     ` Hannes Reinecke
  0 siblings, 1 reply; 4+ messages in thread
From: Gregory Price @ 2026-04-13 21:10 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Hannes Reinecke, lsf-pc, linux-cxl, linux-fsdevel

On Mon, Apr 13, 2026 at 04:43:59PM +0100, Jonathan Cameron wrote:
> > 
> > So quite some things to discuss; however, not sure if this isn't too
> > much of an arcane topic which should rather be directed at places like
> > LPC. But I'll let the PC decide.
> 
> Superficially feels a bit arcane, particularly as we are currently
> kicking untagged memory into the long grass as there are too many
> open questions on how to present it at all (e.g. related to Gregory's
> recent work on private nodes).  On recent CXL sync calls the proposal
> has been to do tagged memory first and only support allocation of
> all memory with a given tag in one go and full release.
>

General consensus after last few months seems to be:

"While technically possible, untagged memory a bad idea for $REASONS"

I do not thing the private node case changes this, if anything it only
changes where the capacity ends up.

~Gregory

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling
  2026-04-13 21:10   ` Gregory Price
@ 2026-04-14  7:08     ` Hannes Reinecke
  0 siblings, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2026-04-14  7:08 UTC (permalink / raw)
  To: Gregory Price, Jonathan Cameron
  Cc: lsf-pc, linux-cxl, linux-fsdevel, linux-mm

On 4/13/26 23:10, Gregory Price wrote:
> On Mon, Apr 13, 2026 at 04:43:59PM +0100, Jonathan Cameron wrote:
>>>
>>> So quite some things to discuss; however, not sure if this isn't too
>>> much of an arcane topic which should rather be directed at places like
>>> LPC. But I'll let the PC decide.
>>
>> Superficially feels a bit arcane, particularly as we are currently
>> kicking untagged memory into the long grass as there are too many
>> open questions on how to present it at all (e.g. related to Gregory's
>> recent work on private nodes).  On recent CXL sync calls the proposal
>> has been to do tagged memory first and only support allocation of
>> all memory with a given tag in one go and full release.
>>
> 
> General consensus after last few months seems to be:
> 
> "While technically possible, untagged memory a bad idea for $REASONS"
> 
> I do not thing the private node case changes this, if anything it only
> changes where the capacity ends up.
> 
Thing is, there will be things like CXL switches. And with that we'll 
get CXL memory behind the switch, making it possible to reshuffle memory
'behind the back' of the application.
While the situation is similar to the current memory hotplug case
(and, in fact, the mechanism on the host side will be the same I guess),
the problem is now that we have a bit more flexibility.

The reason why one would want to reshuffle memory behind a CXL switch
is to deallocate memory from one machine to reassign it to another
machine. But as the request is just for 'memory' (not 'this particular
CXL card holding _that_ memory'), the admin gets to decide _which_
of the memory areas assigned to machine A should be moved to machine B.
But how?

And that basically is the question: Can we get the admin / orchestration
a better idea which of the memory blocks should be preferred for 
reassignment?
I'm sure there are applications which have a pretty flexible memory
allocation strategy which, with some prodding, they would be happy to 
relinquish. But I'm equally sure there are applications which react
extremely allergic to memory being pulled of underneath them.
And then there are 'modern' applications, which also don't like that
but for them it really doesn't matter as one can simply restart them.

So it would be cool if we could address this, as then the admin 
/orchastration can make a far better choice which memory area to
reassign.
And it might even help in other scenarios (VM ballooning?), too.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-14  7:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30  7:59 [LSF/MM/BPF TOPIC] Strategies for memory deallocation/movement for Dynamic Capacity Pooling Hannes Reinecke
2026-04-13 15:43 ` Jonathan Cameron
2026-04-13 21:10   ` Gregory Price
2026-04-14  7:08     ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox