linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Hannes Reinecke <hare@suse.de>, Oscar Salvador <osalvador@suse.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Michal Hocko <mhocko@suse.com>, Hannes Reinecke <hare@kernel.org>
Subject: Re: [RFC] Disable auto_movable_ratio for selfhosted memmap
Date: Mon, 28 Jul 2025 11:42:58 +0200	[thread overview]
Message-ID: <9e152d8d-4b39-4a6c-93be-694a28686c07@redhat.com> (raw)
In-Reply-To: <273a3376-c45a-4d41-85b4-9c4f3428f268@suse.de>

On 28.07.25 11:28, Hannes Reinecke wrote:
> On 7/28/25 10:44, David Hildenbrand wrote:
>> On 28.07.25 10:15, Oscar Salvador wrote:
>>> Hi,
> [ .. ]
>>>
>>> One way to tackle this would be update the ratio every time a new CXL
>>> card gets inserted, but this seems suboptimal.
>>> Another way is that since CXL memory works with selfhosted memmap, we
>>> could relax
>>> the check when 'auto-movable' and only look at the ratio if we aren't
>>> working with selfhosted memmap.
>>
>> The memmap is only a small piece of unmovable data we require late at
>> runtime (a bigger factor is user space page tables actually mapping that
>> memory). The zone ratio we have configured in the kernel dates back to
>> the highmem times, where such ratios were considered safe. Maybe there
>> are better defaults for the ratios today, but it really depends on the
>> workload.
>>
> Point is, the ratio is accounted for the _entire_ memory.
> Which means that you have to _know_ how much memory you are going to
> plug in prior to plugging that in.
 > So to make that correct one would need to update the ratio prior to> 
plug in one module, check if that succeeded, update the ratio, plug
> in the next module, check that, etc.

I am confused. We know how big a DIMM is at the time we plug it. I 
assume you talk about CXL?

Can you describe how that workflow would look like with tools like daxctl?

(what is a "module"? A DIMM?)

> 
>> One could find ways of subtracting the selfhosted part, to account it
>> differently in the kernel, but the memmap is not the only consumer that
>> affects the ratio.
>>
>> I mean, the memmap is roughly 1.6%, I don't think that really makes a
>> difference for you, does it? Can you share some real-life examples?
>>
>>
>> I have a colleague working on one of my old prototypes (memoryhotplugd)
>> for replacing udev rules.
>>
>> The idea there is, to detect that CXL memory is getting hotplugged and
>> keep it offline. Because user space hotplugging that memory (daxctl)
>> will explicitly online it to the proper zone.
>>
>> Things like virtio-mem, DIMMs etc can happily use the auto-movable
>> behavior. But the auto-movable behavior doesn't quite make sense if (a)
>> you want everything movable and (b) daxctl already expects to online the
>> memory itself, usually to ZONE_MOVABLE.
>>
>> So I think this is mostly a user-space problem to solve.
>>
> Hmm.
> Yes, and no.
> 
> While CXL memory is hotpluggable (it's a PCI device, after all),
> it won't be hotplugged on a regular basis.

I've been told that with dynamic memory pooling it is supposed to get 
much more dynamic.

> So the current use-case I'm aware of is that the system will be
> configured once, and then it will be expected to come up in the
> very same state after reboot.
> As such a daemon is a bit of an overkill, as the number of events
> it would need to listen to is in the very low single-digit range.

I am mostly concerned with all the use cases that existed before CXL (in 
particular, virtio-mem, standby memory on s390x, DIMMs) where you see 
memory hotplug way more frequently and also would want to deal with 
things such as memory onlining failing in some environments more 
gracefully (e.g., retry).

What I realized is that
(1) udev rules are not a good for all use cases
(2) auto-onlining in the kernel is not good fit for all use cases

The goal of the daemon will be to configure auto-onlining in the kernel 
where possible (e.g., only virtio-mem, only CXL), but fallback to manual 
onlining in case mixtures might be possible (CXL and virtio-mem etc). I 
expect the latter to be rare, but sometimes we can't make a fully 
reliable decision of what might get hotplugged in the future ...

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-07-28  9:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-28  8:15 [RFC] Disable auto_movable_ratio for selfhosted memmap Oscar Salvador
2025-07-28  8:44 ` David Hildenbrand
2025-07-28  9:28   ` Hannes Reinecke
2025-07-28  9:42     ` David Hildenbrand [this message]
2025-07-28  8:48 ` Michal Hocko
2025-07-28  8:53   ` David Hildenbrand
2025-07-28  9:04     ` Michal Hocko
2025-07-28  9:10       ` David Hildenbrand
2025-07-28  9:37         ` Hannes Reinecke
2025-07-28 13:06           ` Michal Hocko
2025-07-28 13:08             ` David Hildenbrand
2025-07-29  7:24               ` Hannes Reinecke
2025-07-29  9:19                 ` Michal Hocko
2025-07-29  9:29                   ` David Hildenbrand
2025-07-29  9:33                   ` Hannes Reinecke
2025-07-29 11:58                     ` Michal Hocko
2025-07-29 13:52                       ` Hannes Reinecke
2025-07-28 15:15           ` David Hildenbrand
2025-07-28 12:17         ` Michal Hocko
2025-07-28 12:27           ` David Hildenbrand
2025-07-28 12:27             ` David Hildenbrand
2025-07-28 13:00               ` Michal Hocko
2025-07-28 13:03                 ` David Hildenbrand
2025-07-28 12:54             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e152d8d-4b39-4a6c-93be-694a28686c07@redhat.com \
    --to=david@redhat.com \
    --cc=hare@kernel.org \
    --cc=hare@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).