All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Gregory Price <gourry@gourry.net>,
	linux-mm@kvack.org, nvdimm@lists.linux.dev
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	linux-cxl@vger.kernel.org, linux-kselftest@vger.kernel.org,
	djbw@kernel.org, vishal.l.verma@intel.com, dave.jiang@intel.com,
	akpm@linux-foundation.org, ljs@kernel.org, liam@infradead.org,
	vbabka@kernel.org, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, osalvador@suse.de, shuah@kernel.org,
	alison.schofield@intel.com,
	Smita.KoralahalliChannabasappa@amd.com, ira.weiny@intel.com,
	apopple@nvidia.com, Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug
Date: Tue, 9 Jun 2026 12:26:07 +0200	[thread overview]
Message-ID: <1cb0514b-c753-411e-8ff8-80fa29837441@kernel.org> (raw)
In-Reply-To: <20260605211911.2160954-9-gourry@gourry.net>

On 6/5/26 23:19, Gregory Price wrote:
> The dax kmem driver currently onlines memory automatically during
> probe using the system's default online policy but provides no way
> to control or query the entire region state at runtime.
> 
> Additionally, there is no atomic mechanism to offline and remove
> the entire set of memory blocks together.  Instead, this is presently
> done in two steps: (offline all, remove all).  This creates a race
> condition where external entities can operate directly on the blocks
> and cause hot-unplug to fail.
> 
> Add a new 'hotplug' sysfs attribute that allows userspace to control
> and query the entire memory region state.  The writable states mirror
> the per-block /sys/devices/system/memory/memoryX/state ABI:
>   - "unplugged": memory blocks are not present
>   - "online": memory is online, zone chosen by the kernel
>   - "online_kernel": memory is online in ZONE_NORMAL
>   - "online_movable": memory is online in ZONE_MOVABLE
> 
> The "unplugged" state is new and only applies to kmem/hotplug.
> 
> Valid transitions:
>   - unplugged                               -> online[_kernel|_movable]
>   - online | online_kernel | online_movable -> unplugged
>   - offline                                 -> unplugged
> 
> A device can only be onlined from "unplugged", so it must be returned
> there before being onlined into a different state.
> 
> For backwards compatibility the memory blocks are always created at
> probe: existing tools expect them to be present once the kmem driver
> binds.  When the configured policy (mhp_get_default_online_type())
> selects an online state the blocks are onlined into that policy's zone;
> when the policy is offline the blocks are created but left offline and
> the device reports the state "offline".
> 
> "offline" is therefore a reportable state but is not writable: it only
> arises from the legacy auto_online_blocks=offline policy. Onlining such
> a device through this attribute requires unplugging it first.
> 
> The "offline" state may be deprecated later if the memory block ABI
> changes and userland migrates to using the region-wide hotplug.
> 
> Unplug is atomic across the whole device: dax_kmem_do_hotremove()
> collects every added range and offlines/removes them in one operation
> via offline_and_remove_memory_ranges().  Either all ranges are removed
> and the device becomes "unplugged", or offlining is rolled back and the
> device is left fully online, so the reported 'hotplug' state always
> matches reality.
> 
> Unbind Note:
>   We used to call remove_memory() during unbind, which would fire a
>   BUG() if any of the memory blocks were online at that time.  We lift
>   this into a WARN in the cleanup routine and don't attempt hotremove
>   if ->state is not DAX_KMEM_UNPLUGGED or MMOP_OFFLINE.  Memory that is
>   merely offline (the legacy "offline" state) is removed on unbind as
>   before; only online memory is left pinned.
> 
>   The resources are still leaked but this prevents deadlock on unbind
>   if a memory region happens to be impossible to hotremove.
> 
> Inconsistency Note:
> 
>   Since memory blocks can still be modified individually, the hotplug
>   attribute can become out of sync with the state of the system if
>   userland software mixes and matches the use of memory_block ABI and
>   kmem/hotplug ABI.  It's suggests to use one or the other.
> 
> Suggested-by: Hannes Reinecke <hare@suse.de>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---

[...]

>  
> +static int dax_kmem_parse_state(const char *buf)
> +{
> +	if (sysfs_streq(buf, "unplugged"))
> +		return DAX_KMEM_UNPLUGGED;
> +	if (sysfs_streq(buf, "online"))
> +		return MMOP_ONLINE;
> +	if (sysfs_streq(buf, "online_kernel"))
> +		return MMOP_ONLINE_KERNEL;
> +	if (sysfs_streq(buf, "online_movable"))
> +		return MMOP_ONLINE_MOVABLE;
> +	return -EINVAL;

Should we try making use of mhp_online_type_from_str()/online_type_to_str()
[possibly a nicer exported function for the latter] to avoid duplicating this ...

> +}
> +
> +static ssize_t hotplug_show(struct device *dev,
> +			    struct device_attribute *attr, char *buf)
> +{
> +	struct dax_kmem_data *data = dev_get_drvdata(dev);
> +	const char *state_str;
> +
> +	if (!data)
> +		return -ENXIO;
> +
> +	switch (data->state) {
> +	case DAX_KMEM_UNPLUGGED:
> +		state_str = "unplugged";
> +		break;
> +	case MMOP_OFFLINE:
> +		state_str = "offline";
> +		break;
> +	case MMOP_ONLINE:
> +		state_str = "online";
> +		break;
> +	case MMOP_ONLINE_KERNEL:
> +		state_str = "online_kernel";
> +		break;
> +	case MMOP_ONLINE_MOVABLE:
> +		state_str = "online_movable";
> +		break;

...

and this?


[sorry if we already discussed this]

-- 
Cheers,

David

  parent reply	other threads:[~2026-06-09 10:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05 21:19 [PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs Gregory Price
2026-06-05 21:19 ` [PATCH v4 1/9] mm/memory: add memory_block_aligned_range() helper Gregory Price
2026-06-09  9:50   ` David Hildenbrand (Arm)
2026-06-05 21:19 ` [PATCH v4 2/9] mm/memory_hotplug: pass online_type to online_memory_block() via arg Gregory Price
2026-06-05 21:19 ` [PATCH v4 3/9] mm/memory_hotplug: export mhp_get_default_online_type Gregory Price
2026-06-05 21:29   ` sashiko-bot
2026-06-05 21:43     ` Gregory Price
2026-06-09  9:52   ` David Hildenbrand (Arm)
2026-06-09 15:11     ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 4/9] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Gregory Price
2026-06-09  9:55   ` David Hildenbrand (Arm)
2026-06-09 15:12     ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 5/9] mm/memory_hotplug: add multi-range hotunplug Gregory Price
2026-06-05 21:30   ` sashiko-bot
2026-06-09 10:06   ` David Hildenbrand (Arm)
2026-06-09 15:15     ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 6/9] dax: plumb hotplug online_type through dax Gregory Price
2026-06-05 21:31   ` sashiko-bot
2026-06-05 21:54     ` Gregory Price
2026-06-09 10:21       ` David Hildenbrand (Arm)
2026-06-09 15:33         ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 7/9] dax/kmem: extract hotplug/hotremove helper functions Gregory Price
2026-06-05 21:36   ` sashiko-bot
2026-06-05 22:03     ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 8/9] dax/kmem: add sysfs interface for atomic hotplug Gregory Price
2026-06-05 21:37   ` sashiko-bot
2026-06-09 10:26   ` David Hildenbrand (Arm) [this message]
2026-06-09 15:35     ` Gregory Price
2026-06-09 18:11       ` David Hildenbrand (Arm)
2026-06-09 18:19         ` Gregory Price
2026-06-09 18:22           ` David Hildenbrand (Arm)
2026-06-09 18:33             ` Gregory Price
2026-06-05 21:19 ` [PATCH v4 9/9] selftests/dax: add dax/kmem hotplug sysfs regression test Gregory Price
2026-06-05 21:34   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1cb0514b-c753-411e-8ff8-80fa29837441@kernel.org \
    --to=david@kernel.org \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=alison.schofield@intel.com \
    --cc=apopple@nvidia.com \
    --cc=dave.jiang@intel.com \
    --cc=djbw@kernel.org \
    --cc=gourry@gourry.net \
    --cc=hare@suse.de \
    --cc=ira.weiny@intel.com \
    --cc=kernel-team@meta.com \
    --cc=liam@infradead.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.