Linux CXL
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Gregory Price <gourry@gourry.net>, linux-cxl@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	dave@stgolabs.net, jonathan.cameron@huawei.com,
	dave.jiang@intel.com, alison.schofield@intel.com,
	vishal.l.verma@intel.com, ira.weiny@intel.com,
	dan.j.williams@intel.com
Subject: Re: [PATCH 2/6] cxl: add sysram_region memory controller
Date: Mon, 12 Jan 2026 21:00:54 +0100	[thread overview]
Message-ID: <3d5ccbb3-a083-4a5c-8c97-2db2adbc5446@kernel.org> (raw)
In-Reply-To: <20260112163514.2551809-3-gourry@gourry.net>

On 1/12/26 17:35, Gregory Price wrote:
> Add a sysram memctrl that directly hotplugs memory without needing to
> route through DAX.  This simplifies the sysram usecase considerably.
> 
> The sysram memctl adds new sysfs controls when registered:
> 	region/memctrl/[hotplug, hotunplug, state]
> 
> hotplug:   controller attempts to hotplug the memory region

Why disconnect the hotplug from the online state?

echo online_movable > hotplug ?

Then we can just have something like add_and_online_memory() in the core.

> hotunplug: controller attempts to offline and hotunplug the memory region
> state:     [online,online_normal,offline]
>     online       : controller onlines blocks in ZONE_MOVABLE

I don't like this incosistency regarding the remainder of common hotplug 
toggles.

We should use exactly the same values with exactly the same semantics. 
Yes, user-space tooling should be thaught to pass in online_movable :)

>     online_normal: controller onlines blocks in ZONE_NORMAL
>     offline      : controller attempts to offline the memory blocks

Why is that required? ideally we'd start with hotplug vs. hotunplug and 
leave manual onlining/offlining out of this interface for now.

> 
> Hotplug note - by default the controller will hotplug the blocks, but
> leave them offline (unless MHP auto-online in Kconfig is enabled).
> 
> Setting state to "online_normal" may prevent future hot-unplug of sysram
> regions, and unbinding a memory region with memory online in ZONE_NORMAL
> may result in the device being removed but the memory remaining online.
> 
> This can result in future management functions failing (such as adding a
> new region).  This is why "online_normal" is explicit, and the default
> online zone is ZONE_MOVABLE.
> 
> Cc: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>   drivers/cxl/core/core.h                  |   2 +
>   drivers/cxl/core/memctrl/Makefile        |   1 +
>   drivers/cxl/core/memctrl/memctrl.c       |   2 +
>   drivers/cxl/core/memctrl/sysram_region.c | 358 +++++++++++++++++++++++
>   drivers/cxl/core/region.c                |   5 +
>   drivers/cxl/cxl.h                        |   6 +-
>   6 files changed, 372 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
> 
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1156a4bd0080..18cb84950500 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -31,6 +31,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
>   		       struct cxl_endpoint_decoder *cxled, int pos,
>   		       enum cxl_detach_mode mode);
>   
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
> +
>   #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
>   #define CXL_REGION_TYPE(x) (&cxl_region_type)
>   #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> index 8165aad5a52a..1c52c7d75570 100644
> --- a/drivers/cxl/core/memctrl/Makefile
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -2,3 +2,4 @@
>   
>   cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
>   cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 24e0e14b39c7..40ffb59353bb 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -34,6 +34,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
>   		return devm_cxl_add_dax_region(cxlr);
>   	case CXL_MEMCTRL_DAX:
>   		return devm_cxl_add_dax_region(cxlr);
> +	case CXL_MEMCTRL_SYSRAM:
> +		return devm_cxl_add_sysram_region(cxlr);
>   	default:
>   		return -EINVAL;
>   	}
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> new file mode 100644
> index 000000000000..a7570c8a54e1
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -0,0 +1,358 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Inc. All rights reserved. */
> +#include <linux/memremap.h>
> +#include <linux/memory.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <linux/mm.h>
> +#include <linux/memory-tiers.h>
> +#include <linux/memory_hotplug.h>
> +#include <linux/string_helpers.h>
> +#include <linux/sched/signal.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +/* If HMAT was unavailable, assign a default distance. */
> +#define MEMTIER_DEFAULT_CXL_ADISTANCE	(MEMTIER_ADISTANCE_DRAM * 5)
> +
> +static const char *sysram_name = "System RAM (CXL)";
> +
> +struct cxl_sysram_data {
> +	const char *res_name;
> +	int mgid;
> +	struct resource *res;
> +};
> +
> +static DEFINE_MUTEX(cxl_memory_type_lock);
> +static LIST_HEAD(cxl_memory_types);
> +
> +static struct cxl_region *to_cxl_region(struct device *dev)
> +{
> +	if (dev->type != &cxl_region_type)
> +		return NULL;
> +	return container_of(dev, struct cxl_region, dev);
> +}
> +
> +static struct memory_dev_type *cxl_find_alloc_memory_type(int adist)
> +{
> +	guard(mutex)(&cxl_memory_type_lock);
> +	return mt_find_alloc_memory_type(adist, &cxl_memory_types);
> +}
> +
> +static void __maybe_unused cxl_put_memory_types(void)
> +{
> +	guard(mutex)(&cxl_memory_type_lock);
> +	mt_put_memory_types(&cxl_memory_types);
> +}
> +
> +static int cxl_sysram_range(struct cxl_region *cxlr, struct range *r)
> +{
> +	struct cxl_region_params *p = &cxlr->params;
> +
> +	if (!p->res)
> +		return -ENODEV;
> +
> +	/* memory-block align the hotplug range */
> +	r->start = ALIGN(p->res->start, memory_block_size_bytes());
> +	r->end = ALIGN_DOWN(p->res->end + 1, memory_block_size_bytes()) - 1;
> +	if (r->start >= r->end) {
> +		r->start = p->res->start;
> +		r->end = p->res->end;
> +		return -ENOSPC;
> +	}
> +	return 0;
> +}
> +
> +static ssize_t hotunplug_store(struct device *dev,
> +			       struct device_attribute *attr,
> +			       const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	struct range range;
> +	int rc;
> +
> +	if (!cxlr)
> +		return -ENODEV;
> +
> +	rc = cxl_sysram_range(cxlr, &range);
> +	if (rc)
> +		return rc;
> +
> +	rc = offline_and_remove_memory(range.start, range_len(&range));
> +
> +	if (rc)
> +		return rc;
> +
> +	return len;
> +}
> +static DEVICE_ATTR_WO(hotunplug);
> +
> +struct online_memory_cb_arg {
> +	int online_type;
> +	int rc;
> +};
> +
> +static int online_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> +	struct online_memory_cb_arg *cb_arg = arg;
> +
> +	if (signal_pending(current))
> +		return -EINTR;
> +
> +	cond_resched();
> +
> +	if (mem->state == MEM_ONLINE)
> +		return 0;
> +
> +	mem->online_type = cb_arg->online_type;
> +	cb_arg->rc = device_online(&mem->dev);
> +
> +	return cb_arg->rc;
> +}
> +
> +static int offline_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> +	int *rc = arg;
> +
> +	if (signal_pending(current))
> +		return -EINTR;
> +
> +	cond_resched();
> +
> +	if (mem->state == MEM_OFFLINE)
> +		return 0;
> +
> +	*rc = device_offline(&mem->dev);
> +
> +	return *rc;
> +}
> +
> +static ssize_t state_store(struct device *dev,
> +			   struct device_attribute *attr,
> +			   const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	struct online_memory_cb_arg cb_arg;
> +	struct range range;
> +	int rc;
> +
> +	if (!cxlr)
> +		return -ENODEV;
> +
> +	rc = cxl_sysram_range(cxlr, &range);
> +	if (rc)
> +		return rc;
> +
> +	rc = lock_device_hotplug_sysfs();
> +	if (rc)
> +		return rc;
> +
> +	if (sysfs_streq(buf, "online")) {
> +		cb_arg.online_type = MMOP_ONLINE_MOVABLE;
> +		cb_arg.rc = 0;
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&cb_arg, online_memory_block_cb);
> +		if (!rc)
> +			rc = cb_arg.rc;
> +	} else if (sysfs_streq(buf, "online_normal")) {
> +		cb_arg.online_type = MMOP_ONLINE;
> +		cb_arg.rc = 0;
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&cb_arg, online_memory_block_cb);
> +		if (!rc)
> +			rc = cb_arg.rc;
> +	} else if (sysfs_streq(buf, "offline")) {
> +		int offline_rc = 0;
> +
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&offline_rc, offline_memory_block_cb);
> +		if (!rc)
> +			rc = offline_rc;

Let's expose this functionality through some common-code helpers. I 
really don't want more code doing this non-obvious device_offline() etc 
dance.

walk_memory_blocks() should become a core-mm helper. Maybe we can also 
cleanup drivers/acpi/acpi_memhotplug.c in that regard.

Hopefully we can then also reuse these helpers in ppc code (see 
dlpar_add_lmb() and dlpar_remove_lmb() that do something similar, but 
grab the device hotplug lock themselves as they want to perform some 
additional operations).

-- 
Cheers

David

  reply	other threads:[~2026-01-12 20:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20260113093758epcas5p10cc9749a657b8e4d32db75b8b973b67d@epcas5p1.samsung.com>
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
2026-01-12 16:35   ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 20:59     ` dan.j.williams
2026-01-12 22:25       ` Gregory Price
2026-01-13 18:00       ` Dave Jiang
2026-01-13 20:07         ` Gregory Price
2026-01-14 16:36         ` dan.j.williams
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:34       ` Gregory Price
2026-01-14 17:18     ` Jonathan Cameron
2026-01-14 18:25       ` Gregory Price
2026-01-14 18:36         ` Jonathan Cameron
2026-01-12 16:35   ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
2026-01-12 20:00     ` David Hildenbrand (Red Hat) [this message]
2026-01-12 22:43       ` Gregory Price
2026-01-12 21:10     ` dan.j.williams
2026-01-12 22:47       ` Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:55       ` Gregory Price
2026-01-13 22:34         ` Cheatham, Benjamin
2026-01-12 16:35   ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:58       ` Gregory Price
2026-01-13  9:12         ` Neeraj Kumar
2026-01-12 16:35   ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 23:05       ` Gregory Price
2026-01-13  4:31         ` dan.j.williams
2026-01-13 13:55           ` Gregory Price
2026-01-12 16:35   ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
2026-01-12 21:11     ` Cheatham, Benjamin
2026-01-12 23:07       ` Gregory Price
2026-01-12 16:35   ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
2026-01-12 21:11     ` Cheatham, Benjamin
2026-01-12 23:14       ` Gregory Price
2026-01-13 22:35         ` Cheatham, Benjamin
2026-01-13  9:37   ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
2026-01-13 13:33     ` Gregory Price
2026-01-15 18:43   ` Alejandro Lucero Palau
2026-01-15 18:56     ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d5ccbb3-a083-4a5c-8c97-2db2adbc5446@kernel.org \
    --to=david@kernel.org \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=gourry@gourry.net \
    --cc=ira.weiny@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kernel-team@meta.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox