All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Gregory Price <gourry@gourry.net>, linux-cxl@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	dave@stgolabs.net, jonathan.cameron@huawei.com,
	dave.jiang@intel.com, alison.schofield@intel.com,
	vishal.l.verma@intel.com, ira.weiny@intel.com,
	dan.j.williams@intel.com
Subject: Re: [PATCH 2/6] cxl: add sysram_region memory controller
Date: Mon, 12 Jan 2026 21:00:54 +0100	[thread overview]
Message-ID: <3d5ccbb3-a083-4a5c-8c97-2db2adbc5446@kernel.org> (raw)
In-Reply-To: <20260112163514.2551809-3-gourry@gourry.net>

On 1/12/26 17:35, Gregory Price wrote:
> Add a sysram memctrl that directly hotplugs memory without needing to
> route through DAX.  This simplifies the sysram usecase considerably.
> 
> The sysram memctl adds new sysfs controls when registered:
> 	region/memctrl/[hotplug, hotunplug, state]
> 
> hotplug:   controller attempts to hotplug the memory region

Why disconnect the hotplug from the online state?

echo online_movable > hotplug ?

Then we can just have something like add_and_online_memory() in the core.

> hotunplug: controller attempts to offline and hotunplug the memory region
> state:     [online,online_normal,offline]
>     online       : controller onlines blocks in ZONE_MOVABLE

I don't like this incosistency regarding the remainder of common hotplug 
toggles.

We should use exactly the same values with exactly the same semantics. 
Yes, user-space tooling should be thaught to pass in online_movable :)

>     online_normal: controller onlines blocks in ZONE_NORMAL
>     offline      : controller attempts to offline the memory blocks

Why is that required? ideally we'd start with hotplug vs. hotunplug and 
leave manual onlining/offlining out of this interface for now.

> 
> Hotplug note - by default the controller will hotplug the blocks, but
> leave them offline (unless MHP auto-online in Kconfig is enabled).
> 
> Setting state to "online_normal" may prevent future hot-unplug of sysram
> regions, and unbinding a memory region with memory online in ZONE_NORMAL
> may result in the device being removed but the memory remaining online.
> 
> This can result in future management functions failing (such as adding a
> new region).  This is why "online_normal" is explicit, and the default
> online zone is ZONE_MOVABLE.
> 
> Cc: David Hildenbrand <david@kernel.org>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>   drivers/cxl/core/core.h                  |   2 +
>   drivers/cxl/core/memctrl/Makefile        |   1 +
>   drivers/cxl/core/memctrl/memctrl.c       |   2 +
>   drivers/cxl/core/memctrl/sysram_region.c | 358 +++++++++++++++++++++++
>   drivers/cxl/core/region.c                |   5 +
>   drivers/cxl/cxl.h                        |   6 +-
>   6 files changed, 372 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/cxl/core/memctrl/sysram_region.c
> 
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 1156a4bd0080..18cb84950500 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -31,6 +31,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
>   		       struct cxl_endpoint_decoder *cxled, int pos,
>   		       enum cxl_detach_mode mode);
>   
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
> +
>   #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
>   #define CXL_REGION_TYPE(x) (&cxl_region_type)
>   #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
> diff --git a/drivers/cxl/core/memctrl/Makefile b/drivers/cxl/core/memctrl/Makefile
> index 8165aad5a52a..1c52c7d75570 100644
> --- a/drivers/cxl/core/memctrl/Makefile
> +++ b/drivers/cxl/core/memctrl/Makefile
> @@ -2,3 +2,4 @@
>   
>   cxl_core-$(CONFIG_CXL_REGION) += memctrl/memctrl.o
>   cxl_core-$(CONFIG_CXL_REGION) += memctrl/dax_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += memctrl/sysram_region.o
> diff --git a/drivers/cxl/core/memctrl/memctrl.c b/drivers/cxl/core/memctrl/memctrl.c
> index 24e0e14b39c7..40ffb59353bb 100644
> --- a/drivers/cxl/core/memctrl/memctrl.c
> +++ b/drivers/cxl/core/memctrl/memctrl.c
> @@ -34,6 +34,8 @@ int cxl_enable_memctrl(struct cxl_region *cxlr)
>   		return devm_cxl_add_dax_region(cxlr);
>   	case CXL_MEMCTRL_DAX:
>   		return devm_cxl_add_dax_region(cxlr);
> +	case CXL_MEMCTRL_SYSRAM:
> +		return devm_cxl_add_sysram_region(cxlr);
>   	default:
>   		return -EINVAL;
>   	}
> diff --git a/drivers/cxl/core/memctrl/sysram_region.c b/drivers/cxl/core/memctrl/sysram_region.c
> new file mode 100644
> index 000000000000..a7570c8a54e1
> --- /dev/null
> +++ b/drivers/cxl/core/memctrl/sysram_region.c
> @@ -0,0 +1,358 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Inc. All rights reserved. */
> +#include <linux/memremap.h>
> +#include <linux/memory.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <linux/mm.h>
> +#include <linux/memory-tiers.h>
> +#include <linux/memory_hotplug.h>
> +#include <linux/string_helpers.h>
> +#include <linux/sched/signal.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "../core.h"
> +
> +/* If HMAT was unavailable, assign a default distance. */
> +#define MEMTIER_DEFAULT_CXL_ADISTANCE	(MEMTIER_ADISTANCE_DRAM * 5)
> +
> +static const char *sysram_name = "System RAM (CXL)";
> +
> +struct cxl_sysram_data {
> +	const char *res_name;
> +	int mgid;
> +	struct resource *res;
> +};
> +
> +static DEFINE_MUTEX(cxl_memory_type_lock);
> +static LIST_HEAD(cxl_memory_types);
> +
> +static struct cxl_region *to_cxl_region(struct device *dev)
> +{
> +	if (dev->type != &cxl_region_type)
> +		return NULL;
> +	return container_of(dev, struct cxl_region, dev);
> +}
> +
> +static struct memory_dev_type *cxl_find_alloc_memory_type(int adist)
> +{
> +	guard(mutex)(&cxl_memory_type_lock);
> +	return mt_find_alloc_memory_type(adist, &cxl_memory_types);
> +}
> +
> +static void __maybe_unused cxl_put_memory_types(void)
> +{
> +	guard(mutex)(&cxl_memory_type_lock);
> +	mt_put_memory_types(&cxl_memory_types);
> +}
> +
> +static int cxl_sysram_range(struct cxl_region *cxlr, struct range *r)
> +{
> +	struct cxl_region_params *p = &cxlr->params;
> +
> +	if (!p->res)
> +		return -ENODEV;
> +
> +	/* memory-block align the hotplug range */
> +	r->start = ALIGN(p->res->start, memory_block_size_bytes());
> +	r->end = ALIGN_DOWN(p->res->end + 1, memory_block_size_bytes()) - 1;
> +	if (r->start >= r->end) {
> +		r->start = p->res->start;
> +		r->end = p->res->end;
> +		return -ENOSPC;
> +	}
> +	return 0;
> +}
> +
> +static ssize_t hotunplug_store(struct device *dev,
> +			       struct device_attribute *attr,
> +			       const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	struct range range;
> +	int rc;
> +
> +	if (!cxlr)
> +		return -ENODEV;
> +
> +	rc = cxl_sysram_range(cxlr, &range);
> +	if (rc)
> +		return rc;
> +
> +	rc = offline_and_remove_memory(range.start, range_len(&range));
> +
> +	if (rc)
> +		return rc;
> +
> +	return len;
> +}
> +static DEVICE_ATTR_WO(hotunplug);
> +
> +struct online_memory_cb_arg {
> +	int online_type;
> +	int rc;
> +};
> +
> +static int online_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> +	struct online_memory_cb_arg *cb_arg = arg;
> +
> +	if (signal_pending(current))
> +		return -EINTR;
> +
> +	cond_resched();
> +
> +	if (mem->state == MEM_ONLINE)
> +		return 0;
> +
> +	mem->online_type = cb_arg->online_type;
> +	cb_arg->rc = device_online(&mem->dev);
> +
> +	return cb_arg->rc;
> +}
> +
> +static int offline_memory_block_cb(struct memory_block *mem, void *arg)
> +{
> +	int *rc = arg;
> +
> +	if (signal_pending(current))
> +		return -EINTR;
> +
> +	cond_resched();
> +
> +	if (mem->state == MEM_OFFLINE)
> +		return 0;
> +
> +	*rc = device_offline(&mem->dev);
> +
> +	return *rc;
> +}
> +
> +static ssize_t state_store(struct device *dev,
> +			   struct device_attribute *attr,
> +			   const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	struct online_memory_cb_arg cb_arg;
> +	struct range range;
> +	int rc;
> +
> +	if (!cxlr)
> +		return -ENODEV;
> +
> +	rc = cxl_sysram_range(cxlr, &range);
> +	if (rc)
> +		return rc;
> +
> +	rc = lock_device_hotplug_sysfs();
> +	if (rc)
> +		return rc;
> +
> +	if (sysfs_streq(buf, "online")) {
> +		cb_arg.online_type = MMOP_ONLINE_MOVABLE;
> +		cb_arg.rc = 0;
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&cb_arg, online_memory_block_cb);
> +		if (!rc)
> +			rc = cb_arg.rc;
> +	} else if (sysfs_streq(buf, "online_normal")) {
> +		cb_arg.online_type = MMOP_ONLINE;
> +		cb_arg.rc = 0;
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&cb_arg, online_memory_block_cb);
> +		if (!rc)
> +			rc = cb_arg.rc;
> +	} else if (sysfs_streq(buf, "offline")) {
> +		int offline_rc = 0;
> +
> +		rc = walk_memory_blocks(range.start, range_len(&range),
> +					&offline_rc, offline_memory_block_cb);
> +		if (!rc)
> +			rc = offline_rc;

Let's expose this functionality through some common-code helpers. I 
really don't want more code doing this non-obvious device_offline() etc 
dance.

walk_memory_blocks() should become a core-mm helper. Maybe we can also 
cleanup drivers/acpi/acpi_memhotplug.c in that regard.

Hopefully we can then also reuse these helpers in ppc code (see 
dlpar_add_lmb() and dlpar_remove_lmb() that do something similar, but 
grab the device hotplug lock themselves as they want to perform some 
additional operations).

-- 
Cheers

David

  reply	other threads:[~2026-01-12 20:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20260113093758epcas5p10cc9749a657b8e4d32db75b8b973b67d@epcas5p1.samsung.com>
2026-01-12 16:35 ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Gregory Price
2026-01-12 16:35   ` [PATCH 1/6] drivers/cxl: add cxl_memctrl_mode and region->memctrl Gregory Price
2026-01-12 20:59     ` dan.j.williams
2026-01-12 22:25       ` Gregory Price
2026-01-13 18:00       ` Dave Jiang
2026-01-13 20:07         ` Gregory Price
2026-01-14 16:36         ` dan.j.williams
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:34       ` Gregory Price
2026-01-14 17:18     ` Jonathan Cameron
2026-01-14 18:25       ` Gregory Price
2026-01-14 18:36         ` Jonathan Cameron
2026-01-12 16:35   ` [PATCH 2/6] cxl: add sysram_region memory controller Gregory Price
2026-01-12 20:00     ` David Hildenbrand (Red Hat) [this message]
2026-01-12 22:43       ` Gregory Price
2026-01-12 21:10     ` dan.j.williams
2026-01-12 22:47       ` Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:55       ` Gregory Price
2026-01-13 22:34         ` Cheatham, Benjamin
2026-01-12 16:35   ` [PATCH 3/6] cxl/core/region: move pmem memctrl logic into memctrl/pmem_region Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 22:58       ` Gregory Price
2026-01-13  9:12         ` Neeraj Kumar
2026-01-12 16:35   ` [PATCH 4/6] cxl: add CONFIG_CXL_REGION_CTRL_AUTO_* build config options Gregory Price
2026-01-12 21:10     ` Cheatham, Benjamin
2026-01-12 23:05       ` Gregory Price
2026-01-13  4:31         ` dan.j.williams
2026-01-13 13:55           ` Gregory Price
2026-01-12 16:35   ` [PATCH 5/6] cxl: add CXL_REGION_SYSRAM_DEFAULT_* build options Gregory Price
2026-01-12 21:11     ` Cheatham, Benjamin
2026-01-12 23:07       ` Gregory Price
2026-01-12 16:35   ` [PATCH 6/6] cxl/sysram: disallow onlining in ZONE_NORMAL if state is movable only Gregory Price
2026-01-12 21:11     ` Cheatham, Benjamin
2026-01-12 23:14       ` Gregory Price
2026-01-13 22:35         ` Cheatham, Benjamin
2026-01-13  9:37   ` [PATCH 0/6] CXL: Introduce memory controller abstraction and sysram controller Neeraj Kumar
2026-01-13 13:33     ` Gregory Price
2026-01-15 18:43   ` Alejandro Lucero Palau
2026-01-15 18:56     ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d5ccbb3-a083-4a5c-8c97-2db2adbc5446@kernel.org \
    --to=david@kernel.org \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=gourry@gourry.net \
    --cc=ira.weiny@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kernel-team@meta.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.