From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Ben Widawsky <ben.widawsky@intel.com>
Cc: <linux-cxl@vger.kernel.org>,
Alison Schofield <alison.schofield@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
"Ira Weiny" <ira.weiny@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH 1/6] cxl/region: Add region creation ABI
Date: Fri, 18 Jun 2021 10:13:11 +0100 [thread overview]
Message-ID: <20210618101311.00005f78@Huawei.com> (raw)
In-Reply-To: <20210617173655.430424-2-ben.widawsky@intel.com>
On Thu, 17 Jun 2021 10:36:50 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:
> Regions are created as a child of the decoder that encompasses an
> address space with constraints. Regions only exist for persistent
> capacities.
>
> When regions are created, the number of desired interleave ways must be
> known. To enable this, the sysfs attribute will take the desired ways as
> input. This interface intentionally allows creation of
> impossible-to-enable regions based on interleave constraints in the
> topology. The reasoning is to create new regions through the kernel
> interfaces which may become possible on reboot under a variety of
> circumstances.
>
> As an example, creating a x1 region with:
> echo 1 > /sys/bus/cxl/devices/decoder1.0/create_region
>
> Will yield /sys/bus/cxl/devices/decoder1.0/region1.0:0
>
> That region may then be deleted with:
> echo region1.0:0 > /sys/bus/cxl/devices/decoder1.0/delete_region
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Hi Ben,
Some comments inline. The sysfs interface is getting a little
clever for my liking....
> ---
> Documentation/ABI/testing/sysfs-bus-cxl | 21 +++
> .../driver-api/cxl/memory-devices.rst | 11 ++
> drivers/cxl/Makefile | 3 +-
> drivers/cxl/core.c | 71 +++++++++
> drivers/cxl/cxl.h | 11 ++
> drivers/cxl/region.c | 147 ++++++++++++++++++
> drivers/cxl/region.h | 43 +++++
> 7 files changed, 306 insertions(+), 1 deletion(-)
> create mode 100644 drivers/cxl/region.c
> create mode 100644 drivers/cxl/region.h
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 0b6a2e6e8fbb..115a25d2899d 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -127,3 +127,24 @@ Description:
> memory (type-3). The 'target_type' attribute indicates the
> current setting which may dynamically change based on what
> memory regions are activated in this decode hierarchy.
> +
> +What: /sys/bus/cxl/devices/decoderX.Y/create_region
> +Date: June, 2021
> +KernelVersion: v5.14
> +Contact: linux-cxl@vger.kernel.org
> +Description:
> + Creates a new CXL region of N interleaved ways. Writing a value
> + of '2' will create a new uninitialized region with 2x interleave
> + that will be mapped by the CXL decoderX.Y. Reading from this
> + node will return the last created region. Regions must be
> + subsequently configured and bound to a region driver before they
> + can be used.
I don't like sysfs attributes that return entirely different looking (and seemingly
unrelated) values from what you write to them.
I could sort of get behind writing 1 and returning the region (as in the RFC)
on basis that felt like 'create' and 'what did I create'.
Doing '2' and getting back anything other than '2' feels wrong
Maybe, would feel better if the name made it more explicit that the thing took
interleave values?
create_region_with_interleave perhaps?
Not ideal. Sysfs always a bit clunky for this stuff and configfs would mean
a split interface which is never ideal.
It's a bit horrible, but maybe a more intuitive interface would be to make the
targetX visibility dynamic and hence make interleave an attribute of the region?
sysfs_update_group() is rarely used, but I think the intent is to support
this sort of case.
I've not fully thought through the effects of that though so might
well be missing something.
Also, add to the docs what can be expected if there is no 'last created'
or it's been removed again.
> +
> +What: /sys/bus/cxl/devices/decoderX.Y/delete_region
> +Date: June, 2021
> +KernelVersion: v5.14
> +Contact: linux-cxl@vger.kernel.org
> +Description:
> + Deletes the named region. A region must be unbound from the
> + region driver before being deleted. The attributes expects a
> + region in the form "regionX.Y:Z".
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 487ce4f41d77..c7f59a8c94db 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -39,6 +39,17 @@ CXL Core
> .. kernel-doc:: drivers/cxl/core.c
> :doc: cxl core
>
> +CXL Regions
> +-----------
> +.. kernel-doc:: drivers/cxl/region.c
> + :doc: cxl region
> +
> +.. kernel-doc:: drivers/cxl/region.h
> + :identifiers:
> +
> +.. kernel-doc:: drivers/cxl/region.c
> + :identifiers:
> +
> External Interfaces
> ===================
>
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index 32954059b37b..c3151198c041 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,6 +1,6 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_CXL_BUS) += cxl_core.o
> -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> +obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
> obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>
> @@ -9,3 +9,4 @@ cxl_core-y := core.o
> cxl_pci-y := pci.o
> cxl_acpi-y := acpi.o
> cxl_pmem-y := pmem.o
> +cxl_region-y := region.o
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index a2e4d54fc7bc..d8d7ca85e110 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -6,6 +6,7 @@
> #include <linux/pci.h>
> #include <linux/slab.h>
> #include <linux/idr.h>
> +#include "region.h"
> #include "cxl.h"
> #include "mem.h"
>
> @@ -120,7 +121,68 @@ static ssize_t target_list_show(struct device *dev,
> }
> static DEVICE_ATTR_RO(target_list);
>
> +static ssize_t create_region_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct cxl_decoder *cxld = to_cxl_decoder(dev);
> + int rc;
> +
> + device_lock(dev);
> + rc = sprintf(buf, "%s\n",
> + cxld->youngest ? dev_name(&cxld->youngest->dev) : "");
An alternative is to be cynical and adjust the interface a touch.
Lets say it always returns the name of the last created region. If that's
true, just copy the name at creation time. Then no need to care if the
pointer is live or not.
> + device_unlock(dev);
> +
> + return rc;
> +}
> +
> +static ssize_t create_region_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_decoder *cxld = to_cxl_decoder(dev);
> + struct cxl_region *region;
> + ssize_t rc;
> + int val;
> +
> + rc = kstrtoint(buf, 0, &val);
> + if (rc)
> + return rc;
> + if (val < 0 || val > 16)
> + return -EINVAL;
> +
> + region = cxl_alloc_region(cxld, val);
> + if (IS_ERR(region))
> + return PTR_ERR(region);
> +
> + rc = cxl_add_region(cxld, region);
> + if (rc) {
> + cxl_free_region(cxld, region);
> + return rc;
> + }
> +
> + cxld->youngest = region;
> + return len;
> +}
> +static DEVICE_ATTR_RW(create_region);
> +
> +static ssize_t delete_region_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct cxl_decoder *cxld = to_cxl_decoder(dev);
> + int rc;
> +
> + rc = cxl_delete_region(cxld, buf);
> + if (rc)
> + return rc;
> +
> + return len;
> +}
> +static DEVICE_ATTR_WO(delete_region);
> +
> static struct attribute *cxl_decoder_base_attrs[] = {
> + &dev_attr_create_region.attr,
> + &dev_attr_delete_region.attr,
> &dev_attr_start.attr,
> &dev_attr_size.attr,
> &dev_attr_locked.attr,
> @@ -171,7 +233,13 @@ static void cxl_decoder_release(struct device *dev)
> {
> struct cxl_decoder *cxld = to_cxl_decoder(dev);
> struct cxl_port *port = to_cxl_port(dev->parent);
> + struct cxl_region *region;
>
> + list_for_each_entry(region, &cxld->regions, list)
> + cxl_delete_region(cxld, dev_name(®ion->dev));
> +
> + dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
> + "Lost track of a region");
> ida_free(&port->decoder_ida, cxld->id);
> kfree(cxld);
> }
> @@ -483,8 +551,11 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
> .interleave_ways = interleave_ways,
> .interleave_granularity = interleave_granularity,
> .target_type = type,
> + .regions = LIST_HEAD_INIT(cxld->regions),
> };
>
> + ida_init(&cxld->region_ida);
> +
> /* handle implied target_list */
> if (interleave_ways == 1)
> cxld->target[0] =
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b6bda39a59e3..8b27a07d7d0f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -190,6 +190,9 @@ enum cxl_decoder_type {
> * @interleave_granularity: data stride per dport
> * @target_type: accelerator vs expander (type2 vs type3) selector
> * @flags: memory type capabilities and locking
> + * @region_ida: allocator for region ids.
> + * @regions: List of regions mapped (may be disabled) by this decoder.
> + * @youngest: Last region created for this decoder.
> * @target: active ordered target list in current decoder configuration
> */
> struct cxl_decoder {
> @@ -200,6 +203,9 @@ struct cxl_decoder {
> int interleave_granularity;
> enum cxl_decoder_type target_type;
> unsigned long flags;
> + struct ida region_ida;
> + struct list_head regions;
> + struct cxl_region *youngest;
> struct cxl_dport *target[];
> };
>
> @@ -262,6 +268,11 @@ struct cxl_dport {
> struct list_head list;
> };
>
> +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> + int interleave_ways);
> +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region);
> +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
> +int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
> struct cxl_port *to_cxl_port(struct device *dev);
> struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> resource_size_t component_reg_phys,
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> new file mode 100644
> index 000000000000..391467e864a2
> --- /dev/null
> +++ b/drivers/cxl/region.c
> @@ -0,0 +1,147 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/idr.h>
> +#include "region.h"
> +#include "cxl.h"
> +#include "mem.h"
> +
> +/**
> + * DOC: cxl region
> + *
> + * A CXL region encompasses a chunk of host physical address space that may be
> + * consumed by a single device (x1 interleave aka linear) or across multiple
> + * devices (xN interleaved). A region is a child device of a &struct
> + * cxl_decoder. There may be multiple active regions under a single &struct
> + * cxl_decoder. The common case for multiple regions would be several linear,
> + * contiguous regions under a single decoder. Generally, there will be a 1:1
> + * relationship between decoder and region when the region is interleaved.
> + */
> +
> +static void cxl_region_release(struct device *dev);
> +
> +const struct device_type cxl_region_type = {
> + .name = "cxl_region",
> + .release = cxl_region_release,
> +};
> +
> +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
> +{
> + ida_free(&cxld->region_ida, region->id);
> + kfree(region);
> +}
> +
> +static void cxl_region_release(struct device *dev)
> +{
> + struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> + struct cxl_region *region = to_cxl_region(dev);
> +
> + cxl_free_region(cxld, region);
> +}
> +
> +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> + int interleave_ways)
> +{
> + struct cxl_region *region;
> + int rc;
> +
> + region = kzalloc(struct_size(region, targets, interleave_ways),
> + GFP_KERNEL);
> + if (!region)
> + return ERR_PTR(-ENOMEM);
> +
> + region->eniw = interleave_ways;
> +
> + rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
> + if (rc < 0) {
> + dev_err(&cxld->dev, "Couldn't get a new id\n");
> + kfree(region);
> + return ERR_PTR(rc);
> + }
> + region->id = rc;
> +
> + return region;
> +}
> +
> +/**
> + * cxl_add_region - Adds a region to a decoder
> + * @cxld: Parent decoder.
> + * @region: Region to be added to the decoder.
> + *
> + * This is the second step of region initialization. Regions exist within an
> + * address space which is mapped by a @cxld, and that @cxld enforces constraints
> + * upon the region as it is configured. Regions may be added to a @cxld but not
> + * activated and therefore it is possible to have more regions in a @cxld than
> + * there are interleave ways in the @cxld. Regions exist only for persistent
> + * capacities.
> + *
> + * Return: zero if the region was added to the @cxld, else returns negative
> + * error code.
> + */
> +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
> +{
> + struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> + struct device *dev = ®ion->dev;
> + int rc;
> +
> + device_initialize(dev);
> + dev->parent = &cxld->dev;
> + device_set_pm_not_required(dev);
> + dev->bus = &cxl_bus_type;
> + dev->type = &cxl_region_type;
> + rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, region->id);
> + if (rc)
> + goto err;
> +
> + rc = device_add(dev);
> + if (rc)
> + goto err;
> +
> + dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
> +
> + return 0;
> +
> +err:
> + put_device(dev);
> + return rc;
> +}
> +
> +static struct cxl_region *
> +cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name)
> +{
> + struct device *region_dev;
> +
> + region_dev = device_find_child_by_name(&cxld->dev, name);
> + if (!region_dev)
> + return ERR_PTR(-ENOENT);
> +
> + return to_cxl_region(region_dev);
> +}
> +
> +int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
> +{
> + struct cxl_region *region;
> +
> + device_lock(&cxld->dev);
> +
> + region = cxl_find_region_by_name(cxld, region_name);
> + if (IS_ERR(region)) {
> + device_unlock(&cxld->dev);
> + return PTR_ERR(region);
> + }
> +
> + dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
> + dev_name(®ion->dev), dev_name(&cxld->dev));
> +
> + cmpxchg(&cxld->youngest, region, NULL);
Why does this need to be atomic? I think the other side of anything
that might make use of this isn't atomic, except because you are holding
the device_lock for cxld->dev
> +
> + device_unregister(®ion->dev);
> + device_unlock(&cxld->dev);
> +
> + put_device(®ion->dev);
> +
> + return 0;
> +}
> diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
> new file mode 100644
> index 000000000000..7a87d229e38a
> --- /dev/null
> +++ b/drivers/cxl/region.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2021 Intel Corporation. */
> +#ifndef __CXL_REGION_H__
> +#define __CXL_REGION_H__
> +
> +#include <linux/uuid.h>
> +
> +extern const struct device_type cxl_region_type;
> +
> +/**
> + * struct cxl_region - CXL region
> + * @dev: This region's device.
> + * @id: This regions id. Id is globally unique across all regions.
> + * @res: Address space consumed by this region.
> + * @requested_size: Size of the region determined from LSA or userspace.
> + * @uuid: The UUID for this region.
> + * @list: Node in decoders region list.
> + * @eniw: Number of interleave ways this region is configured for.
> + * @targets: The memory devices comprising the region.
> + */
> +struct cxl_region {
> + struct device dev;
> + int id;
> + struct resource *res;
> + u64 requested_size;
> + uuid_t uuid;
> + struct list_head list;
> + int eniw;
> + struct cxl_memdev *targets[];
> +};
> +
> +static inline struct cxl_region *to_cxl_region(struct device *dev)
> +{
> + if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
> + "not a cxl_region device\n"))
> + return NULL;
> +
> + return container_of(dev, struct cxl_region, dev);
> +}
> +
> +bool cxl_is_region_configured(struct cxl_region *region);
> +
> +#endif
next prev parent reply other threads:[~2021-06-18 9:13 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
2021-06-17 21:11 ` [PATCH v2 " Ben Widawsky
2021-06-18 9:13 ` Jonathan Cameron [this message]
2021-06-18 15:07 ` [PATCH " Ben Widawsky
2021-06-18 16:39 ` Dan Williams
2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
2021-06-18 9:13 ` Jonathan Cameron
2021-06-18 15:00 ` Dan Williams
2021-06-17 17:36 ` [PATCH 3/6] cxl/region: Introduce concept of region configuration Ben Widawsky
2021-06-18 11:22 ` Jonathan Cameron
2021-06-18 15:25 ` Ben Widawsky
2021-06-18 15:44 ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 4/6] cxl/region: Introduce a cxl_region driver Ben Widawsky
2021-06-17 21:13 ` [PATCH v2 " Ben Widawsky
2021-06-18 11:49 ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 5/6] cxl/core: Convert decoder range to resource Ben Widawsky
2021-06-18 11:52 ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 6/6] cxl/region: Handle region's address space allocation Ben Widawsky
2021-06-18 13:35 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210618101311.00005f78@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=ben.widawsky@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox