From: Dave Jiang <dave.jiang@intel.com>
To: <ira.weiny@intel.com>, Navneet Singh <navneet.singh@intel.com>,
Fan Ni <fan.ni@samsung.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
"Dan Williams" <dan.j.williams@intel.com>,
<linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 4/5] cxl/mem: Add support to handle DCD add and release capacity events.
Date: Thu, 15 Jun 2023 09:58:48 -0700 [thread overview]
Message-ID: <c8d5594f-846b-2075-2a65-1378de9bfdfb@intel.com> (raw)
In-Reply-To: <20230604-dcd-type2-upstream-v1-4-71b6341bae54@intel.com>
On 6/14/23 12:16, ira.weiny@intel.com wrote:
> From: Navneet Singh <navneet.singh@intel.com>
>
> A dynamic capacity device utilizes events to signal the host about the
> changes to the allocation of DC blocks. The device communicates the
> state of these blocks of dynamic capacity through an extent list that
> describes the starting DPA and length of all blocks the host can access.
>
> Based on the dynamic capacity add or release event type,
> dynamic memory represented by the extents are either added
> or removed as devdax device.
>
> Process the dynamic capacity add and release events.
>
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
>
> ---
> [iweiny: Remove invalid comment]
> ---
> drivers/cxl/core/mbox.c | 345 +++++++++++++++++++++++++++++++++++++++++++++-
> drivers/cxl/core/region.c | 214 +++++++++++++++++++++++++++-
> drivers/cxl/core/trace.h | 3 +-
> drivers/cxl/cxl.h | 4 +-
> drivers/cxl/cxlmem.h | 76 ++++++++++
> drivers/cxl/pci.c | 10 +-
> drivers/dax/bus.c | 11 +-
> drivers/dax/bus.h | 5 +-
> 8 files changed, 652 insertions(+), 16 deletions(-)
Rather large patch. Can this be broken into the add and release events?
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index c5b696737c87..db9295216de5 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -767,6 +767,14 @@ static const uuid_t log_uuid[] = {
> [VENDOR_DEBUG_UUID] = DEFINE_CXL_VENDOR_DEBUG_UUID,
> };
>
> +/* See CXL 3.0 8.2.9.2.1.5 */
> +enum dc_event {
> + ADD_CAPACITY,
> + RELEASE_CAPACITY,
> + FORCED_CAPACITY_RELEASE,
> + REGION_CONFIGURATION_UPDATED,
> +};
> +
> /**
> * cxl_enumerate_cmds() - Enumerate commands for a device.
> * @mds: The driver data for the operation
> @@ -852,6 +860,14 @@ static const uuid_t mem_mod_event_uuid =
> UUID_INIT(0xfe927475, 0xdd59, 0x4339,
> 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
>
> +/*
> + * Dynamic Capacity Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +static const uuid_t dc_event_uuid =
> + UUID_INIT(0xca95afa7, 0xf183, 0x4018, 0x8c,
> + 0x2f, 0x95, 0x26, 0x8e, 0x10, 0x1a, 0x2a);
> +
> static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> struct cxl_event_record_raw *record)
> @@ -945,6 +961,188 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> return rc;
> }
>
> +static int cxl_send_dc_cap_response(struct cxl_memdev_state *mds,
> + struct cxl_mbox_dc_response *res,
> + int extent_cnt, int opcode)
> +{
> + struct cxl_mbox_cmd mbox_cmd;
> + int rc, size;
> +
> + size = struct_size(res, extent_list, extent_cnt);
> + res->extent_list_size = cpu_to_le32(extent_cnt);
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = opcode,
> + .size_in = size,
> + .payload_in = res,
> + };
> +
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> +
> + return rc;
> +
> +}
Return cxl_internal_send_cmd() directly
> +
> +static int cxl_prepare_ext_list(struct cxl_mbox_dc_response **res,
'res' always triggers "resource" in my head. Maybe either spell it out
or 'resp'?
> + int *n, struct range *extent)
Maybe a parameter more descriptive than 'n'?
> +{
> + struct cxl_mbox_dc_response *dc_res;
Same comment as 'res'
> + unsigned int size;
> +
> + if (!extent)
> + size = struct_size(dc_res, extent_list, 0);
> + else
> + size = struct_size(dc_res, extent_list, *n + 1);
> +
> + dc_res = krealloc(*res, size, GFP_KERNEL);
> + if (!dc_res)
> + return -ENOMEM;
> +
> + if (extent) {
> + dc_res->extent_list[*n].dpa_start = cpu_to_le64(extent->start);
> + memset(dc_res->extent_list[*n].reserved, 0, 8);
SZ_8 perhaps?
> + dc_res->extent_list[*n].length =
> + cpu_to_le64(range_len(extent));
> + (*n)++;
> + }
> +
> + *res = dc_res;
> + return 0;
> +}
> +/**
> + * cxl_handle_dcd_event_records() - Read DCD event records.
> + * @mds: The memory device state
> + *
> + * Returns 0 if enumerate completed successfully.
Please also add "errno on failure."
> + *
> + * CXL devices can generate DCD events to add or remove extents in the list.
> + */
> +static int cxl_handle_dcd_event_records(struct cxl_memdev_state *mds,
> + struct cxl_event_record_raw *rec)
> +{
> + struct cxl_mbox_dc_response *dc_res = NULL;
> + struct device *dev = mds->cxlds.dev;
> + uuid_t *id = &rec->hdr.id;
> + struct dcd_event_dyn_cap *record =
> + (struct dcd_event_dyn_cap *)rec;
> + int extent_cnt = 0, rc = 0;
> + struct cxl_dc_extent_data *extent;
> + struct range alloc_range, rel_range;
> + resource_size_t dpa, size;
> +
> + if (!uuid_equal(id, &dc_event_uuid))
> + return -EINVAL;
> +
> + switch (record->data.event_type) {
> + case ADD_CAPACITY:
> + extent = devm_kzalloc(dev, sizeof(*extent), GFP_ATOMIC);
> + if (!extent)
> + return -ENOMEM;
> +
> + extent->dpa_start = le64_to_cpu(record->data.extent.start_dpa);
> + extent->length = le64_to_cpu(record->data.extent.length);
> + memcpy(extent->tag, record->data.extent.tag,
> + sizeof(record->data.extent.tag));
A general comment, the alignment of second line in this patch is
inconsistent
> + extent->shared_extent_seq =
> + le16_to_cpu(record->data.extent.shared_extn_seq);
> + dev_dbg(dev, "Add DC extent DPA:0x%llx LEN:%llx\n",
> + extent->dpa_start, extent->length);
> + alloc_range = (struct range) {
> + .start = extent->dpa_start,
> + .end = extent->dpa_start + extent->length - 1,
> + };
> +
> + rc = cxl_add_dc_extent(mds, &alloc_range);
> + if (rc < 0) {
> + dev_dbg(dev, "unconsumed DC extent DPA:0x%llx LEN:%llx\n",
> + extent->dpa_start, extent->length);
> + rc = cxl_prepare_ext_list(&dc_res, &extent_cnt, NULL);
> + if (rc < 0) {
> + dev_err(dev, "Couldn't create extent list %d\n",
> + rc);
> + devm_kfree(dev, extent);
> + return rc;
> + }
> +
> + rc = cxl_send_dc_cap_response(mds, dc_res,
> + extent_cnt, CXL_MBOX_OP_ADD_DC_RESPONSE);
> + if (rc < 0) {
> + devm_kfree(dev, extent);
> + goto out;
Please be consistent in direct return vs single path error out. Mixing
makes it more difficult to read.
> + }
> +
> + kfree(dc_res);
> + devm_kfree(dev, extent);
> +
> + return 0;
> + }
> +
> + rc = xa_insert(&mds->dc_extent_list, extent->dpa_start, extent,
> + GFP_KERNEL);
> + if (rc < 0)
> + goto out;
> +
> + mds->num_dc_extents++;
> + rc = cxl_prepare_ext_list(&dc_res, &extent_cnt, &alloc_range);
> + if (rc < 0) {
> + dev_err(dev, "Couldn't create extent list %d\n", rc);
> + return rc;
> + }
> +
> + rc = cxl_send_dc_cap_response(mds, dc_res, extent_cnt,
> + CXL_MBOX_OP_ADD_DC_RESPONSE);
> + if (rc < 0)
> + goto out;
> +
> + break;
> +
> + case RELEASE_CAPACITY:
> + dpa = le64_to_cpu(record->data.extent.start_dpa);
> + size = le64_to_cpu(record->data.extent.length);
> + dev_dbg(dev, "Release DC extents DPA:0x%llx LEN:%llx\n",
> + dpa, size);
> + extent = xa_load(&mds->dc_extent_list, dpa);
> + if (!extent) {
> + dev_err(dev, "No extent found with DPA:0x%llx\n", dpa);
> + return -EINVAL;
> + }
> +
> + rel_range = (struct range) {
> + .start = dpa,
> + .end = dpa + size - 1,
> + };
> +
> + rc = cxl_release_dc_extent(mds, &rel_range);
> + if (rc < 0) {
> + dev_dbg(dev, "withhold DC extent DPA:0x%llx LEN:%llx\n",
> + dpa, size);
> + return 0;
> + }
> +
> + xa_erase(&mds->dc_extent_list, dpa);
> + devm_kfree(dev, extent);
> + mds->num_dc_extents--;
> + rc = cxl_prepare_ext_list(&dc_res, &extent_cnt, &rel_range);
> + if (rc < 0) {
> + dev_err(dev, "Couldn't create extent list %d\n", rc);
> + return rc;
> + }
> +
> + rc = cxl_send_dc_cap_response(mds, dc_res, extent_cnt,
> + CXL_MBOX_OP_RELEASE_DC);
> + if (rc < 0)
> + goto out;
> +
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +out:
> + kfree(dc_res);
> + return rc;
> +}
> +
> static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> enum cxl_event_log_type type)
> {
> @@ -982,9 +1180,17 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> if (!nr_rec)
> break;
>
> - for (i = 0; i < nr_rec; i++)
> + for (i = 0; i < nr_rec; i++) {
> cxl_event_trace_record(cxlmd, type,
> &payload->records[i]);
> + if (type == CXL_EVENT_TYPE_DCD) {
> + rc = cxl_handle_dcd_event_records(mds,
> + &payload->records[i]);
> + if (rc)
> + dev_err_ratelimited(dev,
> + "dcd event failed: %d\n", rc);
> + }
> + }
>
> if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> trace_cxl_overflow(cxlmd, type, payload);
> @@ -1024,6 +1230,8 @@ void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status)
> cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_WARN);
> if (status & CXLDEV_EVENT_STATUS_INFO)
> cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_INFO);
> + if (status & CXLDEV_EVENT_STATUS_DCD)
> + cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_DCD);
> }
> EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
>
> @@ -1244,6 +1452,140 @@ int cxl_dev_dynamic_capacity_identify(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_dev_dynamic_capacity_identify, CXL);
>
> +int cxl_dev_get_dc_extent_cnt(struct cxl_memdev_state *mds,
> + unsigned int *extent_gen_num)
> +{
> + struct device *dev = mds->cxlds.dev;
> + struct cxl_mbox_dc_extents *dc_extents;
> + struct cxl_mbox_get_dc_extent get_dc_extent;
> + unsigned int total_extent_cnt;
> + struct cxl_mbox_cmd mbox_cmd;
> + int rc;
> +
> + /* Check GET_DC_EXTENT_LIST is supported by device */
> + if (!test_bit(CXL_DCD_ENABLED_GET_EXTENT_LIST, mds->dcd_cmds)) {
> + dev_dbg(dev, "unsupported cmd : get dyn cap extent list\n");
> + return 0;
> + }
> +
> + dc_extents = kvmalloc(mds->payload_size, GFP_KERNEL);
> + if (!dc_extents)
> + return -ENOMEM;
> +
> + get_dc_extent = (struct cxl_mbox_get_dc_extent) {
> + .extent_cnt = 0,
> + .start_extent_index = 0,
> + };
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_DC_EXTENT_LIST,
> + .payload_in = &get_dc_extent,
> + .size_in = sizeof(get_dc_extent),
> + .size_out = mds->payload_size,
> + .payload_out = dc_extents,
> + .min_out = 1,
> + };
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> + if (rc < 0)
> + goto out;
> +
> + total_extent_cnt = le32_to_cpu(dc_extents->total_extent_cnt);
> + *extent_gen_num = le32_to_cpu(dc_extents->extent_list_num);
> + dev_dbg(dev, "Total extent count :%d Extent list Generation Num: %d\n",
> + total_extent_cnt, *extent_gen_num);
> +out:
> +
> + kvfree(dc_extents);
> + if (rc < 0)
> + return rc;
> +
> + return total_extent_cnt;
> +
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dev_get_dc_extent_cnt, CXL);
> +
> +int cxl_dev_get_dc_extents(struct cxl_memdev_state *mds,
> + unsigned int index, unsigned int cnt)
> +{
> + /* See CXL 3.0 Table 125 dynamic capacity config Output Payload */
> + struct device *dev = mds->cxlds.dev;
> + struct cxl_mbox_dc_extents *dc_extents;
> + struct cxl_mbox_get_dc_extent get_dc_extent;
> + unsigned int extent_gen_num, available_extents, total_extent_cnt;
> + int rc;
> + struct cxl_dc_extent_data *extent;
> + struct cxl_mbox_cmd mbox_cmd;
> + struct range alloc_range;
> +
> + /* Check GET_DC_EXTENT_LIST is supported by device */
> + if (!test_bit(CXL_DCD_ENABLED_GET_EXTENT_LIST, mds->dcd_cmds)) {
> + dev_dbg(dev, "unsupported cmd : get dyn cap extent list\n");
> + return 0;
> + }
> +
> + dc_extents = kvmalloc(mds->payload_size, GFP_KERNEL);
> + if (!dc_extents)
> + return -ENOMEM;
> + get_dc_extent = (struct cxl_mbox_get_dc_extent) {
> + .extent_cnt = cnt,
> + .start_extent_index = index,
> + };
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_DC_EXTENT_LIST,
> + .payload_in = &get_dc_extent,
> + .size_in = sizeof(get_dc_extent),
> + .size_out = mds->payload_size,
> + .payload_out = dc_extents,
> + .min_out = 1,
> + };
> + rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> + if (rc < 0)
> + goto out;
> +
> + available_extents = le32_to_cpu(dc_extents->ret_extent_cnt);
> + total_extent_cnt = le32_to_cpu(dc_extents->total_extent_cnt);
> + extent_gen_num = le32_to_cpu(dc_extents->extent_list_num);
> + dev_dbg(dev, "No Total extent count :%d Extent list Generation Num:%d\n",
> + total_extent_cnt, extent_gen_num);
> +
> +
> + for (int i = 0; i < available_extents ; i++) {
> + extent = devm_kzalloc(dev, sizeof(*extent), GFP_KERNEL);
> + if (!extent) {
> + rc = -ENOMEM;
> + goto out;
> + }
> + extent->dpa_start = le64_to_cpu(dc_extents->extent[i].start_dpa);
> + extent->length = le64_to_cpu(dc_extents->extent[i].length);
> + memcpy(extent->tag, dc_extents->extent[i].tag,
> + sizeof(dc_extents->extent[i].tag));
> + extent->shared_extent_seq =
> + le16_to_cpu(dc_extents->extent[i].shared_extn_seq);
> + dev_dbg(dev, "dynamic capacity extent[%d] DPA:0x%llx LEN:%llx\n",
> + i, extent->dpa_start, extent->length);
> +
> + alloc_range = (struct range){
> + .start = extent->dpa_start,
> + .end = extent->dpa_start + extent->length - 1,
> + };
> +
> + rc = cxl_add_dc_extent(mds, &alloc_range);
> + if (rc < 0)
> + goto out;
> + rc = xa_insert(&mds->dc_extent_list, extent->dpa_start, extent,
> + GFP_KERNEL);
> + }
> +
> +out:
> + kvfree(dc_extents);
> + if (rc < 0)
> + return rc;
> +
> + return available_extents;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dev_get_dc_extents, CXL);
> +
> static int add_dpa_res(struct device *dev, struct resource *parent,
> struct resource *res, resource_size_t start,
> resource_size_t size, const char *type)
> @@ -1452,6 +1794,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
> mutex_init(&mds->event.log_lock);
> mds->cxlds.dev = dev;
> mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> + xa_init(&mds->dc_extent_list);
>
> return mds;
> }
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 144232c8305e..ba45c1c3b0a9 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> #include <linux/memregion.h>
> +#include <linux/interrupt.h>
> #include <linux/genalloc.h>
> #include <linux/device.h>
> #include <linux/module.h>
> @@ -11,6 +12,8 @@
> #include <cxlmem.h>
> #include <cxl.h>
> #include "core.h"
> +#include "../../dax/bus.h"
> +#include "../../dax/dax-private.h"
>
> /**
> * DOC: cxl core region
> @@ -166,6 +169,38 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> return 0;
> }
>
> +static int cxl_region_manage_dc(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + unsigned int extent_gen_num;
> + int i, rc;
> +
> + /* Designed for Non Interleaving flow with the assumption one
comment need to go to next line
/*
* comment
*/
> + * cxl_region will map the complete device DC region's DPA range
> + */
> + for (i = 0; i < p->nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
> +
> + rc = cxl_dev_get_dc_extent_cnt(mds, &extent_gen_num);
> + if (rc < 0)
Need {} since else if does
> + goto err;
I would just do return rc here. No need for the goto. and then else if()
below would become if()
> + else if (rc > 1) {
> + rc = cxl_dev_get_dc_extents(mds, rc, 0);
> + if (rc < 0)
> + goto err;
> + mds->num_dc_extents = rc;
> + mds->dc_extents_index = rc - 1;
> + }
> + mds->dc_list_gen_num = extent_gen_num;
> + dev_dbg(mds->cxlds.dev, "No of preallocated extents :%d\n", rc);
> + }
> + return 0;
> +err:
> + return rc;
> +}
> +
> static int commit_decoder(struct cxl_decoder *cxld)
> {
> struct cxl_switch_decoder *cxlsd = NULL;
> @@ -2865,11 +2900,14 @@ static int devm_cxl_add_dc_region(struct cxl_region *cxlr)
> return PTR_ERR(cxlr_dax);
>
> cxlr_dc = kzalloc(sizeof(*cxlr_dc), GFP_KERNEL);
> - if (!cxlr_dc) {
> - rc = -ENOMEM;
> - goto err;
> - }
> + if (!cxlr_dc)
> + return -ENOMEM;
>
> + rc = request_module("dax_cxl");
> + if (rc) {
> + dev_err(dev, "failed to load dax-ctl module\n");
> + goto load_err;
> + }
> dev = &cxlr_dax->dev;
> rc = dev_set_name(dev, "dax_region%d", cxlr->id);
> if (rc)
> @@ -2891,10 +2929,24 @@ static int devm_cxl_add_dc_region(struct cxl_region *cxlr)
> xa_init(&cxlr_dc->dax_dev_list);
> cxlr->cxlr_dc = cxlr_dc;
> rc = devm_add_action_or_reset(&cxlr->dev, cxl_dc_region_release, cxlr);
> - if (!rc)
> - return 0;
> + if (rc)
> + goto err;
> +
> + if (!dev->driver) {
> + dev_err(dev, "%s Driver not attached\n", dev_name(dev));
> + rc = -ENXIO;
> + goto err;
> + }
> +
> + rc = cxl_region_manage_dc(cxlr);
> + if (rc)
> + goto err;
> +
> + return 0;
> +
> err:
> put_device(dev);
> +load_err:
> kfree(cxlr_dc);
> return rc;
> }
> @@ -3076,6 +3128,156 @@ struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> }
> EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>
> +static int match_ep_decoder_by_range(struct device *dev, void *data)
> +{
> + struct cxl_endpoint_decoder *cxled;
> + struct range *dpa_range = data;
> +
> + if (!is_endpoint_decoder(dev))
> + return 0;
> +
> + cxled = to_cxl_endpoint_decoder(dev);
> + if (!cxled->cxld.region)
> + return 0;
> +
> + if (cxled->dpa_res->start <= dpa_range->start &&
> + cxled->dpa_res->end >= dpa_range->end)
> + return 1;
> +
> + return 0;
Return the compare directly
> +}
> +
> +int cxl_release_dc_extent(struct cxl_memdev_state *mds,
> + struct range *rel_range)
> +{
> + struct cxl_memdev *cxlmd = mds->cxlds.cxlmd;
> + struct cxl_endpoint_decoder *cxled;
> + struct cxl_dc_region *cxlr_dc;
> + struct dax_region *dax_region;
> + resource_size_t dpa_offset;
> + struct cxl_region *cxlr;
> + struct range hpa_range;
> + struct dev_dax *dev_dax;
> + resource_size_t hpa;
> + struct device *dev;
> + int ranges, rc = 0;
> +
> + /*
> + * Find the cxl endpoind decoder with which has the extent dpa range and
> + * get the cxl_region, dax_region refrences.
> + */
> + dev = device_find_child(&cxlmd->endpoint->dev, rel_range,
> + match_ep_decoder_by_range);
> + if (!dev) {
> + dev_err(mds->cxlds.dev, "%pr not mapped\n", rel_range);
> + return PTR_ERR(dev);
dev would be NULL and PTR_ERR() won't work. Maybe 'return -ENODEV'?
device_find_child() doesn't return ERRPTR it seems.
> + }
> +
> + cxled = to_cxl_endpoint_decoder(dev);
> + hpa_range = cxled->cxld.hpa_range;
> + cxlr = cxled->cxld.region;
> + cxlr_dc = cxlr->cxlr_dc;
> +
> + /* DPA to HPA translation */
> + if (cxled->cxld.interleave_ways == 1) {
> + dpa_offset = rel_range->start - cxled->dpa_res->start;
> + hpa = hpa_range.start + dpa_offset;
> + } else {
> + dev_err(mds->cxlds.dev, "Interleaving DC not supported\n");
> + return -EINVAL;
> + }
Check != 1 first and return. Then you don't need else.
> +
> + dev_dax = xa_load(&cxlr_dc->dax_dev_list, hpa);
> + if (!dev_dax)
> + return -EINVAL;
> +
> + dax_region = dev_dax->region;
> + ranges = dev_dax->nr_range;
> +
> + while (ranges) {
> + int i = ranges - 1;
> + struct dax_mapping *mapping = dev_dax->ranges[i].mapping;
> +
> + devm_release_action(dax_region->dev, unregister_dax_mapping,
> + &mapping->dev);
> + ranges--;
> + }
> +
> + dev_dbg(mds->cxlds.dev, "removing devdax device:%s\n",
> + dev_name(&dev_dax->dev));
> + devm_release_action(dax_region->dev, unregister_dev_dax,
> + &dev_dax->dev);
> + xa_erase(&cxlr_dc->dax_dev_list, hpa);
> +
> + return rc;
> +}
> +
> +int cxl_add_dc_extent(struct cxl_memdev_state *mds, struct range *alloc_range)
> +{
> + struct cxl_memdev *cxlmd = mds->cxlds.cxlmd;
> + struct cxl_endpoint_decoder *cxled;
> + struct cxl_dax_region *cxlr_dax;
> + struct cxl_dc_region *cxlr_dc;
> + struct dax_region *dax_region;
> + resource_size_t dpa_offset;
> + struct dev_dax_data data;
> + struct dev_dax *dev_dax;
> + struct cxl_region *cxlr;
> + struct range hpa_range;
> + resource_size_t hpa;
> + struct device *dev;
> + int rc;
> +
> + /*
> + * Find the cxl endpoind decoder with which has the extent dpa range and
> + * get the cxl_region, dax_region refrences.
> + */
> + dev = device_find_child(&cxlmd->endpoint->dev, alloc_range,
> + match_ep_decoder_by_range);
> + if (!dev) {
> + dev_err(mds->cxlds.dev, "%pr not mapped\n", alloc_range);
A lot of extra spaces after ','
> + return PTR_ERR(dev);
Same comment as earlier.
> + }
> +
> + cxled = to_cxl_endpoint_decoder(dev);
> + hpa_range = cxled->cxld.hpa_range;
> + cxlr = cxled->cxld.region;
> + cxlr_dc = cxlr->cxlr_dc;
> + cxlr_dax = cxlr_dc->cxlr_dax;
> + dax_region = dev_get_drvdata(&cxlr_dax->dev);
> +
> + /* DPA to HPA translation */
> + if (cxled->cxld.interleave_ways == 1) {
> + dpa_offset = alloc_range->start - cxled->dpa_res->start;
> + hpa = hpa_range.start + dpa_offset;
> + } else {
> + dev_err(mds->cxlds.dev, "Interleaving DC not supported\n");
> + return -EINVAL;
> + }
Check == 1 first and return. No need for else.
> +
> + data = (struct dev_dax_data) {
> + .dax_region = dax_region,
> + .id = -1,
magic id number
> + .size = 0,
> + };
> +
> + dev_dax = devm_create_dev_dax(&data);
> + if (IS_ERR(dev_dax))
> + return PTR_ERR(dev_dax);
> +
> + if (IS_ALIGNED(range_len(alloc_range), max_t(unsigned long,
> + dev_dax->align, memremap_compat_align()))) {
> + rc = alloc_dev_dax_range(dev_dax, hpa,
> + range_len(alloc_range));
> + if (rc)
> + return rc;
> + }
> +
> + rc = xa_insert(&cxlr_dc->dax_dev_list, hpa, dev_dax, GFP_KERNEL);
> +
> + return rc;
> +}
> +
> /* Establish an empty region covering the given HPA range */
> static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> struct cxl_endpoint_decoder *cxled)
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index a0b5819bc70b..e11651255780 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -122,7 +122,8 @@ TRACE_EVENT(cxl_aer_correctable_error,
> { CXL_EVENT_TYPE_INFO, "Informational" }, \
> { CXL_EVENT_TYPE_WARN, "Warning" }, \
> { CXL_EVENT_TYPE_FAIL, "Failure" }, \
> - { CXL_EVENT_TYPE_FATAL, "Fatal" })
> + { CXL_EVENT_TYPE_FATAL, "Fatal" }, \
> + { CXL_EVENT_TYPE_DCD, "DCD" })
>
> TRACE_EVENT(cxl_overflow,
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 7ac1237938b7..60c436b7ebb1 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -163,11 +163,13 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> #define CXLDEV_EVENT_STATUS_WARN BIT(1)
> #define CXLDEV_EVENT_STATUS_FAIL BIT(2)
> #define CXLDEV_EVENT_STATUS_FATAL BIT(3)
> +#define CXLDEV_EVENT_STATUS_DCD BIT(4)
>
> #define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO | \
> CXLDEV_EVENT_STATUS_WARN | \
> CXLDEV_EVENT_STATUS_FAIL | \
> - CXLDEV_EVENT_STATUS_FATAL)
> + CXLDEV_EVENT_STATUS_FATAL| \
> + CXLDEV_EVENT_STATUS_DCD)
>
> /* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> #define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 9c0b2fa72bdd..0440b5c04ef6 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -5,6 +5,7 @@
> #include <uapi/linux/cxl_mem.h>
> #include <linux/cdev.h>
> #include <linux/uuid.h>
> +#include <linux/xarray.h>
> #include "cxl.h"
>
> /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -226,6 +227,7 @@ struct cxl_event_interrupt_policy {
> u8 warn_settings;
> u8 failure_settings;
> u8 fatal_settings;
> + u8 dyncap_settings;
> } __packed;
>
> /**
> @@ -296,6 +298,13 @@ enum cxl_devtype {
> #define CXL_MAX_DC_REGION 8
> #define CXL_DC_REGION_SRTLEN 8
>
> +struct cxl_dc_extent_data {
> + u64 dpa_start;
> + u64 length;
> + u8 tag[16];
> + u16 shared_extent_seq;
> +};
> +
> /**
> * struct cxl_dev_state - The driver device state
> *
> @@ -406,6 +415,11 @@ struct cxl_memdev_state {
> u8 flags;
> } dc_region[CXL_MAX_DC_REGION];
>
> + u32 dc_list_gen_num;
> + u32 dc_extents_index;
> + struct xarray dc_extent_list;
> + u32 num_dc_extents;
> +
> size_t dc_event_log_size;
> struct cxl_event_state event;
> struct cxl_poison_state poison;
> @@ -470,6 +484,17 @@ enum cxl_opcode {
> UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \
> 0x40, 0x3d, 0x86)
>
> +
> +struct cxl_mbox_dc_response {
> + __le32 extent_list_size;
> + u8 reserved[4];
> + struct updated_extent_list {
> + __le64 dpa_start;
> + __le64 length;
> + u8 reserved[8];
> + } __packed extent_list[];
> +} __packed;
> +
> struct cxl_mbox_get_supported_logs {
> __le16 entries;
> u8 rsvd[6];
> @@ -555,6 +580,7 @@ enum cxl_event_log_type {
> CXL_EVENT_TYPE_WARN,
> CXL_EVENT_TYPE_FAIL,
> CXL_EVENT_TYPE_FATAL,
> + CXL_EVENT_TYPE_DCD,
> CXL_EVENT_TYPE_MAX
> };
>
> @@ -639,6 +665,35 @@ struct cxl_event_mem_module {
> u8 reserved[0x3d];
> } __packed;
>
> +/*
> + * Dynamic Capacity Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.5; Table 8-47
> + */
> +
> +#define CXL_EVENT_DC_TAG_SIZE 0x10
> +struct cxl_dc_extent {
> + __le64 start_dpa;
> + __le64 length;
> + u8 tag[CXL_EVENT_DC_TAG_SIZE];
> + __le16 shared_extn_seq;
> + u8 reserved[6];
> +} __packed;
> +
> +struct dcd_record_data {
> + u8 event_type;
> + u8 reserved;
> + __le16 host_id;
> + u8 region_index;
> + u8 reserved1[3];
> + struct cxl_dc_extent extent;
> + u8 reserved2[32];
> +} __packed;
> +
> +struct dcd_event_dyn_cap {
> + struct cxl_event_record_hdr hdr;
> + struct dcd_record_data data;
> +} __packed;
> +
> struct cxl_mbox_get_partition_info {
> __le64 active_volatile_cap;
> __le64 active_persistent_cap;
> @@ -684,6 +739,19 @@ struct cxl_mbox_dynamic_capacity {
> #define CXL_SET_PARTITION_IMMEDIATE_FLAG BIT(0)
> #define CXL_DYNAMIC_CAPACITY_SANITIZE_ON_RELEASE_FLAG BIT(0)
>
> +struct cxl_mbox_get_dc_extent {
> + __le32 extent_cnt;
> + __le32 start_extent_index;
> +} __packed;
> +
> +struct cxl_mbox_dc_extents {
> + __le32 ret_extent_cnt;
> + __le32 total_extent_cnt;
> + __le32 extent_list_num;
> + u8 rsvd[4];
> + struct cxl_dc_extent extent[];
> +} __packed;
> +
> /* Set Timestamp CXL 3.0 Spec 8.2.9.4.2 */
> struct cxl_mbox_set_timestamp_in {
> __le64 timestamp;
> @@ -826,6 +894,14 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
> int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
>
> +/* FIXME why not have these be static in mbox.c? */
> +int cxl_add_dc_extent(struct cxl_memdev_state *mds, struct range *alloc_range);
> +int cxl_release_dc_extent(struct cxl_memdev_state *mds, struct range *rel_range);
> +int cxl_dev_get_dc_extent_cnt(struct cxl_memdev_state *mds,
> + unsigned int *extent_gen_num);
> +int cxl_dev_get_dc_extents(struct cxl_memdev_state *mds, unsigned int cnt,
> + unsigned int index);
> +
> #ifdef CONFIG_CXL_SUSPEND
> void cxl_mem_active_inc(void);
> void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index ac1a41bc083d..558ffbcb9b34 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -522,8 +522,8 @@ static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> return irq;
>
> return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> - IRQF_SHARED | IRQF_ONESHOT, NULL,
> - dev_id);
> + IRQF_SHARED | IRQF_ONESHOT, NULL,
> + dev_id);
Stray edit?
> }
>
> static int cxl_event_get_int_policy(struct cxl_memdev_state *mds,
> @@ -555,6 +555,7 @@ static int cxl_event_config_msgnums(struct cxl_memdev_state *mds,
> .warn_settings = CXL_INT_MSI_MSIX,
> .failure_settings = CXL_INT_MSI_MSIX,
> .fatal_settings = CXL_INT_MSI_MSIX,
> + .dyncap_settings = CXL_INT_MSI_MSIX,
> };
>
> mbox_cmd = (struct cxl_mbox_cmd) {
> @@ -608,6 +609,11 @@ static int cxl_event_irqsetup(struct cxl_memdev_state *mds)
> return rc;
> }
>
> + rc = cxl_event_req_irq(cxlds, policy.dyncap_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event dc log\n");
> + return rc;
> + }
> return 0;
> }
>
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 227800053309..b2b27033f589 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -434,7 +434,7 @@ static void free_dev_dax_ranges(struct dev_dax *dev_dax)
> trim_dev_dax_range(dev_dax);
> }
>
> -static void unregister_dev_dax(void *dev)
> +void unregister_dev_dax(void *dev)
> {
> struct dev_dax *dev_dax = to_dev_dax(dev);
>
> @@ -445,6 +445,7 @@ static void unregister_dev_dax(void *dev)
> free_dev_dax_ranges(dev_dax);
> put_device(dev);
> }
> +EXPORT_SYMBOL_GPL(unregister_dev_dax);
>
> /* a return value >= 0 indicates this invocation invalidated the id */
> static int __free_dev_dax_id(struct dev_dax *dev_dax)
> @@ -641,7 +642,7 @@ static void dax_mapping_release(struct device *dev)
> kfree(mapping);
> }
>
> -static void unregister_dax_mapping(void *data)
> +void unregister_dax_mapping(void *data)
> {
> struct device *dev = data;
> struct dax_mapping *mapping = to_dax_mapping(dev);
> @@ -658,7 +659,7 @@ static void unregister_dax_mapping(void *data)
> device_del(dev);
> put_device(dev);
> }
> -
> +EXPORT_SYMBOL_GPL(unregister_dax_mapping);
> static struct dev_dax_range *get_dax_range(struct device *dev)
> {
> struct dax_mapping *mapping = to_dax_mapping(dev);
> @@ -793,7 +794,7 @@ static int devm_register_dax_mapping(struct dev_dax *dev_dax, int range_id)
> return 0;
> }
>
> -static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
> +int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
> resource_size_t size)
> {
> struct dax_region *dax_region = dev_dax->region;
> @@ -853,6 +854,8 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
>
> return rc;
> }
> +EXPORT_SYMBOL_GPL(alloc_dev_dax_range);
> +
>
> static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource *res, resource_size_t size)
> {
Split the dax bus changes out as a prep patch?
> diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
> index 8cd79ab34292..aa8418c7aead 100644
> --- a/drivers/dax/bus.h
> +++ b/drivers/dax/bus.h
> @@ -47,8 +47,11 @@ int __dax_driver_register(struct dax_device_driver *dax_drv,
> __dax_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
> void dax_driver_unregister(struct dax_device_driver *dax_drv);
> void kill_dev_dax(struct dev_dax *dev_dax);
> +void unregister_dev_dax(void *dev);
> +void unregister_dax_mapping(void *data);
> bool static_dev_dax(struct dev_dax *dev_dax);
> -
> +int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
> + resource_size_t size);
> /*
> * While run_dax() is potentially a generic operation that could be
> * defined in include/linux/dax.h we don't want to grow any users
>
next prev parent reply other threads:[~2023-06-15 16:59 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-14 19:16 [PATCH 0/5] cxl/dcd: Add support for Dynamic Capacity Devices (DCD) ira.weiny
2023-06-14 19:16 ` [PATCH 1/5] cxl/mem : Read Dynamic capacity configuration from the device ira.weiny
2023-06-14 22:53 ` Dave Jiang
2023-06-15 15:04 ` Ira Weiny
2023-06-14 23:49 ` Alison Schofield
2023-06-15 22:46 ` Ira Weiny
2023-06-15 18:30 ` Fan Ni
2023-06-15 19:17 ` Navneet Singh
2023-06-15 21:41 ` Fan Ni
2023-06-22 15:58 ` Jonathan Cameron
2023-06-24 13:08 ` Ira Weiny
2023-07-03 2:29 ` Jonathan Cameron
2023-06-14 19:16 ` [PATCH 2/5] cxl/region: Add dynamic capacity cxl region support ira.weiny
2023-06-14 23:37 ` Dave Jiang
2023-06-15 18:12 ` Ira Weiny
2023-06-15 18:28 ` Dave Jiang
2023-06-16 3:52 ` Navneet Singh
2023-06-15 18:56 ` Navneet Singh
2023-06-15 0:21 ` Alison Schofield
2023-06-16 2:06 ` Ira Weiny
2023-06-16 15:56 ` Alison Schofield
2023-06-16 16:51 ` Alison Schofield
2023-06-21 2:44 ` Ira Weiny
2023-06-20 17:55 ` Fan Ni
2023-06-20 20:33 ` Ira Weiny
2023-06-21 3:13 ` Navneet Singh
2023-06-21 17:20 ` Fan Ni
2023-06-23 18:02 ` Ira Weiny
2023-06-22 16:34 ` Jonathan Cameron
2023-07-05 14:49 ` Davidlohr Bueso
2023-06-14 19:16 ` [PATCH 3/5] cxl/mem : Expose dynamic capacity configuration to userspace ira.weiny
2023-06-15 0:40 ` Alison Schofield
2023-06-16 2:47 ` Ira Weiny
2023-06-16 15:58 ` Dave Jiang
2023-06-20 16:23 ` Ira Weiny
2023-06-20 16:48 ` Dave Jiang
2023-06-15 15:41 ` Dave Jiang
2023-06-14 19:16 ` [PATCH 4/5] cxl/mem: Add support to handle DCD add and release capacity events ira.weiny
2023-06-15 2:19 ` Alison Schofield
2023-06-16 4:11 ` Ira Weiny
2023-06-27 18:20 ` Fan Ni
2023-06-15 16:58 ` Dave Jiang [this message]
2023-06-22 17:01 ` Jonathan Cameron
2023-06-29 15:19 ` Ira Weiny
2023-06-27 18:17 ` Fan Ni
2023-07-13 12:55 ` Jørgen Hansen
2023-06-14 19:16 ` [PATCH 5/5] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2023-06-15 17:08 ` Dave Jiang
2023-06-15 0:56 ` [PATCH 0/5] cxl/dcd: Add support for Dynamic Capacity Devices (DCD) Alison Schofield
2023-06-16 2:57 ` Ira Weiny
2023-06-15 14:51 ` Ira Weiny
2023-06-22 15:07 ` Jonathan Cameron
2023-06-22 16:37 ` Jonathan Cameron
2023-06-27 14:59 ` Ira Weiny
2023-06-29 15:30 ` Ira Weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c8d5594f-846b-2075-2a65-1378de9bfdfb@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=navneet.singh@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox