From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: <shiju.jose@huawei.com>
Cc: <linux-edac@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-acpi@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <bp@alien8.de>,
<tony.luck@intel.com>, <rafael@kernel.org>, <lenb@kernel.org>,
<mchehab@kernel.org>, <dan.j.williams@intel.com>,
<dave@stgolabs.net>, <dave.jiang@intel.com>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<ira.weiny@intel.com>, <david@redhat.com>,
<Vilas.Sridharan@amd.com>, <leo.duran@amd.com>,
<Yazen.Ghannam@amd.com>, <rientjes@google.com>,
<jiaqiyan@google.com>, <Jon.Grimm@amd.com>,
<dave.hansen@linux.intel.com>, <naoya.horiguchi@nec.com>,
<james.morse@arm.com>, <jthoughton@google.com>,
<somasundaram.a@hpe.com>, <erdemaktas@google.com>,
<pgonda@google.com>, <duenwen@google.com>, <gthelen@google.com>,
<wschwartz@amperecomputing.com>, <dferguson@amperecomputing.com>,
<wbs@os.amperecomputing.com>, <nifan.cxl@gmail.com>,
<tanxiaofei@huawei.com>, <prime.zeng@hisilicon.com>,
<roberto.sassu@huawei.com>, <kangkang.shen@futurewei.com>,
<wanghuiqiang@huawei.com>, <linuxarm@huawei.com>
Subject: Re: [PATCH v13 11/18] cxl/memfeature: Add CXL memory device ECS control feature
Date: Mon, 14 Oct 2024 16:40:55 +0100 [thread overview]
Message-ID: <20241014164055.00002019@Huawei.com> (raw)
In-Reply-To: <20241009124120.1124-12-shiju.jose@huawei.com>
On Wed, 9 Oct 2024 13:41:12 +0100
<shiju.jose@huawei.com> wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
>
> CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 ECS (Error Check
> Scrub) control feature.
> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
> Specification (JESD79-5) and allows the DRAM to internally read, correct
> single-bit errors, and write back corrected data bits to the DRAM array
> while providing transparency to error counts.
I never understood the 'transparency to error counts'.
Maybe from software point of view
'while reporting error counts to the host'.
Unless anyone else can figure out what that text from the CXL spec
means? (I'm guessing it is cut and paste from the JEDEC spec)
>
> The ECS control allows the requester to change the log entry type, the ECS
> threshold count provided that the request is within the definition
> specified in DDR5 mode registers, change mode between codeword mode and
> row count mode, and reset the ECS counter.
>
> Register with EDAC device driver, which gets the ECS attr descriptors
> from the EDAC ECS and expose sysfs ECS control attributes to userspace.
> For example ECS control for the memory media FRU 0 in CXL mem0 device is
> in /sys/bus/edac/devices/cxl_mem0/ecs_fruX/
fru0?
>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
A few little things in line. In general looks good to me.
> ---
> drivers/cxl/core/memfeature.c | 467 +++++++++++++++++++++++++++++++++-
> 1 file changed, 461 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
> index 84d6e887a4fa..567406566c77 100644
> --- a/drivers/cxl/core/memfeature.c
> +++ b/drivers/cxl/core/memfeature.c
> @@ -19,7 +19,7 @@
> #include <cxl.h>
> #include <linux/edac.h>
>
> -#define CXL_DEV_NUM_RAS_FEATURES 1
> +#define CXL_DEV_NUM_RAS_FEATURES 2
> #define CXL_DEV_HOUR_IN_SECS 3600
>
> #define CXL_SCRUB_NAME_LEN 128
> @@ -309,6 +309,420 @@ static const struct edac_scrub_ops cxl_ps_scrub_ops = {
> .set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle,
> };
>
> +/* CXL DDR5 ECS control definitions */
> +static const uuid_t cxl_ecs_uuid =
> + UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e, \
Why the \?
> + 0x89, 0x33, 0x86);
> +
> +
> +#define CXL_ECS_LOG_ENTRY_TYPE_MASK GENMASK(1, 0)
> +#define CXL_ECS_REALTIME_REPORT_CAP_MASK BIT(0)
> +#define CXL_ECS_THRESHOLD_COUNT_MASK GENMASK(2, 0)
> +#define CXL_ECS_MODE_MASK BIT(3)
That name is a little generic. Maybe CXL_ECS_COUNT_MODE_MASK ?
> +#define CXL_ECS_RESET_COUNTER_MASK BIT(4)
> +
> +static const u16 ecs_supp_threshold[] = { 0, 0, 0, 256, 1024, 4096 };
> +
> +enum {
> + ECS_LOG_ENTRY_TYPE_DRAM = 0x0,
> + ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1,
> +};
> +
> +enum {
> + ECS_THRESHOLD_256 = 3,
> + ECS_THRESHOLD_1024 = 4,
> + ECS_THRESHOLD_4096 = 5,
> +};
Perhaps move this above the ecs_supp_threshold array and use
static const ecs_supp_threshold[] = {
[ECS_THRESHOLD_256] = 256,
[ECS_THRESHOLD_1024] = 1024,
[ECS_THRESHOLD_4096] = 4096,
};
which will fill the zeros in for you. You don't care about them anyway
as they are undefined values.
> +
> +enum cxl_ecs_mode {
> + ECS_MODE_COUNTS_ROWS = 0,
> + ECS_MODE_COUNTS_CODEWORDS = 1,
> +};
> +
> +static int cxl_mem_ecs_set_attrs(struct device *dev, void *drv_data, int fru_id,
> + struct cxl_ecs_params *params, u8 param_type)
> +{
> + struct cxl_ecs_context *cxl_ecs_ctx = drv_data;
> + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + struct cxl_ecs_fru_rd_attrs *fru_rd_attrs;
> + struct cxl_ecs_fru_wr_attrs *fru_wr_attrs;
> + size_t rd_data_size, wr_data_size;
> + u16 num_media_frus, count;
> + size_t data_size;
> + int ret;
> +
> + num_media_frus = cxl_ecs_ctx->num_media_frus;
> + rd_data_size = cxl_ecs_ctx->get_feat_size;
> + wr_data_size = cxl_ecs_ctx->set_feat_size;
> + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) =
> + kmalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + data_size = cxl_get_feature(mds, cxl_ecs_uuid,
> + CXL_GET_FEAT_SEL_CURRENT_VALUE,
> + rd_attrs, rd_data_size);
> + if (!data_size)
> + return -EIO;
blank line here as the next line isn't part of this allocate / check
errors block.
> + struct cxl_ecs_wr_attrs *wr_attrs __free(kfree) =
> + kmalloc(wr_data_size, GFP_KERNEL);
> + if (!wr_attrs)
> + return -ENOMEM;
> +
> + /* Fill writable attributes from the current attributes read
CXL uses
/*
* Fill writable
style for multiline comments.
> + * for all the media FRUs.
> + */
> +static int cxl_ecs_get_mode_counts_rows(struct device *dev, void *drv_data,
> + int fru_id, u32 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.mode == ECS_MODE_COUNTS_ROWS)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
> +
> +static int cxl_ecs_get_mode_counts_codewords(struct device *dev, void *drv_data,
> + int fru_id, u32 *val)
> +{
> + struct cxl_ecs_params params;
> + int ret;
This form is pretty common. Maybe worth some macros like
you have in the edac side of things?
> +
> + ret = cxl_mem_ecs_get_attrs(dev, drv_data, fru_id, ¶ms);
> + if (ret)
> + return ret;
> +
> + if (params.mode == ECS_MODE_COUNTS_CODEWORDS)
> + *val = 1;
> + else
> + *val = 0;
> +
> + return 0;
> +}
next prev parent reply other threads:[~2024-10-14 15:41 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 12:41 [PATCH v13 00/18] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-10-09 12:41 ` [PATCH v13 01/18] EDAC: Add support for EDAC device features control shiju.jose
2024-10-14 14:18 ` Jonathan Cameron
2024-10-17 8:37 ` Shiju Jose
2024-10-16 10:58 ` Borislav Petkov
2024-10-17 8:37 ` Shiju Jose
2024-10-09 12:41 ` [PATCH v13 02/18] EDAC: Add scrub control feature shiju.jose
2024-10-14 14:26 ` Jonathan Cameron
2024-10-22 19:04 ` Borislav Petkov
2024-10-23 16:04 ` Shiju Jose
2024-10-23 16:16 ` Borislav Petkov
2024-10-09 12:41 ` [PATCH v13 03/18] EDAC: Add ECS " shiju.jose
2024-10-14 14:33 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 04/18] cxl: move cxl headers to new include/cxl/ directory shiju.jose
2024-10-14 14:34 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 05/18] cxl: Move mailbox related bits to the same context shiju.jose
2024-10-14 14:42 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 06/18] cxl: Convert cxl_internal_send_cmd() to use 'struct cxl_mailbox' as input shiju.jose
2024-10-09 12:41 ` [PATCH v13 07/18] cxl: Add Get Supported Features command for kernel usage shiju.jose
2024-10-14 15:05 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 08/18] cxl/mbox: Add GET_FEATURE mailbox command shiju.jose
2024-10-14 15:08 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 09/18] cxl/mbox: Add SET_FEATURE " shiju.jose
2024-10-14 15:12 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 10/18] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2024-10-14 15:28 ` Jonathan Cameron
2024-10-14 18:02 ` Fan Ni
2024-10-15 16:32 ` Shiju Jose
2024-10-09 12:41 ` [PATCH v13 11/18] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2024-10-14 15:40 ` Jonathan Cameron [this message]
2024-10-09 12:41 ` [PATCH v13 12/18] platform: Add __free() based cleanup function for platform_device_put shiju.jose
2024-10-14 15:43 ` Jonathan Cameron
2024-10-14 16:00 ` Greg KH
2024-10-14 16:04 ` Greg KH
2024-10-14 17:16 ` Jonathan Cameron
2024-10-14 18:06 ` Rafael J. Wysocki
2024-10-15 9:10 ` Jonathan Cameron
2024-10-15 9:40 ` Jonathan Cameron
2024-10-15 10:17 ` Greg KH
2024-10-15 13:32 ` Rafael J. Wysocki
2024-10-15 14:19 ` Jonathan Cameron
2024-10-15 15:35 ` Rafael J. Wysocki
2024-10-16 9:00 ` Jonathan Cameron
2024-10-15 13:34 ` Jonathan Cameron
2024-10-15 13:37 ` Rafael J. Wysocki
2024-10-09 12:41 ` [PATCH v13 13/18] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-10-14 15:49 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 14/18] ras: mem: Add memory " shiju.jose
2024-10-09 12:41 ` [PATCH v13 15/18] EDAC: Add memory repair control feature shiju.jose
2024-10-14 16:23 ` Jonathan Cameron
2024-10-14 16:39 ` Shiju Jose
2024-10-14 17:02 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 16/18] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2024-10-14 16:26 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 17/18] cxl/memfeature: Add CXL memory device PPR control feature shiju.jose
2024-10-14 16:38 ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 18/18] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2024-10-14 17:00 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241014164055.00002019@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=Jon.Grimm@amd.com \
--cc=Vilas.Sridharan@amd.com \
--cc=Yazen.Ghannam@amd.com \
--cc=alison.schofield@intel.com \
--cc=bp@alien8.de \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dferguson@amperecomputing.com \
--cc=duenwen@google.com \
--cc=erdemaktas@google.com \
--cc=gthelen@google.com \
--cc=ira.weiny@intel.com \
--cc=james.morse@arm.com \
--cc=jiaqiyan@google.com \
--cc=jthoughton@google.com \
--cc=kangkang.shen@futurewei.com \
--cc=lenb@kernel.org \
--cc=leo.duran@amd.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxarm@huawei.com \
--cc=mchehab@kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=nifan.cxl@gmail.com \
--cc=pgonda@google.com \
--cc=prime.zeng@hisilicon.com \
--cc=rafael@kernel.org \
--cc=rientjes@google.com \
--cc=roberto.sassu@huawei.com \
--cc=shiju.jose@huawei.com \
--cc=somasundaram.a@hpe.com \
--cc=tanxiaofei@huawei.com \
--cc=tony.luck@intel.com \
--cc=vishal.l.verma@intel.com \
--cc=wanghuiqiang@huawei.com \
--cc=wbs@os.amperecomputing.com \
--cc=wschwartz@amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).