linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shiju Jose <shiju.jose@huawei.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"Jonathan Cameron" <jonathan.cameron@huawei.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	"Roberto Sassu" <roberto.sassu@huawei.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>
Subject: RE: [PATCH v13 02/18] EDAC: Add scrub control feature
Date: Wed, 23 Oct 2024 16:04:05 +0000	[thread overview]
Message-ID: <4ee36d03a2894606a571b37f440da36f@huawei.com> (raw)
In-Reply-To: <20241022190454.GIZxf3VkmLVR-JLeUc@fat_crate.local>


Thanks for the feedbacks.

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 22 October 2024 20:05
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>
>Subject: Re: [PATCH v13 02/18] EDAC: Add scrub control feature
>
>On Wed, Oct 09, 2024 at 01:41:03PM +0100, shiju.jose@huawei.com wrote:
>> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub
>> b/Documentation/ABI/testing/sysfs-edac-scrub
>> new file mode 100644
>> index 000000000000..b4f8f0bba17b
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
>> @@ -0,0 +1,69 @@
>> +What:		/sys/bus/edac/devices/<dev-name>/scrubX
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory
>> +		belongs to an instance of memory scrub control feature,
>> +		where <dev-name> directory corresponds to a device/memory
>> +		region registered with the EDAC device driver for the
>> +		scrub control feature.
>> +		The sysfs scrub attr nodes would be present only if the
>> +		client driver has implemented the corresponding attr
>> +		callback function and passed in ops to the EDAC RAS feature
>> +		driver during registration.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/addr_range_base
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RW) The base of the address range of the memory region
>> +		to be scrubbed (on-demand scrubbing).
>
>Why does this sound more complicated than it is?
>
>Why isn't this simply "addr" and the next one "size"?
>
>On-demand scrubbing should scrub at address "addr" and "size" bytes. Can't get
>any simpler than that.
Sure. Will modify to addr and size.
>
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/addr_range_size
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RW) The size of the address range of the memory region
>> +		to be scrubbed (on-demand scrubbing).
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/enable_background
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RW) Start/Stop background(patrol) scrubbing if supported.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/enable_on_demand
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RW) Start/Stop on-demand scrubbing the memory region
>> +		if supported.
>
>Why do you need a separate "enable" flag?
>
>Why can't it be: "writing into "addr" starts the on-demand scrubbing"?
If  'enable' attribute is removed , then there is an ordering with setting address + size.
Also user space can't check whether scrubbing is enabled or not. 
>
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/min_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RO) Supported minimum scrub cycle duration in seconds
>> +		by the memory scrubber.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/max_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RO) Supported maximum scrub cycle duration in seconds
>> +		by the memory scrubber.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/current_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@vger.kernel.org
>> +Description:
>> +		(RW) The current scrub cycle duration in seconds and must be
>> +		within the supported range by the memory scrubber.
>
>What are those three good for and why are they exposed?
Scrub has an overhead when running and that may want to be reduced by
just taking longer to do it. 
Min and max scrub cycle duration returns the range of scrub rate
supported by the device.
>
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
>> 4edfb83ffbee..a96a74de8b9e 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC)			:= edac_core.o
>>
>>  edac_core-y	:= edac_mc.o edac_device.o edac_mc_sysfs.o
>>  edac_core-y	+= edac_module.o edac_device_sysfs.o wq.o
>> +edac_core-y	+= scrub.o
>>
>>  edac_core-$(CONFIG_EDAC_DEBUG)		+= debugfs.o
>>
>> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
>> index 0b8aa8150239..0c9da55d18bc 100644
>> --- a/drivers/edac/edac_device.c
>> +++ b/drivers/edac/edac_device.c
>> @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev)
>> {
>>  	struct edac_dev_feat_ctx *ctx = container_of(dev, struct
>> edac_dev_feat_ctx, dev);
>>
>> +	kfree(ctx->scrub);
>>  	kfree(ctx->dev.groups);
>>  	kfree(ctx);
>>  }
>> @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  		      const struct edac_dev_feature *ras_features)  {
>>  	const struct attribute_group **ras_attr_groups;
>> +	struct edac_dev_data *dev_data;
>>  	struct edac_dev_feat_ctx *ctx;
>> +	int scrub_cnt = 0, scrub_inst = 0;
>>  	int attr_gcnt = 0;
>>  	int ret, feat;
>
>The EDAC tree preferred ordering of variable declarations at the beginning of a
>function is reverse fir tree order::
I used the reverse tree ordering, but missed here. I will correct. 
>
>	struct long_struct_name *descriptive_name;
>	unsigned long foo, bar;
>	unsigned int tmp;
>	int ret;
>
>The above is faster to parse than the reverse ordering::
>
>	int ret;
>	unsigned int tmp;
>	unsigned long foo, bar;
>	struct long_struct_name *descriptive_name;
>
>And even more so than random ordering::
>
>	unsigned long foo, bar;
>	int ret;
>	struct long_struct_name *descriptive_name;
>	unsigned int tmp;
>
>>
>> @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  	/* Double parse to make space for attributes */
>>  	for (feat = 0; feat < num_features; feat++) {
>>  		switch (ras_features[feat].ft_type) {
>> -		/* Add feature specific code */
>> +		case RAS_FEAT_SCRUB:
>> +			attr_gcnt++;
>> +			scrub_cnt++;
>> +			break;
>>  		default:
>>  			return -EINVAL;
>>  		}
>> @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  		goto ctx_free;
>>  	}
>>
>> +	if (scrub_cnt) {
>> +		ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub),
>GFP_KERNEL);
>> +		if (!ctx->scrub) {
>> +			ret = -ENOMEM;
>> +			goto groups_free;
>> +		}
>> +	}
>> +
>>  	attr_gcnt = 0;
>>  	for (feat = 0; feat < num_features; feat++, ras_features++) {
>>  		switch (ras_features->ft_type) {
>> -		/* Add feature specific code */
>> +		case RAS_FEAT_SCRUB:
>> +			if (!ras_features->scrub_ops)
>> +				continue;
>
>Continue?
>
>I think fail. What is a scrub feature good for if it doesn't have ops?
Here continue to check any other feature (for eg. ECS, memory repair or another scrub instance) listed
by the parent device in the ras_features[].   
>
>> +			if (scrub_inst != ras_features->instance)
>> +				goto data_mem_free;
>> +			dev_data = &ctx->scrub[scrub_inst];
>> +			dev_data->instance = scrub_inst;
>> +			dev_data->scrub_ops = ras_features->scrub_ops;
>> +			dev_data->private = ras_features->ctx;
>> +			ret = edac_scrub_get_desc(parent,
>&ras_attr_groups[attr_gcnt],
>> +						  ras_features->instance);
>> +			if (ret)
>> +				goto data_mem_free;
>> +			scrub_inst++;
>> +			attr_gcnt++;
>> +			break;
>>  		default:
>>  			ret = -EINVAL;
>> -			goto groups_free;
>> +			goto data_mem_free;
>>  		}
>>  	}
>>
>
>...
>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette

Thanks,
Shiju


  reply	other threads:[~2024-10-23 16:04 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09 12:41 [PATCH v13 00/18] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-10-09 12:41 ` [PATCH v13 01/18] EDAC: Add support for EDAC device features control shiju.jose
2024-10-14 14:18   ` Jonathan Cameron
2024-10-17  8:37     ` Shiju Jose
2024-10-16 10:58   ` Borislav Petkov
2024-10-17  8:37     ` Shiju Jose
2024-10-09 12:41 ` [PATCH v13 02/18] EDAC: Add scrub control feature shiju.jose
2024-10-14 14:26   ` Jonathan Cameron
2024-10-22 19:04   ` Borislav Petkov
2024-10-23 16:04     ` Shiju Jose [this message]
2024-10-23 16:16       ` Borislav Petkov
2024-10-09 12:41 ` [PATCH v13 03/18] EDAC: Add ECS " shiju.jose
2024-10-14 14:33   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 04/18] cxl: move cxl headers to new include/cxl/ directory shiju.jose
2024-10-14 14:34   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 05/18] cxl: Move mailbox related bits to the same context shiju.jose
2024-10-14 14:42   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 06/18] cxl: Convert cxl_internal_send_cmd() to use 'struct cxl_mailbox' as input shiju.jose
2024-10-09 12:41 ` [PATCH v13 07/18] cxl: Add Get Supported Features command for kernel usage shiju.jose
2024-10-14 15:05   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 08/18] cxl/mbox: Add GET_FEATURE mailbox command shiju.jose
2024-10-14 15:08   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 09/18] cxl/mbox: Add SET_FEATURE " shiju.jose
2024-10-14 15:12   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 10/18] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2024-10-14 15:28   ` Jonathan Cameron
2024-10-14 18:02   ` Fan Ni
2024-10-15 16:32     ` Shiju Jose
2024-10-09 12:41 ` [PATCH v13 11/18] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2024-10-14 15:40   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 12/18] platform: Add __free() based cleanup function for platform_device_put shiju.jose
2024-10-14 15:43   ` Jonathan Cameron
2024-10-14 16:00     ` Greg KH
2024-10-14 16:04       ` Greg KH
2024-10-14 17:16         ` Jonathan Cameron
2024-10-14 18:06           ` Rafael J. Wysocki
2024-10-15  9:10             ` Jonathan Cameron
2024-10-15  9:40               ` Jonathan Cameron
2024-10-15 10:17                 ` Greg KH
2024-10-15 13:32                   ` Rafael J. Wysocki
2024-10-15 14:19                     ` Jonathan Cameron
2024-10-15 15:35                       ` Rafael J. Wysocki
2024-10-16  9:00                         ` Jonathan Cameron
2024-10-15 13:34                   ` Jonathan Cameron
2024-10-15 13:37                     ` Rafael J. Wysocki
2024-10-09 12:41 ` [PATCH v13 13/18] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-10-14 15:49   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 14/18] ras: mem: Add memory " shiju.jose
2024-10-09 12:41 ` [PATCH v13 15/18] EDAC: Add memory repair control feature shiju.jose
2024-10-14 16:23   ` Jonathan Cameron
2024-10-14 16:39     ` Shiju Jose
2024-10-14 17:02       ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 16/18] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2024-10-14 16:26   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 17/18] cxl/memfeature: Add CXL memory device PPR control feature shiju.jose
2024-10-14 16:38   ` Jonathan Cameron
2024-10-09 12:41 ` [PATCH v13 18/18] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2024-10-14 17:00   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ee36d03a2894606a571b37f440da36f@huawei.com \
    --to=shiju.jose@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jiaqiyan@google.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roberto.sassu@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).