linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Shiju Jose <shiju.jose@huawei.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	"Roberto Sassu" <roberto.sassu@huawei.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>
Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
Date: Thu, 9 Jan 2025 16:01:59 +0000	[thread overview]
Message-ID: <20250109160159.00002add@huawei.com> (raw)
In-Reply-To: <20250109151854.GCZ3_o3rf6S24qUbtB@fat_crate.local>

On Thu, 9 Jan 2025 16:18:54 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Thu, Jan 09, 2025 at 02:24:33PM +0000, Jonathan Cameron wrote:
> > To my thinking that would fail the test of being an intuitive interface.
> > To issue a repair command requires that multiple attributes be configured
> > before triggering the actual repair.
> > 
> > Think of it as setting the coordinates of the repair in a high dimensional
> > space.  
> 
> Why?

Ok. To me the fact it's not a single write was relevant. Seems not
in your mental model of how this works.  For me a single write
that you cannot query back is fine, setting lots of parameters and
being unable to query any of them less so.  I guess you disagree.
In interests of progress I'm not going to argue further. No one is
going to use this interface by hand anyway so the lost of useability
I'm seeing doesn't matter a lot.

> 
> You can write every attribute in its separate file and have a "commit" or
> "start" file which does that.

That's what we have.

> 
> Or you can designate a file which starts the process. This is how I'm
> injecting errors on x86:
> 
> see readme_msg here: arch/x86/kernel/cpu/mce/inject.c
> 
> More specifically:
> 
> "flags:\t Injection type to be performed. Writing to this file will trigger a\n"
> "\t real machine check, an APIC interrupt or invoke the error decoder routines\n"
> "\t for AMD processors.\n"
> 
> So you set everything else, and as the last step you set the injection type
> *and* you also trigger it with this one write.

Agreed. I'm not sure of the relevance though. This is how it works and
there is no proposal to change that.  

What I was trying to argue was for an interface that let you set all the
coordinates and read back what they were before hitting go. 

> 
> > Sure. In this case the addition of min/max was perhaps a wrong response to
> > your request for a way to those ranges rather than just rejecting a write
> > of something out of range as earlier version did.
> > 
> > We can revisit in future if range discovery becomes necessary.  Personally
> > I don't think it is given we are only taking these actions in response error
> > records that give us precisely what to write and hence are always in range.  
> 
> My goal here was to make this user-friendly. Because you need some way of
> knowing what valid ranges are and in order to trigger the repair, if it needs
> to happen for a range.

In at least the CXL case I'm fairly sure most of them are not discoverable.
Until you see errors you have no idea what the memory topology is.

> 
> Or, you can teach the repair logic to ignore invalid ranges and "clamp" things
> to whatever makes sense.

For that you'd need to have a path to read back what happened.

> 
> Again, I'm looking at it from the usability perspective. I haven't actually
> needed this scrub+repair functionality yet to know whether the UI makes sense.
> So yeah, collecting some feedback from real-life use cases would probably give
> you a lot better understanding of how that UI should be designed... perhaps
> you won't ever need the ranges, whow knows.
> 
> So yes, preemptively designing stuff like that "in the dark" is kinda hard.
> :-)

The discoverability is unnecessary for any known usecase.

Ok. Then can we just drop the range discoverability entirely or we go with
your suggestion and do not support read back of what has been
requested but instead have the reads return a range if known or "" /
return -EONOTSUPP if simply not known?


I can live with that though to me we are heading in the direction of
a less intuitive interface to save a small number of additional files.

Jonathan

> 


  reply	other threads:[~2025-01-09 16:02 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-06 12:09 [PATCH v18 00/19] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2025-01-06 12:09 ` [PATCH v18 01/19] EDAC: Add support for EDAC device features control shiju.jose
2025-01-06 13:37   ` Borislav Petkov
2025-01-06 14:48     ` Shiju Jose
2025-01-13 15:06   ` Mauro Carvalho Chehab
2025-01-14  9:55     ` Jonathan Cameron
2025-01-14 10:08     ` Shiju Jose
2025-01-14 11:33       ` Mauro Carvalho Chehab
2025-01-30 19:18   ` Daniel Ferguson
2025-01-06 12:09 ` [PATCH v18 02/19] EDAC: Add scrub control feature shiju.jose
2025-01-06 15:57   ` Borislav Petkov
2025-01-06 19:34     ` Shiju Jose
2025-01-07  7:32       ` Borislav Petkov
2025-01-07  9:23         ` Shiju Jose
2025-01-08 15:47         ` Shiju Jose
2025-01-13 15:50   ` Mauro Carvalho Chehab
2025-01-30 19:18   ` Daniel Ferguson
2025-01-06 12:09 ` [PATCH v18 03/19] EDAC: Add ECS " shiju.jose
2025-01-13 16:09   ` Mauro Carvalho Chehab
2025-01-06 12:10 ` [PATCH v18 04/19] EDAC: Add memory repair " shiju.jose
2025-01-09  9:19   ` Borislav Petkov
2025-01-09 11:00     ` Shiju Jose
2025-01-09 12:32       ` Borislav Petkov
2025-01-09 14:24         ` Jonathan Cameron
2025-01-09 15:18           ` Borislav Petkov
2025-01-09 16:01             ` Jonathan Cameron [this message]
2025-01-09 16:19               ` Borislav Petkov
2025-01-09 18:34                 ` Jonathan Cameron
2025-01-09 23:51                   ` Dan Williams
2025-01-10 11:01                     ` Jonathan Cameron
2025-01-10 22:49                       ` Dan Williams
2025-01-13 11:40                         ` Jonathan Cameron
2025-01-14 19:35                           ` Dan Williams
2025-01-15 10:07                             ` Jonathan Cameron
2025-01-15 11:35                             ` Mauro Carvalho Chehab
2025-01-11 17:12                   ` Borislav Petkov
2025-01-13 11:07                     ` Jonathan Cameron
2025-01-21 16:16                       ` Borislav Petkov
2025-01-21 18:16                         ` Jonathan Cameron
2025-01-22 19:09                           ` Borislav Petkov
2025-02-06 13:39                             ` Jonathan Cameron
2025-02-17 13:23                               ` Borislav Petkov
2025-02-18 16:51                                 ` Jonathan Cameron
2025-02-19 18:45                                   ` Borislav Petkov
2025-02-20 12:19                                     ` Jonathan Cameron
2025-01-14 13:10                   ` Mauro Carvalho Chehab
2025-01-14 12:57               ` Mauro Carvalho Chehab
2025-01-14 12:38           ` Mauro Carvalho Chehab
2025-01-14 13:05             ` Jonathan Cameron
2025-01-14 14:39               ` Mauro Carvalho Chehab
2025-01-14 11:47   ` Mauro Carvalho Chehab
2025-01-14 12:31     ` Shiju Jose
2025-01-14 14:26       ` Mauro Carvalho Chehab
2025-01-14 13:47   ` Mauro Carvalho Chehab
2025-01-14 14:30     ` Shiju Jose
2025-01-15 12:03       ` Mauro Carvalho Chehab
2025-01-06 12:10 ` [PATCH v18 05/19] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2025-01-21 23:01   ` Daniel Ferguson
2025-01-22 15:38     ` Shiju Jose
2025-01-30 19:19   ` Daniel Ferguson
2025-01-06 12:10 ` [PATCH v18 06/19] ras: mem: Add memory " shiju.jose
2025-01-21 23:01   ` Daniel Ferguson
2025-01-30 19:19   ` Daniel Ferguson
2025-01-06 12:10 ` [PATCH v18 07/19] cxl: Refactor user ioctl command path from mds to mailbox shiju.jose
2025-01-06 12:10 ` [PATCH v18 08/19] cxl: Add skeletal features driver shiju.jose
2025-01-06 12:10 ` [PATCH v18 09/19] cxl: Enumerate feature commands shiju.jose
2025-01-06 12:10 ` [PATCH v18 10/19] cxl: Add Get Supported Features command for kernel usage shiju.jose
2025-01-06 12:10 ` [PATCH v18 11/19] cxl: Add features driver attribute to emit number of features supported shiju.jose
2025-01-06 12:10 ` [PATCH v18 12/19] cxl/mbox: Add GET_FEATURE mailbox command shiju.jose
2025-01-06 12:10 ` [PATCH v18 13/19] cxl/mbox: Add SET_FEATURE " shiju.jose
2025-01-06 12:10 ` [PATCH v18 14/19] cxl: Setup exclusive CXL features that are reserved for the kernel shiju.jose
2025-01-06 12:10 ` [PATCH v18 15/19] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2025-01-24 20:38   ` Dan Williams
2025-01-27 10:06     ` Jonathan Cameron
2025-01-27 12:53     ` Shiju Jose
2025-01-27 23:17       ` Dan Williams
2025-01-29 12:28         ` Shiju Jose
2025-01-06 12:10 ` [PATCH v18 16/19] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2025-01-06 12:10 ` [PATCH v18 17/19] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2025-01-06 12:10 ` [PATCH v18 18/19] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
2025-01-06 12:10 ` [PATCH v18 19/19] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2025-01-13 14:46 ` [PATCH v18 00/19] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Mauro Carvalho Chehab
2025-01-13 15:36   ` Jonathan Cameron
2025-01-14 14:06     ` Mauro Carvalho Chehab
2025-01-13 18:15   ` Shiju Jose
2025-01-30 19:18 ` Daniel Ferguson
2025-02-03  9:25   ` Shiju Jose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250109160159.00002add@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roberto.sassu@huawei.com \
    --cc=shiju.jose@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).