From: Alison Schofield <alison.schofield@intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Ben Widawsky <bwidawsk@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
linux-cxl@vger.kernel.org,
Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v12 4/6] cxl/region: Provide region info to the cxl_poison trace event
Date: Wed, 12 Apr 2023 11:39:29 -0700 [thread overview]
Message-ID: <ZDb64ZQpNwZemQeg@aschofie-mobl2> (raw)
In-Reply-To: <643647e5502ab_417e29445@dwillia2-xfh.jf.intel.com.notmuch>
On Tue, Apr 11, 2023 at 10:55:49PM -0700, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > User space may need to know which region, if any, maps the poison
> > address(es) logged in a cxl_poison trace event. Since the mapping
> > of DPAs (device physical addresses) to a region can change, the
> > kernel must provide this information at the time the poison list
> > is read. The event informs user space that at event <timestamp>
> > this <region> mapped to this <DPA>, which is poisoned.
> >
> > The cxl_poison trace event is already wired up to log the region
> > name and uuid if it receives param 'struct cxl_region'.
> >
> > In order to provide that cxl_region, add another method for gathering
> > poison - by committed endpoint decoder mappings. This method is only
> > available with CONFIG_CXL_REGION and is only used if a region actually
> > maps the memdev where poison is being read. After the region driver
> > reads the poison list for all the mapped resources, control returns
> > to the memdev driver, where poison is read for any remaining unmapped
> > resources.
> >
> > Mixed mode decoders are not currently supported in Linux. Add a debug
> > message to the poison request path. That will serve as an alert that
> > poison list retrieval needs to add support for mixed mode.
> >
> > The default method remains: read the poison by memdev resource.
> >
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > ---
> > drivers/cxl/core/core.h | 11 +++++++
> > drivers/cxl/core/memdev.c | 62 +++++++++++++++++++++++++++++++++++++-
> > drivers/cxl/core/region.c | 63 +++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 135 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index e888e293943e..57bd22e01a0b 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -25,7 +25,12 @@ void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
> > #define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
> > int cxl_region_init(void);
> > void cxl_region_exit(void);
> > +int cxl_get_poison_by_endpoint(struct device *dev, void *data);
> > #else
> > +static inline int cxl_get_poison_by_endpoint(struct device *dev, void *data)
> > +{
> > + return 0;
> > +}
>
> For a public function the lack of type safety jumps out at me... more
> below:
>
> > static inline void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
> > {
> > }
> > @@ -68,4 +73,10 @@ enum cxl_poison_trace_type {
> > CXL_POISON_TRACE_LIST,
> > };
> >
> > +struct cxl_trigger_poison_context {
> > + struct cxl_port *port;
> > + enum cxl_decoder_mode mode;
> > + u64 offset;
> > +};
> > +
> > #endif /* __CXL_CORE_H__ */
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 297d87ebaca6..f26b5b6cda10 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -106,6 +106,47 @@ static ssize_t numa_node_show(struct device *dev, struct device_attribute *attr,
> > }
> > static DEVICE_ATTR_RO(numa_node);
> >
> > +static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
> > + struct cxl_trigger_poison_context *ctx)
> > +{
> > + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > + u64 offset, length;
> > + int rc = 0;
> > +
> > + /*
> > + * Collect poison for the remaining unmapped resources
> > + * after poison is collected by committed endpoints.
> > + *
> > + * Knowing that PMEM must always follow RAM, get poison
> > + * for unmapped resources based on the last decoder's mode:
> > + * ram: scan remains of ram range, then any pmem range
> > + * pmem: scan remains of pmem range
> > + */
> > +
> > + if (ctx->mode == CXL_DECODER_RAM) {
> > + offset = ctx->offset;
> > + length = resource_size(&cxlds->ram_res) - offset;
> > + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
> > + if (rc == -EFAULT)
> > + rc = 0;
> > + if (rc)
> > + return rc;
> > + }
> > + if (ctx->mode == CXL_DECODER_PMEM) {
> > + offset = ctx->offset;
> > + length = resource_size(&cxlds->dpa_res) - offset;
> > + if (!length)
> > + return 0;
> > + } else if (resource_size(&cxlds->pmem_res)) {
> > + offset = cxlds->pmem_res.start;
> > + length = resource_size(&cxlds->pmem_res);
> > + } else {
> > + return 0;
> > + }
> > +
> > + return cxl_mem_get_poison(cxlmd, offset, length, NULL);
> > +}
> > +
> > static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
> > {
> > struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > @@ -139,14 +180,33 @@ ssize_t cxl_trigger_poison_list(struct device *dev,
> > const char *buf, size_t len)
> > {
> > struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > + struct cxl_trigger_poison_context ctx;
> > + struct cxl_port *port;
> > bool trigger;
> > ssize_t rc;
> >
> > if (kstrtobool(buf, &trigger) || !trigger)
> > return -EINVAL;
> >
> > + port = dev_get_drvdata(&cxlmd->dev);
> > + if (!port || !is_cxl_endpoint(port))
> > + return -EINVAL;
> > +
> > down_read(&cxl_dpa_rwsem);
> > - rc = cxl_get_poison_by_memdev(cxlmd);
> > + if (port->commit_end == -1) {
> > + /* No regions mapped to this memdev */
> > + rc = cxl_get_poison_by_memdev(cxlmd);
> > + } else {
> > + /* Regions mapped, collect poison by endpoint */
> > + ctx = (struct cxl_trigger_poison_context) {
> > + .port = port,
> > + };
> > + rc = device_for_each_child(&port->dev, &ctx,
> > + cxl_get_poison_by_endpoint);
> > + if (rc == 1)
> > + rc = cxl_get_poison_unmapped(cxlmd, &ctx);
>
> Ah, cxl_get_poison_by_endpoint() is a function pointer to
> device_for_each_child(), that really feels like a detail that's private
> to the implementation.
>
> I would reorganize this to something like:
>
> if (port->commit_end == -1) {
> /* No regions mapped to this memdev */
> rc = cxl_get_poison_by_memdev(cxlmd);
> } else {
> /* Regions mapped, collect poison by endpoint */
> rc = cxl_get_poison_by_endpoint(endpoint);
> }
>
> ...and then internal to cxl_get_poison_by_endpoint() do:
>
> ctx = (struct cxl_trigger_poison_context) {
> .port = endpoint,
> };
> rc = device_for_each_child(&port->dev, &ctx,
> poison_by_decoder);
> if (rc == 1)
> rc = cxl_get_poison_unmapped(cxlmd, &ctx);
>
> ...then the header file reads more calmly with the added type-safety.
I'll take another stab at this.
A couple of versions back, I pulled cxl_get_poison_unmapped() out of
cxl_get_poison_by_endpoint() since it was not really work of the
region driver. (Ira called that out in a review)
Thanks,
Alison
>
> > + }
> > +
> > up_read(&cxl_dpa_rwsem);
> >
> > return rc ? rc : len;
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index f29028148806..4c4d3a6d631d 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -2213,6 +2213,69 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
> > }
> > EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, CXL);
> >
> > +int cxl_get_poison_by_endpoint(struct device *dev, void *arg)
> > +{
> > + struct cxl_trigger_poison_context *ctx = arg;
> > + struct cxl_endpoint_decoder *cxled;
> > + struct cxl_port *port = ctx->port;
> > + struct cxl_memdev *cxlmd;
> > + u64 offset, length;
> > + int rc = 0;
> > +
> > + down_read(&cxl_region_rwsem);
> > +
> > + if (!is_endpoint_decoder(dev))
> > + goto out;
> > +
> > + cxled = to_cxl_endpoint_decoder(dev);
> > + if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
> > + goto out;
> > +
> > + /*
> > + * Regions are only created with single mode decoders: pmem or ram.
> > + * Linux does not currently support mixed mode decoders. This means
> > + * that reading poison per endpoint decoder adheres to the spec
> > + * requirement that poison reads of pmem and ram must be separated.
> > + * CXL 3.0 Spec 8.2.9.8.4.1
> > + *
> > + * Watch for future support of mixed with a dev_dbg() msg.
>
> This sentence can go, mixed will never* be supported.
>
> * at least until the vendor that manages to ship such a thing comes and
> explains why the kernel needs to work around that awkwardness.
>
Got it!
> > + */
snip
>
next prev parent reply other threads:[~2023-04-12 18:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-10 20:55 [PATCH v12 0/6] CXL Poison List Retrieval & Tracing alison.schofield
2023-04-10 20:55 ` [PATCH v12 1/6] cxl/mbox: Add GET_POISON_LIST mailbox command alison.schofield
2023-04-12 1:47 ` Dan Williams
2023-04-12 4:45 ` Alison Schofield
2023-04-12 5:18 ` Dan Williams
2023-04-12 18:01 ` Alison Schofield
2023-04-12 19:16 ` Dan Williams
2023-04-12 18:06 ` Alison Schofield
2023-04-13 16:48 ` Alison Schofield
2023-04-13 18:34 ` Dan Williams
2023-04-17 16:32 ` Alison Schofield
2023-04-17 19:39 ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 2/6] cxl/trace: Add TRACE support for CXL media-error records alison.schofield
2023-04-10 20:55 ` [PATCH v12 3/6] cxl/memdev: Add trigger_poison_list sysfs attribute alison.schofield
2023-04-12 5:37 ` Dan Williams
2023-04-12 18:32 ` Alison Schofield
2023-04-12 19:34 ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 4/6] cxl/region: Provide region info to the cxl_poison trace event alison.schofield
2023-04-12 5:55 ` Dan Williams
2023-04-12 18:39 ` Alison Schofield [this message]
2023-04-12 22:09 ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 5/6] cxl/trace: Add an HPA to cxl_poison trace events alison.schofield
2023-04-10 20:55 ` [PATCH v12 6/6] tools/testing/cxl: Mock support for Get Poison List alison.schofield
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZDb64ZQpNwZemQeg@aschofie-mobl2 \
--to=alison.schofield@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox