* [PATCH 0/3] cxl/region: Cache management and region decode reset fixes
@ 2023-06-17 1:24 Dan Williams
2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Dan Williams @ 2023-06-17 1:24 UTC (permalink / raw)
To: linux-cxl; +Cc: Jonathan Cameron, Vikram Sethi
A recent conversation [1] highlighted the fact that the current location of
cache management could cause problems for reconfiguring persistent
memory devices, or accelerators in the future.
As that touches paths near cxl_region_decode_reset() it reminded me of a
pending bug in that area around handling reset failures [2]. Further details
in the patches.
[1]: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com
[2]: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com
---
Dan Williams (3):
cxl/region: Move cache invalidation before region teardown, and before setup
cxl/region: Flag partially torn down regions as unusable
cxl/region: Fix state transitions after reset failure
drivers/cxl/core/region.c | 102 ++++++++++++++++++++++++++++-----------------
drivers/cxl/cxl.h | 16 ++++---
2 files changed, 72 insertions(+), 46 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup 2023-06-17 1:24 [PATCH 0/3] cxl/region: Cache management and region decode reset fixes Dan Williams @ 2023-06-17 1:24 ` Dan Williams 2023-06-21 0:00 ` Dave Jiang 2023-06-22 9:00 ` Jonathan Cameron 2023-06-17 1:24 ` [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable Dan Williams 2023-06-17 1:24 ` [PATCH 3/3] cxl/region: Fix state transitions after reset failure Dan Williams 2 siblings, 2 replies; 11+ messages in thread From: Dan Williams @ 2023-06-17 1:24 UTC (permalink / raw) To: linux-cxl; +Cc: Jonathan Cameron, Vikram Sethi Vikram raised a concern with the theoretical case of a CPU sending MemClnEvict to a device that is not prepared to receive. MemClnEvict is a message that is sent after a CPU has taken ownership of a cacheline from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder reconfiguration it is possible that the CPU is holding old contents for a new device that has taken over the physical address range being cached by the CPU. To avoid this scenario, invalidate caches prior to tearing down an HDM decoder configuration. Now, this poses another problem that it is possible for something to speculate into that space while the decode configuration is still up, so to close that gap also invalidate prior to establish new contents behind a given physical address range. With this change the cache invalidation is now explicit and need not be checked in cxl_region_probe(), and that obviates the need for CXL_REGION_F_INCOHERENT. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Fixes: d18bc74aced6 ("cxl/region: Manage CPU caches relative to DPA invalidation events") Reported-by: Vikram Sethi <vsethi@nvidia.com> Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/core/region.c | 66 +++++++++++++++++++++++++-------------------- drivers/cxl/cxl.h | 8 +---- 2 files changed, 38 insertions(+), 36 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 013f3656e661..d48c5916f2e9 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -125,10 +125,38 @@ static struct cxl_region_ref *cxl_rr_load(struct cxl_port *port, return xa_load(&port->regions, (unsigned long)cxlr); } +static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) +{ + if (!cpu_cache_has_invalidate_memregion()) { + if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { + dev_warn_once( + &cxlr->dev, + "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); + return 0; + } else { + dev_err(&cxlr->dev, + "Failed to synchronize CPU cache state\n"); + return -ENXIO; + } + } + + cpu_cache_invalidate_memregion(IORES_DESC_CXL); + return 0; +} + static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) { struct cxl_region_params *p = &cxlr->params; - int i; + int i, rc = 0; + + /* + * Before region teardown attempt to flush, and if the flush + * fails cancel the region teardown for data consistency + * concerns + */ + rc = cxl_region_invalidate_memregion(cxlr); + if (rc) + return rc; for (i = count - 1; i >= 0; i--) { struct cxl_endpoint_decoder *cxled = p->targets[i]; @@ -136,7 +164,6 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) struct cxl_port *iter = cxled_to_port(cxled); struct cxl_dev_state *cxlds = cxlmd->cxlds; struct cxl_ep *ep; - int rc = 0; if (cxlds->rcd) goto endpoint_reset; @@ -256,6 +283,14 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, goto out; } + /* + * Invalidate caches before region setup to drop any speculative + * consumption of this address space + */ + rc = cxl_region_invalidate_memregion(cxlr); + if (rc) + return rc; + if (commit) rc = cxl_region_decode_commit(cxlr); else { @@ -1686,7 +1721,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, if (rc) goto err_decrement; p->state = CXL_CONFIG_ACTIVE; - set_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); } cxled->cxld.interleave_ways = p->interleave_ways; @@ -2815,30 +2849,6 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled) } EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL); -static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) -{ - if (!test_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags)) - return 0; - - if (!cpu_cache_has_invalidate_memregion()) { - if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { - dev_warn_once( - &cxlr->dev, - "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); - return 0; - } else { - dev_err(&cxlr->dev, - "Failed to synchronize CPU cache state\n"); - return -ENXIO; - } - } - - cpu_cache_invalidate_memregion(IORES_DESC_CXL); - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); - return 0; -} - static int is_system_ram(struct resource *res, void *arg) { struct cxl_region *cxlr = arg; @@ -2866,8 +2876,6 @@ static int cxl_region_probe(struct device *dev) goto out; } - rc = cxl_region_invalidate_memregion(cxlr); - /* * From this point on any path that changes the region's state away from * CXL_CONFIG_COMMIT is also responsible for releasing the driver. diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index f0c428cb9a71..187abd8ba673 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -462,18 +462,12 @@ struct cxl_region_params { int nr_targets; }; -/* - * Flag whether this region needs to have its HPA span synchronized with - * CPU cache state at region activation time. - */ -#define CXL_REGION_F_INCOHERENT 0 - /* * Indicate whether this region has been assembled by autodetection or * userspace assembly. Prevent endpoint decoders outside of automatic * detection from being added to the region. */ -#define CXL_REGION_F_AUTO 1 +#define CXL_REGION_F_AUTO 0 /** * struct cxl_region - CXL region ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup 2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams @ 2023-06-21 0:00 ` Dave Jiang 2023-06-22 9:00 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Dave Jiang @ 2023-06-21 0:00 UTC (permalink / raw) To: Dan Williams, linux-cxl; +Cc: Jonathan Cameron, Vikram Sethi On 6/16/23 18:24, Dan Williams wrote: > Vikram raised a concern with the theoretical case of a CPU sending > MemClnEvict to a device that is not prepared to receive. MemClnEvict is > a message that is sent after a CPU has taken ownership of a cacheline > from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder > reconfiguration it is possible that the CPU is holding old contents for > a new device that has taken over the physical address range being cached > by the CPU. > > To avoid this scenario, invalidate caches prior to tearing down an HDM > decoder configuration. > > Now, this poses another problem that it is possible for something to > speculate into that space while the decode configuration is still up, so > to close that gap also invalidate prior to establish new contents behind > a given physical address range. > > With this change the cache invalidation is now explicit and need not be > checked in cxl_region_probe(), and that obviates the need for > CXL_REGION_F_INCOHERENT. > > Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Fixes: d18bc74aced6 ("cxl/region: Manage CPU caches relative to DPA invalidation events") > Reported-by: Vikram Sethi <vsethi@nvidia.com> > Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/region.c | 66 +++++++++++++++++++++++++-------------------- > drivers/cxl/cxl.h | 8 +---- > 2 files changed, 38 insertions(+), 36 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 013f3656e661..d48c5916f2e9 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -125,10 +125,38 @@ static struct cxl_region_ref *cxl_rr_load(struct cxl_port *port, > return xa_load(&port->regions, (unsigned long)cxlr); > } > > +static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) > +{ > + if (!cpu_cache_has_invalidate_memregion()) { > + if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { > + dev_warn_once( > + &cxlr->dev, > + "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); > + return 0; > + } else { > + dev_err(&cxlr->dev, > + "Failed to synchronize CPU cache state\n"); > + return -ENXIO; > + } > + } > + > + cpu_cache_invalidate_memregion(IORES_DESC_CXL); > + return 0; > +} > + > static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > { > struct cxl_region_params *p = &cxlr->params; > - int i; > + int i, rc = 0; > + > + /* > + * Before region teardown attempt to flush, and if the flush > + * fails cancel the region teardown for data consistency > + * concerns > + */ > + rc = cxl_region_invalidate_memregion(cxlr); > + if (rc) > + return rc; > > for (i = count - 1; i >= 0; i--) { > struct cxl_endpoint_decoder *cxled = p->targets[i]; > @@ -136,7 +164,6 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > struct cxl_port *iter = cxled_to_port(cxled); > struct cxl_dev_state *cxlds = cxlmd->cxlds; > struct cxl_ep *ep; > - int rc = 0; > > if (cxlds->rcd) > goto endpoint_reset; > @@ -256,6 +283,14 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > goto out; > } > > + /* > + * Invalidate caches before region setup to drop any speculative > + * consumption of this address space > + */ > + rc = cxl_region_invalidate_memregion(cxlr); > + if (rc) > + return rc; > + > if (commit) > rc = cxl_region_decode_commit(cxlr); > else { > @@ -1686,7 +1721,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, > if (rc) > goto err_decrement; > p->state = CXL_CONFIG_ACTIVE; > - set_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > } > > cxled->cxld.interleave_ways = p->interleave_ways; > @@ -2815,30 +2849,6 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled) > } > EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL); > > -static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) > -{ > - if (!test_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags)) > - return 0; > - > - if (!cpu_cache_has_invalidate_memregion()) { > - if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { > - dev_warn_once( > - &cxlr->dev, > - "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); > - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > - return 0; > - } else { > - dev_err(&cxlr->dev, > - "Failed to synchronize CPU cache state\n"); > - return -ENXIO; > - } > - } > - > - cpu_cache_invalidate_memregion(IORES_DESC_CXL); > - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > - return 0; > -} > - > static int is_system_ram(struct resource *res, void *arg) > { > struct cxl_region *cxlr = arg; > @@ -2866,8 +2876,6 @@ static int cxl_region_probe(struct device *dev) > goto out; > } > > - rc = cxl_region_invalidate_memregion(cxlr); > - > /* > * From this point on any path that changes the region's state away from > * CXL_CONFIG_COMMIT is also responsible for releasing the driver. > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index f0c428cb9a71..187abd8ba673 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -462,18 +462,12 @@ struct cxl_region_params { > int nr_targets; > }; > > -/* > - * Flag whether this region needs to have its HPA span synchronized with > - * CPU cache state at region activation time. > - */ > -#define CXL_REGION_F_INCOHERENT 0 > - > /* > * Indicate whether this region has been assembled by autodetection or > * userspace assembly. Prevent endpoint decoders outside of automatic > * detection from being added to the region. > */ > -#define CXL_REGION_F_AUTO 1 > +#define CXL_REGION_F_AUTO 0 > > /** > * struct cxl_region - CXL region > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup 2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams 2023-06-21 0:00 ` Dave Jiang @ 2023-06-22 9:00 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Jonathan Cameron @ 2023-06-22 9:00 UTC (permalink / raw) To: Dan Williams, Vikram Sethi; +Cc: linux-cxl On Fri, 16 Jun 2023 18:24:28 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Vikram raised a concern with the theoretical case of a CPU sending > MemClnEvict to a device that is not prepared to receive. MemClnEvict is > a message that is sent after a CPU has taken ownership of a cacheline > from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder > reconfiguration it is possible that the CPU is holding old contents for > a new device that has taken over the physical address range being cached > by the CPU. > > To avoid this scenario, invalidate caches prior to tearing down an HDM > decoder configuration. > > Now, this poses another problem that it is possible for something to > speculate into that space while the decode configuration is still up, so > to close that gap also invalidate prior to establish new contents behind > a given physical address range. > > With this change the cache invalidation is now explicit and need not be > checked in cxl_region_probe(), and that obviates the need for > CXL_REGION_F_INCOHERENT. > > Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Fixes: d18bc74aced6 ("cxl/region: Manage CPU caches relative to DPA invalidation events") > Reported-by: Vikram Sethi <vsethi@nvidia.com> > Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com > Signed-off-by: Dan Williams <dan.j.williams@intel.com> A far as I can tell (I feel my brain slowly frying when considering the speculation opportunities) this should do the job. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> p.s. it sucks - architecture folk, we need less destructive ways of doing this... > --- > drivers/cxl/core/region.c | 66 +++++++++++++++++++++++++-------------------- > drivers/cxl/cxl.h | 8 +---- > 2 files changed, 38 insertions(+), 36 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 013f3656e661..d48c5916f2e9 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -125,10 +125,38 @@ static struct cxl_region_ref *cxl_rr_load(struct cxl_port *port, > return xa_load(&port->regions, (unsigned long)cxlr); > } > > +static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) > +{ > + if (!cpu_cache_has_invalidate_memregion()) { > + if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { > + dev_warn_once( > + &cxlr->dev, > + "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); > + return 0; > + } else { > + dev_err(&cxlr->dev, > + "Failed to synchronize CPU cache state\n"); > + return -ENXIO; > + } > + } > + > + cpu_cache_invalidate_memregion(IORES_DESC_CXL); > + return 0; > +} > + > static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > { > struct cxl_region_params *p = &cxlr->params; > - int i; > + int i, rc = 0; > + > + /* > + * Before region teardown attempt to flush, and if the flush > + * fails cancel the region teardown for data consistency > + * concerns > + */ > + rc = cxl_region_invalidate_memregion(cxlr); > + if (rc) > + return rc; > > for (i = count - 1; i >= 0; i--) { > struct cxl_endpoint_decoder *cxled = p->targets[i]; > @@ -136,7 +164,6 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > struct cxl_port *iter = cxled_to_port(cxled); > struct cxl_dev_state *cxlds = cxlmd->cxlds; > struct cxl_ep *ep; > - int rc = 0; > > if (cxlds->rcd) > goto endpoint_reset; > @@ -256,6 +283,14 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > goto out; > } > > + /* > + * Invalidate caches before region setup to drop any speculative > + * consumption of this address space > + */ > + rc = cxl_region_invalidate_memregion(cxlr); > + if (rc) > + return rc; > + > if (commit) > rc = cxl_region_decode_commit(cxlr); > else { > @@ -1686,7 +1721,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, > if (rc) > goto err_decrement; > p->state = CXL_CONFIG_ACTIVE; > - set_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > } > > cxled->cxld.interleave_ways = p->interleave_ways; > @@ -2815,30 +2849,6 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled) > } > EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL); > > -static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) > -{ > - if (!test_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags)) > - return 0; > - > - if (!cpu_cache_has_invalidate_memregion()) { > - if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { > - dev_warn_once( > - &cxlr->dev, > - "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); > - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > - return 0; > - } else { > - dev_err(&cxlr->dev, > - "Failed to synchronize CPU cache state\n"); > - return -ENXIO; > - } > - } > - > - cpu_cache_invalidate_memregion(IORES_DESC_CXL); > - clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); > - return 0; > -} > - > static int is_system_ram(struct resource *res, void *arg) > { > struct cxl_region *cxlr = arg; > @@ -2866,8 +2876,6 @@ static int cxl_region_probe(struct device *dev) > goto out; > } > > - rc = cxl_region_invalidate_memregion(cxlr); > - > /* > * From this point on any path that changes the region's state away from > * CXL_CONFIG_COMMIT is also responsible for releasing the driver. > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index f0c428cb9a71..187abd8ba673 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -462,18 +462,12 @@ struct cxl_region_params { > int nr_targets; > }; > > -/* > - * Flag whether this region needs to have its HPA span synchronized with > - * CPU cache state at region activation time. > - */ > -#define CXL_REGION_F_INCOHERENT 0 > - > /* > * Indicate whether this region has been assembled by autodetection or > * userspace assembly. Prevent endpoint decoders outside of automatic > * detection from being added to the region. > */ > -#define CXL_REGION_F_AUTO 1 > +#define CXL_REGION_F_AUTO 0 > > /** > * struct cxl_region - CXL region > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable 2023-06-17 1:24 [PATCH 0/3] cxl/region: Cache management and region decode reset fixes Dan Williams 2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams @ 2023-06-17 1:24 ` Dan Williams 2023-06-21 0:04 ` Dave Jiang 2023-06-22 9:14 ` Jonathan Cameron 2023-06-17 1:24 ` [PATCH 3/3] cxl/region: Fix state transitions after reset failure Dan Williams 2 siblings, 2 replies; 11+ messages in thread From: Dan Williams @ 2023-06-17 1:24 UTC (permalink / raw) To: linux-cxl; +Cc: Jonathan Cameron cxl_region_decode_reset() walks all the decoders associated with a given region and disables them. Due to decoder ordering rules it is possible that a switch in the topology notices that a given decoder can not be shutdown before another region with a higher HPA is shutdown first. That can leave the region in a partially committed state. Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require that a successful cxl_region_decode_reset() attempt must be completed before cxl_region_probe() accepts the region. This is a corollary for the bug that Jonathan identified in "CXL/region : commit reset of out of order region appears to succeed." [1]. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/core/region.c | 12 ++++++++++++ drivers/cxl/cxl.h | 8 ++++++++ 2 files changed, 20 insertions(+) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index d48c5916f2e9..31f498f0fb3a 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -182,14 +182,19 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) rc = cxld->reset(cxld); if (rc) return rc; + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } endpoint_reset: rc = cxled->cxld.reset(&cxled->cxld); if (rc) return rc; + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } + /* all decoders associated with this region have been torn down */ + clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); + return 0; } @@ -2876,6 +2881,13 @@ static int cxl_region_probe(struct device *dev) goto out; } + if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) { + dev_err(&cxlr->dev, + "failed to activate, re-commit region and retry\n"); + rc = -ENXIO; + goto out; + } + /* * From this point on any path that changes the region's state away from * CXL_CONFIG_COMMIT is also responsible for releasing the driver. diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 187abd8ba673..cd7bf6697a51 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -469,6 +469,14 @@ struct cxl_region_params { */ #define CXL_REGION_F_AUTO 0 +/* + * Require that a committed region successfully complete a teardown once + * any of its associated decoders have been torn down. This maintains + * the commit state for the region since there are committed decoders, + * but blocks cxl_region_probe(). + */ +#define CXL_REGION_F_NEEDS_RESET 1 + /** * struct cxl_region - CXL region * @dev: This region's device ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable 2023-06-17 1:24 ` [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable Dan Williams @ 2023-06-21 0:04 ` Dave Jiang 2023-06-22 9:14 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Dave Jiang @ 2023-06-21 0:04 UTC (permalink / raw) To: Dan Williams, linux-cxl; +Cc: Jonathan Cameron On 6/16/23 18:24, Dan Williams wrote: > cxl_region_decode_reset() walks all the decoders associated with a given > region and disables them. Due to decoder ordering rules it is possible > that a switch in the topology notices that a given decoder can not be > shutdown before another region with a higher HPA is shutdown first. That > can leave the region in a partially committed state. > > Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require > that a successful cxl_region_decode_reset() attempt must be completed > before cxl_region_probe() accepts the region. > > This is a corollary for the bug that Jonathan identified in "CXL/region > : commit reset of out of order region appears to succeed." [1]. > > Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/region.c | 12 ++++++++++++ > drivers/cxl/cxl.h | 8 ++++++++ > 2 files changed, 20 insertions(+) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index d48c5916f2e9..31f498f0fb3a 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -182,14 +182,19 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > rc = cxld->reset(cxld); > if (rc) > return rc; > + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > } > > endpoint_reset: > rc = cxled->cxld.reset(&cxled->cxld); > if (rc) > return rc; > + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > } > > + /* all decoders associated with this region have been torn down */ > + clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > + > return 0; > } > > @@ -2876,6 +2881,13 @@ static int cxl_region_probe(struct device *dev) > goto out; > } > > + if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) { > + dev_err(&cxlr->dev, > + "failed to activate, re-commit region and retry\n"); > + rc = -ENXIO; > + goto out; > + } > + > /* > * From this point on any path that changes the region's state away from > * CXL_CONFIG_COMMIT is also responsible for releasing the driver. > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 187abd8ba673..cd7bf6697a51 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -469,6 +469,14 @@ struct cxl_region_params { > */ > #define CXL_REGION_F_AUTO 0 > > +/* > + * Require that a committed region successfully complete a teardown once > + * any of its associated decoders have been torn down. This maintains > + * the commit state for the region since there are committed decoders, > + * but blocks cxl_region_probe(). > + */ > +#define CXL_REGION_F_NEEDS_RESET 1 > + > /** > * struct cxl_region - CXL region > * @dev: This region's device > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable 2023-06-17 1:24 ` [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable Dan Williams 2023-06-21 0:04 ` Dave Jiang @ 2023-06-22 9:14 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Jonathan Cameron @ 2023-06-22 9:14 UTC (permalink / raw) To: Dan Williams; +Cc: linux-cxl On Fri, 16 Jun 2023 18:24:34 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > cxl_region_decode_reset() walks all the decoders associated with a given > region and disables them. Due to decoder ordering rules it is possible > that a switch in the topology notices that a given decoder can not be > shutdown before another region with a higher HPA is shutdown first. That > can leave the region in a partially committed state. > > Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require > that a successful cxl_region_decode_reset() attempt must be completed > before cxl_region_probe() accepts the region. > > This is a corollary for the bug that Jonathan identified in "CXL/region > : commit reset of out of order region appears to succeed." [1]. > > Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> I'm sure I understood this flow better a while back.... I think this is right. The subtlety that had me for a while was a fialure of the first cxld->reset() but if that fails we can assume that all decoders are still in place so I think we are fine. I wonder if a better approach (though heavy for a fix?) is a prepass that checks it is possible to tear the region down, followed by actually doing it (all under appropriate locking so state doesn't change on us) That way no intermediate half torn down states to worry about in the state machine. From a spec side of things I've never been convinced the ordering rules make sense. Some hardware / firmware simplification at cost of nasty software limitations. Jonathan > --- > drivers/cxl/core/region.c | 12 ++++++++++++ > drivers/cxl/cxl.h | 8 ++++++++ > 2 files changed, 20 insertions(+) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index d48c5916f2e9..31f498f0fb3a 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -182,14 +182,19 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) > rc = cxld->reset(cxld); > if (rc) > return rc; > + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > } > > endpoint_reset: > rc = cxled->cxld.reset(&cxled->cxld); > if (rc) > return rc; > + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > } > > + /* all decoders associated with this region have been torn down */ > + clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); > + > return 0; > } > > @@ -2876,6 +2881,13 @@ static int cxl_region_probe(struct device *dev) > goto out; > } > > + if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) { > + dev_err(&cxlr->dev, > + "failed to activate, re-commit region and retry\n"); > + rc = -ENXIO; > + goto out; > + } > + > /* > * From this point on any path that changes the region's state away from > * CXL_CONFIG_COMMIT is also responsible for releasing the driver. > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 187abd8ba673..cd7bf6697a51 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -469,6 +469,14 @@ struct cxl_region_params { > */ > #define CXL_REGION_F_AUTO 0 > > +/* > + * Require that a committed region successfully complete a teardown once > + * any of its associated decoders have been torn down. This maintains > + * the commit state for the region since there are committed decoders, > + * but blocks cxl_region_probe(). > + */ > +#define CXL_REGION_F_NEEDS_RESET 1 > + > /** > * struct cxl_region - CXL region > * @dev: This region's device > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] cxl/region: Fix state transitions after reset failure 2023-06-17 1:24 [PATCH 0/3] cxl/region: Cache management and region decode reset fixes Dan Williams 2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams 2023-06-17 1:24 ` [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable Dan Williams @ 2023-06-17 1:24 ` Dan Williams 2023-06-21 0:06 ` Dave Jiang 2023-06-22 9:18 ` Jonathan Cameron 2 siblings, 2 replies; 11+ messages in thread From: Dan Williams @ 2023-06-17 1:24 UTC (permalink / raw) To: linux-cxl; +Cc: Jonathan Cameron Jonathan reports that failed attempts to reset a region (teardown its HDM decoder configuration) mistakenly advance the state of the region to "not committed". Revert to the previous state of the region on reset failure so that the reset can be re-attempted. Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/core/region.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 31f498f0fb3a..38db377e13f1 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -296,9 +296,11 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, if (rc) return rc; - if (commit) + if (commit) { rc = cxl_region_decode_commit(cxlr); - else { + if (rc == 0) + p->state = CXL_CONFIG_COMMIT; + } else { p->state = CXL_CONFIG_RESET_PENDING; up_write(&cxl_region_rwsem); device_release_driver(&cxlr->dev); @@ -308,18 +310,20 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, * The lock was dropped, so need to revalidate that the reset is * still pending. */ - if (p->state == CXL_CONFIG_RESET_PENDING) + if (p->state == CXL_CONFIG_RESET_PENDING) { rc = cxl_region_decode_reset(cxlr, p->interleave_ways); + /* + * Revert to committed since there may still be active + * decoders associated with this region, or move forward + * to active to mark the reset successful + */ + if (rc) + p->state = CXL_CONFIG_COMMIT; + else + p->state = CXL_CONFIG_ACTIVE; + } } - if (rc) - goto out; - - if (commit) - p->state = CXL_CONFIG_COMMIT; - else if (p->state == CXL_CONFIG_RESET_PENDING) - p->state = CXL_CONFIG_ACTIVE; - out: up_write(&cxl_region_rwsem); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] cxl/region: Fix state transitions after reset failure 2023-06-17 1:24 ` [PATCH 3/3] cxl/region: Fix state transitions after reset failure Dan Williams @ 2023-06-21 0:06 ` Dave Jiang 2023-06-22 9:18 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Dave Jiang @ 2023-06-21 0:06 UTC (permalink / raw) To: Dan Williams, linux-cxl; +Cc: Jonathan Cameron On 6/16/23 18:24, Dan Williams wrote: > Jonathan reports that failed attempts to reset a region (teardown its > HDM decoder configuration) mistakenly advance the state of the region > to "not committed". Revert to the previous state of the region on reset > failure so that the reset can be re-attempted. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/region.c | 26 +++++++++++++++----------- > 1 file changed, 15 insertions(+), 11 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 31f498f0fb3a..38db377e13f1 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -296,9 +296,11 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > if (rc) > return rc; > > - if (commit) > + if (commit) { > rc = cxl_region_decode_commit(cxlr); > - else { > + if (rc == 0) > + p->state = CXL_CONFIG_COMMIT; > + } else { > p->state = CXL_CONFIG_RESET_PENDING; > up_write(&cxl_region_rwsem); > device_release_driver(&cxlr->dev); > @@ -308,18 +310,20 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > * The lock was dropped, so need to revalidate that the reset is > * still pending. > */ > - if (p->state == CXL_CONFIG_RESET_PENDING) > + if (p->state == CXL_CONFIG_RESET_PENDING) { > rc = cxl_region_decode_reset(cxlr, p->interleave_ways); > + /* > + * Revert to committed since there may still be active > + * decoders associated with this region, or move forward > + * to active to mark the reset successful > + */ > + if (rc) > + p->state = CXL_CONFIG_COMMIT; > + else > + p->state = CXL_CONFIG_ACTIVE; > + } > } > > - if (rc) > - goto out; > - > - if (commit) > - p->state = CXL_CONFIG_COMMIT; > - else if (p->state == CXL_CONFIG_RESET_PENDING) > - p->state = CXL_CONFIG_ACTIVE; > - > out: > up_write(&cxl_region_rwsem); > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] cxl/region: Fix state transitions after reset failure 2023-06-17 1:24 ` [PATCH 3/3] cxl/region: Fix state transitions after reset failure Dan Williams 2023-06-21 0:06 ` Dave Jiang @ 2023-06-22 9:18 ` Jonathan Cameron 2023-06-25 20:42 ` Dan Williams 1 sibling, 1 reply; 11+ messages in thread From: Jonathan Cameron @ 2023-06-22 9:18 UTC (permalink / raw) To: Dan Williams; +Cc: linux-cxl On Fri, 16 Jun 2023 18:24:39 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Jonathan reports that failed attempts to reset a region (teardown its > HDM decoder configuration) mistakenly advance the state of the region > to "not committed". Revert to the previous state of the region on reset > failure so that the reset can be re-attempted. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> LGTM - though maybe even nicer if we can be pretty sure this will succeed before trying it.. (same comment as previous patch) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/region.c | 26 +++++++++++++++----------- > 1 file changed, 15 insertions(+), 11 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 31f498f0fb3a..38db377e13f1 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -296,9 +296,11 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > if (rc) > return rc; > > - if (commit) > + if (commit) { > rc = cxl_region_decode_commit(cxlr); > - else { > + if (rc == 0) > + p->state = CXL_CONFIG_COMMIT; > + } else { > p->state = CXL_CONFIG_RESET_PENDING; > up_write(&cxl_region_rwsem); > device_release_driver(&cxlr->dev); > @@ -308,18 +310,20 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, > * The lock was dropped, so need to revalidate that the reset is > * still pending. > */ > - if (p->state == CXL_CONFIG_RESET_PENDING) > + if (p->state == CXL_CONFIG_RESET_PENDING) { > rc = cxl_region_decode_reset(cxlr, p->interleave_ways); > + /* > + * Revert to committed since there may still be active > + * decoders associated with this region, or move forward > + * to active to mark the reset successful > + */ > + if (rc) > + p->state = CXL_CONFIG_COMMIT; > + else > + p->state = CXL_CONFIG_ACTIVE; > + } > } > > - if (rc) > - goto out; > - > - if (commit) > - p->state = CXL_CONFIG_COMMIT; > - else if (p->state == CXL_CONFIG_RESET_PENDING) > - p->state = CXL_CONFIG_ACTIVE; > - > out: > up_write(&cxl_region_rwsem); > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] cxl/region: Fix state transitions after reset failure 2023-06-22 9:18 ` Jonathan Cameron @ 2023-06-25 20:42 ` Dan Williams 0 siblings, 0 replies; 11+ messages in thread From: Dan Williams @ 2023-06-25 20:42 UTC (permalink / raw) To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl Jonathan Cameron wrote: > On Fri, 16 Jun 2023 18:24:39 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > Jonathan reports that failed attempts to reset a region (teardown its > > HDM decoder configuration) mistakenly advance the state of the region > > to "not committed". Revert to the previous state of the region on reset > > failure so that the reset can be re-attempted. > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com > > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > LGTM - though maybe even nicer if we can be pretty sure this will succeed > before trying it.. (same comment as previous patch) I had the same reaction, but satisfied myself that this is something that userspace can manage. I.e. tooling can effectively predict when the kernel will complain about this ordering situation and prevent it. In other words, the only way this happens in practice is if userspace makes a mistake. It is already the case that partially committed decoders need to be tolerated by the platform since setup and teardown are not atomic. So I think 'cxl destroy-region' is where this follow-on smarts belongs. > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Thanks for the collaboration as always. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-25 20:42 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-06-17 1:24 [PATCH 0/3] cxl/region: Cache management and region decode reset fixes Dan Williams 2023-06-17 1:24 ` [PATCH 1/3] cxl/region: Move cache invalidation before region teardown, and before setup Dan Williams 2023-06-21 0:00 ` Dave Jiang 2023-06-22 9:00 ` Jonathan Cameron 2023-06-17 1:24 ` [PATCH 2/3] cxl/region: Flag partially torn down regions as unusable Dan Williams 2023-06-21 0:04 ` Dave Jiang 2023-06-22 9:14 ` Jonathan Cameron 2023-06-17 1:24 ` [PATCH 3/3] cxl/region: Fix state transitions after reset failure Dan Williams 2023-06-21 0:06 ` Dave Jiang 2023-06-22 9:18 ` Jonathan Cameron 2023-06-25 20:42 ` Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox