* [PATCH v2] cxl/region: Fix null pointer dereference in region lookup
@ 2024-06-05 2:19 alison.schofield
2024-06-05 12:18 ` Jonathan Cameron
0 siblings, 1 reply; 3+ messages in thread
From: alison.schofield @ 2024-06-05 2:19 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Jonathan Cameron
From: Alison Schofield <alison.schofield@intel.com>
cxl_dpa_to_region() looks up a region based on a memdev and DPA.
It wrongly assumes an endpoint found mapping the DPA is also of
a fully assembled region. When not true it leads to a null pointer
dereference looking up the region name.
BUG: kernel NULL pointer dereference, address: 0000000000000050
RIP: 0010:__cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
Call Trace:
<TASK>
? show_regs+0x5f/0x70
? __die+0x1f/0x70
? page_fault_oops+0x14b/0x430
? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
? search_exception_tables+0x5b/0x60
? fixup_exception+0x22/0x300
? kernelmode_fixup_or_oops.constprop.0+0x5a/0x70
? __bad_area_nosemaphore+0x166/0x230
? up_read+0x43/0x90
? bad_area_nosemaphore+0x11/0x20
? do_user_addr_fault+0x2cb/0x6b0
? find_held_lock+0x31/0x90
? exc_page_fault+0x6e/0x220
? asm_exc_page_fault+0x27/0x30
? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
? __cxl_dpa_to_region+0x35/0xc0 [cxl_core]
? __pfx___cxl_dpa_to_region+0x10/0x10 [cxl_core]
device_for_each_child+0x4a/0x70
cxl_dpa_to_region+0x61/0x70 [cxl_core]
cxl_inject_poison+0xde/0x1e0 [cxl_core]
cxl_debugfs_poison_inject+0x9/0x10 [cxl_mem]
This appears during testing of region lookup after a failure to
assemble a BIOS defined region or if the lookup raced with the
assembly of the BIOS defined region.
Failure to clean up BIOS defined regions that fail assembly is an
issue in itself and a fix to that problem will alleviate some of
the impact. It will not alleviate the race condition so let's harden
this path.
The behavior change is that the kernel oops due to a null pointer
dereference is replaced with a dev_dbg() message noting that an
endpoint was mapped.
Additional comments are added so that future users of this function
can more clearly understand what it provides.
Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Changes in v2:
- emit endpoint dev_dbg() message only, rm cxlr-ness (Dan)
- rm redundant null cxled check (Dan)
- add stack trace to commit log (Dan)
- s/Avoid/Fix in commit msg (Dan)
- add Reviewed-by Tag (Jonathan)
drivers/cxl/core/region.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3c2b6144be23..d9819650c529 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2700,9 +2700,13 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg)
if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
return 0;
- dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa,
- dev_name(&cxled->cxld.region->dev));
-
+ /*
+ * Stop the region search (return 1) when an endpoint mapping is
+ * found. The region may not be fully constructed so offering
+ * the cxlr in the context structure is not guaranteed.
+ */
+ dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa,
+ dev_name(dev));
ctx->cxlr = cxled->cxld.region;
return 1;
base-commit: 49ba7b515c4c0719b866d16f068e62d16a8a3dd1
--
2.37.3
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] cxl/region: Fix null pointer dereference in region lookup
2024-06-05 2:19 [PATCH v2] cxl/region: Fix null pointer dereference in region lookup alison.schofield
@ 2024-06-05 12:18 ` Jonathan Cameron
2024-06-05 15:31 ` Alison Schofield
0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Cameron @ 2024-06-05 12:18 UTC (permalink / raw)
To: alison.schofield
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Tue, 4 Jun 2024 19:19:28 -0700
alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> cxl_dpa_to_region() looks up a region based on a memdev and DPA.
> It wrongly assumes an endpoint found mapping the DPA is also of
> a fully assembled region. When not true it leads to a null pointer
> dereference looking up the region name.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000050
> RIP: 0010:__cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> Call Trace:
> <TASK>
> ? show_regs+0x5f/0x70
> ? __die+0x1f/0x70
> ? page_fault_oops+0x14b/0x430
> ? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> ? search_exception_tables+0x5b/0x60
> ? fixup_exception+0x22/0x300
> ? kernelmode_fixup_or_oops.constprop.0+0x5a/0x70
> ? __bad_area_nosemaphore+0x166/0x230
> ? up_read+0x43/0x90
> ? bad_area_nosemaphore+0x11/0x20
> ? do_user_addr_fault+0x2cb/0x6b0
> ? find_held_lock+0x31/0x90
> ? exc_page_fault+0x6e/0x220
> ? asm_exc_page_fault+0x27/0x30
> ? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> ? __cxl_dpa_to_region+0x35/0xc0 [cxl_core]
> ? __pfx___cxl_dpa_to_region+0x10/0x10 [cxl_core]
> device_for_each_child+0x4a/0x70
> cxl_dpa_to_region+0x61/0x70 [cxl_core]
> cxl_inject_poison+0xde/0x1e0 [cxl_core]
> cxl_debugfs_poison_inject+0x9/0x10 [cxl_mem]
>
> This appears during testing of region lookup after a failure to
> assemble a BIOS defined region or if the lookup raced with the
> assembly of the BIOS defined region.
>
> Failure to clean up BIOS defined regions that fail assembly is an
> issue in itself and a fix to that problem will alleviate some of
> the impact. It will not alleviate the race condition so let's harden
> this path.
>
> The behavior change is that the kernel oops due to a null pointer
> dereference is replaced with a dev_dbg() message noting that an
> endpoint was mapped.
>
> Additional comments are added so that future users of this function
> can more clearly understand what it provides.
>
> Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region")
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This simplified version works for me so keep the tag... However given the spirit of
the patch changed rather a lot would have been reasonable to drop the RB.
> ---
>
> Changes in v2:
> - emit endpoint dev_dbg() message only, rm cxlr-ness (Dan)
> - rm redundant null cxled check (Dan)
> - add stack trace to commit log (Dan)
> - s/Avoid/Fix in commit msg (Dan)
> - add Reviewed-by Tag (Jonathan)
>
> drivers/cxl/core/region.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3c2b6144be23..d9819650c529 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2700,9 +2700,13 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg)
> if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
> return 0;
>
> - dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa,
> - dev_name(&cxled->cxld.region->dev));
> -
> + /*
> + * Stop the region search (return 1) when an endpoint mapping is
> + * found. The region may not be fully constructed so offering
> + * the cxlr in the context structure is not guaranteed.
> + */
> + dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa,
> + dev_name(dev));
> ctx->cxlr = cxled->cxld.region;
>
> return 1;
>
> base-commit: 49ba7b515c4c0719b866d16f068e62d16a8a3dd1
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v2] cxl/region: Fix null pointer dereference in region lookup
2024-06-05 12:18 ` Jonathan Cameron
@ 2024-06-05 15:31 ` Alison Schofield
0 siblings, 0 replies; 3+ messages in thread
From: Alison Schofield @ 2024-06-05 15:31 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Wed, Jun 05, 2024 at 01:18:35PM +0100, Jonathan Cameron wrote:
> On Tue, 4 Jun 2024 19:19:28 -0700
> alison.schofield@intel.com wrote:
>
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > cxl_dpa_to_region() looks up a region based on a memdev and DPA.
> > It wrongly assumes an endpoint found mapping the DPA is also of
> > a fully assembled region. When not true it leads to a null pointer
> > dereference looking up the region name.
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000050
> > RIP: 0010:__cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> > Call Trace:
> > <TASK>
> > ? show_regs+0x5f/0x70
> > ? __die+0x1f/0x70
> > ? page_fault_oops+0x14b/0x430
> > ? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> > ? search_exception_tables+0x5b/0x60
> > ? fixup_exception+0x22/0x300
> > ? kernelmode_fixup_or_oops.constprop.0+0x5a/0x70
> > ? __bad_area_nosemaphore+0x166/0x230
> > ? up_read+0x43/0x90
> > ? bad_area_nosemaphore+0x11/0x20
> > ? do_user_addr_fault+0x2cb/0x6b0
> > ? find_held_lock+0x31/0x90
> > ? exc_page_fault+0x6e/0x220
> > ? asm_exc_page_fault+0x27/0x30
> > ? __cxl_dpa_to_region+0x8c/0xc0 [cxl_core]
> > ? __cxl_dpa_to_region+0x35/0xc0 [cxl_core]
> > ? __pfx___cxl_dpa_to_region+0x10/0x10 [cxl_core]
> > device_for_each_child+0x4a/0x70
> > cxl_dpa_to_region+0x61/0x70 [cxl_core]
> > cxl_inject_poison+0xde/0x1e0 [cxl_core]
> > cxl_debugfs_poison_inject+0x9/0x10 [cxl_mem]
> >
> > This appears during testing of region lookup after a failure to
> > assemble a BIOS defined region or if the lookup raced with the
> > assembly of the BIOS defined region.
> >
> > Failure to clean up BIOS defined regions that fail assembly is an
> > issue in itself and a fix to that problem will alleviate some of
> > the impact. It will not alleviate the race condition so let's harden
> > this path.
> >
> > The behavior change is that the kernel oops due to a null pointer
> > dereference is replaced with a dev_dbg() message noting that an
> > endpoint was mapped.
> >
> > Additional comments are added so that future users of this function
> > can more clearly understand what it provides.
> >
> > Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region")
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> This simplified version works for me so keep the tag... However given the spirit of
> the patch changed rather a lot would have been reasonable to drop the RB.
Understand, sorry for the oversight.
Thanks for the second review too.
-- Alison
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-06-05 15:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05 2:19 [PATCH v2] cxl/region: Fix null pointer dereference in region lookup alison.schofield
2024-06-05 12:18 ` Jonathan Cameron
2024-06-05 15:31 ` Alison Schofield
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox