linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] libnvdimm, pmem: fix badblocks notification crash
@ 2017-04-27 22:10 Dan Williams
       [not found] ` <149333101097.4714.1923436715100717938.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Williams @ 2017-04-27 22:10 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA

The nd_pmem_notify() routine is called whenever an ARS
(address-range-scrub) completes to communicate results to the
per-namespace badblocks instances.

When the namespace is in btt mode we crash because we do not allocate a
struct pmem_device instance in that case. Resulting in the following
crash signature:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
 IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
 Call Trace:
  nd_device_notify+0x40/0x50
  child_notify+0x10/0x20
  device_for_each_child+0x50/0x90
  nd_region_notify+0x20/0x30
  nd_device_notify+0x40/0x50
  nvdimm_region_notify+0x27/0x30
  acpi_nfit_scrub+0x341/0x590 [nfit]
  process_one_work+0x197/0x450
  worker_thread+0x4e/0x4a0
  kthread+0x109/0x140

Given that we don't even populate the btt badblocks instance, just
return early and skip the device to region lookup.

This is a simpler version of the original fix by Toshi [1].

[1]: https://patchwork.kernel.org/patch/9700055/

Cc: Vishal Verma <vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reported-by: Toshi Kani <toshi.kani-ZPxbGqLxI0U@public.gmane.org>
Signed-off-by: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/nvdimm/pmem.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 5b536be5a12e..ee6cd31dafcf 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -388,21 +388,21 @@ static void nd_pmem_shutdown(struct device *dev)
 
 static void nd_pmem_notify(struct device *dev, enum nvdimm_event event)
 {
-	struct pmem_device *pmem = dev_get_drvdata(dev);
-	struct nd_region *nd_region = to_region(pmem);
 	resource_size_t offset = 0, end_trunc = 0;
 	struct nd_namespace_common *ndns;
 	struct nd_namespace_io *nsio;
+	struct nd_region *nd_region;
+	struct pmem_device *pmem;
 	struct resource res;
 
 	if (event != NVDIMM_REVALIDATE_POISON)
 		return;
 
-	if (is_nd_btt(dev)) {
-		struct nd_btt *nd_btt = to_nd_btt(dev);
+	/* no badblocks instance to update in the btt case */
+	if (is_nd_btt(dev))
+		return;
 
-		ndns = nd_btt->ndns;
-	} else if (is_nd_pfn(dev)) {
+	if (is_nd_pfn(dev)) {
 		struct nd_pfn *nd_pfn = to_nd_pfn(dev);
 		struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
 
@@ -415,6 +415,8 @@ static void nd_pmem_notify(struct device *dev, enum nvdimm_event event)
 	nsio = to_nd_namespace_io(&ndns->dev);
 	res.start = nsio->res.start + offset;
 	res.end = nsio->res.end - end_trunc;
+	pmem = dev_get_drvdata(dev);
+	nd_region = to_region(pmem);
 	nvdimm_badblocks_populate(nd_region, &pmem->bb, &res);
 }

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] libnvdimm, pmem: fix badblocks notification crash
       [not found] ` <149333101097.4714.1923436715100717938.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-04-27 22:25   ` Kani, Toshimitsu
  2017-04-27 22:26     ` Dan Williams
  0 siblings, 1 reply; 4+ messages in thread
From: Kani, Toshimitsu @ 2017-04-27 22:25 UTC (permalink / raw)
  To: dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org
  Cc: linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Thu, 2017-04-27 at 15:10 -0700, Dan Williams wrote:
> The nd_pmem_notify() routine is called whenever an ARS
> (address-range-scrub) completes to communicate results to the
> per-namespace badblocks instances.
> 
> When the namespace is in btt mode we crash because we do not allocate
> a struct pmem_device instance in that case. Resulting in the
> following crash signature:
> 
>  BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000030
>  IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
>  Call Trace:
>   nd_device_notify+0x40/0x50
>   child_notify+0x10/0x20
>   device_for_each_child+0x50/0x90
>   nd_region_notify+0x20/0x30
>   nd_device_notify+0x40/0x50
>   nvdimm_region_notify+0x27/0x30
>   acpi_nfit_scrub+0x341/0x590 [nfit]
>   process_one_work+0x197/0x450
>   worker_thread+0x4e/0x4a0
>   kthread+0x109/0x140
> 
> Given that we don't even populate the btt badblocks instance, just
> return early and skip the device to region lookup.

We populate the btt badblocks into nsio->bb, and check/clear them in
nsio_rw_bytes().

Thanks,
-Toshi
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] libnvdimm, pmem: fix badblocks notification crash
  2017-04-27 22:25   ` Kani, Toshimitsu
@ 2017-04-27 22:26     ` Dan Williams
  2017-04-27 22:28       ` Kani, Toshimitsu
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Williams @ 2017-04-27 22:26 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, Verma, Vishal L

On Thu, Apr 27, 2017 at 3:25 PM, Kani, Toshimitsu <toshi.kani@hpe.com> wrote:
> On Thu, 2017-04-27 at 15:10 -0700, Dan Williams wrote:
>> The nd_pmem_notify() routine is called whenever an ARS
>> (address-range-scrub) completes to communicate results to the
>> per-namespace badblocks instances.
>>
>> When the namespace is in btt mode we crash because we do not allocate
>> a struct pmem_device instance in that case. Resulting in the
>> following crash signature:
>>
>>  BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000030
>>  IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
>>  Call Trace:
>>   nd_device_notify+0x40/0x50
>>   child_notify+0x10/0x20
>>   device_for_each_child+0x50/0x90
>>   nd_region_notify+0x20/0x30
>>   nd_device_notify+0x40/0x50
>>   nvdimm_region_notify+0x27/0x30
>>   acpi_nfit_scrub+0x341/0x590 [nfit]
>>   process_one_work+0x197/0x450
>>   worker_thread+0x4e/0x4a0
>>   kthread+0x109/0x140
>>
>> Given that we don't even populate the btt badblocks instance, just
>> return early and skip the device to region lookup.
>
> We populate the btt badblocks into nsio->bb, and check/clear them in
> nsio_rw_bytes().

Argh, yes, we don't populate them out to the disk badblocks. I'll go
with your patch.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] libnvdimm, pmem: fix badblocks notification crash
  2017-04-27 22:26     ` Dan Williams
@ 2017-04-27 22:28       ` Kani, Toshimitsu
  0 siblings, 0 replies; 4+ messages in thread
From: Kani, Toshimitsu @ 2017-04-27 22:28 UTC (permalink / raw)
  To: dan.j.williams@intel.com
  Cc: linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-acpi@vger.kernel.org, vishal.l.verma@intel.com

On Thu, 2017-04-27 at 15:26 -0700, Dan Williams wrote:
> On Thu, Apr 27, 2017 at 3:25 PM, Kani, Toshimitsu <toshi.kani@hpe.com
> > wrote:
> > On Thu, 2017-04-27 at 15:10 -0700, Dan Williams wrote:
> > > The nd_pmem_notify() routine is called whenever an ARS
> > > (address-range-scrub) completes to communicate results to the
> > > per-namespace badblocks instances.
> > > 
> > > When the namespace is in btt mode we crash because we do not
> > > allocate a struct pmem_device instance in that case. Resulting in
> > > the following crash signature:
> > > 
> > >  BUG: unable to handle kernel NULL pointer dereference at
> > > 0000000000000030
> > >  IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
> > >  Call Trace:
> > >   nd_device_notify+0x40/0x50
> > >   child_notify+0x10/0x20
> > >   device_for_each_child+0x50/0x90
> > >   nd_region_notify+0x20/0x30
> > >   nd_device_notify+0x40/0x50
> > >   nvdimm_region_notify+0x27/0x30
> > >   acpi_nfit_scrub+0x341/0x590 [nfit]
> > >   process_one_work+0x197/0x450
> > >   worker_thread+0x4e/0x4a0
> > >   kthread+0x109/0x140
> > > 
> > > Given that we don't even populate the btt badblocks instance,
> > > just return early and skip the device to region lookup.
> > 
> > We populate the btt badblocks into nsio->bb, and check/clear them
> > in nsio_rw_bytes().
> 
> Argh, yes, we don't populate them out to the disk badblocks. I'll go
> with your patch.

Thanks!
-Toshi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-27 22:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-27 22:10 [PATCH] libnvdimm, pmem: fix badblocks notification crash Dan Williams
     [not found] ` <149333101097.4714.1923436715100717938.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-04-27 22:25   ` Kani, Toshimitsu
2017-04-27 22:26     ` Dan Williams
2017-04-27 22:28       ` Kani, Toshimitsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).