* [PATCHv2] nvmet: Fix crash when a namespace is disabled
@ 2025-02-07 12:41 Hannes Reinecke
2025-02-10 6:14 ` Shinichiro Kawasaki
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Hannes Reinecke @ 2025-02-07 12:41 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke,
Nilay Shroff
The namespace percpu counter protects pending I/O, and we can
only safely diable the namespace once the counter drop to zero.
Otherwise we end up with a crash when running blktests/nvme/058
(eg for loop transport):
[ 2352.930426] [ T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
[ 2352.930431] [ T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[ 2352.930434] [ T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G W 6.13.0-rc6 #232
[ 2352.930438] [ T53909] Tainted: [W]=WARN
[ 2352.930440] [ T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 2352.930443] [ T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
[ 2352.930449] [ T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180
as the queue is already torn down when calling submit_bio();
So we need to init the percpu counter in nvmet_ns_enable(), and
wait for it to drop to zero in nvmet_ns_disable() to avoid having
I/O pending after the namespace has been disabled.
Fixes: 74d16965d7ac ("nvmet-loop: avoid using mutex in IO hotpath")
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
---
drivers/nvme/target/core.c | 40 ++++++++++++++++++--------------------
1 file changed, 19 insertions(+), 21 deletions(-)
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index cdc4a09a6e8a..2e741696f371 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -606,6 +606,9 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
goto out_dev_put;
}
+ if (percpu_ref_init(&ns->ref, nvmet_destroy_namespace, 0, GFP_KERNEL))
+ goto out_pr_exit;
+
nvmet_ns_changed(subsys, ns->nsid);
ns->enabled = true;
xa_set_mark(&subsys->namespaces, ns->nsid, NVMET_NS_ENABLED);
@@ -613,6 +616,9 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
out_unlock:
mutex_unlock(&subsys->lock);
return ret;
+out_pr_exit:
+ if (ns->pr.enable)
+ nvmet_pr_exit_ns(ns);
out_dev_put:
list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry)
pci_dev_put(radix_tree_delete(&ctrl->p2p_ns_map, ns->nsid));
@@ -638,6 +644,19 @@ void nvmet_ns_disable(struct nvmet_ns *ns)
mutex_unlock(&subsys->lock);
+ /*
+ * Now that we removed the namespaces from the lookup list, we
+ * can kill the per_cpu ref and wait for any remaining references
+ * to be dropped, as well as a RCU grace period for anyone only
+ * using the namepace under rcu_read_lock(). Note that we can't
+ * use call_rcu here as we need to ensure the namespaces have
+ * been fully destroyed before unloading the module.
+ */
+ percpu_ref_kill(&ns->ref);
+ synchronize_rcu();
+ wait_for_completion(&ns->disable_done);
+ percpu_ref_exit(&ns->ref);
+
if (ns->pr.enable)
nvmet_pr_exit_ns(ns);
@@ -660,22 +679,6 @@ void nvmet_ns_free(struct nvmet_ns *ns)
if (ns->nsid == subsys->max_nsid)
subsys->max_nsid = nvmet_max_nsid(subsys);
- mutex_unlock(&subsys->lock);
-
- /*
- * Now that we removed the namespaces from the lookup list, we
- * can kill the per_cpu ref and wait for any remaining references
- * to be dropped, as well as a RCU grace period for anyone only
- * using the namepace under rcu_read_lock(). Note that we can't
- * use call_rcu here as we need to ensure the namespaces have
- * been fully destroyed before unloading the module.
- */
- percpu_ref_kill(&ns->ref);
- synchronize_rcu();
- wait_for_completion(&ns->disable_done);
- percpu_ref_exit(&ns->ref);
-
- mutex_lock(&subsys->lock);
subsys->nr_namespaces--;
mutex_unlock(&subsys->lock);
@@ -705,9 +708,6 @@ struct nvmet_ns *nvmet_ns_alloc(struct nvmet_subsys *subsys, u32 nsid)
ns->nsid = nsid;
ns->subsys = subsys;
- if (percpu_ref_init(&ns->ref, nvmet_destroy_namespace, 0, GFP_KERNEL))
- goto out_free;
-
if (ns->nsid > subsys->max_nsid)
subsys->max_nsid = nsid;
@@ -730,8 +730,6 @@ struct nvmet_ns *nvmet_ns_alloc(struct nvmet_subsys *subsys, u32 nsid)
return ns;
out_exit:
subsys->max_nsid = nvmet_max_nsid(subsys);
- percpu_ref_exit(&ns->ref);
-out_free:
kfree(ns);
out_unlock:
mutex_unlock(&subsys->lock);
--
2.35.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCHv2] nvmet: Fix crash when a namespace is disabled
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
@ 2025-02-10 6:14 ` Shinichiro Kawasaki
2025-02-12 18:27 ` Chaitanya Kulkarni
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Shinichiro Kawasaki @ 2025-02-10 6:14 UTC (permalink / raw)
To: Hannes Reinecke
Cc: hch, Keith Busch, Sagi Grimberg, linux-nvme@lists.infradead.org,
Nilay Shroff
On Feb 07, 2025 / 13:41, Hannes Reinecke wrote:
> The namespace percpu counter protects pending I/O, and we can
> only safely diable the namespace once the counter drop to zero.
> Otherwise we end up with a crash when running blktests/nvme/058
> (eg for loop transport):
>
> [ 2352.930426] [ T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
> [ 2352.930431] [ T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
> [ 2352.930434] [ T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G W 6.13.0-rc6 #232
> [ 2352.930438] [ T53909] Tainted: [W]=WARN
> [ 2352.930440] [ T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
> [ 2352.930443] [ T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
> [ 2352.930449] [ T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180
>
> as the queue is already torn down when calling submit_bio();
>
> So we need to init the percpu counter in nvmet_ns_enable(), and
> wait for it to drop to zero in nvmet_ns_disable() to avoid having
> I/O pending after the namespace has been disabled.
>
> Fixes: 74d16965d7ac ("nvmet-loop: avoid using mutex in IO hotpath")
>
> Signed-off-by: Hannes Reinecke <hare@kernel.org>
> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
I confirmed that this fix avoids the blktests failure at nvme/058. I also
run all blktests test cases and observed no regression. Thanks!
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCHv2] nvmet: Fix crash when a namespace is disabled
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
2025-02-10 6:14 ` Shinichiro Kawasaki
@ 2025-02-12 18:27 ` Chaitanya Kulkarni
2025-02-13 6:25 ` Christoph Hellwig
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Chaitanya Kulkarni @ 2025-02-12 18:27 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig
Cc: Keith Busch, Sagi Grimberg, linux-nvme@lists.infradead.org,
Nilay Shroff
On 2/7/25 04:41, Hannes Reinecke wrote:
> The namespace percpu counter protects pending I/O, and we can
> only safely diable the namespace once the counter drop to zero.
> Otherwise we end up with a crash when running blktests/nvme/058
> (eg for loop transport):
>
> [ 2352.930426] [ T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
> [ 2352.930431] [ T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
> [ 2352.930434] [ T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G W 6.13.0-rc6 #232
> [ 2352.930438] [ T53909] Tainted: [W]=WARN
> [ 2352.930440] [ T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
> [ 2352.930443] [ T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
> [ 2352.930449] [ T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180
>
> as the queue is already torn down when calling submit_bio();
>
> So we need to init the percpu counter in nvmet_ns_enable(), and
> wait for it to drop to zero in nvmet_ns_disable() to avoid having
> I/O pending after the namespace has been disabled.
>
> Fixes: 74d16965d7ac ("nvmet-loop: avoid using mutex in IO hotpath")
>
> Signed-off-by: Hannes Reinecke<hare@kernel.org>
> Reviewed-by: Nilay Shroff<nilay@linux.ibm.com>
> ---
Looks good.
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
-ck
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCHv2] nvmet: Fix crash when a namespace is disabled
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
2025-02-10 6:14 ` Shinichiro Kawasaki
2025-02-12 18:27 ` Chaitanya Kulkarni
@ 2025-02-13 6:25 ` Christoph Hellwig
2025-02-17 8:03 ` Sagi Grimberg
2025-02-18 15:44 ` Keith Busch
4 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2025-02-13 6:25 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Nilay Shroff
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCHv2] nvmet: Fix crash when a namespace is disabled
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
` (2 preceding siblings ...)
2025-02-13 6:25 ` Christoph Hellwig
@ 2025-02-17 8:03 ` Sagi Grimberg
2025-02-18 15:44 ` Keith Busch
4 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2025-02-17 8:03 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme, Nilay Shroff
Looks good,
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCHv2] nvmet: Fix crash when a namespace is disabled
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
` (3 preceding siblings ...)
2025-02-17 8:03 ` Sagi Grimberg
@ 2025-02-18 15:44 ` Keith Busch
4 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2025-02-18 15:44 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme, Nilay Shroff
On Fri, Feb 07, 2025 at 01:41:34PM +0100, Hannes Reinecke wrote:
> The namespace percpu counter protects pending I/O, and we can
> only safely diable the namespace once the counter drop to zero.
> Otherwise we end up with a crash when running blktests/nvme/058
> (eg for loop transport):
Thanks, applied to nvme-6.14.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-02-18 15:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07 12:41 [PATCHv2] nvmet: Fix crash when a namespace is disabled Hannes Reinecke
2025-02-10 6:14 ` Shinichiro Kawasaki
2025-02-12 18:27 ` Chaitanya Kulkarni
2025-02-13 6:25 ` Christoph Hellwig
2025-02-17 8:03 ` Sagi Grimberg
2025-02-18 15:44 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox