* [PATCH RFC] nvme: fix crash due to invalid cdev teardown
@ 2026-06-04 13:43 Maurizio Lombardi
2026-06-04 13:51 ` Maurizio Lombardi
2026-06-04 21:33 ` Keith Busch
0 siblings, 2 replies; 3+ messages in thread
From: Maurizio Lombardi @ 2026-06-04 13:43 UTC (permalink / raw)
To: kbusch; +Cc: hch, linux-nvme, dwagner, mlombard
In the NVMe multipath code, if nvme_add_ns_head_cdev() fails during
nvme_mpath_set_live(), the error is completely ignored. However, during
teardown, nvme_remove_head() unconditionally calls nvme_cdev_del().
This teardown asymmetry leads to a kernel panic if the character device
was never successfully initialized, which might happen in very rare
circumstances.
BUG: kernel NULL pointer dereference, address: 00000000000000d0
device_del+0x39/0x3c0
cdev_device_del+0x15/0x50
nvme_cdev_del+0xe/0x20 [nvme_core]
nvme_mpath_shutdown_disk+0x38/0x60 [nvme_core]
nvme_ns_remove+0x177/0x1f0 [nvme_core]
nvme_remove_namespaces+0xdc/0x130 [nvme_core]
nvme_do_delete_ctrl+0x71/0xd0 [nvme_core]
Regarding head namespaces, introduce the NVME_NSHEAD_CDEV_LIVE bit to
track the successful creation of the head character device.
nvme_remove_head() now checks this bit before attempting deletion.
Enforcing rollback here is unsafe, as it triggers kobject_init panics
on the embedded device struct during asynchronous ANA retries.
Regarding non-head namespaces, enforce strict failure by capturing the
return value of nvme_add_ns_cdev() and rolling back the namespace's
creation. Because nvme_alloc_ns() completely frees the nvme_ns structure
on error, rolling back is safe here.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
drivers/nvme/host/core.c | 5 ++++-
drivers/nvme/host/multipath.c | 5 ++++-
drivers/nvme/host/nvme.h | 1 +
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 72c50d5e938d..1e24232c0aac 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4213,13 +4213,16 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
goto out_cleanup_ns_from_list;
if (!nvme_ns_head_multipath(ns->head))
- nvme_add_ns_cdev(ns);
+ if (nvme_add_ns_cdev(ns))
+ goto out_del_disk;
nvme_mpath_add_disk(ns, info->anagrpid);
nvme_fault_inject_init(&ns->fault_inject, ns->disk->disk_name);
return;
+ out_del_disk:
+ del_gendisk(ns->disk);
out_cleanup_ns_from_list:
nvme_put_ctrl(ctrl);
mutex_lock(&ctrl->namespaces_lock);
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 263161cb8ac0..a786eb18735a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -628,6 +628,8 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head)
return ret;
ret = nvme_cdev_add(&head->cdev, &head->cdev_device,
&nvme_ns_head_chr_fops, THIS_MODULE);
+ if (!ret)
+ set_bit(NVME_NSHEAD_CDEV_LIVE, &head->flags);
return ret;
}
@@ -672,7 +674,8 @@ static void nvme_remove_head(struct nvme_ns_head *head)
*/
kblockd_schedule_work(&head->requeue_work);
- nvme_cdev_del(&head->cdev, &head->cdev_device);
+ if (test_and_clear_bit(NVME_NSHEAD_CDEV_LIVE, &head->flags))
+ nvme_cdev_del(&head->cdev, &head->cdev_device);
synchronize_srcu(&head->srcu);
del_gendisk(head->disk);
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9ccaed0b9dbf..e8833e56e9fd 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -567,6 +567,7 @@ struct nvme_ns_head {
unsigned int delayed_removal_secs;
#define NVME_NSHEAD_DISK_LIVE 0
#define NVME_NSHEAD_QUEUE_IF_NO_PATH 1
+#define NVME_NSHEAD_CDEV_LIVE 2
struct nvme_ns __rcu *current_path[];
#endif
};
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH RFC] nvme: fix crash due to invalid cdev teardown
2026-06-04 13:43 [PATCH RFC] nvme: fix crash due to invalid cdev teardown Maurizio Lombardi
@ 2026-06-04 13:51 ` Maurizio Lombardi
2026-06-04 21:33 ` Keith Busch
1 sibling, 0 replies; 3+ messages in thread
From: Maurizio Lombardi @ 2026-06-04 13:51 UTC (permalink / raw)
To: Maurizio Lombardi, kbusch; +Cc: hch, linux-nvme, dwagner, mlombard
I marked it as RFC because I am not sure if not creating
the cdevs is an acceptable solution.
Thoughts?
Maurizio
On Thu Jun 4, 2026 at 3:43 PM CEST, Maurizio Lombardi wrote:
> In the NVMe multipath code, if nvme_add_ns_head_cdev() fails during
> nvme_mpath_set_live(), the error is completely ignored. However, during
> teardown, nvme_remove_head() unconditionally calls nvme_cdev_del().
> This teardown asymmetry leads to a kernel panic if the character device
> was never successfully initialized, which might happen in very rare
> circumstances.
>
> BUG: kernel NULL pointer dereference, address: 00000000000000d0
> device_del+0x39/0x3c0
> cdev_device_del+0x15/0x50
> nvme_cdev_del+0xe/0x20 [nvme_core]
> nvme_mpath_shutdown_disk+0x38/0x60 [nvme_core]
> nvme_ns_remove+0x177/0x1f0 [nvme_core]
> nvme_remove_namespaces+0xdc/0x130 [nvme_core]
> nvme_do_delete_ctrl+0x71/0xd0 [nvme_core]
>
> Regarding head namespaces, introduce the NVME_NSHEAD_CDEV_LIVE bit to
> track the successful creation of the head character device.
> nvme_remove_head() now checks this bit before attempting deletion.
> Enforcing rollback here is unsafe, as it triggers kobject_init panics
> on the embedded device struct during asynchronous ANA retries.
>
> Regarding non-head namespaces, enforce strict failure by capturing the
> return value of nvme_add_ns_cdev() and rolling back the namespace's
> creation. Because nvme_alloc_ns() completely frees the nvme_ns structure
> on error, rolling back is safe here.
>
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
> ---
> drivers/nvme/host/core.c | 5 ++++-
> drivers/nvme/host/multipath.c | 5 ++++-
> drivers/nvme/host/nvme.h | 1 +
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 72c50d5e938d..1e24232c0aac 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4213,13 +4213,16 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
> goto out_cleanup_ns_from_list;
>
> if (!nvme_ns_head_multipath(ns->head))
> - nvme_add_ns_cdev(ns);
> + if (nvme_add_ns_cdev(ns))
> + goto out_del_disk;
>
> nvme_mpath_add_disk(ns, info->anagrpid);
> nvme_fault_inject_init(&ns->fault_inject, ns->disk->disk_name);
>
> return;
>
> + out_del_disk:
> + del_gendisk(ns->disk);
> out_cleanup_ns_from_list:
> nvme_put_ctrl(ctrl);
> mutex_lock(&ctrl->namespaces_lock);
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 263161cb8ac0..a786eb18735a 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -628,6 +628,8 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head)
> return ret;
> ret = nvme_cdev_add(&head->cdev, &head->cdev_device,
> &nvme_ns_head_chr_fops, THIS_MODULE);
> + if (!ret)
> + set_bit(NVME_NSHEAD_CDEV_LIVE, &head->flags);
> return ret;
> }
>
> @@ -672,7 +674,8 @@ static void nvme_remove_head(struct nvme_ns_head *head)
> */
> kblockd_schedule_work(&head->requeue_work);
>
> - nvme_cdev_del(&head->cdev, &head->cdev_device);
> + if (test_and_clear_bit(NVME_NSHEAD_CDEV_LIVE, &head->flags))
> + nvme_cdev_del(&head->cdev, &head->cdev_device);
> synchronize_srcu(&head->srcu);
> del_gendisk(head->disk);
> }
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9ccaed0b9dbf..e8833e56e9fd 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -567,6 +567,7 @@ struct nvme_ns_head {
> unsigned int delayed_removal_secs;
> #define NVME_NSHEAD_DISK_LIVE 0
> #define NVME_NSHEAD_QUEUE_IF_NO_PATH 1
> +#define NVME_NSHEAD_CDEV_LIVE 2
> struct nvme_ns __rcu *current_path[];
> #endif
> };
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH RFC] nvme: fix crash due to invalid cdev teardown
2026-06-04 13:43 [PATCH RFC] nvme: fix crash due to invalid cdev teardown Maurizio Lombardi
2026-06-04 13:51 ` Maurizio Lombardi
@ 2026-06-04 21:33 ` Keith Busch
1 sibling, 0 replies; 3+ messages in thread
From: Keith Busch @ 2026-06-04 21:33 UTC (permalink / raw)
To: Maurizio Lombardi; +Cc: hch, linux-nvme, dwagner, mlombard
On Thu, Jun 04, 2026 at 03:43:08PM +0200, Maurizio Lombardi wrote:
> In the NVMe multipath code, if nvme_add_ns_head_cdev() fails during
> nvme_mpath_set_live(), the error is completely ignored. However, during
> teardown, nvme_remove_head() unconditionally calls nvme_cdev_del().
> This teardown asymmetry leads to a kernel panic if the character device
> was never successfully initialized, which might happen in very rare
> circumstances.
Yeah, we need symmetry here, so no argument on that. Your patch is a bit
odd, though, in that you're treating multipath cdev failure differently
than non-multipath. I honestly hadn't considered this extremely unlikely
scenario. I guess we should proceed with just the gendisk in all cases,
but I know of several important uses that definitely need the generic
cdev handle here, so I think we need a clear error message that this
error event happened.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-04 21:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 13:43 [PATCH RFC] nvme: fix crash due to invalid cdev teardown Maurizio Lombardi
2026-06-04 13:51 ` Maurizio Lombardi
2026-06-04 21:33 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox