* [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
@ 2025-10-23 0:19 Shin'ichiro Kawasaki
2025-10-23 5:54 ` Christoph Hellwig
2025-11-05 7:32 ` Hannes Reinecke
0 siblings, 2 replies; 6+ messages in thread
From: Shin'ichiro Kawasaki @ 2025-10-23 0:19 UTC (permalink / raw)
To: linux-nvme, Keith Busch
Cc: Christoph Hellwig, Sagi Grimberg, Yi Zhang,
Shin'ichiro Kawasaki
Blktests test cases nvme/014, 057 and 058 fail occasionally due to a
lockdep WARN. As reported in the Closes tag URL, the WARN indicates that
a deadlock can happen due to the dependency among disk->open_mutex,
kblockd workqueue completion and partition_scan_work completion.
To avoid the lockdep WARN and the potential deadlock, cut the dependency
by running the partition_scan_work not by kblockd workqueue but by
nvme_wq.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Closes: https://lore.kernel.org/linux-block/CAHj4cs8mJ+R_GmQm9R8ebResKAWUE8kF5+_WVg0v8zndmqd6BQ@mail.gmail.com/
Link: https://lore.kernel.org/linux-block/oeyzci6ffshpukpfqgztsdeke5ost5hzsuz4rrsjfmvpqcevax@5nhnwbkzbrpa/
Fixes: 1f021341eef4 ("nvme-multipath: defer partition scanning")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
drivers/nvme/host/multipath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 543e17aead12..e35eccacee8c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -793,7 +793,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
return;
}
nvme_add_ns_head_cdev(head);
- kblockd_schedule_work(&head->partition_scan_work);
+ queue_work(nvme_wq, &head->partition_scan_work);
}
nvme_mpath_add_sysfs_link(ns->head);
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
2025-10-23 0:19 [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work Shin'ichiro Kawasaki
@ 2025-10-23 5:54 ` Christoph Hellwig
2025-10-23 14:28 ` Keith Busch
2025-11-05 7:32 ` Hannes Reinecke
1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2025-10-23 5:54 UTC (permalink / raw)
To: Shin'ichiro Kawasaki
Cc: linux-nvme, Keith Busch, Christoph Hellwig, Sagi Grimberg,
Yi Zhang
On Thu, Oct 23, 2025 at 09:19:37AM +0900, Shin'ichiro Kawasaki wrote:
> Blktests test cases nvme/014, 057 and 058 fail occasionally due to a
> lockdep WARN. As reported in the Closes tag URL, the WARN indicates that
> a deadlock can happen due to the dependency among disk->open_mutex,
> kblockd workqueue completion and partition_scan_work completion.
>
> To avoid the lockdep WARN and the potential deadlock, cut the dependency
> by running the partition_scan_work not by kblockd workqueue but by
> nvme_wq.
The partition_scan_work was added in 1f021341eef4 ("nvme-multipath:
defer partition scanning") to get it out of the scan work to avoid
deadlocks. I suspect moving it to the same workqueue might reintroduce
the deadlocks, so we might have to add a new workqueue here. Keith
might remember more.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
2025-10-23 5:54 ` Christoph Hellwig
@ 2025-10-23 14:28 ` Keith Busch
2025-10-28 7:55 ` Shinichiro Kawasaki
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2025-10-23 14:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Shin'ichiro Kawasaki, linux-nvme, Sagi Grimberg, Yi Zhang
On Wed, Oct 22, 2025 at 10:54:54PM -0700, Christoph Hellwig wrote:
> On Thu, Oct 23, 2025 at 09:19:37AM +0900, Shin'ichiro Kawasaki wrote:
> > Blktests test cases nvme/014, 057 and 058 fail occasionally due to a
> > lockdep WARN. As reported in the Closes tag URL, the WARN indicates that
> > a deadlock can happen due to the dependency among disk->open_mutex,
> > kblockd workqueue completion and partition_scan_work completion.
> >
> > To avoid the lockdep WARN and the potential deadlock, cut the dependency
> > by running the partition_scan_work not by kblockd workqueue but by
> > nvme_wq.
>
> The partition_scan_work was added in 1f021341eef4 ("nvme-multipath:
> defer partition scanning") to get it out of the scan work to avoid
> deadlocks. I suspect moving it to the same workqueue might reintroduce
> the deadlocks, so we might have to add a new workqueue here. Keith
> might remember more.
I don't think it was a problem of the same workqueue, but because
partition scanning happened in the same work_struct as namespace
scanning. As long as the two happen in different work items, I believe
they can use the same workqueue and be fine.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
2025-10-23 14:28 ` Keith Busch
@ 2025-10-28 7:55 ` Shinichiro Kawasaki
2025-10-29 7:54 ` hch
0 siblings, 1 reply; 6+ messages in thread
From: Shinichiro Kawasaki @ 2025-10-28 7:55 UTC (permalink / raw)
To: Keith Busch
Cc: hch@infradead.org, linux-nvme@lists.infradead.org, Sagi Grimberg,
Yi Zhang
On Oct 23, 2025 / 08:28, Keith Busch wrote:
> On Wed, Oct 22, 2025 at 10:54:54PM -0700, Christoph Hellwig wrote:
> > On Thu, Oct 23, 2025 at 09:19:37AM +0900, Shin'ichiro Kawasaki wrote:
> > > Blktests test cases nvme/014, 057 and 058 fail occasionally due to a
> > > lockdep WARN. As reported in the Closes tag URL, the WARN indicates that
> > > a deadlock can happen due to the dependency among disk->open_mutex,
> > > kblockd workqueue completion and partition_scan_work completion.
> > >
> > > To avoid the lockdep WARN and the potential deadlock, cut the dependency
> > > by running the partition_scan_work not by kblockd workqueue but by
> > > nvme_wq.
> >
> > The partition_scan_work was added in 1f021341eef4 ("nvme-multipath:
> > defer partition scanning") to get it out of the scan work to avoid
> > deadlocks. I suspect moving it to the same workqueue might reintroduce
> > the deadlocks, so we might have to add a new workqueue here. Keith
> > might remember more.
>
> I don't think it was a problem of the same workqueue, but because
> partition scanning happened in the same work_struct as namespace
> scanning. As long as the two happen in different work items, I believe
> they can use the same workqueue and be fine.
Keith, thank you for the comment.
Christoph, do you think the comment by Keith is enough to apply the patch?
If you still have concern, I will rework the patch to add a new workqueue as
you suggested.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
2025-10-28 7:55 ` Shinichiro Kawasaki
@ 2025-10-29 7:54 ` hch
0 siblings, 0 replies; 6+ messages in thread
From: hch @ 2025-10-29 7:54 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Keith Busch, hch@infradead.org, linux-nvme@lists.infradead.org,
Sagi Grimberg, Yi Zhang
Yes, the patch looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work
2025-10-23 0:19 [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work Shin'ichiro Kawasaki
2025-10-23 5:54 ` Christoph Hellwig
@ 2025-11-05 7:32 ` Hannes Reinecke
1 sibling, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2025-11-05 7:32 UTC (permalink / raw)
To: Shin'ichiro Kawasaki, linux-nvme, Keith Busch
Cc: Christoph Hellwig, Sagi Grimberg, Yi Zhang
On 10/23/25 02:19, Shin'ichiro Kawasaki wrote:
> Blktests test cases nvme/014, 057 and 058 fail occasionally due to a
> lockdep WARN. As reported in the Closes tag URL, the WARN indicates that
> a deadlock can happen due to the dependency among disk->open_mutex,
> kblockd workqueue completion and partition_scan_work completion.
>
> To avoid the lockdep WARN and the potential deadlock, cut the dependency
> by running the partition_scan_work not by kblockd workqueue but by
> nvme_wq.
>
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Closes: https://lore.kernel.org/linux-block/CAHj4cs8mJ+R_GmQm9R8ebResKAWUE8kF5+_WVg0v8zndmqd6BQ@mail.gmail.com/
> Link: https://lore.kernel.org/linux-block/oeyzci6ffshpukpfqgztsdeke5ost5hzsuz4rrsjfmvpqcevax@5nhnwbkzbrpa/
> Fixes: 1f021341eef4 ("nvme-multipath: defer partition scanning")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
> drivers/nvme/host/multipath.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 543e17aead12..e35eccacee8c 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -793,7 +793,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
> return;
> }
> nvme_add_ns_head_cdev(head);
> - kblockd_schedule_work(&head->partition_scan_work);
> + queue_work(nvme_wq, &head->partition_scan_work);
> }
>
> nvme_mpath_add_sysfs_link(ns->head);
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-05 7:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 0:19 [PATCH] nvme-multipath: fix lockdep WARN due to partition scan work Shin'ichiro Kawasaki
2025-10-23 5:54 ` Christoph Hellwig
2025-10-23 14:28 ` Keith Busch
2025-10-28 7:55 ` Shinichiro Kawasaki
2025-10-29 7:54 ` hch
2025-11-05 7:32 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox