public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
@ 2022-06-28  4:03 yangx.jy
  2022-06-28  6:07 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: yangx.jy @ 2022-06-28  4:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejun Heo, Bjorn Helgaas, Jason Gunthorpe,
	Leon Romanovsky
  Cc: linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org

Hi everyone,

Running blktests nvme/032 on kernel v5.19-rc2+ which enables 
CONFIG_LOCKDEP and CONFIG_PROVE_LOCKING triggered the below WARNING.

It seems that the deadlock comes from fs/kernfs and driver/pci.
I am not familiar with the related code so I hope someone can give some 
advice.

[ 8245.501691] run blktests nvme/032 at 2022-06-28 11:02:59

[ 8253.061372] ======================================================
[ 8253.061372] WARNING: possible circular locking dependency detected
[ 8253.061372] 5.19.0-rc2+ #33 Not tainted
[ 8253.061372] ------------------------------------------------------
[ 8253.061372] check/1875 is trying to acquire lock:
[ 8253.061372] ffff91a2420277e8 (kn->active#282){++++}-{0:0}, at: 
kernfs_remove_by_name_ns+0x53/0x90
[ 8253.061372]
                but task is already holding lock:
[ 8253.061372] ffffffffb5a07ae8 (pci_rescan_remove_lock){+.+.}-{3:3}, 
at: pci_stop_and_remove_bus_device_locked+0xe/0x30
[ 8253.061372]
                which lock already depends on the new lock.

[ 8253.061372]
                the existing dependency chain (in reverse order) is:
[ 8253.061372]
                -> #1 (pci_rescan_remove_lock){+.+.}-{3:3}:
[ 8253.061372]        __mutex_lock+0x89/0xee0
[ 8253.061372]        dev_rescan_store+0x58/0x80
[ 8253.061372]        kernfs_fop_write_iter+0x140/0x1e0
[ 8253.061372]        new_sync_write+0x10c/0x190
[ 8253.061372]        vfs_write+0x268/0x370
[ 8253.061372]        ksys_write+0x65/0xe0
[ 8253.061372]        do_syscall_64+0x3b/0x90
[ 8253.061372]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 8253.061372]
                -> #0 (kn->active#282){++++}-{0:0}:
[ 8253.061372]        __lock_acquire+0x131d/0x2810
[ 8253.061372]        lock_acquire+0xd5/0x2f0
[ 8253.061372]        __kernfs_remove+0x281/0x2e0
[ 8253.061372]        kernfs_remove_by_name_ns+0x53/0x90
[ 8253.061372]        remove_files.isra.0+0x30/0x70
[ 8253.061372]        sysfs_remove_group+0x3d/0x80
[ 8253.061372]        sysfs_remove_groups+0x29/0x40
[ 8253.061372]        device_remove_attrs+0xb3/0xf0
[ 8253.061372]        device_del+0x1a2/0x410
[ 8253.061372]        pci_remove_bus_device+0x70/0x110
[ 8253.061372]        pci_stop_and_remove_bus_device_locked+0x1e/0x30
[ 8253.061372]        remove_store+0x75/0x90
[ 8253.061372]        kernfs_fop_write_iter+0x140/0x1e0
[ 8253.061372]        new_sync_write+0x10c/0x190
[ 8253.061372]        vfs_write+0x268/0x370
[ 8253.061372]        ksys_write+0x65/0xe0
[ 8253.061372]        do_syscall_64+0x3b/0x90
[ 8253.061372]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 8253.061372]
                other info that might help us debug this:

[ 8253.061372]  Possible unsafe locking scenario:

[ 8253.061372]        CPU0                    CPU1
[ 8253.061372]        ----                    ----
[ 8253.061372]   lock(pci_rescan_remove_lock);
[ 8253.061372]                                lock(kn->active#282);
[ 8253.061372]                                lock(pci_rescan_remove_lock);
[ 8253.061372]   lock(kn->active#282);
[ 8253.061372]
                 *** DEADLOCK ***

[ 8253.061372] 3 locks held by check/1875:
[ 8253.061372]  #0: ffff91a275567458 (sb_writers#4){.+.+}-{0:0}, at: 
ksys_write+0x65/0xe0
[ 8253.061372]  #1: ffff91a246939488 (&of->mutex){+.+.}-{3:3}, at: 
kernfs_fop_write_iter+0x10c/0x1e0
[ 8253.061372]  #2: ffffffffb5a07ae8 
(pci_rescan_remove_lock){+.+.}-{3:3}, at: 
pci_stop_and_remove_bus_device_locked+0xe/0x30
[ 8253.061372]
                stack backtrace:
[ 8253.061372] CPU: 1 PID: 1875 Comm: check Kdump: loaded Not tainted 
5.19.0-rc2+ #33
[ 8253.061372] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
[ 8253.074562] Call Trace:
[ 8253.074562]  <TASK>
[ 8253.074562]  dump_stack_lvl+0x56/0x6f
[ 8253.074562]  check_noncircular+0xfe/0x110
[ 8253.074562]  __lock_acquire+0x131d/0x2810
[ 8253.074562]  lock_acquire+0xd5/0x2f0
[ 8253.074562]  ? kernfs_remove_by_name_ns+0x53/0x90
[ 8253.074562]  ? lock_release+0x144/0x2c0
[ 8253.074562]  ? lock_is_held_type+0xe2/0x140
[ 8253.074562]  __kernfs_remove+0x281/0x2e0
[ 8253.074562]  ? kernfs_remove_by_name_ns+0x53/0x90
[ 8253.074562]  ? kernfs_name_hash+0x12/0x80
[ 8253.074562]  kernfs_remove_by_name_ns+0x53/0x90
[ 8253.074562]  remove_files.isra.0+0x30/0x70
[ 8253.074562]  sysfs_remove_group+0x3d/0x80
[ 8253.074562]  sysfs_remove_groups+0x29/0x40
[ 8253.074562]  device_remove_attrs+0xb3/0xf0
[ 8253.074562]  ? kernfs_remove_by_name_ns+0x5b/0x90
[ 8253.074562]  device_del+0x1a2/0x410
[ 8253.074562]  pci_remove_bus_device+0x70/0x110
[ 8253.074562]  pci_stop_and_remove_bus_device_locked+0x1e/0x30
[ 8253.074562]  remove_store+0x75/0x90
[ 8253.074562]  kernfs_fop_write_iter+0x140/0x1e0
[ 8253.074562]  new_sync_write+0x10c/0x190
[ 8253.074562]  vfs_write+0x268/0x370
[ 8253.074562]  ksys_write+0x65/0xe0
[ 8253.074562]  do_syscall_64+0x3b/0x90
[ 8253.074562]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 8253.074562] RIP: 0033:0x7f0db99004a7
[ 8253.074562] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 
00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 
05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 8253.074562] RSP: 002b:00007ffd07c32428 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[ 8253.074562] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 
00007f0db99004a7
[ 8253.074562] RDX: 0000000000000002 RSI: 0000561722b20200 RDI: 
0000000000000001
[ 8253.074562] RBP: 0000561722b20200 R08: 000000000000000a R09: 
0000000000000001
[ 8253.074562] R10: 0000561722c050a0 R11: 0000000000000246 R12: 
0000000000000002
[ 8253.074562] R13: 00007f0db99d1500 R14: 0000000000000002 R15: 
00007f0db99d1700
[ 8253.074562]  </TASK>
[ 8253.289448] pci 0000:00:03.0: [8086:5845] type 00 class 0x010802
[ 8253.299765] pci 0000:00:03.0: reg 0x10: [mem 0xfebd0000-0xfebd3fff 64bit]
[ 8253.418780] pci 0000:00:03.0: BAR 0: assigned [mem 
0x640000000-0x640003fff 64bit]
[ 8253.481841] nvme nvme0: pci function 0000:00:03.0
[ 8253.660699] nvme nvme0: 4/0/0 default/read/poll queues
[ 8253.681913] nvme nvme0: Ignoring bogus Namespace Identifiers

Best Regards,
Xiao Yang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-06-28  4:03 [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected yangx.jy
@ 2022-06-28  6:07 ` Greg Kroah-Hartman
  2022-06-28  6:09   ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-28  6:07 UTC (permalink / raw)
  To: yangx.jy@fujitsu.com
  Cc: Tejun Heo, Bjorn Helgaas, Jason Gunthorpe, Leon Romanovsky,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org

On Tue, Jun 28, 2022 at 04:03:55AM +0000, yangx.jy@fujitsu.com wrote:
> Hi everyone,
> 
> Running blktests nvme/032 on kernel v5.19-rc2+ which enables 
> CONFIG_LOCKDEP and CONFIG_PROVE_LOCKING triggered the below WARNING.

What was happening when this trace was created?  It looks like you were
manually removing a PCI device from the system through sysfs?  Why would
blktests do that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-06-28  6:07 ` Greg Kroah-Hartman
@ 2022-06-28  6:09   ` Christoph Hellwig
  2022-06-28  6:12     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2022-06-28  6:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: yangx.jy@fujitsu.com, Tejun Heo, Bjorn Helgaas, Jason Gunthorpe,
	Leon Romanovsky, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org

On Tue, Jun 28, 2022 at 08:07:52AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 28, 2022 at 04:03:55AM +0000, yangx.jy@fujitsu.com wrote:
> > Hi everyone,
> > 
> > Running blktests nvme/032 on kernel v5.19-rc2+ which enables 
> > CONFIG_LOCKDEP and CONFIG_PROVE_LOCKING triggered the below WARNING.
> 
> What was happening when this trace was created?  It looks like you were
> manually removing a PCI device from the system through sysfs?  Why would
> blktests do that?

To test how the block layer copes with the fact that the underlying
device goes away.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-06-28  6:09   ` Christoph Hellwig
@ 2022-06-28  6:12     ` Greg Kroah-Hartman
  2022-06-28  6:18       ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-28  6:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: yangx.jy@fujitsu.com, Tejun Heo, Bjorn Helgaas, Jason Gunthorpe,
	Leon Romanovsky, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org

On Mon, Jun 27, 2022 at 11:09:57PM -0700, Christoph Hellwig wrote:
> On Tue, Jun 28, 2022 at 08:07:52AM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Jun 28, 2022 at 04:03:55AM +0000, yangx.jy@fujitsu.com wrote:
> > > Hi everyone,
> > > 
> > > Running blktests nvme/032 on kernel v5.19-rc2+ which enables 
> > > CONFIG_LOCKDEP and CONFIG_PROVE_LOCKING triggered the below WARNING.
> > 
> > What was happening when this trace was created?  It looks like you were
> > manually removing a PCI device from the system through sysfs?  Why would
> > blktests do that?
> 
> To test how the block layer copes with the fact that the underlying
> device goes away.

Ah, so it's a fake PCI device, or is it a real one?

Anyway, it's a valid thing to do, I'll look at the report in a bit...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-06-28  6:12     ` Greg Kroah-Hartman
@ 2022-06-28  6:18       ` Christoph Hellwig
  2022-07-05  0:48         ` Shinichiro Kawasaki
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2022-06-28  6:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, yangx.jy@fujitsu.com, Tejun Heo, Bjorn Helgaas,
	Jason Gunthorpe, Leon Romanovsky, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org

On Tue, Jun 28, 2022 at 08:12:51AM +0200, Greg Kroah-Hartman wrote:
> Ah, so it's a fake PCI device, or is it a real one?

Whatever the blktests configuration points to.  But even if it is
"fake" that fake would come from the hypervisor.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-06-28  6:18       ` Christoph Hellwig
@ 2022-07-05  0:48         ` Shinichiro Kawasaki
  2022-07-19  2:05           ` yangx.jy
  0 siblings, 1 reply; 8+ messages in thread
From: Shinichiro Kawasaki @ 2022-07-05  0:48 UTC (permalink / raw)
  To: hch@infradead.org
  Cc: Greg Kroah-Hartman, yangx.jy@fujitsu.com, Tejun Heo,
	Bjorn Helgaas, Jason Gunthorpe, Leon Romanovsky,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org

On Jun 27, 2022 / 23:18, Christoph Hellwig wrote:
> On Tue, Jun 28, 2022 at 08:12:51AM +0200, Greg Kroah-Hartman wrote:
> > Ah, so it's a fake PCI device, or is it a real one?
> 
> Whatever the blktests configuration points to.  But even if it is
> "fake" that fake would come from the hypervisor.

FYI, the WARN was observed with real PCI NVMe device also. I observed the WARN
with v5.19-rc1 and still observe with v5.19-rc5. I'm not sure which version
introduced the WARN.

I once reported this failure, and shared with linux-pci list.

https://lore.kernel.org/linux-block/20220614010907.bvbrgbz7nnvpnw5w@shindev/

-- 
Shin'ichiro Kawasaki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-07-05  0:48         ` Shinichiro Kawasaki
@ 2022-07-19  2:05           ` yangx.jy
  2022-07-19  4:42             ` Shinichiro Kawasaki
  0 siblings, 1 reply; 8+ messages in thread
From: yangx.jy @ 2022-07-19  2:05 UTC (permalink / raw)
  To: Shinichiro Kawasaki, hch@infradead.org
  Cc: Greg Kroah-Hartman, Tejun Heo, Bjorn Helgaas, Jason Gunthorpe,
	Leon Romanovsky, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org

On 2022/7/5 8:48, Shinichiro Kawasaki wrote:
> On Jun 27, 2022 / 23:18, Christoph Hellwig wrote:
>> On Tue, Jun 28, 2022 at 08:12:51AM +0200, Greg Kroah-Hartman wrote:
>>> Ah, so it's a fake PCI device, or is it a real one?
>>
>> Whatever the blktests configuration points to.  But even if it is
>> "fake" that fake would come from the hypervisor.
> 
> FYI, the WARN was observed with real PCI NVMe device also. I observed the WARN
> with v5.19-rc1 and still observe with v5.19-rc5. I'm not sure which version
> introduced the WARN.
> 
> I once reported this failure, and shared with linux-pci list.
> 
> https://lore.kernel.org/linux-block/20220614010907.bvbrgbz7nnvpnw5w@shindev/
> 

Hi Shinichiro,

I wonder if the failure has been fixed? If so, could you tell me the fix 
patch?

Best Regards,
Xiao Yang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected
  2022-07-19  2:05           ` yangx.jy
@ 2022-07-19  4:42             ` Shinichiro Kawasaki
  0 siblings, 0 replies; 8+ messages in thread
From: Shinichiro Kawasaki @ 2022-07-19  4:42 UTC (permalink / raw)
  To: yangx.jy@fujitsu.com
  Cc: hch@infradead.org, Greg Kroah-Hartman, Tejun Heo, Bjorn Helgaas,
	Jason Gunthorpe, Leon Romanovsky, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org

On Jul 19, 2022 / 02:05, yangx.jy@fujitsu.com wrote:
> On 2022/7/5 8:48, Shinichiro Kawasaki wrote:
> > On Jun 27, 2022 / 23:18, Christoph Hellwig wrote:
> >> On Tue, Jun 28, 2022 at 08:12:51AM +0200, Greg Kroah-Hartman wrote:
> >>> Ah, so it's a fake PCI device, or is it a real one?
> >>
> >> Whatever the blktests configuration points to.  But even if it is
> >> "fake" that fake would come from the hypervisor.
> > 
> > FYI, the WARN was observed with real PCI NVMe device also. I observed the WARN
> > with v5.19-rc1 and still observe with v5.19-rc5. I'm not sure which version
> > introduced the WARN.
> > 
> > I once reported this failure, and shared with linux-pci list.
> > 
> > https://lore.kernel.org/linux-block/20220614010907.bvbrgbz7nnvpnw5w@shindev/
> > 
> 
> Hi Shinichiro,
> 
> I wonder if the failure has been fixed? If so, could you tell me the fix 
> patch?

No, I'm not aware of fix for the failure yet. At v5.19-rc6, I still observe the
failure.

-- 
Shin'ichiro Kawasaki

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-19  5:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-28  4:03 [bug report from blktests nvme/032] WARNING: possible circular locking dependency detected yangx.jy
2022-06-28  6:07 ` Greg Kroah-Hartman
2022-06-28  6:09   ` Christoph Hellwig
2022-06-28  6:12     ` Greg Kroah-Hartman
2022-06-28  6:18       ` Christoph Hellwig
2022-07-05  0:48         ` Shinichiro Kawasaki
2022-07-19  2:05           ` yangx.jy
2022-07-19  4:42             ` Shinichiro Kawasaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox