* Re: [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep [not found] <20241018013542.3013963-1-ming.lei@redhat.com> @ 2024-10-22 6:18 ` Christoph Hellwig 2024-10-22 7:19 ` Peter Zijlstra 2024-10-23 3:22 ` Ming Lei 0 siblings, 2 replies; 5+ messages in thread From: Christoph Hellwig @ 2024-10-22 6:18 UTC (permalink / raw) To: Ming Lei Cc: Jens Axboe, linux-block, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel On Fri, Oct 18, 2024 at 09:35:42AM +0800, Ming Lei wrote: > Recently we got several deadlock report[1][2][3] caused by blk_mq_freeze_queue > and blk_enter_queue(). > > Turns out the two are just like one rwsem, so model them as rwsem for > supporting lockdep: > > 1) model blk_mq_freeze_queue() as down_write_trylock() > - it is exclusive lock, so dependency with blk_enter_queue() is covered > - it is trylock because blk_mq_freeze_queue() are allowed to run concurrently Is this using the right terminology? down_write and other locking primitives obviously can run concurrently, the whole point is to synchronize the code run inside the criticial section. I think what you mean here is blk_mq_freeze_queue can be called more than once due to a global recursion counter. Not sure modelling it as a trylock is the right approach here, I've added the lockdep maintainers if they have an idea. > > 2) model blk_enter_queue() as down_read() > - it is shared lock, so concurrent blk_enter_queue() are allowed > - it is read lock, so dependency with blk_mq_freeze_queue() is modeled > - blk_queue_exit() is often called from other contexts(such as irq), and > it can't be annotated as rwsem_release(), so simply do it in > blk_enter_queue(), this way still covered cases as many as possible > > NVMe is the only subsystem which may call blk_mq_freeze_queue() and > blk_mq_unfreeze_queue() from different context, so it is the only > exception for the modeling. Add one tagset flag to exclude it from > the lockdep support. rwsems have a non_owner variant for these kinds of uses cases, we should do the same for blk_mq_freeze_queue to annoate the callsite instead of a global flag. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep 2024-10-22 6:18 ` [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep Christoph Hellwig @ 2024-10-22 7:19 ` Peter Zijlstra 2024-10-22 7:21 ` Christoph Hellwig 2024-10-23 3:22 ` Ming Lei 1 sibling, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2024-10-22 7:19 UTC (permalink / raw) To: Christoph Hellwig Cc: Ming Lei, Jens Axboe, linux-block, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel On Tue, Oct 22, 2024 at 08:18:05AM +0200, Christoph Hellwig wrote: > On Fri, Oct 18, 2024 at 09:35:42AM +0800, Ming Lei wrote: > > Recently we got several deadlock report[1][2][3] caused by blk_mq_freeze_queue > > and blk_enter_queue(). > > > > Turns out the two are just like one rwsem, so model them as rwsem for > > supporting lockdep: > > > > 1) model blk_mq_freeze_queue() as down_write_trylock() > > - it is exclusive lock, so dependency with blk_enter_queue() is covered > > - it is trylock because blk_mq_freeze_queue() are allowed to run concurrently > > Is this using the right terminology? down_write and other locking > primitives obviously can run concurrently, the whole point is to > synchronize the code run inside the criticial section. > > I think what you mean here is blk_mq_freeze_queue can be called more > than once due to a global recursion counter. > > Not sure modelling it as a trylock is the right approach here, > I've added the lockdep maintainers if they have an idea. So lockdep supports recursive reader state, but you're looking at recursive exclusive state? If you achieve this using an external nest count, then it is probably (I haven't yet had morning juice) sufficient to use the regular exclusive state on the outermost lock / unlock pair and simply ignore the inner locks. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep 2024-10-22 7:19 ` Peter Zijlstra @ 2024-10-22 7:21 ` Christoph Hellwig 0 siblings, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2024-10-22 7:21 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Hellwig, Ming Lei, Jens Axboe, linux-block, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel On Tue, Oct 22, 2024 at 09:19:05AM +0200, Peter Zijlstra wrote: > > Not sure modelling it as a trylock is the right approach here, > > I've added the lockdep maintainers if they have an idea. > > So lockdep supports recursive reader state, but you're looking at > recursive exclusive state? Yes. > If you achieve this using an external nest count, then it is probably (I > haven't yet had morning juice) sufficient to use the regular exclusive > state on the outermost lock / unlock pair and simply ignore the inner > locks. There obviosuly is an external nest count here, so I guess that should do the job. Ming? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep 2024-10-22 6:18 ` [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep Christoph Hellwig 2024-10-22 7:19 ` Peter Zijlstra @ 2024-10-23 3:22 ` Ming Lei 2024-10-23 6:07 ` Christoph Hellwig 1 sibling, 1 reply; 5+ messages in thread From: Ming Lei @ 2024-10-23 3:22 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, linux-block, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel On Tue, Oct 22, 2024 at 08:18:05AM +0200, Christoph Hellwig wrote: > On Fri, Oct 18, 2024 at 09:35:42AM +0800, Ming Lei wrote: > > Recently we got several deadlock report[1][2][3] caused by blk_mq_freeze_queue > > and blk_enter_queue(). > > > > Turns out the two are just like one rwsem, so model them as rwsem for > > supporting lockdep: > > > > 1) model blk_mq_freeze_queue() as down_write_trylock() > > - it is exclusive lock, so dependency with blk_enter_queue() is covered > > - it is trylock because blk_mq_freeze_queue() are allowed to run concurrently > > Is this using the right terminology? down_write and other locking > primitives obviously can run concurrently, the whole point is to > synchronize the code run inside the criticial section. > > I think what you mean here is blk_mq_freeze_queue can be called more > than once due to a global recursion counter. > > Not sure modelling it as a trylock is the right approach here, > I've added the lockdep maintainers if they have an idea. Yeah, looks we can just call lock_acquire for the outermost freeze/unfreeze. > > > > > 2) model blk_enter_queue() as down_read() > > - it is shared lock, so concurrent blk_enter_queue() are allowed > > - it is read lock, so dependency with blk_mq_freeze_queue() is modeled > > - blk_queue_exit() is often called from other contexts(such as irq), and > > it can't be annotated as rwsem_release(), so simply do it in > > blk_enter_queue(), this way still covered cases as many as possible > > > > NVMe is the only subsystem which may call blk_mq_freeze_queue() and > > blk_mq_unfreeze_queue() from different context, so it is the only > > exception for the modeling. Add one tagset flag to exclude it from > > the lockdep support. > > rwsems have a non_owner variant for these kinds of uses cases, > we should do the same for blk_mq_freeze_queue to annoate the callsite > instead of a global flag. Here it isn't real rwsem, and lockdep doesn't have non_owner variant for rwsem_acquire() and rwsem_release(). Another corner case is blk_mark_disk_dead() in which freeze & unfreeze may be run from different task contexts too. thanks, Ming ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep 2024-10-23 3:22 ` Ming Lei @ 2024-10-23 6:07 ` Christoph Hellwig 0 siblings, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2024-10-23 6:07 UTC (permalink / raw) To: Ming Lei Cc: Christoph Hellwig, Jens Axboe, linux-block, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel On Wed, Oct 23, 2024 at 11:22:55AM +0800, Ming Lei wrote: > > > 2) model blk_enter_queue() as down_read() > > > - it is shared lock, so concurrent blk_enter_queue() are allowed > > > - it is read lock, so dependency with blk_mq_freeze_queue() is modeled > > > - blk_queue_exit() is often called from other contexts(such as irq), and > > > it can't be annotated as rwsem_release(), so simply do it in > > > blk_enter_queue(), this way still covered cases as many as possible > > > > > > NVMe is the only subsystem which may call blk_mq_freeze_queue() and > > > blk_mq_unfreeze_queue() from different context, so it is the only > > > exception for the modeling. Add one tagset flag to exclude it from > > > the lockdep support. > > > > rwsems have a non_owner variant for these kinds of uses cases, > > we should do the same for blk_mq_freeze_queue to annoate the callsite > > instead of a global flag. > > Here it isn't real rwsem, and lockdep doesn't have non_owner variant > for rwsem_acquire() and rwsem_release(). Hmm, it looks like down_read_non_owner completely skips lockdep, which seems rather problematic. Sure we can't really track an owner, but having it take part in the lock chain would be extremely useful. Whatever we're using there should work for the freeze protection. > Another corner case is blk_mark_disk_dead() in which freeze & unfreeze > may be run from different task contexts too. Yes, this is a pretty questionable one though as we should be able to unfreeze as soon as the dying bit is set. Separate discussion, though. Either way the non-ownership should be per call and not a queue or tagset flag. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-23 6:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241018013542.3013963-1-ming.lei@redhat.com>
2024-10-22 6:18 ` [PATCH] block: model freeze & enter queue as rwsem for supporting lockdep Christoph Hellwig
2024-10-22 7:19 ` Peter Zijlstra
2024-10-22 7:21 ` Christoph Hellwig
2024-10-23 3:22 ` Ming Lei
2024-10-23 6:07 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox