public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs
@ 2018-01-30 11:21 Joseph Qi
  2018-01-30 21:19 ` Bart Van Assche
  0 siblings, 1 reply; 4+ messages in thread
From: Joseph Qi @ 2018-01-30 11:21 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Gang Deng

Hi Jens and Folks,

Recently we've gotten a hard LOCKUP issue. After investigating the issue
we've found a race between blk_init_queue_node and blkcg_print_blkgs.
The race is described below.

blk_init_queue_node                 blkcg_print_blkgs
  blk_alloc_queue_node (1)
    q->queue_lock = &q->__queue_lock (2)
    blkcg_init_queue(q) (3)
                                    spin_lock_irq(blkg->q->queue_lock) (4)
  q->queue_lock = lock (5)
                                    spin_unlock_irq(blkg->q->queue_lock) (6)

(1) allocate an uninitialized queue;
(2) initialize queue_lock to its default internal lock;
(3) initialize blkcg part of request queue, which will create blkg and
then insert it to blkg_list;
(4) traverse blkg_list and find the created blkg, and then take its
queue lock, here it is the default *internal lock*;
(5) *race window*, now queue_lock is overridden with *driver specified
lock*;
(6) now unlock *driver specified lock*, not the locked *internal lock*,
unlock balance breaks.

For the issue above, I think blkcg_init_queue is a bit earlier. We
can't allow a further use before request queue is fully initialized.
Since blk_init_queue_node is a really common path and it allows driver
to override the default internal lock, I'm afraid several other places
may also have the same issue.
Am I missing something here?

Thanks,
Joseph

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-01-31 16:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-30 11:21 [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs Joseph Qi
2018-01-30 21:19 ` Bart Van Assche
2018-01-31  1:53   ` Joseph Qi
2018-01-31 16:39     ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox