Lockdep splat involving all_q_mutex

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: peterz@infradead.org, axboe@kernel.dk
Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org, tglx@linutronix.de
Subject: Lockdep splat involving all_q_mutex
Date: Wed, 10 May 2017 15:34:07 -0700	[thread overview]
Message-ID: <20170510223407.GA7122@linux.vnet.ibm.com> (raw)

Hello!

I got the lockdep splat shown below during some rcutorture testing (which
does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
inversion") just got done moving these two statements in the other
direction.

Acquiring the update-side CPU-hotplug lock across sched_feat_write()
seems like it might be an alternative to eabe06595d62, but I figured
I should check first.  Another approach would be to do the work in
blk_mq_queue_reinit_dead() asynchronously, for example, from a workqueue,
but I would have to know much more about blk_mq to know what effects
that would have.

Thoughts?

							Thanx, Paul

[   32.808758] ======================================================
[   32.810110] [ INFO: possible circular locking dependency detected ]
[   32.811468] 4.11.0+ #1 Not tainted
[   32.812190] -------------------------------------------------------
[   32.813626] torture_onoff/769 is trying to acquire lock:
[   32.814769]  (all_q_mutex){+.+...}, at: [<ffffffff93b884c3>] blk_mq_queue_reinit_work+0x13/0x110
[   32.816655] 
[   32.816655] but task is already holding lock:
[   32.817926]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff9386014b>] cpu_hotplug_begin+0x6b/0xb0
[   32.819754] 
[   32.819754] which lock already depends on the new lock.
[   32.819754] 
[   32.821898] 
[   32.821898] the existing dependency chain (in reverse order) is:
[   32.823518] 
[   32.823518] -> #1 (cpu_hotplug.lock){+.+.+.}:
[   32.824788]        lock_acquire+0xd1/0x1b0
[   32.825728]        __mutex_lock+0x54/0x8a0
[   32.826629]        mutex_lock_nested+0x16/0x20
[   32.827585]        get_online_cpus+0x47/0x60
[   32.828519]        blk_mq_init_allocated_queue+0x398/0x4d0
[   32.829754]        blk_mq_init_queue+0x35/0x60
[   32.830766]        loop_add+0xe0/0x270
[   32.831595]        loop_init+0x10d/0x14b
[   32.832454]        do_one_initcall+0xef/0x160
[   32.833449]        kernel_init_freeable+0x1b6/0x23e
[   32.834517]        kernel_init+0x9/0x100
[   32.835363]        ret_from_fork+0x2e/0x40
[   32.836254] 
[   32.836254] -> #0 (all_q_mutex){+.+...}:
[   32.837635]        __lock_acquire+0x10bf/0x1350
[   32.838714]        lock_acquire+0xd1/0x1b0
[   32.839702]        __mutex_lock+0x54/0x8a0
[   32.840667]        mutex_lock_nested+0x16/0x20
[   32.841767]        blk_mq_queue_reinit_work+0x13/0x110
[   32.842987]        blk_mq_queue_reinit_dead+0x17/0x20
[   32.844164]        cpuhp_invoke_callback+0x1d1/0x770
[   32.845379]        cpuhp_down_callbacks+0x3d/0x80
[   32.846484]        _cpu_down+0xad/0xe0
[   32.847388]        do_cpu_down+0x39/0x50
[   32.848316]        cpu_down+0xb/0x10
[   32.849236]        torture_offline+0x75/0x140
[   32.850258]        torture_onoff+0x102/0x1e0
[   32.851278]        kthread+0x104/0x140
[   32.852158]        ret_from_fork+0x2e/0x40
[   32.853167] 
[   32.853167] other info that might help us debug this:
[   32.853167] 
[   32.855052]  Possible unsafe locking scenario:
[   32.855052] 
[   32.856442]        CPU0                    CPU1
[   32.857366]        ----                    ----
[   32.858429]   lock(cpu_hotplug.lock);
[   32.859289]                                lock(all_q_mutex);
[   32.860649]                                lock(cpu_hotplug.lock);
[   32.862148]   lock(all_q_mutex);
[   32.862910] 
[   32.862910]  *** DEADLOCK ***
[   32.862910] 
[   32.864289] 3 locks held by torture_onoff/769:
[   32.865386]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff938601f2>] do_cpu_down+0x22/0x50
[   32.867429]  #1:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffff938600e0>] cpu_hotplug_begin+0x0/0xb0
[   32.869612]  #2:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff9386014b>] cpu_hotplug_begin+0x6b/0xb0
[   32.871700] 
[   32.871700] stack backtrace:
[   32.872727] CPU: 1 PID: 769 Comm: torture_onoff Not tainted 4.11.0+ #1
[   32.874299] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   32.876447] Call Trace:
[   32.877053]  dump_stack+0x67/0x97
[   32.877828]  print_circular_bug+0x1e3/0x250
[   32.878789]  __lock_acquire+0x10bf/0x1350
[   32.879701]  ? retint_kernel+0x10/0x10
[   32.880567]  lock_acquire+0xd1/0x1b0
[   32.881453]  ? lock_acquire+0xd1/0x1b0
[   32.882305]  ? blk_mq_queue_reinit_work+0x13/0x110
[   32.883400]  __mutex_lock+0x54/0x8a0
[   32.884215]  ? blk_mq_queue_reinit_work+0x13/0x110
[   32.885390]  ? kernfs_put+0x103/0x1a0
[   32.886227]  ? kernfs_put+0x103/0x1a0
[   32.887063]  ? blk_mq_queue_reinit_work+0x13/0x110
[   32.888158]  ? rcu_read_lock_sched_held+0x58/0x60
[   32.889287]  ? kmem_cache_free+0x1f7/0x260
[   32.890224]  ? anon_transport_class_unregister+0x20/0x20
[   32.891443]  ? kernfs_put+0x103/0x1a0
[   32.892274]  ? blk_mq_queue_reinit_work+0x110/0x110
[   32.893436]  mutex_lock_nested+0x16/0x20
[   32.894328]  ? mutex_lock_nested+0x16/0x20
[   32.895260]  blk_mq_queue_reinit_work+0x13/0x110
[   32.896307]  blk_mq_queue_reinit_dead+0x17/0x20
[   32.897425]  cpuhp_invoke_callback+0x1d1/0x770
[   32.898443]  ? __flow_cache_shrink+0x130/0x130
[   32.899453]  cpuhp_down_callbacks+0x3d/0x80
[   32.900402]  _cpu_down+0xad/0xe0
[   32.901213]  do_cpu_down+0x39/0x50
[   32.902002]  cpu_down+0xb/0x10
[   32.902716]  torture_offline+0x75/0x140
[   32.903603]  torture_onoff+0x102/0x1e0
[   32.904459]  kthread+0x104/0x140
[   32.905243]  ? torture_kthread_stopping+0x70/0x70
[   32.906316]  ? kthread_create_on_node+0x40/0x40
[   32.907351]  ret_from_fork+0x2e/0x40

next             reply	other threads:[~2017-05-10 22:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-10 22:34 Paul E. McKenney [this message]
2017-05-11  2:55 ` Lockdep splat involving all_q_mutex Jens Axboe
2017-05-11  3:13   ` Paul E. McKenney
2017-05-11 20:12     ` Jens Axboe
2017-05-11 20:23       ` Paul E. McKenney
2017-05-12  5:02         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170510223407.GA7122@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.