All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: linux-kernel@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, hpa@zytor.com, x86@kernel.org,
	fenghua.yu@intel.com
Cc: xen-devel@lists.xensource.com
Subject: v3.9 - CPU hotplug and microcode earlier loading hits a mutex deadlock (x86_cpu_hotplug_driver_mutex)
Date: Mon, 6 May 2013 08:59:37 -0400	[thread overview]
Message-ID: <20130506125937.GA14036@phenom.dumpdata.com> (raw)


Hey,

As I was fixing the PV HVM's broken CPU hotplug mechanism I discovered
a deadlock in the microcode and generic code.

I am not sure if the ACPI generic mechanism would expose this, but
looking at the flow (arch_register_cpu, then expecting user-space to call
cpu_up), it should trigger this.

Anyhow, this can easily be triggered if a new CPU is added and
from user-space do:

echo 1 > /sys/devices/system/cpu/cpu3/online

on a newly appeared CPU. The deadlock is that the "store_online" in
drivers/base/cpu.c takes the cpu_hotplug_driver_lock() lock, then
calls "cpu_up". "cpu_up" eventually ends up calling "save_mc_for_early"
which also takes the cpu_hotplug_driver_lock() lock.

And here is that kernel thinks of it:

smpboot: Stack at about ffff880075c39f44
smpboot: CPU3: has booted.
microcode: CPU3 sig=0x206a7, pf=0x2, revision=0x25

=============================================
[ INFO: possible recursive locking detected ]
3.9.0upstream-10129-g167af0e #1 Not tainted
---------------------------------------------
sh/2487 is trying to acquire lock:
 (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20

but task is already holding lock:
 (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(x86_cpu_hotplug_driver_mutex);
  lock(x86_cpu_hotplug_driver_mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

6 locks held by sh/2487:
 #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ca48d>] vfs_write+0x17d/0x190
 #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff812464ef>] sysfs_write_file+0x3f/0x160
 #2:  (s_active#20){.+.+.+}, at: [<ffffffff81246578>] sysfs_write_file+0xc8/0x160
 #3:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
 #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810961c2>] cpu_maps_update_begin+0x12/0x20
 #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810962a7>] cpu_hotplug_begin+0x27/0x60

stack backtrace:
CPU: 1 PID: 2487 Comm: sh Not tainted 3.9.0upstream-10129-g167af0e #1
Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/03/2013
 ffffffff8229b710 ffff880064e75538 ffffffff816fe47e ffff880064e75608
 ffffffff81100cb6 ffff880064e75560 ffff88006670e290 ffff880064e75588
 ffffffff00000000 ffff88006670e9b8 21710c800c83a1f5 ffff880064e75598
Call Trace:
 [<ffffffff816fe47e>] dump_stack+0x19/0x1b
 [<ffffffff81100cb6>] __lock_acquire+0x726/0x1890
 [<ffffffff8110b352>] ? is_module_text_address+0x22/0x40
 [<ffffffff810bbff8>] ? __kernel_text_address+0x58/0x80
 [<ffffffff81101eca>] lock_acquire+0xaa/0x190
 [<ffffffff81075512>] ? cpu_hotplug_driver_lock+0x12/0x20
 [<ffffffff816fee1e>] __mutex_lock_common+0x5e/0x450
 [<ffffffff81075512>] ? cpu_hotplug_driver_lock+0x12/0x20
 [<ffffffff810d4225>] ? sched_clock_local+0x25/0x90
 [<ffffffff81075512>] ? cpu_hotplug_driver_lock+0x12/0x20
 [<ffffffff816ff340>] mutex_lock_nested+0x40/0x50
 [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
 [<ffffffff81080757>] save_mc_for_early+0x27/0xf0
 [<ffffffff810ffd30>] ? mark_held_locks+0x90/0x150
 [<ffffffff81176a5d>] ? get_page_from_freelist+0x46d/0x8e0
 [<ffffffff8110029d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81056a39>] ? sched_clock+0x9/0x10
 [<ffffffff81177275>] ? __alloc_pages_nodemask+0x165/0xa30
 [<ffffffff810d4225>] ? sched_clock_local+0x25/0x90
 [<ffffffff810d4348>] ? sched_clock_cpu+0xb8/0x110
 [<ffffffff810fb88d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff811a5e79>] ? vmap_page_range_noflush+0x279/0x370
 [<ffffffff811a5f9d>] ? map_vm_area+0x2d/0x50
 [<ffffffff811a7dce>] ? __vmalloc_node_range+0x18e/0x260
 [<ffffffff810812a8>] ? generic_load_microcode+0xb8/0x1c0
 [<ffffffff8108135c>] generic_load_microcode+0x16c/0x1c0
 [<ffffffff8110917e>] ? generic_exec_single+0x7e/0xb0
 [<ffffffff81081470>] ? request_microcode_user+0x20/0x20
 [<ffffffff8108142f>] request_microcode_fw+0x7f/0xa0
 [<ffffffff813356ab>] ? kobject_uevent+0xb/0x10
 [<ffffffff81081004>] microcode_init_cpu+0xf4/0x110
 [<ffffffff816f6b8c>] mc_cpu_callback+0x5b/0xb3
 [<ffffffff81706d7c>] notifier_call_chain+0x5c/0x120
 [<ffffffff810c6359>] __raw_notifier_call_chain+0x9/0x10
 [<ffffffff8109616b>] __cpu_notify+0x1b/0x30
 [<ffffffff816f72e1>] _cpu_up+0x103/0x14b
 [<ffffffff816f7404>] cpu_up+0xdb/0xee
 [<ffffffff816eda0a>] store_online+0xba/0x120
 [<ffffffff8145f08b>] dev_attr_store+0x1b/0x20
 [<ffffffff81246591>] sysfs_write_file+0xe1/0x160
 [<ffffffff811ca3ef>] vfs_write+0xdf/0x190
 [<ffffffff811ca96d>] SyS_write+0x5d/0xa0
 [<ffffffff8133f4fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8170b7a9>] system_call_fastpath+0x16/0x1b

Thoughts? 

             reply	other threads:[~2013-05-06 12:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-06 12:59 Konrad Rzeszutek Wilk [this message]
2013-05-07 19:00 ` v3.9 - CPU hotplug and microcode earlier loading hits a mutex deadlock (x86_cpu_hotplug_driver_mutex) Konrad Rzeszutek Wilk
2013-05-08 12:54   ` Borislav Petkov
2013-05-08 14:03     ` Konrad Rzeszutek Wilk
2013-05-08 14:29       ` Borislav Petkov
2013-05-08 15:20         ` H. Peter Anvin
2013-05-08 15:41         ` [Xen-devel] " Ross Philipson
2013-05-08 15:41           ` Ross Philipson
2013-05-08 16:32         ` Konrad Rzeszutek Wilk
2013-05-08 18:19           ` Toshi Kani
2013-05-08 18:19             ` Toshi Kani
2013-05-08 18:42             ` Konrad Rzeszutek Wilk
2013-05-08 18:59               ` Toshi Kani
2013-05-09  0:23           ` Rafael J. Wysocki
2013-05-09  0:23             ` Rafael J. Wysocki
2013-05-08 16:13       ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130506125937.GA14036@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.