* [BUG] smp alternatives: sleeping function called from invalid context
@ 2008-04-30 19:51 Luca Tettamanti
2008-05-01 13:04 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Luca Tettamanti @ 2008-04-30 19:51 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel
Hello,
with git current I'm seeing these warnings when bringing CPUs down:
[13794.108805] CPU0 attaching NULL sched-domain.
[13794.108815] CPU1 attaching NULL sched-domain.
[13794.124695] Broke affinity for irq 22
[13794.192376] CPU 1 is now offline
[13794.192381] lockdep: fixing up alternatives.
[13794.192389] SMP alternatives: switching to UP code
[13794.192395] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
[13794.192398] in_atomic():1, irqs_disabled():0
[13794.192401] 4 locks held by bash/11413:
[13794.192404] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
[13794.192417] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff80253479>] cpu_down+0x12/0x32
[13794.192427] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
[13794.192435] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
[13794.192446] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
[13794.192449]
[13794.192450] Call Trace:
[13794.192681] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
[13794.192695] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
[13794.192703] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
[13794.192709] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
[13794.192715] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13794.192719] [<ffffffff80276cc5>] vmap+0x2d/0x5a
[13794.192727] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13794.192736] [<ffffffff80446f1b>] _etext+0x0/0x5
[13794.192742] [<ffffffff80211666>] alternatives_smp_unlock+0x50/0x65
[13794.192752] [<ffffffff80211934>] alternatives_smp_switch+0x168/0x1a5
[13794.192758] [<ffffffff80219439>] __cpus_weight+0x3f/0x41
[13794.192767] [<ffffffff802532b4>] _cpu_down+0x18f/0x264
[13794.192796] [<ffffffff8025348b>] cpu_down+0x24/0x32
[13794.192805] [<ffffffff8043259d>] store_online+0x29/0x67
[13794.192812] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
[13794.192824] [<ffffffff80283cba>] vfs_write+0xa5/0xde
[13794.192832] [<ffffffff80283da5>] sys_write+0x45/0x6b
[13794.192842] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
[13794.192857]
[13794.208799] CPU0 attaching NULL sched-domain.
...and up:
[13803.369635] lockdep: fixing up alternatives.
[13803.369640] SMP alternatives: switching to SMP code
[13803.369645] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
[13803.369648] in_atomic():1, irqs_disabled():0
[13803.369651] 4 locks held by bash/11413:
[13803.369654] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
[13803.369666] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff8044115e>] cpu_up+0x40/0x64
[13803.369676] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
[13803.369686] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
[13803.369697] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
[13803.369700]
[13803.369701] Call Trace:
[13803.369712] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
[13803.369721] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
[13803.369731] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
[13803.369738] [<ffffffff80446f1b>] _etext+0x0/0x5
[13803.369743] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
[13803.369749] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13803.369753] [<ffffffff80276cc5>] vmap+0x2d/0x5a
[13803.369763] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13803.369773] [<ffffffff802118c5>] alternatives_smp_switch+0xf9/0x1a5
[13803.369795] [<ffffffff8043f29e>] native_cpu_up+0x224/0x7fc
[13803.369804] [<ffffffff8043ee95>] do_fork_idle+0x0/0x20
[13803.369833] [<ffffffff80441098>] _cpu_up+0x91/0x117
[13803.369842] [<ffffffff80441175>] cpu_up+0x57/0x64
[13803.369849] [<ffffffff804325b7>] store_online+0x43/0x67
[13803.369856] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
[13803.369867] [<ffffffff80283cba>] vfs_write+0xa5/0xde
[13803.369875] [<ffffffff80283da5>] sys_write+0x45/0x6b
[13803.369886] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
[13803.369898]
[13803.380408] Booting processor 1/1 ip 6000
[13803.389154] Initializing CPU#1
[13803.389154] masked ExtINT on CPU#1
The bug has been introduced (assuming that I'm using git blame correctly) by this commit:
commit e587cadd8f47e202a30712e2906a65a0606d5865
Author: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Date: Thu Mar 6 08:48:49 2008 -0500
x86: enhance DEBUG_RODATA support - alternatives
In both code paths - cpu_up() and cpud_down() - alternatives_smp_switch ends up
calling text_poke() (via alternatives_smp_{,un}lock) with smp_alt held;
then text_poke does a vmap which might sleep!
Luca
--
Windows NT crashed.
I'm the Blue Screen of Death.
No one hears your screams.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [BUG] smp alternatives: sleeping function called from invalid context 2008-04-30 19:51 [BUG] smp alternatives: sleeping function called from invalid context Luca Tettamanti @ 2008-05-01 13:04 ` Mathieu Desnoyers 2008-05-01 15:02 ` Luca Tettamanti 0 siblings, 1 reply; 9+ messages in thread From: Mathieu Desnoyers @ 2008-05-01 13:04 UTC (permalink / raw) To: Luca Tettamanti; +Cc: linux-kernel * Luca Tettamanti (kronos.it@gmail.com) wrote: > Hello, > with git current I'm seeing these warnings when bringing CPUs down: > Hi Luca, Does your tree include the patch posted here ? http://lkml.org/lkml/2008/4/19/139 Mathieu > [13794.108805] CPU0 attaching NULL sched-domain. > [13794.108815] CPU1 attaching NULL sched-domain. > [13794.124695] Broke affinity for irq 22 > [13794.192376] CPU 1 is now offline > [13794.192381] lockdep: fixing up alternatives. > [13794.192389] SMP alternatives: switching to UP code > [13794.192395] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049 > [13794.192398] in_atomic():1, irqs_disabled():0 > [13794.192401] 4 locks held by bash/11413: > [13794.192404] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c > [13794.192417] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff80253479>] cpu_down+0x12/0x32 > [13794.192427] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6 > [13794.192435] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5 > [13794.192446] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68 > [13794.192449] > [13794.192450] Call Trace: > [13794.192681] [<ffffffff80280f61>] __kmalloc+0x29/0x7d > [13794.192695] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1 > [13794.192703] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12 > [13794.192709] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e > [13794.192715] [<ffffffff8021159d>] text_poke+0xdd/0x156 > [13794.192719] [<ffffffff80276cc5>] vmap+0x2d/0x5a > [13794.192727] [<ffffffff8021159d>] text_poke+0xdd/0x156 > [13794.192736] [<ffffffff80446f1b>] _etext+0x0/0x5 > [13794.192742] [<ffffffff80211666>] alternatives_smp_unlock+0x50/0x65 > [13794.192752] [<ffffffff80211934>] alternatives_smp_switch+0x168/0x1a5 > [13794.192758] [<ffffffff80219439>] __cpus_weight+0x3f/0x41 > [13794.192767] [<ffffffff802532b4>] _cpu_down+0x18f/0x264 > [13794.192796] [<ffffffff8025348b>] cpu_down+0x24/0x32 > [13794.192805] [<ffffffff8043259d>] store_online+0x29/0x67 > [13794.192812] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c > [13794.192824] [<ffffffff80283cba>] vfs_write+0xa5/0xde > [13794.192832] [<ffffffff80283da5>] sys_write+0x45/0x6b > [13794.192842] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa > [13794.192857] > [13794.208799] CPU0 attaching NULL sched-domain. > > ...and up: > > [13803.369635] lockdep: fixing up alternatives. > [13803.369640] SMP alternatives: switching to SMP code > [13803.369645] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049 > [13803.369648] in_atomic():1, irqs_disabled():0 > [13803.369651] 4 locks held by bash/11413: > [13803.369654] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c > [13803.369666] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff8044115e>] cpu_up+0x40/0x64 > [13803.369676] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6 > [13803.369686] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5 > [13803.369697] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68 > [13803.369700] > [13803.369701] Call Trace: > [13803.369712] [<ffffffff80280f61>] __kmalloc+0x29/0x7d > [13803.369721] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1 > [13803.369731] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12 > [13803.369738] [<ffffffff80446f1b>] _etext+0x0/0x5 > [13803.369743] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e > [13803.369749] [<ffffffff8021159d>] text_poke+0xdd/0x156 > [13803.369753] [<ffffffff80276cc5>] vmap+0x2d/0x5a > [13803.369763] [<ffffffff8021159d>] text_poke+0xdd/0x156 > [13803.369773] [<ffffffff802118c5>] alternatives_smp_switch+0xf9/0x1a5 > [13803.369795] [<ffffffff8043f29e>] native_cpu_up+0x224/0x7fc > [13803.369804] [<ffffffff8043ee95>] do_fork_idle+0x0/0x20 > [13803.369833] [<ffffffff80441098>] _cpu_up+0x91/0x117 > [13803.369842] [<ffffffff80441175>] cpu_up+0x57/0x64 > [13803.369849] [<ffffffff804325b7>] store_online+0x43/0x67 > [13803.369856] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c > [13803.369867] [<ffffffff80283cba>] vfs_write+0xa5/0xde > [13803.369875] [<ffffffff80283da5>] sys_write+0x45/0x6b > [13803.369886] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa > [13803.369898] > [13803.380408] Booting processor 1/1 ip 6000 > [13803.389154] Initializing CPU#1 > [13803.389154] masked ExtINT on CPU#1 > > The bug has been introduced (assuming that I'm using git blame correctly) by this commit: > > commit e587cadd8f47e202a30712e2906a65a0606d5865 > Author: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Date: Thu Mar 6 08:48:49 2008 -0500 > > x86: enhance DEBUG_RODATA support - alternatives > > > In both code paths - cpu_up() and cpud_down() - alternatives_smp_switch ends up > calling text_poke() (via alternatives_smp_{,un}lock) with smp_alt held; > then text_poke does a vmap which might sleep! > > Luca > -- > Windows NT crashed. > I'm the Blue Screen of Death. > No one hears your screams. -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] smp alternatives: sleeping function called from invalid context 2008-05-01 13:04 ` Mathieu Desnoyers @ 2008-05-01 15:02 ` Luca Tettamanti 2008-05-01 16:44 ` Luca Tettamanti 0 siblings, 1 reply; 9+ messages in thread From: Luca Tettamanti @ 2008-05-01 15:02 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: linux-kernel On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote: > * Luca Tettamanti (kronos.it@gmail.com) wrote: > > Hello, > > with git current I'm seeing these warnings when bringing CPUs down: > > > > > Hi Luca, > > Does your tree include the patch posted here ? > > http://lkml.org/lkml/2008/4/19/139 Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will test the patch ASAP. Luca ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] smp alternatives: sleeping function called from invalid context 2008-05-01 15:02 ` Luca Tettamanti @ 2008-05-01 16:44 ` Luca Tettamanti 2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers 0 siblings, 1 reply; 9+ messages in thread From: Luca Tettamanti @ 2008-05-01 16:44 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: linux-kernel On Thu, May 1, 2008 at 5:02 PM, Luca Tettamanti <kronos.it@gmail.com> wrote: > On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers > <mathieu.desnoyers@polymtl.ca> wrote: > > * Luca Tettamanti (kronos.it@gmail.com) wrote: > > > Hello, > > > with git current I'm seeing these warnings when bringing CPUs down: > > > > > > > > > Hi Luca, > > > > Does your tree include the patch posted here ? > > > > http://lkml.org/lkml/2008/4/19/139 > > Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will > test the patch ASAP. Tested the patch: works fine, and the system is stable after multiple up&down, I'm not experiencing a reboot. Luca ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable 2008-05-01 16:44 ` Luca Tettamanti @ 2008-05-01 18:46 ` Mathieu Desnoyers 2008-05-01 18:52 ` Linus Torvalds 0 siblings, 1 reply; 9+ messages in thread From: Mathieu Desnoyers @ 2008-05-01 18:46 UTC (permalink / raw) To: Ingo Molnar, Linus Torvalds, Luca Tettamanti; +Cc: linux-kernel * Luca Tettamanti (kronos.it@gmail.com) wrote: > On Thu, May 1, 2008 at 5:02 PM, Luca Tettamanti <kronos.it@gmail.com> wrote: > > On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers > > <mathieu.desnoyers@polymtl.ca> wrote: > > > * Luca Tettamanti (kronos.it@gmail.com) wrote: > > > > Hello, > > > > with git current I'm seeing these warnings when bringing CPUs down: > > > > > > > > > > > > > Hi Luca, > > > > > > Does your tree include the patch posted here ? > > > > > > http://lkml.org/lkml/2008/4/19/139 > > > > Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will > > test the patch ASAP. > > Tested the patch: works fine, and the system is stable after multiple > up&down, I'm not experiencing a reboot. > > Luca Ingo, Linus, I guess fast-tracking pulling the following patch into mainline wouldn't hurt. I just updated it so it applies to mainline. Thanks, Mathieu Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable * Pekka Paalanen (pq@iki.fi) wrote: > On Mon, 14 Apr 2008 08:57:13 +0200 > Ingo Molnar <mingo@elte.hu> wrote: > > > Pekka Paalanen wrote: > > > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor() > > > triggered the following kernel bug: > > > > > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006) > > > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13 > > > 22:09:03 EEST 2008 > > > ... > > > in mmio_trace_init > > > mmiotrace: Disabling non-boot CPUs... > > > CPU 1 is now offline > > > lockdep: fixing up alternatives. > > > SMP alternatives: switching to UP code > > > BUG: sleeping function called from invalid context at mm/slab.c:3053 > > > in_atomic():1, irqs_disabled():0 > > > 5 locks held by bash/4423: > > > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a > > > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142 > > > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14 > > > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f > > > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab > > > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2 > > > > > > Call Trace: > > > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24 > > > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb > > > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32 > > > [<ffffffff80287a32>] __kmalloc+0x34/0xa5 > > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb > > > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb > > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > > [<ffffffff8027da41>] __get_vm_area+0x13/0x15 > > > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f > > > [<ffffffff8027e14c>] vmap+0x2a/0x5c > > > [<ffffffff8021191b>] text_poke+0xaa/0x136 > > > [<ffffffff804cba2b>] ? _etext+0x0/0x5 > > > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63 > > > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab > > > [<ffffffff8021b163>] __cpu_die+0x53/0x7d > > > [<ffffffff802583e2>] _cpu_down+0x195/0x26c > > > [<ffffffff802585ca>] cpu_down+0x26/0x36 > > > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142 > > > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40 > > > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a > > > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13 > > > [<ffffffff8028b047>] vfs_write+0xa7/0xe1 > > > [<ffffffff8028b13b>] sys_write+0x47/0x6d > > > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80 > > > > > > mmiotrace: CPU1 is down. > > > mmiotrace: enabled. > > > > > > Is this my fault, or is there a bug somewhere else? The kernel tree is > > > sched-devel/latest git from 12th April, IIRC. > > > > there's no known bug of sched-devel/latest of this kind (or any known > > bug for that matter). > > > > i suspect the bug is that you bring the CPU down from an atomic > > (spinlocked or irq disabled) context. > > I have been eyeballing the code in current sched-devel/latest and there's > something I think I found. > > This is the beginning of my enter_uniprocessor() which is called from > enable_mmiotrace() seen in the backtrace. > > static void enter_uniprocessor(void) > { > int cpu; > int err; > > get_online_cpus(); > downed_cpus = cpu_online_map; > cpu_clear(first_cpu(cpu_online_map), downed_cpus); > if (num_online_cpus() > 1) > pr_notice(NAME "Disabling non-boot CPUs...\n"); > put_online_cpus(); > > for_each_cpu_mask(cpu, downed_cpus) { > err = cpu_down(cpu); > > The function get_online_cpus() calls might_sleep(), so at that > point everything is fine. > > Following the backtrace, we come to alternatives_smp_switch(), > which does > spin_lock(&smp_alt); > and then continues eventually into this function: > > void *__kprobes text_poke(void *addr, const void *opcode, size_t len) > { > unsigned long flags; > char *vaddr; > int nr_pages = 2; > > BUG_ON(len > sizeof(long)); > BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1)) > - ((long)addr & ~(sizeof(long) - 1))); > if (kernel_text_address((unsigned long)addr)) { > struct page *pages[2] = { virt_to_page(addr), > virt_to_page(addr + PAGE_SIZE) }; > if (!pages[1]) > nr_pages = 1; > vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); > > After which we find ourselves in > > struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags, > unsigned long start, unsigned long end) > { > return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL); > > and in __get_vm_area_node() there is > area = kmalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); > > where gfp_mask is now GFP_KERNEL. As far as I can tell, using SLAB, > kmalloc_node() here is actually kmalloc(). Anyway, looks like we end up > in __kmalloc with such flags that it may sleep. > > I have followed the code so that the path up to kmalloc_node() is > possible, unless there are some "should never happen" conditions > along the way, which I just cannot know of. > > Is this the bug? > Yep. I think using a mutex should fix it. There is no reason to use a spinlock rather than a mutex here. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Pekka Paalanen <pq@iki.fi> CC: Ingo Molnar <mingo@elte.hu> CC: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/kernel/alternative.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) Index: linux-2.6-lttng/arch/x86/kernel/alternative.c =================================================================== --- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-05-01 14:41:09.000000000 -0400 +++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-05-01 14:44:28.000000000 -0400 @@ -1,6 +1,6 @@ #include <linux/module.h> #include <linux/sched.h> -#include <linux/spinlock.h> +#include <linux/mutex.h> #include <linux/list.h> #include <linux/kprobes.h> #include <linux/mm.h> @@ -279,7 +279,7 @@ struct smp_alt_module { struct list_head next; }; static LIST_HEAD(smp_alt_modules); -static DEFINE_SPINLOCK(smp_alt); +static DEFINE_MUTEX(smp_alt); static int smp_mode = 1; /* protected by smp_alt */ void alternatives_smp_module_add(struct module *mod, char *name, @@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct __func__, smp->locks, smp->locks_end, smp->text, smp->text_end, smp->name); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_add_tail(&smp->next, &smp_alt_modules); if (boot_cpu_has(X86_FEATURE_UP)) alternatives_smp_unlock(smp->locks, smp->locks_end, smp->text, smp->text_end); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_module_del(struct module *mod) @@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct if (smp_alt_once || noreplace_smp) return; - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_for_each_entry(item, &smp_alt_modules, next) { if (mod != item->mod) continue; list_del(&item->next); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); DPRINTK("%s: %s\n", __func__, item->name); kfree(item); return; } - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_switch(int smp) @@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp) return; BUG_ON(!smp && (num_online_cpus() > 1)); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); /* * Avoid unnecessary switches because it forces JIT based VMs to @@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp) mod->text, mod->text_end); } smp_mode = smp; - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } #endif -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable 2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers @ 2008-05-01 18:52 ` Linus Torvalds 2008-05-01 20:08 ` Mathieu Desnoyers 0 siblings, 1 reply; 9+ messages in thread From: Linus Torvalds @ 2008-05-01 18:52 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: Ingo Molnar, Luca Tettamanti, linux-kernel On Thu, 1 May 2008, Mathieu Desnoyers wrote: > > Yep. I think using a mutex should fix it. There is no reason to use a > spinlock rather than a mutex here. Is there any reason to do any locking what-so-ever? If this code could possibly race with something else, that sounds like a bug in itself, no? Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable 2008-05-01 18:52 ` Linus Torvalds @ 2008-05-01 20:08 ` Mathieu Desnoyers 0 siblings, 0 replies; 9+ messages in thread From: Mathieu Desnoyers @ 2008-05-01 20:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Ingo Molnar, Luca Tettamanti, linux-kernel * Linus Torvalds (torvalds@linux-foundation.org) wrote: > > > On Thu, 1 May 2008, Mathieu Desnoyers wrote: > > > > Yep. I think using a mutex should fix it. There is no reason to use a > > spinlock rather than a mutex here. > > Is there any reason to do any locking what-so-ever? > > If this code could possibly race with something else, that sounds like a > bug in itself, no? > > Linus The smp_alt mutex taken in alternatives_smp_module_add/alternatives_smp_module_del seems required because this code will run at module load/free, even in SMP context (not within stop_machine_run). As for alternatives_smp_switch, which is called when switching between UP and SMP, always in UP context, I guess we should assume it could be executed concurrently wrt the module_load/free, even in UP on a CONFIG_PREEMPT kernel. Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20080413224207.4430a09c@daedalus.pq.iki.fi>]
* [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs [not found] <20080413224207.4430a09c@daedalus.pq.iki.fi> @ 2008-04-13 20:05 ` Pekka Paalanen 2008-04-14 6:57 ` Ingo Molnar 0 siblings, 1 reply; 9+ messages in thread From: Pekka Paalanen @ 2008-04-13 20:05 UTC (permalink / raw) To: Ingo Molnar; +Cc: Pekka Paalanen, Steven Rostedt, linux-kernel >From 9d00d006631650e8ba9131356c544a5ae79f873e Mon Sep 17 00:00:00 2001 From: Pekka Paalanen <pq@iki.fi> Date: Sat, 12 Apr 2008 00:18:57 +0300 Subject: [PATCH] x86 mmiotrace: dynamically disable non-boot CPUs Mmiotrace is not reliable with multiple CPUs and may miss events. Drop to single CPU when mmiotrace is activated. Signed-off-by: Pekka Paalanen <pq@iki.fi> --- When I tested this patch on Intel Core 2 Duo, enter_uniprocessor() triggered the following kernel bug: Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006) (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13 22:09:03 EEST 2008 ... in mmio_trace_init mmiotrace: Disabling non-boot CPUs... CPU 1 is now offline lockdep: fixing up alternatives. SMP alternatives: switching to UP code BUG: sleeping function called from invalid context at mm/slab.c:3053 in_atomic():1, irqs_disabled():0 5 locks held by bash/4423: #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142 #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14 #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2 Call Trace: [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24 [<ffffffff8022d292>] __might_sleep+0xd9/0xdb [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32 [<ffffffff80287a32>] __kmalloc+0x34/0xa5 [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 [<ffffffff8027da41>] __get_vm_area+0x13/0x15 [<ffffffff8027da60>] get_vm_area+0x1d/0x1f [<ffffffff8027e14c>] vmap+0x2a/0x5c [<ffffffff8021191b>] text_poke+0xaa/0x136 [<ffffffff804cba2b>] ? _etext+0x0/0x5 [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63 [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab [<ffffffff8021b163>] __cpu_die+0x53/0x7d [<ffffffff802583e2>] _cpu_down+0x195/0x26c [<ffffffff802585ca>] cpu_down+0x26/0x36 [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142 [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40 [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a [<ffffffff80327fac>] ? security_file_permission+0x11/0x13 [<ffffffff8028b047>] vfs_write+0xa7/0xe1 [<ffffffff8028b13b>] sys_write+0x47/0x6d [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80 mmiotrace: CPU1 is down. mmiotrace: enabled. Is this my fault, or is there a bug somewhere else? The kernel tree is sched-devel/latest git from 12th April, IIRC. arch/x86/mm/mmio-mod.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 51 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c index ab2bb77..f523c5f 100644 --- a/arch/x86/mm/mmio-mod.c +++ b/arch/x86/mm/mmio-mod.c @@ -32,6 +32,7 @@ #include <asm/e820.h> /* for ISA_START_ADDRESS */ #include <asm/atomic.h> #include <linux/percpu.h> +#include <linux/cpu.h> #include "pf_in.h" @@ -399,6 +400,54 @@ static void clear_trace_list(void) } } +static cpumask_t downed_cpus; + +static void enter_uniprocessor(void) +{ +#ifdef CONFIG_SMP + int cpu; + int err; + + get_online_cpus(); + downed_cpus = cpu_online_map; + cpu_clear(first_cpu(cpu_online_map), downed_cpus); + if (num_online_cpus() > 1) + pr_notice(NAME "Disabling non-boot CPUs...\n"); + put_online_cpus(); + + for_each_cpu_mask(cpu, downed_cpus) { + err = cpu_down(cpu); + if (!err) { + pr_info(NAME "CPU%d is down.\n", cpu); + } else { + pr_err(NAME "Error taking CPU%d down: %d\n", cpu, err); + } + } + if (num_online_cpus() > 1) + pr_warning(NAME "multiple CPUs still online, " + "may miss events.\n"); +#endif /* CONFIG_SMP */ +} + +static void leave_uniprocessor(void) +{ +#ifdef CONFIG_SMP + int cpu; + int err; + + if (cpus_weight(downed_cpus) == 0) + return; + pr_notice(NAME "Re-enabling CPUs...\n"); + for_each_cpu_mask(cpu, downed_cpus) { + err = cpu_up(cpu); + if (!err) + pr_info(NAME "enabled CPU%d.\n", cpu); + else + pr_err(NAME "cannot re-enable CPU%d: %d\n", cpu, err); + } +#endif /* CONFIG_SMP */ +} + #if 0 /* XXX: out of order */ static struct file_operations fops_marker = { .owner = THIS_MODULE, @@ -421,6 +470,7 @@ void enable_mmiotrace(void) if (nommiotrace) pr_info(NAME "MMIO tracing disabled.\n"); + enter_uniprocessor(); spin_lock_irq(&trace_lock); atomic_inc(&mmiotrace_enabled); spin_unlock_irq(&trace_lock); @@ -441,6 +491,7 @@ void disable_mmiotrace(void) spin_unlock_irq(&trace_lock); clear_trace_list(); /* guarantees: no more kmmio callbacks */ + leave_uniprocessor(); if (marker_file) { debugfs_remove(marker_file); marker_file = NULL; -- 1.5.3.7 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs 2008-04-13 20:05 ` [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs Pekka Paalanen @ 2008-04-14 6:57 ` Ingo Molnar 2008-04-19 15:41 ` [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked Pekka Paalanen 0 siblings, 1 reply; 9+ messages in thread From: Ingo Molnar @ 2008-04-14 6:57 UTC (permalink / raw) To: Pekka Paalanen; +Cc: Steven Rostedt, linux-kernel thanks Pekka, i've applied your patches. i'm wondering about this though: > Mmiotrace is not reliable with multiple CPUs and may miss events. Drop > to single CPU when mmiotrace is activated. we should fix this restriction ASAP. Forcibly dropping to UP will cause mmiotrace to be much less useful for diagnostic purposes of Linux drivers. We want to enable the mmiotrace-ing of specific devices via some /sys flag. For example via: cat /sys/devices/pci0000\:00/0000\:00\:1f.2/mmiotrace this should start mmiotracing of that specific device - or something like that. Hm? > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor() > triggered the following kernel bug: > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006) > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13 > 22:09:03 EEST 2008 > ... > in mmio_trace_init > mmiotrace: Disabling non-boot CPUs... > CPU 1 is now offline > lockdep: fixing up alternatives. > SMP alternatives: switching to UP code > BUG: sleeping function called from invalid context at mm/slab.c:3053 > in_atomic():1, irqs_disabled():0 > 5 locks held by bash/4423: > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142 > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14 > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2 > > Call Trace: > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24 > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32 > [<ffffffff80287a32>] __kmalloc+0x34/0xa5 > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > [<ffffffff8027da41>] __get_vm_area+0x13/0x15 > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f > [<ffffffff8027e14c>] vmap+0x2a/0x5c > [<ffffffff8021191b>] text_poke+0xaa/0x136 > [<ffffffff804cba2b>] ? _etext+0x0/0x5 > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63 > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab > [<ffffffff8021b163>] __cpu_die+0x53/0x7d > [<ffffffff802583e2>] _cpu_down+0x195/0x26c > [<ffffffff802585ca>] cpu_down+0x26/0x36 > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142 > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40 > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13 > [<ffffffff8028b047>] vfs_write+0xa7/0xe1 > [<ffffffff8028b13b>] sys_write+0x47/0x6d > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80 > > mmiotrace: CPU1 is down. > mmiotrace: enabled. > > Is this my fault, or is there a bug somewhere else? The kernel tree is > sched-devel/latest git from 12th April, IIRC. there's no known bug of sched-devel/latest of this kind (or any known bug for that matter). i suspect the bug is that you bring the CPU down from an atomic (spinlocked or irq disabled) context. Ingo ^ permalink raw reply [flat|nested] 9+ messages in thread
* [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked 2008-04-14 6:57 ` Ingo Molnar @ 2008-04-19 15:41 ` Pekka Paalanen 2008-04-19 16:19 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers 0 siblings, 1 reply; 9+ messages in thread From: Pekka Paalanen @ 2008-04-19 15:41 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Steven Rostedt, mathieu.desnoyers On Mon, 14 Apr 2008 08:57:13 +0200 Ingo Molnar <mingo@elte.hu> wrote: > Pekka Paalanen wrote: > > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor() > > triggered the following kernel bug: > > > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006) > > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13 > > 22:09:03 EEST 2008 > > ... > > in mmio_trace_init > > mmiotrace: Disabling non-boot CPUs... > > CPU 1 is now offline > > lockdep: fixing up alternatives. > > SMP alternatives: switching to UP code > > BUG: sleeping function called from invalid context at mm/slab.c:3053 > > in_atomic():1, irqs_disabled():0 > > 5 locks held by bash/4423: > > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a > > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142 > > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14 > > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f > > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab > > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2 > > > > Call Trace: > > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24 > > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb > > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32 > > [<ffffffff80287a32>] __kmalloc+0x34/0xa5 > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb > > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > [<ffffffff8027da41>] __get_vm_area+0x13/0x15 > > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f > > [<ffffffff8027e14c>] vmap+0x2a/0x5c > > [<ffffffff8021191b>] text_poke+0xaa/0x136 > > [<ffffffff804cba2b>] ? _etext+0x0/0x5 > > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63 > > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab > > [<ffffffff8021b163>] __cpu_die+0x53/0x7d > > [<ffffffff802583e2>] _cpu_down+0x195/0x26c > > [<ffffffff802585ca>] cpu_down+0x26/0x36 > > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142 > > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40 > > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a > > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13 > > [<ffffffff8028b047>] vfs_write+0xa7/0xe1 > > [<ffffffff8028b13b>] sys_write+0x47/0x6d > > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80 > > > > mmiotrace: CPU1 is down. > > mmiotrace: enabled. > > > > Is this my fault, or is there a bug somewhere else? The kernel tree is > > sched-devel/latest git from 12th April, IIRC. > > there's no known bug of sched-devel/latest of this kind (or any known > bug for that matter). > > i suspect the bug is that you bring the CPU down from an atomic > (spinlocked or irq disabled) context. I have been eyeballing the code in current sched-devel/latest and there's something I think I found. This is the beginning of my enter_uniprocessor() which is called from enable_mmiotrace() seen in the backtrace. static void enter_uniprocessor(void) { int cpu; int err; get_online_cpus(); downed_cpus = cpu_online_map; cpu_clear(first_cpu(cpu_online_map), downed_cpus); if (num_online_cpus() > 1) pr_notice(NAME "Disabling non-boot CPUs...\n"); put_online_cpus(); for_each_cpu_mask(cpu, downed_cpus) { err = cpu_down(cpu); The function get_online_cpus() calls might_sleep(), so at that point everything is fine. Following the backtrace, we come to alternatives_smp_switch(), which does spin_lock(&smp_alt); and then continues eventually into this function: void *__kprobes text_poke(void *addr, const void *opcode, size_t len) { unsigned long flags; char *vaddr; int nr_pages = 2; BUG_ON(len > sizeof(long)); BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1)) - ((long)addr & ~(sizeof(long) - 1))); if (kernel_text_address((unsigned long)addr)) { struct page *pages[2] = { virt_to_page(addr), virt_to_page(addr + PAGE_SIZE) }; if (!pages[1]) nr_pages = 1; vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); After which we find ourselves in struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags, unsigned long start, unsigned long end) { return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL); and in __get_vm_area_node() there is area = kmalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); where gfp_mask is now GFP_KERNEL. As far as I can tell, using SLAB, kmalloc_node() here is actually kmalloc(). Anyway, looks like we end up in __kmalloc with such flags that it may sleep. I have followed the code so that the path up to kmalloc_node() is possible, unless there are some "should never happen" conditions along the way, which I just cannot know of. Is this the bug? Thanks. -- Pekka Paalanen http://www.iki.fi/pq/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable 2008-04-19 15:41 ` [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked Pekka Paalanen @ 2008-04-19 16:19 ` Mathieu Desnoyers 2008-04-19 21:06 ` Pekka Paalanen 0 siblings, 1 reply; 9+ messages in thread From: Mathieu Desnoyers @ 2008-04-19 16:19 UTC (permalink / raw) To: Pekka Paalanen; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt * Pekka Paalanen (pq@iki.fi) wrote: > On Mon, 14 Apr 2008 08:57:13 +0200 > Ingo Molnar <mingo@elte.hu> wrote: > > > Pekka Paalanen wrote: > > > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor() > > > triggered the following kernel bug: > > > > > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006) > > > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13 > > > 22:09:03 EEST 2008 > > > ... > > > in mmio_trace_init > > > mmiotrace: Disabling non-boot CPUs... > > > CPU 1 is now offline > > > lockdep: fixing up alternatives. > > > SMP alternatives: switching to UP code > > > BUG: sleeping function called from invalid context at mm/slab.c:3053 > > > in_atomic():1, irqs_disabled():0 > > > 5 locks held by bash/4423: > > > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a > > > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142 > > > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14 > > > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f > > > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab > > > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2 > > > > > > Call Trace: > > > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24 > > > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb > > > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32 > > > [<ffffffff80287a32>] __kmalloc+0x34/0xa5 > > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb > > > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb > > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17 > > > [<ffffffff8027da41>] __get_vm_area+0x13/0x15 > > > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f > > > [<ffffffff8027e14c>] vmap+0x2a/0x5c > > > [<ffffffff8021191b>] text_poke+0xaa/0x136 > > > [<ffffffff804cba2b>] ? _etext+0x0/0x5 > > > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63 > > > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab > > > [<ffffffff8021b163>] __cpu_die+0x53/0x7d > > > [<ffffffff802583e2>] _cpu_down+0x195/0x26c > > > [<ffffffff802585ca>] cpu_down+0x26/0x36 > > > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142 > > > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40 > > > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a > > > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13 > > > [<ffffffff8028b047>] vfs_write+0xa7/0xe1 > > > [<ffffffff8028b13b>] sys_write+0x47/0x6d > > > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80 > > > > > > mmiotrace: CPU1 is down. > > > mmiotrace: enabled. > > > > > > Is this my fault, or is there a bug somewhere else? The kernel tree is > > > sched-devel/latest git from 12th April, IIRC. > > > > there's no known bug of sched-devel/latest of this kind (or any known > > bug for that matter). > > > > i suspect the bug is that you bring the CPU down from an atomic > > (spinlocked or irq disabled) context. > > I have been eyeballing the code in current sched-devel/latest and there's > something I think I found. > > This is the beginning of my enter_uniprocessor() which is called from > enable_mmiotrace() seen in the backtrace. > > static void enter_uniprocessor(void) > { > int cpu; > int err; > > get_online_cpus(); > downed_cpus = cpu_online_map; > cpu_clear(first_cpu(cpu_online_map), downed_cpus); > if (num_online_cpus() > 1) > pr_notice(NAME "Disabling non-boot CPUs...\n"); > put_online_cpus(); > > for_each_cpu_mask(cpu, downed_cpus) { > err = cpu_down(cpu); > > The function get_online_cpus() calls might_sleep(), so at that > point everything is fine. > > Following the backtrace, we come to alternatives_smp_switch(), > which does > spin_lock(&smp_alt); > and then continues eventually into this function: > > void *__kprobes text_poke(void *addr, const void *opcode, size_t len) > { > unsigned long flags; > char *vaddr; > int nr_pages = 2; > > BUG_ON(len > sizeof(long)); > BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1)) > - ((long)addr & ~(sizeof(long) - 1))); > if (kernel_text_address((unsigned long)addr)) { > struct page *pages[2] = { virt_to_page(addr), > virt_to_page(addr + PAGE_SIZE) }; > if (!pages[1]) > nr_pages = 1; > vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); > > After which we find ourselves in > > struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags, > unsigned long start, unsigned long end) > { > return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL); > > and in __get_vm_area_node() there is > area = kmalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); > > where gfp_mask is now GFP_KERNEL. As far as I can tell, using SLAB, > kmalloc_node() here is actually kmalloc(). Anyway, looks like we end up > in __kmalloc with such flags that it may sleep. > > I have followed the code so that the path up to kmalloc_node() is > possible, unless there are some "should never happen" conditions > along the way, which I just cannot know of. > > Is this the bug? > Yep. I think using a mutex should fix it. There is no reason to use a spinlock rather than a mutex here. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Pekka Paalanen <pq@iki.fi> CC: Ingo Molnar <mingo@elte.hu> CC: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/kernel/alternative.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) Index: linux-2.6-lttng/arch/x86/kernel/alternative.c =================================================================== --- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-04-19 12:01:08.000000000 -0400 +++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-04-19 12:03:14.000000000 -0400 @@ -1,6 +1,6 @@ #include <linux/module.h> #include <linux/sched.h> -#include <linux/spinlock.h> +#include <linux/mutex.h> #include <linux/list.h> #include <linux/kprobes.h> #include <linux/mm.h> @@ -279,7 +279,7 @@ struct smp_alt_module { struct list_head next; }; static LIST_HEAD(smp_alt_modules); -static DEFINE_SPINLOCK(smp_alt); +static DEFINE_MUTEX(smp_alt); static int smp_mode = 1; /* protected by smp_alt */ void alternatives_smp_module_add(struct module *mod, char *name, @@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct __FUNCTION__, smp->locks, smp->locks_end, smp->text, smp->text_end, smp->name); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_add_tail(&smp->next, &smp_alt_modules); if (boot_cpu_has(X86_FEATURE_UP)) alternatives_smp_unlock(smp->locks, smp->locks_end, smp->text, smp->text_end); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_module_del(struct module *mod) @@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct if (smp_alt_once || noreplace_smp) return; - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_for_each_entry(item, &smp_alt_modules, next) { if (mod != item->mod) continue; list_del(&item->next); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); DPRINTK("%s: %s\n", __FUNCTION__, item->name); kfree(item); return; } - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_switch(int smp) @@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp) return; BUG_ON(!smp && (num_online_cpus() > 1)); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); /* * Avoid unnecessary switches because it forces JIT based VMs to @@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp) mod->text, mod->text_end); } smp_mode = smp; - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } #endif -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable 2008-04-19 16:19 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers @ 2008-04-19 21:06 ` Pekka Paalanen 0 siblings, 0 replies; 9+ messages in thread From: Pekka Paalanen @ 2008-04-19 21:06 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt >From 58390e93507669d860ef2d178b60ad447d5260c4 Mon Sep 17 00:00:00 2001 From: Pekka Paalanen <pq@iki.fi> Date: Sat, 19 Apr 2008 23:22:11 +0300 Subject: [PATCH] x86: Fix SMP alternatives: use mutex instead of spinlock. text_poke is sleepable. The original fix by Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>. Signed-off-by: Pekka Paalanen <pq@iki.fi> --- Mathieu, thank you for the quick reply. Your patch did not apply on sched-devel/latest so I made the changes by hand, resulting patch here. The change was in the context lines, which made `patch' fail, e.g. __FUNCTION__ -> __func__. Now my enter_uniprocessor(), that disables all but the first cpu, works fine. My next quest is, why attempting to re-enable the cpus makes the whole machine reboot after a short hang. arch/x86/kernel/alternative.c | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 5412fd7..7ce0939 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1,6 +1,6 @@ #include <linux/module.h> #include <linux/sched.h> -#include <linux/spinlock.h> +#include <linux/mutex.h> #include <linux/list.h> #include <linux/kprobes.h> #include <linux/mm.h> @@ -279,7 +279,7 @@ struct smp_alt_module { struct list_head next; }; static LIST_HEAD(smp_alt_modules); -static DEFINE_SPINLOCK(smp_alt); +static DEFINE_MUTEX(smp_alt); static int smp_mode = 1; /* protected by smp_alt */ void alternatives_smp_module_add(struct module *mod, char *name, @@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct module *mod, char *name, __func__, smp->locks, smp->locks_end, smp->text, smp->text_end, smp->name); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_add_tail(&smp->next, &smp_alt_modules); if (boot_cpu_has(X86_FEATURE_UP)) alternatives_smp_unlock(smp->locks, smp->locks_end, smp->text, smp->text_end); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_module_del(struct module *mod) @@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct module *mod) if (smp_alt_once || noreplace_smp) return; - spin_lock(&smp_alt); + mutex_lock(&smp_alt); list_for_each_entry(item, &smp_alt_modules, next) { if (mod != item->mod) continue; list_del(&item->next); - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); DPRINTK("%s: %s\n", __func__, item->name); kfree(item); return; } - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } void alternatives_smp_switch(int smp) @@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp) return; BUG_ON(!smp && (num_online_cpus() > 1)); - spin_lock(&smp_alt); + mutex_lock(&smp_alt); /* * Avoid unnecessary switches because it forces JIT based VMs to @@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp) mod->text, mod->text_end); } smp_mode = smp; - spin_unlock(&smp_alt); + mutex_unlock(&smp_alt); } #endif -- 1.5.3.7 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-05-01 20:09 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-30 19:51 [BUG] smp alternatives: sleeping function called from invalid context Luca Tettamanti
2008-05-01 13:04 ` Mathieu Desnoyers
2008-05-01 15:02 ` Luca Tettamanti
2008-05-01 16:44 ` Luca Tettamanti
2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
2008-05-01 18:52 ` Linus Torvalds
2008-05-01 20:08 ` Mathieu Desnoyers
[not found] <20080413224207.4430a09c@daedalus.pq.iki.fi>
2008-04-13 20:05 ` [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs Pekka Paalanen
2008-04-14 6:57 ` Ingo Molnar
2008-04-19 15:41 ` [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked Pekka Paalanen
2008-04-19 16:19 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
2008-04-19 21:06 ` Pekka Paalanen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox