* [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
2008-04-19 15:41 ` [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked Pekka Paalanen
@ 2008-04-19 16:19 ` Mathieu Desnoyers
2008-04-19 21:06 ` Pekka Paalanen
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2008-04-19 16:19 UTC (permalink / raw)
To: Pekka Paalanen; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt
* Pekka Paalanen (pq@iki.fi) wrote:
> On Mon, 14 Apr 2008 08:57:13 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
>
> > Pekka Paalanen wrote:
> > > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor()
> > > triggered the following kernel bug:
> > >
> > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006)
> > > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13
> > > 22:09:03 EEST 2008
> > > ...
> > > in mmio_trace_init
> > > mmiotrace: Disabling non-boot CPUs...
> > > CPU 1 is now offline
> > > lockdep: fixing up alternatives.
> > > SMP alternatives: switching to UP code
> > > BUG: sleeping function called from invalid context at mm/slab.c:3053
> > > in_atomic():1, irqs_disabled():0
> > > 5 locks held by bash/4423:
> > > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a
> > > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142
> > > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14
> > > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f
> > > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab
> > > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2
> > >
> > > Call Trace:
> > > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24
> > > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb
> > > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32
> > > [<ffffffff80287a32>] __kmalloc+0x34/0xa5
> > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17
> > > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb
> > > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb
> > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17
> > > [<ffffffff8027da41>] __get_vm_area+0x13/0x15
> > > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f
> > > [<ffffffff8027e14c>] vmap+0x2a/0x5c
> > > [<ffffffff8021191b>] text_poke+0xaa/0x136
> > > [<ffffffff804cba2b>] ? _etext+0x0/0x5
> > > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63
> > > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab
> > > [<ffffffff8021b163>] __cpu_die+0x53/0x7d
> > > [<ffffffff802583e2>] _cpu_down+0x195/0x26c
> > > [<ffffffff802585ca>] cpu_down+0x26/0x36
> > > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142
> > > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40
> > > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a
> > > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13
> > > [<ffffffff8028b047>] vfs_write+0xa7/0xe1
> > > [<ffffffff8028b13b>] sys_write+0x47/0x6d
> > > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80
> > >
> > > mmiotrace: CPU1 is down.
> > > mmiotrace: enabled.
> > >
> > > Is this my fault, or is there a bug somewhere else? The kernel tree is
> > > sched-devel/latest git from 12th April, IIRC.
> >
> > there's no known bug of sched-devel/latest of this kind (or any known
> > bug for that matter).
> >
> > i suspect the bug is that you bring the CPU down from an atomic
> > (spinlocked or irq disabled) context.
>
> I have been eyeballing the code in current sched-devel/latest and there's
> something I think I found.
>
> This is the beginning of my enter_uniprocessor() which is called from
> enable_mmiotrace() seen in the backtrace.
>
> static void enter_uniprocessor(void)
> {
> int cpu;
> int err;
>
> get_online_cpus();
> downed_cpus = cpu_online_map;
> cpu_clear(first_cpu(cpu_online_map), downed_cpus);
> if (num_online_cpus() > 1)
> pr_notice(NAME "Disabling non-boot CPUs...\n");
> put_online_cpus();
>
> for_each_cpu_mask(cpu, downed_cpus) {
> err = cpu_down(cpu);
>
> The function get_online_cpus() calls might_sleep(), so at that
> point everything is fine.
>
> Following the backtrace, we come to alternatives_smp_switch(),
> which does
> spin_lock(&smp_alt);
> and then continues eventually into this function:
>
> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> {
> unsigned long flags;
> char *vaddr;
> int nr_pages = 2;
>
> BUG_ON(len > sizeof(long));
> BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
> - ((long)addr & ~(sizeof(long) - 1)));
> if (kernel_text_address((unsigned long)addr)) {
> struct page *pages[2] = { virt_to_page(addr),
> virt_to_page(addr + PAGE_SIZE) };
> if (!pages[1])
> nr_pages = 1;
> vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
>
> After which we find ourselves in
>
> struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
> unsigned long start, unsigned long end)
> {
> return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL);
>
> and in __get_vm_area_node() there is
> area = kmalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
>
> where gfp_mask is now GFP_KERNEL. As far as I can tell, using SLAB,
> kmalloc_node() here is actually kmalloc(). Anyway, looks like we end up
> in __kmalloc with such flags that it may sleep.
>
> I have followed the code so that the path up to kmalloc_node() is
> possible, unless there are some "should never happen" conditions
> along the way, which I just cannot know of.
>
> Is this the bug?
>
Yep. I think using a mutex should fix it. There is no reason to use a
spinlock rather than a mutex here.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Pekka Paalanen <pq@iki.fi>
CC: Ingo Molnar <mingo@elte.hu>
CC: Steven Rostedt <rostedt@goodmis.org>
---
arch/x86/kernel/alternative.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-04-19 12:01:08.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-04-19 12:03:14.000000000 -0400
@@ -1,6 +1,6 @@
#include <linux/module.h>
#include <linux/sched.h>
-#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/kprobes.h>
#include <linux/mm.h>
@@ -279,7 +279,7 @@ struct smp_alt_module {
struct list_head next;
};
static LIST_HEAD(smp_alt_modules);
-static DEFINE_SPINLOCK(smp_alt);
+static DEFINE_MUTEX(smp_alt);
static int smp_mode = 1; /* protected by smp_alt */
void alternatives_smp_module_add(struct module *mod, char *name,
@@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct
__FUNCTION__, smp->locks, smp->locks_end,
smp->text, smp->text_end, smp->name);
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_add_tail(&smp->next, &smp_alt_modules);
if (boot_cpu_has(X86_FEATURE_UP))
alternatives_smp_unlock(smp->locks, smp->locks_end,
smp->text, smp->text_end);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_module_del(struct module *mod)
@@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct
if (smp_alt_once || noreplace_smp)
return;
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_for_each_entry(item, &smp_alt_modules, next) {
if (mod != item->mod)
continue;
list_del(&item->next);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
DPRINTK("%s: %s\n", __FUNCTION__, item->name);
kfree(item);
return;
}
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_switch(int smp)
@@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp)
return;
BUG_ON(!smp && (num_online_cpus() > 1));
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
/*
* Avoid unnecessary switches because it forces JIT based VMs to
@@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp)
mod->text, mod->text_end);
}
smp_mode = smp;
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
#endif
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
2008-04-19 16:19 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
@ 2008-04-19 21:06 ` Pekka Paalanen
0 siblings, 0 replies; 9+ messages in thread
From: Pekka Paalanen @ 2008-04-19 21:06 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt
>From 58390e93507669d860ef2d178b60ad447d5260c4 Mon Sep 17 00:00:00 2001
From: Pekka Paalanen <pq@iki.fi>
Date: Sat, 19 Apr 2008 23:22:11 +0300
Subject: [PATCH] x86: Fix SMP alternatives: use mutex instead of spinlock.
text_poke is sleepable.
The original fix by Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
---
Mathieu,
thank you for the quick reply. Your patch did not apply on sched-devel/latest
so I made the changes by hand, resulting patch here. The change was in the
context lines, which made `patch' fail, e.g. __FUNCTION__ -> __func__.
Now my enter_uniprocessor(), that disables all but the first cpu, works fine.
My next quest is, why attempting to re-enable the cpus makes the whole
machine reboot after a short hang.
arch/x86/kernel/alternative.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5412fd7..7ce0939 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1,6 +1,6 @@
#include <linux/module.h>
#include <linux/sched.h>
-#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/kprobes.h>
#include <linux/mm.h>
@@ -279,7 +279,7 @@ struct smp_alt_module {
struct list_head next;
};
static LIST_HEAD(smp_alt_modules);
-static DEFINE_SPINLOCK(smp_alt);
+static DEFINE_MUTEX(smp_alt);
static int smp_mode = 1; /* protected by smp_alt */
void alternatives_smp_module_add(struct module *mod, char *name,
@@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct module *mod, char *name,
__func__, smp->locks, smp->locks_end,
smp->text, smp->text_end, smp->name);
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_add_tail(&smp->next, &smp_alt_modules);
if (boot_cpu_has(X86_FEATURE_UP))
alternatives_smp_unlock(smp->locks, smp->locks_end,
smp->text, smp->text_end);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_module_del(struct module *mod)
@@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct module *mod)
if (smp_alt_once || noreplace_smp)
return;
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_for_each_entry(item, &smp_alt_modules, next) {
if (mod != item->mod)
continue;
list_del(&item->next);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
DPRINTK("%s: %s\n", __func__, item->name);
kfree(item);
return;
}
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_switch(int smp)
@@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp)
return;
BUG_ON(!smp && (num_online_cpus() > 1));
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
/*
* Avoid unnecessary switches because it forces JIT based VMs to
@@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp)
mod->text, mod->text_end);
}
smp_mode = smp;
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
#endif
--
1.5.3.7
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [BUG] smp alternatives: sleeping function called from invalid context
@ 2008-04-30 19:51 Luca Tettamanti
2008-05-01 13:04 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Luca Tettamanti @ 2008-04-30 19:51 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel
Hello,
with git current I'm seeing these warnings when bringing CPUs down:
[13794.108805] CPU0 attaching NULL sched-domain.
[13794.108815] CPU1 attaching NULL sched-domain.
[13794.124695] Broke affinity for irq 22
[13794.192376] CPU 1 is now offline
[13794.192381] lockdep: fixing up alternatives.
[13794.192389] SMP alternatives: switching to UP code
[13794.192395] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
[13794.192398] in_atomic():1, irqs_disabled():0
[13794.192401] 4 locks held by bash/11413:
[13794.192404] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
[13794.192417] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff80253479>] cpu_down+0x12/0x32
[13794.192427] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
[13794.192435] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
[13794.192446] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
[13794.192449]
[13794.192450] Call Trace:
[13794.192681] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
[13794.192695] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
[13794.192703] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
[13794.192709] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
[13794.192715] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13794.192719] [<ffffffff80276cc5>] vmap+0x2d/0x5a
[13794.192727] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13794.192736] [<ffffffff80446f1b>] _etext+0x0/0x5
[13794.192742] [<ffffffff80211666>] alternatives_smp_unlock+0x50/0x65
[13794.192752] [<ffffffff80211934>] alternatives_smp_switch+0x168/0x1a5
[13794.192758] [<ffffffff80219439>] __cpus_weight+0x3f/0x41
[13794.192767] [<ffffffff802532b4>] _cpu_down+0x18f/0x264
[13794.192796] [<ffffffff8025348b>] cpu_down+0x24/0x32
[13794.192805] [<ffffffff8043259d>] store_online+0x29/0x67
[13794.192812] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
[13794.192824] [<ffffffff80283cba>] vfs_write+0xa5/0xde
[13794.192832] [<ffffffff80283da5>] sys_write+0x45/0x6b
[13794.192842] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
[13794.192857]
[13794.208799] CPU0 attaching NULL sched-domain.
...and up:
[13803.369635] lockdep: fixing up alternatives.
[13803.369640] SMP alternatives: switching to SMP code
[13803.369645] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
[13803.369648] in_atomic():1, irqs_disabled():0
[13803.369651] 4 locks held by bash/11413:
[13803.369654] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
[13803.369666] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff8044115e>] cpu_up+0x40/0x64
[13803.369676] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
[13803.369686] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
[13803.369697] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
[13803.369700]
[13803.369701] Call Trace:
[13803.369712] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
[13803.369721] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
[13803.369731] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
[13803.369738] [<ffffffff80446f1b>] _etext+0x0/0x5
[13803.369743] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
[13803.369749] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13803.369753] [<ffffffff80276cc5>] vmap+0x2d/0x5a
[13803.369763] [<ffffffff8021159d>] text_poke+0xdd/0x156
[13803.369773] [<ffffffff802118c5>] alternatives_smp_switch+0xf9/0x1a5
[13803.369795] [<ffffffff8043f29e>] native_cpu_up+0x224/0x7fc
[13803.369804] [<ffffffff8043ee95>] do_fork_idle+0x0/0x20
[13803.369833] [<ffffffff80441098>] _cpu_up+0x91/0x117
[13803.369842] [<ffffffff80441175>] cpu_up+0x57/0x64
[13803.369849] [<ffffffff804325b7>] store_online+0x43/0x67
[13803.369856] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
[13803.369867] [<ffffffff80283cba>] vfs_write+0xa5/0xde
[13803.369875] [<ffffffff80283da5>] sys_write+0x45/0x6b
[13803.369886] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
[13803.369898]
[13803.380408] Booting processor 1/1 ip 6000
[13803.389154] Initializing CPU#1
[13803.389154] masked ExtINT on CPU#1
The bug has been introduced (assuming that I'm using git blame correctly) by this commit:
commit e587cadd8f47e202a30712e2906a65a0606d5865
Author: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Date: Thu Mar 6 08:48:49 2008 -0500
x86: enhance DEBUG_RODATA support - alternatives
In both code paths - cpu_up() and cpud_down() - alternatives_smp_switch ends up
calling text_poke() (via alternatives_smp_{,un}lock) with smp_alt held;
then text_poke does a vmap which might sleep!
Luca
--
Windows NT crashed.
I'm the Blue Screen of Death.
No one hears your screams.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] smp alternatives: sleeping function called from invalid context
2008-04-30 19:51 [BUG] smp alternatives: sleeping function called from invalid context Luca Tettamanti
@ 2008-05-01 13:04 ` Mathieu Desnoyers
2008-05-01 15:02 ` Luca Tettamanti
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2008-05-01 13:04 UTC (permalink / raw)
To: Luca Tettamanti; +Cc: linux-kernel
* Luca Tettamanti (kronos.it@gmail.com) wrote:
> Hello,
> with git current I'm seeing these warnings when bringing CPUs down:
>
Hi Luca,
Does your tree include the patch posted here ?
http://lkml.org/lkml/2008/4/19/139
Mathieu
> [13794.108805] CPU0 attaching NULL sched-domain.
> [13794.108815] CPU1 attaching NULL sched-domain.
> [13794.124695] Broke affinity for irq 22
> [13794.192376] CPU 1 is now offline
> [13794.192381] lockdep: fixing up alternatives.
> [13794.192389] SMP alternatives: switching to UP code
> [13794.192395] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
> [13794.192398] in_atomic():1, irqs_disabled():0
> [13794.192401] 4 locks held by bash/11413:
> [13794.192404] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
> [13794.192417] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff80253479>] cpu_down+0x12/0x32
> [13794.192427] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
> [13794.192435] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
> [13794.192446] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
> [13794.192449]
> [13794.192450] Call Trace:
> [13794.192681] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
> [13794.192695] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
> [13794.192703] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
> [13794.192709] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
> [13794.192715] [<ffffffff8021159d>] text_poke+0xdd/0x156
> [13794.192719] [<ffffffff80276cc5>] vmap+0x2d/0x5a
> [13794.192727] [<ffffffff8021159d>] text_poke+0xdd/0x156
> [13794.192736] [<ffffffff80446f1b>] _etext+0x0/0x5
> [13794.192742] [<ffffffff80211666>] alternatives_smp_unlock+0x50/0x65
> [13794.192752] [<ffffffff80211934>] alternatives_smp_switch+0x168/0x1a5
> [13794.192758] [<ffffffff80219439>] __cpus_weight+0x3f/0x41
> [13794.192767] [<ffffffff802532b4>] _cpu_down+0x18f/0x264
> [13794.192796] [<ffffffff8025348b>] cpu_down+0x24/0x32
> [13794.192805] [<ffffffff8043259d>] store_online+0x29/0x67
> [13794.192812] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
> [13794.192824] [<ffffffff80283cba>] vfs_write+0xa5/0xde
> [13794.192832] [<ffffffff80283da5>] sys_write+0x45/0x6b
> [13794.192842] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
> [13794.192857]
> [13794.208799] CPU0 attaching NULL sched-domain.
>
> ...and up:
>
> [13803.369635] lockdep: fixing up alternatives.
> [13803.369640] SMP alternatives: switching to SMP code
> [13803.369645] BUG: sleeping function called from invalid context at /home/kronos/src/linux-2.6.git/mm/slab.c:3049
> [13803.369648] in_atomic():1, irqs_disabled():0
> [13803.369651] 4 locks held by bash/11413:
> [13803.369654] #0: (&buffer->mutex){--..}, at: [<ffffffff802c638c>] sysfs_write_file+0x36/0x10c
> [13803.369666] #1: (cpu_add_remove_lock){--..}, at: [<ffffffff8044115e>] cpu_up+0x40/0x64
> [13803.369676] #2: (&cpu_hotplug.lock){--..}, at: [<ffffffff80253024>] cpu_hotplug_begin+0x36/0xa6
> [13803.369686] #3: (smp_alt){--..}, at: [<ffffffff8021182f>] alternatives_smp_switch+0x63/0x1a5
> [13803.369697] Pid: 11413, comm: bash Not tainted 2.6.25-64-05976-ge31a94e-dirty #68
> [13803.369700]
> [13803.369701] Call Trace:
> [13803.369712] [<ffffffff80280f61>] __kmalloc+0x29/0x7d
> [13803.369721] [<ffffffff80276415>] __get_vm_area_node+0x9a/0x1b1
> [13803.369731] [<ffffffff80209391>] clear_ti_thread_flag+0xc/0x12
> [13803.369738] [<ffffffff80446f1b>] _etext+0x0/0x5
> [13803.369743] [<ffffffff80276585>] get_vm_area_caller+0x2b/0x2e
> [13803.369749] [<ffffffff8021159d>] text_poke+0xdd/0x156
> [13803.369753] [<ffffffff80276cc5>] vmap+0x2d/0x5a
> [13803.369763] [<ffffffff8021159d>] text_poke+0xdd/0x156
> [13803.369773] [<ffffffff802118c5>] alternatives_smp_switch+0xf9/0x1a5
> [13803.369795] [<ffffffff8043f29e>] native_cpu_up+0x224/0x7fc
> [13803.369804] [<ffffffff8043ee95>] do_fork_idle+0x0/0x20
> [13803.369833] [<ffffffff80441098>] _cpu_up+0x91/0x117
> [13803.369842] [<ffffffff80441175>] cpu_up+0x57/0x64
> [13803.369849] [<ffffffff804325b7>] store_online+0x43/0x67
> [13803.369856] [<ffffffff802c642b>] sysfs_write_file+0xd5/0x10c
> [13803.369867] [<ffffffff80283cba>] vfs_write+0xa5/0xde
> [13803.369875] [<ffffffff80283da5>] sys_write+0x45/0x6b
> [13803.369886] [<ffffffff80221ed2>] ia32_sysret+0x0/0xa
> [13803.369898]
> [13803.380408] Booting processor 1/1 ip 6000
> [13803.389154] Initializing CPU#1
> [13803.389154] masked ExtINT on CPU#1
>
> The bug has been introduced (assuming that I'm using git blame correctly) by this commit:
>
> commit e587cadd8f47e202a30712e2906a65a0606d5865
> Author: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Date: Thu Mar 6 08:48:49 2008 -0500
>
> x86: enhance DEBUG_RODATA support - alternatives
>
>
> In both code paths - cpu_up() and cpud_down() - alternatives_smp_switch ends up
> calling text_poke() (via alternatives_smp_{,un}lock) with smp_alt held;
> then text_poke does a vmap which might sleep!
>
> Luca
> --
> Windows NT crashed.
> I'm the Blue Screen of Death.
> No one hears your screams.
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] smp alternatives: sleeping function called from invalid context
2008-05-01 13:04 ` Mathieu Desnoyers
@ 2008-05-01 15:02 ` Luca Tettamanti
2008-05-01 16:44 ` Luca Tettamanti
0 siblings, 1 reply; 9+ messages in thread
From: Luca Tettamanti @ 2008-05-01 15:02 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel
On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers
<mathieu.desnoyers@polymtl.ca> wrote:
> * Luca Tettamanti (kronos.it@gmail.com) wrote:
> > Hello,
> > with git current I'm seeing these warnings when bringing CPUs down:
> >
>
>
> Hi Luca,
>
> Does your tree include the patch posted here ?
>
> http://lkml.org/lkml/2008/4/19/139
Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will
test the patch ASAP.
Luca
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] smp alternatives: sleeping function called from invalid context
2008-05-01 15:02 ` Luca Tettamanti
@ 2008-05-01 16:44 ` Luca Tettamanti
2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Luca Tettamanti @ 2008-05-01 16:44 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-kernel
On Thu, May 1, 2008 at 5:02 PM, Luca Tettamanti <kronos.it@gmail.com> wrote:
> On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers
> <mathieu.desnoyers@polymtl.ca> wrote:
> > * Luca Tettamanti (kronos.it@gmail.com) wrote:
> > > Hello,
> > > with git current I'm seeing these warnings when bringing CPUs down:
> > >
> >
> >
> > Hi Luca,
> >
> > Does your tree include the patch posted here ?
> >
> > http://lkml.org/lkml/2008/4/19/139
>
> Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will
> test the patch ASAP.
Tested the patch: works fine, and the system is stable after multiple
up&down, I'm not experiencing a reboot.
Luca
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
2008-05-01 16:44 ` Luca Tettamanti
@ 2008-05-01 18:46 ` Mathieu Desnoyers
2008-05-01 18:52 ` Linus Torvalds
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2008-05-01 18:46 UTC (permalink / raw)
To: Ingo Molnar, Linus Torvalds, Luca Tettamanti; +Cc: linux-kernel
* Luca Tettamanti (kronos.it@gmail.com) wrote:
> On Thu, May 1, 2008 at 5:02 PM, Luca Tettamanti <kronos.it@gmail.com> wrote:
> > On Thu, May 1, 2008 at 3:04 PM, Mathieu Desnoyers
> > <mathieu.desnoyers@polymtl.ca> wrote:
> > > * Luca Tettamanti (kronos.it@gmail.com) wrote:
> > > > Hello,
> > > > with git current I'm seeing these warnings when bringing CPUs down:
> > > >
> > >
> > >
> > > Hi Luca,
> > >
> > > Does your tree include the patch posted here ?
> > >
> > > http://lkml.org/lkml/2008/4/19/139
> >
> > Nope, I was using vanilla kernel (smp_alt is still a spinlock). Will
> > test the patch ASAP.
>
> Tested the patch: works fine, and the system is stable after multiple
> up&down, I'm not experiencing a reboot.
>
> Luca
Ingo, Linus,
I guess fast-tracking pulling the following patch into mainline wouldn't
hurt. I just updated it so it applies to mainline.
Thanks,
Mathieu
Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
* Pekka Paalanen (pq@iki.fi) wrote:
> On Mon, 14 Apr 2008 08:57:13 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
>
> > Pekka Paalanen wrote:
> > > When I tested this patch on Intel Core 2 Duo, enter_uniprocessor()
> > > triggered the following kernel bug:
> > >
> > > Linux version 2.6.25-rc8-sched-devel.git-x86-latest.git (paalanen@ct200006)
> > > (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #2 SMP PREEMPT Sun Apr 13
> > > 22:09:03 EEST 2008
> > > ...
> > > in mmio_trace_init
> > > mmiotrace: Disabling non-boot CPUs...
> > > CPU 1 is now offline
> > > lockdep: fixing up alternatives.
> > > SMP alternatives: switching to UP code
> > > BUG: sleeping function called from invalid context at mm/slab.c:3053
> > > in_atomic():1, irqs_disabled():0
> > > 5 locks held by bash/4423:
> > > #0: (trace_types_lock){--..}, at: [<ffffffff8026442f>] tracing_set_trace_write+0x93/0x11a
> > > #1: (mmiotrace_mutex){--..}, at: [<ffffffff802251e0>] enable_mmiotrace+0x17/0x142
> > > #2: (cpu_add_remove_lock){--..}, at: [<ffffffff802580e5>] cpu_maps_update_begin+0x12/0x14
> > > #3: (&cpu_hotplug.lock){--..}, at: [<ffffffff8025814f>] cpu_hotplug_begin+0x39/0x9f
> > > #4: (smp_alt){--..}, at: [<ffffffff80211bd9>] alternatives_smp_switch+0x66/0x1ab
> > > Pid: 4423, comm: bash Not tainted 2.6.25-rc8-sched-devel.git-x86-latest.git #2
> > >
> > > Call Trace:
> > > [<ffffffff802520c0>] ? __debug_show_held_locks+0x22/0x24
> > > [<ffffffff8022d292>] __might_sleep+0xd9/0xdb
> > > [<ffffffff80287326>] cache_alloc_debugcheck_before+0x23/0x32
> > > [<ffffffff80287a32>] __kmalloc+0x34/0xa5
> > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17
> > > [<ffffffff8027d7f6>] kmalloc_node+0x9/0xb
> > > [<ffffffff8027d8e0>] __get_vm_area_node+0xa2/0x1cb
> > > [<ffffffff80209393>] ? clear_ti_thread_flag+0x10/0x17
> > > [<ffffffff8027da41>] __get_vm_area+0x13/0x15
> > > [<ffffffff8027da60>] get_vm_area+0x1d/0x1f
> > > [<ffffffff8027e14c>] vmap+0x2a/0x5c
> > > [<ffffffff8021191b>] text_poke+0xaa/0x136
> > > [<ffffffff804cba2b>] ? _etext+0x0/0x5
> > > [<ffffffff802119f6>] alternatives_smp_unlock+0x4f/0x63
> > > [<ffffffff80211ce1>] alternatives_smp_switch+0x16e/0x1ab
> > > [<ffffffff8021b163>] __cpu_die+0x53/0x7d
> > > [<ffffffff802583e2>] _cpu_down+0x195/0x26c
> > > [<ffffffff802585ca>] cpu_down+0x26/0x36
> > > [<ffffffff80225270>] enable_mmiotrace+0xa7/0x142
> > > [<ffffffff80266b8d>] mmio_trace_init+0x3c/0x40
> > > [<ffffffff8026448e>] tracing_set_trace_write+0xf2/0x11a
> > > [<ffffffff80327fac>] ? security_file_permission+0x11/0x13
> > > [<ffffffff8028b047>] vfs_write+0xa7/0xe1
> > > [<ffffffff8028b13b>] sys_write+0x47/0x6d
> > > [<ffffffff8020b4db>] system_call_after_swapgs+0x7b/0x80
> > >
> > > mmiotrace: CPU1 is down.
> > > mmiotrace: enabled.
> > >
> > > Is this my fault, or is there a bug somewhere else? The kernel tree is
> > > sched-devel/latest git from 12th April, IIRC.
> >
> > there's no known bug of sched-devel/latest of this kind (or any known
> > bug for that matter).
> >
> > i suspect the bug is that you bring the CPU down from an atomic
> > (spinlocked or irq disabled) context.
>
> I have been eyeballing the code in current sched-devel/latest and there's
> something I think I found.
>
> This is the beginning of my enter_uniprocessor() which is called from
> enable_mmiotrace() seen in the backtrace.
>
> static void enter_uniprocessor(void)
> {
> int cpu;
> int err;
>
> get_online_cpus();
> downed_cpus = cpu_online_map;
> cpu_clear(first_cpu(cpu_online_map), downed_cpus);
> if (num_online_cpus() > 1)
> pr_notice(NAME "Disabling non-boot CPUs...\n");
> put_online_cpus();
>
> for_each_cpu_mask(cpu, downed_cpus) {
> err = cpu_down(cpu);
>
> The function get_online_cpus() calls might_sleep(), so at that
> point everything is fine.
>
> Following the backtrace, we come to alternatives_smp_switch(),
> which does
> spin_lock(&smp_alt);
> and then continues eventually into this function:
>
> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> {
> unsigned long flags;
> char *vaddr;
> int nr_pages = 2;
>
> BUG_ON(len > sizeof(long));
> BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
> - ((long)addr & ~(sizeof(long) - 1)));
> if (kernel_text_address((unsigned long)addr)) {
> struct page *pages[2] = { virt_to_page(addr),
> virt_to_page(addr + PAGE_SIZE) };
> if (!pages[1])
> nr_pages = 1;
> vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
>
> After which we find ourselves in
>
> struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
> unsigned long start, unsigned long end)
> {
> return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL);
>
> and in __get_vm_area_node() there is
> area = kmalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
>
> where gfp_mask is now GFP_KERNEL. As far as I can tell, using SLAB,
> kmalloc_node() here is actually kmalloc(). Anyway, looks like we end up
> in __kmalloc with such flags that it may sleep.
>
> I have followed the code so that the path up to kmalloc_node() is
> possible, unless there are some "should never happen" conditions
> along the way, which I just cannot know of.
>
> Is this the bug?
>
Yep. I think using a mutex should fix it. There is no reason to use a
spinlock rather than a mutex here.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Pekka Paalanen <pq@iki.fi>
CC: Ingo Molnar <mingo@elte.hu>
CC: Steven Rostedt <rostedt@goodmis.org>
---
arch/x86/kernel/alternative.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-05-01 14:41:09.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-05-01 14:44:28.000000000 -0400
@@ -1,6 +1,6 @@
#include <linux/module.h>
#include <linux/sched.h>
-#include <linux/spinlock.h>
+#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/kprobes.h>
#include <linux/mm.h>
@@ -279,7 +279,7 @@ struct smp_alt_module {
struct list_head next;
};
static LIST_HEAD(smp_alt_modules);
-static DEFINE_SPINLOCK(smp_alt);
+static DEFINE_MUTEX(smp_alt);
static int smp_mode = 1; /* protected by smp_alt */
void alternatives_smp_module_add(struct module *mod, char *name,
@@ -312,12 +312,12 @@ void alternatives_smp_module_add(struct
__func__, smp->locks, smp->locks_end,
smp->text, smp->text_end, smp->name);
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_add_tail(&smp->next, &smp_alt_modules);
if (boot_cpu_has(X86_FEATURE_UP))
alternatives_smp_unlock(smp->locks, smp->locks_end,
smp->text, smp->text_end);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_module_del(struct module *mod)
@@ -327,17 +327,17 @@ void alternatives_smp_module_del(struct
if (smp_alt_once || noreplace_smp)
return;
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
list_for_each_entry(item, &smp_alt_modules, next) {
if (mod != item->mod)
continue;
list_del(&item->next);
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
DPRINTK("%s: %s\n", __func__, item->name);
kfree(item);
return;
}
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
void alternatives_smp_switch(int smp)
@@ -359,7 +359,7 @@ void alternatives_smp_switch(int smp)
return;
BUG_ON(!smp && (num_online_cpus() > 1));
- spin_lock(&smp_alt);
+ mutex_lock(&smp_alt);
/*
* Avoid unnecessary switches because it forces JIT based VMs to
@@ -383,7 +383,7 @@ void alternatives_smp_switch(int smp)
mod->text, mod->text_end);
}
smp_mode = smp;
- spin_unlock(&smp_alt);
+ mutex_unlock(&smp_alt);
}
#endif
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
@ 2008-05-01 18:52 ` Linus Torvalds
2008-05-01 20:08 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2008-05-01 18:52 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: Ingo Molnar, Luca Tettamanti, linux-kernel
On Thu, 1 May 2008, Mathieu Desnoyers wrote:
>
> Yep. I think using a mutex should fix it. There is no reason to use a
> spinlock rather than a mutex here.
Is there any reason to do any locking what-so-ever?
If this code could possibly race with something else, that sounds like a
bug in itself, no?
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable
2008-05-01 18:52 ` Linus Torvalds
@ 2008-05-01 20:08 ` Mathieu Desnoyers
0 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2008-05-01 20:08 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Ingo Molnar, Luca Tettamanti, linux-kernel
* Linus Torvalds (torvalds@linux-foundation.org) wrote:
>
>
> On Thu, 1 May 2008, Mathieu Desnoyers wrote:
> >
> > Yep. I think using a mutex should fix it. There is no reason to use a
> > spinlock rather than a mutex here.
>
> Is there any reason to do any locking what-so-ever?
>
> If this code could possibly race with something else, that sounds like a
> bug in itself, no?
>
> Linus
The smp_alt mutex taken in
alternatives_smp_module_add/alternatives_smp_module_del seems required
because this code will run at module load/free, even in SMP context (not
within stop_machine_run).
As for alternatives_smp_switch, which is called when switching between
UP and SMP, always in UP context, I guess we should assume it could be
executed concurrently wrt the module_load/free, even in UP on a
CONFIG_PREEMPT kernel.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-05-01 20:09 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-30 19:51 [BUG] smp alternatives: sleeping function called from invalid context Luca Tettamanti
2008-05-01 13:04 ` Mathieu Desnoyers
2008-05-01 15:02 ` Luca Tettamanti
2008-05-01 16:44 ` Luca Tettamanti
2008-05-01 18:46 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
2008-05-01 18:52 ` Linus Torvalds
2008-05-01 20:08 ` Mathieu Desnoyers
[not found] <20080413224207.4430a09c@daedalus.pq.iki.fi>
2008-04-13 20:05 ` [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs Pekka Paalanen
2008-04-14 6:57 ` Ingo Molnar
2008-04-19 15:41 ` [BUG] kmalloc_node(GFP_KERNEL) while smp_alt spinlocked Pekka Paalanen
2008-04-19 16:19 ` [PATCH] Fix SMP alternatives : use mutex instead of spinlock, text_poke is sleepable Mathieu Desnoyers
2008-04-19 21:06 ` Pekka Paalanen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox