* [BUG 2.6.36-rc6] list corruption in module_bug_finalize
@ 2010-10-03 19:51 Thomas Gleixner
2010-10-04 11:00 ` Arnd Bergmann
2010-10-05 4:18 ` Rusty Russell
0 siblings, 2 replies; 14+ messages in thread
From: Thomas Gleixner @ 2010-10-03 19:51 UTC (permalink / raw)
To: LKML; +Cc: Rusty Russell, Arnd Bergmann
Current mainline triggers a list corruption bug in
module_bug_finalize(). dmesg excerpt below.
The corresponding code says:
/*
* Strictly speaking this should have a spinlock to protect against
* traversals, but since we only traverse on BUG()s, a spinlock
* could potentially lead to deadlock and thus be counter-productive.
*/
list_add(&mod->bug_list, &module_bug_list);
I can see the traversal problem vs. BUG(), but what's protecting the
list_add() ? BKL probably did, but is that true anymore ?
Thanks,
tglx
---
initcall floppy_module_init+0x0/0xddb [floppy] returned 0 after 12247 usecs
calling mb862xxfb_init+0x0/0x25 [mb862xxfb] @ 768
mb862xxfb 0000:05:00.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
mb862xxfb 0000:05:00.0: Fujitsu Carmine GDC Rev.3 found
initcall mb862xxfb_init+0x0/0x25 [mb862xxfb] returned 0 after 36925 usecs
calling parport_default_proc_register+0x0/0x1b [parport] @ 800
initcall parport_default_proc_register+0x0/0x1b [parport] returned 0 after 6 usecs
calling alsa_sound_init+0x0/0x96 [snd] @ 689
initcall alsa_sound_init+0x0/0x96 [snd] returned 0 after 20 usecs
calling i82975x_init+0x0/0xa1 [i82975x_edac] @ 690
EDAC i82975x: ECC disabled on both channels.
initcall i82975x_init+0x0/0xa1 [i82975x_edac] returned 0 after 4060 usecs
calling alsa_timer_init+0x0/0x17f [snd_timer] @ 689
initcall alsa_timer_init+0x0/0x17f [snd_timer] returned 0 after 45 usecs
calling parport_pc_init+0x0/0x357 [parport_pc] @ 800
parport_pc 00:09: reported by Plug and Play ACPI
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE]
calling cp_init+0x0/0x35 [8139cp] @ 766
8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
8139cp 0000:05:02.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip, use 8139too
calling alsa_pcm_init+0x0/0x71 [snd_pcm] @ 845
initcall alsa_pcm_init+0x0/0x71 [snd_pcm] returned 0 after 6 usecs
initcall cp_init+0x0/0x35 [8139cp] returned 0 after 23729 usecs
initcall parport_pc_init+0x0/0x357 [parport_pc] returned 0 after 99365 usecs
calling ppdev_init+0x0/0xd2 [ppdev] @ 847
ppdev: user-space parallel port driver
initcall ppdev_init+0x0/0xd2 [ppdev] returned 0 after 3744 usecs
calling alsa_seq_device_init+0x0/0x60 [snd_seq_device] @ 848
initcall alsa_seq_device_init+0x0/0x60 [snd_seq_device] returned 0 after 5 usecs
calling rtl8139_init_module+0x0/0x2e [8139too] @ 766
8139too: 8139too Fast Ethernet driver 0.9.28
8139too 0000:05:02.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
8139too 0000:05:02.0: eth0: RealTek RTL8139 at 0xffffc900055dcc00, 00:50:fc:23:8e:aa, IRQ 18
initcall rtl8139_init_module+0x0/0x2e [8139too] returned 0 after 18257 usecs
calling shpcd_init+0x0/0x68 [shpchp] @ 654
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
initcall shpcd_init+0x0/0x68 [shpchp] returned 0 after 94 usecs
udev: renamed network interface eth0 to eth2
calling alsa_seq_init+0x0/0x4c [snd_seq] @ 848
initcall alsa_seq_init+0x0/0x4c [snd_seq] returned 0 after 56 usecs
------------[ cut here ]------------
calling alsa_hwdep_init+0x0/0x69 [snd_hwdep] @ 856
initcall alsa_hwdep_init+0x0/0x69 [snd_hwdep] returned 0 after 5 usecs
WARNING: at /home/tglx/work/kernel/rt-new/linux-2.6-tip/lib/list_debug.c:26 __list_add+0x3f/0x83()
Hardware name:
list_add corruption. next->prev should be prev (ffffffff81a4c260), but was ffffffffa02a1368. (next=ffffffffa028b5c8).
calling e1000_init_module+0x0/0x43 [e1000e] @ 853
e1000e: Intel(R) PRO/1000 Network Driver - 1.2.7-k2
e1000e: Copyright (c) 1999 - 2010 Intel Corporation.
e1000e 0000:04:00.0: Disabling ASPM L1
e1000e 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
e1000e 0000:04:00.0: setting latency timer to 64
e1000e 0000:04:00.0: irq 42 for MSI/MSI-X
e1000e 0000:04:00.0: Disabling ASPM L0s
Modules linked in: e1000e(+) snd_hwdep snd_seq shpchp 8139too snd_seq_device ppdev snd_pcm 8139cp parport_pc snd_timer i82975x_edac snd parport mii mb862xxfb mb862xxfb_accel floppy edac_core i2c_i801 serio_raw pcspkr microcode soundcore iTCO_wdt snd_page_alloc iTCO_vendor_support raid0 raid1 firewire_ohci firewire_core sata_sil crc_itu_t radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 689, comm: modprobe Not tainted 2.6.36-rc5+ #80
e1000e 0000:04:00.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:16:76:ab:5f:54
e1000e 0000:04:00.0: eth0: Intel(R) PRO/1000 Network Connection
e1000e 0000:04:00.0: eth0: MAC: 2, PHY: 2, PBA No: ffffff-0ff
initcall e1000_init_module+0x0/0x43 [e1000e] returned 0 after 82625 usecs
Call Trace:
[<ffffffff81048c95>] warn_slowpath_common+0x85/0x9d
[<ffffffff81048d50>] warn_slowpath_fmt+0x46/0x48
[<ffffffff811fac30>] __list_add+0x3f/0x83
[<ffffffff811ecf0b>] module_bug_finalize+0xb9/0xca
[<ffffffff810276f5>] module_finalize+0x156/0x165
[<ffffffff81079b65>] load_module+0xf75/0x177a
[<ffffffff8107a3b4>] sys_init_module+0x4a/0x1e2
[<ffffffff81009cd2>] system_call_fastpath+0x16/0x1b
---[ end trace c97cbc43385366a8 ]---
------------[ cut here ]------------
WARNING: at /home/tglx/work/kernel/rt-new/linux-2.6-tip/lib/list_debug.c:30 __list_add+0x68/0x83()
Hardware name:
list_add corruption. prev->next should be next (ffffffffa028b5c8), but was ffffffffa02c9068. (prev=ffffffff81a4c260).
Modules linked in: e1000e snd_hwdep snd_seq shpchp 8139too snd_seq_device ppdev snd_pcm 8139cp parport_pc snd_timer i82975x_edac snd parport mii mb862xxfb mb862xxfb_accel floppy edac_core i2c_i801 serio_raw pcspkr microcode soundcore iTCO_wdt snd_page_alloc iTCO_vendor_support raid0 raid1 firewire_ohci firewire_core sata_sil crc_itu_t radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 689, comm: modprobe Tainted: G W 2.6.36-rc5+ #80
Call Trace:
[<ffffffff81048c95>] warn_slowpath_common+0x85/0x9d
[<ffffffff81048d50>] warn_slowpath_fmt+0x46/0x48
[<ffffffff811fac59>] __list_add+0x68/0x83
[<ffffffff811ecf0b>] module_bug_finalize+0xb9/0xca
[<ffffffff810276f5>] module_finalize+0x156/0x165
[<ffffffff81079b65>] load_module+0xf75/0x177a
[<ffffffff8107a3b4>] sys_init_module+0x4a/0x1e2
[<ffffffff81009cd2>] system_call_fastpath+0x16/0x1b
---[ end trace c97cbc43385366a9 ]---
udev: renamed network interface eth0 to eth1
calling alsa_card_azx_init+0x0/0x20 [snd_hda_intel] @ 689
HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
HDA Intel 0000:00:1b.0: irq 43 for MSI/MSI-X
HDA Intel 0000:00:1b.0: setting latency timer to 64
md: bind<sdb6>
md: bind<sda6>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-03 19:51 [BUG 2.6.36-rc6] list corruption in module_bug_finalize Thomas Gleixner
@ 2010-10-04 11:00 ` Arnd Bergmann
2010-10-04 22:43 ` Thomas Gleixner
2010-10-05 4:18 ` Rusty Russell
1 sibling, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2010-10-04 11:00 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: LKML, Rusty Russell, Kay Sievers, Brandon Philips
On Sunday 03 October 2010, Thomas Gleixner wrote:
> Current mainline triggers a list corruption bug in
> module_bug_finalize(). dmesg excerpt below.
>
> The corresponding code says:
>
> /*
> * Strictly speaking this should have a spinlock to protect against
> * traversals, but since we only traverse on BUG()s, a spinlock
> * could potentially lead to deadlock and thus be counter-productive.
> */
> list_add(&mod->bug_list, &module_bug_list);
>
> I can see the traversal problem vs. BUG(), but what's protecting the
> list_add() ? BKL probably did, but is that true anymore ?
BKL hasn't been in this code path since before git.
I think this relatively recent change caused module_finalize to be
called without module_mutex held:
commit 75676500f8298f0ee89db12db97294883c4b768e
Author: Rusty Russell <rusty@rustcorp.com.au>
Date: Sat Jun 5 11:17:36 2010 -0600
module: make locking more fine-grained.
Kay Sievers <kay.sievers@vrfy.org> reports that we still have some
contention over module loading which is slowing boot.
Linus also disliked a previous "drop lock and regrab" patch to fix the
bne2 "gave up waiting for init of module libcrc32c" message.
This is more ambitious: we only grab the lock where we need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Brandon Philips <brandon@ifup.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-04 11:00 ` Arnd Bergmann
@ 2010-10-04 22:43 ` Thomas Gleixner
2010-10-04 23:55 ` Linus Torvalds
2010-10-05 5:14 ` Rusty Russell
0 siblings, 2 replies; 14+ messages in thread
From: Thomas Gleixner @ 2010-10-04 22:43 UTC (permalink / raw)
To: Arnd Bergmann
Cc: LKML, Rusty Russell, Kay Sievers, Brandon Philips, Linus Torvalds
On Mon, 4 Oct 2010, Arnd Bergmann wrote:
> On Sunday 03 October 2010, Thomas Gleixner wrote:
> > Current mainline triggers a list corruption bug in
> > module_bug_finalize(). dmesg excerpt below.
> >
> > The corresponding code says:
> >
> > /*
> > * Strictly speaking this should have a spinlock to protect against
> > * traversals, but since we only traverse on BUG()s, a spinlock
> > * could potentially lead to deadlock and thus be counter-productive.
> > */
> > list_add(&mod->bug_list, &module_bug_list);
> >
> > I can see the traversal problem vs. BUG(), but what's protecting the
> > list_add() ? BKL probably did, but is that true anymore ?
>
> BKL hasn't been in this code path since before git.
Fair enough. I have to admit that I did not even look. :)
> I think this relatively recent change caused module_finalize to be
> called without module_mutex held:
Yeah.
> commit 75676500f8298f0ee89db12db97294883c4b768e
> Author: Rusty Russell <rusty@rustcorp.com.au>
> Date: Sat Jun 5 11:17:36 2010 -0600
>
> module: make locking more fine-grained.
>
> Kay Sievers <kay.sievers@vrfy.org> reports that we still have some
> contention over module loading which is slowing boot.
>
> Linus also disliked a previous "drop lock and regrab" patch to fix the
> bne2 "gave up waiting for init of module libcrc32c" message.
>
> This is more ambitious: we only grab the lock where we need it.
>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Brandon Philips <brandon@ifup.org>
> Cc: Kay Sievers <kay.sievers@vrfy.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>
> Arnd
The patch below cures it.
Thanks,
tglx
---->
diff --git a/lib/bug.c b/lib/bug.c
index 7cdfad8..40f32d8 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -92,18 +92,21 @@ int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
}
/*
- * Strictly speaking this should have a spinlock to protect against
- * traversals, but since we only traverse on BUG()s, a spinlock
- * could potentially lead to deadlock and thus be counter-productive.
+ * We need to take module_mutex here to protect the list add, though
+ * it won't protect against a concurrent BUG().
*/
+ mutex_lock(&module_mutex);
list_add(&mod->bug_list, &module_bug_list);
+ mutex_unlock(&module_mutex);
return 0;
}
void module_bug_cleanup(struct module *mod)
{
+ mutex_lock(&module_mutex);
list_del(&mod->bug_list);
+ mutex_unlock(&module_mutex);
}
#else
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-04 22:43 ` Thomas Gleixner
@ 2010-10-04 23:55 ` Linus Torvalds
2010-10-05 1:11 ` Linus Torvalds
2010-10-05 5:14 ` Rusty Russell
1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2010-10-04 23:55 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Arnd Bergmann, LKML, Rusty Russell, Kay Sievers, Brandon Philips
[-- Attachment #1: Type: text/plain, Size: 771 bytes --]
On Mon, Oct 4, 2010 at 3:43 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The patch below cures it.
Hmm. I think I'd rather move the module_bug_finalize() call away from
the arch-specific module_finalize(), and down later into
load_module().
Basically, we simply shouldn't be doing global things that need the
'module_mutex' until after we've done all the local things, and then
checked for uniqueness.
And I don't see why module_bug_finalize() (and module_bug_cleanup) is
called from arch-specific code anyway. I think that placement is
purely historical.
It would seem to make most sense to do the module_bug_finalize() in
the same location where we add the module to the list of modules.
IOW, a patch like the attached. UNTESTED!
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 6497 bytes --]
arch/avr32/kernel/module.c | 3 +--
arch/h8300/kernel/module.c | 3 +--
arch/mn10300/kernel/module.c | 3 +--
arch/parisc/kernel/module.c | 3 +--
arch/powerpc/kernel/module.c | 5 -----
arch/s390/kernel/module.c | 3 +--
arch/sh/kernel/module.c | 2 --
arch/x86/kernel/module.c | 3 +--
include/linux/module.h | 5 ++---
kernel/module.c | 3 +++
lib/bug.c | 6 ++----
11 files changed, 13 insertions(+), 26 deletions(-)
diff --git a/arch/avr32/kernel/module.c b/arch/avr32/kernel/module.c
index 98f94d0..a727f54 100644
--- a/arch/avr32/kernel/module.c
+++ b/arch/avr32/kernel/module.c
@@ -314,10 +314,9 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
vfree(module->arch.syminfo);
module->arch.syminfo = NULL;
- return module_bug_finalize(hdr, sechdrs, module);
+ return 0;
}
void module_arch_cleanup(struct module *module)
{
- module_bug_cleanup(module);
}
diff --git a/arch/h8300/kernel/module.c b/arch/h8300/kernel/module.c
index 0865e29..db4953d 100644
--- a/arch/h8300/kernel/module.c
+++ b/arch/h8300/kernel/module.c
@@ -112,10 +112,9 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/mn10300/kernel/module.c b/arch/mn10300/kernel/module.c
index 6aea7fd..196a111 100644
--- a/arch/mn10300/kernel/module.c
+++ b/arch/mn10300/kernel/module.c
@@ -206,7 +206,7 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
/*
@@ -214,5 +214,4 @@ int module_finalize(const Elf_Ehdr *hdr,
*/
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index 159a2b8..6e81bb5 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -941,11 +941,10 @@ int module_finalize(const Elf_Ehdr *hdr,
nsyms = newptr - (Elf_Sym *)symhdr->sh_addr;
DEBUGP("NEW num_symtab %lu\n", nsyms);
symhdr->sh_size = nsyms * sizeof(Elf_Sym);
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
deregister_unwind_table(mod);
- module_bug_cleanup(mod);
}
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 477c663..4ef93ae 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -65,10 +65,6 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sect;
int err;
- err = module_bug_finalize(hdr, sechdrs, me);
- if (err)
- return err;
-
/* Apply feature fixups */
sect = find_section(hdr, sechdrs, "__ftr_fixup");
if (sect != NULL)
@@ -101,5 +97,4 @@ int module_finalize(const Elf_Ehdr *hdr,
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 22cfd63..f7167ee 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -407,10 +407,9 @@ int module_finalize(const Elf_Ehdr *hdr,
{
vfree(me->arch.syminfo);
me->arch.syminfo = NULL;
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/sh/kernel/module.c b/arch/sh/kernel/module.c
index 43adddf..ae0be69 100644
--- a/arch/sh/kernel/module.c
+++ b/arch/sh/kernel/module.c
@@ -149,13 +149,11 @@ int module_finalize(const Elf_Ehdr *hdr,
int ret = 0;
ret |= module_dwarf_finalize(hdr, sechdrs, me);
- ret |= module_bug_finalize(hdr, sechdrs, me);
return ret;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
module_dwarf_cleanup(mod);
}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index e0bc186..1c355c5 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -239,11 +239,10 @@ int module_finalize(const Elf_Ehdr *hdr,
apply_paravirt(pseg, pseg + para->sh_size);
}
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
alternatives_smp_module_del(mod);
- module_bug_cleanup(mod);
}
diff --git a/include/linux/module.h b/include/linux/module.h
index 8a6b9fd..aace066 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -686,17 +686,16 @@ extern int module_sysfs_initialized;
#ifdef CONFIG_GENERIC_BUG
-int module_bug_finalize(const Elf_Ehdr *, const Elf_Shdr *,
+void module_bug_finalize(const Elf_Ehdr *, const Elf_Shdr *,
struct module *);
void module_bug_cleanup(struct module *);
#else /* !CONFIG_GENERIC_BUG */
-static inline int module_bug_finalize(const Elf_Ehdr *hdr,
+static inline void module_bug_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *mod)
{
- return 0;
}
static inline void module_bug_cleanup(struct module *mod) {}
#endif /* CONFIG_GENERIC_BUG */
diff --git a/kernel/module.c b/kernel/module.c
index d0b5f8d..e2aeddd 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2625,6 +2625,7 @@ static struct module *load_module(void __user *umod,
if (err < 0)
goto ddebug;
+ module_bug_finalize(info.hdr, info.sechdrs, mod);
list_add_rcu(&mod->list, &modules);
mutex_unlock(&module_mutex);
@@ -2650,6 +2651,8 @@ static struct module *load_module(void __user *umod,
mutex_lock(&module_mutex);
/* Unlink carefully: kallsyms could be walking list. */
list_del_rcu(&mod->list);
+ module_bug_cleanup(mod);
+
ddebug:
if (!mod->taints)
dynamic_debug_remove(info.debug);
diff --git a/lib/bug.c b/lib/bug.c
index 7cdfad8..1955209 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -72,8 +72,8 @@ static const struct bug_entry *module_find_bug(unsigned long bugaddr)
return NULL;
}
-int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
- struct module *mod)
+void module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
+ struct module *mod)
{
char *secstrings;
unsigned int i;
@@ -97,8 +97,6 @@ int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
* could potentially lead to deadlock and thus be counter-productive.
*/
list_add(&mod->bug_list, &module_bug_list);
-
- return 0;
}
void module_bug_cleanup(struct module *mod)
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-04 23:55 ` Linus Torvalds
@ 2010-10-05 1:11 ` Linus Torvalds
0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2010-10-05 1:11 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Arnd Bergmann, LKML, Rusty Russell, Kay Sievers, Brandon Philips
[-- Attachment #1: Type: text/plain, Size: 701 bytes --]
On Mon, Oct 4, 2010 at 4:55 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> It would seem to make most sense to do the module_bug_finalize() in
> the same location where we add the module to the list of modules.
>
> IOW, a patch like the attached. UNTESTED!
That patch is still untested, but reading through it once more I
noticed that I forgot to add the module_bug_cleanup() call to the
module exit path.
So here's a trivially fixed version that has the module unload doing
the bug cleanup too.
I also suspect that the whole 'module_bug_{finalize,cleanup}()' thing
should probably be moved to kernel/module.c, but that's a separate
issue entirely.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 6659 bytes --]
arch/avr32/kernel/module.c | 3 +--
arch/h8300/kernel/module.c | 3 +--
arch/mn10300/kernel/module.c | 3 +--
arch/parisc/kernel/module.c | 3 +--
arch/powerpc/kernel/module.c | 5 -----
arch/s390/kernel/module.c | 3 +--
arch/sh/kernel/module.c | 2 --
arch/x86/kernel/module.c | 3 +--
include/linux/module.h | 5 ++---
kernel/module.c | 4 ++++
lib/bug.c | 6 ++----
11 files changed, 14 insertions(+), 26 deletions(-)
diff --git a/arch/avr32/kernel/module.c b/arch/avr32/kernel/module.c
index 98f94d0..a727f54 100644
--- a/arch/avr32/kernel/module.c
+++ b/arch/avr32/kernel/module.c
@@ -314,10 +314,9 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
vfree(module->arch.syminfo);
module->arch.syminfo = NULL;
- return module_bug_finalize(hdr, sechdrs, module);
+ return 0;
}
void module_arch_cleanup(struct module *module)
{
- module_bug_cleanup(module);
}
diff --git a/arch/h8300/kernel/module.c b/arch/h8300/kernel/module.c
index 0865e29..db4953d 100644
--- a/arch/h8300/kernel/module.c
+++ b/arch/h8300/kernel/module.c
@@ -112,10 +112,9 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/mn10300/kernel/module.c b/arch/mn10300/kernel/module.c
index 6aea7fd..196a111 100644
--- a/arch/mn10300/kernel/module.c
+++ b/arch/mn10300/kernel/module.c
@@ -206,7 +206,7 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
/*
@@ -214,5 +214,4 @@ int module_finalize(const Elf_Ehdr *hdr,
*/
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index 159a2b8..6e81bb5 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -941,11 +941,10 @@ int module_finalize(const Elf_Ehdr *hdr,
nsyms = newptr - (Elf_Sym *)symhdr->sh_addr;
DEBUGP("NEW num_symtab %lu\n", nsyms);
symhdr->sh_size = nsyms * sizeof(Elf_Sym);
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
deregister_unwind_table(mod);
- module_bug_cleanup(mod);
}
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 477c663..4ef93ae 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -65,10 +65,6 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sect;
int err;
- err = module_bug_finalize(hdr, sechdrs, me);
- if (err)
- return err;
-
/* Apply feature fixups */
sect = find_section(hdr, sechdrs, "__ftr_fixup");
if (sect != NULL)
@@ -101,5 +97,4 @@ int module_finalize(const Elf_Ehdr *hdr,
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 22cfd63..f7167ee 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -407,10 +407,9 @@ int module_finalize(const Elf_Ehdr *hdr,
{
vfree(me->arch.syminfo);
me->arch.syminfo = NULL;
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
}
diff --git a/arch/sh/kernel/module.c b/arch/sh/kernel/module.c
index 43adddf..ae0be69 100644
--- a/arch/sh/kernel/module.c
+++ b/arch/sh/kernel/module.c
@@ -149,13 +149,11 @@ int module_finalize(const Elf_Ehdr *hdr,
int ret = 0;
ret |= module_dwarf_finalize(hdr, sechdrs, me);
- ret |= module_bug_finalize(hdr, sechdrs, me);
return ret;
}
void module_arch_cleanup(struct module *mod)
{
- module_bug_cleanup(mod);
module_dwarf_cleanup(mod);
}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index e0bc186..1c355c5 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -239,11 +239,10 @@ int module_finalize(const Elf_Ehdr *hdr,
apply_paravirt(pseg, pseg + para->sh_size);
}
- return module_bug_finalize(hdr, sechdrs, me);
+ return 0;
}
void module_arch_cleanup(struct module *mod)
{
alternatives_smp_module_del(mod);
- module_bug_cleanup(mod);
}
diff --git a/include/linux/module.h b/include/linux/module.h
index 8a6b9fd..aace066 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -686,17 +686,16 @@ extern int module_sysfs_initialized;
#ifdef CONFIG_GENERIC_BUG
-int module_bug_finalize(const Elf_Ehdr *, const Elf_Shdr *,
+void module_bug_finalize(const Elf_Ehdr *, const Elf_Shdr *,
struct module *);
void module_bug_cleanup(struct module *);
#else /* !CONFIG_GENERIC_BUG */
-static inline int module_bug_finalize(const Elf_Ehdr *hdr,
+static inline void module_bug_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *mod)
{
- return 0;
}
static inline void module_bug_cleanup(struct module *mod) {}
#endif /* CONFIG_GENERIC_BUG */
diff --git a/kernel/module.c b/kernel/module.c
index d0b5f8d..ccd6419 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1537,6 +1537,7 @@ static int __unlink_module(void *_mod)
{
struct module *mod = _mod;
list_del(&mod->list);
+ module_bug_cleanup(mod);
return 0;
}
@@ -2625,6 +2626,7 @@ static struct module *load_module(void __user *umod,
if (err < 0)
goto ddebug;
+ module_bug_finalize(info.hdr, info.sechdrs, mod);
list_add_rcu(&mod->list, &modules);
mutex_unlock(&module_mutex);
@@ -2650,6 +2652,8 @@ static struct module *load_module(void __user *umod,
mutex_lock(&module_mutex);
/* Unlink carefully: kallsyms could be walking list. */
list_del_rcu(&mod->list);
+ module_bug_cleanup(mod);
+
ddebug:
if (!mod->taints)
dynamic_debug_remove(info.debug);
diff --git a/lib/bug.c b/lib/bug.c
index 7cdfad8..1955209 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -72,8 +72,8 @@ static const struct bug_entry *module_find_bug(unsigned long bugaddr)
return NULL;
}
-int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
- struct module *mod)
+void module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
+ struct module *mod)
{
char *secstrings;
unsigned int i;
@@ -97,8 +97,6 @@ int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
* could potentially lead to deadlock and thus be counter-productive.
*/
list_add(&mod->bug_list, &module_bug_list);
-
- return 0;
}
void module_bug_cleanup(struct module *mod)
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-03 19:51 [BUG 2.6.36-rc6] list corruption in module_bug_finalize Thomas Gleixner
2010-10-04 11:00 ` Arnd Bergmann
@ 2010-10-05 4:18 ` Rusty Russell
2010-10-05 11:08 ` Adrian Bunk
1 sibling, 1 reply; 14+ messages in thread
From: Rusty Russell @ 2010-10-05 4:18 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Arnd Bergmann, Linus Torvalds, Jeremy Fitzhardinge,
Adrian Bunk
On Mon, 4 Oct 2010 06:21:08 am Thomas Gleixner wrote:
> Current mainline triggers a list corruption bug in
> module_bug_finalize(). dmesg excerpt below.
>
> The corresponding code says:
>
> /*
> * Strictly speaking this should have a spinlock to protect against
> * traversals, but since we only traverse on BUG()s, a spinlock
> * could potentially lead to deadlock and thus be counter-productive.
> */
> list_add(&mod->bug_list, &module_bug_list);
>
> I can see the traversal problem vs. BUG(), but what's protecting the
> list_add() ? BKL probably did, but is that true anymore ?
I've never even *seen* this code before :(
Looks like it went through Adrian Bunk to Andrew, but despite the fact that
it (foolishly) doesn't touch kernel/module.c, it's generic code and I should
have seen it. It did change the linux/module.h header.
So, it used to be protected by module_mutex, but Linus and I cleaned that up.
So, we need a lock around this list for adding and removal. I'd use
list_add_rcu to try to help the lockless traversal too...
And moving it from all the archs into kernel/module.c would be a nice bonus.
Nice catch!
Rusty.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-04 22:43 ` Thomas Gleixner
2010-10-04 23:55 ` Linus Torvalds
@ 2010-10-05 5:14 ` Rusty Russell
2010-10-05 7:30 ` Thomas Gleixner
1 sibling, 1 reply; 14+ messages in thread
From: Rusty Russell @ 2010-10-05 5:14 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Arnd Bergmann, LKML, Kay Sievers, Brandon Philips, Linus Torvalds
On Tue, 5 Oct 2010 09:13:38 am Thomas Gleixner wrote:
> The patch below cures it.
Using module_mutex here is just lazy... Here's 5c, go buy your own lock :)
> /*
> - * Strictly speaking this should have a spinlock to protect against
> - * traversals, but since we only traverse on BUG()s, a spinlock
> - * could potentially lead to deadlock and thus be counter-productive.
> + * We need to take module_mutex here to protect the list add, though
> + * it won't protect against a concurrent BUG().
> */
> + mutex_lock(&module_mutex);
> list_add(&mod->bug_list, &module_bug_list);
> + mutex_unlock(&module_mutex);
>
> return 0;
> }
>
> void module_bug_cleanup(struct module *mod)
> {
> + mutex_lock(&module_mutex);
> list_del(&mod->bug_list);
> + mutex_unlock(&module_mutex);
> }
>
> #else
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 5:14 ` Rusty Russell
@ 2010-10-05 7:30 ` Thomas Gleixner
2010-10-05 15:34 ` Linus Torvalds
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2010-10-05 7:30 UTC (permalink / raw)
To: Rusty Russell
Cc: Arnd Bergmann, LKML, Kay Sievers, Brandon Philips, Linus Torvalds
On Tue, 5 Oct 2010, Rusty Russell wrote:
> On Tue, 5 Oct 2010 09:13:38 am Thomas Gleixner wrote:
> > The patch below cures it.
>
> Using module_mutex here is just lazy... Here's 5c, go buy your own lock :)
I'm lazy. :) My evil plan of sending a crap patch so it gets replaced
by a nice one worked well :)
Thanks,
tglx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 4:18 ` Rusty Russell
@ 2010-10-05 11:08 ` Adrian Bunk
0 siblings, 0 replies; 14+ messages in thread
From: Adrian Bunk @ 2010-10-05 11:08 UTC (permalink / raw)
To: Rusty Russell
Cc: Thomas Gleixner, LKML, Arnd Bergmann, Linus Torvalds,
Jeremy Fitzhardinge, Andrew Morton
On Tue, Oct 05, 2010 at 02:48:34PM +1030, Rusty Russell wrote:
> On Mon, 4 Oct 2010 06:21:08 am Thomas Gleixner wrote:
> > Current mainline triggers a list corruption bug in
> > module_bug_finalize(). dmesg excerpt below.
> >
> > The corresponding code says:
> >
> > /*
> > * Strictly speaking this should have a spinlock to protect against
> > * traversals, but since we only traverse on BUG()s, a spinlock
> > * could potentially lead to deadlock and thus be counter-productive.
> > */
> > list_add(&mod->bug_list, &module_bug_list);
> >
> > I can see the traversal problem vs. BUG(), but what's protecting the
> > list_add() ? BKL probably did, but is that true anymore ?
>
> I've never even *seen* this code before :(
>
> Looks like it went through Adrian Bunk to Andrew,
>...
[bunk@stusta.de: include/linux/bug.h must always #include <linux/module.h]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
The commit did not went through me, and I did never review or forward it.
My Signed-off-by: was for the change I sent against the original patch,
and it was added to the commit when Andrew included my change into the
original patch.
> but despite the fact that
> it (foolishly) doesn't touch kernel/module.c, it's generic code and I should
> have seen it. It did change the linux/module.h header.
>...
The commit says
Cc: Rusty Russell <rusty@rustcorp.com.au>
When Andrew submitted it to Linus that should have resulted in an email
to you by the script Andrew uses for submitting patches.
And according to my mail archives that did happen:
Message-Id: <200612081036.kB8AaJDK016473@shell0.pdx.osdl.net>
Subject: [patch 027/368] Generic BUG implementation
To: torvalds@osdl.org
Cc: akpm@osdl.org,
jeremy@goop.org,
ak@muc.de,
benh@kernel.crashing.org,
bunk@stusta.de,
hugh@veritas.com,
michael@ellerman.id.au,
paulus@samba.org,
rusty@rustcorp.com.au
From: akpm@osdl.org
Date: Fri, 08 Dec 2006 02:36:19 -0800
> Nice catch!
> Rusty.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 7:30 ` Thomas Gleixner
@ 2010-10-05 15:34 ` Linus Torvalds
2010-10-05 16:40 ` Thomas Gleixner
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2010-10-05 15:34 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Rusty Russell, Arnd Bergmann, LKML, Kay Sievers, Brandon Philips
On Tue, Oct 5, 2010 at 12:30 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 5 Oct 2010, Rusty Russell wrote:
>
>> On Tue, 5 Oct 2010 09:13:38 am Thomas Gleixner wrote:
>> > The patch below cures it.
>>
>> Using module_mutex here is just lazy... Here's 5c, go buy your own lock :)
>
> I'm lazy. :) My evil plan of sending a crap patch so it gets replaced
> by a nice one worked well :)
Can you test the one I sent out (the second one with the trivial
module unload fix)? My own testing would be pretty pointless, since I
don't use modules myself (I've compile-tested it, and it all looks
sane, but...)
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 15:34 ` Linus Torvalds
@ 2010-10-05 16:40 ` Thomas Gleixner
2010-10-05 17:17 ` Linus Torvalds
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2010-10-05 16:40 UTC (permalink / raw)
To: Linus Torvalds
Cc: Rusty Russell, Arnd Bergmann, LKML, Kay Sievers, Brandon Philips
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4615 bytes --]
Linus,
On Tue, 5 Oct 2010, Linus Torvalds wrote:
> On Tue, Oct 5, 2010 at 12:30 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Tue, 5 Oct 2010, Rusty Russell wrote:
> >
> >> On Tue, 5 Oct 2010 09:13:38 am Thomas Gleixner wrote:
> >> > The patch below cures it.
> >>
> >> Using module_mutex here is just lazy... Here's 5c, go buy your own lock :)
> >
> > I'm lazy. :) My evil plan of sending a crap patch so it gets replaced
> > by a nice one worked well :)
>
> Can you test the one I sent out (the second one with the trivial
> module unload fix)? My own testing would be pretty pointless, since I
> don't use modules myself (I've compile-tested it, and it all looks
> sane, but...)
Hmm, with this patch the corruption triggers at every boot (5 of
5). Without it it's just happening randomly (1 of 10)
Digging further. Dammit, I fear my evil plan fires back now due to
some even more lazy person who doesn't use modules. :)
Thanks,
tglx
------------[ cut here ]------------
WARNING: at /home/tglx/work/kernel/rt-new/linux-2.6-tip/lib/list_debug.c:26 __list_add+0x3f/0x83()
Hardware name:
list_add corruption. next->prev should be prev (ffffffff81a4c460), but was 0f000000a8838948. (next=ffffffffa00091a8).
Modules linked in: firewire_ohci floppy sata_sil firewire_core crc_itu_t radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 421, comm: modprobe Not tainted 2.6.36-rc5+ #96
Call Trace:
[<ffffffff81049c25>] warn_slowpath_common+0x85/0x9d
[<ffffffff81049ce0>] warn_slowpath_fmt+0x46/0x48
[<ffffffff810795b1>] ? verify_export_symbols+0x16/0x126
[<ffffffff811fb820>] __list_add+0x3f/0x83
[<ffffffff811edb03>] module_bug_finalize+0xb9/0xca
[<ffffffff8107abac>] load_module+0x1038/0x1798
[<ffffffff8107b356>] sys_init_module+0x4a/0x1e0
[<ffffffff81009cd2>] system_call_fastpath+0x16/0x1b
---[ end trace f5f118a264676de3 ]---
calling raid_init+0x0/0x12 [raid1] @ 421
md: raid1 personality registered for level 1
initcall raid_init+0x0/0x12 [raid1] returned 0 after 1 usecs
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/dev
CPU 1
Modules linked in: raid1 firewire_ohci floppy sata_sil firewire_core crc_itu_t radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 409, comm: mdadm Tainted: G W 2.6.36-rc5+ #96 D975XBX/
RIP: 0010:[<ffffffffa00091b1>] [<ffffffffa00091b1>] setup_conf+0xc2/0x294 [raid1]
RSP: 0018:ffff880078fe3c40 EFLAGS: 00010282
RAX: ffff880078fe3c58 RBX: ffff88007844a300 RCX: ffff880078702d40
RDX: 0000000000000100 RSI: ffff880078a3e800 RDI: 00000000000000ff
RBP: ffff880078fe3c58 R08: 00000000000080d0 R09: ffff880001e20940
R10: ffff88007fbb9c00 R11: 0000000000000060 R12: ffff880078b81000
R13: 00000000fffffff4 R14: ffff880078b81018 R15: ffff880078fe3ca8
FS: 00007f532de28700(0000) GS:ffff880002080000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000041a078 CR3: 00000000784c2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mdadm (pid: 409, threadinfo ffff880078fe2000, task ffff880078ab0080)
Stack:
ffff880078b81000 0000000000000000 ffff880078b81018 ffff880078fe3c88
<0> ffffffffa000c00b ffff880078b81000 ffff880078b81018 ffff880078b81018
<0> ffff880078b81018 ffff880078fe3d28 ffffffff8133ca1b 0000000000000000
Call Trace:
[<ffffffffa000c00b>] run+0x89/0x248 [raid1]
[<ffffffff8133ca1b>] md_run+0x57c/0x84a
[<ffffffff8133ccfd>] do_md_run+0x14/0x67
[<ffffffff8133ea62>] md_ioctl+0xdf9/0x1074
[<ffffffff811bc355>] ? inode_has_perm+0x7a/0x90
[<ffffffff811bc682>] ? dentry_has_perm+0x5a/0x70
[<ffffffff811e33a4>] __blkdev_driver_ioctl+0x28/0x2a
[<ffffffff811e3c29>] blkdev_ioctl+0x5bd/0x5fc
[<ffffffff811bc40f>] ? file_has_perm+0xa4/0xc6
[<ffffffff8111886c>] block_ioctl+0x37/0x3b
[<ffffffff810fe119>] do_vfs_ioctl+0x4b9/0x508
[<ffffffff810fe1be>] sys_ioctl+0x56/0x79
[<ffffffff81009cd2>] system_call_fastpath+0x16/0x1b
Code: 24 e0 00 00 00 48 c7 c6 54 b1 00 a0 bf 00 01 00 00 89 50 08 48 8b 8b 98 00 00 00 48 c7 c2 31 90 00 a0 e8 34 f5 0a e1 48 85 c0 58 <d4> 00 a0 ff ff ff ff 84 b2 01 00 00 48 8b 83 98 00 00 00 49 8d
RIP [<ffffffffa00091b1>] setup_conf+0xc2/0x294 [raid1]
RSP <ffff880078fe3c40>
---[ end trace f5f118a264676de4 ]---
Segmentation fault
calling wait_scan_init+0x0/0x12 [scsi_wait_scan] @ 435
initcall wait_scan_init+0x0/0x12 [scsi_wait_scan] returned 0 after 0 usecs
dracut: Autoassembling MD Raid
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 16:40 ` Thomas Gleixner
@ 2010-10-05 17:17 ` Linus Torvalds
2010-10-05 17:43 ` Thomas Gleixner
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2010-10-05 17:17 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Rusty Russell, Arnd Bergmann, LKML, Kay Sievers, Brandon Philips
On Tue, Oct 5, 2010 at 9:40 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Hmm, with this patch the corruption triggers at every boot (5 of
> 5). Without it it's just happening randomly (1 of 10)
>
> Digging further. Dammit, I fear my evil plan fires back now due to
> some even more lazy person who doesn't use modules. :)
Sre you sure you used the second version? The first version missed the
cleanup at module unload time, and would have caused the symptoms you
see.
I just tried my own machine with modules (and list debugging), and it
seemed fine. And it show now all happen under the module mutex, so I
don't see how it would be timing-sensitive.
But hey, maybe I'm missing something really silly.
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 17:17 ` Linus Torvalds
@ 2010-10-05 17:43 ` Thomas Gleixner
2010-10-06 9:10 ` Rusty Russell
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2010-10-05 17:43 UTC (permalink / raw)
To: Linus Torvalds
Cc: Rusty Russell, Arnd Bergmann, LKML, Kay Sievers, Brandon Philips
On Tue, 5 Oct 2010, Linus Torvalds wrote:
> On Tue, Oct 5, 2010 at 9:40 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Hmm, with this patch the corruption triggers at every boot (5 of
> > 5). Without it it's just happening randomly (1 of 10)
> >
> > Digging further. Dammit, I fear my evil plan fires back now due to
> > some even more lazy person who doesn't use modules. :)
>
> Sre you sure you used the second version? The first version missed the
> cleanup at module unload time, and would have caused the symptoms you
> see.
Crap, yes. Stupid me managed to pick the wrong one though I'm sure I
double checked.
> I just tried my own machine with modules (and list debugging), and it
> seemed fine. And it show now all happen under the module mutex, so I
> don't see how it would be timing-sensitive.
>
> But hey, maybe I'm missing something really silly.
Nah. Works fine. Sorry for making you build modules :)
Tested-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize
2010-10-05 17:43 ` Thomas Gleixner
@ 2010-10-06 9:10 ` Rusty Russell
0 siblings, 0 replies; 14+ messages in thread
From: Rusty Russell @ 2010-10-06 9:10 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Linus Torvalds, Arnd Bergmann, LKML, Kay Sievers, Brandon Philips
On Wed, 6 Oct 2010 04:13:33 am Thomas Gleixner wrote:
> On Tue, 5 Oct 2010, Linus Torvalds wrote:
> > But hey, maybe I'm missing something really silly.
>
> Nah. Works fine. Sorry for making you build modules :)
>
> Tested-by: Thomas Gleixner <tglx@linutronix.de>
It's also much saner. Thanks Linus!
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-10-06 9:10 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-03 19:51 [BUG 2.6.36-rc6] list corruption in module_bug_finalize Thomas Gleixner
2010-10-04 11:00 ` Arnd Bergmann
2010-10-04 22:43 ` Thomas Gleixner
2010-10-04 23:55 ` Linus Torvalds
2010-10-05 1:11 ` Linus Torvalds
2010-10-05 5:14 ` Rusty Russell
2010-10-05 7:30 ` Thomas Gleixner
2010-10-05 15:34 ` Linus Torvalds
2010-10-05 16:40 ` Thomas Gleixner
2010-10-05 17:17 ` Linus Torvalds
2010-10-05 17:43 ` Thomas Gleixner
2010-10-06 9:10 ` Rusty Russell
2010-10-05 4:18 ` Rusty Russell
2010-10-05 11:08 ` Adrian Bunk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox