From: Ingo Molnar <mingo@elte.hu>
To: Adrian Bunk <bunk@stusta.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, Greg Kroah-Hartman <gregkh@suse.de>
Subject: [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions"
Date: Fri, 30 Mar 2007 14:04:16 +0200 [thread overview]
Message-ID: <20070330120416.GA19373@elte.hu> (raw)
In-Reply-To: <20070327015929.GY16477@stusta.de>
i just found a new category of driver regressions in 2.6.21, doing
allyesconfig bzImage bootup tests: the init methods of various drivers
hangs in driver_unregister().
It is caused by this problem: the semantics of driver_unregister() [also
implicitly called in pci_driver_unregister()] has apparently changed
recently. If a driver does:
pci_register_driver(&my_driver);
...
if (some_failure) {
pci_unregister_driver(&my_driver);
...
}
it will hang the bootup in the following piece of code:
drivers/base/driver.c:
void driver_unregister(struct device_driver * drv)
{
bus_remove_driver(drv);
wait_for_completion(&drv->unloaded);
the completion is never done - because nobody removes the bus while the
init is still happening, obviously. (and bootup is serialized anyway)
now, the majority of drivers does the driver unregistry from its
module-cleanup function, so it's not affected by this problem. But if
you apply the debug patch attached further below, and do an allyesconfig
bzImage bootup, there's 3 hits already:
BUG: at drivers/base/driver.c:187 driver_unregister()
[<c0105ff9>] show_trace_log_lvl+0x19/0x2e
[<c01063e2>] show_trace+0x12/0x14
[<c01063f8>] dump_stack+0x14/0x16
[<c063f7e6>] driver_unregister+0x3d/0x43
[<c0488048>] pci_unregister_driver+0x10/0x5f
[<c1b5f7c7>] slgt_init+0x9b/0x1ca
[<c1b31a2d>] init+0x15d/0x2bd
[<c0105bc3>] kernel_thread_helper+0x7/0x10
BUG: at drivers/base/driver.c:187 driver_unregister()
[<c0105ff9>] show_trace_log_lvl+0x19/0x2e
[<c01063e2>] show_trace+0x12/0x14
[<c01063f8>] dump_stack+0x14/0x16
[<c063f7e6>] driver_unregister+0x3d/0x43
[<c0488048>] pci_unregister_driver+0x10/0x5f
[<c0619505>] init_ipmi_si+0x70a/0x738
[<c1b31a2d>] init+0x15d/0x2bd
[<c0105bc3>] kernel_thread_helper+0x7/0x10
BUG: at drivers/base/driver.c:187 driver_unregister()
[<c0105ff9>] show_trace_log_lvl+0x19/0x2e
[<c01063e2>] show_trace+0x12/0x14
[<c01063f8>] dump_stack+0x14/0x16
[<c063f7e6>] driver_unregister+0x3d/0x43
[<c0488048>] pci_unregister_driver+0x10/0x5f
[<c1b6d2d8>] tlan_probe+0x2dd/0x30e
[<c1b31a2d>] init+0x15d/0x2bd
[<c0105bc3>] kernel_thread_helper+0x7/0x10
possibly more could trigger. Each of these 3 places caused an actual
bootup hang on my testbox, so these are real regressions and need to be
fixed.
because there are a good number of drivers that do
pci_unregister_device() from their init function, and because i cannot
see anything obviously wrong in doing an unregister call after a
failure, i think it's driver_unregister() that needs to be fixed. Greg,
what do you think?
Ingo
Index: linux/drivers/base/driver.c
===================================================================
--- linux.orig/drivers/base/driver.c
+++ linux/drivers/base/driver.c
@@ -183,7 +183,8 @@ int driver_register(struct device_driver
void driver_unregister(struct device_driver * drv)
{
bus_remove_driver(drv);
- wait_for_completion(&drv->unloaded);
+ if (!drv->unloaded.done)
+ WARN_ON(1);
}
/**
next prev parent reply other threads:[~2007-03-30 12:04 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-25 23:08 Linux 2.6.21-rc5 Linus Torvalds
2007-03-26 8:31 ` Ingo Molnar
2007-03-26 8:17 ` Ayaz Abdulla
2007-03-26 8:39 ` Ingo Molnar
2007-03-26 8:58 ` [patch] forcedeth: work around NULL skb dereference crash Ingo Molnar
2007-04-02 11:56 ` [patch] forcedeth: improve NAPI logic Ingo Molnar
2007-03-26 8:55 ` Linux 2.6.21-rc5 Thomas Gleixner
2007-03-26 12:25 ` Bob Tracy
2007-03-26 12:30 ` Thomas Gleixner
2007-03-26 9:04 ` 2.6.21-rc5: maxcpus=1 crash in cpufreq: kernel BUG at drivers/cpufreq/cpufreq.c:82! Ingo Molnar
2007-03-26 18:12 ` Venki Pallipadi
2007-03-26 19:03 ` Venki Pallipadi
2007-03-27 7:11 ` Ingo Molnar
2007-03-26 9:21 ` [PATCH] clockevents: remove bad designed sysfs support for now Thomas Gleixner
2007-03-26 9:25 ` Ingo Molnar
2007-03-26 18:57 ` Greg KH
2007-03-26 12:51 ` Pavel Machek
2007-03-27 7:08 ` [PATCH] i386: Fix bogus return value in hpet_next_event() Thomas Gleixner
2007-03-26 10:11 ` -rc5: e1000 resume weirdness Ingo Molnar
2007-03-26 15:39 ` Kok, Auke
2007-03-26 15:50 ` Jesse Brandeburg
2007-03-26 15:55 ` Kok, Auke
2007-03-26 17:39 ` Ingo Molnar
2007-03-27 1:59 ` [1/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-28 18:54 ` Kok, Auke
2007-03-28 19:23 ` Ingo Molnar
2007-03-30 18:04 ` Adrian Bunk
2007-03-30 12:04 ` Ingo Molnar [this message]
2007-03-30 12:06 ` [bug] fixed_init(): BUG: at drivers/base/core.c:120 device_release(), was: "2.6.21-rc5: known regressions" Ingo Molnar
2007-03-30 14:18 ` Greg KH
2007-03-30 14:25 ` Ingo Molnar
2007-03-30 16:31 ` Vitaly Bordug
2007-03-30 14:16 ` [bug] hung bootup in various drivers, " Greg KH
2007-03-30 17:46 ` Ingo Molnar
2007-03-30 19:32 ` Greg KH
2007-03-31 2:32 ` Kay Sievers
2007-03-31 16:51 ` [patch] driver core: fix built-in drivers sysfs links Ingo Molnar
2007-03-31 16:31 ` [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions" Ingo Molnar
2007-04-01 7:49 ` Pavel Machek
2007-04-01 17:17 ` Linus Torvalds
2007-04-01 17:35 ` [patch] driver core: if built-in, do not wait in driver_unregister() Ingo Molnar
2007-04-02 1:47 ` Greg KH
2007-03-27 1:59 ` [2/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-28 19:46 ` Laurent Riffard
2007-03-29 19:02 ` Fabio Comolli
2007-03-27 1:59 ` [3/5] " Adrian Bunk
2007-03-27 1:59 ` [4/5] " Adrian Bunk
2007-03-27 8:00 ` Marcus Better
2007-03-27 13:25 ` Eric W. Biederman
2007-03-27 16:53 ` Marcus Better
2007-03-27 20:50 ` Eric W. Biederman
2007-03-27 10:09 ` Rafael J. Wysocki
2007-03-27 22:29 ` Adrian Bunk
2007-03-27 22:45 ` Thomas Meyer
2007-03-28 12:19 ` Ingo Molnar
2007-03-28 12:41 ` Ingo Molnar
2007-03-28 13:03 ` Ingo Molnar
2007-03-28 13:06 ` [patch] MSI-X: fix resume crash Ingo Molnar
2007-03-28 13:31 ` Eric W. Biederman
2007-03-28 13:36 ` Ingo Molnar
2007-03-29 4:30 ` Len Brown
2007-03-29 4:57 ` Eric W. Biederman
2007-03-27 1:59 ` [5/5] 2.6.21-rc5: known regressions Adrian Bunk
2007-03-27 5:51 ` ATA ACPI (was Re: Linux 2.6.21-rc5) Jeff Garzik
2007-03-27 5:54 ` Tejun Heo
2007-03-27 21:32 ` Pavel Machek
2007-03-28 9:51 ` Tejun Heo
2007-03-27 17:07 ` Linus Torvalds
2007-03-27 18:48 ` Jeff Garzik
2007-03-27 6:17 ` Linux 2.6.21-rc5 Andrew Morton
2007-03-27 6:20 ` Greg KH
2007-03-27 16:49 ` Jesse Barnes
2007-03-27 9:49 ` Takashi Iwai
2007-03-27 12:25 ` Andi Kleen
2007-03-27 16:33 ` Andrew Morton
2007-03-27 12:43 ` Dmitry Torokhov
2007-03-28 22:32 ` Tilman Schmidt
2007-03-27 18:34 ` Michal Piotrowski
2007-03-27 22:29 ` Pavel Machek
2007-03-27 22:55 ` Michal Piotrowski
2007-03-27 18:53 ` Michal Piotrowski
2007-03-28 14:30 ` Andi Kleen
2007-03-28 14:56 ` Michal Piotrowski
2007-03-28 16:12 ` Jiri Kosina
2007-03-28 16:51 ` Michal Piotrowski
2007-03-28 17:56 ` Linus Torvalds
[not found] ` <20070327230024.GJ16477@stusta.de>
2007-03-27 23:10 ` 2.6.21-rc5: known regressions with patches Rafael J. Wysocki
2007-03-28 0:50 ` Jay Cliburn
2007-03-30 21:32 ` [1/4] 2.6.21-rc5: known regressions (v2) Adrian Bunk
2007-03-30 21:38 ` Greg KH
2007-03-31 0:23 ` Michal Jaegermann
2007-03-31 15:01 ` Adrian Bunk
2007-03-31 16:42 ` Michal Jaegermann
2007-03-30 21:32 ` [2/4] " Adrian Bunk
2007-03-30 21:32 ` [3/4] " Adrian Bunk
2007-03-31 2:52 ` Jeff Chua
2007-03-31 3:16 ` Adrian Bunk
2007-03-31 11:08 ` Jens Axboe
2007-04-01 5:39 ` Jeremy Fitzhardinge
2007-04-13 16:32 ` Michal Piotrowski
2007-03-30 21:49 ` [4/4] " Adrian Bunk
2007-03-31 2:41 ` Jeff Chua
2007-03-31 6:44 ` Frédéric Riss
2007-04-01 7:04 ` Michael S. Tsirkin
2007-04-01 20:37 ` Michael S. Tsirkin
2007-03-31 18:19 ` 2.6.21-rc5: known regressions with patches (v2) Adrian Bunk
2007-04-03 4:05 ` [PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2) Robert Hancock
2007-04-03 4:13 ` Tejun Heo
2007-04-04 6:09 ` Jeff Garzik
2007-04-04 14:26 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070330120416.GA19373@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=bunk@stusta.de \
--cc=gregkh@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox