* Top kernel oopses/warnings for the week of May 16th 2008
@ 2008-05-16 16:41 Arjan van de Ven
2008-05-16 17:14 ` Evgeniy Polyakov
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Arjan van de Ven @ 2008-05-16 16:41 UTC (permalink / raw)
To: Linux Kernel Mailing List
Cc: Linus Torvalds, NetDev, Andrew Morton, Jeff Garzik
The http://www.kerneloops.org website collects kernel oops and
warning reports from various mailing lists and bugzillas as well as
with a client users can install to auto-submit oopses.
Below is a top 10 list of the oopses collected in the last 7 days.
(Reports prior to 2.6.23 have been omitted in collecting the top 10)
This week, a total of 1617 oopses and warnings have been reported,
compared to 452 reports in the previous week. This sharp increase
is due to Fedora 9 being released, which includes the automatic
collection client.
Per file statistics
-------------------
743 kernel/sysctl.c
113 fs/buffer.c
76 fs/sysfs/dir.c
38 kernel/spinlock.c
22 fs/inotify.c
21 kernel/sysctl_check.c (P)
21 net/core/sock.c
18 fs/file_table.c
17 mm/page_alloc.c
15 lib/iomap.c
Bug of the week
---------------
Not in the top 10 (but barely not so), but upcoming fast is a bug that has a very
distinct pattern.
The backtraces are at http://www.kerneloops.org/searchweek.php?search=fput
The pattern is that the kernel gets an invalid pointer passed to fput(),
coming down from a select() system call done by the "wpa_supplicant" program.
The fact that it is ONLY wpa_supplicant implicates the wireless/network stack.
Another observation is that this only happens with 64 bit kernels, even though
a large portion of the users uses 32 bit kernels. This implies that this is a 64-bit
type of bug. It appears that the top 32 bit of the pointers is getting corrupted
(the bottom part at least looks valid).
Top 10 reported bugs
--------------------
Rank 1: __register_sysctl_paths
Reported 741 times (1254 total reports)
Duplicate /proc registration. Bugs in madwifi but also in the parport driver
This oops was last seen in version 2.6.25.3, and first seen in 2.6.25-rc3.
More info: http://www.kerneloops.org/searchweek.php?search=__register_sysctl_paths
Rank 2: mark_buffer_dirty
Reported 110 times (306 total reports)
EXT3 bug while hot-removing a USB device
This oops was last seen in version 2.6.25.3, and first seen in 2.6.24-rc6.
More info: http://www.kerneloops.org/searchweek.php?search=mark_buffer_dirty
Rank 3: sysfs_add_one
Reported 67 times (272 total reports)
Adding duplicate sysfs files... seems to be mostly around USB so all in GregKH's park
This oops was last seen in version 2.6.26-rc2, and first seen in 2.6.24-rc6.
More info: http://www.kerneloops.org/searchweek.php?search=sysfs_add_one
Rank 4: _spin_unlock_irqrestore
Reported 38 times (130 total reports)
Mostly "softlockups" coming out of idle. This could well be mostly hardware issues;
idle is the most harsh thing in terms of voltage/current swings.
This oops was last seen in version 2.6.25.3, and first seen in 2.6.22-rc1.
More info: http://www.kerneloops.org/searchweek.php?search=_spin_unlock_irqrestore
Rank 5: ieee80211_stop_tx_ba_session
Reported 36 times (56 total reports)
Seems to be caused by the 4965 driver
This oops was last seen in version 2.6.25.3, and first seen in 2.6.25-rc7-git6.
More info: http://www.kerneloops.org/searchweek.php?search=ieee80211_stop_tx_ba_session
Rank 6: nouveau_gpuobj_ref_del
Reported 22 times (40 total reports)
Bug in the out-of-tree nouveau driver
This oops was last seen in version 2.6.25, and first seen in 2.6.25-rc4.
More info: http://www.kerneloops.org/searchweek.php?search=nouveau_gpuobj_ref_del
Rank 7: set_dentry_child_flags
Reported 22 times (741 total reports)
Bug in the 2.6.24 inotify code that got exposed by KDE4 and got fixed in 2.6.25
This oops was last seen in version 2.6.24.4, and first seen in 2.6.24-rc8-git4.
More info: http://www.kerneloops.org/searchweek.php?search=set_dentry_child_flags
Rank 8: sysctl_check_lookup
Reported 21 times (239 total reports)
Bug in the proprietary madwifi driver
Oops only shows up in tainted kernels
This oops was last seen in version 2.6.24.5, and first seen in 2.6.24-rc5.
More info: http://www.kerneloops.org/searchweek.php?search=sysctl_check_lookup
Rank 9: sk_free
Reported 19 times (80 total reports)
VMWare driver bug
Oops only shows up in tainted kernels
This oops was last seen in version 2.6.25.3, and first seen in 2.6.23.9.
More info: http://www.kerneloops.org/searchweek.php?search=sk_free
Rank 10: __alloc_pages
Reported 16 times (31 total reports)
Sleeping allocation in interrupt context, some in netlink, some in the nv sata driver
This oops was last seen in version 2.6.25.3, and first seen in 2.6.18-rc1.
More info: http://www.kerneloops.org/searchweek.php?search=__alloc_pages
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Top kernel oopses/warnings for the week of May 16th 2008
2008-05-16 16:41 Top kernel oopses/warnings for the week of May 16th 2008 Arjan van de Ven
@ 2008-05-16 17:14 ` Evgeniy Polyakov
2008-05-16 18:04 ` Adrian Bunk
2008-05-16 18:50 ` notification for systemtap-related oops Frank Ch. Eigler
2 siblings, 0 replies; 8+ messages in thread
From: Evgeniy Polyakov @ 2008-05-16 17:14 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Linux Kernel Mailing List, Linus Torvalds, NetDev, Andrew Morton,
Jeff Garzik, Francois Romieu
On Fri, May 16, 2008 at 09:41:31AM -0700, Arjan van de Ven (arjan@linux.intel.com) wrote:
> Rank 10: __alloc_pages
> Reported 16 times (31 total reports)
> Sleeping allocation in interrupt context, some in netlink, some in
> the nv sata driver
> This oops was last seen in version 2.6.25.3, and first seen in
> 2.6.18-rc1.
> More info:
> http://www.kerneloops.org/searchweek.php?search=__alloc_pages
Number of them from via-velocity driver should be fixed by attached
patch (added Francois Romieu <romieu@fr.zoreil.com> to copy), but
frankly that looks really bad. Allocations are protected by lock, which
is used for interrupts, but that is safe, since device is turned off,
but also for suspend (which can free them again, btw), mii register dump
(will break without lock) and something else, which should be fine
though because of rtnl. What we could do better, is to allocate new
rings in advance, and only substitue pointers and write registers under
the lock, Francois?
diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index 6b8d882..d6b7972 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -1251,7 +1251,7 @@ static int velocity_init_rd_ring(struct velocity_info *vptr)
vptr->rx_buf_sz = (mtu <= ETH_DATA_LEN) ? PKT_BUF_SZ : mtu + 32;
vptr->rd_info = kcalloc(vptr->options.numrx,
- sizeof(struct velocity_rd_info), GFP_KERNEL);
+ sizeof(struct velocity_rd_info), GFP_ATOMIC);
if (!vptr->rd_info)
return -ENOMEM;
@@ -1324,7 +1324,7 @@ static int velocity_init_td_ring(struct velocity_info *vptr)
vptr->td_infos[j] = kcalloc(vptr->options.numtx,
sizeof(struct velocity_td_info),
- GFP_KERNEL);
+ GFP_ATOMIC);
if (!vptr->td_infos[j]) {
while(--j >= 0)
kfree(vptr->td_infos[j]);
--
Evgeniy Polyakov
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: Top kernel oopses/warnings for the week of May 16th 2008
2008-05-16 16:41 Top kernel oopses/warnings for the week of May 16th 2008 Arjan van de Ven
2008-05-16 17:14 ` Evgeniy Polyakov
@ 2008-05-16 18:04 ` Adrian Bunk
2008-05-16 18:19 ` Arjan van de Ven
2008-05-20 3:53 ` Dave Jones
2008-05-16 18:50 ` notification for systemtap-related oops Frank Ch. Eigler
2 siblings, 2 replies; 8+ messages in thread
From: Adrian Bunk @ 2008-05-16 18:04 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Linux Kernel Mailing List, Linus Torvalds, NetDev, Andrew Morton,
Jeff Garzik
On Fri, May 16, 2008 at 09:41:31AM -0700, Arjan van de Ven wrote:
>...
> Bug of the week
> ---------------
> Not in the top 10 (but barely not so), but upcoming fast is a bug that has a very
> distinct pattern.
> The backtraces are at http://www.kerneloops.org/searchweek.php?search=fput
>
> The pattern is that the kernel gets an invalid pointer passed to fput(),
> coming down from a select() system call done by the "wpa_supplicant" program.
> The fact that it is ONLY wpa_supplicant implicates the wireless/network stack.
> Another observation is that this only happens with 64 bit kernels, even though
> a large portion of the users uses 32 bit kernels. This implies that this is a 64-bit
> type of bug. It appears that the top 32 bit of the pointers is getting corrupted
> (the bottom part at least looks valid).
>...
Unless I misunderstand your webinterface another pattern is a "fc9" in
the version string.
My first guess would be that it might be a problem in some code that is
only in Fedora kernels?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Top kernel oopses/warnings for the week of May 16th 2008
2008-05-16 18:04 ` Adrian Bunk
@ 2008-05-16 18:19 ` Arjan van de Ven
2008-05-20 3:53 ` Dave Jones
1 sibling, 0 replies; 8+ messages in thread
From: Arjan van de Ven @ 2008-05-16 18:19 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linux Kernel Mailing List, Linus Torvalds, NetDev, Andrew Morton,
Jeff Garzik
Adrian Bunk wrote:
> On Fri, May 16, 2008 at 09:41:31AM -0700, Arjan van de Ven wrote:
>> ...
>> Bug of the week
>> ---------------
>> Not in the top 10 (but barely not so), but upcoming fast is a bug that has a very
>> distinct pattern.
>> The backtraces are at http://www.kerneloops.org/searchweek.php?search=fput
>>
>> The pattern is that the kernel gets an invalid pointer passed to fput(),
>> coming down from a select() system call done by the "wpa_supplicant" program.
>> The fact that it is ONLY wpa_supplicant implicates the wireless/network stack.
>> Another observation is that this only happens with 64 bit kernels, even though
>> a large portion of the users uses 32 bit kernels. This implies that this is a 64-bit
>> type of bug. It appears that the top 32 bit of the pointers is getting corrupted
>> (the bottom part at least looks valid).
>> ...
>
> Unless I misunderstand your webinterface another pattern is a "fc9" in
> the version string.
that's because fc9 is the only OS that currently ships the client by default,
which means that it's a statistical thing where 90%+ of the reports come from
Fedora kernels, just because that's where the data is mined.
>
> My first guess would be that it might be a problem in some code that is
> only in Fedora kernels?
that may or may not be true, but we can't conclude that right now.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Top kernel oopses/warnings for the week of May 16th 2008
2008-05-16 18:04 ` Adrian Bunk
2008-05-16 18:19 ` Arjan van de Ven
@ 2008-05-20 3:53 ` Dave Jones
1 sibling, 0 replies; 8+ messages in thread
From: Dave Jones @ 2008-05-20 3:53 UTC (permalink / raw)
To: Adrian Bunk
Cc: Arjan van de Ven, Linux Kernel Mailing List, Linus Torvalds,
NetDev, Andrew Morton, Jeff Garzik
On Fri, May 16, 2008 at 09:04:26PM +0300, Adrian Bunk wrote:
> On Fri, May 16, 2008 at 09:41:31AM -0700, Arjan van de Ven wrote:
> >...
> > Bug of the week
> > ---------------
> > Not in the top 10 (but barely not so), but upcoming fast is a bug that has a very
> > distinct pattern.
> > The backtraces are at http://www.kerneloops.org/searchweek.php?search=fput
> >
> > The pattern is that the kernel gets an invalid pointer passed to fput(),
> > coming down from a select() system call done by the "wpa_supplicant" program.
> > The fact that it is ONLY wpa_supplicant implicates the wireless/network stack.
> > Another observation is that this only happens with 64 bit kernels, even though
> > a large portion of the users uses 32 bit kernels. This implies that this is a 64-bit
> > type of bug. It appears that the top 32 bit of the pointers is getting corrupted
> > (the bottom part at least looks valid).
> >...
>
> Unless I misunderstand your webinterface another pattern is a "fc9" in
> the version string.
Unsurprising really given we just did a release, and not many other distros
are enabling kerneloops by default yet.
> My first guess would be that it might be a problem in some code that is
> only in Fedora kernels?
Very likely, though it's worth noting that all the wireless patches we have
in f9 are from wireless.git, so they're valid 2.6.26-rc bugs
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 8+ messages in thread
* notification for systemtap-related oops
2008-05-16 16:41 Top kernel oopses/warnings for the week of May 16th 2008 Arjan van de Ven
2008-05-16 17:14 ` Evgeniy Polyakov
2008-05-16 18:04 ` Adrian Bunk
@ 2008-05-16 18:50 ` Frank Ch. Eigler
2008-05-16 19:47 ` Arjan van de Ven
2 siblings, 1 reply; 8+ messages in thread
From: Frank Ch. Eigler @ 2008-05-16 18:50 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, systemtap
Hi -
Arjan, would it be possible for kerneloops.org to notify us systemtap
people (cc:d) automagically if the oops messages implicate systemtap
modules by including "stap_.*" in the loaded-module list or stack
backtrace symbols?
- FChE
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: notification for systemtap-related oops
2008-05-16 18:50 ` notification for systemtap-related oops Frank Ch. Eigler
@ 2008-05-16 19:47 ` Arjan van de Ven
2008-05-16 20:23 ` Frank Ch. Eigler
0 siblings, 1 reply; 8+ messages in thread
From: Arjan van de Ven @ 2008-05-16 19:47 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: Linux Kernel Mailing List, systemtap
Frank Ch. Eigler wrote:
> Hi -
>
> Arjan, would it be possible for kerneloops.org to notify us systemtap
> people (cc:d) automagically if the oops messages implicate systemtap
> modules by including "stap_.*" in the loaded-module list or stack
> backtrace symbols?
I can get you an RSS feed for that....
doing something more proactive is harder due to the unscalable nature
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: notification for systemtap-related oops
2008-05-16 19:47 ` Arjan van de Ven
@ 2008-05-16 20:23 ` Frank Ch. Eigler
0 siblings, 0 replies; 8+ messages in thread
From: Frank Ch. Eigler @ 2008-05-16 20:23 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, systemtap
Hi -
On Fri, May 16, 2008 at 12:47:29PM -0700, Arjan van de Ven wrote:
> I can get you an RSS feed for [notifying us of stap* oopses]....
> doing something more proactive is harder due to the unscalable nature
RSS would be fine, and can be converted to email by other tools.
- FChE
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-05-20 3:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-16 16:41 Top kernel oopses/warnings for the week of May 16th 2008 Arjan van de Ven
2008-05-16 17:14 ` Evgeniy Polyakov
2008-05-16 18:04 ` Adrian Bunk
2008-05-16 18:19 ` Arjan van de Ven
2008-05-20 3:53 ` Dave Jones
2008-05-16 18:50 ` notification for systemtap-related oops Frank Ch. Eigler
2008-05-16 19:47 ` Arjan van de Ven
2008-05-16 20:23 ` Frank Ch. Eigler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox