netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ayaz Abdulla <aabdulla@nvidia.com>,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	Adrian Bunk <bunk@stusta.de>, Greg KH <greg@kroah.com>,
	Dave Jones <davej@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jeff Garzik <jgarzik@pobox.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [1/3] 2.6.21-rc6: known regressions
Date: Sat, 14 Apr 2007 08:21:43 +0200	[thread overview]
Message-ID: <20070414062143.GA12707@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.64.0704131821020.28042@woody.linux-foundation.org>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Note: Ingo also reports what looks like a memory corruption due to the 
> 6b6b6b6b pattern on presumably the same box.
> 
> The 6b6b6b6b pattern is POISON_FREE, implying some kind of slab 
> misuse, most likely a use-after-free, although possibly just due to 
> overrunning a slab into the next one or something like that.

unfortunately, while being at -rc6 based kernel #445 meanwhile, this 
incident was the only time i saw this problem. Note: while it's a 
CONFIG_SMP kernel, in that bootup i was using maxcpus=1:

   WARNING: maxcpus limit of 1 reached. Processor ignored.

so it's a pure UP problem. Plus i used PREEMPT_NONE. So this really must 
be something fundamental.

> What I'm leading up to is that I'm wondering if these mysterious 
> network driver bugs aren't due to the network drivers themselves, but 
> due to some higher-level problem. I think the hangs that Ingo sees 
> with forcedeth were preceded by mysterious and "impossible" NULL 
> pointer oopses. Ingo?

hm. I would tend to exclude networking, because the oops happened right 
during bootup (i saw it happen real time on the serial console), 
possibly before networking was brought up. It was udevd that crashed, 
and rarely does udevd do anything after its initial /dev hierarchy setup 
frenzy. (But this testbox boots very fast so it might have been near 
network bringup.)

note that i can pretty much freely force the forcedeth problem to occur 
on -rt [but all the reports i sent about it were done on a vanilla 
kernel]. I triggered that problem at least a couple of dozen times, and 
it _never_ caused any other effect besides the skb NULL dereference - or 
lately (with the latest forcedeth.c version), a pure forcedeth interface 
hang. That doesnt exclude networking driver badness, but makes it less 
likely.

to me this crash has the feeling of being sysfs related: not just 
because the crash itself is within sysfs:

 EIP is at module_put+0x19/0x2d

 [<c0104c44>] show_trace_log_lvl+0x19/0x2e
 [<c0104cf4>] show_stack_log_lvl+0x9b/0xa3
 [<c0104fdd>] show_registers+0x1c8/0x29a
 [<c01052d0>] die+0x119/0x1f0
 [<c03cd075>] do_page_fault+0x4e3/0x5b8
 [<c03cb7a4>] error_code+0x7c/0x84
 [<c019e832>] sysfs_release+0x55/0x76
 [<c0167c7f>] __fput+0xb9/0x15e
 [<c0167d3b>] fput+0x17/0x19
 [<c01658b2>] filp_close+0x52/0x5a
 [<c01660a3>] sys_close+0x76/0xad
 [<c0103dc0>] syscall_call+0x7/0xb

but also because udevd itself is _very_ sysfs intense - an in fact on 
this bzImage kernel it's perhaps the _only_ true sysfs activity that 
happens. (there are no loadable modules whatsoever, all drivers are 
built in)

	Ingo

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

  parent reply	other threads:[~2007-04-14  6:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.64.0704051944230.6730@woody.linux-foundation.org>
2007-04-06 21:40 ` tg3: unable to handle null pointer dereference [Re: Linux 2.6.21-rc6] Nishanth Aravamudan
2007-04-06 22:57   ` Michael Chan
2007-04-07  0:36     ` tg3: unable to handle null pointer dereference David Miller
2007-04-07  1:53       ` Nishanth Aravamudan
2007-04-14  0:36 ` [1/3] 2.6.21-rc6: known regressions Adrian Bunk
2007-04-14  1:34   ` Linus Torvalds
2007-04-14  1:49     ` Brandeburg, Jesse
2007-04-14  4:25     ` David Miller
2007-04-14  5:07     ` Ian McDonald
2007-04-14  5:29     ` David Miller
2007-04-14  6:21     ` Ingo Molnar [this message]
2007-04-14  7:25       ` Greg KH
2007-04-20 13:39       ` Ingo Molnar
2007-04-20 13:46     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070414062143.GA12707@elte.hu \
    --to=mingo@elte.hu \
    --cc=aabdulla@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=bunk@stusta.de \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=greg@kroah.com \
    --cc=jgarzik@pobox.com \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).