* [TRIVIAL] Fix recent bug in fib_semantics.c
@ 2004-09-17 6:20 David Gibson
2004-09-17 6:37 ` Jeff Garzik
2004-09-17 18:27 ` David S. Miller
0 siblings, 2 replies; 12+ messages in thread
From: David Gibson @ 2004-09-17 6:20 UTC (permalink / raw)
To: Andrew Morton; +Cc: David Miller, trivial, linux-kernel, netdev
Andrew, please apply:
When fib_create_info() allocates new hash tables, it neglects to
initialize them. This leads to an oops during boot on at least
machine I use. This patch addresses the problem.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Index: working-2.6/net/ipv4/fib_semantics.c
===================================================================
--- working-2.6.orig/net/ipv4/fib_semantics.c 2004-09-17 09:20:04.000000000 +1000
+++ working-2.6/net/ipv4/fib_semantics.c 2004-09-17 16:24:42.634638304 +1000
@@ -604,8 +604,12 @@
if (!new_info_hash || !new_laddrhash) {
fib_hash_free(new_info_hash, bytes);
fib_hash_free(new_laddrhash, bytes);
- } else
+ } else {
+ memset(new_info_hash, 0, bytes);
+ memset(new_laddrhash, 0, bytes);
+
fib_hash_move(new_info_hash, new_laddrhash, new_size);
+ }
if (!fib_hash_size)
goto failure;
--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-17 6:20 [TRIVIAL] Fix recent bug in fib_semantics.c David Gibson
@ 2004-09-17 6:37 ` Jeff Garzik
2004-09-17 18:27 ` David S. Miller
1 sibling, 0 replies; 12+ messages in thread
From: Jeff Garzik @ 2004-09-17 6:37 UTC (permalink / raw)
To: David Gibson; +Cc: Andrew Morton, David Miller, trivial, linux-kernel, netdev
David Gibson wrote:
> Andrew, please apply:
>
> When fib_create_info() allocates new hash tables, it neglects to
> initialize them. This leads to an oops during boot on at least
> machine I use. This patch addresses the problem.
>
> Signed-off-by: David Gibson <dwg@au1.ibm.com>
This may be the oops in fib_xxx I just saw on my Athlon64 box...
Jeff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-17 6:20 [TRIVIAL] Fix recent bug in fib_semantics.c David Gibson
2004-09-17 6:37 ` Jeff Garzik
@ 2004-09-17 18:27 ` David S. Miller
2004-09-18 0:21 ` Jon Smirl
1 sibling, 1 reply; 12+ messages in thread
From: David S. Miller @ 2004-09-17 18:27 UTC (permalink / raw)
To: David Gibson; +Cc: akpm, trivial, linux-kernel, netdev
Thanks David, I'll push this upstream asap.
I can't believe in all the route testing I did I never
triggered this on my sparc64 boxes, must have been lucky :(
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-17 18:27 ` David S. Miller
@ 2004-09-18 0:21 ` Jon Smirl
2004-09-18 0:27 ` Herbert Xu
0 siblings, 1 reply; 12+ messages in thread
From: Jon Smirl @ 2004-09-18 0:21 UTC (permalink / raw)
To: David S. Miller; +Cc: David Gibson, akpm, trivial, linux-kernel, netdev
I'm still OOPsing at boot in fib_disable_ip+21 from
fib_netdev_event+63. Both e1000 and tg3 are effected. I have current
linus bk as of time of this message.
It only occurs when Redhat goes through the scaning for new hardware
phase during boot. Is RH loading the drivers in some special way
during this phase? If I load the drivers manually after I'm booted
they load ok. I'm running with the drivers as modules, I'll try
switching to compiled in.
The change referenced in this thread is in my kernel:
fib_semantics.c, 604
} else {
memset(new_info_hash, 0, bytes);
memset(new_laddrhash, 0, bytes);
fib_hash_move(new_info_hash, new_laddrhash, new_size);
}
On Fri, 17 Sep 2004 11:27:44 -0700, David S. Miller <davem@davemloft.net> wrote:
>
> Thanks David, I'll push this upstream asap.
>
> I can't believe in all the route testing I did I never
> triggered this on my sparc64 boxes, must have been lucky :(
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 0:21 ` Jon Smirl
@ 2004-09-18 0:27 ` Herbert Xu
2004-09-18 0:59 ` Jon Smirl
2004-09-18 1:37 ` Jon Smirl
0 siblings, 2 replies; 12+ messages in thread
From: Herbert Xu @ 2004-09-18 0:27 UTC (permalink / raw)
To: Jon Smirl; +Cc: davem, david, akpm, trivial, linux-kernel, netdev
Jon Smirl <jonsmirl@gmail.com> wrote:
> I'm still OOPsing at boot in fib_disable_ip+21 from
> fib_netdev_event+63. Both e1000 and tg3 are effected. I have current
> linus bk as of time of this message.
Please post the complete error message.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 0:27 ` Herbert Xu
@ 2004-09-18 0:59 ` Jon Smirl
2004-09-18 1:37 ` Jon Smirl
1 sibling, 0 replies; 12+ messages in thread
From: Jon Smirl @ 2004-09-18 0:59 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, david, akpm, trivial, linux-kernel, netdev
I have verified that compiling the drivers in avoids the problem.
I'll boot again and get more of the error message. It's not making it
to the logs so I am copying it by hand from the screen.
On Sat, 18 Sep 2004 10:27:47 +1000, Herbert Xu
<herbert@gondor.apana.org.au> wrote:
> Jon Smirl <jonsmirl@gmail.com> wrote:
> > I'm still OOPsing at boot in fib_disable_ip+21 from
> > fib_netdev_event+63. Both e1000 and tg3 are effected. I have current
> > linus bk as of time of this message.
>
> Please post the complete error message.
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 0:27 ` Herbert Xu
2004-09-18 0:59 ` Jon Smirl
@ 2004-09-18 1:37 ` Jon Smirl
2004-09-18 4:16 ` Herbert Xu
1 sibling, 1 reply; 12+ messages in thread
From: Jon Smirl @ 2004-09-18 1:37 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, david, akpm, trivial, linux-kernel, netdev
Call stack at failure:
e1000_exit_module
...pci calls...
e1000_remove
unregister_netdev
unregister_netdevice
notifier_call_chain
fib_netdev_event
fib_disable_ip
error_code
Rest of the info has scrolled off the screen.
The problem is when RH/Fedora is doing it's modprobe/rmmod to detect
what hardware is in the system since that's the only thing that would
be rmmod'ing e1000.
On the same system if I disable networking and boot, I can
modprobe/rmmod the drivers without problem. So I'd conclude that RH is
doing something special during it's probing phase, but I don't know
enough about the RH init scripts to know what it is.
On Sat, 18 Sep 2004 10:27:47 +1000, Herbert Xu
<herbert@gondor.apana.org.au> wrote:
> Jon Smirl <jonsmirl@gmail.com> wrote:
> > I'm still OOPsing at boot in fib_disable_ip+21 from
> > fib_netdev_event+63. Both e1000 and tg3 are effected. I have current
> > linus bk as of time of this message.
>
> Please post the complete error message.
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 1:37 ` Jon Smirl
@ 2004-09-18 4:16 ` Herbert Xu
2004-09-18 5:22 ` Jon Smirl
2004-09-18 6:31 ` David S. Miller
0 siblings, 2 replies; 12+ messages in thread
From: Herbert Xu @ 2004-09-18 4:16 UTC (permalink / raw)
To: Jon Smirl; +Cc: davem, david, akpm, trivial, linux-kernel, netdev
[-- Attachment #1: Type: text/plain, Size: 810 bytes --]
On Fri, Sep 17, 2004 at 09:37:15PM -0400, Jon Smirl wrote:
> Call stack at failure:
> e1000_exit_module
> ...pci calls...
> e1000_remove
> unregister_netdev
> unregister_netdevice
> notifier_call_chain
> fib_netdev_event
> fib_disable_ip
> error_code
Thanks. The following bug is probably your problem.
> Rest of the info has scrolled off the screen.
You should be able to hit Shift-PageUp to scroll up.
There is a thinko in the allocation for the devindex hash. We're
only giving it 8 elements when it should be 1<<8 elements.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: p --]
[-- Type: text/plain, Size: 718 bytes --]
===== net/ipv4/fib_semantics.c 1.16 vs edited =====
--- 1.16/net/ipv4/fib_semantics.c 2004-09-18 04:11:04 +10:00
+++ edited/net/ipv4/fib_semantics.c 2004-09-18 14:08:55 +10:00
@@ -52,7 +52,8 @@
static unsigned int fib_info_cnt;
#define DEVINDEX_HASHBITS 8
-static struct hlist_head fib_info_devhash[DEVINDEX_HASHBITS];
+#define DEVINDEX_HASHSIZE (1U << DEVINDEX_HASHBITS)
+static struct hlist_head fib_info_devhash[DEVINDEX_HASHSIZE];
#ifdef CONFIG_IP_ROUTE_MULTIPATH
@@ -229,7 +230,7 @@
static inline unsigned int fib_devindex_hashfn(unsigned int val)
{
- unsigned int mask = ((1U << DEVINDEX_HASHBITS) - 1);
+ unsigned int mask = DEVINDEX_HASHSIZE - 1;
return (val ^
(val >> DEVINDEX_HASHBITS) ^
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 4:16 ` Herbert Xu
@ 2004-09-18 5:22 ` Jon Smirl
2004-09-18 6:31 ` David S. Miller
1 sibling, 0 replies; 12+ messages in thread
From: Jon Smirl @ 2004-09-18 5:22 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, david, akpm, trivial, linux-kernel, netdev
Still getting the same fault with the patch. Someone else has this
problem too. The full stack trace is in this thread....
[2.6.9-rc2-bk] Network-related panic on boot
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 4:16 ` Herbert Xu
2004-09-18 5:22 ` Jon Smirl
@ 2004-09-18 6:31 ` David S. Miller
2004-09-18 15:28 ` Jon Smirl
2004-09-18 20:31 ` jamal
1 sibling, 2 replies; 12+ messages in thread
From: David S. Miller @ 2004-09-18 6:31 UTC (permalink / raw)
To: Herbert Xu; +Cc: jonsmirl, david, akpm, trivial, linux-kernel, netdev
On Sat, 18 Sep 2004 14:16:28 +1000
Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Thanks. The following bug is probably your problem.
Good catch on this fix, but really he's hitting the
BUG_ON() in fib_sync_down() (I hate i386 backtraces,
it's an art to decode them properly)
So if you rmmod() a device before any routes are ever
created in ipv4, this triggers. I didn't think this
was possible, but it is.
The fix is simple enough.
===== net/ipv4/fib_semantics.c 1.16 vs edited =====
--- 1.16/net/ipv4/fib_semantics.c 2004-09-17 11:11:04 -07:00
+++ edited/net/ipv4/fib_semantics.c 2004-09-17 23:14:44 -07:00
@@ -1040,9 +1040,7 @@
if (force)
scope = -1;
- BUG_ON(!fib_info_laddrhash);
-
- if (local) {
+ if (local && fib_info_laddrhash) {
unsigned int hash = fib_laddr_hashfn(local);
struct hlist_head *head = &fib_info_laddrhash[hash];
struct hlist_node *node;
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 6:31 ` David S. Miller
@ 2004-09-18 15:28 ` Jon Smirl
2004-09-18 20:31 ` jamal
1 sibling, 0 replies; 12+ messages in thread
From: Jon Smirl @ 2004-09-18 15:28 UTC (permalink / raw)
To: David S. Miller; +Cc: Herbert Xu, david, akpm, trivial, linux-kernel, netdev
The last patch fixes things so that I can boot. The net is working too.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [TRIVIAL] Fix recent bug in fib_semantics.c
2004-09-18 6:31 ` David S. Miller
2004-09-18 15:28 ` Jon Smirl
@ 2004-09-18 20:31 ` jamal
1 sibling, 0 replies; 12+ messages in thread
From: jamal @ 2004-09-18 20:31 UTC (permalink / raw)
To: David S. Miller
Cc: Herbert Xu, jonsmirl, david, akpm, trivial, linux-kernel, netdev
On Sat, 2004-09-18 at 02:31, David S. Miller wrote:
> On Sat, 18 Sep 2004 14:16:28 +1000
> So if you rmmod() a device before any routes are ever
> created in ipv4, this triggers. I didn't think this
> was possible, but it is.
May have been exposed by LLTX. When i turned off LLTX on e1000
before seeing your fix, the oops disapeared.
cheers,
jamal
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2004-09-18 20:31 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-17 6:20 [TRIVIAL] Fix recent bug in fib_semantics.c David Gibson
2004-09-17 6:37 ` Jeff Garzik
2004-09-17 18:27 ` David S. Miller
2004-09-18 0:21 ` Jon Smirl
2004-09-18 0:27 ` Herbert Xu
2004-09-18 0:59 ` Jon Smirl
2004-09-18 1:37 ` Jon Smirl
2004-09-18 4:16 ` Herbert Xu
2004-09-18 5:22 ` Jon Smirl
2004-09-18 6:31 ` David S. Miller
2004-09-18 15:28 ` Jon Smirl
2004-09-18 20:31 ` jamal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).