* 2.6.0-test9 : bridge freezes
@ 2003-11-22 15:27 SVR Anand
2003-11-22 16:19 ` Gene Heskett
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: SVR Anand @ 2003-11-22 15:27 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-net
Hi,
I am one of the system administrators who manage a campus network of 5000 users
that is connected to Internet. We have placed a Linux bridge to isolate the
Internet from the campus. To nullify network flooding effect, we have used
iptables. The kernel is 2.6.0-test9, the ethernet cards that are used are
RTL8139.
The problem is : After 3 to 4 hours of functioning, the bridge stops working
and the machine becomes unusable where it doesn't respond to keyboard, and
there is no video display. In simple terms it freezes. Before going in for
2.6.0-test9 I have tried 2.4.20 with bridge patches for iptables support. It
worked reliably except that I cannot even login from the console because
I don't get the shell prompt after a while.
Presently I have gone back to 2.4.20 for the sake of robustness. Can someone
let me know what I can do to use 2.6.x kernel with a good amount of confidence
so that I can keep the campus users happy ? I am making guess work as
to whether the problem is with the network drivers, or some power management
issues, and so on.
Any inputs from you will be really useful. I am eager to try out any amount
of debugging, the thing is I don't know what to look for.
Anand
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-11-22 15:27 2.6.0-test9 : bridge freezes SVR Anand
@ 2003-11-22 16:19 ` Gene Heskett
2003-11-22 16:20 ` Linus Torvalds
` (2 subsequent siblings)
3 siblings, 0 replies; 22+ messages in thread
From: Gene Heskett @ 2003-11-22 16:19 UTC (permalink / raw)
To: SVR Anand, linux-kernel; +Cc: linux-net
On Saturday 22 November 2003 10:27, SVR Anand wrote:
>Hi,
>
>I am one of the system administrators who manage a campus network of
> 5000 users that is connected to Internet. We have placed a Linux
> bridge to isolate the Internet from the campus. To nullify network
> flooding effect, we have used iptables. The kernel is 2.6.0-test9,
> the ethernet cards that are used are RTL8139.
>
>The problem is : After 3 to 4 hours of functioning, the bridge stops
> working and the machine becomes unusable where it doesn't respond
> to keyboard, and there is no video display. In simple terms it
> freezes. Before going in for 2.6.0-test9 I have tried 2.4.20 with
> bridge patches for iptables support. It worked reliably except that
> I cannot even login from the console because I don't get the shell
> prompt after a while.
>
>Presently I have gone back to 2.4.20 for the sake of robustness. Can
> someone let me know what I can do to use 2.6.x kernel with a good
> amount of confidence so that I can keep the campus users happy ? I
> am making guess work as to whether the problem is with the network
> drivers, or some power management issues, and so on.
>
>Any inputs from you will be really useful. I am eager to try out any
> amount of debugging, the thing is I don't know what to look for.
Neither do I, but I can report that iptables is apparently stable,
witness this from my firewall machine:
---------------------
[root@gene root]# uname -a
Linux gene.coyote.den 2.4.21-rc1-ck6 #9 Mon May 5 23:31:30 EDT 2003
i586 unknown
[root@gene root]# uptime
11:14am up 35 days, 3:19, 2 users, load average: 1.00, 1.00, 1.00
---------------------
Now admittedly it doesn't have 5000 users, just one. But everytime
I'd had problems such as you are describing at the tv station the
where user count is about 65, its hardware related. I'd start by
letting memtest86 run on that box for a couple of days, maybe it will
find some flakey memory.
--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-11-22 15:27 2.6.0-test9 : bridge freezes SVR Anand
2003-11-22 16:19 ` Gene Heskett
@ 2003-11-22 16:20 ` Linus Torvalds
2003-11-23 23:26 ` David S. Miller
2003-11-22 19:18 ` 2.6.0-test9 : bridge freezes Markus Hästbacka
2003-11-24 19:09 ` Stephen Hemminger
3 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2003-11-22 16:20 UTC (permalink / raw)
To: SVR Anand; +Cc: Kernel Mailing List, netdev
On Sat, 22 Nov 2003, SVR Anand wrote:
>
> The problem is : After 3 to 4 hours of functioning, the bridge stops working
> and the machine becomes unusable where it doesn't respond to keyboard, and
> there is no video display.
Sounds like a memory leak somewhere. It would probably be interesting to
watch /proc/slabinfo every five minutes or so, and see what happens..
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-11-22 15:27 2.6.0-test9 : bridge freezes SVR Anand
2003-11-22 16:19 ` Gene Heskett
2003-11-22 16:20 ` Linus Torvalds
@ 2003-11-22 19:18 ` Markus Hästbacka
2003-11-24 19:09 ` Stephen Hemminger
3 siblings, 0 replies; 22+ messages in thread
From: Markus Hästbacka @ 2003-11-22 19:18 UTC (permalink / raw)
To: SVR Anand; +Cc: Kernel Mailinglist
[-- Attachment #1: Type: text/plain, Size: 553 bytes --]
Hi!
I had this problem on my router too, the computer freezed somewhere
after 3-4 hours, in my case 2.6.0-test4 worked, but the test8 got lockup
(didn't test anything between test4 and test8).
Regards,
Markus
On Sat, 2003-11-22 at 17:27, SVR Anand wrote:
> The problem is : After 3 to 4 hours of functioning, the bridge stops working
> and the machine becomes unusable where it doesn't respond to keyboard, and
> there is no video display.
--
"Software is like sex, it's better when it's free."
Markus Hästbacka <midian at ihme.org>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-11-22 16:20 ` Linus Torvalds
@ 2003-11-23 23:26 ` David S. Miller
2003-11-24 0:02 ` Markus Hästbacka
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: David S. Miller @ 2003-11-23 23:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: anand, linux-kernel, netdev
On Sat, 22 Nov 2003 08:20:40 -0800 (PST)
Linus Torvalds <torvalds@osdl.org> wrote:
> On Sat, 22 Nov 2003, SVR Anand wrote:
> >
> > The problem is : After 3 to 4 hours of functioning, the bridge stops working
> > and the machine becomes unusable where it doesn't respond to keyboard, and
> > there is no video display.
>
> Sounds like a memory leak somewhere. It would probably be interesting to
> watch /proc/slabinfo every five minutes or so, and see what happens..
Also, we've certainly fixed some serious networking bugs since test9
came out.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-11-23 23:26 ` David S. Miller
@ 2003-11-24 0:02 ` Markus Hästbacka
2003-11-25 17:21 ` 2.6.0-test9-bk25 : bridge works fine SVR Anand
2003-11-29 9:44 ` Bridging woes after 3 days SVR Anand
2 siblings, 0 replies; 22+ messages in thread
From: Markus Hästbacka @ 2003-11-24 0:02 UTC (permalink / raw)
To: David S. Miller; +Cc: Kernel Mailinglist
[-- Attachment #1: Type: text/plain, Size: 695 bytes --]
I wonder how it's possible that test4 worked fine and then something
like this comes up? (I DID report this earlier, but who would care?)
Also, it've been too long since test9, and there's not much people
testing the bk's at all.
There may be a reason for someone not to test the bk's, maybe the
experience with 2.4 bk's, yes, those which wont compile/boot at all.
So I'd suggest to remove 2.4 bk's totaly from kernel.org.
Regards,
Markus
On Mon, 2003-11-24 at 01:26, David S. Miller wrote:
> Also, we've certainly fixed some serious networking bugs since test9
> came out.
--
"Software is like sex, it's better when it's free."
Markus Hästbacka <midian at ihme dot org>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Bridge] Re: 2.6.0-test9 : bridge freezes
2003-11-22 15:27 2.6.0-test9 : bridge freezes SVR Anand
@ 2003-11-24 19:09 ` Stephen Hemminger
2003-11-22 16:20 ` Linus Torvalds
` (2 subsequent siblings)
3 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2003-11-24 19:09 UTC (permalink / raw)
To: SVR Anand; +Cc: linux-net, bridge, linux-kernel
On Sat, 22 Nov 2003 20:57:44 +0530 (GMT+05:30)
anand@eis.iisc.ernet.in (SVR Anand) wrote:
> Hi,
>
> I am one of the system administrators who manage a campus network of 5000 users
> that is connected to Internet. We have placed a Linux bridge to isolate the
> Internet from the campus. To nullify network flooding effect, we have used
> iptables. The kernel is 2.6.0-test9, the ethernet cards that are used are
> RTL8139.
>
> The problem is : After 3 to 4 hours of functioning, the bridge stops working
> and the machine becomes unusable where it doesn't respond to keyboard, and
> there is no video display. In simple terms it freezes. Before going in for
> 2.6.0-test9 I have tried 2.4.20 with bridge patches for iptables support. It
> worked reliably except that I cannot even login from the console because
> I don't get the shell prompt after a while.
>
> Presently I have gone back to 2.4.20 for the sake of robustness. Can someone
> let me know what I can do to use 2.6.x kernel with a good amount of confidence
> so that I can keep the campus users happy ? I am making guess work as
> to whether the problem is with the network drivers, or some power management
> issues, and so on.
>
> Any inputs from you will be really useful. I am eager to try out any amount
> of debugging, the thing is I don't know what to look for.
>
>
> Anand
Linus is right, this is probably a memory leak issue. There are several areas
that could be the problem:
- core networking
- iptables
- iptables filter
- ethernet bridging
- ethernet driver (rtl8169)
To find/fix the problem, we need to narrow down the scope.
Things that would help are, what are the iptables rules you are using?
Are there any errors showing up on the ethernet devices?
Also what does the bridge forwarding table look like? are there lots of entries, are
you running spanning tree?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
@ 2003-11-24 19:09 ` Stephen Hemminger
0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2003-11-24 19:09 UTC (permalink / raw)
To: SVR Anand; +Cc: linux-kernel, linux-net, bridge
On Sat, 22 Nov 2003 20:57:44 +0530 (GMT+05:30)
anand@eis.iisc.ernet.in (SVR Anand) wrote:
> Hi,
>
> I am one of the system administrators who manage a campus network of 5000 users
> that is connected to Internet. We have placed a Linux bridge to isolate the
> Internet from the campus. To nullify network flooding effect, we have used
> iptables. The kernel is 2.6.0-test9, the ethernet cards that are used are
> RTL8139.
>
> The problem is : After 3 to 4 hours of functioning, the bridge stops working
> and the machine becomes unusable where it doesn't respond to keyboard, and
> there is no video display. In simple terms it freezes. Before going in for
> 2.6.0-test9 I have tried 2.4.20 with bridge patches for iptables support. It
> worked reliably except that I cannot even login from the console because
> I don't get the shell prompt after a while.
>
> Presently I have gone back to 2.4.20 for the sake of robustness. Can someone
> let me know what I can do to use 2.6.x kernel with a good amount of confidence
> so that I can keep the campus users happy ? I am making guess work as
> to whether the problem is with the network drivers, or some power management
> issues, and so on.
>
> Any inputs from you will be really useful. I am eager to try out any amount
> of debugging, the thing is I don't know what to look for.
>
>
> Anand
Linus is right, this is probably a memory leak issue. There are several areas
that could be the problem:
- core networking
- iptables
- iptables filter
- ethernet bridging
- ethernet driver (rtl8169)
To find/fix the problem, we need to narrow down the scope.
Things that would help are, what are the iptables rules you are using?
Are there any errors showing up on the ethernet devices?
Also what does the bridge forwarding table look like? are there lots of entries, are
you running spanning tree?
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Bridge] Re: 2.6.0-test9 : bridge freezes
2003-11-24 19:09 ` Stephen Hemminger
@ 2003-11-25 6:39 ` SVR Anand
-1 siblings, 0 replies; 22+ messages in thread
From: SVR Anand @ 2003-11-25 6:39 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: linux-net, bridge, linux-kernel
Hi,
To begin with, thanks a lot for the concern you all have shown to address my
problem.
This morning I have put in test9-bk25 image to see if the problem disappears.
The result should be out in the next few hours. I hope it is OK if I send you
the slabinfo in case the problem persists.
I plan to test in stages.
i) Just bridging, no iptables
ii) With iptables.
I have very limited set of iptables rules. In fact it is as simple as blocking
icmp. There are no errors reported by ethernet devices.
Anand
PS : The latest test10 stops at the booting stage while initialising my aic7xxx
scsi. So, I had to use bk25.
>
> Linus is right, this is probably a memory leak issue. There are several areas
> that could be the problem:
> - core networking
> - iptables
> - iptables filter
> - ethernet bridging
> - ethernet driver (rtl8169)
>
> To find/fix the problem, we need to narrow down the scope.
> Things that would help are, what are the iptables rules you are using?
> Are there any errors showing up on the ethernet devices?
> Also what does the bridge forwarding table look like? are there lots of entries, are
> you running spanning tree?
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
@ 2003-11-25 6:39 ` SVR Anand
0 siblings, 0 replies; 22+ messages in thread
From: SVR Anand @ 2003-11-25 6:39 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: linux-kernel, linux-net, bridge
Hi,
To begin with, thanks a lot for the concern you all have shown to address my
problem.
This morning I have put in test9-bk25 image to see if the problem disappears.
The result should be out in the next few hours. I hope it is OK if I send you
the slabinfo in case the problem persists.
I plan to test in stages.
i) Just bridging, no iptables
ii) With iptables.
I have very limited set of iptables rules. In fact it is as simple as blocking
icmp. There are no errors reported by ethernet devices.
Anand
PS : The latest test10 stops at the booting stage while initialising my aic7xxx
scsi. So, I had to use bk25.
>
> Linus is right, this is probably a memory leak issue. There are several areas
> that could be the problem:
> - core networking
> - iptables
> - iptables filter
> - ethernet bridging
> - ethernet driver (rtl8169)
>
> To find/fix the problem, we need to narrow down the scope.
> Things that would help are, what are the iptables rules you are using?
> Are there any errors showing up on the ethernet devices?
> Also what does the bridge forwarding table look like? are there lots of entries, are
> you running spanning tree?
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9-bk25 : bridge works fine
2003-11-23 23:26 ` David S. Miller
2003-11-24 0:02 ` Markus Hästbacka
@ 2003-11-25 17:21 ` SVR Anand
2003-11-29 9:44 ` Bridging woes after 3 days SVR Anand
2 siblings, 0 replies; 22+ messages in thread
From: SVR Anand @ 2003-11-25 17:21 UTC (permalink / raw)
To: David S. Miller; +Cc: Linus Torvalds, linux-kernel, netdev
Hi,
With test9-bk25, I am not facing any problem for the past many hours which
was not to be the case with test9. I am hopeful that it will work for ever.
Thanks a lot for all the help. Next time I should make it a point to try on
the latest of the latest before shooting off a mail :)
Anand
>
> On Sat, 22 Nov 2003 08:20:40 -0800 (PST)
> Linus Torvalds <torvalds@osdl.org> wrote:
>
> > On Sat, 22 Nov 2003, SVR Anand wrote:
> > >
> > > The problem is : After 3 to 4 hours of functioning, the bridge stops working
> > > and the machine becomes unusable where it doesn't respond to keyboard, and
> > > there is no video display.
> >
> > Sounds like a memory leak somewhere. It would probably be interesting to
> > watch /proc/slabinfo every five minutes or so, and see what happens..
>
> Also, we've certainly fixed some serious networking bugs since test9
> came out.
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Bridging woes after 3 days
2003-11-23 23:26 ` David S. Miller
2003-11-24 0:02 ` Markus Hästbacka
2003-11-25 17:21 ` 2.6.0-test9-bk25 : bridge works fine SVR Anand
@ 2003-11-29 9:44 ` SVR Anand
2003-12-01 19:33 ` Stephen Hemminger
2 siblings, 1 reply; 22+ messages in thread
From: SVR Anand @ 2003-11-29 9:44 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
Hi,
After a continous run for 3 days the bridge came down crashing with the
following kernel panic screen dump. The kernel is 2.6.0-test9-bk25 with
kernel preemption disabled.
The following call stack is what I have seen on the console. The ethernet
cards are RTL8139. Please let me know if you want more information or finer
debugging method, I will pass it on when the bridge fails the next time.
When it worked it really worked.
Anand
----------------------------------------------------------------------------
nf_hook_slow
br_nf_pre_routing_finish
br_nf_pre_routing
br_nf_pre_finish
nf_iterate
br_handle_frame_finish
nf_hook_slow
br_handle_frame_finish
br_handle_frame
br_handle_frame_finish
netif_recieve_skb
process_backlog
net_rx_action
do_selfirq
do_IRQ
common_interrupt
default_idle
default_idle
cpu_idle
printk
code: 8b 40 18 85 c0 75 7f 85 ff 74 44 89 7b 70 0f ba 6b 40 00 89
<0> kernel panic : Fatal exception in interrupt
In interrupt handles - not syncing
----------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Bridging woes after 3 days
2003-11-29 9:44 ` Bridging woes after 3 days SVR Anand
@ 2003-12-01 19:33 ` Stephen Hemminger
0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2003-12-01 19:33 UTC (permalink / raw)
To: SVR Anand; +Cc: David S. Miller, netdev
On Sat, 29 Nov 2003 15:14:08 +0530 (GMT+05:30)
anand@eis.iisc.ernet.in (SVR Anand) wrote:
> Hi,
>
> After a continous run for 3 days the bridge came down crashing with the
> following kernel panic screen dump. The kernel is 2.6.0-test9-bk25 with
> kernel preemption disabled.
>
> The following call stack is what I have seen on the console. The ethernet
> cards are RTL8139. Please let me know if you want more information or finer
> debugging method, I will pass it on when the bridge fails the next time.
Can you hook a serial console to catch the precise wording?
Also if you save copies of /proc/slabinfo on a regular interval (like per hour),
then it is possible to see if there is a memory leak.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
@ 2003-12-15 13:15 Steve Hill
2003-12-16 1:17 ` David S. Miller
0 siblings, 1 reply; 22+ messages in thread
From: Steve Hill @ 2003-12-15 13:15 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1200 bytes --]
With both conntrack and bridging turned on in the 2.6.0test11 kernel,
sending fragmented packets over the bridge reveals a memory leak
(specifically, forwarding packets from any interface to a bridge). The
memory that is leaking seems to be being allocated on line 299 on
net/bridge/br_netfilter.c:
if ((nf_bridge = nf_bridge_alloc(skb)) == NULL)
return NF_DROP;
Only the first fragment gets freed later on.
The patch attached fixes the problem by freeing nf_bridge when the
packets are defragmented, however I am sure this is not the right place
to do this. Where would the skb's for the fragments usually get freed?
Bart De Schuymer suggested that they should be freed in
skbuff.c::skb_release_data(), but having looked at this it seems to do
this already. skb_release_data() calls skb_drop_fraglist(), which does
kfree_skb() on each fragment, and kfree_skb calls nf_bridge_put correctly
so this isn't the problem.
--
- Steve Hill
Senior Software Developer Email: steve@navaho.co.uk
Navaho Technologies Ltd. Tel: +44-870-7034015
... Alcohol and calculus don't mix - Don't drink and derive! ...
[-- Attachment #2: Type: TEXT/PLAIN, Size: 565 bytes --]
diff -urN linux-2.6.0-test11.vanilla/net/ipv4/ip_fragment.c linux-2.6.0-test11/net/ipv4/ip_fragment.c
--- linux-2.6.0-test11.vanilla/net/ipv4/ip_fragment.c 2003-12-12 19:27:07.000000000 +0000
+++ linux-2.6.0-test11/net/ipv4/ip_fragment.c 2003-12-15 08:49:01.000000000 +0000
@@ -592,6 +592,9 @@
atomic_sub(head->truesize, &ip_frag_mem);
for (fp=head->next; fp; fp = fp->next) {
+#ifdef CONFIG_BRIDGE_NETFILTER
+ nf_bridge_put(fp->nf_bridge);
+#endif
head->data_len += fp->len;
head->len += fp->len;
if (head->ip_summed != fp->ip_summed)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-15 13:15 Steve Hill
@ 2003-12-16 1:17 ` David S. Miller
2003-12-16 7:43 ` Bart De Schuymer
0 siblings, 1 reply; 22+ messages in thread
From: David S. Miller @ 2003-12-16 1:17 UTC (permalink / raw)
To: Steve Hill; +Cc: netdev
On Mon, 15 Dec 2003 13:15:44 +0000 (GMT)
Steve Hill <steve@navaho.co.uk> wrote:
> The memory that is leaking seems to be being allocated on line 299 on
> net/bridge/br_netfilter.c:
>
> if ((nf_bridge = nf_bridge_alloc(skb)) == NULL)
> return NF_DROP;
>
> Only the first fragment gets freed later on.
I see.
> The patch attached fixes the problem by freeing nf_bridge when the
> packets are defragmented, however I am sure this is not the right place
> to do this. Where would the skb's for the fragments usually get freed?
>
> Bart De Schuymer suggested that they should be freed in
> skbuff.c::skb_release_data(), but having looked at this it seems to do
> this already. skb_release_data() calls skb_drop_fraglist(), which does
> kfree_skb() on each fragment, and kfree_skb calls nf_bridge_put correctly
> so this isn't the problem.
There must be something in particular that the IPV4 fragmentation code
is doing that makes these fragment reference drops get forgotten. Hmmm...
I just noticed that both bridge netfilter and IPV4 fragmentation make much
use of the skb->cb[] control block, this may be the true source of the
troubles.
In fact, since bridge netfilter expects pointers to be there, I'm surprised
this does not cause a crash.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 1:17 ` David S. Miller
@ 2003-12-16 7:43 ` Bart De Schuymer
2003-12-16 7:46 ` David S. Miller
2003-12-16 9:00 ` Steve Hill
0 siblings, 2 replies; 22+ messages in thread
From: Bart De Schuymer @ 2003-12-16 7:43 UTC (permalink / raw)
To: David S. Miller, Steve Hill; +Cc: netdev
On Tuesday 16 December 2003 02:17, David S. Miller wrote:
> There must be something in particular that the IPV4 fragmentation code
> is doing that makes these fragment reference drops get forgotten. Hmmm...
>
> I just noticed that both bridge netfilter and IPV4 fragmentation make much
> use of the skb->cb[] control block, this may be the true source of the
> troubles.
>
> In fact, since bridge netfilter expects pointers to be there, I'm surprised
> this does not cause a crash.
It only expects a pointer in br_nf_forward_finish() for ARP traffic. I
checked and the ARP code doesn't use the control buffer.
For IP traffic, it uses the control buffer just before and just after
the call to the IP PRE_ROUTING hook.
OK, I just looked at the ip_fragment.c code and it uses the control buffer
too. You are truly amazing. I'll use skbuff.c::nf_bridge_info instead.
Steve, does this patch fix things? Of course, first remove your code from
ip_fragment.c. I haven't tested this patch yet, this will have to wait
until this evening.
Dave, I'll cook up a slightly different patch for you later, I think
nf_bridge->hh is now a bad name, I'll change it into nf_bridge->data.
thanks,
Bart
--- linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c.old 2003-12-16 08:33:35.000000000 +0100
+++ linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c 2003-12-16 08:34:12.000000000 +0100
@@ -38,11 +38,9 @@
#define skb_origaddr(skb) (((struct bridge_skb_cb *) \
- (skb->cb))->daddr.ipv4)
+ (skb->nf_bridge->hh))->daddr.ipv4)
#define store_orig_dstaddr(skb) (skb_origaddr(skb) = (skb)->nh.iph->daddr)
#define dnat_took_place(skb) (skb_origaddr(skb) != (skb)->nh.iph->daddr)
-#define clear_cb(skb) (memset(&skb_origaddr(skb), 0, \
- sizeof(struct bridge_skb_cb)))
#define has_bridge_parent(device) ((device)->br_port != NULL)
#define bridge_parent(device) ((device)->br_port->br->dev)
@@ -203,7 +201,6 @@ bridged_dnat:
*/
nf_bridge->mask |= BRNF_BRIDGED_DNAT;
skb->dev = nf_bridge->physindev;
- clear_cb(skb);
if (skb->protocol ==
__constant_htons(ETH_P_8021Q)) {
skb_push(skb, VLAN_HLEN);
@@ -224,7 +221,6 @@ bridged_dnat:
dst_hold(skb->dst);
}
- clear_cb(skb);
skb->dev = nf_bridge->physindev;
if (skb->protocol == __constant_htons(ETH_P_8021Q)) {
skb_push(skb, VLAN_HLEN);
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 7:43 ` Bart De Schuymer
@ 2003-12-16 7:46 ` David S. Miller
2003-12-16 9:00 ` Steve Hill
1 sibling, 0 replies; 22+ messages in thread
From: David S. Miller @ 2003-12-16 7:46 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: steve, netdev
On Tue, 16 Dec 2003 08:43:58 +0100
Bart De Schuymer <bdschuym@pandora.be> wrote:
> You are truly amazing.
Don't make me blush in public :)
> Dave, I'll cook up a slightly different patch for you later, I think
> nf_bridge->hh is now a bad name, I'll change it into nf_bridge->data.
Great, I hope we've got this one nailed.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 7:43 ` Bart De Schuymer
2003-12-16 7:46 ` David S. Miller
@ 2003-12-16 9:00 ` Steve Hill
2003-12-16 21:46 ` Bart De Schuymer
1 sibling, 1 reply; 22+ messages in thread
From: Steve Hill @ 2003-12-16 9:00 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: David S. Miller, netdev
On Tue, 16 Dec 2003, Bart De Schuymer wrote:
> Steve, does this patch fix things? Of course, first remove your code from
> ip_fragment.c. I haven't tested this patch yet, this will have to wait
> until this evening.
No, it still leaks I'm afraid :(
--
- Steve Hill
Senior Software Developer Email: steve@navaho.co.uk
Navaho Technologies Ltd. Tel: +44-870-7034015
... Alcohol and calculus don't mix - Don't drink and derive! ...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 9:00 ` Steve Hill
@ 2003-12-16 21:46 ` Bart De Schuymer
2003-12-16 21:49 ` David S. Miller
2003-12-17 8:36 ` Steve Hill
0 siblings, 2 replies; 22+ messages in thread
From: Bart De Schuymer @ 2003-12-16 21:46 UTC (permalink / raw)
To: Steve Hill; +Cc: David S. Miller, netdev
On Tuesday 16 December 2003 10:00, Steve Hill wrote:
> On Tue, 16 Dec 2003, Bart De Schuymer wrote:
> > Steve, does this patch fix things? Of course, first remove your code from
> > ip_fragment.c. I haven't tested this patch yet, this will have to wait
> > until this evening.
>
> No, it still leaks I'm afraid :(
OK, I think the patch below should fix it, my previous patch is still valid.
I'll send a combined patch once it's confirmed this fixes the bug.
Steve, please test this patch.
cheers,
Bart
--- linux-2.6.0-test11-bk10/net/ipv4/ip_output.c.old 2003-12-16 22:05:02.000000000 +0100
+++ linux-2.6.0-test11-bk10/net/ipv4/ip_output.c 2003-12-16 22:36:02.000000000 +0100
@@ -417,6 +417,7 @@ static void ip_copy_metadata(struct sk_b
to->nfct = from->nfct;
nf_conntrack_get(to->nfct);
#ifdef CONFIG_BRIDGE_NETFILTER
+ nf_bridge_put(to->nf_bridge);
to->nf_bridge = from->nf_bridge;
nf_bridge_get(to->nf_bridge);
#endif
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 21:46 ` Bart De Schuymer
@ 2003-12-16 21:49 ` David S. Miller
2003-12-17 8:36 ` Steve Hill
1 sibling, 0 replies; 22+ messages in thread
From: David S. Miller @ 2003-12-16 21:49 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: steve, netdev
On Tue, 16 Dec 2003 22:46:45 +0100
Bart De Schuymer <bdschuym@pandora.be> wrote:
> #ifdef CONFIG_BRIDGE_NETFILTER
> + nf_bridge_put(to->nf_bridge);
> to->nf_bridge = from->nf_bridge;
> nf_bridge_get(to->nf_bridge);
> #endif
Now this change makes a LOT of sense, great that you discovered it.
Steve, we eagerly wait your test results :)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-16 21:46 ` Bart De Schuymer
2003-12-16 21:49 ` David S. Miller
@ 2003-12-17 8:36 ` Steve Hill
2003-12-17 18:27 ` Bart De Schuymer
1 sibling, 1 reply; 22+ messages in thread
From: Steve Hill @ 2003-12-17 8:36 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: David S. Miller, netdev
On Tue, 16 Dec 2003, Bart De Schuymer wrote:
> OK, I think the patch below should fix it, my previous patch is still valid.
> I'll send a combined patch once it's confirmed this fixes the bug.
Yep, this seems to fix the problem. Thanks.
--
- Steve Hill
Senior Software Developer Email: steve@navaho.co.uk
Navaho Technologies Ltd. Tel: +44-870-7034015
... Alcohol and calculus don't mix - Don't drink and derive! ...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.0-test9 : bridge freezes
2003-12-17 8:36 ` Steve Hill
@ 2003-12-17 18:27 ` Bart De Schuymer
0 siblings, 0 replies; 22+ messages in thread
From: Bart De Schuymer @ 2003-12-17 18:27 UTC (permalink / raw)
To: Steve Hill; +Cc: David S. Miller, netdev
On Wednesday 17 December 2003 09:36, Steve Hill wrote:
> On Tue, 16 Dec 2003, Bart De Schuymer wrote:
> > OK, I think the patch below should fix it, my previous patch is still
> > valid. I'll send a combined patch once it's confirmed this fixes the bug.
>
> Yep, this seems to fix the problem. Thanks.
Cool,
Dave, here's the patch, nf_bridge_info.hh is renamed to nf_bridge_info.data,
the bridge-nf IP code no longer uses the control buffer and a very
necessary call of nf_bridge_put is added.
cheers,
Bart
--- linux-2.6.0-test11-bk10/include/linux/skbuff.h.old 2003-12-17 08:12:12.000000000 +0100
+++ linux-2.6.0-test11-bk10/include/linux/skbuff.h 2003-12-17 08:12:38.000000000 +0100
@@ -107,7 +107,7 @@ struct nf_bridge_info {
struct net_device *netoutdev;
#endif
unsigned int mask;
- unsigned long hh[32 / sizeof(unsigned long)];
+ unsigned long data[32 / sizeof(unsigned long)];
};
#endif
--- linux-2.6.0-test11-bk10/include/linux/netfilter_bridge.h.old 2003-12-17 08:08:21.000000000 +0100
+++ linux-2.6.0-test11-bk10/include/linux/netfilter_bridge.h 2003-12-17 08:09:19.000000000 +0100
@@ -73,11 +73,11 @@ void nf_bridge_maybe_copy_header(struct
if (skb->nf_bridge) {
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
if (skb->protocol == __constant_htons(ETH_P_8021Q)) {
- memcpy(skb->data - 18, skb->nf_bridge->hh, 18);
+ memcpy(skb->data - 18, skb->nf_bridge->data, 18);
skb_push(skb, 4);
} else
#endif
- memcpy(skb->data - 16, skb->nf_bridge->hh, 16);
+ memcpy(skb->data - 16, skb->nf_bridge->data, 16);
}
}
@@ -90,7 +90,7 @@ void nf_bridge_save_header(struct sk_buf
if (skb->protocol == __constant_htons(ETH_P_8021Q))
header_size = 18;
#endif
- memcpy(skb->nf_bridge->hh, skb->data - header_size, header_size);
+ memcpy(skb->nf_bridge->data, skb->data - header_size, header_size);
}
struct bridge_skb_cb {
--- linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c.old 2003-12-16 08:33:35.000000000 +0100
+++ linux-2.6.0-test11-bk10/net/bridge/br_netfilter.c 2003-12-17 08:08:08.000000000 +0100
@@ -38,11 +38,9 @@
#define skb_origaddr(skb) (((struct bridge_skb_cb *) \
- (skb->cb))->daddr.ipv4)
+ (skb->nf_bridge->data))->daddr.ipv4)
#define store_orig_dstaddr(skb) (skb_origaddr(skb) = (skb)->nh.iph->daddr)
#define dnat_took_place(skb) (skb_origaddr(skb) != (skb)->nh.iph->daddr)
-#define clear_cb(skb) (memset(&skb_origaddr(skb), 0, \
- sizeof(struct bridge_skb_cb)))
#define has_bridge_parent(device) ((device)->br_port != NULL)
#define bridge_parent(device) ((device)->br_port->br->dev)
@@ -203,7 +201,6 @@ bridged_dnat:
*/
nf_bridge->mask |= BRNF_BRIDGED_DNAT;
skb->dev = nf_bridge->physindev;
- clear_cb(skb);
if (skb->protocol ==
__constant_htons(ETH_P_8021Q)) {
skb_push(skb, VLAN_HLEN);
@@ -224,7 +221,6 @@ bridged_dnat:
dst_hold(skb->dst);
}
- clear_cb(skb);
skb->dev = nf_bridge->physindev;
if (skb->protocol == __constant_htons(ETH_P_8021Q)) {
skb_push(skb, VLAN_HLEN);
--- linux-2.6.0-test11-bk10/net/ipv4/ip_output.c.old 2003-12-16 22:05:02.000000000 +0100
+++ linux-2.6.0-test11-bk10/net/ipv4/ip_output.c 2003-12-16 22:36:02.000000000 +0100
@@ -417,6 +417,7 @@ static void ip_copy_metadata(struct sk_b
to->nfct = from->nfct;
nf_conntrack_get(to->nfct);
#ifdef CONFIG_BRIDGE_NETFILTER
+ nf_bridge_put(to->nf_bridge);
to->nf_bridge = from->nf_bridge;
nf_bridge_get(to->nf_bridge);
#endif
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2003-12-17 18:27 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-22 15:27 2.6.0-test9 : bridge freezes SVR Anand
2003-11-22 16:19 ` Gene Heskett
2003-11-22 16:20 ` Linus Torvalds
2003-11-23 23:26 ` David S. Miller
2003-11-24 0:02 ` Markus Hästbacka
2003-11-25 17:21 ` 2.6.0-test9-bk25 : bridge works fine SVR Anand
2003-11-29 9:44 ` Bridging woes after 3 days SVR Anand
2003-12-01 19:33 ` Stephen Hemminger
2003-11-22 19:18 ` 2.6.0-test9 : bridge freezes Markus Hästbacka
2003-11-24 19:09 ` [Bridge] " Stephen Hemminger
2003-11-24 19:09 ` Stephen Hemminger
2003-11-25 6:39 ` [Bridge] " SVR Anand
2003-11-25 6:39 ` SVR Anand
-- strict thread matches above, loose matches on Subject: below --
2003-12-15 13:15 Steve Hill
2003-12-16 1:17 ` David S. Miller
2003-12-16 7:43 ` Bart De Schuymer
2003-12-16 7:46 ` David S. Miller
2003-12-16 9:00 ` Steve Hill
2003-12-16 21:46 ` Bart De Schuymer
2003-12-16 21:49 ` David S. Miller
2003-12-17 8:36 ` Steve Hill
2003-12-17 18:27 ` Bart De Schuymer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.