* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
[not found] <3F964F9D.D5C69498@fy.chalmers.se>
@ 2003-10-23 9:02 ` Harald Welte
2003-10-23 9:52 ` Andy Polyakov
0 siblings, 1 reply; 9+ messages in thread
From: Harald Welte @ 2003-10-23 9:02 UTC (permalink / raw)
To: Andy Polyakov; +Cc: coreteam, Netfilter Development Mailinglist
[-- Attachment #1: Type: text/plain, Size: 1590 bytes --]
On Wed, Oct 22, 2003 at 11:36:29AM +0200, Andy Polyakov wrote:
> Hi,
>
> This is a preliminary report:-)
ok, thanks.
> - VMware "host-only" network is NAT-ed by ipchains;
Just a quick unrelated question: Why would somebody be using ipchains on
a 2.6 kernel?
> What the heck with rebooting guest and connecting to same URL? Guest OS
> will use very same port numbers at second boot and NAT layer will use
> same port translations, which appears as the triggering factor itself.
>
> Reloading ipchains module in between guest OS boots makes it possible to
> avoid lock-ups/oopses.
As I have no idea about vmware: does it destroy the virtual interface on
the host at time of the reboot in your guest os?
What about using iptables? Does it produce a similar behaviour? I
think this is the first time within at least a year that we've had any
report of somebody using (or finding bugs) in the ipfwadm/ipchains
compat layer... it wasn't even frequently used with 2.4.x
> Question. Should I pursue the issue further?
yes, please. Especially a means of reproduction without running
proprietary software (and thus being repruducable for me) would be very
helpful.
> Cheers. A.
--
- Harald Welte <laforge@netfilter.org> http://www.netfilter.org/
============================================================================
"Fragmentation is like classful addressing -- an interesting early
architectural error that shows how much experimentation was going
on while IP was being designed." -- Paul Vixie
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 9:02 ` [netfilter-core] linux-2.6.0-testX ipchains oops in NAT Harald Welte
@ 2003-10-23 9:52 ` Andy Polyakov
2003-10-23 10:57 ` Harald Welte
2003-10-23 11:16 ` Andy Polyakov
0 siblings, 2 replies; 9+ messages in thread
From: Andy Polyakov @ 2003-10-23 9:52 UTC (permalink / raw)
To: Harald Welte; +Cc: coreteam, Netfilter Development Mailinglist
> Just a quick unrelated question: Why would somebody be using ipchains on
> a 2.6 kernel?
Well, I use it, because it was on my computer since eternity and I used
to it. It's hardly a "crime":-):-):-)
> > What the heck with rebooting guest and connecting to same URL? Guest OS
> > will use very same port numbers at second boot and NAT layer will use
> > same port translations, which appears as the triggering factor itself.
> >
> > Reloading ipchains module in between guest OS boots makes it possible to
> > avoid lock-ups/oopses.
>
> As I have no idea about vmware: does it destroy the virtual interface on
> the host at time of the reboot in your guest os?
No, vmnetN interfaces are persistent and are taken up/down upon system
boot/shutdown. They are pretty much independent from the VMware
application, the one which arranges for communication between guest OS
and the vmnetN interface.
> What about using iptables? Does it produce a similar behaviour?
I don't know. I'll check at some point, but not right away...
> I
> think this is the first time within at least a year that we've had any
> report of somebody using (or finding bugs) in the ipfwadm/ipchains
> compat layer... it wasn't even frequently used with 2.4.x
Well, then it probably should have disappeared from 2.6. I mean if the
code is there, then it should work and is actually expected to work.
> > Question. Should I pursue the issue further?
>
> yes, please. Especially a means of reproduction without running
> proprietary software (and thus being repruducable for me) would be very
> helpful.
Would eth0:1 be sufficient? A.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 9:52 ` Andy Polyakov
@ 2003-10-23 10:57 ` Harald Welte
2003-10-23 14:29 ` Andy Polyakov
2003-10-23 11:16 ` Andy Polyakov
1 sibling, 1 reply; 9+ messages in thread
From: Harald Welte @ 2003-10-23 10:57 UTC (permalink / raw)
To: Andy Polyakov; +Cc: coreteam, Netfilter Development Mailinglist
[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]
On Thu, Oct 23, 2003 at 11:52:05AM +0200, Andy Polyakov wrote:
> > Just a quick unrelated question: Why would somebody be using ipchains on
> > a 2.6 kernel?
>
> Well, I use it, because it was on my computer since eternity and I used
> to it. It's hardly a "crime":-):-):-)
No, it's not a crime - just a very strange thing to do (at least from my
experience).
> > What about using iptables? Does it produce a similar behaviour?
>
> I don't know. I'll check at some point, but not right away...
That would be helpful, since it would become a way more important bug if
it was in iptables.
> > think this is the first time within at least a year that we've had any
> > report of somebody using (or finding bugs) in the ipfwadm/ipchains
> > compat layer... it wasn't even frequently used with 2.4.x
>
> Well, then it probably should have disappeared from 2.6. I mean if the
> code is there, then it should work and is actually expected to work.
Actually we were thinking about removal - and I am still tempted to do
so.
> > yes, please. Especially a means of reproduction without running
> > proprietary software (and thus being repruducable for me) would be very
> > helpful.
>
> Would eth0:1 be sufficient? A.
Yes, it would. Can you confirm the bug happens when you use ipchains,
an alias interface and reuse adresses/ports from a machine behind that
interface?
--
- Harald Welte <laforge@netfilter.org> http://www.netfilter.org/
============================================================================
"Fragmentation is like classful addressing -- an interesting early
architectural error that shows how much experimentation was going
on while IP was being designed." -- Paul Vixie
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 9:52 ` Andy Polyakov
2003-10-23 10:57 ` Harald Welte
@ 2003-10-23 11:16 ` Andy Polyakov
2003-10-23 11:26 ` Andy Polyakov
2003-10-26 6:19 ` Rusty Russell
1 sibling, 2 replies; 9+ messages in thread
From: Andy Polyakov @ 2003-10-23 11:16 UTC (permalink / raw)
To: Harald Welte, coreteam, Netfilter Development Mailinglist
[-- Attachment #1: Type: text/plain, Size: 815 bytes --]
> > > Question. Should I pursue the issue further?
> >
> > yes, please. Especially a means of reproduction without running
> > proprietary software (and thus being repruducable for me) would be very
> > helpful.
>
> Would eth0:1 be sufficient?
It's perfectly reproducible with eth0:1. In other words I
- take up eth0:1 with private ip address, e.g. 192.168.60.1 on computer
running 2.6 with 'ipchains -A forward -s 192.168.0.0/255.255.0.0 -d
0.0.0.0/0.0.0.0 -j MASQ';
- on another computer take up eth0:1 with e.g. 192.168.60.2 and 'route
add host some.host 192.168.60.1';
- on that other computer run attached script as './conn.pl some.host 80
2345';
- wait till port translation expires at first computer;
- run attached script as './conn.pl some.host 80 2345' once again;
- collect attached console.dump;
A.
[-- Attachment #2: conn.pl --]
[-- Type: application/x-perl, Size: 598 bytes --]
[-- Attachment #3: console.dump --]
[-- Type: text/plain, Size: 1824 bytes --]
Unable to handle kernel paging request at virtual address 00100108
printing eip:
e08787e1
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<e08787e1>] Tainted: PF
EFLAGS: 00013203
EIP is at find_appropriate_src+0x3d/0xa0 [ipchains]
eax: e084dcf0 ebx: 00100100 ecx: dd3cdd44 edx: 0000059e
esi: 00000000 edi: dd3cdcd8 ebp: dd3cdc50 esp: dd3cdc40
ds: 007b es: 007b ss: 0068
Process X (pid: 1257, threadinfo=dd3cc000 task=dd3b5940)
Stack: dd3cdce8 dd3cdd08 dd3cdcd8 0000059e dd3cdc90 e0878a83 dd3cdcd8 dd3cdd44
dd3cdce8 c80e4ea4 dd3cdd08 e08817e0 dd3cdcd8 c80e4e2c c80e4e2c dd3cdc9c
e0876099 dd3cdcd8 c80e4e2c e0881580 dd3cdd18 e0878c2d dd3cdd08 dd3cdcd8
Call Trace:
[<e0878a83>] get_unique_tuple+0x33/0x190 [ipchains]
[<e0876099>] invert_tuplepr+0x1d/0x28 [ipchains]
[<e0878c2d>] ip_nat_setup_info+0x4d/0x2a0 [ipchains]
[<e0875ff3>] ip_conntrack_in+0x18f/0x218 [ipchains]
[<c02300c3>] __ip_route_output_key+0x23/0xe4
[<e08780b8>] gcc2_compiled.+0x168/0x1f0 [ipchains]
[<e0877735>] fw_in+0x1f9/0x228 [ipchains]
[<c0229f70>] nf_iterate+0x44/0xa4
[<c0232c44>] ip_forward_finish+0x0/0x4c
[<c022a30a>] nf_hook_slow+0x8e/0x124
[<c0232c44>] ip_forward_finish+0x0/0x4c
[<c0232bfc>] ip_forward+0x1ec/0x234
[<c0232c44>] ip_forward_finish+0x0/0x4c
[<c0231c89>] ip_rcv_finish+0x1bd/0x204
[<c022a348>] nf_hook_slow+0xcc/0x124
[<c02318da>] ip_rcv+0x3ae/0x3f0
[<c0231acc>] ip_rcv_finish+0x0/0x204
[<c0220fe0>] netif_receive_skb+0x13c/0x18c
[<c022109f>] process_backlog+0x6f/0x100
[<c02211a2>] net_rx_action+0x72/0x11c
[<c01202be>] do_softirq+0x4e/0xa0
[<c010d251>] do_IRQ+0x115/0x130
[<c010ba08>] common_interrupt+0x18/0x20
Code: 8b 53 08 0f b7 47 0e 31 f6 66 39 42 1e 75 2e 8b 07 39 42 10
<0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 11:16 ` Andy Polyakov
@ 2003-10-23 11:26 ` Andy Polyakov
2003-10-26 6:19 ` Rusty Russell
1 sibling, 0 replies; 9+ messages in thread
From: Andy Polyakov @ 2003-10-23 11:26 UTC (permalink / raw)
To: Harald Welte, coreteam, Netfilter Development Mailinglist
> - collect attached console.dump;
> Unable to handle kernel paging request at virtual address 00100108
> EIP is at find_appropriate_src+0x3d/0xa0 [ipchains]
Just as in my original report. It was TCP translation which has expired,
therefore +0x3d, 0x100100 is address "i" itself from
"i->conntrack->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum ==
tuple->dst.protonum" line in src_cmp. A.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 10:57 ` Harald Welte
@ 2003-10-23 14:29 ` Andy Polyakov
0 siblings, 0 replies; 9+ messages in thread
From: Andy Polyakov @ 2003-10-23 14:29 UTC (permalink / raw)
To: Harald Welte; +Cc: coreteam, Netfilter Development Mailinglist
> > > What about using iptables? Does it produce a similar behaviour?
> >
> > I don't know. I'll check at some point, but not right away...
>
> That would be helpful, since it would become a way more important bug if
> it was in iptables.
The bug appears to be ipchains specific. Meaning that I can confirm that
my kernel does *not* crash [after VMware guest OS reboot] if I use
iptables to implement equivalent setup. A.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-23 11:16 ` Andy Polyakov
2003-10-23 11:26 ` Andy Polyakov
@ 2003-10-26 6:19 ` Rusty Russell
2003-10-26 13:31 ` Andy Polyakov
2003-10-27 7:58 ` David S. Miller
1 sibling, 2 replies; 9+ messages in thread
From: Rusty Russell @ 2003-10-26 6:19 UTC (permalink / raw)
To: Andy Polyakov
Cc: Harald Welte, coreteam, Netfilter Development Mailinglist, davem
In message <3F97B874.CB12C184@fy.chalmers.se> you write:
> It's perfectly reproducible with eth0:1. In other words I
Thanks for the excellent help Andy!
Found it by inspection from Andy's description.
We updated ip_nat_setup_info to set the initialized flag and call
place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c
which also calls place_in_hashes() itself (again!). Result: corrupt
list, and next thing which lands in the same hash bucket goes boom.
This should fix it.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
Name: ipchains/ipfwadm compat changes for new ip_nat_setup_info
Author: Rusty Russell
Status: Experimental
D: We updated ip_nat_setup_info to set the initialized flag and call
D: place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c
D: which also calls place_in_hashes() itself (again!). Result: corrupt
D: list, and next thing which lands in the same hash bucket goes boom.
D:
D: Thanks to Andy Polyakov for chasing this down.
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .17896-linux-2.6.0-test9/net/ipv4/netfilter/ip_fw_compat_masq.c .17896-linux-2.6.0-test9.updated/net/ipv4/netfilter/ip_fw_compat_masq.c
--- .17896-linux-2.6.0-test9/net/ipv4/netfilter/ip_fw_compat_masq.c 2003-09-22 10:28:14.000000000 +1000
+++ .17896-linux-2.6.0-test9.updated/net/ipv4/netfilter/ip_fw_compat_masq.c 2003-10-26 17:17:30.000000000 +1100
@@ -91,9 +91,6 @@ do_masquerade(struct sk_buff **pskb, con
WRITE_UNLOCK(&ip_nat_lock);
return ret;
}
-
- place_in_hashes(ct, info);
- info->initialized = 1;
} else
DEBUGP("Masquerading already done on this conn.\n");
WRITE_UNLOCK(&ip_nat_lock);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-26 6:19 ` Rusty Russell
@ 2003-10-26 13:31 ` Andy Polyakov
2003-10-27 7:58 ` David S. Miller
1 sibling, 0 replies; 9+ messages in thread
From: Andy Polyakov @ 2003-10-26 13:31 UTC (permalink / raw)
To: Rusty Russell
Cc: Harald Welte, coreteam, Netfilter Development Mailinglist, davem
> We updated ip_nat_setup_info to set the initialized flag and call
> place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c
> which also calls place_in_hashes() itself (again!). Result: corrupt
> list, and next thing which lands in the same hash bucket goes boom.
>
> This should fix it.
I can confirm that the proposed patch does resolve the problem. Thank
you. A.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT
2003-10-26 6:19 ` Rusty Russell
2003-10-26 13:31 ` Andy Polyakov
@ 2003-10-27 7:58 ` David S. Miller
1 sibling, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-10-27 7:58 UTC (permalink / raw)
To: Rusty Russell; +Cc: appro, laforge, coreteam, netfilter-devel
On Sun, 26 Oct 2003 17:19:38 +1100
Rusty Russell <rusty@rustcorp.com.au> wrote:
> We updated ip_nat_setup_info to set the initialized flag and call
> place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c
> which also calls place_in_hashes() itself (again!). Result: corrupt
> list, and next thing which lands in the same hash bucket goes boom.
>
> This should fix it.
Applied, thanks a lot Rusty.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-10-27 7:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <3F964F9D.D5C69498@fy.chalmers.se>
2003-10-23 9:02 ` [netfilter-core] linux-2.6.0-testX ipchains oops in NAT Harald Welte
2003-10-23 9:52 ` Andy Polyakov
2003-10-23 10:57 ` Harald Welte
2003-10-23 14:29 ` Andy Polyakov
2003-10-23 11:16 ` Andy Polyakov
2003-10-23 11:26 ` Andy Polyakov
2003-10-26 6:19 ` Rusty Russell
2003-10-26 13:31 ` Andy Polyakov
2003-10-27 7:58 ` David S. Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.