* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT [not found] <3F964F9D.D5C69498@fy.chalmers.se> @ 2003-10-23 9:02 ` Harald Welte 2003-10-23 9:52 ` Andy Polyakov 0 siblings, 1 reply; 9+ messages in thread From: Harald Welte @ 2003-10-23 9:02 UTC (permalink / raw) To: Andy Polyakov; +Cc: coreteam, Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 1590 bytes --] On Wed, Oct 22, 2003 at 11:36:29AM +0200, Andy Polyakov wrote: > Hi, > > This is a preliminary report:-) ok, thanks. > - VMware "host-only" network is NAT-ed by ipchains; Just a quick unrelated question: Why would somebody be using ipchains on a 2.6 kernel? > What the heck with rebooting guest and connecting to same URL? Guest OS > will use very same port numbers at second boot and NAT layer will use > same port translations, which appears as the triggering factor itself. > > Reloading ipchains module in between guest OS boots makes it possible to > avoid lock-ups/oopses. As I have no idea about vmware: does it destroy the virtual interface on the host at time of the reboot in your guest os? What about using iptables? Does it produce a similar behaviour? I think this is the first time within at least a year that we've had any report of somebody using (or finding bugs) in the ipfwadm/ipchains compat layer... it wasn't even frequently used with 2.4.x > Question. Should I pursue the issue further? yes, please. Especially a means of reproduction without running proprietary software (and thus being repruducable for me) would be very helpful. > Cheers. A. -- - Harald Welte <laforge@netfilter.org> http://www.netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 9:02 ` [netfilter-core] linux-2.6.0-testX ipchains oops in NAT Harald Welte @ 2003-10-23 9:52 ` Andy Polyakov 2003-10-23 10:57 ` Harald Welte 2003-10-23 11:16 ` Andy Polyakov 0 siblings, 2 replies; 9+ messages in thread From: Andy Polyakov @ 2003-10-23 9:52 UTC (permalink / raw) To: Harald Welte; +Cc: coreteam, Netfilter Development Mailinglist > Just a quick unrelated question: Why would somebody be using ipchains on > a 2.6 kernel? Well, I use it, because it was on my computer since eternity and I used to it. It's hardly a "crime":-):-):-) > > What the heck with rebooting guest and connecting to same URL? Guest OS > > will use very same port numbers at second boot and NAT layer will use > > same port translations, which appears as the triggering factor itself. > > > > Reloading ipchains module in between guest OS boots makes it possible to > > avoid lock-ups/oopses. > > As I have no idea about vmware: does it destroy the virtual interface on > the host at time of the reboot in your guest os? No, vmnetN interfaces are persistent and are taken up/down upon system boot/shutdown. They are pretty much independent from the VMware application, the one which arranges for communication between guest OS and the vmnetN interface. > What about using iptables? Does it produce a similar behaviour? I don't know. I'll check at some point, but not right away... > I > think this is the first time within at least a year that we've had any > report of somebody using (or finding bugs) in the ipfwadm/ipchains > compat layer... it wasn't even frequently used with 2.4.x Well, then it probably should have disappeared from 2.6. I mean if the code is there, then it should work and is actually expected to work. > > Question. Should I pursue the issue further? > > yes, please. Especially a means of reproduction without running > proprietary software (and thus being repruducable for me) would be very > helpful. Would eth0:1 be sufficient? A. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 9:52 ` Andy Polyakov @ 2003-10-23 10:57 ` Harald Welte 2003-10-23 14:29 ` Andy Polyakov 2003-10-23 11:16 ` Andy Polyakov 1 sibling, 1 reply; 9+ messages in thread From: Harald Welte @ 2003-10-23 10:57 UTC (permalink / raw) To: Andy Polyakov; +Cc: coreteam, Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 1785 bytes --] On Thu, Oct 23, 2003 at 11:52:05AM +0200, Andy Polyakov wrote: > > Just a quick unrelated question: Why would somebody be using ipchains on > > a 2.6 kernel? > > Well, I use it, because it was on my computer since eternity and I used > to it. It's hardly a "crime":-):-):-) No, it's not a crime - just a very strange thing to do (at least from my experience). > > What about using iptables? Does it produce a similar behaviour? > > I don't know. I'll check at some point, but not right away... That would be helpful, since it would become a way more important bug if it was in iptables. > > think this is the first time within at least a year that we've had any > > report of somebody using (or finding bugs) in the ipfwadm/ipchains > > compat layer... it wasn't even frequently used with 2.4.x > > Well, then it probably should have disappeared from 2.6. I mean if the > code is there, then it should work and is actually expected to work. Actually we were thinking about removal - and I am still tempted to do so. > > yes, please. Especially a means of reproduction without running > > proprietary software (and thus being repruducable for me) would be very > > helpful. > > Would eth0:1 be sufficient? A. Yes, it would. Can you confirm the bug happens when you use ipchains, an alias interface and reuse adresses/ports from a machine behind that interface? -- - Harald Welte <laforge@netfilter.org> http://www.netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 10:57 ` Harald Welte @ 2003-10-23 14:29 ` Andy Polyakov 0 siblings, 0 replies; 9+ messages in thread From: Andy Polyakov @ 2003-10-23 14:29 UTC (permalink / raw) To: Harald Welte; +Cc: coreteam, Netfilter Development Mailinglist > > > What about using iptables? Does it produce a similar behaviour? > > > > I don't know. I'll check at some point, but not right away... > > That would be helpful, since it would become a way more important bug if > it was in iptables. The bug appears to be ipchains specific. Meaning that I can confirm that my kernel does *not* crash [after VMware guest OS reboot] if I use iptables to implement equivalent setup. A. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 9:52 ` Andy Polyakov 2003-10-23 10:57 ` Harald Welte @ 2003-10-23 11:16 ` Andy Polyakov 2003-10-23 11:26 ` Andy Polyakov 2003-10-26 6:19 ` Rusty Russell 1 sibling, 2 replies; 9+ messages in thread From: Andy Polyakov @ 2003-10-23 11:16 UTC (permalink / raw) To: Harald Welte, coreteam, Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 815 bytes --] > > > Question. Should I pursue the issue further? > > > > yes, please. Especially a means of reproduction without running > > proprietary software (and thus being repruducable for me) would be very > > helpful. > > Would eth0:1 be sufficient? It's perfectly reproducible with eth0:1. In other words I - take up eth0:1 with private ip address, e.g. 192.168.60.1 on computer running 2.6 with 'ipchains -A forward -s 192.168.0.0/255.255.0.0 -d 0.0.0.0/0.0.0.0 -j MASQ'; - on another computer take up eth0:1 with e.g. 192.168.60.2 and 'route add host some.host 192.168.60.1'; - on that other computer run attached script as './conn.pl some.host 80 2345'; - wait till port translation expires at first computer; - run attached script as './conn.pl some.host 80 2345' once again; - collect attached console.dump; A. [-- Attachment #2: conn.pl --] [-- Type: application/x-perl, Size: 598 bytes --] [-- Attachment #3: console.dump --] [-- Type: text/plain, Size: 1824 bytes --] Unable to handle kernel paging request at virtual address 00100108 printing eip: e08787e1 *pde = 00000000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[<e08787e1>] Tainted: PF EFLAGS: 00013203 EIP is at find_appropriate_src+0x3d/0xa0 [ipchains] eax: e084dcf0 ebx: 00100100 ecx: dd3cdd44 edx: 0000059e esi: 00000000 edi: dd3cdcd8 ebp: dd3cdc50 esp: dd3cdc40 ds: 007b es: 007b ss: 0068 Process X (pid: 1257, threadinfo=dd3cc000 task=dd3b5940) Stack: dd3cdce8 dd3cdd08 dd3cdcd8 0000059e dd3cdc90 e0878a83 dd3cdcd8 dd3cdd44 dd3cdce8 c80e4ea4 dd3cdd08 e08817e0 dd3cdcd8 c80e4e2c c80e4e2c dd3cdc9c e0876099 dd3cdcd8 c80e4e2c e0881580 dd3cdd18 e0878c2d dd3cdd08 dd3cdcd8 Call Trace: [<e0878a83>] get_unique_tuple+0x33/0x190 [ipchains] [<e0876099>] invert_tuplepr+0x1d/0x28 [ipchains] [<e0878c2d>] ip_nat_setup_info+0x4d/0x2a0 [ipchains] [<e0875ff3>] ip_conntrack_in+0x18f/0x218 [ipchains] [<c02300c3>] __ip_route_output_key+0x23/0xe4 [<e08780b8>] gcc2_compiled.+0x168/0x1f0 [ipchains] [<e0877735>] fw_in+0x1f9/0x228 [ipchains] [<c0229f70>] nf_iterate+0x44/0xa4 [<c0232c44>] ip_forward_finish+0x0/0x4c [<c022a30a>] nf_hook_slow+0x8e/0x124 [<c0232c44>] ip_forward_finish+0x0/0x4c [<c0232bfc>] ip_forward+0x1ec/0x234 [<c0232c44>] ip_forward_finish+0x0/0x4c [<c0231c89>] ip_rcv_finish+0x1bd/0x204 [<c022a348>] nf_hook_slow+0xcc/0x124 [<c02318da>] ip_rcv+0x3ae/0x3f0 [<c0231acc>] ip_rcv_finish+0x0/0x204 [<c0220fe0>] netif_receive_skb+0x13c/0x18c [<c022109f>] process_backlog+0x6f/0x100 [<c02211a2>] net_rx_action+0x72/0x11c [<c01202be>] do_softirq+0x4e/0xa0 [<c010d251>] do_IRQ+0x115/0x130 [<c010ba08>] common_interrupt+0x18/0x20 Code: 8b 53 08 0f b7 47 0e 31 f6 66 39 42 1e 75 2e 8b 07 39 42 10 <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 11:16 ` Andy Polyakov @ 2003-10-23 11:26 ` Andy Polyakov 2003-10-26 6:19 ` Rusty Russell 1 sibling, 0 replies; 9+ messages in thread From: Andy Polyakov @ 2003-10-23 11:26 UTC (permalink / raw) To: Harald Welte, coreteam, Netfilter Development Mailinglist > - collect attached console.dump; > Unable to handle kernel paging request at virtual address 00100108 > EIP is at find_appropriate_src+0x3d/0xa0 [ipchains] Just as in my original report. It was TCP translation which has expired, therefore +0x3d, 0x100100 is address "i" itself from "i->conntrack->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum == tuple->dst.protonum" line in src_cmp. A. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-23 11:16 ` Andy Polyakov 2003-10-23 11:26 ` Andy Polyakov @ 2003-10-26 6:19 ` Rusty Russell 2003-10-26 13:31 ` Andy Polyakov 2003-10-27 7:58 ` David S. Miller 1 sibling, 2 replies; 9+ messages in thread From: Rusty Russell @ 2003-10-26 6:19 UTC (permalink / raw) To: Andy Polyakov Cc: Harald Welte, coreteam, Netfilter Development Mailinglist, davem In message <3F97B874.CB12C184@fy.chalmers.se> you write: > It's perfectly reproducible with eth0:1. In other words I Thanks for the excellent help Andy! Found it by inspection from Andy's description. We updated ip_nat_setup_info to set the initialized flag and call place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c which also calls place_in_hashes() itself (again!). Result: corrupt list, and next thing which lands in the same hash bucket goes boom. This should fix it. Rusty. -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. Name: ipchains/ipfwadm compat changes for new ip_nat_setup_info Author: Rusty Russell Status: Experimental D: We updated ip_nat_setup_info to set the initialized flag and call D: place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c D: which also calls place_in_hashes() itself (again!). Result: corrupt D: list, and next thing which lands in the same hash bucket goes boom. D: D: Thanks to Andy Polyakov for chasing this down. diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .17896-linux-2.6.0-test9/net/ipv4/netfilter/ip_fw_compat_masq.c .17896-linux-2.6.0-test9.updated/net/ipv4/netfilter/ip_fw_compat_masq.c --- .17896-linux-2.6.0-test9/net/ipv4/netfilter/ip_fw_compat_masq.c 2003-09-22 10:28:14.000000000 +1000 +++ .17896-linux-2.6.0-test9.updated/net/ipv4/netfilter/ip_fw_compat_masq.c 2003-10-26 17:17:30.000000000 +1100 @@ -91,9 +91,6 @@ do_masquerade(struct sk_buff **pskb, con WRITE_UNLOCK(&ip_nat_lock); return ret; } - - place_in_hashes(ct, info); - info->initialized = 1; } else DEBUGP("Masquerading already done on this conn.\n"); WRITE_UNLOCK(&ip_nat_lock); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-26 6:19 ` Rusty Russell @ 2003-10-26 13:31 ` Andy Polyakov 2003-10-27 7:58 ` David S. Miller 1 sibling, 0 replies; 9+ messages in thread From: Andy Polyakov @ 2003-10-26 13:31 UTC (permalink / raw) To: Rusty Russell Cc: Harald Welte, coreteam, Netfilter Development Mailinglist, davem > We updated ip_nat_setup_info to set the initialized flag and call > place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c > which also calls place_in_hashes() itself (again!). Result: corrupt > list, and next thing which lands in the same hash bucket goes boom. > > This should fix it. I can confirm that the proposed patch does resolve the problem. Thank you. A. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [netfilter-core] linux-2.6.0-testX ipchains oops in NAT 2003-10-26 6:19 ` Rusty Russell 2003-10-26 13:31 ` Andy Polyakov @ 2003-10-27 7:58 ` David S. Miller 1 sibling, 0 replies; 9+ messages in thread From: David S. Miller @ 2003-10-27 7:58 UTC (permalink / raw) To: Rusty Russell; +Cc: appro, laforge, coreteam, netfilter-devel On Sun, 26 Oct 2003 17:19:38 +1100 Rusty Russell <rusty@rustcorp.com.au> wrote: > We updated ip_nat_setup_info to set the initialized flag and call > place_in_hashes, but *didn't* change the call in ip_fw_compat_masq.c > which also calls place_in_hashes() itself (again!). Result: corrupt > list, and next thing which lands in the same hash bucket goes boom. > > This should fix it. Applied, thanks a lot Rusty. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-10-27 7:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <3F964F9D.D5C69498@fy.chalmers.se>
2003-10-23 9:02 ` [netfilter-core] linux-2.6.0-testX ipchains oops in NAT Harald Welte
2003-10-23 9:52 ` Andy Polyakov
2003-10-23 10:57 ` Harald Welte
2003-10-23 14:29 ` Andy Polyakov
2003-10-23 11:16 ` Andy Polyakov
2003-10-23 11:26 ` Andy Polyakov
2003-10-26 6:19 ` Rusty Russell
2003-10-26 13:31 ` Andy Polyakov
2003-10-27 7:58 ` David S. Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.