* ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) [not found] ` <200012181811.KAA05490@pizda.ninka.net> @ 2000-12-18 19:14 ` Harald Welte 2000-12-19 13:55 ` Tom Leete 0 siblings, 1 reply; 8+ messages in thread From: Harald Welte @ 2000-12-18 19:14 UTC (permalink / raw) To: David S. Miller; +Cc: rusty, kuznet, netfilter-devel, linux-kernel On Mon, Dec 18, 2000 at 10:11:14AM -0800, David S. Miller wrote: > From: Rusty Russell <rusty@linuxcare.com.au> > Date: Mon, 18 Dec 2000 14:15:52 +1100 > > Alexey is right, locking is screwed (explains some reports of > occasional failure during rmmod). > > Patch applied, thank you. > > I have a patch which compiles for non-linear skb mods to netfilter > (NAT uses linear packets still, but connection tracking and packet > filtering only linearize minimal requirements). Waiting for > DaveM's solution to ip_ct_gather_frags()... > > I feel it's best to just skb_clone() the skb arg to ip_defrag > and this will close the whole thing, I think. no. The clone()d skb will still have a skb->dev pointing to NULL. The problem occurs only for locally-generated outgoing packets, which need to be fragmented: - ip_build_xmit() discoveres it has to fragment - ip_build_xmit_slow() generates fragments and calls - NF_HOOK() for NF_IP_LOCAL_OUT - ip_conntrack_local() is called, which in turn calls - ip_conntrack_in(), which calls - ip_ct_gather_frags(), which calls - ip_defrag(), which calls [now we have two possible oops - caues] a ip_find(), which calls a ip_frag_create(), which initialises a timer with the function a ip_expire(), which dereferences qp->iif b ip_frag_queue(), which dereferences it in qp->iif= skb->dev->ifindex as andi kleen pointed out: > > Also is it sure that the backtrace involves ip_rcv ? A more likely > > guess is that it happens during the IP_LOCAL_OUT hook, when skb->dev > > isn't set yet, but conntrack already has to already reassemble fragments. > Actually, I do not understand how current code could even have worked > in the past. Once the SKB is passed to ip_defrag, it is nobody's > buisness to reference that SKB anymore. This ip_defrag call is (from mmh... we really don't do this. We use the return value of ip_defrag(), which is what ip_frag_reasm() returns (== the new datagram consisting out of all its fragments). > Alexey, what have I missed? I don't like the ip_fragment.c proposed > fix for this reason, what netfilter is doing with ip_defrag here looks > just wrong. Well, my conclusion is: - the defragmentation code in the ipv4 stack assumes that skb->dev points to a valid device. It does this primaryly to send icmp reassembly errors if a fragment reassembly timeout occurs - netfilter wants to use the ip_defrag for defragmentation, not only for incoming packets from the network at NF_IP_PRE_ROUTING, but also for locally-generated fragmented packets (at NF_IP_LOCAL_OUT). Unfortunately we don't have a valid skb->dev at this point. I don't think that there is any skb_clone()ing needed, nor this would solve the problem. The solution is a) netfilter conntrack (more exactly: ip_ct_gather_frags) sets skb->dev to something valid (skb->dst->dev). Dirty hack. b) make ip_defrag(), ... aware of the case where skb->dev == NULL. Sounds like a good idea, since it is only one if(skb->dev) clause. c) netfilter stops using ip_defrag() for this case. Bad idea, it had to reinvent the wheel :( > David S. Miller > davem@redhat.com -- Live long and prosper - Harald Welte / laforge@gnumonks.org http://www.gnumonks.org ============================================================================ GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) 2000-12-18 19:14 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Harald Welte @ 2000-12-19 13:55 ` Tom Leete 2000-12-19 14:12 ` David S. Miller 0 siblings, 1 reply; 8+ messages in thread From: Tom Leete @ 2000-12-19 13:55 UTC (permalink / raw) To: Harald Welte Cc: David S. Miller, rusty, kuznet, netfilter-devel, linux-kernel Harald Welte wrote: > > On Mon, Dec 18, 2000 at 10:11:14AM -0800, David S. Miller wrote: > > From: Rusty Russell <rusty@linuxcare.com.au> > > Date: Mon, 18 Dec 2000 14:15:52 +1100 > > > > Alexey is right, locking is screwed (explains some reports of > > occasional failure during rmmod). > > > > Patch applied, thank you. > > > > I have a patch which compiles for non-linear skb mods to netfilter > > (NAT uses linear packets still, but connection tracking and packet > > filtering only linearize minimal requirements). Waiting for > > DaveM's solution to ip_ct_gather_frags()... > > > > I feel it's best to just skb_clone() the skb arg to ip_defrag > > and this will close the whole thing, I think. > > no. The clone()d skb will still have a skb->dev pointing to NULL. > > The problem occurs only for locally-generated outgoing packets, which > need to be fragmented: > > - ip_build_xmit() discoveres it has to fragment > - ip_build_xmit_slow() generates fragments and calls > - NF_HOOK() for NF_IP_LOCAL_OUT > - ip_conntrack_local() is called, which in turn calls > - ip_conntrack_in(), which calls > - ip_ct_gather_frags(), which calls > - ip_defrag(), which calls > > [now we have two possible oops - caues] > > a ip_find(), which calls > a ip_frag_create(), which initialises a timer with the function > a ip_expire(), which dereferences qp->iif > > b ip_frag_queue(), which dereferences it in qp->iif= skb->dev->ifindex > > as andi kleen pointed out: > > > > Also is it sure that the backtrace involves ip_rcv ? A more likely > > > guess is that it happens during the IP_LOCAL_OUT hook, when skb->dev > > > isn't set yet, but conntrack already has to already reassemble fragments. > > > > Actually, I do not understand how current code could even have worked > > in the past. Once the SKB is passed to ip_defrag, it is nobody's > > buisness to reference that SKB anymore. This ip_defrag call is (from > > mmh... we really don't do this. We use the return value of ip_defrag(), > which is what ip_frag_reasm() returns (== the new datagram consisting > out of all its fragments). > > > Alexey, what have I missed? I don't like the ip_fragment.c proposed > > fix for this reason, what netfilter is doing with ip_defrag here looks > > just wrong. > > Well, my conclusion is: > > - the defragmentation code in the ipv4 stack assumes that skb->dev points > to a valid device. It does this primaryly to send icmp reassembly errors > if a fragment reassembly timeout occurs > > - netfilter wants to use the ip_defrag for defragmentation, not only for > incoming packets from the network at NF_IP_PRE_ROUTING, but also for > locally-generated fragmented packets (at NF_IP_LOCAL_OUT). Unfortunately > we don't have a valid skb->dev at this point. > > I don't think that there is any skb_clone()ing needed, nor this would solve > the problem. The solution is > > a) netfilter conntrack (more exactly: ip_ct_gather_frags) sets skb->dev to > something valid (skb->dst->dev). Dirty hack. > > b) make ip_defrag(), ... aware of the case where skb->dev == NULL. Sounds > like a good idea, since it is only one if(skb->dev) clause. > > c) netfilter stops using ip_defrag() for this case. Bad idea, it had to > reinvent the wheel :( > > - Harald Welte / laforge@gnumonks.org http://www.gnumonks.org I'm now testing this small patch based on your suggestion b). I have netfilter running, nfs mounts are behaving well, and fragmented pings dont kill the box. I made no attempt to resolve anything but the derefernce of a netdevice *dev.= NULL The patch only deals with the ip_defrag_queue path. I have not seen the alternate one happen. It's only been up an hour, I'll put some more time on it. Cheers, Tom --- linux/net/ipv4/ip_fragment.c~ Tue Dec 12 06:56:29 2000 +++ linux/net/ipv4/ip_fragment.c Tue Dec 19 07:29:53 2000 @@ -485,7 +485,8 @@ else qp->fragments = skb; - qp->iif = skb->dev->ifindex; + if (skb->dev) + qp->iif = skb->dev->ifindex; skb->dev = NULL; qp->meat += skb->len; atomic_add(skb->truesize, &ip_frag_mem); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) 2000-12-19 13:55 ` Tom Leete @ 2000-12-19 14:12 ` David S. Miller 2000-12-19 14:54 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet 2000-12-20 5:59 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete 0 siblings, 2 replies; 8+ messages in thread From: David S. Miller @ 2000-12-19 14:12 UTC (permalink / raw) To: tleete; +Cc: laforge, rusty, kuznet, netfilter-devel, linux-kernel Date: Tue, 19 Dec 2000 08:55:35 -0500 From: Tom Leete <tleete@mountain.net> The patch only deals with the ip_defrag_queue path. I have not seen the alternate one happen. It's only been up an hour, I'll put some more time on it. --- linux/net/ipv4/ip_fragment.c~ Tue Dec 12 06:56:29 2000 +++ linux/net/ipv4/ip_fragment.c Tue Dec 19 07:29:53 2000 @@ -485,7 +485,8 @@ This is basically what is in my tree right now. However, there was one reporter who claimed that after this kind of change he still was able to lockup/OOPS his machine by logging into X as a user who had his home directory over NFS. This was with netfilter enabled as well so it has to be the same bug. Later, David S. Miller davem@redhat.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter 2000-12-19 14:12 ` David S. Miller @ 2000-12-19 14:54 ` kuznet 2000-12-20 1:45 ` Mohammad A. Haque 2000-12-20 5:59 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete 1 sibling, 1 reply; 8+ messages in thread From: kuznet @ 2000-12-19 14:54 UTC (permalink / raw) To: David S. Miller; +Cc: tleete, laforge, rusty, netfilter-devel, linux-kernel Hello! > able to lockup/OOPS his machine by logging into X as a user who had > his home directory over NFS. I believe this report is to be ignored. It is fully meaningless. X has nothing to do with NFS, NFS is with X, and defragmenter is at least with one of them. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter 2000-12-19 14:54 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet @ 2000-12-20 1:45 ` Mohammad A. Haque 2000-12-20 1:35 ` David S. Miller 0 siblings, 1 reply; 8+ messages in thread From: Mohammad A. Haque @ 2000-12-20 1:45 UTC (permalink / raw) To: kuznet Cc: David S. Miller, tleete, laforge, rusty, netfilter-devel, linux-kernel how is this meaningless? This just confirms what I and others have found in test12 wrt to the netfilter issue. kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > able to lockup/OOPS his machine by logging into X as a user who had > > his home directory over NFS. > > I believe this report is to be ignored. It is fully meaningless. > X has nothing to do with NFS, NFS is with X, and defragmenter is > at least with one of them. -- ===================================================================== Mohammad A. Haque http://www.haque.net/ mhaque@haque.net "Alcohol and calculus don't mix. Project Lead Don't drink and derive." --Unknown http://wm.themes.org/ batmanppc@themes.org ===================================================================== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter 2000-12-20 1:45 ` Mohammad A. Haque @ 2000-12-20 1:35 ` David S. Miller 2000-12-20 2:47 ` Mohammad A. Haque 0 siblings, 1 reply; 8+ messages in thread From: David S. Miller @ 2000-12-20 1:35 UTC (permalink / raw) To: mhaque; +Cc: kuznet, tleete, laforge, rusty, netfilter-devel, linux-kernel Date: Tue, 19 Dec 2000 20:45:14 -0500 From: "Mohammad A. Haque" <mhaque@haque.net> how is this meaningless? This just confirms what I and others have found in test12 wrt to the netfilter issue. Alexey is talking about test12/netfilter + our ip_fragment.c fix, not vanilla test12. Later, David S. Miller davem@redhat.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter 2000-12-20 1:35 ` David S. Miller @ 2000-12-20 2:47 ` Mohammad A. Haque 0 siblings, 0 replies; 8+ messages in thread From: Mohammad A. Haque @ 2000-12-20 2:47 UTC (permalink / raw) To: David S. Miller Cc: kuznet, tleete, laforge, rusty, netfilter-devel, linux-kernel Woops, my bad. These finally exams are getting to me. Sorry. "David S. Miller" wrote: > Alexey is talking about test12/netfilter + our ip_fragment.c fix, > not vanilla test12. -- ===================================================================== Mohammad A. Haque http://www.haque.net/ mhaque@haque.net "Alcohol and calculus don't mix. Project Lead Don't drink and derive." --Unknown http://wm.themes.org/ batmanppc@themes.org ===================================================================== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) 2000-12-19 14:12 ` David S. Miller 2000-12-19 14:54 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet @ 2000-12-20 5:59 ` Tom Leete 1 sibling, 0 replies; 8+ messages in thread From: Tom Leete @ 2000-12-20 5:59 UTC (permalink / raw) To: David S. Miller; +Cc: laforge, rusty, kuznet, netfilter-devel, linux-kernel "David S. Miller" wrote: > This is basically what is in my tree right now. However, there was > one reporter who claimed that after this kind of change he still was > able to lockup/OOPS his machine by logging into X as a user who had > his home directory over NFS. This was with netfilter enabled as well > so it has to be the same bug. > > Later, > David S. Miller > davem@redhat.com 2.4.0-test12+ip_fragment.c.patch is still up. I hadn't previously tried client-side mounts on that box, just exports. I tried to reproduce the problem with nfs home directory mount + X. I am unable to generate the error -- it works for me. I think Alexey is right. There may be either coincidence or confusion in the report. Btw, I am able to compile a kernel over an nfs mount with this. Very handy since the remote is much faster at it. Cheers, Tom - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2000-12-20 6:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <E147qmG-0001yB-00@halfway>
[not found] ` <200012181811.KAA05490@pizda.ninka.net>
2000-12-18 19:14 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Harald Welte
2000-12-19 13:55 ` Tom Leete
2000-12-19 14:12 ` David S. Miller
2000-12-19 14:54 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter kuznet
2000-12-20 1:45 ` Mohammad A. Haque
2000-12-20 1:35 ` David S. Miller
2000-12-20 2:47 ` Mohammad A. Haque
2000-12-20 5:59 ` ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking) Tom Leete
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox