netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
@ 2003-10-11 17:18 Julian Anastasov
  2003-10-11 18:07 ` David S. Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Julian Anastasov @ 2003-10-11 17:18 UTC (permalink / raw)
  To: David S. Miller; +Cc: Wensong Zhang, netdev


	Hello,

	The included changes help IPVS to correctly forward
fragmented packets. The first problem is that IPVS relies on
skb->nfcache to contain valid information in all hooks but
ip_fragment does not copy this field, both in 2.4 and 2.6.
I just realize that this is the cause for the problem one
IPVS user reported for 2.4.

	The second problem is that 2.6 does not allow IPVS to
transmit fragmented skbs, they are reassembled just because one of
the frags has no owner. Now we add check to avoid the
reassembling if the first skb is not owned and by this way we
have reason to be happy with all recent non-linear handling
changes.

	First is the nfcache patch for 2.4, followed by 2 csets
for 2.6:

--- linux/net/ipv4/ip_output.c.orig	Mon Aug 25 22:06:13 2003
+++ linux/net/ipv4/ip_output.c	Sat Oct 11 19:54:34 2003
@@ -879,6 +879,7 @@
 #endif
 #ifdef CONFIG_NETFILTER
 		skb2->nfmark = skb->nfmark;
+		skb2->nfcache = skb->nfcache;
 		/* Connection association is same as pre-frag packet */
 		skb2->nfct = skb->nfct;
 		nf_conntrack_get(skb2->nfct);



===

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1502  -> 1.1503 
#	net/ipv4/ip_output.c	1.45    -> 1.46   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/10/11	ja@ssi.bg	1.1503
# [IPV4]: ip_copy_metadata must copy the nfcache field
# --------------------------------------------
#
diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
--- a/net/ipv4/ip_output.c	Sat Oct 11 19:35:15 2003
+++ b/net/ipv4/ip_output.c	Sat Oct 11 19:35:15 2003
@@ -412,6 +412,7 @@
 #endif
 #ifdef CONFIG_NETFILTER
 	to->nfmark = from->nfmark;
+	to->nfcache = from->nfcache;
 	/* Connection association is same as pre-frag packet */
 	to->nfct = from->nfct;
 	nf_conntrack_get(to->nfct);



===


# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1503  -> 1.1504 
#	net/ipv4/ip_output.c	1.46    -> 1.47   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/10/11	ja@ssi.bg	1.1504
# [IPV4]: ip_fragment should not reassemble if all frags do not have owner
# --------------------------------------------
#
diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
--- a/net/ipv4/ip_output.c	Sat Oct 11 19:49:38 2003
+++ b/net/ipv4/ip_output.c	Sat Oct 11 19:49:38 2003
@@ -493,7 +493,7 @@
 			    goto slow_path;
 
 			/* Correct socket ownership. */
-			if (frag->sk == NULL)
+			if (frag->sk == NULL && skb->sk)
 				goto slow_path;
 
 			/* Partially cloned skb? */

===

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-11 17:18 [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS Julian Anastasov
@ 2003-10-11 18:07 ` David S. Miller
  2003-10-11 18:08 ` David S. Miller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-10-11 18:07 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: wensong, netdev


I don't read patches emailed privately to me, if you don't bother to
at least CC: netdev on your patches (so that other people can review
the patch, I'm actually quite busy right now) then I'm not going to
bother reading your email.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-11 17:18 [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS Julian Anastasov
  2003-10-11 18:07 ` David S. Miller
@ 2003-10-11 18:08 ` David S. Miller
  2003-10-11 18:53 ` David S. Miller
  2003-10-11 19:02 ` David S. Miller
  3 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-10-11 18:08 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: wensong, netdev


My bad, I just saw netdev in the CC, sorry about that.

I really do hate when people email me crap privately.  Because
even if I wanted to forward it to netdev I can't do that without
asking their permission which takes even more time, so that's why
I just chew people out when they make this mistake.

Sorry again.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-11 17:18 [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS Julian Anastasov
  2003-10-11 18:07 ` David S. Miller
  2003-10-11 18:08 ` David S. Miller
@ 2003-10-11 18:53 ` David S. Miller
  2003-10-11 19:02 ` David S. Miller
  3 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-10-11 18:53 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: wensong, netdev

On Sat, 11 Oct 2003 20:18:33 +0300 (EEST)
Julian Anastasov <ja@ssi.bg> wrote:

> # [IPV4]: ip_copy_metadata must copy the nfcache field

I can't believe we missed this, I'll definitely apply
these patches.

Thanks.

> # [IPV4]: ip_fragment should not reassemble if all frags do not have owner

This one I have to think about some more.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-11 17:18 [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS Julian Anastasov
                   ` (2 preceding siblings ...)
  2003-10-11 18:53 ` David S. Miller
@ 2003-10-11 19:02 ` David S. Miller
  2003-10-14 12:03   ` kuznet
  3 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2003-10-11 19:02 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: wensong, netdev, kuznet

On Sat, 11 Oct 2003 20:18:33 +0300 (EEST)
Julian Anastasov <ja@ssi.bg> wrote:

>  			/* Correct socket ownership. */
> -			if (frag->sk == NULL)
> +			if (frag->sk == NULL && skb->sk)
>  				goto slow_path;

Alexey I think this piece of Julian's patch is OK and this is
the test you meant to make in the first place.

Right?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-11 19:02 ` David S. Miller
@ 2003-10-14 12:03   ` kuznet
  2003-10-14 22:55     ` Julian Anastasov
  0 siblings, 1 reply; 9+ messages in thread
From: kuznet @ 2003-10-14 12:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: ja, wensong, netdev, kuznet

Hello!

> On Sat, 11 Oct 2003 20:18:33 +0300 (EEST)
> Julian Anastasov <ja@ssi.bg> wrote:
> 
> >  			/* Correct socket ownership. */
> > -			if (frag->sk == NULL)
> > +			if (frag->sk == NULL && skb->sk)
> >  				goto slow_path;
> 
> Alexey I think this piece of Julian's patch is OK and this is
> the test you meant to make in the first place.

Yes. The test was to eliminate the pathological cases, but the test
is really valid only when skb is generated by ip_append_*. If skb as whole
is now owned, it is also perfect case.

> Right?

Perfectly correct.

Alexey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-14 22:55     ` Julian Anastasov
@ 2003-10-14 22:51       ` David S. Miller
  2003-10-15  7:01         ` Julian Anastasov
  0 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2003-10-14 22:51 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: kuznet, wensong, netdev

On Wed, 15 Oct 2003 01:55:43 +0300 (EEST)
Julian Anastasov <ja@ssi.bg> wrote:

> 	One related question: is ip_fragment() backport from 2.6
> to 2.4 planned?

No, it only makes sense in the 2.6.x stack.

2.4.x and 2.6.x networking are wildly different beasts and
you'll therefore need to make the IPVS implementation cope
with that in each tree.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-14 12:03   ` kuznet
@ 2003-10-14 22:55     ` Julian Anastasov
  2003-10-14 22:51       ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Julian Anastasov @ 2003-10-14 22:55 UTC (permalink / raw)
  To: kuznet; +Cc: David S. Miller, wensong, netdev


	Hello,

On Tue, 14 Oct 2003 kuznet@ms2.inr.ac.ru wrote:

> Yes. The test was to eliminate the pathological cases, but the test
> is really valid only when skb is generated by ip_append_*. If skb as whole
> is now owned, it is also perfect case.

	One related question: is ip_fragment() backport from 2.6
to 2.4 planned? Because so many NF hooks were changed to avoid
linearization but ip_fragment is still outdated and the forced
reassembly after ip_defrag is not always a good thing.

> Alexey

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS
  2003-10-14 22:51       ` David S. Miller
@ 2003-10-15  7:01         ` Julian Anastasov
  0 siblings, 0 replies; 9+ messages in thread
From: Julian Anastasov @ 2003-10-15  7:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, wensong, netdev


	Hello,

On Tue, 14 Oct 2003, David S. Miller wrote:

> > 	One related question: is ip_fragment() backport from 2.6
> > to 2.4 planned?
>
> No, it only makes sense in the 2.6.x stack.
>
> 2.4.x and 2.6.x networking are wildly different beasts and
> you'll therefore need to make the IPVS implementation cope
> with that in each tree.

	ok. But my subject is wrong. In fact, it covers
the general case of forwarded packets after calling ip_defrag.
It means, for example, that if ip_conntrack is running in 2.4 then all 
packets with total_length>mtu are refragmented or linearized. This 
includes the case with 60k packets containing frags where each frag fits
in mtu. But I see, it is a dangerous place to change.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-10-15  7:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-11 17:18 [2.4/2.6 PATCHES] Change some ip_fragment checks to help IPVS Julian Anastasov
2003-10-11 18:07 ` David S. Miller
2003-10-11 18:08 ` David S. Miller
2003-10-11 18:53 ` David S. Miller
2003-10-11 19:02 ` David S. Miller
2003-10-14 12:03   ` kuznet
2003-10-14 22:55     ` Julian Anastasov
2003-10-14 22:51       ` David S. Miller
2003-10-15  7:01         ` Julian Anastasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).