public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Fragment flooding in 2.4.x/2.5.x
@ 2002-06-27 15:57 Trond Myklebust
  2002-06-27 16:34 ` kuznet
  0 siblings, 1 reply; 18+ messages in thread
From: Trond Myklebust @ 2002-06-27 15:57 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

Hi David,

  I have a question about the case of non-blocking sends in 
ip_build_xmit_slow(). While investigating a problem with the RH7.3 kernel 
causing the Netapp filer IP stack to blow up, we've observed that use of the 
MSG_DONTWAIT flag causes some pretty nasty behaviour.

The fact that fragments are immediately queued for sending means that if 
sock_alloc_send_skb() fails at some point in the middle of the process of 
building the message, then you've ended up sending off a bunch of fragments 
for which there is not even a header (can be a large source of wasted 
bandwidth given heavy NFS traffic).

The appended patch which was originally designed purely to test inverting the 
sending order of fragments (on the hypothesis that the receiving devices were 
making buffer management assumptions based on ordering), removes this effect 
because it delays sending off the fragments until the entire message has been 
built.
Would such a patch be acceptable, or is there a better way of doing this?

Cheers,
  Trond

[-- Attachment #2: ip_build_xmit_slow.dif --]
[-- Type: text/plain, Size: 1348 bytes --]

--- linux-2.4.19-smp/net/ipv4/ip_output.c.orig	Mon May 13 23:34:37 2002
+++ linux-2.4.19-smp/net/ipv4/ip_output.c	Mon Jun 17 23:13:28 2002
@@ -437,6 +437,8 @@
 		  struct rtable *rt,
 		  int flags)
 {
+	struct sk_buff_head frags;
+	struct sk_buff * skb;
 	unsigned int fraglen, maxfraglen, fragheaderlen;
 	int err;
 	int offset, mf;
@@ -512,10 +514,10 @@
 	 */
 
 	id = sk->protinfo.af_inet.id++;
+	skb_queue_head_init(&frags);
 
 	do {
 		char *data;
-		struct sk_buff * skb;
 
 		/*
 		 *	Get the memory we require with some space left for alignment.
@@ -599,7 +601,11 @@
 		fraglen = maxfraglen;
 
 		nfrags++;
+		__skb_queue_head(&frags, skb);
+	} while (offset >= 0);
 
+	/* Ensure we send fragments in order of increasing offset */
+	while ((skb = __skb_dequeue(&frags)) != NULL) {
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, 
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
@@ -608,7 +614,7 @@
 			if (err)
 				goto error;
 		}
-	} while (offset >= 0);
+	}
 
 	if (nfrags>1)
 		ip_statistics[smp_processor_id()*2 + !in_softirq()].IpFragCreates += nfrags;
@@ -617,6 +623,10 @@
 
 error:
 	IP_INC_STATS(IpOutDiscards);
+	while ((skb = __skb_dequeue(&frags)) != NULL) {
+		kfree_skb(skb);
+		nfrags--;
+	}
 	if (nfrags>1)
 		ip_statistics[smp_processor_id()*2 + !in_softirq()].IpFragCreates += nfrags;
 	return err; 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2002-08-06  7:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-27 15:57 Fragment flooding in 2.4.x/2.5.x Trond Myklebust
2002-06-27 16:34 ` kuznet
2002-06-27 17:00   ` Trond Myklebust
2002-06-27 20:05     ` kuznet
2002-06-27 20:45       ` Trond Myklebust
2002-06-27 22:07         ` kuznet
2002-06-28  8:22           ` Trond Myklebust
2002-06-28 10:38             ` Trond Myklebust
2002-06-28 18:21               ` Alexey Kuznetsov
2002-07-01 12:14                 ` Trond Myklebust
2002-08-03 10:17                   ` David S. Miller
2002-08-05 13:43                     ` Trond Myklebust
2002-08-05 14:00                       ` Trond Myklebust
2002-08-05 14:54                       ` David S. Miller
2002-08-05 23:30                       ` kuznet
2002-08-05 23:45                         ` Trond Myklebust
2002-08-06  7:53                           ` Henning P. Schmiedehausen
2002-08-06  4:43                         ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox