public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Marc MERLIN <marc-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
Cc: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	Larry.Finger-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org,
	bhutchings-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH] tcp: avoid order-1 allocations on wifi and tx path
Date: Wed, 11 Apr 2012 08:30:48 +0200	[thread overview]
Message-ID: <1334125848.5300.2330.camel@edumazet-glaptop> (raw)
In-Reply-To: <1334122980.5300.2154.camel@edumazet-glaptop>

Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

After investigation, it turns out TCP uses sk_stream_alloc_skb() and
used as a convention skb_tailroom(skb) to know how many bytes of data
payload could be put in this skb (for non SG capable devices)

Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
sizeof(struct skb_shared_info) being above 2048)

Later, mac80211 layer need to add some bytes at the tail of skb
(IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
available has to call pskb_expand_head() and request order-1
allocations.

This patch changes sk_stream_alloc_skb() so that only
sk->sk_prot->max_header bytes of headroom are reserved, and use a new
skb field, avail_size to hold the data payload limit.

This way, order-0 allocations done by TCP stack can leave more than 2 KB
of tailroom and no more allocation is performed in mac80211 layer (or
any layer needing some tailroom)

avail_size is unioned with mark/dropcount, since mark will be set later
in IP stack for output packets. Therefore, skb size is unchanged.

Reported-by: Marc MERLIN <marc-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
Tested-by: Marc MERLIN <marc-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 include/linux/skbuff.h |   13 +++++++++++++
 net/ipv4/tcp.c         |    8 ++++----
 net/ipv4/tcp_output.c  |    2 +-
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3337027..70a3f8d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -481,6 +481,7 @@ struct sk_buff {
 	union {
 		__u32		mark;
 		__u32		dropcount;
+		__u32		avail_size;
 	};
 
 	sk_buff_data_t		transport_header;
@@ -1366,6 +1367,18 @@ static inline int skb_tailroom(const struct sk_buff *skb)
 }
 
 /**
+ *	skb_availroom - bytes at buffer end
+ *	@skb: buffer to check
+ *
+ *	Return the number of bytes of free space at the tail of an sk_buff
+ *	allocated by sk_stream_alloc()
+ */
+static inline int skb_availroom(const struct sk_buff *skb)
+{
+	return skb_is_nonlinear(skb) ? 0 : skb->avail_size - skb->len;
+}
+
+/**
  *	skb_reserve - adjust headroom
  *	@skb: buffer to alter
  *	@len: bytes to move
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5d54ed3..87f497f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -701,11 +701,12 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp)
 	skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp);
 	if (skb) {
 		if (sk_wmem_schedule(sk, skb->truesize)) {
+			skb_reserve(skb, sk->sk_prot->max_header);
 			/*
 			 * Make sure that we have exactly size bytes
 			 * available to the caller, no more, no less.
 			 */
-			skb_reserve(skb, skb_tailroom(skb) - size);
+			skb->avail_size = size;		
 			return skb;
 		}
 		__kfree_skb(skb);
@@ -995,10 +996,9 @@ new_segment:
 				copy = seglen;
 
 			/* Where to copy to? */
-			if (skb_tailroom(skb) > 0) {
+			if (skb_availroom(skb) > 0) {
 				/* We have some space in skb head. Superb! */
-				if (copy > skb_tailroom(skb))
-					copy = skb_tailroom(skb);
+				copy = min_t(int, copy, skb_availroom(skb));
 				err = skb_add_data_nocache(sk, skb, from, copy);
 				if (err)
 					goto do_fault;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 364784a..376b2cf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2060,7 +2060,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
 		/* Punt if not enough space exists in the first SKB for
 		 * the data in the second
 		 */
-		if (skb->len > skb_tailroom(to))
+		if (skb->len > skb_availroom(to))
 			break;
 
 		if (after(TCP_SKB_CB(skb)->end_seq, tcp_wnd_end(tp)))


--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-04-11  6:30 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-29 16:38 3.2.8/amd64 full interrupt hangs and deadlocks under big network copies (page allocation failure) Marc MERLIN
     [not found] ` <20120311183244.GA14001@merlins.org>
     [not found]   ` <20120329053111.GD24933@merlins.org>
     [not found]     ` <20120329163800.GH24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
     [not found]       ` <20120329053111.GD24933-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-03-29 16:41         ` Marc MERLIN
2012-03-29 18:09 ` Ben Hutchings
     [not found]   ` <1333044575.2656.1.camel-/LGg1Z1CJKReKY3V0RtoKmatzQS1i7+A3tAM5lWOD0I@public.gmane.org>
2012-03-29 21:19     ` Marc MERLIN
2012-04-09 17:20   ` Marc MERLIN
2012-04-09 18:12     ` David Miller
     [not found]       ` <20120409.141241.1216091936509309354.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 18:36         ` Marc MERLIN
2012-04-09 18:37           ` David Miller
2012-04-09 18:58             ` Larry Finger
2012-04-09 19:11               ` Eric Dumazet
2012-04-09 19:34                 ` David Miller
     [not found]                   ` <20120409.153452.1284163346306246866.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-09 19:46                     ` Marc MERLIN
2012-04-10  3:56                   ` Eric Dumazet
2012-04-10  5:11                     ` Marc MERLIN
2012-04-10  6:11                       ` Eric Dumazet
2012-04-11  5:27                         ` Marc MERLIN
     [not found]                           ` <20120411052733.GA17352-xnduUnryOU1AfugRpC6u6w@public.gmane.org>
2012-04-11  5:43                             ` Eric Dumazet
2012-04-11  6:30                               ` Eric Dumazet [this message]
2012-04-11  7:38                                 ` [PATCH] tcp: avoid order-1 allocations on wifi and tx path Eric Dumazet
2012-04-11 14:12                                   ` David Miller
2012-04-11 14:11                                 ` David Miller
2012-04-11  6:08                           ` [PATCH] net: allow pskb_expand_head() to get maximum tailroom Eric Dumazet
2012-04-11 14:11                             ` David Miller
2012-07-15 21:59                           ` 3.4.4/amd64 full interrupt hangs under big nfs copies Marc MERLIN
2012-07-16  6:18                             ` Eric Dumazet
2012-07-16 15:18                               ` Marc MERLIN
2012-07-16 16:21                                 ` Eric Dumazet
2012-07-16 17:17                                   ` Marc MERLIN
2013-02-19  4:05                                   ` 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs copies out Marc MERLIN
2013-02-19  5:17                                     ` Eric Dumazet
2013-02-19  5:26                                       ` Marc MERLIN
2013-02-19 10:03                                       ` Johannes Berg
2013-02-19 16:18                                         ` Marc MERLIN
2013-02-19 16:36                                           ` Eric Dumazet
2013-02-19 16:21                                         ` Eric Dumazet
2013-02-20  9:12                                           ` Johannes Berg
2013-02-20  9:15                                             ` Johannes Berg
2013-02-20 15:11                                               ` Eric Dumazet
2013-02-20 16:20                                                 ` Johannes Berg
     [not found]                                                   ` <1361377243.8629.34.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-02-20 16:55                                                     ` Eric Dumazet
2013-02-20 16:59                                                       ` Johannes Berg
2013-02-20 17:39                                                         ` Eric Dumazet
2013-02-20 17:01                                                       ` Johannes Berg
2013-02-20 17:24                                                         ` Eric Dumazet
2013-02-20 18:16                                                           ` Johannes Berg
2013-02-20 19:17                                                             ` Eric Dumazet
2013-02-20 19:58                                                               ` Johannes Berg
2013-02-20 20:14                                                                 ` Eric Dumazet
2013-02-20 20:27                                                                   ` Johannes Berg
2013-02-20 20:09                                                 ` Johannes Berg
2013-02-23  6:14                                               ` Marc MERLIN
2013-02-26 20:54                                                 ` Johannes Berg
     [not found]                                                   ` <1361912099.8440.21.camel-8Nb76shvtaUJvtFkdXX2HixXY32XiHfO@public.gmane.org>
2013-06-18 16:52                                                     ` Eric Dumazet
2013-06-18 17:04                                                       ` Johannes Berg
2013-06-19 13:09                                                         ` Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1334125848.5300.2330.camel@edumazet-glaptop \
    --to=eric.dumazet-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=Larry.Finger-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org \
    --cc=bhutchings-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=marc-xnduUnryOU1AfugRpC6u6w@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox