* [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates
@ 2018-03-31 20:16 Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 1/2] ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Dumazet @ 2018-03-31 20:16 UTC (permalink / raw)
To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
While testing my inet defrag changes, I found that senders
could spend ~20% of cpu cycles in skb_set_owner_w() updating
sk->sk_wmem_alloc for every fragment they cook, competing
with TX completion of prior skbs possibly happening on another cpus.
One solution to this problem is to use alloc_skb() instead
of sock_wmalloc() and manually perform a single sk_wmem_alloc change.
This greatly increases speed for applications sending big UDP datagrams.
Eric Dumazet (2):
ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()
ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()
net/ipv4/ip_output.c | 17 ++++++++++++-----
net/ipv6/ip6_output.c | 17 ++++++++++++-----
2 files changed, 24 insertions(+), 10 deletions(-)
--
2.17.0.rc1.321.gba9d0f2565-goog
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next 1/2] ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()
2018-03-31 20:16 [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates Eric Dumazet
@ 2018-03-31 20:16 ` Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 2/2] ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() Eric Dumazet
2018-04-01 18:09 ` [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2018-03-31 20:16 UTC (permalink / raw)
To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
While testing my inet defrag changes, I found that the senders
could spend ~20% of cpu cycles in skb_set_owner_w() updating
sk->sk_wmem_alloc for every fragment they cook.
The solution to this problem is to use alloc_skb() instead
of sock_wmalloc() and manually perform a single sk_wmem_alloc change.
Similar change for IPv6 is provided in following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/ip_output.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 66340ab750e69ff5775f7996192839a24ddc6e65..94cacae76aca41e6e7feb7575c7999a414145c49 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -876,6 +876,7 @@ static int __ip_append_data(struct sock *sk,
unsigned int maxfraglen, fragheaderlen, maxnonfragsize;
int csummode = CHECKSUM_NONE;
struct rtable *rt = (struct rtable *)cork->dst;
+ unsigned int wmem_alloc_delta = 0;
u32 tskey = 0;
skb = skb_peek_tail(queue);
@@ -971,11 +972,10 @@ static int __ip_append_data(struct sock *sk,
(flags & MSG_DONTWAIT), &err);
} else {
skb = NULL;
- if (refcount_read(&sk->sk_wmem_alloc) <=
+ if (refcount_read(&sk->sk_wmem_alloc) + wmem_alloc_delta <=
2 * sk->sk_sndbuf)
- skb = sock_wmalloc(sk,
- alloclen + hh_len + 15, 1,
- sk->sk_allocation);
+ skb = alloc_skb(alloclen + hh_len + 15,
+ sk->sk_allocation);
if (unlikely(!skb))
err = -ENOBUFS;
}
@@ -1033,6 +1033,11 @@ static int __ip_append_data(struct sock *sk,
/*
* Put the packet on the pending queue.
*/
+ if (!skb->destructor) {
+ skb->destructor = sock_wfree;
+ skb->sk = sk;
+ wmem_alloc_delta += skb->truesize;
+ }
__skb_queue_tail(queue, skb);
continue;
}
@@ -1079,12 +1084,13 @@ static int __ip_append_data(struct sock *sk,
skb->len += copy;
skb->data_len += copy;
skb->truesize += copy;
- refcount_add(copy, &sk->sk_wmem_alloc);
+ wmem_alloc_delta += copy;
}
offset += copy;
length -= copy;
}
+ refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
return 0;
error_efault:
@@ -1092,6 +1098,7 @@ static int __ip_append_data(struct sock *sk,
error:
cork->length -= length;
IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
+ refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
return err;
}
--
2.17.0.rc1.321.gba9d0f2565-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net-next 2/2] ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()
2018-03-31 20:16 [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 1/2] ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() Eric Dumazet
@ 2018-03-31 20:16 ` Eric Dumazet
2018-04-01 18:09 ` [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2018-03-31 20:16 UTC (permalink / raw)
To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
While testing my inet defrag changes, I found that the senders
could spend ~20% of cpu cycles in skb_set_owner_w() updating
sk->sk_wmem_alloc for every fragment they cook, competing
with TX completion of prior skbs possibly happening on another cpus.
The solution to this problem is to use alloc_skb() instead
of sock_wmalloc() and manually perform a single sk_wmem_alloc change.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/ip6_output.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 2c7f09c3c39ed8a1e85a967e105ff3cc30dce5b9..323d7a354ffb6f75e2a948dea63a8018ed0e057f 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1259,6 +1259,7 @@ static int __ip6_append_data(struct sock *sk,
struct ipv6_txoptions *opt = v6_cork->opt;
int csummode = CHECKSUM_NONE;
unsigned int maxnonfragsize, headersize;
+ unsigned int wmem_alloc_delta = 0;
skb = skb_peek_tail(queue);
if (!skb) {
@@ -1411,11 +1412,10 @@ static int __ip6_append_data(struct sock *sk,
(flags & MSG_DONTWAIT), &err);
} else {
skb = NULL;
- if (refcount_read(&sk->sk_wmem_alloc) <=
+ if (refcount_read(&sk->sk_wmem_alloc) + wmem_alloc_delta <=
2 * sk->sk_sndbuf)
- skb = sock_wmalloc(sk,
- alloclen + hh_len, 1,
- sk->sk_allocation);
+ skb = alloc_skb(alloclen + hh_len,
+ sk->sk_allocation);
if (unlikely(!skb))
err = -ENOBUFS;
}
@@ -1474,6 +1474,11 @@ static int __ip6_append_data(struct sock *sk,
/*
* Put the packet on the pending queue
*/
+ if (!skb->destructor) {
+ skb->destructor = sock_wfree;
+ skb->sk = sk;
+ wmem_alloc_delta += skb->truesize;
+ }
__skb_queue_tail(queue, skb);
continue;
}
@@ -1520,12 +1525,13 @@ static int __ip6_append_data(struct sock *sk,
skb->len += copy;
skb->data_len += copy;
skb->truesize += copy;
- refcount_add(copy, &sk->sk_wmem_alloc);
+ wmem_alloc_delta += copy;
}
offset += copy;
length -= copy;
}
+ refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
return 0;
error_efault:
@@ -1533,6 +1539,7 @@ static int __ip6_append_data(struct sock *sk,
error:
cork->length -= length;
IP6_INC_STATS(sock_net(sk), rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS);
+ refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
return err;
}
--
2.17.0.rc1.321.gba9d0f2565-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates
2018-03-31 20:16 [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 1/2] ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 2/2] ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() Eric Dumazet
@ 2018-04-01 18:09 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2018-04-01 18:09 UTC (permalink / raw)
To: edumazet; +Cc: netdev, eric.dumazet
From: Eric Dumazet <edumazet@google.com>
Date: Sat, 31 Mar 2018 13:16:24 -0700
> While testing my inet defrag changes, I found that senders
> could spend ~20% of cpu cycles in skb_set_owner_w() updating
> sk->sk_wmem_alloc for every fragment they cook, competing
> with TX completion of prior skbs possibly happening on another cpus.
>
> One solution to this problem is to use alloc_skb() instead
> of sock_wmalloc() and manually perform a single sk_wmem_alloc change.
>
> This greatly increases speed for applications sending big UDP datagrams.
Looks good, series applied, thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-04-01 18:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-31 20:16 [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 1/2] ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() Eric Dumazet
2018-03-31 20:16 ` [PATCH net-next 2/2] ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() Eric Dumazet
2018-04-01 18:09 ` [PATCH net-next 0/2] inet: factorize sk_wmem_alloc updates David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).