* [RFC] skb_free_datagram() doing something expensive ?
@ 2008-11-04 23:02 Eric Dumazet
2008-11-04 23:41 ` David Miller
2008-11-05 5:05 ` Eric Dumazet
0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2008-11-04 23:02 UTC (permalink / raw)
To: Linux Netdev List, Corey Minyard
Hi all
I noticed high contention on udp_memory_allocated on a typical VOIP application.
(Now that oprofile correctly runs on my machine :) )
I can see that skb_free_datagram() is :
void skb_free_datagram(struct sock *sk, struct sk_buff *skb)
{
kfree_skb(skb);
sk_mem_reclaim(sk);
}
So each time an UDP packet is received, we must touch udp_memory_allocated
Each time application reads a packet, we call sk_mem_reclaim() and touch again udp_memory_allocated.
Surely this cannot be correct ?
If this is correct, time is to resurrect a patch to make proto->memory_allocated a percpu_counter
or something to have a percpu reserve of say 64 or 128 pages to avoid cache line trashing...
tcp_memory_allocated do not have this problem, since tcp carefully calls sk_mem_reclaim(sk) only on
selected paths, not on fast path.
Thanks
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC] skb_free_datagram() doing something expensive ? 2008-11-04 23:02 [RFC] skb_free_datagram() doing something expensive ? Eric Dumazet @ 2008-11-04 23:41 ` David Miller 2008-11-05 5:05 ` Eric Dumazet 1 sibling, 0 replies; 4+ messages in thread From: David Miller @ 2008-11-04 23:41 UTC (permalink / raw) To: dada1; +Cc: netdev, minyard From: Eric Dumazet <dada1@cosmosbay.com> Date: Wed, 05 Nov 2008 00:02:22 +0100 > tcp_memory_allocated do not have this problem, since tcp carefully calls sk_mem_reclaim(sk) only on > selected paths, not on fast path. I think something similar can be done for UDP. Otherwise, yes, we'll need to do else something about it. I guess your per-cpu idea, since we're trying to enforce global limits, is to cache the available quota on a per-cpu basis. But I wonder if that can work properly. If a cpu gets overloaded and runs out of it's local quota, it'll need to grab from another cpu.... hmmm... ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] skb_free_datagram() doing something expensive ? 2008-11-04 23:02 [RFC] skb_free_datagram() doing something expensive ? Eric Dumazet 2008-11-04 23:41 ` David Miller @ 2008-11-05 5:05 ` Eric Dumazet 2008-11-05 9:38 ` David Miller 1 sibling, 1 reply; 4+ messages in thread From: Eric Dumazet @ 2008-11-05 5:05 UTC (permalink / raw) To: David S. Miller; +Cc: Linux Netdev List, Corey Minyard [-- Attachment #1: Type: text/plain, Size: 2241 bytes --] Eric Dumazet a écrit : > Hi all > > I noticed high contention on udp_memory_allocated on a typical VOIP > application. > > (Now that oprofile correctly runs on my machine :) ) > > I can see that skb_free_datagram() is : > > void skb_free_datagram(struct sock *sk, struct sk_buff *skb) > { > kfree_skb(skb); > sk_mem_reclaim(sk); > } > > So each time an UDP packet is received, we must touch udp_memory_allocated > > Each time application reads a packet, we call sk_mem_reclaim() and touch > again udp_memory_allocated. > > Surely this cannot be correct ? > > If this is correct, time is to resurrect a patch to make > proto->memory_allocated a percpu_counter > or something to have a percpu reserve of say 64 or 128 pages to avoid > cache line trashing... > > tcp_memory_allocated do not have this problem, since tcp carefully calls > sk_mem_reclaim(sk) only on > selected paths, not on fast path. > > Thanks > > What we can do is to avoid reclaiming space if forward_alloc is less than a page We did that in the past, when introducing sk_mem_reclaim_partial() in commit 9993e7d313e80bdc005d09c7def91903e0068f07 ([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer()) This patch gives a nice speedup on UDP, particularly for multiple RTP flows, where each flow has a medium trafic (say VOIP trafic) [PATCH] net: sk_free_datagram() should use sk_mem_reclaim_partial() I noticed a contention on udp_memory_allocated on regular UDP applications. While tcp_memory_allocated is seldom used, it appears each incoming UDP frame is currently touching udp_memory_allocated when queued, and when received by application. One possible solution is to use sk_mem_reclaim_partial() instead of sk_mem_reclaim(), so that we keep a small reserve (less than one page) of memory for each UDP socket. We did something very similar on TCP side in commit 9993e7d313e80bdc005d09c7def91903e0068f07 ([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer()) A more complex solution would need to convert prot->memory_allocated to use a percpu_counter with batches of 64 or 128 pages. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> [-- Attachment #2: udp_mem_reclaim.patch --] [-- Type: text/plain, Size: 611 bytes --] diff --git a/net/core/datagram.c b/net/core/datagram.c index ee63184..5e2ac0c 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -209,7 +209,7 @@ struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags, void skb_free_datagram(struct sock *sk, struct sk_buff *skb) { kfree_skb(skb); - sk_mem_reclaim(sk); + sk_mem_reclaim_partial(sk); } /** @@ -248,8 +248,7 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags) spin_unlock_bh(&sk->sk_receive_queue.lock); } - kfree_skb(skb); - sk_mem_reclaim(sk); + skb_free_datagram(sk, skb); return err; } ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC] skb_free_datagram() doing something expensive ? 2008-11-05 5:05 ` Eric Dumazet @ 2008-11-05 9:38 ` David Miller 0 siblings, 0 replies; 4+ messages in thread From: David Miller @ 2008-11-05 9:38 UTC (permalink / raw) To: dada1; +Cc: netdev, minyard From: Eric Dumazet <dada1@cosmosbay.com> Date: Wed, 05 Nov 2008 06:05:08 +0100 > What we can do is to avoid reclaiming space if forward_alloc is less than a page > > We did that in the past, when introducing sk_mem_reclaim_partial() in commit 9993e7d313e80bdc005d09c7def91903e0068f07 > ([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer()) > > This patch gives a nice speedup on UDP, particularly for multiple > RTP flows, where each flow has a medium trafic (say VOIP trafic) I like it, patch applied, thanks Eric! ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-05 9:38 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-04 23:02 [RFC] skb_free_datagram() doing something expensive ? Eric Dumazet 2008-11-04 23:41 ` David Miller 2008-11-05 5:05 ` Eric Dumazet 2008-11-05 9:38 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).