From: Eric Dumazet <eric.dumazet@gmail.com>
To: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Cc: David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, opurdila@ixiacom.com
Subject: Re: [PATCH net-next-2.6] udp: Optimise multicast reception
Date: Fri, 06 Nov 2009 17:54:15 +0100 [thread overview]
Message-ID: <4AF454B7.6050103@gmail.com> (raw)
In-Reply-To: <200911061835.24928.lgrijincu@ixiacom.com>
Lucian Adrian Grijincu a écrit :
> În data de Vin 06 Noi 2009 17:59:51 ați scris:
>> +static void flush_stack(struct sock **stack, unsigned int count,
>> + struct sk_buff *skb, unsigned int final)
>> +{
>> + unsigned int i;
>> + struct sk_buff *skb1 = NULL;
>> +
>> + for (i = 0; i < count; i++) {
>> + if (likely(skb1 == NULL))
>> + skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC);
>> +
>> + if (skb1 && udp_queue_rcv_skb(stack[i], skb1) <= 0)
>> + skb1 = NULL;
>> + }
>> + if (skb1)
>> + consume_skb(skb1);
>> +}
>
> consume_skb() assumes the skb was successfuly transmitted.
>
> free_skb() does the same thing, but assumes that the frame is being dropped
> after a failure and notes that.
>
> In your code, if (count == 0) you:
> * fail to remove the original skb (memory leak),
> * simply consume the last dropped skb, without noting the droping failure.
>
> I fixed these in the attached (untested) patch.
>
> One last issue: you silently ignore dropped failures (skb1 is reused in case
> of a failure).
>
> If this tracing must record all failures, I'd add an
> trace_kfree_skb(skb1, __builtin_return_address(0));
> if udp_queue_rcv_skb() fails.
>
> Other than this, nicely done!
>
Thanks !
Note, free_skb() doesnt exist ;)
And we should not call consume_skb() in this path.
I made the if (count == 0) done at the end of __udp4_lib_mcast_deliver()
[PATCH net-next-2.6] udp: Optimise multicast reception
UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.
Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
---
net/ipv4/udp.c | 76 ++++++++++++++++++++++++++++++-----------------
1 files changed, 50 insertions(+), 26 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d5e75e9..45c73b1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1190,49 +1190,73 @@ drop:
return -1;
}
+
+static void flush_stack(struct sock **stack, unsigned int count,
+ struct sk_buff *skb, unsigned int final)
+{
+ unsigned int i;
+ struct sk_buff *skb1 = NULL;
+
+ for (i = 0; i < count; i++) {
+ if (likely(skb1 == NULL))
+ skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC);
+
+ if (skb1 && udp_queue_rcv_skb(stack[i], skb1) <= 0)
+ skb1 = NULL;
+ }
+ if (unlikely(skb1))
+ kfree_skb(skb1);
+}
+
/*
* Multicasts and broadcasts go to each listener.
*
- * Note: called only from the BH handler context,
- * so we don't need to lock the hashes.
+ * Note: called only from the BH handler context.
*/
static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
struct udphdr *uh,
__be32 saddr, __be32 daddr,
struct udp_table *udptable)
{
- struct sock *sk;
+ struct sock *sk, *stack[256 / sizeof(struct sock *)];
struct udp_hslot *hslot = udp_hashslot(udptable, net, ntohs(uh->dest));
int dif;
+ unsigned int i, count = 0;
spin_lock(&hslot->lock);
sk = sk_nulls_head(&hslot->head);
dif = skb->dev->ifindex;
sk = udp_v4_mcast_next(net, sk, uh->dest, daddr, uh->source, saddr, dif);
- if (sk) {
- struct sock *sknext = NULL;
-
- do {
- struct sk_buff *skb1 = skb;
-
- sknext = udp_v4_mcast_next(net, sk_nulls_next(sk), uh->dest,
- daddr, uh->source, saddr,
- dif);
- if (sknext)
- skb1 = skb_clone(skb, GFP_ATOMIC);
-
- if (skb1) {
- int ret = udp_queue_rcv_skb(sk, skb1);
- if (ret > 0)
- /* we should probably re-process instead
- * of dropping packets here. */
- kfree_skb(skb1);
- }
- sk = sknext;
- } while (sknext);
- } else
- consume_skb(skb);
+ while (sk) {
+ stack[count++] = sk;
+ sk = udp_v4_mcast_next(net, sk_nulls_next(sk), uh->dest,
+ daddr, uh->source, saddr, dif);
+ if (unlikely(count == ARRAY_SIZE(stack))) {
+ if (!sk)
+ break;
+ flush_stack(stack, count, skb, ~0);
+ count = 0;
+ }
+ }
+ /*
+ * before releasing chain lock, we must take a reference on sockets
+ */
+ for (i = 0; i < count; i++)
+ sock_hold(stack[i]);
+
spin_unlock(&hslot->lock);
+
+ /*
+ * do the slow work with no lock held
+ */
+ if (count) {
+ flush_stack(stack, count, skb, count - 1);
+
+ for (i = 0; i < count; i++)
+ sock_put(stack[i]);
+ } else {
+ kfree_skb(skb);
+ }
return 0;
}
prev parent reply other threads:[~2009-11-06 16:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-05 18:33 [PATCH 1/2] udp: cleanup __udp4_lib_mcast_deliver Lucian Adrian Grijincu
2009-11-06 8:42 ` David Miller
2009-11-06 14:04 ` Lucian Adrian Grijincu
2009-11-06 15:10 ` Eric Dumazet
2009-11-06 15:59 ` [PATCH net-next-2.6] udp: Optimise multicast reception Eric Dumazet
2009-11-06 16:30 ` [PATCH net-next-2.6] ipv6: " Eric Dumazet
2009-11-06 16:35 ` Eric Dumazet
2009-11-06 17:06 ` [PATCH net-next-2.6 take2] " Eric Dumazet
2009-11-06 17:19 ` Lucian Adrian Grijincu
2009-11-06 17:24 ` Eric Dumazet
2009-11-06 17:54 ` Eric Dumazet
2009-11-06 17:59 ` Lucian Adrian Grijincu
2009-11-06 18:03 ` Eric Dumazet
2009-11-06 16:35 ` [PATCH net-next-2.6] " Lucian Adrian Grijincu
2009-11-06 16:54 ` Eric Dumazet [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AF454B7.6050103@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=lgrijincu@ixiacom.com \
--cc=netdev@vger.kernel.org \
--cc=opurdila@ixiacom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.