netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Linux Netdev List <netdev@vger.kernel.org>
Subject: [RFC] Could we avoid touching dst->refcount in some cases ?
Date: Mon, 24 Nov 2008 09:57:56 +0100	[thread overview]
Message-ID: <492A6C94.7030308@cosmosbay.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 484 bytes --]

tbench has hard time incrementing decrementing the route cache refcount
shared by all communications on localhost.

On real world, we also have this problem on RTP servers sending many UDP
frames to mediagateways, especially big ones handling thousand of streams.

Given that route entries are using RCU, we probably can avoid incrementing
their refcount in case of connected sockets ?

Here is a (untested and probably not working at all) patch on UDP part to
illustrate the idea :


[-- Attachment #2: avoid_touching_refcount.patch --]
[-- Type: text/plain, Size: 1271 bytes --]

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index da869ce..c385f13 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -553,6 +553,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	int ulen = len;
 	struct ipcm_cookie ipc;
 	struct rtable *rt = NULL;
+	int rt_release = 0;
 	int free = 0;
 	int connected = 0;
 	__be32 daddr, faddr, saddr;
@@ -656,8 +657,9 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		connected = 0;
 	}
 
+	rcu_read_lock();
 	if (connected)
-		rt = (struct rtable*)sk_dst_check(sk, 0);
+		rt = (struct rtable *)__sk_dst_check(sk, 0);
 
 	if (rt == NULL) {
 		struct flowi fl = { .oif = ipc.oif,
@@ -681,11 +683,14 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		}
 
 		err = -EACCES;
+		rt_release = 1;
 		if ((rt->rt_flags & RTCF_BROADCAST) &&
 		    !sock_flag(sk, SOCK_BROADCAST))
 			goto out;
-		if (connected)
-			sk_dst_set(sk, dst_clone(&rt->u.dst));
+		if (connected) {
+			sk_dst_set(sk, &rt->u.dst);
+			rt_release = 0;
+		}
 	}
 
 	if (msg->msg_flags&MSG_CONFIRM)
@@ -730,7 +735,9 @@ do_append_data:
 	release_sock(sk);
 
 out:
-	ip_rt_put(rt);
+	if (rt_release)
+		ip_rt_put(rt);
+	rcu_read_unlock();
 	if (free)
 		kfree(ipc.opt);
 	if (!err)

             reply	other threads:[~2008-11-24  8:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24  8:57 Eric Dumazet [this message]
2008-11-24  9:42 ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
2008-11-24 10:14   ` Eric Dumazet
2008-11-24 11:24     ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
2008-11-24 13:59       ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
2008-11-25  0:07         ` David Miller
2008-11-24 23:55       ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
2008-11-25  2:22         ` Andi Kleen
2008-11-24 11:27     ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
2008-11-24 23:36       ` David Miller
2008-11-24 23:39     ` David Miller
2008-11-25  4:43       ` Eric Dumazet
2008-11-25  5:00         ` David Miller
2008-11-26  0:00           ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
2008-11-26  0:23             ` David Miller
2008-11-26  2:04             ` David Miller
2008-11-26  7:39               ` Eric Dumazet
2008-11-26  9:08                 ` David Miller
2008-12-17 11:25             ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
2008-12-18  3:34               ` net-next: broken IP_PKTINFO David Miller
2008-12-18  5:59                 ` Eric Dumazet
2008-12-18  6:17                   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=492A6C94.7030308@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).