All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [RFC] Could we avoid touching dst->refcount in some cases ?
Date: Mon, 24 Nov 2008 10:42:42 +0100	[thread overview]
Message-ID: <87y6z9h33h.fsf@basil.nowhere.org> (raw)
In-Reply-To: <492A6C94.7030308@cosmosbay.com> (Eric Dumazet's message of "Mon, 24 Nov 2008 09:57:56 +0100")

Eric Dumazet <dada1@cosmosbay.com> writes:

> tbench has hard time incrementing decrementing the route cache refcount
> shared by all communications on localhost.

iirc there was a patch some time ago to use per CPU loopback devices to 
avoid this, but it was considered too much a benchmark hack.
As core counts increase it might stop being that though.

>
> On real world, we also have this problem on RTP servers sending many UDP
> frames to mediagateways, especially big ones handling thousand of streams.
>
> Given that route entries are using RCU, we probably can avoid incrementing
> their refcount in case of connected sockets ?

Normally they can be hold over sleeps or queuing of skbs too, and RCU
doesn't handle that. To make it handle that you would need to define a
custom RCU period designed for this case, but this would be probably
tricky and fragile: especially I'm not sure even if you had a "any
packet queued" RCU method it be guaranteed to always finish 
because there is no fixed upper livetime of a packet.

The other issue is that on preemptible kernels you would need to 
disable preemption all the time such a routing entry is hold, which
could be potentially quite long.

-Andi

-- 
ak@linux.intel.com

  reply	other threads:[~2008-11-24  9:42 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24  8:57 [RFC] Could we avoid touching dst->refcount in some cases ? Eric Dumazet
2008-11-24  9:42 ` Andi Kleen [this message]
2008-11-24 10:14   ` Eric Dumazet
2008-11-24 11:24     ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
2008-11-24 13:59       ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
2008-11-25  0:07         ` David Miller
2008-11-24 23:55       ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
2008-11-25  2:22         ` Andi Kleen
2008-11-24 11:27     ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
2008-11-24 23:36       ` David Miller
2008-11-24 23:39     ` David Miller
2008-11-25  4:43       ` Eric Dumazet
2008-11-25  5:00         ` David Miller
2008-11-26  0:00           ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
2008-11-26  0:23             ` David Miller
2008-11-26  2:04             ` David Miller
2008-11-26  7:39               ` Eric Dumazet
2008-11-26  9:08                 ` David Miller
2008-12-17 11:25             ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
2008-12-18  3:34               ` net-next: broken IP_PKTINFO David Miller
2008-12-18  5:59                 ` Eric Dumazet
2008-12-18  6:17                   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y6z9h33h.fsf@basil.nowhere.org \
    --to=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.