All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@gmail.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: David Miller <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [RFC] net: release dst entry in dev_queue_xmit()
Date: Wed, 25 Mar 2009 20:18:50 +0100	[thread overview]
Message-ID: <20090325191850.GA2928@ami.dom.local> (raw)
In-Reply-To: <49CA7AD7.8040401@cosmosbay.com>

On Wed, Mar 25, 2009 at 07:41:27PM +0100, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > David Miller wrote, On 03/25/2009 08:17 AM:
> > 
> >> From: Eric Dumazet <dada1@cosmosbay.com>
> >> Date: Wed, 25 Mar 2009 08:13:30 +0100
> >>
> >>> If done in dev_hard_start_xmit(), skb could be requeued (because of
> >>> NETDEV_TX_BUSY).  Then if requeued, maybe at this time, dst being
> >>> NULL is not a problem ?
> >> Usually it should be OK because the packet schedulers have
> >> a sort-of one-behind state that allows them to reinsert
> >> the SKB into their queue datastructures without reclassifying.
> > 
> > 
> > Actually, since David has dumped requeuing there is no reinserting;
> > this last one "requeued" skb is buffered at the top in q->gso_skb
> > and waiting for better times.
> 
> Yes indeed, this is what I thought too, thanks Jarek.

Alas I'm a bit concerned with virtual devs, e.g. now I'm looking at
xmits in macvlan and pppoe. Maybe this patch should exclude them?

Jarek P.

> 
> I tested following patch today on my machine, but obviously could not try 
> all possible quirks :)
> 
> [PATCH] net: release dst entry in dev_hard_start_xmit()
> 
> One point of contention in high network loads is the dst_release() performed
> when a transmited skb is freed. This is because NIC tx completion calls skb free
> long after original call to dev_queue_xmit(skb).
> 
> CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
> quite visible if one CPU is 100% handling softirqs for a network device,
> since dst_clone() is done by other cpus, involving cache line ping pongs.
> 
> I believe we can release dst in dev_hard_start_xmit(), while cpu cache is hot, since
> caller of dev_queue_xmit() had to hold a reference on dst right before. This reduce
> work to be done by softirq handler, and decrease cache misses.
> 
> I also believe only pktgen can call dev_queue_xmit() with skb which have
> a skb->users != 1. But pkthen skbs have a NULL dst entry.
> 
> I added a WARN_ON_ONCE() to catch other cases, and not release skb->dst
> if skb->users != 1
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e3fe5c7..a622db6 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1664,6 +1664,26 @@ static int dev_gso_segment(struct sk_buff *skb)
>  	return 0;
>  }
>  
> +
> +/*
> + * Release dst while its refcnt is likely hot in CPU cache, instead
> + * of waiting NIC tx completion.
> + * We inline dst_release() code for performance reason
> + */
> +static void release_skb_dst(struct sk_buff *skb)
> +{
> +	if (likely(skb->dst)) {
> +		if (!WARN_ON_ONCE(atomic_read(&skb->users) != 1)) {
> +			int newrefcnt;
> +
> +			smp_mb__before_atomic_dec();
> +			newrefcnt = atomic_dec_return(&skb->dst->__refcnt);
> +			WARN_ON(newrefcnt < 0);
> +			skb->dst = NULL;
> +		}
> +	}
> +}
> +
>  int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  			struct netdev_queue *txq)
>  {
> @@ -1681,6 +1701,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  				goto gso;
>  		}
>  
> +		release_skb_dst(skb);
>  		return ops->ndo_start_xmit(skb, dev);
>  	}
>  
> @@ -1691,6 +1712,7 @@ gso:
>  
>  		skb->next = nskb->next;
>  		nskb->next = NULL;
> +		release_skb_dst(nskb);
>  		rc = ops->ndo_start_xmit(nskb, dev);
>  		if (unlikely(rc)) {
>  			nskb->next = skb->next;
> 

  reply	other threads:[~2009-03-25 19:19 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-20 11:40 [RFC] net: release dst entry in dev_queue_xmit() Eric Dumazet
2009-03-20 14:10 ` Neil Horman
2009-03-25  6:43 ` David Miller
2009-03-25  7:13   ` Eric Dumazet
2009-03-25  7:17     ` David Miller
2009-03-25 18:22       ` Jarek Poplawski
2009-03-25 18:41         ` Eric Dumazet
2009-03-25 19:18           ` Jarek Poplawski [this message]
2009-03-25 19:40             ` Eric Dumazet
2009-03-25 19:54               ` Jarek Poplawski
2009-03-25 20:28                 ` Eric Dumazet
2009-03-25 21:12                   ` Jarek Poplawski
2009-03-25 21:20                     ` Patrick McHardy
2009-05-12  8:12 ` [PATCH] net: release dst entry in dev_hard_start_xmit() Eric Dumazet
2009-05-12  8:21   ` Eric Dumazet
2009-05-12 19:26     ` [PATCH, v2] " Eric Dumazet
2009-05-19  5:19       ` David Miller
2009-05-19  5:44         ` Eric Dumazet
2009-05-19 19:44         ` Eric Dumazet
2009-05-19 21:09           ` Jarek Poplawski
2009-05-19 21:21             ` Eric Dumazet
2009-05-19 21:24             ` David Miller
2009-05-12 19:27     ` [PATCH] " Jarek Poplawski
2009-05-12 19:44       ` Eric Dumazet
2009-05-12 20:05         ` Jarek Poplawski
2009-05-12 20:24           ` Jarek Poplawski
2009-05-12 20:52           ` Eric Dumazet
2009-05-12 20:59             ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090325191850.GA2928@ami.dom.local \
    --to=jarkao2@gmail.com \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.