All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: "David S . Miller" <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	John Fastabend <john.r.fastabend@intel.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	brouer@redhat.com
Subject: Re: [PATCH net-next 1/4] net_sched: drop packets after root qdisc lock is released
Date: Wed, 22 Jun 2016 17:14:37 +0200	[thread overview]
Message-ID: <20160622171437.3f52d3a4@redhat.com> (raw)
In-Reply-To: <1466576212-15012-2-git-send-email-edumazet@google.com>

On Tue, 21 Jun 2016 23:16:49 -0700
Eric Dumazet <edumazet@google.com> wrote:

> Qdisc performance suffers when packets are dropped at enqueue()
> time because drops (kfree_skb()) are done while qdisc lock is held,
> delaying a dequeue() draining the queue.
> 
> Nominal throughput can be reduced by 50 % when this happens,
> at a time we would like the dequeue() to proceed as fast as possible.
> 
> Even FQ is vulnerable to this problem, while one of FQ goals was
> to provide some flow isolation.
> 
> This patch adds a 'struct sk_buff **to_free' parameter to all
> qdisc->enqueue(), and in qdisc_drop() helper.
> 
> I measured a performance increase of up to 12 %, but this patch
> is a prereq so that future batches in enqueue() can fly.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
[...]
> +/* Instead of calling kfree_skb() while root qdisc lock is held,
> + * queue the skb for future freeing at end of __dev_xmit_skb()
> + */
> +static inline void __qdisc_drop(struct sk_buff *skb, struct sk_buff **to_free)
> +{
> +	skb->next = *to_free;
> +	*to_free = skb;
> +}
> +
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d40593b3b9fb..aba10d2a8bc3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3070,6 +3070,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>  				 struct netdev_queue *txq)
>  {
>  	spinlock_t *root_lock = qdisc_lock(q);
> +	struct sk_buff *to_free = NULL;
>  	bool contended;
>  	int rc;
>  
> @@ -3086,7 +3087,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>  
>  	spin_lock(root_lock);
>  	if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
> -		kfree_skb(skb);
> +		__qdisc_drop(skb, &to_free);
>  		rc = NET_XMIT_DROP;
>  	} else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) &&
>  		   qdisc_run_begin(q)) {
> @@ -3109,7 +3110,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>  
>  		rc = NET_XMIT_SUCCESS;
>  	} else {
> -		rc = q->enqueue(skb, q) & NET_XMIT_MASK;
> +		rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
>  		if (qdisc_run_begin(q)) {
>  			if (unlikely(contended)) {
>  				spin_unlock(&q->busylock);
> @@ -3119,6 +3120,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>  		}
>  	}
>  	spin_unlock(root_lock);
> +	if (unlikely(to_free))
> +		kfree_skb_list(to_free);

Great, now there is a good argument for implementing kmem_cache bulk
freeing inside kfree_skb_list().  I did a ugly PoC implementation
once, but there was no use-case that really needed the performance
boost.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2016-06-22 15:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22  6:16 [PATCH net-next 0/4] net_sched: bulk dequeue and deferred drops Eric Dumazet
2016-06-22  6:16 ` [PATCH net-next 1/4] net_sched: drop packets after root qdisc lock is released Eric Dumazet
2016-06-22 15:14   ` Jesper Dangaard Brouer [this message]
2016-06-22  6:16 ` [PATCH net-next 2/4] net_sched: fq_codel: cache skb->truesize into skb->cb Eric Dumazet
2016-06-22  6:16 ` [PATCH net-next 3/4] net_sched: sch_htb: export class backlog in dumps Eric Dumazet
2016-06-22  6:16 ` [PATCH net-next 4/4] net_sched: generalize bulk dequeue Eric Dumazet
2016-06-22 15:03   ` Jesper Dangaard Brouer
2016-06-23  7:26   ` Paolo Abeni
2016-06-22 14:47 ` [PATCH net-next 0/4] net_sched: bulk dequeue and deferred drops Jesper Dangaard Brouer
2016-06-22 14:55   ` Eric Dumazet
2016-06-22 15:44     ` Jesper Dangaard Brouer
2016-06-22 16:49       ` Eric Dumazet
2016-06-23 14:22         ` Jesper Dangaard Brouer
2016-06-23 16:21         ` Luigi Rizzo
2016-06-25 16:20 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160622171437.3f52d3a4@redhat.com \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.