From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: "David S . Miller" <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
John Fastabend <john.r.fastabend@intel.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
brouer@redhat.com
Subject: Re: [PATCH net-next 1/4] net_sched: drop packets after root qdisc lock is released
Date: Wed, 22 Jun 2016 17:14:37 +0200 [thread overview]
Message-ID: <20160622171437.3f52d3a4@redhat.com> (raw)
In-Reply-To: <1466576212-15012-2-git-send-email-edumazet@google.com>
On Tue, 21 Jun 2016 23:16:49 -0700
Eric Dumazet <edumazet@google.com> wrote:
> Qdisc performance suffers when packets are dropped at enqueue()
> time because drops (kfree_skb()) are done while qdisc lock is held,
> delaying a dequeue() draining the queue.
>
> Nominal throughput can be reduced by 50 % when this happens,
> at a time we would like the dequeue() to proceed as fast as possible.
>
> Even FQ is vulnerable to this problem, while one of FQ goals was
> to provide some flow isolation.
>
> This patch adds a 'struct sk_buff **to_free' parameter to all
> qdisc->enqueue(), and in qdisc_drop() helper.
>
> I measured a performance increase of up to 12 %, but this patch
> is a prereq so that future batches in enqueue() can fly.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
[...]
> +/* Instead of calling kfree_skb() while root qdisc lock is held,
> + * queue the skb for future freeing at end of __dev_xmit_skb()
> + */
> +static inline void __qdisc_drop(struct sk_buff *skb, struct sk_buff **to_free)
> +{
> + skb->next = *to_free;
> + *to_free = skb;
> +}
> +
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d40593b3b9fb..aba10d2a8bc3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3070,6 +3070,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
> struct netdev_queue *txq)
> {
> spinlock_t *root_lock = qdisc_lock(q);
> + struct sk_buff *to_free = NULL;
> bool contended;
> int rc;
>
> @@ -3086,7 +3087,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>
> spin_lock(root_lock);
> if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
> - kfree_skb(skb);
> + __qdisc_drop(skb, &to_free);
> rc = NET_XMIT_DROP;
> } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) &&
> qdisc_run_begin(q)) {
> @@ -3109,7 +3110,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
>
> rc = NET_XMIT_SUCCESS;
> } else {
> - rc = q->enqueue(skb, q) & NET_XMIT_MASK;
> + rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
> if (qdisc_run_begin(q)) {
> if (unlikely(contended)) {
> spin_unlock(&q->busylock);
> @@ -3119,6 +3120,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
> }
> }
> spin_unlock(root_lock);
> + if (unlikely(to_free))
> + kfree_skb_list(to_free);
Great, now there is a good argument for implementing kmem_cache bulk
freeing inside kfree_skb_list(). I did a ugly PoC implementation
once, but there was no use-case that really needed the performance
boost.
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2016-06-22 15:14 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-22 6:16 [PATCH net-next 0/4] net_sched: bulk dequeue and deferred drops Eric Dumazet
2016-06-22 6:16 ` [PATCH net-next 1/4] net_sched: drop packets after root qdisc lock is released Eric Dumazet
2016-06-22 15:14 ` Jesper Dangaard Brouer [this message]
2016-06-22 6:16 ` [PATCH net-next 2/4] net_sched: fq_codel: cache skb->truesize into skb->cb Eric Dumazet
2016-06-22 6:16 ` [PATCH net-next 3/4] net_sched: sch_htb: export class backlog in dumps Eric Dumazet
2016-06-22 6:16 ` [PATCH net-next 4/4] net_sched: generalize bulk dequeue Eric Dumazet
2016-06-22 15:03 ` Jesper Dangaard Brouer
2016-06-23 7:26 ` Paolo Abeni
2016-06-22 14:47 ` [PATCH net-next 0/4] net_sched: bulk dequeue and deferred drops Jesper Dangaard Brouer
2016-06-22 14:55 ` Eric Dumazet
2016-06-22 15:44 ` Jesper Dangaard Brouer
2016-06-22 16:49 ` Eric Dumazet
2016-06-23 14:22 ` Jesper Dangaard Brouer
2016-06-23 16:21 ` Luigi Rizzo
2016-06-25 16:20 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160622171437.3f52d3a4@redhat.com \
--to=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=john.r.fastabend@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).