From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: [net-next PATCH 1/1 V4] qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE Date: Wed, 24 Sep 2014 18:12:12 +0200 Message-ID: <20140924161047.9721.43080.stgit@localhost> References: <20140924160932.9721.56450.stgit@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: Alexander Duyck , toke@toke.dk, Florian Westphal , jhs@mojatatu.com, Dave Taht , John Fastabend , Daniel Borkmann , Hannes Frederic Sowa To: netdev@vger.kernel.org, therbert@google.com, "David S. Miller" , Eric Dumazet , Jesper Dangaard Brouer Return-path: Received: from mx1.redhat.com ([209.132.183.28]:50690 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751630AbaIXQMh (ORCPT ); Wed, 24 Sep 2014 12:12:37 -0400 In-Reply-To: <20140924160932.9721.56450.stgit@localhost> Sender: netdev-owner@vger.kernel.org List-ID: Based on DaveM's recent API work on dev_hard_start_xmit(), that allows sending/processing an entire skb list. This patch implements qdisc bulk dequeue, by allowing multiple packets to be dequeued in dequeue_skb(). The optimization principle for this is two fold, (1) to amortize locking cost and (2) avoid expensive tailptr update for notifying HW. (1) Several packets are dequeued while holding the qdisc root_lock, amortizing locking cost over several packet. The dequeued SKB list is processed under the TXQ lock in dev_hard_start_xmit(), thus also amortizing the cost of the TXQ lock. (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more API to delay HW tailptr update, which also reduces the cost per packet. One restriction of the new API is that every SKB must belong to the same TXQ. This patch takes the easy way out, by restricting bulk dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the qdisc only have attached a single TXQ. Some detail about the flow; dev_hard_start_xmit() will process the skb list, and transmit packets individually towards the driver (see xmit_one()). In case the driver stops midway in the list, the remaining skb list is returned by dev_hard_start_xmit(). In sch_direct_xmit() this returned list is requeued by dev_requeue_skb(). To avoid overshooting the HW limits, which results in requeuing, the patch limits the amount of bytes dequeued, based on the drivers BQL limits. In-effect bulking will only happen for BQL enabled drivers. Besides the bytelimit from BQL, also limit bulking to maximum 8 packets to avoid any issues with available descriptor in HW. For now, as a conservative approach, don't bulk dequeue GSO and segmented GSO packets, because testing showed requeuing occuring with segmented GSO packets. Jointed work with Hannes, Daniel and Florian. Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Hannes Frederic Sowa Signed-off-by: Daniel Borkmann Signed-off-by: Florian Westphal --- V4: - Patch rewritten in the Red Hat Neuchatel office jointed work with Hannes, Daniel and Florian. - Conservative approach of only using on BQL enabled drivers - No user tunable parameter, but limit bulking to 8 packets. - For now, avoid bulking GSO packets packets. V3: - Correct use of BQL - Some minor adjustments based on feedback. - Default setting only bulk dequeue 1 extra packet (2 packets). V2: - Restruct functions, split out functionality - Use BQL bytelimit to avoid overshooting driver limits --- include/net/sch_generic.h | 16 +++++++++++++++ net/sched/sch_generic.c | 47 +++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 61 insertions(+), 2 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index a3cfb8e..4e39a3e 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -111,6 +112,21 @@ static inline void qdisc_run_end(struct Qdisc *qdisc) qdisc->__state &= ~__QDISC___STATE_RUNNING; } +static inline bool qdisc_may_bulk(const struct Qdisc *qdisc) +{ + return qdisc->flags & TCQ_F_ONETXQUEUE; +} + +static inline int qdisc_avail_bulklimit(const struct netdev_queue *txq) +{ +#ifdef CONFIG_BQL + /* Non-BQL migrated drivers will return 0, too. */ + return dql_avail(&txq->dql); +#else + return 0; +#endif +} + static inline bool qdisc_is_throttled(const struct Qdisc *qdisc) { return test_bit(__QDISC_STATE_THROTTLED, &qdisc->state) ? true : false; diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 19696eb..6fba089 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -56,6 +56,42 @@ static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q) return 0; } +static struct sk_buff *try_bulk_dequeue_skb(struct Qdisc *q, + struct sk_buff *head_skb, + int bytelimit) +{ + struct sk_buff *skb, *tail_skb = head_skb; + int budget = 8; /* Arbitrary, but conservatively choosen limit */ + + while (bytelimit > 0 && --budget > 0) { + /* For now, don't bulk dequeue GSO (or GSO segmented) pkts */ + if (tail_skb->next || skb_is_gso(tail_skb)) + break; + + skb = q->dequeue(q); + if (!skb) + break; + + bytelimit -= skb->len; /* covers GSO len */ + skb = validate_xmit_skb(skb, qdisc_dev(q)); + if (!skb) + break; + + /* "skb" can be a skb list after validate call above + * (GSO segmented), but it is okay to append it to + * current tail_skb->next, because next round will exit + * in-case "tail_skb->next" is a skb list. + */ + tail_skb->next = skb; + tail_skb = skb; + } + + return head_skb; +} + +/* Note that dequeue_skb can possibly return a SKB list (via skb->next). + * A requeued skb (via q->gso_skb) can also be a SKB list. + */ static inline struct sk_buff *dequeue_skb(struct Qdisc *q) { struct sk_buff *skb = q->gso_skb; @@ -70,10 +106,17 @@ static inline struct sk_buff *dequeue_skb(struct Qdisc *q) } else skb = NULL; } else { - if (!(q->flags & TCQ_F_ONETXQUEUE) || !netif_xmit_frozen_or_stopped(txq)) { + if (!(q->flags & TCQ_F_ONETXQUEUE) || + !netif_xmit_frozen_or_stopped(txq)) { + int bytelimit = qdisc_avail_bulklimit(txq); + skb = q->dequeue(q); - if (skb) + if (skb) { + bytelimit -= skb->len; skb = validate_xmit_skb(skb, qdisc_dev(q)); + } + if (skb && qdisc_may_bulk(q)) + skb = try_bulk_dequeue_skb(q, skb, bytelimit); } }