From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, therbert@google.com,
hannes@stressinduktion.org, fw@strlen.de, dborkman@redhat.com,
jhs@mojatatu.com, alexander.duyck@gmail.com,
john.r.fastabend@intel.com, dave.taht@gmail.com, toke@toke.dk,
brouer@redhat.com
Subject: Re: Quota in __qdisc_run() (was: qdisc: validate skb without holding lock)
Date: Tue, 7 Oct 2014 15:30:50 +0200 [thread overview]
Message-ID: <20141007153050.792c9743@redhat.com> (raw)
In-Reply-To: <1412686038.11091.111.camel@edumazet-glaptop2.roam.corp.google.com>
On Tue, 07 Oct 2014 05:47:18 -0700
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-10-07 at 09:34 +0200, Jesper Dangaard Brouer wrote:
> > On Fri, 03 Oct 2014 16:30:44 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > > Another problem we need to address is the quota in __qdisc_run()
> > > is no longer meaningfull, if each qdisc_restart() can pump many packets.
> >
> > I fully agree. My earlier "magic" packet limit was covering/pampering
> > over this issue.
>
> Although quota was multiplied by 7 or 8 in worst case ?
Yes, exactly not a very elegant solution ;-)
> > > An idea would be to use the bstats (or cpu_qstats if applicable)
> >
> > Please elaborate some more, as I don't completely follow (feel free to
> > show with a patch ;-)).
> >
>
> I was hoping John could finish the percpu stats before I do that.
>
> Problem with q->bstats.packets is that TSO packets with 45 MSS add 45 to
> this counter.
>
> Using a time quota would be better, but : jiffies is too big, and
> local_clock() might be too expensive.
Hannes hacked up this patch for me... (didn't finish testing)
The basic idea is we want keep/restore the quota fairness between
qdisc's , that we sort of broke with commit 5772e9a346 ("qdisc: bulk
dequeue support for qdiscs with TCQ_F_ONETXQUEUE").
We choose not to account for the number of packets inside the TSO/GSO
packets ("skb_gso_segs"). As the previous fairness also had this "defect".
You might view this as a short term solution, until you can fix it with
your q->bstats.packets or time quota?
[RFC PATCH] net_sched: restore quota limits after bulk dequeue
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -57,17 +57,19 @@ static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
static void try_bulk_dequeue_skb(struct Qdisc *q,
struct sk_buff *skb,
- const struct netdev_queue *txq)
+ const struct netdev_queue *txq,
+ int *quota)
{
int bytelimit = qdisc_avail_bulklimit(txq) - skb->len;
- while (bytelimit > 0) {
+ while (bytelimit > 0 && *quota > 0) {
struct sk_buff *nskb = q->dequeue(q);
if (!nskb)
break;
bytelimit -= nskb->len; /* covers GSO len */
+ --*quota;
skb->next = nskb;
skb = nskb;
}
@@ -77,7 +79,7 @@ static void try_bulk_dequeue_skb(struct Qdisc *q,
/* Note that dequeue_skb can possibly return a SKB list (via skb->next).
* A requeued skb (via q->gso_skb) can also be a SKB list.
*/
-static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate)
+static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate, int *quota)
{
struct sk_buff *skb = q->gso_skb;
const struct netdev_queue *txq = q->dev_queue;
@@ -87,18 +89,25 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate)
/* check the reason of requeuing without tx lock first */
txq = skb_get_tx_queue(txq->dev, skb);
if (!netif_xmit_frozen_or_stopped(txq)) {
+ struct sk_buff *iskb = skb;
+
q->gso_skb = NULL;
q->q.qlen--;
- } else
+ do
+ --*quota;
+ while ((iskb = skb->next));
+ } else {
skb = NULL;
+ }
/* skb in gso_skb were already validated */
*validate = false;
} else {
if (!(q->flags & TCQ_F_ONETXQUEUE) ||
!netif_xmit_frozen_or_stopped(txq)) {
skb = q->dequeue(q);
+ --*quota;
if (skb && qdisc_may_bulk(q))
- try_bulk_dequeue_skb(q, skb, txq);
+ try_bulk_dequeue_skb(q, skb, txq, quota);
}
}
return skb;
@@ -204,7 +213,7 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
* >0 - queue is not empty.
*
*/
-static inline int qdisc_restart(struct Qdisc *q)
+static inline int qdisc_restart(struct Qdisc *q, int *quota)
{
struct netdev_queue *txq;
struct net_device *dev;
@@ -213,7 +222,7 @@ static inline int qdisc_restart(struct Qdisc *q)
bool validate;
/* Dequeue packet */
- skb = dequeue_skb(q, &validate);
+ skb = dequeue_skb(q, &validate, quota);
if (unlikely(!skb))
return 0;
@@ -230,13 +239,13 @@ void __qdisc_run(struct Qdisc *q)
{
int quota = weight_p;
- while (qdisc_restart(q)) {
+ while (qdisc_restart(q, "a)) {
/*
* Ordered by possible occurrence: Postpone processing if
* 1. we've exceeded packet quota
* 2. another process needs the CPU;
*/
- if (--quota <= 0 || need_resched()) {
+ if (quota <= 0 || need_resched()) {
__netif_schedule(q);
break;
}
next prev parent reply other threads:[~2014-10-07 14:22 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-01 20:35 [net-next PATCH V6 0/2] qdisc: bulk dequeue support Jesper Dangaard Brouer
2014-10-01 20:35 ` [net-next PATCH V6 1/2] qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE Jesper Dangaard Brouer
2014-10-01 20:36 ` [net-next PATCH V6 2/2] qdisc: dequeue bulking also pickup GSO/TSO packets Jesper Dangaard Brouer
2014-10-02 14:35 ` Eric Dumazet
2014-10-02 14:38 ` Daniel Borkmann
2014-10-02 14:42 ` [net-next PATCH V6 0/2] qdisc: bulk dequeue support Tom Herbert
2014-10-02 15:04 ` Eric Dumazet
2014-10-02 15:24 ` [PATCH net-next] mlx4: add a new xmit_more counter Eric Dumazet
2014-10-05 0:04 ` David Miller
2014-10-02 15:27 ` [net-next PATCH V6 0/2] qdisc: bulk dequeue support Tom Herbert
2014-10-02 16:52 ` Florian Westphal
2014-10-02 17:32 ` Eric Dumazet
2014-10-02 17:35 ` Tom Herbert
2014-10-03 19:38 ` David Miller
2014-10-03 20:57 ` Eric Dumazet
2014-10-03 21:56 ` David Miller
2014-10-03 21:57 ` Eric Dumazet
2014-10-03 22:15 ` Eric Dumazet
2014-10-03 22:19 ` Tom Herbert
2014-10-03 22:56 ` Eric Dumazet
2014-10-03 22:30 ` David Miller
2014-10-03 22:31 ` [PATCH net-next] qdisc: validate skb without holding lock Eric Dumazet
2014-10-03 22:36 ` David Miller
2014-10-03 23:30 ` Eric Dumazet
2014-10-07 7:34 ` Quota in __qdisc_run() (was: qdisc: validate skb without holding lock) Jesper Dangaard Brouer
2014-10-07 12:47 ` Eric Dumazet
2014-10-07 13:30 ` Jesper Dangaard Brouer [this message]
2014-10-07 14:43 ` Hannes Frederic Sowa
2014-10-07 15:01 ` Eric Dumazet
2014-10-07 15:06 ` Eric Dumazet
2014-10-07 17:19 ` Quota in __qdisc_run() David Miller
2014-10-07 17:32 ` Eric Dumazet
2014-10-07 18:37 ` Jesper Dangaard Brouer
2014-10-07 20:07 ` Jesper Dangaard Brouer
2014-10-07 18:03 ` Jesper Dangaard Brouer
2014-10-07 19:10 ` Eric Dumazet
2014-10-07 19:34 ` Jesper Dangaard Brouer
2014-10-07 15:26 ` Quota in __qdisc_run() (was: qdisc: validate skb without holding lock) Jesper Dangaard Brouer
2014-10-08 17:38 ` Quota in __qdisc_run() John Fastabend
2014-10-06 14:12 ` [PATCH net-next] qdisc: validate skb without holding lock Jesper Dangaard Brouer
2014-10-04 3:59 ` [PATCH net-next] net: skb_segment() provides list head and tail Eric Dumazet
2014-10-06 4:38 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141007153050.792c9743@redhat.com \
--to=brouer@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=dave.taht@gmail.com \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=jhs@mojatatu.com \
--cc=john.r.fastabend@intel.com \
--cc=netdev@vger.kernel.org \
--cc=therbert@google.com \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.