From: Jarek Poplawski <jarkao2@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: herbert@gondor.apana.org.au, netdev@vger.kernel.org, kaber@trash.net
Subject: Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
Date: Sun, 21 Sep 2008 01:48:43 +0200 [thread overview]
Message-ID: <20080920234843.GA2531@ami.dom.local> (raw)
In-Reply-To: <20080920.002137.108837580.davem@davemloft.net>
On Sat, Sep 20, 2008 at 12:21:37AM -0700, David Miller wrote:
...
> Let's look at what actually matters for cpu utilization. These
> __qdisc_run() things are invoked in two situations where we might
> block on the hw queue being stopped:
>
> 1) When feeding packets into the qdisc in dev_queue_xmit().
>
> Guess what? We _know_ the queue this packet is going to
> hit.
>
> The only new thing we can possible trigger and be interested
> in at this specific point is if _this_ packet can be sent at
> this time.
>
> And we can check that queue mapping after the qdisc_enqueue_root()
> call, so that multiq aware qdiscs can have made their changes.
>
> 2) When waking up a queue. And here we should schedule the qdisc_run
> _unconditionally_.
>
> If the queue was full, it is extremely likely that new packets
> are bound for that device queue. There is no real savings to
> be had by doing this peek/requeue/dequeue stuff.
>
> The cpu utilization savings exist for case #1 only, and we can
> implement the bypass logic _perfectly_ as described above.
>
> For #2 there is nothing to check, just do it and see what comes
> out of the qdisc.
Right, unless __netif_schedule() wasn't done when waking up. I've
thought about this because of another thread/patch around this
problem, and got misled by dev_requeue_skb() scheduling. Now, I think
this could be the main reason for this high load. Anyway, if we want
to skip this check for #2 I think something like the patch below is
needed.
> I would suggest adding an skb pointer argument to qdisc_run().
> If it's NULL, unconditionally schedule __qdisc_run(). Else,
> only schedule if the TX queue indicated by skb_queue_mapping()
> is not stopped.
>
> dev_queue_xmit() will use the "pass the skb" case, but only if
> qdisc_enqueue_root()'s return value doesn't indicate that there
> is a potential drop. On potential drop, we'll pass NULL to
> make sure we don't potentially reference a free'd SKB.
>
> The other case in net_tx_action() can always pass NULL to qdisc_run().
I'm not convinced this #1 is useful for us: this could be an skb #1000
in a queue; the tx status could change many times before this packet
would be #1; why worry? This adds additional checks on the fast path
for something which is unlikely even if this skb would be #1, but for
any later skbs it's only a guess. IMHO, if we can't check for the next
skb to be xmitted it's better to skip this test entirely (which seems
to be safe with the patch below).
Jarek P.
--------------->
pkt_sched: dev_requeue_skb: Don't schedule if a queue is stopped
Doing __netif_schedule() while requeuing because of a stopped tx queue
and skipping such a test in qdisc_run() can cause a requeuing loop with
high cpu use until the queue is awaken.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---
net/sched/sch_generic.c | 23 +++++++++++++++--------
1 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index ec0a083..bae2eb8 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -42,14 +42,17 @@ static inline int qdisc_qlen(struct Qdisc *q)
return q->q.qlen;
}
-static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
+static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q,
+ bool stopped)
{
if (unlikely(skb->next))
q->gso_skb = skb;
else
q->ops->requeue(skb, q);
- __netif_schedule(q);
+ if (!stopped)
+ __netif_schedule(q);
+
return 0;
}
@@ -89,7 +92,7 @@ static inline int handle_dev_cpu_collision(struct sk_buff *skb,
* some time.
*/
__get_cpu_var(netdev_rx_stat).cpu_collision++;
- ret = dev_requeue_skb(skb, q);
+ ret = dev_requeue_skb(skb, q, false);
}
return ret;
@@ -121,6 +124,7 @@ static inline int qdisc_restart(struct Qdisc *q)
struct net_device *dev;
spinlock_t *root_lock;
struct sk_buff *skb;
+ bool stopped;
/* Dequeue packet */
if (unlikely((skb = dequeue_skb(q)) == NULL))
@@ -135,9 +139,13 @@ static inline int qdisc_restart(struct Qdisc *q)
txq = netdev_get_tx_queue(dev, skb_get_queue_mapping(skb));
HARD_TX_LOCK(dev, txq, smp_processor_id());
- if (!netif_tx_queue_stopped(txq) &&
- !netif_tx_queue_frozen(txq))
+ if (!netif_tx_queue_stopped(txq) && !netif_tx_queue_frozen(txq)) {
ret = dev_hard_start_xmit(skb, dev, txq);
+ stopped = netif_tx_queue_stopped(txq) ||
+ netif_tx_queue_frozen(txq);
+ } else {
+ stopped = true;
+ }
HARD_TX_UNLOCK(dev, txq);
spin_lock(root_lock);
@@ -159,12 +167,11 @@ static inline int qdisc_restart(struct Qdisc *q)
printk(KERN_WARNING "BUG %s code %d qlen %d\n",
dev->name, ret, q->q.qlen);
- ret = dev_requeue_skb(skb, q);
+ ret = dev_requeue_skb(skb, q, stopped);
break;
}
- if (ret && (netif_tx_queue_stopped(txq) ||
- netif_tx_queue_frozen(txq)))
+ if (ret && stopped)
ret = 0;
return ret;
next prev parent reply other threads:[~2008-09-20 23:45 UTC|newest]
Thread overview: 209+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-11 20:53 [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-12 1:12 ` David Miller
2008-08-12 5:20 ` Jarek Poplawski
2008-08-12 5:40 ` David Miller
2008-08-12 7:00 ` Jarek Poplawski
2008-08-12 8:15 ` David Miller
2008-08-12 10:38 ` Jarek Poplawski
2008-08-13 4:30 ` Herbert Xu
2008-08-13 5:11 ` David Miller
2008-08-13 5:31 ` Herbert Xu
2008-08-13 9:30 ` David Miller
2008-08-13 6:13 ` Jarek Poplawski
2008-08-13 6:16 ` David Miller
2008-08-13 6:53 ` Jarek Poplawski
2008-08-13 7:31 ` Jarek Poplawski
2008-08-13 9:25 ` David Miller
2008-08-13 9:58 ` Herbert Xu
2008-08-13 10:27 ` Jarek Poplawski
2008-08-13 10:42 ` Jarek Poplawski
2008-08-13 10:42 ` Herbert Xu
2008-08-13 10:50 ` Jarek Poplawski
2008-08-13 22:19 ` David Miller
2008-08-14 7:59 ` Jarek Poplawski
2008-08-14 8:16 ` Herbert Xu
2008-08-14 8:31 ` Jarek Poplawski
2008-08-14 8:33 ` Herbert Xu
2008-08-14 8:44 ` Jarek Poplawski
2008-08-14 8:52 ` Jarek Poplawski
2008-08-17 22:57 ` David Miller
2008-08-17 23:03 ` David Miller
2008-08-18 1:25 ` Herbert Xu
2008-08-18 1:35 ` David Miller
2008-08-18 1:36 ` Herbert Xu
2008-08-18 1:49 ` David Miller
2008-08-18 4:27 ` Herbert Xu
2008-08-18 4:31 ` David Miller
2008-08-18 4:36 ` Herbert Xu
2008-08-18 5:13 ` David Miller
2008-08-18 6:08 ` Denys Fedoryshchenko
2008-08-18 6:13 ` David Miller
2008-08-18 6:27 ` Jarek Poplawski
2008-08-18 6:38 ` David Miller
2008-08-18 21:29 ` Jarek Poplawski
2008-08-18 23:47 ` David Miller
2008-08-19 10:31 ` Jarek Poplawski
2008-08-19 10:51 ` Herbert Xu
2008-08-19 10:54 ` David Miller
2008-08-19 10:55 ` Herbert Xu
2008-08-19 10:58 ` Herbert Xu
2008-08-19 11:02 ` David Miller
2008-08-19 11:11 ` Herbert Xu
2008-08-19 16:48 ` Jarek Poplawski
2008-08-19 22:23 ` Herbert Xu
2008-08-20 11:56 ` [PATCH] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race Jarek Poplawski
2008-08-20 12:16 ` Herbert Xu
2008-08-21 5:17 ` Jarek Poplawski
2008-08-21 5:49 ` [PATCH take 2] " Jarek Poplawski
2008-08-21 6:10 ` Herbert Xu
2008-08-21 6:49 ` Jarek Poplawski
2008-08-21 7:16 ` Herbert Xu
2008-08-21 7:52 ` David Miller
2008-08-21 8:00 ` Herbert Xu
2008-08-21 8:27 ` Jarek Poplawski
2008-08-21 8:35 ` Jarek Poplawski
2008-08-21 8:47 ` Jarek Poplawski
2008-09-11 10:39 ` David Miller
2008-09-11 10:45 ` Herbert Xu
2008-09-11 10:49 ` David Miller
2008-09-11 11:00 ` Herbert Xu
2008-09-11 11:42 ` David Miller
2008-09-11 11:45 ` Herbert Xu
2008-09-11 11:47 ` David Miller
2008-09-12 4:49 ` David Miller
2008-09-12 8:02 ` Jarek Poplawski
2008-09-12 23:10 ` David Miller
2008-09-13 1:10 ` Herbert Xu
2008-09-13 1:22 ` David Miller
2008-09-13 1:27 ` Herbert Xu
2008-09-13 1:40 ` David Miller
2008-09-13 1:48 ` Herbert Xu
2008-09-13 20:54 ` Jarek Poplawski
2008-09-14 6:16 ` Herbert Xu
2008-09-14 10:31 ` Alexander Duyck
2008-09-14 21:43 ` Jarek Poplawski
2008-09-14 22:13 ` Herbert Xu
2008-09-15 6:07 ` Jarek Poplawski
2008-09-15 6:19 ` Herbert Xu
2008-09-15 7:20 ` Jarek Poplawski
2008-09-15 7:45 ` Jarek Poplawski
2008-09-15 23:44 ` Duyck, Alexander H
2008-09-16 10:47 ` Jarek Poplawski
2008-09-17 2:31 ` Alexander Duyck
2008-09-14 11:56 ` jamal
2008-09-14 20:27 ` Jarek Poplawski
2008-09-20 7:21 ` David Miller
2008-09-20 7:25 ` Herbert Xu
2008-09-20 7:28 ` David Miller
2008-09-20 23:48 ` Jarek Poplawski [this message]
2008-09-21 5:35 ` David Miller
2008-09-21 5:50 ` David Miller
2008-09-21 6:38 ` Herbert Xu
2008-09-21 7:03 ` David Miller
2008-09-23 6:23 ` Herbert Xu
2008-09-24 7:15 ` Jarek Poplawski
2008-09-24 8:04 ` Herbert Xu
2008-09-24 8:28 ` Jarek Poplawski
2008-09-21 15:25 ` Jarek Poplawski
2008-09-21 9:57 ` Jarek Poplawski
2008-09-21 10:18 ` David Miller
2008-09-21 11:15 ` Jarek Poplawski
2008-09-23 5:16 ` David Miller
2008-09-23 8:02 ` Jarek Poplawski
2008-09-23 8:06 ` David Miller
2008-09-11 11:51 ` Jarek Poplawski
2008-09-11 11:54 ` Herbert Xu
2008-09-11 12:10 ` Jarek Poplawski
2008-09-11 12:34 ` Jarek Poplawski
2008-08-21 12:11 ` David Miller
2008-08-14 8:17 ` [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-14 11:24 ` Jarek Poplawski
2008-08-17 13:42 ` Jarek Poplawski
2008-08-17 21:34 ` David Miller
2008-08-17 22:22 ` Jarek Poplawski
2008-08-17 22:32 ` David Miller
2008-08-18 20:12 ` Jarek Poplawski
2008-08-18 23:54 ` David Miller
2008-08-19 0:05 ` Herbert Xu
2008-08-19 0:11 ` David Miller
2008-08-19 4:07 ` David Miller
2008-08-19 5:27 ` Ilpo Järvinen
2008-08-19 5:30 ` David Miller
2008-08-19 6:46 ` Jarek Poplawski
2008-08-19 7:03 ` David Miller
2008-08-19 7:23 ` Jarek Poplawski
2008-08-19 7:23 ` Herbert Xu
2008-08-19 7:35 ` Jarek Poplawski
2008-08-19 7:46 ` Herbert Xu
2008-08-19 7:56 ` Jarek Poplawski
2008-08-19 8:05 ` Herbert Xu
2008-08-19 8:17 ` Jarek Poplawski
2008-08-19 8:23 ` Herbert Xu
2008-08-19 8:32 ` David Miller
2008-08-19 8:41 ` Jarek Poplawski
2008-08-19 8:48 ` David Miller
2008-08-19 8:50 ` Herbert Xu
2008-08-19 8:39 ` Jarek Poplawski
2008-08-19 8:55 ` Herbert Xu
2008-08-19 9:16 ` Jarek Poplawski
2008-08-21 10:01 ` Jarek Poplawski
2008-08-21 10:05 ` David Miller
2008-08-21 10:11 ` Jarek Poplawski
2008-08-21 10:18 ` Jarek Poplawski
2008-08-21 10:21 ` Herbert Xu
2008-08-21 10:23 ` Herbert Xu
2008-08-21 10:33 ` Jarek Poplawski
2008-08-21 10:51 ` Herbert Xu
2008-08-21 11:20 ` Jarek Poplawski
2008-08-21 11:26 ` Herbert Xu
2008-08-21 11:55 ` Jarek Poplawski
2008-08-21 12:01 ` Herbert Xu
2008-08-21 12:19 ` Jarek Poplawski
2008-08-21 12:22 ` Herbert Xu
2008-08-21 12:27 ` David Miller
2008-08-21 12:35 ` Herbert Xu
2008-08-21 12:48 ` Herbert Xu
2008-08-21 12:55 ` Jarek Poplawski
2008-08-21 13:12 ` Herbert Xu
2008-08-21 18:58 ` Jarek Poplawski
2008-08-21 21:14 ` Jarek Poplawski
2008-08-21 22:23 ` Herbert Xu
2008-08-22 8:49 ` Jarek Poplawski
2008-08-22 8:55 ` David Miller
2008-08-22 10:07 ` Herbert Xu
2008-08-22 10:27 ` David Miller
2008-08-22 11:02 ` Herbert Xu
2008-08-22 11:38 ` Jarek Poplawski
2008-08-22 11:42 ` David Miller
2008-08-22 12:09 ` Jarek Poplawski
2008-08-22 12:11 ` Herbert Xu
2008-08-22 12:18 ` David Miller
2008-08-22 12:45 ` Herbert Xu
2008-08-24 23:26 ` Stephen Hemminger
2008-08-24 23:49 ` Herbert Xu
2008-08-25 0:29 ` Stephen Hemminger
2008-08-26 7:35 ` Herbert Xu
2008-08-26 7:47 ` Herbert Xu
2008-08-26 12:24 ` Stephen Hemminger
2008-08-26 12:41 ` Herbert Xu
2008-08-26 12:50 ` Stephen Hemminger
2008-08-26 12:56 ` Herbert Xu
2008-08-27 12:17 ` Bastian Bloessl
2008-08-27 9:32 ` David Miller
2008-08-27 9:56 ` Herbert Xu
2008-08-22 12:25 ` Jarek Poplawski
2008-08-23 12:15 ` David Miller
2008-08-21 20:40 ` Jarek Poplawski
2008-08-21 22:24 ` Herbert Xu
2008-08-22 8:41 ` [PATCH] pkt_sched: Fix qdisc list locking Jarek Poplawski
2008-08-22 10:14 ` Herbert Xu
2008-08-22 9:27 ` [PATCH take 2] " Jarek Poplawski
2008-08-22 10:15 ` Herbert Xu
2008-08-22 10:28 ` David Miller
2008-08-22 10:23 ` David Miller
2008-08-21 12:49 ` [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-21 12:51 ` Herbert Xu
2008-08-21 12:06 ` David Miller
2008-08-21 10:18 ` Herbert Xu
2008-08-12 22:02 ` [PATCH take 2] pkt_sched: Protect gen estimators under est_lock Jarek Poplawski
2008-08-13 22:20 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080920234843.GA2531@ami.dom.local \
--to=jarkao2@gmail.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).