From: Jarek Poplawski <jarkao2@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: herbert@gondor.apana.org.au, netdev@vger.kernel.org, kaber@trash.net
Subject: Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
Date: Sun, 21 Sep 2008 01:48:43 +0200 [thread overview]
Message-ID: <20080920234843.GA2531@ami.dom.local> (raw)
In-Reply-To: <20080920.002137.108837580.davem@davemloft.net>
On Sat, Sep 20, 2008 at 12:21:37AM -0700, David Miller wrote:
...
> Let's look at what actually matters for cpu utilization. These
> __qdisc_run() things are invoked in two situations where we might
> block on the hw queue being stopped:
>
> 1) When feeding packets into the qdisc in dev_queue_xmit().
>
> Guess what? We _know_ the queue this packet is going to
> hit.
>
> The only new thing we can possible trigger and be interested
> in at this specific point is if _this_ packet can be sent at
> this time.
>
> And we can check that queue mapping after the qdisc_enqueue_root()
> call, so that multiq aware qdiscs can have made their changes.
>
> 2) When waking up a queue. And here we should schedule the qdisc_run
> _unconditionally_.
>
> If the queue was full, it is extremely likely that new packets
> are bound for that device queue. There is no real savings to
> be had by doing this peek/requeue/dequeue stuff.
>
> The cpu utilization savings exist for case #1 only, and we can
> implement the bypass logic _perfectly_ as described above.
>
> For #2 there is nothing to check, just do it and see what comes
> out of the qdisc.
Right, unless __netif_schedule() wasn't done when waking up. I've
thought about this because of another thread/patch around this
problem, and got misled by dev_requeue_skb() scheduling. Now, I think
this could be the main reason for this high load. Anyway, if we want
to skip this check for #2 I think something like the patch below is
needed.
> I would suggest adding an skb pointer argument to qdisc_run().
> If it's NULL, unconditionally schedule __qdisc_run(). Else,
> only schedule if the TX queue indicated by skb_queue_mapping()
> is not stopped.
>
> dev_queue_xmit() will use the "pass the skb" case, but only if
> qdisc_enqueue_root()'s return value doesn't indicate that there
> is a potential drop. On potential drop, we'll pass NULL to
> make sure we don't potentially reference a free'd SKB.
>
> The other case in net_tx_action() can always pass NULL to qdisc_run().
I'm not convinced this #1 is useful for us: this could be an skb #1000
in a queue; the tx status could change many times before this packet
would be #1; why worry? This adds additional checks on the fast path
for something which is unlikely even if this skb would be #1, but for
any later skbs it's only a guess. IMHO, if we can't check for the next
skb to be xmitted it's better to skip this test entirely (which seems
to be safe with the patch below).
Jarek P.
--------------->
pkt_sched: dev_requeue_skb: Don't schedule if a queue is stopped
Doing __netif_schedule() while requeuing because of a stopped tx queue
and skipping such a test in qdisc_run() can cause a requeuing loop with
high cpu use until the queue is awaken.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---
net/sched/sch_generic.c | 23 +++++++++++++++--------
1 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index ec0a083..bae2eb8 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -42,14 +42,17 @@ static inline int qdisc_qlen(struct Qdisc *q)
return q->q.qlen;
}
-static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
+static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q,
+ bool stopped)
{
if (unlikely(skb->next))
q->gso_skb = skb;
else
q->ops->requeue(skb, q);
- __netif_schedule(q);
+ if (!stopped)
+ __netif_schedule(q);
+
return 0;
}
@@ -89,7 +92,7 @@ static inline int handle_dev_cpu_collision(struct sk_buff *skb,
* some time.
*/
__get_cpu_var(netdev_rx_stat).cpu_collision++;
- ret = dev_requeue_skb(skb, q);
+ ret = dev_requeue_skb(skb, q, false);
}
return ret;
@@ -121,6 +124,7 @@ static inline int qdisc_restart(struct Qdisc *q)
struct net_device *dev;
spinlock_t *root_lock;
struct sk_buff *skb;
+ bool stopped;
/* Dequeue packet */
if (unlikely((skb = dequeue_skb(q)) == NULL))
@@ -135,9 +139,13 @@ static inline int qdisc_restart(struct Qdisc *q)
txq = netdev_get_tx_queue(dev, skb_get_queue_mapping(skb));
HARD_TX_LOCK(dev, txq, smp_processor_id());
- if (!netif_tx_queue_stopped(txq) &&
- !netif_tx_queue_frozen(txq))
+ if (!netif_tx_queue_stopped(txq) && !netif_tx_queue_frozen(txq)) {
ret = dev_hard_start_xmit(skb, dev, txq);
+ stopped = netif_tx_queue_stopped(txq) ||
+ netif_tx_queue_frozen(txq);
+ } else {
+ stopped = true;
+ }
HARD_TX_UNLOCK(dev, txq);
spin_lock(root_lock);
@@ -159,12 +167,11 @@ static inline int qdisc_restart(struct Qdisc *q)
printk(KERN_WARNING "BUG %s code %d qlen %d\n",
dev->name, ret, q->q.qlen);
- ret = dev_requeue_skb(skb, q);
+ ret = dev_requeue_skb(skb, q, stopped);
break;
}
- if (ret && (netif_tx_queue_stopped(txq) ||
- netif_tx_queue_frozen(txq)))
+ if (ret && stopped)
ret = 0;
return ret;
next prev parent reply other threads:[~2008-09-20 23:45 UTC|newest]
Thread overview: 209+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-11 20:53 [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-12 1:12 ` David Miller
2008-08-12 5:20 ` Jarek Poplawski
2008-08-12 5:40 ` David Miller
2008-08-12 7:00 ` Jarek Poplawski
2008-08-12 8:15 ` David Miller
2008-08-12 10:38 ` Jarek Poplawski
2008-08-13 4:30 ` Herbert Xu
2008-08-13 5:11 ` David Miller
2008-08-13 5:31 ` Herbert Xu
2008-08-13 9:30 ` David Miller
2008-08-13 6:13 ` Jarek Poplawski
2008-08-13 6:16 ` David Miller
2008-08-13 6:53 ` Jarek Poplawski
2008-08-13 7:31 ` Jarek Poplawski
2008-08-13 9:25 ` David Miller
2008-08-13 9:58 ` Herbert Xu
2008-08-13 10:27 ` Jarek Poplawski
2008-08-13 10:42 ` Jarek Poplawski
2008-08-13 10:42 ` Herbert Xu
2008-08-13 10:50 ` Jarek Poplawski
2008-08-13 22:19 ` David Miller
2008-08-14 7:59 ` Jarek Poplawski
2008-08-14 8:16 ` Herbert Xu
2008-08-14 8:31 ` Jarek Poplawski
2008-08-14 8:33 ` Herbert Xu
2008-08-14 8:44 ` Jarek Poplawski
2008-08-14 8:52 ` Jarek Poplawski
2008-08-17 22:57 ` David Miller
2008-08-17 23:03 ` David Miller
2008-08-18 1:25 ` Herbert Xu
2008-08-18 1:35 ` David Miller
2008-08-18 1:36 ` Herbert Xu
2008-08-18 1:49 ` David Miller
2008-08-18 4:27 ` Herbert Xu
2008-08-18 4:31 ` David Miller
2008-08-18 4:36 ` Herbert Xu
2008-08-18 5:13 ` David Miller
2008-08-18 6:08 ` Denys Fedoryshchenko
2008-08-18 6:13 ` David Miller
2008-08-18 6:27 ` Jarek Poplawski
2008-08-18 6:38 ` David Miller
2008-08-18 21:29 ` Jarek Poplawski
2008-08-18 23:47 ` David Miller
2008-08-19 10:31 ` Jarek Poplawski
2008-08-19 10:51 ` Herbert Xu
2008-08-19 10:54 ` David Miller
2008-08-19 10:55 ` Herbert Xu
2008-08-19 10:58 ` Herbert Xu
2008-08-19 11:02 ` David Miller
2008-08-19 11:11 ` Herbert Xu
2008-08-19 16:48 ` Jarek Poplawski
2008-08-19 22:23 ` Herbert Xu
2008-08-20 11:56 ` [PATCH] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race Jarek Poplawski
2008-08-20 12:16 ` Herbert Xu
2008-08-21 5:17 ` Jarek Poplawski
2008-08-21 5:49 ` [PATCH take 2] " Jarek Poplawski
2008-08-21 6:10 ` Herbert Xu
2008-08-21 6:49 ` Jarek Poplawski
2008-08-21 7:16 ` Herbert Xu
2008-08-21 7:52 ` David Miller
2008-08-21 8:00 ` Herbert Xu
2008-08-21 8:27 ` Jarek Poplawski
2008-08-21 8:35 ` Jarek Poplawski
2008-08-21 8:47 ` Jarek Poplawski
2008-09-11 10:39 ` David Miller
2008-09-11 10:45 ` Herbert Xu
2008-09-11 10:49 ` David Miller
2008-09-11 11:00 ` Herbert Xu
2008-09-11 11:42 ` David Miller
2008-09-11 11:45 ` Herbert Xu
2008-09-11 11:47 ` David Miller
2008-09-12 4:49 ` David Miller
2008-09-12 8:02 ` Jarek Poplawski
2008-09-12 23:10 ` David Miller
2008-09-13 1:10 ` Herbert Xu
2008-09-13 1:22 ` David Miller
2008-09-13 1:27 ` Herbert Xu
2008-09-13 1:40 ` David Miller
2008-09-13 1:48 ` Herbert Xu
2008-09-13 20:54 ` Jarek Poplawski
2008-09-14 6:16 ` Herbert Xu
2008-09-14 10:31 ` Alexander Duyck
2008-09-14 21:43 ` Jarek Poplawski
2008-09-14 22:13 ` Herbert Xu
2008-09-15 6:07 ` Jarek Poplawski
2008-09-15 6:19 ` Herbert Xu
2008-09-15 7:20 ` Jarek Poplawski
2008-09-15 7:45 ` Jarek Poplawski
2008-09-15 23:44 ` Duyck, Alexander H
2008-09-16 10:47 ` Jarek Poplawski
2008-09-17 2:31 ` Alexander Duyck
2008-09-14 11:56 ` jamal
2008-09-14 20:27 ` Jarek Poplawski
2008-09-20 7:21 ` David Miller
2008-09-20 7:25 ` Herbert Xu
2008-09-20 7:28 ` David Miller
2008-09-20 23:48 ` Jarek Poplawski [this message]
2008-09-21 5:35 ` David Miller
2008-09-21 5:50 ` David Miller
2008-09-21 6:38 ` Herbert Xu
2008-09-21 7:03 ` David Miller
2008-09-23 6:23 ` Herbert Xu
2008-09-24 7:15 ` Jarek Poplawski
2008-09-24 8:04 ` Herbert Xu
2008-09-24 8:28 ` Jarek Poplawski
2008-09-21 15:25 ` Jarek Poplawski
2008-09-21 9:57 ` Jarek Poplawski
2008-09-21 10:18 ` David Miller
2008-09-21 11:15 ` Jarek Poplawski
2008-09-23 5:16 ` David Miller
2008-09-23 8:02 ` Jarek Poplawski
2008-09-23 8:06 ` David Miller
2008-09-11 11:51 ` Jarek Poplawski
2008-09-11 11:54 ` Herbert Xu
2008-09-11 12:10 ` Jarek Poplawski
2008-09-11 12:34 ` Jarek Poplawski
2008-08-21 12:11 ` David Miller
2008-08-14 8:17 ` [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-14 11:24 ` Jarek Poplawski
2008-08-17 13:42 ` Jarek Poplawski
2008-08-17 21:34 ` David Miller
2008-08-17 22:22 ` Jarek Poplawski
2008-08-17 22:32 ` David Miller
2008-08-18 20:12 ` Jarek Poplawski
2008-08-18 23:54 ` David Miller
2008-08-19 0:05 ` Herbert Xu
2008-08-19 0:11 ` David Miller
2008-08-19 4:07 ` David Miller
2008-08-19 5:27 ` Ilpo Järvinen
2008-08-19 5:30 ` David Miller
2008-08-19 6:46 ` Jarek Poplawski
2008-08-19 7:03 ` David Miller
2008-08-19 7:23 ` Jarek Poplawski
2008-08-19 7:23 ` Herbert Xu
2008-08-19 7:35 ` Jarek Poplawski
2008-08-19 7:46 ` Herbert Xu
2008-08-19 7:56 ` Jarek Poplawski
2008-08-19 8:05 ` Herbert Xu
2008-08-19 8:17 ` Jarek Poplawski
2008-08-19 8:23 ` Herbert Xu
2008-08-19 8:32 ` David Miller
2008-08-19 8:41 ` Jarek Poplawski
2008-08-19 8:48 ` David Miller
2008-08-19 8:50 ` Herbert Xu
2008-08-19 8:39 ` Jarek Poplawski
2008-08-19 8:55 ` Herbert Xu
2008-08-19 9:16 ` Jarek Poplawski
2008-08-21 10:01 ` Jarek Poplawski
2008-08-21 10:05 ` David Miller
2008-08-21 10:11 ` Jarek Poplawski
2008-08-21 10:18 ` Jarek Poplawski
2008-08-21 10:21 ` Herbert Xu
2008-08-21 10:23 ` Herbert Xu
2008-08-21 10:33 ` Jarek Poplawski
2008-08-21 10:51 ` Herbert Xu
2008-08-21 11:20 ` Jarek Poplawski
2008-08-21 11:26 ` Herbert Xu
2008-08-21 11:55 ` Jarek Poplawski
2008-08-21 12:01 ` Herbert Xu
2008-08-21 12:19 ` Jarek Poplawski
2008-08-21 12:22 ` Herbert Xu
2008-08-21 12:27 ` David Miller
2008-08-21 12:35 ` Herbert Xu
2008-08-21 12:48 ` Herbert Xu
2008-08-21 12:55 ` Jarek Poplawski
2008-08-21 13:12 ` Herbert Xu
2008-08-21 18:58 ` Jarek Poplawski
2008-08-21 21:14 ` Jarek Poplawski
2008-08-21 22:23 ` Herbert Xu
2008-08-22 8:49 ` Jarek Poplawski
2008-08-22 8:55 ` David Miller
2008-08-22 10:07 ` Herbert Xu
2008-08-22 10:27 ` David Miller
2008-08-22 11:02 ` Herbert Xu
2008-08-22 11:38 ` Jarek Poplawski
2008-08-22 11:42 ` David Miller
2008-08-22 12:09 ` Jarek Poplawski
2008-08-22 12:11 ` Herbert Xu
2008-08-22 12:18 ` David Miller
2008-08-22 12:45 ` Herbert Xu
2008-08-24 23:26 ` Stephen Hemminger
2008-08-24 23:49 ` Herbert Xu
2008-08-25 0:29 ` Stephen Hemminger
2008-08-26 7:35 ` Herbert Xu
2008-08-26 7:47 ` Herbert Xu
2008-08-26 12:24 ` Stephen Hemminger
2008-08-26 12:41 ` Herbert Xu
2008-08-26 12:50 ` Stephen Hemminger
2008-08-26 12:56 ` Herbert Xu
2008-08-27 12:17 ` Bastian Bloessl
2008-08-27 9:32 ` David Miller
2008-08-27 9:56 ` Herbert Xu
2008-08-22 12:25 ` Jarek Poplawski
2008-08-23 12:15 ` David Miller
2008-08-21 20:40 ` Jarek Poplawski
2008-08-21 22:24 ` Herbert Xu
2008-08-22 8:41 ` [PATCH] pkt_sched: Fix qdisc list locking Jarek Poplawski
2008-08-22 10:14 ` Herbert Xu
2008-08-22 9:27 ` [PATCH take 2] " Jarek Poplawski
2008-08-22 10:15 ` Herbert Xu
2008-08-22 10:28 ` David Miller
2008-08-22 10:23 ` David Miller
2008-08-21 12:49 ` [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock() Jarek Poplawski
2008-08-21 12:51 ` Herbert Xu
2008-08-21 12:06 ` David Miller
2008-08-21 10:18 ` Herbert Xu
2008-08-12 22:02 ` [PATCH take 2] pkt_sched: Protect gen estimators under est_lock Jarek Poplawski
2008-08-13 22:20 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080920234843.GA2531@ami.dom.local \
--to=jarkao2@gmail.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.