* [PATCH] sched: Optimize return value of qdisc_restart
@ 2007-05-08 7:31 Krishna Kumar
2007-05-09 2:05 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar @ 2007-05-08 7:31 UTC (permalink / raw)
To: netdev; +Cc: krkumar2, Krishna Kumar
Optimize return value of qdisc_restart so that it is not called an
extra time if there are no more packets on the queue to be sent out.
It is also not required to check for gso_skb (though the lock is
dropped) since another cpu which added this would have done a
netif_schedule.
Patch against net-2.6.22.git
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
diff -ruNp org/net/sched/sch_generic.c new/net/sched/sch_generic.c
--- org/net/sched/sch_generic.c 2007-05-07 17:25:25.000000000 +0530
+++ new/net/sched/sch_generic.c 2007-05-07 17:39:04.000000000 +0530
@@ -115,7 +115,7 @@ static inline int qdisc_restart(struct n
kfree_skb(skb);
if (net_ratelimit())
printk(KERN_DEBUG "Dead loop on netdevice %s, fix it urgently!\n", dev->name);
- return -1;
+ return q->q.qlen ? -1 : 0;
}
__get_cpu_var(netdev_rx_stat).cpu_collision++;
goto requeue;
@@ -135,7 +135,7 @@ static inline int qdisc_restart(struct n
netif_tx_unlock(dev);
}
spin_lock(&dev->queue_lock);
- return -1;
+ return q->q.qlen ? -1 : 0;
}
if (ret == NETDEV_TX_LOCKED && nolock) {
spin_lock(&dev->queue_lock);
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-08 7:31 [PATCH] sched: Optimize return value of qdisc_restart Krishna Kumar
@ 2007-05-09 2:05 ` David Miller
2007-05-09 4:35 ` Krishna Kumar2
0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2007-05-09 2:05 UTC (permalink / raw)
To: krkumar2; +Cc: netdev
From: Krishna Kumar <krkumar2@in.ibm.com>
Date: Tue, 08 May 2007 13:01:32 +0530
> Optimize return value of qdisc_restart so that it is not called an
> extra time if there are no more packets on the queue to be sent out.
> It is also not required to check for gso_skb (though the lock is
> dropped) since another cpu which added this would have done a
> netif_schedule.
>
> Patch against net-2.6.22.git
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
<0 return value here means that the queue is not empty, and the device
is throttled. If you want to do this, just branch down to the end of
the function which asserts that q->q.qlen is not negative, and returns
it.
That will achieve the right effect.
But I'm not so sure about this idea, I have this strange feeling that
we do things this way for a reason... Hmmm...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 2:05 ` David Miller
@ 2007-05-09 4:35 ` Krishna Kumar2
2007-05-09 6:36 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2007-05-09 4:35 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Hi Dave,
This change will not change existing behavior in case there are packets in
the queue, it will return -1 each time as does the original code. But when
there are no packets, the original qdisc_restart returns -1 and the caller
will
unnecessarily call qdisc_restart which branches to the bottom and returns
0 (qlen). This call can be avoided if we had returned 0 in the previous
iteration.
Branching to the bottom cannot be done since it breaks the "packets
present"
case, as it will return a positive number and the caller will assume that
the
queue is throttled.
thanks,
- KK
David Miller <davem@davemloft.net> wrote on 05/09/2007 07:35:06 AM:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> Date: Tue, 08 May 2007 13:01:32 +0530
>
> > Optimize return value of qdisc_restart so that it is not called an
> > extra time if there are no more packets on the queue to be sent out.
> > It is also not required to check for gso_skb (though the lock is
> > dropped) since another cpu which added this would have done a
> > netif_schedule.
> >
> > Patch against net-2.6.22.git
> >
> > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
>
> <0 return value here means that the queue is not empty, and the device
> is throttled. If you want to do this, just branch down to the end of
> the function which asserts that q->q.qlen is not negative, and returns
> it.
>
> That will achieve the right effect.
>
> But I'm not so sure about this idea, I have this strange feeling that
> we do things this way for a reason... Hmmm...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 4:35 ` Krishna Kumar2
@ 2007-05-09 6:36 ` David Miller
2007-05-09 7:23 ` Krishna Kumar2
0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2007-05-09 6:36 UTC (permalink / raw)
To: krkumar2; +Cc: netdev
From: Krishna Kumar2 <krkumar2@in.ibm.com>
Date: Wed, 9 May 2007 10:05:57 +0530
> This change will not change existing behavior in case there are
> packets in the queue, it will return -1 each time as does the
> original code. But when there are no packets, the original
> qdisc_restart returns -1 and the caller will unnecessarily call
> qdisc_restart which branches to the bottom and returns 0
> (qlen). This call can be avoided if we had returned 0 in the
> previous iteration. Branching to the bottom cannot be done since it
> breaks the "packets present" case, as it will return a positive
> number and the caller will assume that the queue is throttled.
Ok, thanks for the explanation.
But this only makes sense for the second hunk you changed.
The first hunk is a bug case, an improper looping device
decapsulation condition, and we do want to always return
-1 in that case in order to break out of the loop.
Your change will make the kernel essentially hang instead
when this bug check is triggered.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 6:36 ` David Miller
@ 2007-05-09 7:23 ` Krishna Kumar2
2007-05-09 8:12 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2007-05-09 7:23 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller <davem@davemloft.net> wrote on 05/09/2007 12:06:30 PM:
> But this only makes sense for the second hunk you changed.
I think it should work fine for both.
> The first hunk is a bug case, an improper looping device
> decapsulation condition, and we do want to always return
> -1 in that case in order to break out of the loop.
I guess you meant "we do *not* want to always return -1" ?
> Your change will make the kernel essentially hang instead
> when this bug check is triggered.
By returning -1, we end up freeing all the skbs one by one, or
until the condition : "transient error caused by hard_start_xmit
recursing" is over. Essentially, that behavior is also not
changed in my patch (only run time change is to return 0 if
there are no more skbs).
Thanks,
- KK
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 7:23 ` Krishna Kumar2
@ 2007-05-09 8:12 ` David Miller
2007-05-09 12:56 ` jamal
0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2007-05-09 8:12 UTC (permalink / raw)
To: krkumar2; +Cc: netdev
From: Krishna Kumar2 <krkumar2@in.ibm.com>
Date: Wed, 9 May 2007 12:53:05 +0530
> David Miller <davem@davemloft.net> wrote on 05/09/2007 12:06:30 PM:
>
> > Your change will make the kernel essentially hang instead
> > when this bug check is triggered.
>
> By returning -1, we end up freeing all the skbs one by one, or
> until the condition : "transient error caused by hard_start_xmit
> recursing" is over. Essentially, that behavior is also not
> changed in my patch (only run time change is to return 0 if
> there are no more skbs).
Something this evening is obviously making it impossible
for my brain to understand this function and your patch,
so I'm going to sleep on it and try again tomorrow :-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 8:12 ` David Miller
@ 2007-05-09 12:56 ` jamal
2007-05-09 14:47 ` Krishna Kumar2
0 siblings, 1 reply; 22+ messages in thread
From: jamal @ 2007-05-09 12:56 UTC (permalink / raw)
To: David Miller; +Cc: krkumar2, netdev
On Wed, 2007-09-05 at 01:12 -0700, David Miller wrote:
> Something this evening is obviously making it impossible
> for my brain to understand this function and your patch,
> so I'm going to sleep on it and try again tomorrow :-)
It is one of those areas that are hard to size-up in a blink;->
Gut-feeling: It doesnt sit right with me as well.
With (2.6.18-rxX++) QDISC_RUNNING changes that mean only one of N CPUs
will be dequeueing while the N-1 maybe enqueueing concurently. All N
CPUs contend for the queue lock; and theres a possible window between
releasing the queue lock by the dequeuer-CPU and enqueuer-CPU for a
race. The dequeuer-CPU entering one last time helps.
Krishna, you probably saw this "wasted entry into qdisc" under low
traffic conditions with more than likely only one CPU sending, am i
correct? Under heavier traffic when we have multiple CPUs funneling to
the same device, that entry is not really a "waste" because we endup
only go in once per X number of packets enqueued on the qdisc and that
check is absolutely necessary because a different CPU may have enqueued
while you were not looking. In the case of low traffic, X=1 - so it is a
waste there albeit a necessary one.
cheers,
jamal
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 12:56 ` jamal
@ 2007-05-09 14:47 ` Krishna Kumar2
2007-05-09 15:52 ` jamal
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2007-05-09 14:47 UTC (permalink / raw)
To: hadi; +Cc: David Miller, J Hadi Salim, netdev
Hi Jamal,
J Hadi Salim <j.hadi123@gmail.com> wrote on 05/09/2007 06:26:43 PM:
> On Wed, 2007-09-05 at 01:12 -0700, David Miller wrote:
>
> > Something this evening is obviously making it impossible
> > for my brain to understand this function and your patch,
> > so I'm going to sleep on it and try again tomorrow :-)
>
> It is one of those areas that are hard to size-up in a blink;->
> Gut-feeling: It doesnt sit right with me as well.
>
> With (2.6.18-rxX++) QDISC_RUNNING changes that mean only one of N CPUs
> will be dequeueing while the N-1 maybe enqueueing concurently. All N
Concurrently is not possible, since everyone needs the queue_lock
to add/delete. Did I misunderstand your comment ?
> CPUs contend for the queue lock; and theres a possible window between
> releasing the queue lock by the dequeuer-CPU and enqueuer-CPU for a
> race. The dequeuer-CPU entering one last time helps.
The dev->queue_lock is held by both enqueue'r and dequeue'r (though
the dequeue'r drops it before calling xmit). But once the dequeue'r
re-gets the lock, it is guaranteed that no one else has the lock
Other CPU's trying to add will block on the lock, or if they have
already added by getting the lock for a short time while my CPU was
doing the xmit, then their qdisc_run returns doing nothing as RUNNING
is true.
Since I am holding a lock in these two changed areas till I return
back to __qdisc_run (which clears the RUNNING bit) and then drop the
lock, there is no way packets can be on the queue while I falsely
return 0, or no packets on the queue while I falsely return -1.
I hope my explanation was not confusing.
> Krishna, you probably saw this "wasted entry into qdisc" under low
> traffic conditions with more than likely only one CPU sending, am i
> correct? Under heavier traffic when we have multiple CPUs funneling to
Actually, I have never seen this problem, just code reading while
trying to implement something else in this area, for which I plan to
bug Dave in a few days.
Thanks,
- KK
> the same device, that entry is not really a "waste" because we endup
> only go in once per X number of packets enqueued on the qdisc and that
> check is absolutely necessary because a different CPU may have enqueued
> while you were not looking. In the case of low traffic, X=1 - so it is a
> waste there albeit a necessary one.
>
> cheers,
> jamal
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 14:47 ` Krishna Kumar2
@ 2007-05-09 15:52 ` jamal
2007-05-10 5:12 ` Krishna Kumar2
0 siblings, 1 reply; 22+ messages in thread
From: jamal @ 2007-05-09 15:52 UTC (permalink / raw)
To: Krishna Kumar2; +Cc: David Miller, netdev, Herbert Xu
Krishna,
On Wed, 2007-09-05 at 20:17 +0530, Krishna Kumar2 wrote:
> Concurrently is not possible, since everyone needs the queue_lock
> to add/delete. Did I misunderstand your comment ?
>
I think so, more below where you explain it:
> The dev->queue_lock is held by both enqueue'r and dequeue'r (though
> the dequeue'r drops it before calling xmit). But once the dequeue'r
> re-gets the lock, it is guaranteed that no one else has the lock
> Other CPU's trying to add will block on the lock, or if they have
> already added by getting the lock for a short time while my CPU was
That is how concurency is achieved on the queue. If you have N CPUs, N-1
could be queueing.
Important to note, only one - that owns the QDISC_RUNNING can dequeue.
> doing the xmit, then their qdisc_run returns doing nothing as RUNNING
> is true.
>
lack of ownership of QDISC_RUNNING is what makes them enqueuers. The CPU
that owns it is the dequeuer.
> Since I am holding a lock in these two changed areas till I return
> back to __qdisc_run (which clears the RUNNING bit) and then drop the
> lock, there is no way packets can be on the queue while I falsely
> return 0, or no packets on the queue while I falsely return -1.
>
If you relinquish yourself from being a dequeuer by letting go of
RUNNING then it is possible during that short window one of the other
N-1 CPUs could have been enqueueing; that packet will never be dequeued
unless a new packet shows up some X amount of time later.
> I hope my explanation was not confusing.
>
I hope what i described above helps. Off for about a day. CCing Herbert
who last made changes to that area incase i missed something ..
cheers,
jamal
PS:- Please dont use my temporary gmail account to respond; a reply-to
will pick the right address (@cyberus.ca).
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-09 15:52 ` jamal
@ 2007-05-10 5:12 ` Krishna Kumar2
2007-05-10 11:50 ` Herbert Xu
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2007-05-10 5:12 UTC (permalink / raw)
To: hadi; +Cc: David Miller, Herbert Xu, netdev
Hi Jamal,
J Hadi Salim <j.hadi123@gmail.com> wrote on 05/09/2007 09:22:44 PM:
> > Since I am holding a lock in these two changed areas till I return
> > back to __qdisc_run (which clears the RUNNING bit) and then drop the
> > lock, there is no way packets can be on the queue while I falsely
> > return 0, or no packets on the queue while I falsely return -1.
>
> If you relinquish yourself from being a dequeuer by letting go of
> RUNNING then it is possible during that short window one of the other
> N-1 CPUs could have been enqueueing; that packet will never be dequeued
> unless a new packet shows up some X amount of time later.
But RUNNING is never relinquished till all the way back out to
__qdisc_run. Only the lock is dropped before calling xmit and
re-got after xmit is finished (all the while holding RUNNING).
After xmit both queue_lock and RUNNING are held. If some other
cpu enqueue'd during this release window (and they would have
returned after dropping the lock as they are pure enqueuer's),
qdisc_restart will find those skbs. Similarly if no skbs were
added, qdisc_restart will return 0.
Thanks,
- KK
> I hope what i described above helps. Off for about a day. CCing Herbert
> who last made changes to that area incase i missed something ..
>
> cheers,
> jamal
>
> PS:- Please dont use my temporary gmail account to respond; a reply-to
> will pick the right address (@cyberus.ca).
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 5:12 ` Krishna Kumar2
@ 2007-05-10 11:50 ` Herbert Xu
2007-05-10 11:55 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2007-05-10 11:50 UTC (permalink / raw)
To: Krishna Kumar2; +Cc: hadi, David Miller, netdev
On Thu, May 10, 2007 at 10:42:59AM +0530, Krishna Kumar2 wrote:
>
> But RUNNING is never relinquished till all the way back out to
> __qdisc_run. Only the lock is dropped before calling xmit and
> re-got after xmit is finished (all the while holding RUNNING).
> After xmit both queue_lock and RUNNING are held. If some other
> cpu enqueue'd during this release window (and they would have
> returned after dropping the lock as they are pure enqueuer's),
> qdisc_restart will find those skbs. Similarly if no skbs were
> added, qdisc_restart will return 0.
Yes I agree with Krishna completely. In fact this whole section
is so 20th-century :) Let's fix up the comments too while we're
at it.
[NET_SCHED]: Rationalise return value of qdisc_restart
The current return value scheme and associated comment was invented
back in the 20th century when we still had that tbusy flag. Things
have changed quite a bit since then (even Tony Blair is moving on
now, not to mention the new French president).
All we need to indicate now is whether the caller should continue
processing the queue. Therefore it's sufficient if we return 0 if
we want to stop and non-zero otherwise.
This is based on a patch by Krishna Kumar.
Signed-off-by: <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 3385ee5..3480e81 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -71,12 +71,9 @@ void qdisc_unlock_tree(struct net_device *dev)
/* Kick device.
- Note, that this procedure can be called by a watchdog timer, so that
- we do not check dev->tbusy flag here.
- Returns: 0 - queue is empty.
- >0 - queue is not empty, but throttled.
- <0 - queue is not empty. Device is throttled, if dev->tbusy != 0.
+ Returns: 0 - queue is empty or throttled.
+ >0 - queue is not empty.
NOTE: Called under dev->queue_lock with locally disabled BH.
*/
@@ -115,7 +112,7 @@ static inline int qdisc_restart(struct net_device *dev)
kfree_skb(skb);
if (net_ratelimit())
printk(KERN_DEBUG "Dead loop on netdevice %s, fix it urgently!\n", dev->name);
- return -1;
+ goto out;
}
__get_cpu_var(netdev_rx_stat).cpu_collision++;
goto requeue;
@@ -135,7 +132,7 @@ static inline int qdisc_restart(struct net_device *dev)
netif_tx_unlock(dev);
}
spin_lock(&dev->queue_lock);
- return -1;
+ goto out;
}
if (ret == NETDEV_TX_LOCKED && nolock) {
spin_lock(&dev->queue_lock);
@@ -168,8 +165,10 @@ requeue:
else
q->ops->requeue(skb, q);
netif_schedule(dev);
- return 1;
+ return 0;
}
+
+out:
BUG_ON((int) q->q.qlen < 0);
return q->q.qlen;
}
@@ -179,8 +178,10 @@ void __qdisc_run(struct net_device *dev)
if (unlikely(dev->qdisc == &noop_qdisc))
goto out;
- while (qdisc_restart(dev) < 0 && !netif_queue_stopped(dev))
- /* NOTHING */;
+ do {
+ if (!qdisc_restart(dev))
+ break;
+ } while (!netif_queue_stopped(dev));
out:
clear_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 11:50 ` Herbert Xu
@ 2007-05-10 11:55 ` David Miller
2007-05-10 12:10 ` Herbert Xu
2007-05-10 12:21 ` jamal
0 siblings, 2 replies; 22+ messages in thread
From: David Miller @ 2007-05-10 11:55 UTC (permalink / raw)
To: herbert; +Cc: krkumar2, hadi, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 10 May 2007 21:50:39 +1000
> On Thu, May 10, 2007 at 10:42:59AM +0530, Krishna Kumar2 wrote:
> >
> > But RUNNING is never relinquished till all the way back out to
> > __qdisc_run. Only the lock is dropped before calling xmit and
> > re-got after xmit is finished (all the while holding RUNNING).
> > After xmit both queue_lock and RUNNING are held. If some other
> > cpu enqueue'd during this release window (and they would have
> > returned after dropping the lock as they are pure enqueuer's),
> > qdisc_restart will find those skbs. Similarly if no skbs were
> > added, qdisc_restart will return 0.
>
> Yes I agree with Krishna completely. In fact this whole section
> is so 20th-century :) Let's fix up the comments too while we're
> at it.
>
> [NET_SCHED]: Rationalise return value of qdisc_restart
>
> The current return value scheme and associated comment was invented
> back in the 20th century when we still had that tbusy flag. Things
> have changed quite a bit since then (even Tony Blair is moving on
> now, not to mention the new French president).
>
> All we need to indicate now is whether the caller should continue
> processing the queue. Therefore it's sufficient if we return 0 if
> we want to stop and non-zero otherwise.
>
> This is based on a patch by Krishna Kumar.
>
> Signed-off-by: <herbert@gondor.apana.org.au>
Fair enough, patch applied :-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 11:55 ` David Miller
@ 2007-05-10 12:10 ` Herbert Xu
2007-05-10 21:11 ` David Miller
2007-05-10 12:21 ` jamal
1 sibling, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2007-05-10 12:10 UTC (permalink / raw)
To: David Miller; +Cc: krkumar2, hadi, netdev
On Thu, May 10, 2007 at 04:55:34AM -0700, David Miller wrote:
>
> > [NET_SCHED]: Rationalise return value of qdisc_restart
>
> Fair enough, patch applied :-)
Sorry, reading Thomas's patch made me realise that I've just added
that bug all over again.
[NET_SCHED]: Reread dev->qdisc for NETDEV_TX_OK
Now that we return the queue length after NETDEV_TX_OK we better
make sure that we have the right queue. Otherwise we can cause a
stall after a really quick dev_deactive/dev_activate.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Chers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 3480e81..5f04b06 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -132,6 +132,7 @@ static inline int qdisc_restart(struct net_device *dev)
netif_tx_unlock(dev);
}
spin_lock(&dev->queue_lock);
+ q = dev->qdisc;
goto out;
}
if (ret == NETDEV_TX_LOCKED && nolock) {
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 11:55 ` David Miller
2007-05-10 12:10 ` Herbert Xu
@ 2007-05-10 12:21 ` jamal
2007-05-10 12:50 ` jamal
2007-05-10 12:59 ` Herbert Xu
1 sibling, 2 replies; 22+ messages in thread
From: jamal @ 2007-05-10 12:21 UTC (permalink / raw)
To: David Miller; +Cc: herbert, krkumar2, netdev
On Thu, 2007-10-05 at 04:55 -0700, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Thu, 10 May 2007 21:50:39 +1000
>
> T_SCHED]: Rationalise return value of qdisc_restart
> >
> > The current return value scheme and associated comment was invented
> > back in the 20th century when we still had that tbusy flag. Things
> > have changed quite a bit since then (even Tony Blair is moving on
> > now, not to mention the new French president).
> >
> > All we need to indicate now is whether the caller should continue
> > processing the queue. Therefore it's sufficient if we return 0 if
> > we want to stop and non-zero otherwise.
> >
> > This is based on a patch by Krishna Kumar.
> >
> > Signed-off-by: <herbert@gondor.apana.org.au>
>
> Fair enough, patch applied :-)
Ok, see if this makes sense:
CPU0 CPU1 (holding qdisc running)
. |
. |
. + grab qlock
. |
. | deq pkt
. + release qlock
. + grab txlock
. | send pkt
. + release txlock
. + grab qlock
has pktX | (NEW: qlen = 0); return 0 instead of -1
waiting for qlock + release qlock
+ grab qlock |
| + find that return code is 0
| enq pkt X + release qdisc running
| |
+ release qlock | ==> outta here
pkt X is stuck unless some event happens such as a new pkt arrival.
In other words it sits there for an indeterminate period.
cheers,
jamal
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 12:21 ` jamal
@ 2007-05-10 12:50 ` jamal
2007-05-10 12:59 ` Herbert Xu
1 sibling, 0 replies; 22+ messages in thread
From: jamal @ 2007-05-10 12:50 UTC (permalink / raw)
To: David Miller; +Cc: herbert, krkumar2, netdev
Never mind, I was wrong. qdisc run will be invoked by cpu0; i.e:
CPU0 CPU1
+ grab qlock |
| + find that return code is 0
| enq pkt X + release qdisc running
| |
+ grab qdisc running | ==> outta here
| call qdisc_run
+ release qlock
Should have looked at my netconf slides first ;->
cheers,
jamal
On Thu, 2007-10-05 at 08:21 -0400, jamal wrote:
> On Thu, 2007-10-05 at 04:55 -0700, David Miller wrote:
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Thu, 10 May 2007 21:50:39 +1000
> >
>
> > T_SCHED]: Rationalise return value of qdisc_restart
> > >
> > > The current return value scheme and associated comment was invented
> > > back in the 20th century when we still had that tbusy flag. Things
> > > have changed quite a bit since then (even Tony Blair is moving on
> > > now, not to mention the new French president).
> > >
> > > All we need to indicate now is whether the caller should continue
> > > processing the queue. Therefore it's sufficient if we return 0 if
> > > we want to stop and non-zero otherwise.
> > >
> > > This is based on a patch by Krishna Kumar.
> > >
> > > Signed-off-by: <herbert@gondor.apana.org.au>
> >
> > Fair enough, patch applied :-)
>
> Ok, see if this makes sense:
>
> CPU0 CPU1 (holding qdisc running)
> . |
> . |
> . + grab qlock
> . |
> . | deq pkt
> . + release qlock
> . + grab txlock
> . | send pkt
> . + release txlock
> . + grab qlock
> has pktX | (NEW: qlen = 0); return 0 instead of -1
> waiting for qlock + release qlock
> + grab qlock |
> | + find that return code is 0
> | enq pkt X + release qdisc running
> | |
> + release qlock | ==> outta here
>
> pkt X is stuck unless some event happens such as a new pkt arrival.
> In other words it sits there for an indeterminate period.
>
> cheers,
> jamal
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 12:21 ` jamal
2007-05-10 12:50 ` jamal
@ 2007-05-10 12:59 ` Herbert Xu
2007-05-10 13:18 ` jamal
1 sibling, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2007-05-10 12:59 UTC (permalink / raw)
To: hadi; +Cc: davem, herbert, krkumar2, netdev
jamal <hadi@cyberus.ca> wrote:
>
> Ok, see if this makes sense:
>
> CPU0 CPU1 (holding qdisc running)
> . |
> . |
> . + grab qlock
> . |
> . | deq pkt
> . + release qlock
> . + grab txlock
> . | send pkt
> . + release txlock
> . + grab qlock
> has pktX | (NEW: qlen = 0); return 0 instead of -1
> waiting for qlock + release qlock
This release qlock isn't in our source code :)
> + grab qlock |
> | + find that return code is 0
> | enq pkt X + release qdisc running
> | |
> + release qlock | ==> outta here
>
> pkt X is stuck unless some event happens such as a new pkt arrival.
> In other words it sits there for an indeterminate period.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 12:59 ` Herbert Xu
@ 2007-05-10 13:18 ` jamal
2007-05-10 13:52 ` Herbert Xu
0 siblings, 1 reply; 22+ messages in thread
From: jamal @ 2007-05-10 13:18 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, krkumar2, netdev
On Thu, 2007-10-05 at 22:59 +1000, Herbert Xu wrote:
> jamal <hadi@cyberus.ca> wrote:
> This release qlock isn't in our source code :)
This is why i defered this to you ;->
For completion sake, this is how it looks like:
CPU0 CPU1
| wait qlock |
| |
| + find that return code is 0
| + release qdisc running
| + release qlock
+ grab qlock + ==> outta here
| enq pktx |
| call qdisc_run |
+ grab running
| deq etc
+ release running
+ release qlock
+ ==> outta here
I wonder how much performance improvement this give now that the extra
incursion into qdisc_restart is avoided.
cheers,
jamal
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 13:18 ` jamal
@ 2007-05-10 13:52 ` Herbert Xu
2007-05-10 14:12 ` jamal
0 siblings, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2007-05-10 13:52 UTC (permalink / raw)
To: jamal; +Cc: davem, krkumar2, netdev
On Thu, May 10, 2007 at 09:18:04AM -0400, jamal wrote:
>
> I wonder how much performance improvement this give now that the extra
> incursion into qdisc_restart is avoided.
Probably very little since the only way it can make a difference is if
there is lots of contention. But if there is lots of contention we
should stay in the function anyway.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 13:52 ` Herbert Xu
@ 2007-05-10 14:12 ` jamal
2007-05-10 14:26 ` Krishna Kumar2
0 siblings, 1 reply; 22+ messages in thread
From: jamal @ 2007-05-10 14:12 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, krkumar2, netdev, Thomas Graf, Roberto Nibali
On Thu, 2007-10-05 at 23:52 +1000, Herbert Xu wrote:
> On Thu, May 10, 2007 at 09:18:04AM -0400, jamal wrote:
> >
> > I wonder how much performance improvement this give now that the extra
> > incursion into qdisc_restart is avoided.
>
> Probably very little since the only way it can make a difference is if
> there is lots of contention. But if there is lots of contention we
> should stay in the function anyway.
Less to do with CPU contention and more with Krishna's change which
avoids entering that region all together which is useful even in the
absence of contention. In the past (based on some measurement i did)
upto 50% of the executions of qdisc restart were on an empty queue
depending on traffic rates.
BTW, given that i couldnt recall the layout of the locks on qdisc
restart; I think i will come back with the old patch i had to clean up
the readability of qdisc restart so it is easy to tell just by quick
inspection.
Since Canada seems to be comfortably taking revenge on the Swiss (my
condolonces to Thomas and Ratz) - i think it is time for me to head
out ;-> ttl.
cheers,
jamal
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 14:12 ` jamal
@ 2007-05-10 14:26 ` Krishna Kumar2
2007-05-10 14:31 ` Herbert Xu
0 siblings, 1 reply; 22+ messages in thread
From: Krishna Kumar2 @ 2007-05-10 14:26 UTC (permalink / raw)
To: hadi; +Cc: davem, Herbert Xu, netdev, Roberto Nibali, Thomas Graf
I would guess that even with this change, qdisc_restart can get called with
no
skbs in the queue due to softirq which was enabled (but by the time it ran,
more skbs enqueue could have dequeue'd those packets). But it would
definitely reduce by once each iteration of qdisc_run (if one packet then
50% reduction, if 9 packets then 10% reduction, and so on).
Herbert,
- while (qdisc_restart(dev) < 0 && !netif_queue_stopped(dev))
- /* NOTHING */;
+ do {
+ if (!qdisc_restart(dev))
+ break;
+ } while (!netif_queue_stopped(dev));
This could be changed to old format, if required :
+ while (qdisc_restart(dev)) && !netif_queue_stopped(dev))
+ /* NOTHING */;
thanks,
- KK
J Hadi Salim <j.hadi123@gmail.com> wrote on 05/10/2007 07:42:55 PM:
> On Thu, 2007-10-05 at 23:52 +1000, Herbert Xu wrote:
> > On Thu, May 10, 2007 at 09:18:04AM -0400, jamal wrote:
> > >
> > > I wonder how much performance improvement this give now that the
extra
> > > incursion into qdisc_restart is avoided.
> >
> > Probably very little since the only way it can make a difference is if
> > there is lots of contention. But if there is lots of contention we
> > should stay in the function anyway.
>
> Less to do with CPU contention and more with Krishna's change which
> avoids entering that region all together which is useful even in the
> absence of contention. In the past (based on some measurement i did)
> upto 50% of the executions of qdisc restart were on an empty queue
> depending on traffic rates.
>
> BTW, given that i couldnt recall the layout of the locks on qdisc
> restart; I think i will come back with the old patch i had to clean up
> the readability of qdisc restart so it is easy to tell just by quick
> inspection.
>
> Since Canada seems to be comfortably taking revenge on the Swiss (my
> condolonces to Thomas and Ratz) - i think it is time for me to head
> out ;-> ttl.
>
> cheers,
> jamal
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 14:26 ` Krishna Kumar2
@ 2007-05-10 14:31 ` Herbert Xu
0 siblings, 0 replies; 22+ messages in thread
From: Herbert Xu @ 2007-05-10 14:31 UTC (permalink / raw)
To: Krishna Kumar2; +Cc: hadi, davem, netdev, Roberto Nibali, Thomas Graf
On Thu, May 10, 2007 at 07:56:02PM +0530, Krishna Kumar2 wrote:
>
> Herbert,
>
> - while (qdisc_restart(dev) < 0 && !netif_queue_stopped(dev))
> - /* NOTHING */;
> + do {
> + if (!qdisc_restart(dev))
> + break;
> + } while (!netif_queue_stopped(dev));
>
> This could be changed to old format, if required :
>
> + while (qdisc_restart(dev)) && !netif_queue_stopped(dev))
> + /* NOTHING */;
Actually I deliberately changed it so that it's slightly clearer.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] sched: Optimize return value of qdisc_restart
2007-05-10 12:10 ` Herbert Xu
@ 2007-05-10 21:11 ` David Miller
0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2007-05-10 21:11 UTC (permalink / raw)
To: herbert; +Cc: krkumar2, hadi, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 10 May 2007 22:10:10 +1000
> On Thu, May 10, 2007 at 04:55:34AM -0700, David Miller wrote:
> >
> > > [NET_SCHED]: Rationalise return value of qdisc_restart
> >
> > Fair enough, patch applied :-)
>
> Sorry, reading Thomas's patch made me realise that I've just added
> that bug all over again.
>
> [NET_SCHED]: Reread dev->qdisc for NETDEV_TX_OK
>
> Now that we return the queue length after NETDEV_TX_OK we better
> make sure that we have the right queue. Otherwise we can cause a
> stall after a really quick dev_deactive/dev_activate.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
:-) Applied.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2007-05-10 21:11 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-08 7:31 [PATCH] sched: Optimize return value of qdisc_restart Krishna Kumar
2007-05-09 2:05 ` David Miller
2007-05-09 4:35 ` Krishna Kumar2
2007-05-09 6:36 ` David Miller
2007-05-09 7:23 ` Krishna Kumar2
2007-05-09 8:12 ` David Miller
2007-05-09 12:56 ` jamal
2007-05-09 14:47 ` Krishna Kumar2
2007-05-09 15:52 ` jamal
2007-05-10 5:12 ` Krishna Kumar2
2007-05-10 11:50 ` Herbert Xu
2007-05-10 11:55 ` David Miller
2007-05-10 12:10 ` Herbert Xu
2007-05-10 21:11 ` David Miller
2007-05-10 12:21 ` jamal
2007-05-10 12:50 ` jamal
2007-05-10 12:59 ` Herbert Xu
2007-05-10 13:18 ` jamal
2007-05-10 13:52 ` Herbert Xu
2007-05-10 14:12 ` jamal
2007-05-10 14:26 ` Krishna Kumar2
2007-05-10 14:31 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).