* bk16 changes to cbq
@ 2004-07-03 13:36 jamal
2004-07-03 17:16 ` David S. Miller
2004-07-05 20:27 ` Alexey Kuznetsov
0 siblings, 2 replies; 9+ messages in thread
From: jamal @ 2004-07-03 13:36 UTC (permalink / raw)
To: Alexey, David S. Miller; +Cc: Stephen Hemminger, netdev
I am noticing this in bk16; it may be one of the changes that came
from Stephen recently..
--- a/net/sched/sch_cbq.c 2004-02-20 18:37:25 -08:00
+++ b/net/sched/sch_cbq.c 2004-06-18 13:51:18 -07:00
@@ -1054,7 +1054,7 @@
if (sch->q.qlen) {
sch->stats.overlimits++;
- if (q->wd_expires && !netif_queue_stopped(sch->dev)) {
+ if (q->wd_expires) {
long delay = PSCHED_US2JIFFIE(q->wd_expires);
if (delay <= 0)
delay = 1;
What i remember is this (4-5 years back) used to cure a bug - cant
remember the details unfortunately, but Alexey may remember.
I am hoping removal of the above line implies that those conditions dont
exist anymore?
At the expense of repeating a discussion that may have already happened,
what was the logic for removing this line?
cheers,
jamal
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bk16 changes to cbq 2004-07-03 13:36 bk16 changes to cbq jamal @ 2004-07-03 17:16 ` David S. Miller 2004-07-04 0:03 ` jamal 2004-07-04 1:21 ` jamal 2004-07-05 20:27 ` Alexey Kuznetsov 1 sibling, 2 replies; 9+ messages in thread From: David S. Miller @ 2004-07-03 17:16 UTC (permalink / raw) To: hadi; +Cc: kuznet, shemminger, netdev On 03 Jul 2004 09:36:50 -0400 jamal <hadi@cyberus.ca> wrote: > - if (q->wd_expires && !netif_queue_stopped(sch->dev)) { > + if (q->wd_expires) { > long delay = PSCHED_US2JIFFIE(q->wd_expires); > if (delay <= 0) > delay = 1; > > What i remember is this (4-5 years back) used to cure a bug - cant > remember the details unfortunately, but Alexey may remember. > I am hoping removal of the above line implies that those conditions dont > exist anymore? The test is racy with drivers, on the very next line the driver could take a TX completion interrupt and unplug the queue invalidating the test entirely. If the test proves wrong, that's OK because we'll try again at the top level of packet queue dispatch. There was a good explaination of Stephen's patch on netdev when he posted it. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-03 17:16 ` David S. Miller @ 2004-07-04 0:03 ` jamal 2004-07-04 1:21 ` jamal 1 sibling, 0 replies; 9+ messages in thread From: jamal @ 2004-07-04 0:03 UTC (permalink / raw) To: David S. Miller; +Cc: Alexey, shemminger, netdev On Sat, 2004-07-03 at 13:16, David S. Miller wrote: > On 03 Jul 2004 09:36:50 -0400 > The test is racy with drivers, on the very next line the > driver could take a TX completion interrupt and unplug the > queue invalidating the test entirely. I could be wrong: The only thing the tx complete can do is open up the device for more packets to tx into the device... i.e !netif_queue_stopped(sch->dev) is true in that specific case. Which would mean theres no race. > If the test proves wrong, that's OK because we'll try again > at the top level of packet queue dispatch. > > There was a good explaination of Stephen's patch > on netdev when he posted it. Let me dig into the emails and get back. cheers, jamal ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-03 17:16 ` David S. Miller 2004-07-04 0:03 ` jamal @ 2004-07-04 1:21 ` jamal 1 sibling, 0 replies; 9+ messages in thread From: jamal @ 2004-07-04 1:21 UTC (permalink / raw) To: David S. Miller; +Cc: Alexey, shemminger, netdev Ok, I looked at Stephens email; i found it here: http://marc.theaimsgroup.com/?l=linux-netdev&m=108751387802077&w=2 Basically similar to what you said. My suggestion is we back out this change - i think there may be a problem elsewhere, somehow we end up pointing here. It is also possible there was a hidden bug, but if this stuff was problematic we would have had complaints in the last few years related. The other thing is you can leave it in there and wait for people to complain. Lets say there was a race condition: The worst that could happen in such a case (on TX completion) is that the cbq watchdog is started (which later reschedules the device) and the top layer reschedules the device as well(because the device was stopped; stopped devices get descheduled). In such a case there will be two netif_schedules() with one being unnecessary. No biggie, just a few extra lines. Lets say there was no race condition: The cbq watchdog gets scheduled only when the netdevice is not stopped. This is optimization is now gone. To reiterate what i said earlier. The only thing the tx complete can do is open up the device for more packets to tx into the device... i.e !netif_queue_stopped(sch->dev) is true in that specific case. Actually while looking at this i noticed some odd things; it may be time to remind people of the return code rules on dev->hard_start_xmit(). I will make another posting. cheers, jamal ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-03 13:36 bk16 changes to cbq jamal 2004-07-03 17:16 ` David S. Miller @ 2004-07-05 20:27 ` Alexey Kuznetsov 2004-07-06 1:28 ` jamal 2004-07-07 5:13 ` David S. Miller 1 sibling, 2 replies; 9+ messages in thread From: Alexey Kuznetsov @ 2004-07-05 20:27 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, Stephen Hemminger, netdev Hello! > What i remember is this (4-5 years back) used to cure a bug - cant > remember the details unfortunately, but Alexey may remember. Actually, I do not. It looks like an optimization: if the device is throttled we just do not need to add timer, qdisc will be woken up by unthrottle. I do not see any race condition here, but also I do not see why this check could be important. > At the expense of repeating a discussion that may have already happened, To be honest, I wish to be reminded. If this check cured something, it would be interesting to know what was this. Alexey ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-05 20:27 ` Alexey Kuznetsov @ 2004-07-06 1:28 ` jamal 2004-07-07 5:13 ` David S. Miller 1 sibling, 0 replies; 9+ messages in thread From: jamal @ 2004-07-06 1:28 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: David S. Miller, Stephen Hemminger, netdev On Mon, 2004-07-05 at 16:27, Alexey Kuznetsov wrote: > Hello! > > > What i remember is this (4-5 years back) used to cure a bug - cant > > remember the details unfortunately, but Alexey may remember. > > Actually, I do not. It looks like an optimization: if the device is > throttled we just do not need to add timer, qdisc will be woken up > by unthrottle. I do not see any race condition here, but also I do not > see why this check could be important. Indeed it looks like an optimization; instead of two events (queue watchdog and probably EOL )or device wdog) trying to reschedule device, only one does. > > At the expense of repeating a discussion that may have already happened, > > To be honest, I wish to be reminded. If this check cured something, > it would be interesting to know what was this. I searched the archives couldnt find anything relevant. cheers, jamal ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-05 20:27 ` Alexey Kuznetsov 2004-07-06 1:28 ` jamal @ 2004-07-07 5:13 ` David S. Miller 2004-07-07 15:59 ` Alexey Kuznetsov 1 sibling, 1 reply; 9+ messages in thread From: David S. Miller @ 2004-07-07 5:13 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: hadi, shemminger, netdev On Tue, 6 Jul 2004 00:27:27 +0400 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > > What i remember is this (4-5 years back) used to cure a bug - cant > > remember the details unfortunately, but Alexey may remember. > > Actually, I do not. It looks like an optimization: if the device is > throttled we just do not need to add timer, qdisc will be woken up > by unthrottle. I do not see any race condition here, but also I do not > see why this check could be important. > > > At the expense of repeating a discussion that may have already happened, > > To be honest, I wish to be reminded. If this check cured something, > it would be interesting to know what was this. e1000 would hang the delay scheduler sometimes Let me look for the exact email from Stephen. Here it is: http://mailman.ds9a.nl/pipermail/lartc/2004q2/012736.html Jamal read this, and that is what prompted him to say that if the delay scheduler was made classful instead of "pretending" to be, the bug mentioned in Stephen's email would not occur. I do not understand things fully. I think it would help if, for example, each of the qdisc and classification operations had some comments describing the exact semantics of each ops->method. For example, what does requeue do and what is it indicating with each of the possible return values. What is expected of it? What kind of state is it allowed to leave the queue in? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-07 5:13 ` David S. Miller @ 2004-07-07 15:59 ` Alexey Kuznetsov 2004-07-07 18:11 ` jamal 0 siblings, 1 reply; 9+ messages in thread From: Alexey Kuznetsov @ 2004-07-07 15:59 UTC (permalink / raw) To: David S. Miller; +Cc: hadi, shemminger, netdev Hello! > e1000 would hang the delay scheduler sometimes > > Let me look for the exact email from Stephen. Here it is: > > http://mailman.ds9a.nl/pipermail/lartc/2004q2/012736.html Well, that test was really meaningless. So, Stephen's patch is right. The problem is that Stephen's argument just does not look right and forces to suspect some worse bug somewhere. If netif_queue_stopped(), it will be awaken, so no loss of wakeups can happen. If a device sets stopped flag and forgets to wake up after that, it is really badly broken and this should be repaired, it inevitably will result in stalls. > Jamal read this, and that is what prompted him to say that if > the delay scheduler was made classful instead of "pretending" > to be, the bug mentioned in Stephen's email would not occur. It is not related issue. sch_delay allocates private fifo qdisc, which is quite pointless (it is not accessible for tuning anyway) and complicates things a lot. F.e. if it copied tbf structure, it would not have to use ->requeue. And if it is planned to be tunable and replacable with something else, it should be classful. > I do not understand things fully. I think it would help if, for example, > each of the qdisc and classification operations had some comments > describing the exact semantics of each ops->method. For example, > what does requeue do and what is it indicating with each of the > possible return values. What is expected of it? What kind of state > is it allowed to leave the queue in? Ideally, ->requeue() should return queue to the state before ->dequeue(), undoing all the state. When it fails (which is possible only due to some fault sort of failed memory allocation. In the case of sch_delay, which uses requeue of fifo it is absolutely impossible), it drops skb and returns NET_XMIT_DROP or NET_XMIT_CN. Stephen's statement: > Also, if requeuing the packet fails, it is probably because the queue became full > by a racing enqueue. is not true. Everything between dequeue() and requeue() happens under dev->queue_lock, so there is no place for races there. Simple qdiscs never fail. cbq cannot fail too, the place where it returns NET_XMIT_CN is rather for safety when some user drops lock somewhere between dequeue and requeue, which is not allowed actually. > So the right thing to do is to go back and try again In theory he is right. sch_delay just have no choice. If it just returns, when ->requeue() failed by some unknown reason, sch_delay would stall. But, again, sch_delay uses fifo, which never fails in any case. So, it does not matter. :-) Alexey ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bk16 changes to cbq 2004-07-07 15:59 ` Alexey Kuznetsov @ 2004-07-07 18:11 ` jamal 0 siblings, 0 replies; 9+ messages in thread From: jamal @ 2004-07-07 18:11 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: David S. Miller, shemminger, netdev On Wed, 2004-07-07 at 11:59, Alexey Kuznetsov wrote: > Hello! > > > e1000 would hang the delay scheduler sometimes > > > > Let me look for the exact email from Stephen. Here it is: > > > > http://mailman.ds9a.nl/pipermail/lartc/2004q2/012736.html > > Well, that test was really meaningless. So, Stephen's patch is right. Not totaly Alexey. Its like you said before, the code there was an optimization. Heres a scenario especialy with drivers returning 0 on everything in their txmit() (discussion going elsewhere with Jeff Garzik): Driver does netif_queue_stop() because of full ring and returns 0. qdisc gets called and it decides to throttle. In that case if it starts watchdog (because it doesnt look at device state). That would be two possible future reschedules of device. One because of qdisc throttle timer and another because of netif_stop. Its not much code difference or cpu cycles, but still an optimization for that case. OTOH, this would not happen if the driver returned a 1 along with skb (the way e1000 does). cheers, jamal ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-07-07 18:11 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-07-03 13:36 bk16 changes to cbq jamal 2004-07-03 17:16 ` David S. Miller 2004-07-04 0:03 ` jamal 2004-07-04 1:21 ` jamal 2004-07-05 20:27 ` Alexey Kuznetsov 2004-07-06 1:28 ` jamal 2004-07-07 5:13 ` David S. Miller 2004-07-07 15:59 ` Alexey Kuznetsov 2004-07-07 18:11 ` jamal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).