* Re: ACPI/HT or Packet Scheduler BUG? [not found] ` <1113601029.4294.80.camel@localhost.localdomain> @ 2005-04-15 21:44 ` jamal 2005-04-15 21:54 ` Steven Rostedt 0 siblings, 1 reply; 13+ messages in thread From: jamal @ 2005-04-15 21:44 UTC (permalink / raw) To: Steven Rostedt; +Cc: netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel Didnt see the beginings of this thread - please post on netdev instead of lkml network related questions. The real cause seems to be an ARP issue from what i saw in the oops posted a while back: -- [4294692.342000] Call Trace: [4294692.342000] [<c0104d76>] show_stack+0xa6/0xe0 [4294692.342000] [<c0104f2b>] show_registers+0x15b/0x1f0 [4294692.342000] [<c01051a1>] die+0x141/0x2d0 [4294692.342000] [<c011e13e>] do_page_fault+0x22e/0x6a6 [4294692.342000] [<c0104817>] error_code+0x4f/0x54 [4294692.342000] [<c04236da>] qdisc_restart+0xba/0x730 [4294692.342000] [<c04136fe>] dev_queue_xmit+0x13e/0x640 [4294692.342000] [<c0454c4c>] arp_solicit+0xfc/0x210 [4294692.342000] [<c041a6ee>] neigh_timer_handler+0x13e/0x320 [4294692.342000] [<c0137450>] run_timer_softirq+0x130/0x490 [4294692.342000] [<c0131ad2>] __do_softirq+0x42/0xa0 [4294692.342000] [<c01066e1>] do_softirq+0x51/0x60 ----- Is this the same issue? Can you describe how you create this issue; kernel version etc. cheers, jamal On Fri, 2005-15-04 at 17:37 -0400, Steven Rostedt wrote: > On Thu, 2005-04-14 at 18:46 +0300, Tarhon-Onu Victor wrote: > > On Tue, 12 Apr 2005, Tarhon-Onu Victor wrote: > > > > > So the problem should be looked in that changes to the pkt sched API, > > > the patch containing only those changes is at > > > > The bug is in this portion of code from net/sched/sch_generic.c, > > in the qdisc_destroy() function: > > > > == > > list_for_each_entry(cq, &cql, list) > > list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list) > > if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) { > > if (q->ops->cl_ops == NULL) > > list_del_init(&q->list); > > else > > list_move_tail(&q->list, &cql); > > } > > list_for_each_entry_safe(cq, n, &cql, list) > > list_del_init(&cq->list); > > == > > > > ...and it happens when q->ops->cl_ops is NULL and > > list_del_init(&q->list) is executed. > > > > The stuff from include/linux/list.h looks ok, it seems like one > > of those two iterations (list_for_each_entry() and > > list_for_each_entry_safe()) enters an endless loop when an element is > > removed from the list under some circumstances. > > There's a comment above qdisc_destroy that says: > > /* Under dev->queue_lock and BH! */ > > I'm not so sure this is the case. I've included the emails of those > listed as Authors of sch_generic.c and sch_htb.c, hopefully they are the > ones who can help (if not, sorry to bother you). > > The list.h is fine, but if another task goes down this list when it > list_del_init is done, there's a chance that the reading task can get to > the deleted item just as it is being deleted, and has pointed itself to > itself. p->next == p. This would go into an infinite loop. > > The reason sysrq works is because this doesn't stop interrupts. But put > a local_irq_save around that list and run your test, I bet you won't be > able to do anything, but power off with the big button. > > Hope someone can help. I don't know the queue disciplines well enough to > make a proper fix. > > -- Steve > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-15 21:44 ` ACPI/HT or Packet Scheduler BUG? jamal @ 2005-04-15 21:54 ` Steven Rostedt 2005-04-15 22:54 ` Thomas Graf 0 siblings, 1 reply; 13+ messages in thread From: Steven Rostedt @ 2005-04-15 21:54 UTC (permalink / raw) To: hadi; +Cc: netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel Don't think the issue is the same. This problem is happening with Tarhon-Onu Victor. I'm including his previous two posts from LKML here. So you can read the whole thing in one letter. He states that the problem started in 2.6.10-rc2 and looking at a diff, between rc1 and rc2 the list walk was added in qdisc_destroy. Take a look at his scripts and you may see that on a SMP machine, this my have a race condition. -- Steve >From April 8th: I am not subscribed to this list so please CC me if you post a reply, if you need additional info or if you suggest a patch (in which I would be very interested). Occurrence: - the kernels must be compiled with Hyper Threading support (and the CPU/MB must support it); - a (tc) process is reading statistics from the htb tree and another tries to delete or deletes the tree. The bug will not occur if the system is booted with acpi=off or if it's not SMP (HT) capable. I reproduced the bug on 2.6.10-1.770_FCsmp (Fedora Core update package) and vanilla 2.6.11, 2.6.11.5, 2.6.11.6, 2.6.12-rc1 and 2.6.12-rc2 compiled with SMP and ACPI support in order to enable Hyper Threading (with and without preempt, with or without SMT support). The compiler I used is GCC version 3.4.2 (from Fedora Core 3). Result: almost all the time the system hangs but still can execute SysRq commands. How I tested ~~~~~~~~~~~~ On a console I was running a script that initializes a htb tree on an interface (dummy0) in an endless loop: while /bin/true; do . htbdown.dummy0.sh; done ...and on another console I run tc -s also in an endless loop: while /bin/true; do tc -s class sh dev dummy0; done After a while (sometimes after 2 runs of the htbdown.dummy0.sh script, sometimes after 20) the system hangs. It still responds to SysRq commands and I was able to see the two tc processes running. Sometimes I still have ping replies from it and one time, just one time in 2 days I still was able to log in remotely. There are no printk messages, or no other warnings or errors printed in the system log or kernel log. It just hangs and it seems like something is wasting all the CPU power: when I still was able to log in I noticed that one of the two virtual CPUs was 100% with system interrupts and the other was 100% system. I wasn't able to strace any of the two running tc's. What I was able to paste with the mouse in my console, to catch in a typescript and also the script that initializes the htb tree on dummy0 can be found at ftp://blackblue.iasi.rdsnet.ro/pub/various/k/ . The test host is a 3.0GHz Intel Prescott and I first noticed the bug on a system with a 2.8GHz Intel Northwood, both having motherboards with Intel chipset (865GBF). I am not able to test it in other SMP environments (Intel Xeon or Itanium, AMD Opteron, Dual P3, etc), so I'm not able to tell if it's an ACPI bug, a SMP bug or a Packet Scheduler bug. >From April 12th: On Fri, 8 Apr 2005, Tarhon-Onu Victor wrote: > I am not subscribed to this list so please CC me if you post a reply, > if you need additional info or if you suggest a patch (in which I would be > very interested). > [snip] > (Intel Xeon or Itanium, AMD Opteron, Dual P3, etc), so I'm not able to tell > if it's an ACPI bug, a SMP bug or a Packet Scheduler bug. It seems like this bug is a packed scheduler one and it was introduced in 2.6.10-rc2. In the summary of changes from 2.6.10-rc1 to 2.6.10-rc2 there are a lot of changes announced for the packet scheduler. I removed all the changes of the packet scheduler files from the incremental patch 2.6.10-rc1 to 2.6.10-rc2, I applied it to 2.6.10-rc1 and the new 2.6.10-rc2-whithout-sched-changes does not hang. So the problem should be looked in that changes to the pkt sched API, the patch containing only those changes is at ftp://blackblue.iasi.rdsnet.ro/pub/various/k/patch-2.6.10-sched_changes-from_rc1-to-rc2.gz On Fri, 2005-04-15 at 17:44 -0400, jamal wrote: > Didnt see the beginings of this thread - please post on netdev instead > of lkml network related questions. > > The real cause seems to be an ARP issue from what i saw in the oops > posted a while back: > -- > [4294692.342000] Call Trace: > [4294692.342000] [<c0104d76>] show_stack+0xa6/0xe0 > [4294692.342000] [<c0104f2b>] show_registers+0x15b/0x1f0 > [4294692.342000] [<c01051a1>] die+0x141/0x2d0 > [4294692.342000] [<c011e13e>] do_page_fault+0x22e/0x6a6 > [4294692.342000] [<c0104817>] error_code+0x4f/0x54 > [4294692.342000] [<c04236da>] qdisc_restart+0xba/0x730 > [4294692.342000] [<c04136fe>] dev_queue_xmit+0x13e/0x640 > [4294692.342000] [<c0454c4c>] arp_solicit+0xfc/0x210 > [4294692.342000] [<c041a6ee>] neigh_timer_handler+0x13e/0x320 > [4294692.342000] [<c0137450>] run_timer_softirq+0x130/0x490 > [4294692.342000] [<c0131ad2>] __do_softirq+0x42/0xa0 > [4294692.342000] [<c01066e1>] do_softirq+0x51/0x60 > ----- > > Is this the same issue? > Can you describe how you create this issue; kernel version etc. > > cheers, > jamal > > On Fri, 2005-15-04 at 17:37 -0400, Steven Rostedt wrote: > > On Thu, 2005-04-14 at 18:46 +0300, Tarhon-Onu Victor wrote: > > > On Tue, 12 Apr 2005, Tarhon-Onu Victor wrote: > > > > > > > So the problem should be looked in that changes to the pkt sched API, > > > > the patch containing only those changes is at > > > > > > The bug is in this portion of code from net/sched/sch_generic.c, > > > in the qdisc_destroy() function: > > > > > > == > > > list_for_each_entry(cq, &cql, list) > > > list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list) > > > if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) { > > > if (q->ops->cl_ops == NULL) > > > list_del_init(&q->list); > > > else > > > list_move_tail(&q->list, &cql); > > > } > > > list_for_each_entry_safe(cq, n, &cql, list) > > > list_del_init(&cq->list); > > > == > > > > > > ...and it happens when q->ops->cl_ops is NULL and > > > list_del_init(&q->list) is executed. > > > > > > The stuff from include/linux/list.h looks ok, it seems like one > > > of those two iterations (list_for_each_entry() and > > > list_for_each_entry_safe()) enters an endless loop when an element is > > > removed from the list under some circumstances. > > > > There's a comment above qdisc_destroy that says: > > > > /* Under dev->queue_lock and BH! */ > > > > I'm not so sure this is the case. I've included the emails of those > > listed as Authors of sch_generic.c and sch_htb.c, hopefully they are the > > ones who can help (if not, sorry to bother you). > > > > The list.h is fine, but if another task goes down this list when it > > list_del_init is done, there's a chance that the reading task can get to > > the deleted item just as it is being deleted, and has pointed itself to > > itself. p->next == p. This would go into an infinite loop. > > > > The reason sysrq works is because this doesn't stop interrupts. But put > > a local_irq_save around that list and run your test, I bet you won't be > > able to do anything, but power off with the big button. > > > > Hope someone can help. I don't know the queue disciplines well enough to > > make a proper fix. > > > > -- Steve > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-15 21:54 ` Steven Rostedt @ 2005-04-15 22:54 ` Thomas Graf 2005-04-16 1:49 ` Herbert Xu 0 siblings, 1 reply; 13+ messages in thread From: Thomas Graf @ 2005-04-15 22:54 UTC (permalink / raw) To: Steven Rostedt Cc: hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy * Steven Rostedt <1113602052.4294.89.camel@localhost.localdomain> 2005-04-15 17:54 > > == > > list_for_each_entry(cq, &cql, list) > > list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list) > > if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) { > > if (q->ops->cl_ops == NULL) > > list_del_init(&q->list); > > else > > list_move_tail(&q->list, &cql); > > } > > list_for_each_entry_safe(cq, n, &cql, list) > > list_del_init(&cq->list); > > == > > > > ...and it happens when q->ops->cl_ops is NULL and > > list_del_init(&q->list) is executed. > > > > The stuff from include/linux/list.h looks ok, it seems like one > > of those two iterations (list_for_each_entry() and > > list_for_each_entry_safe()) enters an endless loop when an element is > > removed from the list under some circumstances. > > There's a comment above qdisc_destroy that says: > > /* Under dev->queue_lock and BH! */ > > I'm not so sure this is the case. It's not, we should change the comment. qdisc_destroy calls for inner leaf qdiscs comming from rcu scheduled __qdisc_destroy -> qdisc::destroy() -> qdisc_destroy() will not have a lock on queue_lock but it shouldn't be a problem since the qdiscs cannot be found anymore. Another case were it's not locked is upon a deletion of a class where the class deletes its inner qdisc, although there is only one case how this can happen and that's under rtnl semaphore so there is no way we can have a dumper at the same time. A wild guess would be that one of the many wrong locking places where dev->queue_lock is acquired but qdisc_tree_lock is not is the problem. tc_dump_qdisc assumed that locking on qdisc_tree_lock is enough which is absolutely right, still most callers to qdisc_destroy only lock on dev->queue_lock for the modification of the tree, which _was_ fine before we used list.h since the unlinking was atomic. This is no news and Patrick's patch in 2.6.10 ought to have fixed this by unlinking inner leaf qdiscs from dev->qdisc_list and thus preventing them to be found by anyone during unlocked calls to destroy_qdisc. So... we might have a unlocked qdisc_destroy call for a qdisc not properly unlinked from dev->qdisc_list. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-15 22:54 ` Thomas Graf @ 2005-04-16 1:49 ` Herbert Xu 2005-04-16 5:01 ` Steven Rostedt 2005-04-16 11:06 ` Thomas Graf 0 siblings, 2 replies; 13+ messages in thread From: Herbert Xu @ 2005-04-16 1:49 UTC (permalink / raw) To: Thomas Graf Cc: Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller [-- Attachment #1: Type: text/plain, Size: 2081 bytes --] On Fri, Apr 15, 2005 at 10:54:22PM +0000, Thomas Graf wrote: > > Another case were it's not locked is upon a deletion of a class where > the class deletes its inner qdisc, although there is only one case > how this can happen and that's under rtnl semaphore so there is no > way we can have a dumper at the same time. Sorry, that's where tc went astray :) The assumption that the rtnl is held during dumping is false. It is only true for the initial dump call. All subsequent dumps are not protected by the rtnl. The solution is certainly not to place the dumpers under rtnl :) The dump operation is read-only and should not need any exclusive locks. In fact the whole qdisc locking is a mess. It's a cross-breed between locking with ad-hoc reference counting and RCU. What's more, the RCU is completely useless too because for the case where we actually have a queue we still end up taking the spin lock on each transmit! I think someone's been benchmarking the loopback device again :) It needs to be reengineered. Here is a quick'n'dirty fix to the problem at hand. What happened between 2.6.10-rc1 and 2.6.10-rc2 is that qdisc_destroy started changing the next pointer of qdisc entries which totally confuses the readers because qdisc_destroy doesn't always take the tree lock. This patch tries to ensure that all top-level calls to qdisc_destroy come under the tree lock. As Thomas correctedly pointed out, most of the other qdisc_destroy calls occur after the top qdisc has been unlinked from the device qdisc_list. However, someone should go through each one of the remaining ones (they're all in the individual sch_* implementations) and make sure that this assumption is really true. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> If anyone has cycles to spare and a stomach strong enough for this stuff, here is your chance :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 966 bytes --] ===== net/sched/sch_api.c 1.47 vs edited ===== --- 1.47/net/sched/sch_api.c 2005-04-01 14:35:56 +10:00 +++ edited/net/sched/sch_api.c 2005-04-16 08:47:16 +10:00 @@ -608,9 +608,9 @@ return err; if (q) { qdisc_notify(skb, n, clid, q, NULL); - spin_lock_bh(&dev->queue_lock); + qdisc_lock_tree(dev); qdisc_destroy(q); - spin_unlock_bh(&dev->queue_lock); + qdisc_unlock_tree(dev); } } else { qdisc_notify(skb, n, clid, NULL, q); @@ -743,17 +743,17 @@ err = qdisc_graft(dev, p, clid, q, &old_q); if (err) { if (q) { - spin_lock_bh(&dev->queue_lock); + qdisc_lock_tree(dev); qdisc_destroy(q); - spin_unlock_bh(&dev->queue_lock); + qdisc_unlock_tree(dev); } return err; } qdisc_notify(skb, n, clid, old_q, q); if (old_q) { - spin_lock_bh(&dev->queue_lock); + qdisc_lock_tree(dev); qdisc_destroy(old_q); - spin_unlock_bh(&dev->queue_lock); + qdisc_unlock_tree(dev); } } return 0; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 1:49 ` Herbert Xu @ 2005-04-16 5:01 ` Steven Rostedt 2005-04-16 11:06 ` Thomas Graf 1 sibling, 0 replies; 13+ messages in thread From: Steven Rostedt @ 2005-04-16 5:01 UTC (permalink / raw) To: Herbert Xu Cc: Thomas Graf, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller On Sat, 2005-04-16 at 11:49 +1000, Herbert Xu wrote: > Here is a quick'n'dirty fix to the problem at hand. What happened > between 2.6.10-rc1 and 2.6.10-rc2 is that qdisc_destroy started > changing the next pointer of qdisc entries which totally confuses > the readers because qdisc_destroy doesn't always take the tree lock. > > This patch tries to ensure that all top-level calls to qdisc_destroy > come under the tree lock. As Thomas correctedly pointed out, most > of the other qdisc_destroy calls occur after the top qdisc has been > unlinked from the device qdisc_list. However, someone should go > through each one of the remaining ones (they're all in the individual > sch_* implementations) and make sure that this assumption is really > true. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > If anyone has cycles to spare and a stomach strong enough for > this stuff, here is your chance :) > FYI, I ran the test case that Tarhon-Ohn had, but had to change his tc execution from batch to single lines since the version of tc I have segfaults on newlines. Anyway, I did see the lock up with 2.6.11.2 after 7 iterations. I applied your patch, and it ran for 30 iterations before I manually killed it. I didn't test any more than that, but this seems to be the quick fix for now. -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 1:49 ` Herbert Xu 2005-04-16 5:01 ` Steven Rostedt @ 2005-04-16 11:06 ` Thomas Graf 2005-04-16 11:12 ` Herbert Xu 2005-04-16 11:23 ` Herbert Xu 1 sibling, 2 replies; 13+ messages in thread From: Thomas Graf @ 2005-04-16 11:06 UTC (permalink / raw) To: Herbert Xu Cc: Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller * Herbert Xu <20050416014906.GA3291@gondor.apana.org.au> 2005-04-16 11:49 > On Fri, Apr 15, 2005 at 10:54:22PM +0000, Thomas Graf wrote: > > > > Another case were it's not locked is upon a deletion of a class where > > the class deletes its inner qdisc, although there is only one case > > how this can happen and that's under rtnl semaphore so there is no > > way we can have a dumper at the same time. > > Sorry, that's where tc went astray :) > > The assumption that the rtnl is held during dumping is false. It is > only true for the initial dump call. All subsequent dumps are not > protected by the rtnl. Ahh.. it's the unlocked subsequent dump calls that are _still_ running when the destroy is invoked. That's where Patrick and I went wrong when we tried to fix this issue. We set our t0 to qdisc_destroy and didn't really consider any prior unlocked tasks still running. > In fact the whole qdisc locking is a mess. It's a cross-breed > between locking with ad-hoc reference counting and RCU. What's > more, the RCU is completely useless too because for the case > where we actually have a queue we still end up taking the spin > lock on each transmit! I think someone's been benchmarking the > loopback device again :) It's not completely useless, it speeds up the deletion classful qdiscs having some depth. However, it's not worth the locking troubles I guess. > Here is a quick'n'dirty fix to the problem at hand. I think it's pretty clean but it doesn't solve the problem completely, see below. > This patch tries to ensure that all top-level calls to qdisc_destroy > come under the tree lock. As Thomas correctedly pointed out, most > of the other qdisc_destroy calls occur after the top qdisc has been > unlinked from the device qdisc_list. However, someone should go > through each one of the remaining ones (they're all in the individual > sch_* implementations) and make sure that this assumption is really > true. qdisc_destroy can still be invoked without qdisc_tree_lock via the deletion of a class when it calls qdisc_destroy to destroy its leaf qdisc. > If anyone has cycles to spare and a stomach strong enough for > this stuff, here is your chance :) I will look into this. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 11:06 ` Thomas Graf @ 2005-04-16 11:12 ` Herbert Xu 2005-04-17 17:46 ` Patrick McHardy 2005-04-16 11:23 ` Herbert Xu 1 sibling, 1 reply; 13+ messages in thread From: Herbert Xu @ 2005-04-16 11:12 UTC (permalink / raw) To: Thomas Graf Cc: Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller On Sat, Apr 16, 2005 at 01:06:39PM +0200, Thomas Graf wrote: > > qdisc_destroy can still be invoked without qdisc_tree_lock via the > deletion of a class when it calls qdisc_destroy to destroy its > leaf qdisc. Indeed. Fortuantely HTB seems to be safe as it calls sch_tree_lock which is another name for qdisc_tree_lock. CBQ on the other hand needs to have a little tweak. > I will look into this. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 11:12 ` Herbert Xu @ 2005-04-17 17:46 ` Patrick McHardy 2005-04-17 21:37 ` Herbert Xu 0 siblings, 1 reply; 13+ messages in thread From: Patrick McHardy @ 2005-04-17 17:46 UTC (permalink / raw) To: Herbert Xu Cc: Thomas Graf, Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, David S. Miller Herbert Xu wrote: > On Sat, Apr 16, 2005 at 01:06:39PM +0200, Thomas Graf wrote: > >>qdisc_destroy can still be invoked without qdisc_tree_lock via the >>deletion of a class when it calls qdisc_destroy to destroy its >>leaf qdisc. > > Indeed. Fortuantely HTB seems to be safe as it calls sch_tree_lock > which is another name for qdisc_tree_lock. CBQ on the other hand > needs to have a little tweak. HTB also needs to be fixed. Destruction is usually defered by the refcnt until ->put(), htb_put() doesn't lock the tree. Same for HFSC and CBQ. Regards Patrick ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-17 17:46 ` Patrick McHardy @ 2005-04-17 21:37 ` Herbert Xu 0 siblings, 0 replies; 13+ messages in thread From: Herbert Xu @ 2005-04-17 21:37 UTC (permalink / raw) To: Patrick McHardy Cc: Thomas Graf, Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, David S. Miller On Sun, Apr 17, 2005 at 07:46:16PM +0200, Patrick McHardy wrote: > > HTB also needs to be fixed. Destruction is usually defered by the > refcnt until ->put(), htb_put() doesn't lock the tree. Same for > HFSC and CBQ. Yes you're absolutely right. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 11:06 ` Thomas Graf 2005-04-16 11:12 ` Herbert Xu @ 2005-04-16 11:23 ` Herbert Xu 2005-04-16 11:34 ` Thomas Graf 1 sibling, 1 reply; 13+ messages in thread From: Herbert Xu @ 2005-04-16 11:23 UTC (permalink / raw) To: Thomas Graf Cc: Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller On Sat, Apr 16, 2005 at 01:06:39PM +0200, Thomas Graf wrote: > > It's not completely useless, it speeds up the deletion classful > qdiscs having some depth. However, it's not worth the locking > troubles I guess. RCU is meant to optimise the common reader path. In this case that's the packet transmission code. Unfortunately it fails miserably when judged by that criterion. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 11:23 ` Herbert Xu @ 2005-04-16 11:34 ` Thomas Graf 2005-04-16 16:04 ` jamal 0 siblings, 1 reply; 13+ messages in thread From: Thomas Graf @ 2005-04-16 11:34 UTC (permalink / raw) To: Herbert Xu Cc: Steven Rostedt, hadi, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller * Herbert Xu <20050416112329.GA31847@gondor.apana.org.au> 2005-04-16 21:23 > On Sat, Apr 16, 2005 at 01:06:39PM +0200, Thomas Graf wrote: > > > > It's not completely useless, it speeds up the deletion classful > > qdiscs having some depth. However, it's not worth the locking > > troubles I guess. > > RCU is meant to optimise the common reader path. In this case > that's the packet transmission code. Unfortunately it fails > miserably when judged by that criterion. There is one case where it can do good for latency which is for per flow qdiscs or any other scenarios implying hundreds or thousands of leaf qdiscs where a destroyage of one such qdisc tree will take up quite some cpu to traverse all the classes under dev->queue_lock. I don't have any numbers on this, but I don't completely dislike the method of hiding the qdiscs under the lock and do the expensive traveling unlocked. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 11:34 ` Thomas Graf @ 2005-04-16 16:04 ` jamal 2005-04-16 18:21 ` Thomas Graf 0 siblings, 1 reply; 13+ messages in thread From: jamal @ 2005-04-16 16:04 UTC (permalink / raw) To: Thomas Graf Cc: Herbert Xu, Steven Rostedt, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller On Sat, 2005-16-04 at 13:34 +0200, Thomas Graf wrote: > * Herbert Xu <20050416112329.GA31847@gondor.apana.org.au> 2005-04-16 21:23 > > On Sat, Apr 16, 2005 at 01:06:39PM +0200, Thomas Graf wrote: > > > > > > It's not completely useless, it speeds up the deletion classful > > > qdiscs having some depth. However, it's not worth the locking > > > troubles I guess. > > > > RCU is meant to optimise the common reader path. In this case > > that's the packet transmission code. Unfortunately it fails > > miserably when judged by that criterion. > > There is one case where it can do good for latency which is for > per flow qdiscs or any other scenarios implying hundreds or > thousands of leaf qdiscs where a destroyage of one such qdisc > tree will take up quite some cpu to traverse all the classes > under dev->queue_lock. I don't have any numbers on this, but > I don't completely dislike the method of hiding the qdiscs under > the lock and do the expensive traveling unlocked. The rule of "optimize for the common" fails miserably in this case because this is not a common case/usage of qdiscs. I have a feeling though that the patch went in due to dude-optimizing-loopback as pointed by Herbert. It could also be it was done because RCU-is-so-cool. I dont recall. Maybe worth reverting to the earlier scheme if it is going to continue to be problematic. cheers, jamal ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ACPI/HT or Packet Scheduler BUG? 2005-04-16 16:04 ` jamal @ 2005-04-16 18:21 ` Thomas Graf 0 siblings, 0 replies; 13+ messages in thread From: Thomas Graf @ 2005-04-16 18:21 UTC (permalink / raw) To: jamal Cc: Herbert Xu, Steven Rostedt, netdev, Tarhon-Onu Victor, kuznet, devik, linux-kernel, Patrick McHardy, David S. Miller * jamal <1113667447.7419.9.camel@localhost.localdomain> 2005-04-16 12:04 > The rule of "optimize for the common" fails miserably in this case > because this is not a common case/usage of qdiscs. I tend to agree. OTOH, I use exactly such setups... ;-> > I have a feeling though that the patch went in due to > dude-optimizing-loopback as pointed by Herbert. I checked, it was in fact during the lockless loopback optimizations. > Maybe worth reverting to the earlier scheme if it is going to continue > to be problematic. Let me first check and see how the locking can be done at best, it doesn't match the principles in sch_generic.c anyway at the moment so once we know how to do the locking efficient and how to remove the error proneess we can see if the optimization fits in without problems and make a call. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-04-17 21:37 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.61.0504081225510.27991@blackblue.iasi.rdsnet.ro>
[not found] ` <Pine.LNX.4.61.0504121526550.4822@blackblue.iasi.rdsnet.ro>
[not found] ` <Pine.LNX.4.61.0504141840420.13546@blackblue.iasi.rdsnet.ro>
[not found] ` <1113601029.4294.80.camel@localhost.localdomain>
2005-04-15 21:44 ` ACPI/HT or Packet Scheduler BUG? jamal
2005-04-15 21:54 ` Steven Rostedt
2005-04-15 22:54 ` Thomas Graf
2005-04-16 1:49 ` Herbert Xu
2005-04-16 5:01 ` Steven Rostedt
2005-04-16 11:06 ` Thomas Graf
2005-04-16 11:12 ` Herbert Xu
2005-04-17 17:46 ` Patrick McHardy
2005-04-17 21:37 ` Herbert Xu
2005-04-16 11:23 ` Herbert Xu
2005-04-16 11:34 ` Thomas Graf
2005-04-16 16:04 ` jamal
2005-04-16 18:21 ` Thomas Graf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).