netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Suspend to RAM regression (bisected)
       [not found]   ` <20080721150415.GA3804@localhost>
@ 2008-07-21 17:12     ` Linus Torvalds
  2008-07-21 18:54       ` Carlos R. Mafra
  0 siblings, 1 reply; 2+ messages in thread
From: Linus Torvalds @ 2008-07-21 17:12 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: David S. Miller, netdev



On Mon, 21 Jul 2008, Carlos R. Mafra wrote:
> 
> Ok, I tried your new v2.6.26-5253-g14b395e and unfortunately suspend to
> RAM fails completely now. The backlight doesn't turn off and the keyboard
> leds keep blinking.
> 
> After several boots I bisected this new regression down to 
> 37437bb2e1ae8af470dfcd5b4ff454110894ccaf ("pkt_sched: Schedule qdiscs 
> instead of netdev_queue.") by David Miller. 

Ok, I think this is an oops, and since it's bisected down to the same 
commit that some other oopses were bisected down to at boot-time, it's 
probably the same thing: something is calling "netif_wake_queue()" without 
having called "netif_start_queue()".

Or, to be more precise, in the case of suspending, something has probably 
called "dev_deactivate()" because of a link event or something like that, 
which seems to be a total piece-of-sh*t code that sets the qdisc back to 
the "illegal" noop_qdisc (thus causing oopses if some qdisc event 
happens), but does so *before* al the qdisc's have been quiesced (which it 
must do - because otherwise they may keep coming), so the same problem 
that plagued netif_wake_queue() will happen.

I don't really know the code very well (I'm waiting for David to fix up 
the mess), but I can imagine that the appended patch may at least turn the 
dead machine into a single warning and hopefully a working setup. Can you 
please try?

			Linus

---
 net/core/dev.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 2eed17b..43ab4f5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1325,7 +1325,8 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
 
 void __netif_schedule(struct Qdisc *q)
 {
-	BUG_ON(q == &noop_qdisc);
+	if (WARN_ON_ONCE(q == &noop_qdisc))
+		return;
 
 	if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state)) {
 		struct softnet_data *sd;

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: Suspend to RAM regression (bisected)
  2008-07-21 17:12     ` Suspend to RAM regression (bisected) Linus Torvalds
@ 2008-07-21 18:54       ` Carlos R. Mafra
  0 siblings, 0 replies; 2+ messages in thread
From: Carlos R. Mafra @ 2008-07-21 18:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David S. Miller, netdev

On Mon 21.Jul'08 at 10:12:30 -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 21 Jul 2008, Carlos R. Mafra wrote:
> > 
> > Ok, I tried your new v2.6.26-5253-g14b395e and unfortunately suspend to
> > RAM fails completely now. The backlight doesn't turn off and the keyboard
> > leds keep blinking.
> > 
> > After several boots I bisected this new regression down to 
> > 37437bb2e1ae8af470dfcd5b4ff454110894ccaf ("pkt_sched: Schedule qdiscs 
> > instead of netdev_queue.") by David Miller. 
> 
> Ok, I think this is an oops, and since it's bisected down to the same 
> commit that some other oopses were bisected down to at boot-time, it's 
> probably the same thing: something is calling "netif_wake_queue()" without 
> having called "netif_start_queue()".
> 
> Or, to be more precise, in the case of suspending, something has probably 
> called "dev_deactivate()" because of a link event or something like that, 
> which seems to be a total piece-of-sh*t code that sets the qdisc back to 
> the "illegal" noop_qdisc (thus causing oopses if some qdisc event 
> happens), but does so *before* al the qdisc's have been quiesced (which it 
> must do - because otherwise they may keep coming), so the same problem 
> that plagued netif_wake_queue() will happen.
> 
> I don't really know the code very well (I'm waiting for David to fix up 
> the mess), but I can imagine that the appended patch may at least turn the 
> dead machine into a single warning and hopefully a working setup. Can you 
> please try?

I tested your patch below and now suspend to RAM is working again, it 
resumed just fine into X. So both regressions are gone now.

Thanks a lot,
Carlos


> 			Linus
> 
> ---
>  net/core/dev.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 2eed17b..43ab4f5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1325,7 +1325,8 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
>  
>  void __netif_schedule(struct Qdisc *q)
>  {
> -	BUG_ON(q == &noop_qdisc);
> +	if (WARN_ON_ONCE(q == &noop_qdisc))
> +		return;
>  
>  	if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state)) {
>  		struct softnet_data *sd;

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-07-21 18:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20080721020349.GA4328@localhost>
     [not found] ` <alpine.LFD.1.10.0807202119240.31863@woody.linux-foundation.org>
     [not found]   ` <20080721150415.GA3804@localhost>
2008-07-21 17:12     ` Suspend to RAM regression (bisected) Linus Torvalds
2008-07-21 18:54       ` Carlos R. Mafra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).