netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Matt Mackall <mpm@selenic.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org
Subject: Re: Using netconsole for debugging suspend/resume
Date: Thu, 08 Jun 2006 18:54:38 -0700	[thread overview]
Message-ID: <4488D4DE.3000100@goop.org> (raw)
In-Reply-To: <20060608210702.GD24227@waste.org>

Matt Mackall wrote:
> That's odd. Netpoll holds a reference to the device, of course, but so
> does a normal "up" interface. So that shouldn't be the problem.
> Another possibility is that outgoing packets from printks in the
> driver are causing difficulty. Not sure what can be done about that.
>   
I only tried once; maybe I misunderstood what was going on.  I'll try 
again tonight.

Oh, I think I see what's happening.  The e1000 suspend routine does this:

	if (netif_running(netdev))
		e1000_down(adapter);

This leaves the interface up, but it stops the queue.  Then 
netpoll_send_skb() has this loop:

	do {
		npinfo->tries--;
		spin_lock(&np->dev->xmit_lock);
		np->dev->xmit_lock_owner = smp_processor_id();

		/*
		 * network drivers do not expect to be called if the queue is
		 * stopped.
		 */
		if (netif_queue_stopped(np->dev)) {
			np->dev->xmit_lock_owner = -1;
			spin_unlock(&np->dev->xmit_lock);
			netpoll_poll(np);
			udelay(50);
			continue;
		}
/* ... */
again: /* proposed */
	} while (npinfo->tries > 0);


so this will end up in an infinite loop, since netif_queue_stopped() 
will always return true, and it never looks at npinfo->tries.  Should 
the "continue" be "goto again"?

Also, e1000_down does a netif_poll_disable(), but I'm not sure what that 
actually does...  Should it prevent netpoll from even trying to send?
> It's generally going to suck, because unlike a polled serial port, the
> device needs to be put to sleep. But if you're doing suspend to RAM,
>   
I'm interested in suspend-to-ram.  I presume that with suspend-to-disk, 
booting with built-in netconsole will tell me useful stuff; that'll be 
the next experiment.

> you might be able to do something like this:
>
> - unhook net device from suspend machinery (possibly just return success)
> - bounce out of suspend before the final call to ACPI is made
>
> Net effect is you do OS-level suspend and resume of everything but the
> NIC without actually powering down the core. Which should let you
> debug just about everything.

Well, the machine has to really suspend so that I can see (and debug) a 
mostly normal resume.  In particular, I need the hardware to be zapped 
so I can see if it is being restarted properly.

What might work is to change the e1000 suspend routine to save enough 
state for resume to work, but keep the interface up so that netconsole 
can keep transmitting all the way up to the point that the final acpi 
call powers off the machine.

Then the e1000 would resume normally, including restarting the xmit 
queue so that netconsole can start again immediately; any netconsole 
output before the e1000 resume would be lost, of course (I guess it 
could be buffered).  That would suit me for now.

    J


  reply	other threads:[~2006-06-09  4:04 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-08 17:50 Using netconsole for debugging suspend/resume Jeremy Fitzhardinge
2006-06-08 20:35 ` Auke Kok
2006-06-08 20:40 ` Rafael J. Wysocki
2006-06-09  1:56   ` Jeremy Fitzhardinge
2006-06-09 10:34     ` Rafael J. Wysocki
2006-06-08 21:07 ` Matt Mackall
2006-06-09  1:54   ` Jeremy Fitzhardinge [this message]
2006-06-09  5:13     ` Auke Kok
2006-06-09  5:23       ` David Miller
2006-06-09  5:50         ` Andi Kleen
2006-06-09 17:14           ` Matt Mackall
2006-06-09  5:45       ` Jeremy Fitzhardinge
2006-06-09  2:15   ` [PATCH RFC] netpoll: don't spin forever sending to stopped queues Jeremy Fitzhardinge
2006-06-11 20:04     ` Matt Mackall
2006-06-12 20:57       ` Jeremy Fitzhardinge
2006-06-12 20:53         ` Matt Mackall
2006-06-12 21:20           ` Jeremy Fitzhardinge
2006-06-09  3:46 ` Using netconsole for debugging suspend/resume Andi Kleen
2006-06-09 15:24   ` Mark Lord
2006-06-12 11:21     ` Andi Kleen
2006-06-12 15:38       ` Mark Lord
2006-06-12 15:46         ` Andi Kleen
2006-06-12 21:25           ` Jeremy Fitzhardinge
2006-06-13  3:47             ` Andi Kleen
2006-06-13  4:49               ` David Miller
2006-06-13  4:54                 ` Andi Kleen
2006-06-13  5:03                   ` David Miller
2006-06-13  7:18                     ` Christoph Hellwig
2006-06-13  7:31                       ` David Miller
2006-06-09  8:34 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4488D4DE.3000100@goop.org \
    --to=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpm@selenic.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).