[PATCH 0/2] suspend-to-ram debugging patches

* [PATCH 0/2] suspend-to-ram debugging patches
@ 2006-06-13 21:30 Linus Torvalds
  2006-06-13 21:35 ` [PATCH 1/2] Add some basic resume trace facilities Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 354+ messages in thread
From: Linus Torvalds @ 2006-06-13 21:30 UTC (permalink / raw)
  To: Power management list

Ok,
 some of the people on this list have already seen the first of these two 
patches, but others haven't, and comments are welcome.

These two patches came about due to me debugging my Mac Mini 
suspend/resume, and not being able to make a lot of headway. 

The patches do two things:

 [patch 1]: Add some basic resume trace facilities

	This adds the capability to trace what the last operation was 
	before the machine hung or rebooted. It does so by saving off a 
	few magic hashes into the machine RTC, so that on next bootup 
	(within three minutes!) you can tell which device, and which 
	source code line number was the last one that was traced.

	NOTE! On its own, the patch does nothing. You also need to add 
	trace-points by hand, ie at a minimum add a TRACE_DEVICE(dev)
	in resume_device(), and then TRACE_RESUME() points all along the 
	path you're trying to debug to see which one is the one you hit 
	last.

	IOW, it's very nasty to use, but it's better than "my machine 
	never came back, and doesn't tell me anything, what should I do 
	now?"

 [patch 2]: Fix console handling during suspend/resume

	Some people may hate this, but what it does is to suspend the 
	console handling _properly_, so that if there are messages that 
	happen while the machine is suspending or resuming, they can 
	actually be printed out over a netconsole window, even if the 
	network device was part of the devices going down.

	The reason people may hate it is that it actually means that we 
	don't print the messages at all when the machine is going down. We 
	really can't. Even VGA may be behind a bridge or something, and 
	trying to access it is just totally random luck. So the suspend 
	and resume actually gets a lot more quiet - but in the process it 
	actually gets more reliable.

	This makes netconsole usable over a suspend/resume, for example, 
	instead of just oopsing or doing really bad things because we're 
	trying to use the network device at the same time that it's going 
	down.

	When the resume is done, the normal printk() buffering will have 
	kept all the messages, so they are then printed when the devices 
	actually work again.

	I suspect that we might want to have a "debug mode" that basically 
	doesn't stop the console at all, because sometimes the extra 
	messages are very useful, even if they sometimes also just help 
	break the suspend/resume further. That might make some of the 
	people who otherwise hate this happier.

Actual patches in the next two mails as replies to this one.

[ And note: I'm not on the linux-pm list, so please cc me with any useful 
  commentary ]

			Linus

^ permalink raw reply	[flat|nested] 354+ messages in thread