* [2.6.21.1] resume doesn't run suspended kernel?
@ 2007-05-26 22:42 Bill Davidsen
2007-05-27 8:41 ` David Greaves
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Bill Davidsen @ 2007-05-26 22:42 UTC (permalink / raw)
To: Linux Kernel M/L
I was testing susp2disk in 2.6.21.1 under FC6, to support reliable
computing environment (RCE) needs. The idea is that if power fails,
after some short time on UPS the system does susp2disk with a time set,
and boots back every so often to see if power is stable.
No, I don't want susp2mem until I debug it, console come up in useless
mode, console as kalidescope is not what I need.
Anyway, I pulled the plug on the UPS, and the system shut down. But when
it powered up, it booted the default kernel rather than the test kernel,
decided that it couldn't resume, and then did a cold boot.
I can bypass this by making the debug kernel the default, but WHY? Is
the kernel not saved such that any kernel can be rolled back into memory
and run? Actually, the answer is HELL NO, so I really ask if this is the
intended mode of operation, that only the default boot kernel will restore.
--
Bill Davidsen <davidsen@tmr.com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-26 22:42 [2.6.21.1] resume doesn't run suspended kernel? Bill Davidsen @ 2007-05-27 8:41 ` David Greaves 2007-05-27 13:10 ` Bill Davidsen 2007-05-27 21:17 ` Pavel Machek 2007-05-27 21:14 ` Pavel Machek 2007-06-05 7:23 ` Stefan Seyfried 2 siblings, 2 replies; 23+ messages in thread From: David Greaves @ 2007-05-27 8:41 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L Bill Davidsen wrote: > Anyway, I pulled the plug on the UPS, and the system shut down. But when > it powered up, it booted the default kernel rather than the test kernel, > decided that it couldn't resume, and then did a cold boot. Booting the machine isn't the kernel's job, it's the bootloader's job. > I can bypass this by making the debug kernel the default, but WHY? Is > the kernel not saved such that any kernel can be rolled back into memory > and run? Actually, the answer is HELL NO, so I really ask if this is the > intended mode of operation, that only the default boot kernel will restore. Yes. It is very dangerous to attempt a resume with a different kernel than the one that has gone to sleep. Different kernels may be compiled with different options that affect where or how in-memory structures are saved. So you suspend with a kernel which holds your filesystem data/cache/inodes at 0x1234000 and restore with a kernel that expects to see your filesystem data at 0x1235000. Ouch. Personally I think the kernel suspend should write a signature - similar to a hash of the bzImage - into the suspend image so it won't even attempt a resume if there's a mismatch. (Yes, I made this mistake once whilst playing with suspend). David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-27 8:41 ` David Greaves @ 2007-05-27 13:10 ` Bill Davidsen 2007-05-27 15:26 ` David Greaves 2007-05-27 21:20 ` Pavel Machek 2007-05-27 21:17 ` Pavel Machek 1 sibling, 2 replies; 23+ messages in thread From: Bill Davidsen @ 2007-05-27 13:10 UTC (permalink / raw) To: David Greaves; +Cc: Linux Kernel M/L David Greaves wrote: > Bill Davidsen wrote: >> Anyway, I pulled the plug on the UPS, and the system shut down. But when >> it powered up, it booted the default kernel rather than the test kernel, >> decided that it couldn't resume, and then did a cold boot. > > Booting the machine isn't the kernel's job, it's the bootloader's job. > And resume is not the the bootloader's job... if memory and registers are restored, and a jump is made to the resume address, a resumed system should result. clearly some part of that didn't happen :-( >> I can bypass this by making the debug kernel the default, but WHY? Is >> the kernel not saved such that any kernel can be rolled back into memory >> and run? Actually, the answer is HELL NO, so I really ask if this is the >> intended mode of operation, that only the default boot kernel will restore. > > Yes. > > It is very dangerous to attempt a resume with a different kernel than the one > that has gone to sleep. > Different kernels may be compiled with different options that affect where or > how in-memory structures are saved. > If the mainline resume is depending on that no wonder resume is so fragile. User action can change order of module loads, kmalloc calls move allocated structures, etc. Counting on anything to be locked in place seems naive. > So you suspend with a kernel which holds your filesystem data/cache/inodes at > 0x1234000 and restore with a kernel that expects to see your filesystem data at > 0x1235000. > > Ouch. > I would hope that the data used by the resumed kernel would be the same data that was suspended, not something from another kernel. > Personally I think the kernel suspend should write a signature - similar to a > hash of the bzImage - into the suspend image so it won't even attempt a resume > if there's a mismatch. (Yes, I made this mistake once whilst playing with suspend). > Someone else dropped a note saying the FC kernels use suspend2, and work fine. I'm off to look at the FC source and see if that's the case. That would explain why suspend works and resume doesn't, hopefully there's a 2.6.21 suspend2 patch in that case. Thanks for the feedback in any case. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-27 13:10 ` Bill Davidsen @ 2007-05-27 15:26 ` David Greaves 2007-05-27 21:20 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: David Greaves @ 2007-05-27 15:26 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L Bill Davidsen wrote: > David Greaves wrote: >> Bill Davidsen wrote: >>> Anyway, I pulled the plug on the UPS, and the system shut down. But when >>> it powered up, it booted the default kernel rather than the test kernel, >>> decided that it couldn't resume, and then did a cold boot. >> >> Booting the machine isn't the kernel's job, it's the bootloader's job. >> > And resume is not the the bootloader's job... if memory and registers > are restored, and a jump is made to the resume address, a resumed system > should result. clearly some part of that didn't happen :-( Well, what if you wanted to boot a 2nd, dual-boot OS? The bootloader needs to boot the kernel which may choose to resume. Is there a misunderstanding here? I read your OP as saying that you booted kernel B (configured to have suspend support) and then hit suspend. When the machine rebooted the "default kernel" ie, kernel A, not kernel B was selected by the bootloader. Since the default kernel didn't have or couldn't resume, it simply booted. Just what I'd expect. >> It is very dangerous to attempt a resume with a different kernel than >> the one >> that has gone to sleep. >> Different kernels may be compiled with different options that affect >> where or >> how in-memory structures are saved. >> > If the mainline resume is depending on that no wonder resume is so > fragile. User action can change order of module loads, kmalloc calls > move allocated structures, etc. Counting on anything to be locked in > place seems naive. Err, no. It's a lot more sophisticated. However it does ask that you not resume with a different kernel than you suspended with - not unreasonable!! >> So you suspend with a kernel which holds your filesystem >> data/cache/inodes at >> 0x1234000 and restore with a kernel that expects to see your >> filesystem data at >> 0x1235000. >> >> Ouch. >> > I would hope that the data used by the resumed kernel would be the same > data that was suspended, not something from another kernel. Linux based OSes provide enough rope to build a harness or a noose. Choose wisely :) As you suggest you are about to, it may be best to get a distro-configured system or do some more background research. Mainline doesn't provide scripts to interact with bootloaders etc. Nb I replied because I've just done some work configuring s2d and now have 3 desktop/server machines doing suspend2disk on 2.6.21 quite nicely - thanks all around. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-27 13:10 ` Bill Davidsen 2007-05-27 15:26 ` David Greaves @ 2007-05-27 21:20 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Pavel Machek @ 2007-05-27 21:20 UTC (permalink / raw) To: Bill Davidsen; +Cc: David Greaves, Linux Kernel M/L Hi! > >It is very dangerous to attempt a resume with a > >different kernel than the one > >that has gone to sleep. > >Different kernels may be compiled with different > >options that affect where or > >how in-memory structures are saved. > > > If the mainline resume is depending on that no wonder > resume is so fragile. User action can change order of > module loads, kmalloc calls move allocated structures, > etc. Counting on anything to be locked in place seems > naive. Look at code before spreading FUD. (suspend and suspend2 are same in this matter). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-27 8:41 ` David Greaves 2007-05-27 13:10 ` Bill Davidsen @ 2007-05-27 21:17 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Pavel Machek @ 2007-05-27 21:17 UTC (permalink / raw) To: David Greaves; +Cc: Bill Davidsen, Linux Kernel M/L Hi! > Personally I think the kernel suspend should write a signature - similar to a > hash of the bzImage - into the suspend image so it won't even attempt a resume > if there's a mismatch. (Yes, I made this mistake once whilst playing with suspend). We have such 'hash' but it is not foolproof. Improvements welcome. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-26 22:42 [2.6.21.1] resume doesn't run suspended kernel? Bill Davidsen 2007-05-27 8:41 ` David Greaves @ 2007-05-27 21:14 ` Pavel Machek 2007-05-28 3:15 ` Bill Davidsen 2007-06-05 7:23 ` Stefan Seyfried 2 siblings, 1 reply; 23+ messages in thread From: Pavel Machek @ 2007-05-27 21:14 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L On Sat 2007-05-26 18:42:37, Bill Davidsen wrote: > I was testing susp2disk in 2.6.21.1 under FC6, to > support reliable computing environment (RCE) needs. The > idea is that if power fails, after some short time on > UPS the system does susp2disk with a time set, and boots > back every so often to see if power is stable. > > No, I don't want susp2mem until I debug it, console come > up in useless mode, console as kalidescope is not what I > need. > > Anyway, I pulled the plug on the UPS, and the system > shut down. But when it powered up, it booted the default > kernel rather than the test kernel, decided that it > couldn't resume, and then did a cold boot. > > I can bypass this by making the debug kernel the > default, but WHY? HELL YES :-). We do not save kernel code into image. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-27 21:14 ` Pavel Machek @ 2007-05-28 3:15 ` Bill Davidsen 2007-05-28 13:21 ` Bill Davidsen 0 siblings, 1 reply; 23+ messages in thread From: Bill Davidsen @ 2007-05-28 3:15 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux Kernel M/L Pavel Machek wrote: > On Sat 2007-05-26 18:42:37, Bill Davidsen wrote: > >> I was testing susp2disk in 2.6.21.1 under FC6, to >> support reliable computing environment (RCE) needs. The >> idea is that if power fails, after some short time on >> UPS the system does susp2disk with a time set, and boots >> back every so often to see if power is stable. >> >> No, I don't want susp2mem until I debug it, console come >> up in useless mode, console as kalidescope is not what I >> need. >> >> Anyway, I pulled the plug on the UPS, and the system >> shut down. But when it powered up, it booted the default >> kernel rather than the test kernel, decided that it >> couldn't resume, and then did a cold boot. >> >> I can bypass this by making the debug kernel the >> default, but WHY? >> > > HELL YES :-). We do not save kernel code into image. > > That's clear, I'll have to use xen or kvm or similar which restores the system as suspended. Thanks for the clarification of the limitations. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-28 3:15 ` Bill Davidsen @ 2007-05-28 13:21 ` Bill Davidsen 2007-05-28 13:26 ` Pavel Machek 0 siblings, 1 reply; 23+ messages in thread From: Bill Davidsen @ 2007-05-28 13:21 UTC (permalink / raw) To: Pavel Machek; +Cc: Bill Davidsen, Linux Kernel M/L Bill Davidsen wrote: > Pavel Machek wrote: >> On Sat 2007-05-26 18:42:37, Bill Davidsen wrote: >> >>> I was testing susp2disk in 2.6.21.1 under FC6, to support reliable >>> computing environment (RCE) needs. The idea is that if power fails, >>> after some short time on UPS the system does susp2disk with a time >>> set, and boots back every so often to see if power is stable. >>> >>> No, I don't want susp2mem until I debug it, console come up in >>> useless mode, console as kalidescope is not what I need. >>> >>> Anyway, I pulled the plug on the UPS, and the system shut down. But >>> when it powered up, it booted the default kernel rather than the >>> test kernel, decided that it couldn't resume, and then did a cold boot. >>> >>> I can bypass this by making the debug kernel the default, but WHY? >> >> HELL YES :-). We do not save kernel code into image. >> >> > That's clear, I'll have to use xen or kvm or similar which restores > the system as suspended. Thanks for the clarification of the limitations. > Sorry, I wrote that late at night and quickly. I should have said "design decision" rather than "limitation," For systems which don't do multiple kernels it's not an issue. I certainly would not have made the same decision, but I didn't write the code. It seems more robust to save everything than to try to identify what has and hasn't changed in a modular kernel. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-28 13:21 ` Bill Davidsen @ 2007-05-28 13:26 ` Pavel Machek 2007-05-28 17:57 ` Rafael J. Wysocki 0 siblings, 1 reply; 23+ messages in thread From: Pavel Machek @ 2007-05-28 13:26 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L Hi! > >That's clear, I'll have to use xen or kvm or similar which restores > >the system as suspended. Thanks for the clarification of the limitations. > > > Sorry, I wrote that late at night and quickly. I should have said > "design decision" rather than "limitation," For systems which don't do > multiple kernels it's not an issue. > > I certainly would not have made the same decision, but I didn't write > the code. It seems more robust to save everything than to try to > identify what has and hasn't changed in a modular kernel. We rely on atomic copy routine not moving inside the kernel. Yes, it would be possible to copy it to "known good" address and gain ability to resume different kernels. Actually it should not be _that_ hard. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-28 13:26 ` Pavel Machek @ 2007-05-28 17:57 ` Rafael J. Wysocki 2007-05-28 22:48 ` Nigel Cunningham 0 siblings, 1 reply; 23+ messages in thread From: Rafael J. Wysocki @ 2007-05-28 17:57 UTC (permalink / raw) To: Pavel Machek; +Cc: Bill Davidsen, Linux Kernel M/L On Monday, 28 May 2007 15:26, Pavel Machek wrote: > Hi! > > > >That's clear, I'll have to use xen or kvm or similar which restores > > >the system as suspended. Thanks for the clarification of the limitations. > > > > > Sorry, I wrote that late at night and quickly. I should have said > > "design decision" rather than "limitation," For systems which don't do > > multiple kernels it's not an issue. > > > > I certainly would not have made the same decision, but I didn't write > > the code. It seems more robust to save everything than to try to > > identify what has and hasn't changed in a modular kernel. > > We rely on atomic copy routine not moving inside the kernel. Yes, it > would be possible to copy it to "known good" address and gain ability > to resume different kernels. Actually it should not be _that_ hard. Yup. Don't we do something like this for the (ACPI-based) suspend to RAM already? Greetings, Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-28 17:57 ` Rafael J. Wysocki @ 2007-05-28 22:48 ` Nigel Cunningham 2007-05-29 11:29 ` Pavel Machek 0 siblings, 1 reply; 23+ messages in thread From: Nigel Cunningham @ 2007-05-28 22:48 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pavel Machek, Bill Davidsen, Linux Kernel M/L [-- Attachment #1: Type: text/plain, Size: 1442 bytes --] Hi. On Mon, 2007-05-28 at 19:57 +0200, Rafael J. Wysocki wrote: > On Monday, 28 May 2007 15:26, Pavel Machek wrote: > > Hi! > > > > > >That's clear, I'll have to use xen or kvm or similar which restores > > > >the system as suspended. Thanks for the clarification of the limitations. > > > > > > > Sorry, I wrote that late at night and quickly. I should have said > > > "design decision" rather than "limitation," For systems which don't do > > > multiple kernels it's not an issue. > > > > > > I certainly would not have made the same decision, but I didn't write > > > the code. It seems more robust to save everything than to try to > > > identify what has and hasn't changed in a modular kernel. > > > > We rely on atomic copy routine not moving inside the kernel. Yes, it > > would be possible to copy it to "known good" address and gain ability > > to resume different kernels. Actually it should not be _that_ hard. > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > already? Yeah, I was thinking about this overnight too. It should be doable. In addition to what we already do, I think you'd want: - to copy the assembly to do the copying to a safe page; - to put the location of the cpu state that was saved in the image header so that it can be used after the data is copied back; - to copy the nosave data to a 'safe' page. What else? Regards, Nigel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-28 22:48 ` Nigel Cunningham @ 2007-05-29 11:29 ` Pavel Machek 2007-05-29 12:03 ` Rafael J. Wysocki 0 siblings, 1 reply; 23+ messages in thread From: Pavel Machek @ 2007-05-29 11:29 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Rafael J. Wysocki, Bill Davidsen, Linux Kernel M/L Hi! > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > already? > > Yeah, I was thinking about this overnight too. It should be doable. In > addition to what we already do, I think you'd want: > > - to copy the assembly to do the copying to a safe page; > - to put the location of the cpu state that was saved in the image > header so that it can be used after the data is copied back; ...alternatively, we can just rely on copy routine (and its data) not changing frequently. > - to copy the nosave data to a 'safe' page. > > What else? page directories need to be on a safe place, too. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 11:29 ` Pavel Machek @ 2007-05-29 12:03 ` Rafael J. Wysocki 2007-05-29 12:23 ` Nigel Cunningham 2007-05-29 12:40 ` Pavel Machek 0 siblings, 2 replies; 23+ messages in thread From: Rafael J. Wysocki @ 2007-05-29 12:03 UTC (permalink / raw) To: Pavel Machek; +Cc: Nigel Cunningham, Bill Davidsen, Linux Kernel M/L On Tuesday, 29 May 2007 13:29, Pavel Machek wrote: > Hi! > > > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > > already? > > > > Yeah, I was thinking about this overnight too. It should be doable. In > > addition to what we already do, I think you'd want: > > > > - to copy the assembly to do the copying to a safe page; > > - to put the location of the cpu state that was saved in the image > > header so that it can be used after the data is copied back; > > ...alternatively, we can just rely on copy routine (and its data) not > changing frequently. > > > - to copy the nosave data to a 'safe' page. > > > > What else? > > page directories need to be on a safe place, too. They are already. Greetings, Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 12:03 ` Rafael J. Wysocki @ 2007-05-29 12:23 ` Nigel Cunningham 2007-05-29 12:40 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Nigel Cunningham @ 2007-05-29 12:23 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pavel Machek, Bill Davidsen, Linux Kernel M/L [-- Attachment #1: Type: text/plain, Size: 1003 bytes --] Hi. On Tue, 2007-05-29 at 14:03 +0200, Rafael J. Wysocki wrote: > On Tuesday, 29 May 2007 13:29, Pavel Machek wrote: > > Hi! > > > > > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > > > already? > > > > > > Yeah, I was thinking about this overnight too. It should be doable. In > > > addition to what we already do, I think you'd want: > > > > > > - to copy the assembly to do the copying to a safe page; > > > - to put the location of the cpu state that was saved in the image > > > header so that it can be used after the data is copied back; > > > > ...alternatively, we can just rely on copy routine (and its data) not > > changing frequently. I'd rather be sure. It will be extra code, but reliability is important. > > > - to copy the nosave data to a 'safe' page. > > > > > > What else? > > > > page directories need to be on a safe place, too. > > They are already. Yeah - that's why I ignored them. Regards, Nigel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 12:03 ` Rafael J. Wysocki 2007-05-29 12:23 ` Nigel Cunningham @ 2007-05-29 12:40 ` Pavel Machek 2007-05-29 13:13 ` Nigel Cunningham 1 sibling, 1 reply; 23+ messages in thread From: Pavel Machek @ 2007-05-29 12:40 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Nigel Cunningham, Bill Davidsen, Linux Kernel M/L On Tue 2007-05-29 14:03:07, Rafael J. Wysocki wrote: > On Tuesday, 29 May 2007 13:29, Pavel Machek wrote: > > Hi! > > > > > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > > > already? > > > > > > Yeah, I was thinking about this overnight too. It should be doable. In > > > addition to what we already do, I think you'd want: > > > > > > - to copy the assembly to do the copying to a safe page; > > > - to put the location of the cpu state that was saved in the image > > > header so that it can be used after the data is copied back; > > > > ...alternatively, we can just rely on copy routine (and its data) not > > changing frequently. > > > > > - to copy the nosave data to a 'safe' page. > > > > > > What else? > > > > page directories need to be on a safe place, too. > > They are already. ...but will that place still be safe when we use other version of kernel? Anyway, pagedirs are on the safe place, right? That means that we swsusp should no longer clash with page allocation debugging... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 12:40 ` Pavel Machek @ 2007-05-29 13:13 ` Nigel Cunningham 2007-05-29 21:51 ` Rafael J. Wysocki 2007-06-04 11:02 ` Pavel Machek 0 siblings, 2 replies; 23+ messages in thread From: Nigel Cunningham @ 2007-05-29 13:13 UTC (permalink / raw) To: Pavel Machek; +Cc: Rafael J. Wysocki, Bill Davidsen, Linux Kernel M/L [-- Attachment #1: Type: text/plain, Size: 1613 bytes --] Hi. On Tue, 2007-05-29 at 14:40 +0200, Pavel Machek wrote: > On Tue 2007-05-29 14:03:07, Rafael J. Wysocki wrote: > > On Tuesday, 29 May 2007 13:29, Pavel Machek wrote: > > > Hi! > > > > > > > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > > > > already? > > > > > > > > Yeah, I was thinking about this overnight too. It should be doable. In > > > > addition to what we already do, I think you'd want: > > > > > > > > - to copy the assembly to do the copying to a safe page; > > > > - to put the location of the cpu state that was saved in the image > > > > header so that it can be used after the data is copied back; > > > > > > ...alternatively, we can just rely on copy routine (and its data) not > > > changing frequently. > > > > > > > - to copy the nosave data to a 'safe' page. > > > > > > > > What else? > > > > > > page directories need to be on a safe place, too. > > > > They are already. > > ...but will that place still be safe when we use other version of > kernel? They'll be in the image too, won't they? Failing that, the information could be stored in the image header. > Anyway, pagedirs are on the safe place, right? That means that we > swsusp should no longer clash with page allocation debugging... You mean DEBUG_PAGEALLOC? That can be overcome easily - I have code in current Suspend2 that works with DEBUG_PAGEALLOC. I handle the page fault, mapping the page and setting a flag in the fault handler to tell the atomic copy code to unmap the page again once it has been copied. Regards, Nigel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 13:13 ` Nigel Cunningham @ 2007-05-29 21:51 ` Rafael J. Wysocki 2007-06-04 11:02 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Rafael J. Wysocki @ 2007-05-29 21:51 UTC (permalink / raw) To: nigel; +Cc: Pavel Machek, Bill Davidsen, Linux Kernel M/L Hi, On Tuesday, 29 May 2007 15:13, Nigel Cunningham wrote: > Hi. > > On Tue, 2007-05-29 at 14:40 +0200, Pavel Machek wrote: > > On Tue 2007-05-29 14:03:07, Rafael J. Wysocki wrote: > > > On Tuesday, 29 May 2007 13:29, Pavel Machek wrote: > > > > Hi! > > > > > > > > > > Yup. Don't we do something like this for the (ACPI-based) suspend to RAM > > > > > > already? > > > > > > > > > > Yeah, I was thinking about this overnight too. It should be doable. In > > > > > addition to what we already do, I think you'd want: > > > > > > > > > > - to copy the assembly to do the copying to a safe page; > > > > > - to put the location of the cpu state that was saved in the image > > > > > header so that it can be used after the data is copied back; > > > > > > > > ...alternatively, we can just rely on copy routine (and its data) not > > > > changing frequently. > > > > > > > > > - to copy the nosave data to a 'safe' page. > > > > > > > > > > What else? > > > > > > > > page directories need to be on a safe place, too. > > > > > > They are already. > > > > ...but will that place still be safe when we use other version of > > kernel? Yes. > They'll be in the image too, won't they? Failing that, the information > could be stored in the image header. In fact, for each page we have the number of the page frame that it should be restored to. Page frame numbers don't change. :-) > > Anyway, pagedirs are on the safe place, right? That means that we > > swsusp should no longer clash with page allocation debugging... > > You mean DEBUG_PAGEALLOC? That can be overcome easily - I have code in > current Suspend2 that works with DEBUG_PAGEALLOC. I handle the page > fault, mapping the page and setting a flag in the fault handler to tell > the atomic copy code to unmap the page again once it has been copied. Well, I can't comment, I haven't look at that yet. Greetings, Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-29 13:13 ` Nigel Cunningham 2007-05-29 21:51 ` Rafael J. Wysocki @ 2007-06-04 11:02 ` Pavel Machek 2007-06-04 11:05 ` Nigel Cunningham 1 sibling, 1 reply; 23+ messages in thread From: Pavel Machek @ 2007-06-04 11:02 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Rafael J. Wysocki, Bill Davidsen, Linux Kernel M/L Hi! > > > They are already. > > > > ...but will that place still be safe when we use other version of > > kernel? > > They'll be in the image too, won't they? Failing that, the information > could be stored in the image header. > > > Anyway, pagedirs are on the safe place, right? That means that we > > swsusp should no longer clash with page allocation debugging... > > You mean DEBUG_PAGEALLOC? That can be overcome easily - I have code in > current Suspend2 that works with DEBUG_PAGEALLOC. I handle the page > fault, mapping the page and setting a flag in the fault handler to tell > the atomic copy code to unmap the page again once it has been copied. I meant debug_pagealloc, but no, I do not think we want to make page fault handler more complex. Switching to 1:1 mapping tables should be enough. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-06-04 11:02 ` Pavel Machek @ 2007-06-04 11:05 ` Nigel Cunningham 0 siblings, 0 replies; 23+ messages in thread From: Nigel Cunningham @ 2007-06-04 11:05 UTC (permalink / raw) To: Pavel Machek; +Cc: Rafael J. Wysocki, Bill Davidsen, Linux Kernel M/L [-- Attachment #1: Type: text/plain, Size: 1626 bytes --] Hi. On Mon, 2007-06-04 at 13:02 +0200, Pavel Machek wrote: > Hi! > > > > > They are already. > > > > > > ...but will that place still be safe when we use other version of > > > kernel? > > > > They'll be in the image too, won't they? Failing that, the information > > could be stored in the image header. > > > > > Anyway, pagedirs are on the safe place, right? That means that we > > > swsusp should no longer clash with page allocation debugging... > > > > You mean DEBUG_PAGEALLOC? That can be overcome easily - I have code in > > current Suspend2 that works with DEBUG_PAGEALLOC. I handle the page > > fault, mapping the page and setting a flag in the fault handler to tell > > the atomic copy code to unmap the page again once it has been copied. > > I meant debug_pagealloc, but no, I do not think we want to make page > fault handler more complex. Switching to 1:1 mapping tables should be > enough. > Pavel @@ -311,6 +315,20 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs, si_code = SEGV_MAPERR; + /* During a Suspend2 atomic copy, with DEBUG_SLAB, we will + * get page faults where slab has been unmapped. Map them + * temporarily and set the variable that tells Suspend2 to + * unmap afterwards. + */ + + if (unlikely(suspend2_running && !suspend2_faulted)) { + struct page *page = NULL; + suspend2_faulted = 1; + page = virt_to_page(address); + kernel_map_pages(page, 1, 1); + return; + } + /* * We fault-in kernel-space virtual memory on-demand. The * 'reference' page table is init_mm.pgd. Regards, Nigel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-05-26 22:42 [2.6.21.1] resume doesn't run suspended kernel? Bill Davidsen 2007-05-27 8:41 ` David Greaves 2007-05-27 21:14 ` Pavel Machek @ 2007-06-05 7:23 ` Stefan Seyfried 2007-06-05 14:08 ` Bill Davidsen 2 siblings, 1 reply; 23+ messages in thread From: Stefan Seyfried @ 2007-06-05 7:23 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L Hi, On Sat, May 26, 2007 at 06:42:37PM -0400, Bill Davidsen wrote: > I was testing susp2disk in 2.6.21.1 under FC6, to support reliable computing > environment (RCE) needs. The idea is that if power fails, after some short > time on UPS the system does susp2disk with a time set, and boots back every > so often to see if power is stable. Interesting use case. > No, I don't want susp2mem until I debug it, console come up in useless mode, > console as kalidescope is not what I need. You probably need to reset the video mode. Try the s2ram workaround, specifically "-m". > Anyway, I pulled the plug on the UPS, and the system shut down. But when it > powered up, it booted the default kernel rather than the test kernel, decided > that it couldn't resume, and then did a cold boot. > > I can bypass this by making the debug kernel the default, but WHY? Is the > kernel not saved such that any kernel can be rolled back into memory and run? The Kernel does nothing to the bootloader during suspend. The kernel does not even know that you are using a bootloader and how it might be configured. Userland has to do this (and SUSE's pm-utils actually do. I thought the Fedora pm-utils also did, but i cannot say for sure). "Just" find out which entry in menu.lst corresponds to the currently running kernel, and preselect it for the next boot. It is doable. So it's a problem of your distro's userland (and if you did not use pm-hibernate to suspend, it is your very own problem). You could of course simply go for GRUB's "default saved" and "savedefault" feature, to always boot the last-booted kernel unless changed in the menu. -- Stefan Seyfried QA / R&D Team Mobile Devices | "Any ideas, John?" SUSE LINUX Products GmbH, Nürnberg | "Well, surrounding them's out." This footer brought to you by insane German lawmakers: SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [2.6.21.1] resume doesn't run suspended kernel? 2007-06-05 7:23 ` Stefan Seyfried @ 2007-06-05 14:08 ` Bill Davidsen 0 siblings, 0 replies; 23+ messages in thread From: Bill Davidsen @ 2007-06-05 14:08 UTC (permalink / raw) To: Stefan Seyfried; +Cc: Linux Kernel M/L Stefan Seyfried wrote: > Hi, > > On Sat, May 26, 2007 at 06:42:37PM -0400, Bill Davidsen wrote: > >> I was testing susp2disk in 2.6.21.1 under FC6, to support reliable computing >> environment (RCE) needs. The idea is that if power fails, after some short >> time on UPS the system does susp2disk with a time set, and boots back every >> so often to see if power is stable. >> > > Interesting use case. > > >> No, I don't want susp2mem until I debug it, console come up in useless mode, >> console as kalidescope is not what I need. >> > > You probably need to reset the video mode. Try the s2ram workaround, > specifically "-m". > > >> Anyway, I pulled the plug on the UPS, and the system shut down. But when it >> powered up, it booted the default kernel rather than the test kernel, decided >> that it couldn't resume, and then did a cold boot. >> >> I can bypass this by making the debug kernel the default, but WHY? Is the >> kernel not saved such that any kernel can be rolled back into memory and run? >> > > The Kernel does nothing to the bootloader during suspend. The kernel does not > even know that you are using a bootloader and how it might be configured. > > What I really expected is that what I was running would be save, and resume would restore what I was running and then jump back to where that suspended itself. Without having to address the issue of booting the "right" kernel, but having any functional kernel which was booted then restore whar was originally suspended. From discussion here, I conclude that "it could work that way but doesn't." > Userland has to do this (and SUSE's pm-utils actually do. I thought the > Fedora pm-utils also did, but i cannot say for sure). "Just" find out which > entry in menu.lst corresponds to the currently running kernel, and preselect > it for the next boot. It is doable. > > So it's a problem of your distro's userland (and if you did not use > pm-hibernate to suspend, it is your very own problem). > > You could of course simply go for GRUB's "default saved" and "savedefault" > feature, to always boot the last-booted kernel unless changed in the menu. > I'm being very careful to avoid changing the default boot kernel. If the system suspends (ie. deliberately) I want to resume in the running kernel, but if it crashes I want the cold boot to bring up a known stable kernel, even though that may be lacking in features, have an old scheduler, etc. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <fa.rXcMBo+RSE/6L84EBqFCeyFql/k@ifi.uio.no>]
* Re: [2.6.21.1] resume doesn't run suspended kernel? [not found] <fa.rXcMBo+RSE/6L84EBqFCeyFql/k@ifi.uio.no> @ 2007-05-27 2:44 ` Robert Hancock 0 siblings, 0 replies; 23+ messages in thread From: Robert Hancock @ 2007-05-27 2:44 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux Kernel M/L Bill Davidsen wrote: > I was testing susp2disk in 2.6.21.1 under FC6, to support reliable > computing environment (RCE) needs. The idea is that if power fails, > after some short time on UPS the system does susp2disk with a time set, > and boots back every so often to see if power is stable. > > No, I don't want susp2mem until I debug it, console come up in useless > mode, console as kalidescope is not what I need. > > Anyway, I pulled the plug on the UPS, and the system shut down. But when > it powered up, it booted the default kernel rather than the test kernel, > decided that it couldn't resume, and then did a cold boot. > > I can bypass this by making the debug kernel the default, but WHY? Is > the kernel not saved such that any kernel can be rolled back into memory > and run? Actually, the answer is HELL NO, so I really ask if this is the > intended mode of operation, that only the default boot kernel will restore. Fedora scripts for hibernation are supposed to tell GRUB to set the default kernel on the next boot to be the current one before suspending to disk, so that it comes up with the same version it was running and the resume can succeed. If the way you're triggering the suspend bypasses this mechanism, you'll see this problem. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from hancockr@nospamshaw.ca Home Page: http://www.roberthancock.com/ ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2007-06-05 14:07 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-26 22:42 [2.6.21.1] resume doesn't run suspended kernel? Bill Davidsen
2007-05-27 8:41 ` David Greaves
2007-05-27 13:10 ` Bill Davidsen
2007-05-27 15:26 ` David Greaves
2007-05-27 21:20 ` Pavel Machek
2007-05-27 21:17 ` Pavel Machek
2007-05-27 21:14 ` Pavel Machek
2007-05-28 3:15 ` Bill Davidsen
2007-05-28 13:21 ` Bill Davidsen
2007-05-28 13:26 ` Pavel Machek
2007-05-28 17:57 ` Rafael J. Wysocki
2007-05-28 22:48 ` Nigel Cunningham
2007-05-29 11:29 ` Pavel Machek
2007-05-29 12:03 ` Rafael J. Wysocki
2007-05-29 12:23 ` Nigel Cunningham
2007-05-29 12:40 ` Pavel Machek
2007-05-29 13:13 ` Nigel Cunningham
2007-05-29 21:51 ` Rafael J. Wysocki
2007-06-04 11:02 ` Pavel Machek
2007-06-04 11:05 ` Nigel Cunningham
2007-06-05 7:23 ` Stefan Seyfried
2007-06-05 14:08 ` Bill Davidsen
[not found] <fa.rXcMBo+RSE/6L84EBqFCeyFql/k@ifi.uio.no>
2007-05-27 2:44 ` Robert Hancock
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox