* Suspend-to-disk woes
@ 2005-03-18 16:28 Erik Andrén
2005-03-18 17:54 ` Stefan Seyfried
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Erik Andrén @ 2005-03-18 16:28 UTC (permalink / raw)
To: linux-kernel
Hello, I experienced a pretty nasty problem a couple of days back:
I ran 2.6.11-ck1 and built 2.6.11-ck2. The last thing I did before
booting the new kernel was to suspend-to-disk the old kernel (something
I usually do as I'm working on this laptop).
I ran the new kernel a couple of days and decided to boot the old kernel
to do some performance tests. Imagine my dread as the old kernel instead
of detecting that the system has booted another kernel just reloads the
old suspend-to-disk image. The result is that after succesfully
resuming, my harddrive goes bonkers and starts to work. After a couple
of minutes the whole kernel hangs. I reboot and try to boot the -ck2
kernel again only to find that the system complains as it finds missing
nodes. The reisertools try to rebuild the system unsucessully. The
--rebuild-tree parameter worked but a lot of files were still missing.
In the end I had to reinstall the whole system as it went so unstable.
My question is: Why isn't there a check before resuming a
suspend-to-disk image if the system has booted another kernel since the
suspend to prevent this kind of hassle?
//Regards Erik Andrén
Please cc me as I'm not on the lkml list yadda yadda
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: Suspend-to-disk woes 2005-03-18 16:28 Suspend-to-disk woes Erik Andrén @ 2005-03-18 17:54 ` Stefan Seyfried 2005-03-18 22:04 ` Nigel Cunningham 2005-03-19 13:26 ` Pavel Machek 2 siblings, 0 replies; 14+ messages in thread From: Stefan Seyfried @ 2005-03-18 17:54 UTC (permalink / raw) To: Erik Andrén; +Cc: kernel list Erik Andrén wrote: > My question is: Why isn't there a check before resuming a > suspend-to-disk image if the system has booted another kernel since the > suspend to prevent this kind of hassle? Just provide a patch which does this. Hint: this is highly nontrivial. If you boot a kernel, that does not know swsusp (and if it knew, it would have invalidated the suspend image in the swap), or which does not have the necessary information (because of a missing resume= parameter), this kernel cannot do much. Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-18 16:28 Suspend-to-disk woes Erik Andrén 2005-03-18 17:54 ` Stefan Seyfried @ 2005-03-18 22:04 ` Nigel Cunningham 2005-03-19 13:26 ` Pavel Machek 2 siblings, 0 replies; 14+ messages in thread From: Nigel Cunningham @ 2005-03-18 22:04 UTC (permalink / raw) To: Erik Andrén; +Cc: Linux Kernel Mailing List Hi. The simplest solution is to mkswap your swap partitions during boot. Nigel On Sat, 2005-03-19 at 03:28, Erik Andrén wrote: > Hello, I experienced a pretty nasty problem a couple of days back: > > I ran 2.6.11-ck1 and built 2.6.11-ck2. The last thing I did before > booting the new kernel was to suspend-to-disk the old kernel (something > I usually do as I'm working on this laptop). > I ran the new kernel a couple of days and decided to boot the old kernel > to do some performance tests. Imagine my dread as the old kernel instead > of detecting that the system has booted another kernel just reloads the > old suspend-to-disk image. The result is that after succesfully > resuming, my harddrive goes bonkers and starts to work. After a couple > of minutes the whole kernel hangs. I reboot and try to boot the -ck2 > kernel again only to find that the system complains as it finds missing > nodes. The reisertools try to rebuild the system unsucessully. The > --rebuild-tree parameter worked but a lot of files were still missing. > In the end I had to reinstall the whole system as it went so unstable. > > My question is: Why isn't there a check before resuming a > suspend-to-disk image if the system has booted another kernel since the > suspend to prevent this kind of hassle? > //Regards Erik Andrén > > Please cc me as I'm not on the lkml list yadda yadda > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-18 16:28 Suspend-to-disk woes Erik Andrén 2005-03-18 17:54 ` Stefan Seyfried 2005-03-18 22:04 ` Nigel Cunningham @ 2005-03-19 13:26 ` Pavel Machek 2005-03-19 20:20 ` Russell Miller 2 siblings, 1 reply; 14+ messages in thread From: Pavel Machek @ 2005-03-19 13:26 UTC (permalink / raw) To: erik.andren; +Cc: linux-kernel Hi! > Hello, I experienced a pretty nasty problem a couple of days back: > > I ran 2.6.11-ck1 and built 2.6.11-ck2. The last thing I did before > booting the new kernel was to suspend-to-disk the old kernel > (something I usually do as I'm working on this laptop). > I ran the new kernel a couple of days and decided to boot the old > kernel to do some performance tests. Imagine my dread as the old > kernel instead of detecting that the system has booted another kernel > just reloads the old suspend-to-disk image. The result is that after > succesfully resuming, my harddrive goes bonkers and starts to work. > After a couple of minutes the whole kernel hangs. I reboot and try to > boot the -ck2 kernel again only to find that the system complains as > it finds missing nodes. The reisertools try to rebuild the system > unsucessully. The --rebuild-tree parameter worked but a lot of files > were still missing. In the end I had to reinstall the whole system as > it went so unstable. > > My question is: Why isn't there a check before resuming a > suspend-to-disk image if the system has booted another kernel since > the suspend to prevent this kind of hassle? Checking that would be hard, but you might want to provide patch to check last-mounted dates of filesystems and panic if they changed. Pavel -- 64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-19 13:26 ` Pavel Machek @ 2005-03-19 20:20 ` Russell Miller 2005-03-19 21:29 ` Pavel Machek 0 siblings, 1 reply; 14+ messages in thread From: Russell Miller @ 2005-03-19 20:20 UTC (permalink / raw) To: Pavel Machek; +Cc: erik.andren, linux-kernel On Saturday 19 March 2005 05:26, Pavel Machek wrote: > Checking that would be hard, but you might want to provide patch to check > last-mounted dates of filesystems and panic if they changed. > Pavel Then how would you fix it? There'd also have to be a way to reset it, otherwise the kernel will never boot again. Perhaps an argument to the kernel that allows for resetting of the mechanism? --Russell -- Russell Miller - rmiller@duskglow.com - Agoura, CA ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-19 20:20 ` Russell Miller @ 2005-03-19 21:29 ` Pavel Machek 2005-03-19 21:44 ` Russell Miller 2005-03-21 0:14 ` Nigel Cunningham 0 siblings, 2 replies; 14+ messages in thread From: Pavel Machek @ 2005-03-19 21:29 UTC (permalink / raw) To: Russell Miller; +Cc: erik.andren, linux-kernel On So 19-03-05 12:20:35, Russell Miller wrote: > On Saturday 19 March 2005 05:26, Pavel Machek wrote: > > > Checking that would be hard, but you might want to provide patch to check > > last-mounted dates of filesystems and panic if they changed. > > Pavel > > Then how would you fix it? There'd also have to be a way to reset it, boot with "noresume", then mkswap. Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-19 21:29 ` Pavel Machek @ 2005-03-19 21:44 ` Russell Miller 2005-03-21 0:14 ` Nigel Cunningham 1 sibling, 0 replies; 14+ messages in thread From: Russell Miller @ 2005-03-19 21:44 UTC (permalink / raw) To: Pavel Machek; +Cc: erik.andren, linux-kernel On Saturday 19 March 2005 13:29, Pavel Machek wrote: > On So 19-03-05 12:20:35, Russell Miller wrote: > > On Saturday 19 March 2005 05:26, Pavel Machek wrote: > > > Checking that would be hard, but you might want to provide patch to > > > check last-mounted dates of filesystems and panic if they changed. > > > Pavel > > > > Then how would you fix it? There'd also have to be a way to reset it, > > boot with "noresume", then mkswap. > Pavel Ah, makes sense. I've never used the resume functionality, so my ignorance on that subject is understandable... :-) --Russell -- Russell Miller - rmiller@duskglow.com - Agoura, CA ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-19 21:29 ` Pavel Machek 2005-03-19 21:44 ` Russell Miller @ 2005-03-21 0:14 ` Nigel Cunningham 2005-03-21 0:17 ` Matthew Garrett 2005-03-21 7:38 ` Stefan Seyfried 1 sibling, 2 replies; 14+ messages in thread From: Nigel Cunningham @ 2005-03-21 0:14 UTC (permalink / raw) To: Pavel Machek; +Cc: Russell Miller, erik.andren, Linux Kernel Mailing List Hi. On Sun, 2005-03-20 at 08:29, Pavel Machek wrote: > On So 19-03-05 12:20:35, Russell Miller wrote: > > On Saturday 19 March 2005 05:26, Pavel Machek wrote: > > > > > Checking that would be hard, but you might want to provide patch to check > > > last-mounted dates of filesystems and panic if they changed. > > > Pavel > > > > Then how would you fix it? There'd also have to be a way to reset it, > > boot with "noresume", then mkswap. Yuck! Why panic when you know what is needed? A better solution is to tell the user they've messed up and given them the option to (1) reboot and try another kernel or (2) have swsusp restore the original swap signature and continue booting. This is what suspend2 does (with a timeout for the prompt). It's not that hard. Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 0:14 ` Nigel Cunningham @ 2005-03-21 0:17 ` Matthew Garrett 2005-03-21 5:59 ` Nigel Cunningham 2005-03-21 7:38 ` Stefan Seyfried 1 sibling, 1 reply; 14+ messages in thread From: Matthew Garrett @ 2005-03-21 0:17 UTC (permalink / raw) To: linux-kernel Nigel Cunningham <ncunningham@cyclades.com> wrote: > Yuck! Why panic when you know what is needed? A better solution is to > tell the user they've messed up and given them the option to (1) reboot > and try another kernel or (2) have swsusp restore the original swap > signature and continue booting. This is what suspend2 does (with a > timeout for the prompt). It's not that hard. It's trivial to do this in userspace - just have an app in initramfs that checks for a swsusp signature, and then compare the kernel versions. If they mismatch, prompt for what to do. Putting it in the kernel is madness. -- Matthew Garrett | mjg59-chiark.mail.linux-rutgers.kernel@srcf.ucam.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 0:17 ` Matthew Garrett @ 2005-03-21 5:59 ` Nigel Cunningham 2005-03-21 9:33 ` Stefan Seyfried 0 siblings, 1 reply; 14+ messages in thread From: Nigel Cunningham @ 2005-03-21 5:59 UTC (permalink / raw) To: Matthew Garrett; +Cc: Linux Kernel Mailing List Hi. On Mon, 2005-03-21 at 11:17, Matthew Garrett wrote: > Nigel Cunningham <ncunningham@cyclades.com> wrote: > > > Yuck! Why panic when you know what is needed? A better solution is to > > tell the user they've messed up and given them the option to (1) reboot > > and try another kernel or (2) have swsusp restore the original swap > > signature and continue booting. This is what suspend2 does (with a > > timeout for the prompt). It's not that hard. > > It's trivial to do this in userspace - just have an app in initramfs > that checks for a swsusp signature, and then compare the kernel > versions. If they mismatch, prompt for what to do. Putting it in the > kernel is madness. It's not that trivial. - You need to know how to modify your initramfs to do it; - You might have to (learn how to) set up an initramfs just for this; - Your image might not be stored in a swap partition. For Suspend2, it can potentially in a swap file or (soon) an ordinary file; - Finding which partition to look in for the signature might be non trivial (labels in fstab). You'd want to hard code it or (perferably) copy a config file from the root (or other) partition; - Having addressed the above issues, you still need to add code to read the swap header, parse it to find the header, read the header from the image, parse it and obtain the kernel version of the saved image. If your image is not stored in a swap partition, you probably can't mount the fs the image is stored on, because doing so will replay the image and make resuming unsafe, so this approach is less trivial without knowing exactly which disk blocks and device IDs to use (and using dd to access them). On top of these, we have two implementations, so you'll want to check for the signatures of both. That said, I am considering making something like what you're saying: exposing methods of testing whether an image exists and an entry through which you can get Suspend to erase an image via a proc (eventually sysfs) entry. This will allow something like what you're saying to be controlled from userspace. Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 5:59 ` Nigel Cunningham @ 2005-03-21 9:33 ` Stefan Seyfried 2005-03-21 21:38 ` Nigel Cunningham 0 siblings, 1 reply; 14+ messages in thread From: Stefan Seyfried @ 2005-03-21 9:33 UTC (permalink / raw) To: ncunningham; +Cc: Linux Kernel Mailing List, mgarrett Hi, Nigel Cunningham wrote: > On Mon, 2005-03-21 at 11:17, Matthew Garrett wrote: >> It's trivial to do this in userspace - just have an app in initramfs > It's not that trivial. > - Your image might not be stored in a swap partition. For Suspend2, it > can potentially in a swap file or (soon) an ordinary file; > - Finding which partition to look in for the signature might be non > trivial (labels in fstab). You'd want to hard code it or (perferably) > copy a config file from the root (or other) partition; > - Having addressed the above issues, you still need to add code to read > the swap header, parse it to find the header, read the header from the > image, parse it and obtain the kernel version of the saved image. Well, and you want to compile all this into the kernel? Just to hold the hands of users who have not read the fine manual? And you'd need to compile this into all kernels, especially those that _don't_ support suspend to disk. Or you are back at the place where the thread started. > If your image is not stored in a swap partition, you probably can't > mount the fs the image is stored on, because doing so will replay the > image and make resuming unsafe, so this approach is less trivial without > knowing exactly which disk blocks and device IDs to use (and using dd to > access them). GRUB reads kernel and initramfs from a dirty reiserfs partition on resume (although this is a bad idea if you want a fast resume, but that's another problem). It is possible. > On top of these, we have two implementations, so you'll want to check > for the signatures of both. This is the final argument for doing it in userspace :-). > That said, I am considering making something like what you're saying: > exposing methods of testing whether an image exists and an entry through > which you can get Suspend to erase an image via a proc (eventually > sysfs) entry. This will allow something like what you're saying to be > controlled from userspace. It does not help if the next kernel i boot is not suspend2 patched. This work should rather go into a library that exports this functions to userspace programs, for all known suspend implementations. Regards, Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 9:33 ` Stefan Seyfried @ 2005-03-21 21:38 ` Nigel Cunningham 0 siblings, 0 replies; 14+ messages in thread From: Nigel Cunningham @ 2005-03-21 21:38 UTC (permalink / raw) To: Stefan Seyfried; +Cc: Linux Kernel Mailing List, mgarrett Hi. On Mon, 2005-03-21 at 20:33, Stefan Seyfried wrote: > Hi, > > Nigel Cunningham wrote: > > > On Mon, 2005-03-21 at 11:17, Matthew Garrett wrote: > > >> It's trivial to do this in userspace - just have an app in initramfs > > > It's not that trivial. > > > - Your image might not be stored in a swap partition. For Suspend2, it > > can potentially in a swap file or (soon) an ordinary file; > > - Finding which partition to look in for the signature might be non > > trivial (labels in fstab). You'd want to hard code it or (perferably) > > copy a config file from the root (or other) partition; > > - Having addressed the above issues, you still need to add code to read > > the swap header, parse it to find the header, read the header from the > > image, parse it and obtain the kernel version of the saved image. > > Well, and you want to compile all this into the kernel? Just to hold the > hands of users who have not read the fine manual? Most of it is in there anyway - the kernel code needs to check the image exists and read the header irrespective of whether it does sanity checking. In Suspend2, this code is also used for other error conditions that can stop you being able to resume (failure to load the right modules in an initrd, failure at accessing the device where the image should be found etc). > And you'd need to compile this into all kernels, especially those that > _don't_ support suspend to disk. Or you are back at the place where the > thread started. Yes. The real solution is for all kernels on a system to either support suspend to disk or not support it. Half measures are what cause the problem. > > If your image is not stored in a swap partition, you probably can't > > mount the fs the image is stored on, because doing so will replay the > > image and make resuming unsafe, so this approach is less trivial without > > knowing exactly which disk blocks and device IDs to use (and using dd to > > access them). > > GRUB reads kernel and initramfs from a dirty reiserfs partition on > resume (although this is a bad idea if you want a fast resume, but > that's another problem). It is possible. Mmm. I know it's all possible, but I'm pointing out the issues that make it not "trivial", which was the original claim. > > On top of these, we have two implementations, so you'll want to check > > for the signatures of both. > > This is the final argument for doing it in userspace :-). How so? You then have to maintain two codebases for doing all this reading and parsing. > > That said, I am considering making something like what you're saying: > > exposing methods of testing whether an image exists and an entry through > > which you can get Suspend to erase an image via a proc (eventually > > sysfs) entry. This will allow something like what you're saying to be > > controlled from userspace. > > It does not help if the next kernel i boot is not suspend2 patched. This > work should rather go into a library that exports this functions to > userspace programs, for all known suspend implementations. So don't use kernels that aren't suspend2 patched :> If someone said "I want to boot a kernel that doesn't have support for ext3 but my rootfs is ext3", would we say "Well then, write a userspace ext3 driver"? Not exactly the same, I know, but I think the point stands. We'd say "Don't be silly. Put in the support you need." The real solution to this mess is to get distros compiling in support for suspend-to-disk by default. I realise that hasn't been attractive. Hopefully it will change real-soon-now. Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 0:14 ` Nigel Cunningham 2005-03-21 0:17 ` Matthew Garrett @ 2005-03-21 7:38 ` Stefan Seyfried 2005-03-21 11:23 ` Nigel Cunningham 1 sibling, 1 reply; 14+ messages in thread From: Stefan Seyfried @ 2005-03-21 7:38 UTC (permalink / raw) To: ncunningham Cc: Russell Miller, erik.andren, Linux Kernel Mailing List, Pavel Machek Nigel Cunningham wrote: > Hi. > > On Sun, 2005-03-20 at 08:29, Pavel Machek wrote: >> boot with "noresume", then mkswap. > > Yuck! Why panic when you know what is needed? A better solution is to Ok, so let's printk("You booted another kernel than you suspended with.\n"); printk("You have two options now:\n"); printk(" - boot the kernel you suspended with\n"); printk(" - pass 'noresume' at boot and mkswap your swap partition " " later\n"); printk("Try again, player 1!\n"); panic(); > tell the user they've messed up and give them the option to (1) reboot > and try another kernel or (2) have swsusp restore the original swap > signature and continue booting. This is what suspend2 does (with a > timeout for the prompt). It's not that hard. yes, but you need user input etc. Not considered a good idea IIRC. Anyway, the hard thing to do is to find out when to bail out and when not. The part that handles the user interface is the easier one :-) Regards, Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Suspend-to-disk woes 2005-03-21 7:38 ` Stefan Seyfried @ 2005-03-21 11:23 ` Nigel Cunningham 0 siblings, 0 replies; 14+ messages in thread From: Nigel Cunningham @ 2005-03-21 11:23 UTC (permalink / raw) To: Stefan Seyfried Cc: Russell Miller, erik.andren, Linux Kernel Mailing List, Pavel Machek Hi. On Mon, 2005-03-21 at 18:38, Stefan Seyfried wrote: > Nigel Cunningham wrote: > > Hi. > > > > On Sun, 2005-03-20 at 08:29, Pavel Machek wrote: > > >> boot with "noresume", then mkswap. > > > > Yuck! Why panic when you know what is needed? A better solution is to > > Ok, so let's > > printk("You booted another kernel than you suspended with.\n"); > printk("You have two options now:\n"); > printk(" - boot the kernel you suspended with\n"); > printk(" - pass 'noresume' at boot and mkswap your swap partition " > " later\n"); > printk("Try again, player 1!\n"); > panic(); Still in the yuck category, although the better information is definitely an improvement :> > > tell the user they've messed up and give them the option to (1) reboot > > and try another kernel or (2) have swsusp restore the original swap > > signature and continue booting. This is what suspend2 does (with a > > timeout for the prompt). It's not that hard. > > yes, but you need user input etc. Not considered a good idea IIRC. I understood that having it hang indefinitely was considered a bad idea. Suspend2 already has code that does what I'm suggesting, and incorporates a 30 second timeout. > Anyway, the hard thing to do is to find out when to bail out and when > not. The part that handles the user interface is the easier one :-) Agreed. That's where Pavel's code might need a little hacking around. Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2005-03-21 21:44 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-18 16:28 Suspend-to-disk woes Erik Andrén 2005-03-18 17:54 ` Stefan Seyfried 2005-03-18 22:04 ` Nigel Cunningham 2005-03-19 13:26 ` Pavel Machek 2005-03-19 20:20 ` Russell Miller 2005-03-19 21:29 ` Pavel Machek 2005-03-19 21:44 ` Russell Miller 2005-03-21 0:14 ` Nigel Cunningham 2005-03-21 0:17 ` Matthew Garrett 2005-03-21 5:59 ` Nigel Cunningham 2005-03-21 9:33 ` Stefan Seyfried 2005-03-21 21:38 ` Nigel Cunningham 2005-03-21 7:38 ` Stefan Seyfried 2005-03-21 11:23 ` Nigel Cunningham
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox