* Re: [PATCH] Freeze bdevs when freezing processes. [not found] ` <200610231236.54317.rjw@sisk.pl> @ 2006-10-24 14:44 ` David Chinner 2006-10-24 15:29 ` Rafael J. Wysocki 2006-10-24 22:19 ` Nigel Cunningham 0 siblings, 2 replies; 28+ messages in thread From: David Chinner @ 2006-10-24 14:44 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Nigel Cunningham, Andrew Morton, LKML, Pavel Machek, xfs On Mon, Oct 23, 2006 at 12:36:53PM +0200, Rafael J. Wysocki wrote: > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote: > > XFS can continue to submit I/O from a timer routine, even after > > freezeable kernel and userspace threads are frozen. This doesn't seem to > > be an issue for current swsusp code, > > So it doesn't look like we need the patch _now_. > > > but is definitely an issue for Suspend2, where the pages being written could > > be overwritten by Suspend2's atomic copy. > > And IMO that's a good reason why we shouldn't use RCU pages for storing the > image. XFS is one known example that breaks things if we do so and > there may be more such things that we don't know of. The fact that they > haven't appeared in testing so far doesn't mean they don't exist and > moreover some things like that may appear in the future. Could you please tell us which XFS bits are broken so we can get them fixed? The XFS daemons should all be checking if they are supposed to freeze (i.e. they call try_to_freeze() after they wake up due to timer expiry) so I thought they were doing the right thing. However, I have to say that I agree with freezing the filesystems before suspend - at least XFS will be in a consistent state that can be recovered from without corruption if your machine fails to resume.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 14:44 ` [PATCH] Freeze bdevs when freezing processes David Chinner @ 2006-10-24 15:29 ` Rafael J. Wysocki 2006-10-24 16:27 ` Oleg Verych ` (2 more replies) 2006-10-24 22:19 ` Nigel Cunningham 1 sibling, 3 replies; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-24 15:29 UTC (permalink / raw) To: David Chinner; +Cc: Nigel Cunningham, Andrew Morton, LKML, Pavel Machek, xfs On Tuesday, 24 October 2006 16:44, David Chinner wrote: > On Mon, Oct 23, 2006 at 12:36:53PM +0200, Rafael J. Wysocki wrote: > > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote: > > > XFS can continue to submit I/O from a timer routine, even after > > > freezeable kernel and userspace threads are frozen. This doesn't seem to > > > be an issue for current swsusp code, > > > > So it doesn't look like we need the patch _now_. > > > > > but is definitely an issue for Suspend2, where the pages being written could > > > be overwritten by Suspend2's atomic copy. > > > > And IMO that's a good reason why we shouldn't use RCU pages for storing the > > image. XFS is one known example that breaks things if we do so and > > there may be more such things that we don't know of. The fact that they > > haven't appeared in testing so far doesn't mean they don't exist and > > moreover some things like that may appear in the future. > > Could you please tell us which XFS bits are broken so we can get > them fixed? The XFS daemons should all be checking if they are > supposed to freeze (i.e. they call try_to_freeze() after they wake > up due to timer expiry) so I thought they were doing the right > thing. > > However, I have to say that I agree with freezing the filesystems > before suspend - at least XFS will be in a consistent state that can > be recovered from without corruption if your machine fails to > resume.... Do you mean calling sys_sync() after the userspace has been frozen may not be sufficient? Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 15:29 ` Rafael J. Wysocki @ 2006-10-24 16:27 ` Oleg Verych 2006-10-25 8:05 ` Pavel Machek 2006-10-24 16:33 ` David Chinner 2006-10-24 17:06 ` Christoph Hellwig 2 siblings, 1 reply; 28+ messages in thread From: Oleg Verych @ 2006-10-24 16:27 UTC (permalink / raw) To: Rafael J. Wysocki, David Chinner, Nigel Cunningham, Andrew Morton, Pavel Machek, xfs, Oleg Verych On 2006-10-24, Rafael J. Wysocki wrote: > On Tuesday, 24 October 2006 16:44, David Chinner wrote: >> On Mon, Oct 23, 2006 at 12:36:53PM +0200, Rafael J. Wysocki wrote: >> > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote: >> > > XFS can continue to submit I/O from a timer routine, even after >> > > freezeable kernel and userspace threads are frozen. This doesn't seem to >> > > be an issue for current swsusp code, >> > >> > So it doesn't look like we need the patch _now_. >> > >> > > but is definitely an issue for Suspend2, where the pages being written could >> > > be overwritten by Suspend2's atomic copy. >> > >> > And IMO that's a good reason why we shouldn't use RCU pages for storing the >> > image. XFS is one known example that breaks things if we do so and >> > there may be more such things that we don't know of. The fact that they >> > haven't appeared in testing so far doesn't mean they don't exist and >> > moreover some things like that may appear in the future. >> >> Could you please tell us which XFS bits are broken so we can get >> them fixed? The XFS daemons should all be checking if they are >> supposed to freeze (i.e. they call try_to_freeze() after they wake >> up due to timer expiry) so I thought they were doing the right >> thing. >> >> However, I have to say that I agree with freezing the filesystems >> before suspend - at least XFS will be in a consistent state that can >> be recovered from without corruption if your machine fails to >> resume.... > > Do you mean calling sys_sync() after the userspace has been frozen > may not be sufficient? Please see <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317479;msg=105;att=0> it's bottom of <http://bugs.debian.org/317479> IMHO it's may be helpful. -o--=O`C #oo'L O <___=E M ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 16:27 ` Oleg Verych @ 2006-10-25 8:05 ` Pavel Machek 0 siblings, 0 replies; 28+ messages in thread From: Pavel Machek @ 2006-10-25 8:05 UTC (permalink / raw) To: Oleg Verych Cc: Rafael J. Wysocki, David Chinner, Nigel Cunningham, Andrew Morton, xfs Hi! > >> However, I have to say that I agree with freezing the filesystems > >> before suspend - at least XFS will be in a consistent state that can > >> be recovered from without corruption if your machine fails to > >> resume.... > > > > Do you mean calling sys_sync() after the userspace has been frozen > > may not be sufficient? > > Please see > <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317479;msg=105;att=0> > > it's bottom of > <http://bugs.debian.org/317479> > > IMHO it's may be helpful. Heh, ouch. Okay, so what happens is that Debian's scripts attempt to change /boot/../grub.conf.something (xfs mounted). Changes are properly propagated to the journal, but grub can't parse the journal, and fails. Ouch. Either 1) fix grub to parse the journal (not practical, I guess) 2) do not modify grub configuration from hibernate scripts or 3) fix hibernate scripts to sync down to disk, properly. swsusp does not guarantee any particular state of filesystems when system is suspended. Sorry. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 15:29 ` Rafael J. Wysocki 2006-10-24 16:27 ` Oleg Verych @ 2006-10-24 16:33 ` David Chinner 2006-10-24 21:37 ` Pavel Machek 2006-10-24 17:06 ` Christoph Hellwig 2 siblings, 1 reply; 28+ messages in thread From: David Chinner @ 2006-10-24 16:33 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Chinner, Nigel Cunningham, Andrew Morton, LKML, Pavel Machek, xfs On Tue, Oct 24, 2006 at 05:29:59PM +0200, Rafael J. Wysocki wrote: > On Tuesday, 24 October 2006 16:44, David Chinner wrote: > > On Mon, Oct 23, 2006 at 12:36:53PM +0200, Rafael J. Wysocki wrote: > > > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote: > > > > XFS can continue to submit I/O from a timer routine, even after > > > > freezeable kernel and userspace threads are frozen. This doesn't seem to > > > > be an issue for current swsusp code, > > > > > > So it doesn't look like we need the patch _now_. > > > > > > > but is definitely an issue for Suspend2, where the pages being written could > > > > be overwritten by Suspend2's atomic copy. > > > > > > And IMO that's a good reason why we shouldn't use RCU pages for storing the > > > image. XFS is one known example that breaks things if we do so and > > > there may be more such things that we don't know of. The fact that they > > > haven't appeared in testing so far doesn't mean they don't exist and > > > moreover some things like that may appear in the future. > > > > Could you please tell us which XFS bits are broken so we can get > > them fixed? The XFS daemons should all be checking if they are > > supposed to freeze (i.e. they call try_to_freeze() after they wake > > up due to timer expiry) so I thought they were doing the right > > thing. > > > > However, I have to say that I agree with freezing the filesystems > > before suspend - at least XFS will be in a consistent state that can > > be recovered from without corruption if your machine fails to > > resume.... > > Do you mean calling sys_sync() after the userspace has been frozen > may not be sufficient? In most cases it probably is, but sys_sync() doesn't provide any guarantees that the filesystem is not being used or written to after it completes. Given that every so often I hear about an XFS filesystem that was corrupted by suspend, I don't think this is sufficient... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 16:33 ` David Chinner @ 2006-10-24 21:37 ` Pavel Machek 2006-10-25 0:13 ` David Chinner 0 siblings, 1 reply; 28+ messages in thread From: Pavel Machek @ 2006-10-24 21:37 UTC (permalink / raw) To: David Chinner Cc: Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs Hi! > > Do you mean calling sys_sync() after the userspace has been frozen > > may not be sufficient? > > In most cases it probably is, but sys_sync() doesn't provide any > guarantees that the filesystem is not being used or written to after > it completes. Given that every so often I hear about an XFS filesystem > that was corrupted by suspend, I don't think this is sufficient... Userspace is frozen. There's noone that can write to the XFS filesystem. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 21:37 ` Pavel Machek @ 2006-10-25 0:13 ` David Chinner 2006-10-25 8:10 ` Pavel Machek 0 siblings, 1 reply; 28+ messages in thread From: David Chinner @ 2006-10-25 0:13 UTC (permalink / raw) To: Pavel Machek Cc: David Chinner, Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs On Tue, Oct 24, 2006 at 11:37:37PM +0200, Pavel Machek wrote: > Hi! > > > > Do you mean calling sys_sync() after the userspace has been frozen > > > may not be sufficient? > > > > In most cases it probably is, but sys_sync() doesn't provide any > > guarantees that the filesystem is not being used or written to after > > it completes. Given that every so often I hear about an XFS filesystem > > that was corrupted by suspend, I don't think this is sufficient... > > Userspace is frozen. There's noone that can write to the XFS > filesystem. Sure, no new userspace processes can write data, but what about the internal state of the filesystem? All a sync guarantees is that the filesystem is consistent when the sync returns and XFS provides this guarantee by writing all data and ensuring all metadata changes are logged so if a crash occurs it can be recovered (which provides the sync guarantee). hence after a sys_sync(), XFS will still have lots of dirty metadata that needs to be written to disk at some time in the future so the transactions can be removed from the log. This dirty metadata can be flushed at any time, and the dirty state is kept in XFS structures and not always in page structures (think multipage metadata buffers). Hence I cannot see how suspend can guarantee that it has saved all the dirty data in XFS, nor restore it correctly on resume. Once you toss dirty metadata that is currently in the log, further operations will result in that log transaction being overwritten without it ever being written to disk. That then means any subsequent operations after resume will corrupt the filesystem.... Hence the only way to correctly rebuild the XFS state on resume is to quiesce the filesystem on suspend and thaw it on resume so as to trigger log recovery. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-25 0:13 ` David Chinner @ 2006-10-25 8:10 ` Pavel Machek 2006-10-25 8:38 ` David Chinner 0 siblings, 1 reply; 28+ messages in thread From: Pavel Machek @ 2006-10-25 8:10 UTC (permalink / raw) To: David Chinner Cc: Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs Hi! > > > > Do you mean calling sys_sync() after the userspace has been frozen > > > > may not be sufficient? > > > > > > In most cases it probably is, but sys_sync() doesn't provide any > > > guarantees that the filesystem is not being used or written to after > > > it completes. Given that every so often I hear about an XFS filesystem > > > that was corrupted by suspend, I don't think this is sufficient... > > > > Userspace is frozen. There's noone that can write to the XFS > > filesystem. > > Sure, no new userspace processes can write data, but what about the > internal state of the filesystem? > > All a sync guarantees is that the filesystem is consistent when the > sync returns and XFS provides this guarantee by writing all data and > ensuring all metadata changes are logged so if a crash occurs it can > be recovered (which provides the sync guarantee). hence after a > sys_sync(), XFS will still have lots of dirty metadata that needs to > be written to disk at some time in the future so the transactions > can be removed from the log. > > This dirty metadata can be flushed at any time, and the dirty state > is kept in XFS structures and not always in page structures (think > multipage metadata buffers). Hence I cannot see how suspend can > guarantee that it has saved all the dirty data in XFS, nor > restore it correctly on resume. Once you toss dirty metadata that > is currently in the log, further operations will result in that log > transaction being overwritten without it ever being written to disk. > That then means any subsequent operations after resume will corrupt > the filesystem.... > > Hence the only way to correctly rebuild the XFS state on resume is > to quiesce the filesystem on suspend and thaw it on resume so as to > trigger log recovery. No, during suspend/resume, memory image is saved, and no state is lost. We would not even have to do sys_sync(), and suspend/resume would still work properly. sys_sync() is there only to limit damage in case of suspend/resume failure. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-25 8:10 ` Pavel Machek @ 2006-10-25 8:38 ` David Chinner 2006-10-25 8:47 ` Pavel Machek 0 siblings, 1 reply; 28+ messages in thread From: David Chinner @ 2006-10-25 8:38 UTC (permalink / raw) To: Pavel Machek Cc: David Chinner, Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs On Wed, Oct 25, 2006 at 10:10:01AM +0200, Pavel Machek wrote: > > Hence the only way to correctly rebuild the XFS state on resume is > > to quiesce the filesystem on suspend and thaw it on resume so as to > > trigger log recovery. > > No, during suspend/resume, memory image is saved, and no state is > lost. We would not even have to do sys_sync(), and suspend/resume > would still work properly. It seems to me that you ensure the filesystem is synced to disk and then at some point later you record the memory state of the filesystem, but these happen at different times. That leaves a window for things to get out of sync again, right? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-25 8:38 ` David Chinner @ 2006-10-25 8:47 ` Pavel Machek 2006-10-25 12:32 ` Rafael J. Wysocki 0 siblings, 1 reply; 28+ messages in thread From: Pavel Machek @ 2006-10-25 8:47 UTC (permalink / raw) To: David Chinner Cc: Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs On Wed 2006-10-25 18:38:30, David Chinner wrote: > On Wed, Oct 25, 2006 at 10:10:01AM +0200, Pavel Machek wrote: > > > Hence the only way to correctly rebuild the XFS state on resume is > > > to quiesce the filesystem on suspend and thaw it on resume so as to > > > trigger log recovery. > > > > No, during suspend/resume, memory image is saved, and no state is > > lost. We would not even have to do sys_sync(), and suspend/resume > > would still work properly. > > It seems to me that you ensure the filesystem is synced to disk and > then at some point later you record the memory state of the > filesystem, but these happen at different times. That leaves a > window for things to get out of sync again, right? I DO NOT HAVE TO ENSURE FILESYSTEM IS SYNCED. That sys_sync() is optional. Recording of memory state is atomic, and as long as noone writes to the disk after atomic snapshot, memory image matches what is on disk. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-25 8:47 ` Pavel Machek @ 2006-10-25 12:32 ` Rafael J. Wysocki 2006-10-25 13:23 ` Nigel Cunningham 0 siblings, 1 reply; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-25 12:32 UTC (permalink / raw) To: Pavel Machek; +Cc: David Chinner, Nigel Cunningham, Andrew Morton, LKML, xfs On Wednesday, 25 October 2006 10:47, Pavel Machek wrote: > On Wed 2006-10-25 18:38:30, David Chinner wrote: > > On Wed, Oct 25, 2006 at 10:10:01AM +0200, Pavel Machek wrote: > > > > Hence the only way to correctly rebuild the XFS state on resume is > > > > to quiesce the filesystem on suspend and thaw it on resume so as to > > > > trigger log recovery. > > > > > > No, during suspend/resume, memory image is saved, and no state is > > > lost. We would not even have to do sys_sync(), and suspend/resume > > > would still work properly. > > > > It seems to me that you ensure the filesystem is synced to disk and > > then at some point later you record the memory state of the > > filesystem, but these happen at different times. That leaves a > > window for things to get out of sync again, right? > > I DO NOT HAVE TO ENSURE FILESYSTEM IS SYNCED. That sys_sync() is > optional. > > Recording of memory state is atomic, and as long as noone writes to > the disk after atomic snapshot, memory image matches what is on disk. Well, my impression is that this is exactly what happens here: Something in the XFS code causes metadata to be written to disk _after_ the atomic snapshot. That's why I asked if the dirty XFS metadata were flushed by a kernel thread. Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-25 12:32 ` Rafael J. Wysocki @ 2006-10-25 13:23 ` Nigel Cunningham [not found] ` <200610252105.56862.rjw@sisk.pl> 0 siblings, 1 reply; 28+ messages in thread From: Nigel Cunningham @ 2006-10-25 13:23 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pavel Machek, David Chinner, Andrew Morton, LKML, xfs Hi. On Wed, 2006-10-25 at 14:32 +0200, Rafael J. Wysocki wrote: > On Wednesday, 25 October 2006 10:47, Pavel Machek wrote: > > On Wed 2006-10-25 18:38:30, David Chinner wrote: > > > On Wed, Oct 25, 2006 at 10:10:01AM +0200, Pavel Machek wrote: > > > > > Hence the only way to correctly rebuild the XFS state on resume is > > > > > to quiesce the filesystem on suspend and thaw it on resume so as to > > > > > trigger log recovery. > > > > > > > > No, during suspend/resume, memory image is saved, and no state is > > > > lost. We would not even have to do sys_sync(), and suspend/resume > > > > would still work properly. > > > > > > It seems to me that you ensure the filesystem is synced to disk and > > > then at some point later you record the memory state of the > > > filesystem, but these happen at different times. That leaves a > > > window for things to get out of sync again, right? > > > > I DO NOT HAVE TO ENSURE FILESYSTEM IS SYNCED. That sys_sync() is > > optional. > > > > Recording of memory state is atomic, and as long as noone writes to > > the disk after atomic snapshot, memory image matches what is on disk. > > Well, my impression is that this is exactly what happens here: Something > in the XFS code causes metadata to be written to disk _after_ the atomic > snapshot. > > That's why I asked if the dirty XFS metadata were flushed by a kernel thread. When I first added bdev freezing it was because there was an XFS timer doing writes. Regards, Nigel ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <200610252105.56862.rjw@sisk.pl>]
* Re: [PATCH] Freeze bdevs when freezing processes. [not found] ` <200610252105.56862.rjw@sisk.pl> @ 2006-10-26 7:30 ` David Chinner 2006-10-26 8:18 ` Nigel Cunningham 0 siblings, 1 reply; 28+ messages in thread From: David Chinner @ 2006-10-26 7:30 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Nigel Cunningham, Pavel Machek, David Chinner, Andrew Morton, LKML, xfs On Wed, Oct 25, 2006 at 09:05:56PM +0200, Rafael J. Wysocki wrote: > On Wednesday, 25 October 2006 15:23, Nigel Cunningham wrote: > > > > > > Well, my impression is that this is exactly what happens here: Something > > > in the XFS code causes metadata to be written to disk _after_ the atomic > > > snapshot. > > > > > > That's why I asked if the dirty XFS metadata were flushed by a kernel thread. > > > > When I first added bdev freezing it was because there was an XFS timer > > doing writes. > > Yes, I noticed you said that, but I'd like someone from the XFS team to either > confirm or deny it. We have daemons running in the background that can definitely do stuff after a sync. hmm - one does try_to_freeze() after a wakeup, the other does: if (unlikely(freezing(current))) { set_bit(XBT_FORCE_SLEEP, &target->bt_flags); refrigerator(); } else { clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); } before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) can definitely wake up after the sync and do work, and the other could if the kernel thread freeze occurs after the sync. Another good question at this point - exactly how should we be putting these thread to to sleep? Are both these valid methods for freezing them? And should we be freezing when we wake up instead of before we go to sleep? i.e. what are teh rules we are supposed to be following? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 7:30 ` David Chinner @ 2006-10-26 8:18 ` Nigel Cunningham 2006-10-26 8:48 ` Rafael J. Wysocki ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Nigel Cunningham @ 2006-10-26 8:18 UTC (permalink / raw) To: David Chinner; +Cc: Rafael J. Wysocki, Pavel Machek, Andrew Morton, LKML, xfs Hi Dave. On Thu, 2006-10-26 at 17:30 +1000, David Chinner wrote: > On Wed, Oct 25, 2006 at 09:05:56PM +0200, Rafael J. Wysocki wrote: > > On Wednesday, 25 October 2006 15:23, Nigel Cunningham wrote: > > > > > > > > Well, my impression is that this is exactly what happens here: Something > > > > in the XFS code causes metadata to be written to disk _after_ the atomic > > > > snapshot. > > > > > > > > That's why I asked if the dirty XFS metadata were flushed by a kernel thread. > > > > > > When I first added bdev freezing it was because there was an XFS timer > > > doing writes. > > > > Yes, I noticed you said that, but I'd like someone from the XFS team to either > > confirm or deny it. > > We have daemons running in the background that can definitely do stuff > after a sync. hmm - one does try_to_freeze() after a wakeup, the > other does: > > if (unlikely(freezing(current))) { > set_bit(XBT_FORCE_SLEEP, &target->bt_flags); > refrigerator(); > } else { > clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); > } > > before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) > can definitely wake up after the sync and do work, and the other could if > the kernel thread freeze occurs after the sync. > > Another good question at this point - exactly how should we be putting > these thread to to sleep? Are both these valid methods for freezing them? > And should we be freezing when we wake up instead of before we go to > sleep? i.e. what are teh rules we are supposed to be following? As you have them at the moment, the threads seem to be freezing fine. The issue I've seen in the past related not to threads but to timer based activity. Admittedly it was 2.6.14 when I last looked at it, but there used to be a possibility for XFS to submit I/O from a timer when the threads are frozen but the bdev isn't frozen. Has that changed? Regards, Nigel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 8:18 ` Nigel Cunningham @ 2006-10-26 8:48 ` Rafael J. Wysocki 2006-10-26 8:57 ` David Chinner 2006-10-26 9:08 ` Rafael J. Wysocki 2 siblings, 0 replies; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-26 8:48 UTC (permalink / raw) To: Nigel Cunningham; +Cc: David Chinner, Pavel Machek, Andrew Morton, LKML, xfs On Thursday, 26 October 2006 10:18, Nigel Cunningham wrote: > Hi Dave. > > On Thu, 2006-10-26 at 17:30 +1000, David Chinner wrote: > > On Wed, Oct 25, 2006 at 09:05:56PM +0200, Rafael J. Wysocki wrote: > > > On Wednesday, 25 October 2006 15:23, Nigel Cunningham wrote: > > > > > > > > > > Well, my impression is that this is exactly what happens here: Something > > > > > in the XFS code causes metadata to be written to disk _after_ the atomic > > > > > snapshot. > > > > > > > > > > That's why I asked if the dirty XFS metadata were flushed by a kernel thread. > > > > > > > > When I first added bdev freezing it was because there was an XFS timer > > > > doing writes. > > > > > > Yes, I noticed you said that, but I'd like someone from the XFS team to either > > > confirm or deny it. > > > > We have daemons running in the background that can definitely do stuff > > after a sync. hmm - one does try_to_freeze() after a wakeup, the > > other does: > > > > if (unlikely(freezing(current))) { > > set_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > refrigerator(); > > } else { > > clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > } > > > > before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) > > can definitely wake up after the sync and do work, Once it's entered the refrigerator, it won't return from it before freeze_processes() is called (usually during the resume). > > and the other could if the kernel thread freeze occurs after the sync. Even if it does anything after the sync, it's okay as long as that's before we create the image. And it is, because both threads are frozen before we do it. > > Another good question at this point - exactly how should we be putting > > these thread to to sleep? Are both these valid methods for freezing them? > > And should we be freezing when we wake up instead of before we go to > > sleep? i.e. what are teh rules we are supposed to be following? > > As you have them at the moment, the threads seem to be freezing fine. Yes. IMO they are freezing fine. In fact the suspend would fail if they didn't freeze unless one of them had PF_NOFREEZE set. Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 8:18 ` Nigel Cunningham 2006-10-26 8:48 ` Rafael J. Wysocki @ 2006-10-26 8:57 ` David Chinner 2006-10-26 9:11 ` Rafael J. Wysocki 2006-10-26 9:18 ` Nigel Cunningham 2006-10-26 9:08 ` Rafael J. Wysocki 2 siblings, 2 replies; 28+ messages in thread From: David Chinner @ 2006-10-26 8:57 UTC (permalink / raw) To: Nigel Cunningham Cc: David Chinner, Rafael J. Wysocki, Pavel Machek, Andrew Morton, LKML, xfs Hi Nigel, On Thu, Oct 26, 2006 at 06:18:29PM +1000, Nigel Cunningham wrote: > On Thu, 2006-10-26 at 17:30 +1000, David Chinner wrote: > > We have daemons running in the background that can definitely do stuff > > after a sync. hmm - one does try_to_freeze() after a wakeup, the > > other does: > > > > if (unlikely(freezing(current))) { > > set_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > refrigerator(); > > } else { > > clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > } > > > > before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) > > can definitely wake up after the sync and do work, and the other could if > > the kernel thread freeze occurs after the sync. > > > > Another good question at this point - exactly how should we be putting > > these thread to to sleep? Are both these valid methods for freezing them? > > And should we be freezing when we wake up instead of before we go to > > sleep? i.e. what are teh rules we are supposed to be following? > > As you have them at the moment, the threads seem to be freezing fine. > The issue I've seen in the past related not to threads but to timer > based activity. Admittedly it was 2.6.14 when I last looked at it, but > there used to be a possibility for XFS to submit I/O from a timer when > the threads are frozen but the bdev isn't frozen. Has that changed? I didn't think we've ever done that - periodic or delayed operations are passed off to the kernel threads to execute. A stack trace (if you still have it) would be really help here. Hmmm - we have a couple of per-cpu work queues as well that are used on I/O completion and that can, in some circumstances, trigger new transactions. If we are only flush metadata, then I don't think that any more I/o will be issued, but I could be wrong (maze of twisty passages). Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 8:57 ` David Chinner @ 2006-10-26 9:11 ` Rafael J. Wysocki 2006-10-27 1:38 ` David Chinner 2006-10-26 9:18 ` Nigel Cunningham 1 sibling, 1 reply; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-26 9:11 UTC (permalink / raw) To: David Chinner; +Cc: Nigel Cunningham, Pavel Machek, Andrew Morton, LKML, xfs Hi, On Thursday, 26 October 2006 10:57, David Chinner wrote: > Hi Nigel, > > On Thu, Oct 26, 2006 at 06:18:29PM +1000, Nigel Cunningham wrote: > > On Thu, 2006-10-26 at 17:30 +1000, David Chinner wrote: > > > We have daemons running in the background that can definitely do stuff > > > after a sync. hmm - one does try_to_freeze() after a wakeup, the > > > other does: > > > > > > if (unlikely(freezing(current))) { > > > set_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > > refrigerator(); > > > } else { > > > clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > > } > > > > > > before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) > > > can definitely wake up after the sync and do work, and the other could if > > > the kernel thread freeze occurs after the sync. > > > > > > Another good question at this point - exactly how should we be putting > > > these thread to to sleep? Are both these valid methods for freezing them? > > > And should we be freezing when we wake up instead of before we go to > > > sleep? i.e. what are teh rules we are supposed to be following? > > > > As you have them at the moment, the threads seem to be freezing fine. > > The issue I've seen in the past related not to threads but to timer > > based activity. Admittedly it was 2.6.14 when I last looked at it, but > > there used to be a possibility for XFS to submit I/O from a timer when > > the threads are frozen but the bdev isn't frozen. Has that changed? > > I didn't think we've ever done that - periodic or delayed operations > are passed off to the kernel threads to execute. A stack trace > (if you still have it) would be really help here. > > Hmmm - we have a couple of per-cpu work queues as well that are > used on I/O completion and that can, in some circumstances, > trigger new transactions. If we are only flush metadata, then > I don't think that any more I/o will be issued, but I could be > wrong (maze of twisty passages). Well, I think this exactly is the problem, because worker_threads run with PF_NOFREEZE set (as I've just said in another message). Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 9:11 ` Rafael J. Wysocki @ 2006-10-27 1:38 ` David Chinner 2006-10-27 14:37 ` Rafael J. Wysocki 2006-10-29 17:35 ` Pavel Machek 0 siblings, 2 replies; 28+ messages in thread From: David Chinner @ 2006-10-27 1:38 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Chinner, Nigel Cunningham, Pavel Machek, Andrew Morton, LKML, xfs On Thu, Oct 26, 2006 at 11:11:29AM +0200, Rafael J. Wysocki wrote: > On Thursday, 26 October 2006 10:57, David Chinner wrote: > > On Thu, Oct 26, 2006 at 06:18:29PM +1000, Nigel Cunningham wrote: > > > As you have them at the moment, the threads seem to be freezing fine. > > > The issue I've seen in the past related not to threads but to timer > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but > > > there used to be a possibility for XFS to submit I/O from a timer when > > > the threads are frozen but the bdev isn't frozen. Has that changed? > > > > I didn't think we've ever done that - periodic or delayed operations > > are passed off to the kernel threads to execute. A stack trace > > (if you still have it) would be really help here. > > > > Hmmm - we have a couple of per-cpu work queues as well that are > > used on I/O completion and that can, in some circumstances, > > trigger new transactions. If we are only flush metadata, then > > I don't think that any more I/o will be issued, but I could be > > wrong (maze of twisty passages). > > Well, I think this exactly is the problem, because worker_threads run with > PF_NOFREEZE set (as I've just said in another message). Ok, so freezing the filesystem is the only way you can prevent this as the workqueues are flushed as part of quiescing the filesystem. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-27 1:38 ` David Chinner @ 2006-10-27 14:37 ` Rafael J. Wysocki 2006-10-29 17:35 ` Pavel Machek 1 sibling, 0 replies; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-27 14:37 UTC (permalink / raw) To: David Chinner; +Cc: Nigel Cunningham, Pavel Machek, Andrew Morton, LKML, xfs On Friday, 27 October 2006 03:38, David Chinner wrote: > On Thu, Oct 26, 2006 at 11:11:29AM +0200, Rafael J. Wysocki wrote: > > On Thursday, 26 October 2006 10:57, David Chinner wrote: > > > On Thu, Oct 26, 2006 at 06:18:29PM +1000, Nigel Cunningham wrote: > > > > As you have them at the moment, the threads seem to be freezing fine. > > > > The issue I've seen in the past related not to threads but to timer > > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but > > > > there used to be a possibility for XFS to submit I/O from a timer when > > > > the threads are frozen but the bdev isn't frozen. Has that changed? > > > > > > I didn't think we've ever done that - periodic or delayed operations > > > are passed off to the kernel threads to execute. A stack trace > > > (if you still have it) would be really help here. > > > > > > Hmmm - we have a couple of per-cpu work queues as well that are > > > used on I/O completion and that can, in some circumstances, > > > trigger new transactions. If we are only flush metadata, then > > > I don't think that any more I/o will be issued, but I could be > > > wrong (maze of twisty passages). > > > > Well, I think this exactly is the problem, because worker_threads run with > > PF_NOFREEZE set (as I've just said in another message). > > Ok, so freezing the filesystem is the only way you can prevent > this as the workqueues are flushed as part of quiescing the filesystem. Yes, I think so. Now at last I know what the problem actually is and why we need the freezing of filesystems, so thanks for helping me understand that. :-) Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-27 1:38 ` David Chinner 2006-10-27 14:37 ` Rafael J. Wysocki @ 2006-10-29 17:35 ` Pavel Machek [not found] ` <200610300029.25555.rjw@sisk.pl> 1 sibling, 1 reply; 28+ messages in thread From: Pavel Machek @ 2006-10-29 17:35 UTC (permalink / raw) To: David Chinner Cc: Rafael J. Wysocki, Nigel Cunningham, Andrew Morton, LKML, xfs Hi! > > > > As you have them at the moment, the threads seem to be freezing fine. > > > > The issue I've seen in the past related not to threads but to timer > > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but > > > > there used to be a possibility for XFS to submit I/O from a timer when > > > > the threads are frozen but the bdev isn't frozen. Has that changed? > > > > > > I didn't think we've ever done that - periodic or delayed operations > > > are passed off to the kernel threads to execute. A stack trace > > > (if you still have it) would be really help here. > > > > > > Hmmm - we have a couple of per-cpu work queues as well that are > > > used on I/O completion and that can, in some circumstances, > > > trigger new transactions. If we are only flush metadata, then > > > I don't think that any more I/o will be issued, but I could be > > > wrong (maze of twisty passages). > > > > Well, I think this exactly is the problem, because worker_threads run with > > PF_NOFREEZE set (as I've just said in another message). > > Ok, so freezing the filesystem is the only way you can prevent > this as the workqueues are flushed as part of quiescing the filesystem. Well, alternative is to teach XFS to sense that we are being frozen and stop disk writes in such case. OTOH freeze_bdevs is perhaps not that bad solution... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <200610300029.25555.rjw@sisk.pl>]
* Re: [PATCH] Freeze bdevs when freezing processes. [not found] ` <200610300029.25555.rjw@sisk.pl> @ 2006-10-29 23:46 ` Nigel Cunningham 0 siblings, 0 replies; 28+ messages in thread From: Nigel Cunningham @ 2006-10-29 23:46 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, David Chinner, Andrew Morton, LKML, xfs, Christoph Hellwig Hi. On Mon, 2006-10-30 at 00:29 +0100, Rafael J. Wysocki wrote: > Hi, > > On Sunday, 29 October 2006 18:35, Pavel Machek wrote: > > Hi! > > > > > > > > As you have them at the moment, the threads seem to be freezing fine. > > > > > > The issue I've seen in the past related not to threads but to timer > > > > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but > > > > > > there used to be a possibility for XFS to submit I/O from a timer when > > > > > > the threads are frozen but the bdev isn't frozen. Has that changed? > > > > > > > > > > I didn't think we've ever done that - periodic or delayed operations > > > > > are passed off to the kernel threads to execute. A stack trace > > > > > (if you still have it) would be really help here. > > > > > > > > > > Hmmm - we have a couple of per-cpu work queues as well that are > > > > > used on I/O completion and that can, in some circumstances, > > > > > trigger new transactions. If we are only flush metadata, then > > > > > I don't think that any more I/o will be issued, but I could be > > > > > wrong (maze of twisty passages). > > > > > > > > Well, I think this exactly is the problem, because worker_threads run with > > > > PF_NOFREEZE set (as I've just said in another message). > > > > > > Ok, so freezing the filesystem is the only way you can prevent > > > this as the workqueues are flushed as part of quiescing the filesystem. > > > > Well, alternative is to teach XFS to sense that we are being frozen > > and stop disk writes in such case. > > > > OTOH freeze_bdevs is perhaps not that bad solution... > > Okay, appended is a patch that implements the freezing of bdevs in a slightly > different way than the Nigel's patch did it. > As Christoph suggested, I have put freeze_filesystems() and thaw_filesystems() > into fs/buffer.c and indroduced the MS_FROZEN flag to mark frozen > filesystems. > > It seems to work fine, except I get the following trace from lockdep during > the suspend on a regular basis (not 100% reproducible, though): > > Stopping tasks... > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.19-rc2-mm2 #15 > --------------------------------------------- > s2disk/5564 is trying to acquire lock: > (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10 > > but task is already holding lock: > (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10 > > other info that might help us debug this: > 3 locks held by s2disk/5564: > #0: (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10 > #1: (&type->s_umount_key#16){----}, at: [<ffffffff80291647>] get_super+0x67/0xc0 > #2: (&journal->j_barrier){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10 > > stack backtrace: > > Call Trace: > [<ffffffff8020af79>] dump_trace+0xb9/0x430 > [<ffffffff8020b333>] show_trace+0x43/0x60 > [<ffffffff8020b635>] dump_stack+0x15/0x20 > [<ffffffff8024a1d1>] __lock_acquire+0x881/0xc60 > [<ffffffff8024a94d>] lock_acquire+0x8d/0xc0 > [<ffffffff80475cd4>] __mutex_lock_slowpath+0xd4/0x270 > [<ffffffff80475e79>] mutex_lock+0x9/0x10 > [<ffffffff802b2bb6>] freeze_bdev+0x16/0x80 > [<ffffffff802b3105>] freeze_filesystems+0x55/0x80 > [<ffffffff80255942>] freeze_processes+0x1e2/0x360 > [<ffffffff802592a3>] snapshot_ioctl+0x163/0x610 > [<ffffffff8029cf0b>] do_ioctl+0x6b/0xa0 > [<ffffffff8029d1eb>] vfs_ioctl+0x2ab/0x2d0 > [<ffffffff8029d27a>] sys_ioctl+0x6a/0xa0 > [<ffffffff80209c2e>] system_call+0x7e/0x83 > [<00002afb13a4d8a9>] > > done. > Shrinking memory... done (19126 pages freed) > > Greetings, > Rafael > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Heh. I've just prepared almost exactly the same patch (except for unwinding the process thawing). I haven't ever seen those mutex warnings - is that due to something new in -mm or one of those gazillion new warnings you can enable in vanilla? Apart from that, I'll add: Signed-off-by: Nigel Cunningham <nigel@suspend2.net> Regards, Nigel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 8:57 ` David Chinner 2006-10-26 9:11 ` Rafael J. Wysocki @ 2006-10-26 9:18 ` Nigel Cunningham 1 sibling, 0 replies; 28+ messages in thread From: Nigel Cunningham @ 2006-10-26 9:18 UTC (permalink / raw) To: David Chinner; +Cc: Rafael J. Wysocki, Pavel Machek, Andrew Morton, LKML, xfs Hi. On Thu, 2006-10-26 at 18:57 +1000, David Chinner wrote: > I didn't think we've ever done that - periodic or delayed operations > are passed off to the kernel threads to execute. A stack trace > (if you still have it) would be really help here. I don't, but I know how to get it again. Will give it a go shortly. > Hmmm - we have a couple of per-cpu work queues as well that are > used on I/O completion and that can, in some circumstances, > trigger new transactions. If we are only flush metadata, then > I don't think that any more I/o will be issued, but I could be > wrong (maze of twisty passages). I can understand that. I'm trying to learn Xgl programming at the mo :) Nigel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-26 8:18 ` Nigel Cunningham 2006-10-26 8:48 ` Rafael J. Wysocki 2006-10-26 8:57 ` David Chinner @ 2006-10-26 9:08 ` Rafael J. Wysocki 2 siblings, 0 replies; 28+ messages in thread From: Rafael J. Wysocki @ 2006-10-26 9:08 UTC (permalink / raw) To: Nigel Cunningham; +Cc: David Chinner, Pavel Machek, Andrew Morton, LKML, xfs On Thursday, 26 October 2006 10:18, Nigel Cunningham wrote: > Hi Dave. > > On Thu, 2006-10-26 at 17:30 +1000, David Chinner wrote: > > On Wed, Oct 25, 2006 at 09:05:56PM +0200, Rafael J. Wysocki wrote: > > > On Wednesday, 25 October 2006 15:23, Nigel Cunningham wrote: > > > > > > > > > > Well, my impression is that this is exactly what happens here: Something > > > > > in the XFS code causes metadata to be written to disk _after_ the atomic > > > > > snapshot. > > > > > > > > > > That's why I asked if the dirty XFS metadata were flushed by a kernel thread. > > > > > > > > When I first added bdev freezing it was because there was an XFS timer > > > > doing writes. > > > > > > Yes, I noticed you said that, but I'd like someone from the XFS team to either > > > confirm or deny it. > > > > We have daemons running in the background that can definitely do stuff > > after a sync. hmm - one does try_to_freeze() after a wakeup, the > > other does: > > > > if (unlikely(freezing(current))) { > > set_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > refrigerator(); > > } else { > > clear_bit(XBT_FORCE_SLEEP, &target->bt_flags); > > } > > > > before it goes to sleep. So that one (xfsbufd - metadata buffer flushing) > > can definitely wake up after the sync and do work, and the other could if > > the kernel thread freeze occurs after the sync. > > > > Another good question at this point - exactly how should we be putting > > these thread to to sleep? Are both these valid methods for freezing them? > > And should we be freezing when we wake up instead of before we go to > > sleep? i.e. what are teh rules we are supposed to be following? > > As you have them at the moment, the threads seem to be freezing fine. > The issue I've seen in the past related not to threads but to timer > based activity. Admittedly it was 2.6.14 when I last looked at it, but > there used to be a possibility for XFS to submit I/O from a timer when > the threads are frozen but the bdev isn't frozen. Also there may be a problem if a workqueue is used for that, because worker_threads run with PF_NOFREEZE set. Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 15:29 ` Rafael J. Wysocki 2006-10-24 16:27 ` Oleg Verych 2006-10-24 16:33 ` David Chinner @ 2006-10-24 17:06 ` Christoph Hellwig 2006-10-24 21:26 ` Pavel Machek 2 siblings, 1 reply; 28+ messages in thread From: Christoph Hellwig @ 2006-10-24 17:06 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Chinner, Nigel Cunningham, Andrew Morton, LKML, Pavel Machek, xfs On Tue, Oct 24, 2006 at 05:29:59PM +0200, Rafael J. Wysocki wrote: > Do you mean calling sys_sync() after the userspace has been frozen > may not be sufficient? No, that's definitly not enough. You need to freeze_bdev to make sure data is on disk in the place it's expected by the filesystem without starting a log recovery. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 17:06 ` Christoph Hellwig @ 2006-10-24 21:26 ` Pavel Machek 2006-10-24 21:33 ` Christoph Hellwig 0 siblings, 1 reply; 28+ messages in thread From: Pavel Machek @ 2006-10-24 21:26 UTC (permalink / raw) To: Christoph Hellwig, Rafael J. Wysocki, David Chinner, Nigel Cunningham, Andrew Morton, LKML, xfs Hi! On Tue 2006-10-24 18:06:33, Christoph Hellwig wrote: > On Tue, Oct 24, 2006 at 05:29:59PM +0200, Rafael J. Wysocki wrote: > > Do you mean calling sys_sync() after the userspace has been frozen > > may not be sufficient? > > No, that's definitly not enough. You need to freeze_bdev to make sure > data is on disk in the place it's expected by the filesystem without > starting a log recovery. I believe log recovery is okay in this case. It can only happen when kernel dies during suspend or during resume... And log recovery seems okay in that case. We even guarantee that user did not loose any data -- by using sys_sync() after userland is stopped -- but let's not overdo over protections. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 21:26 ` Pavel Machek @ 2006-10-24 21:33 ` Christoph Hellwig 2006-10-24 21:43 ` Pavel Machek 0 siblings, 1 reply; 28+ messages in thread From: Christoph Hellwig @ 2006-10-24 21:33 UTC (permalink / raw) To: Pavel Machek Cc: Christoph Hellwig, Rafael J. Wysocki, David Chinner, Nigel Cunningham, Andrew Morton, LKML, xfs On Tue, Oct 24, 2006 at 11:26:48PM +0200, Pavel Machek wrote: > > No, that's definitly not enough. You need to freeze_bdev to make sure > > data is on disk in the place it's expected by the filesystem without > > starting a log recovery. > > I believe log recovery is okay in this case. > > It can only happen when kernel dies during suspend or during > resume... And log recovery seems okay in that case. We even guarantee > that user did not loose any data -- by using sys_sync() after userland > is stopped -- but let's not overdo over protections. You're still entirely missing the problem. Take a look at http://www.opengroup.org/onlinepubs/007908799/xsh/sync.html and the linux sync(2) manpage. The only thing sync guarantees is writing out all in-memory data to disk. It doesn't even gurantee completion, although we've been synchronous in Linux for a while. What it does not gurantee is where on disk the data is located. Now for a journaling filesystem pushing everything to the log is the easiest way to complete sync, and it's perfectly valid - if the system crashes after the sync and before data is written back to it's normal place on disk the system notices it's not been unmounted cleanly and will do a log recovery. In the suspend case however the system neither crashes nor is unmounted - thus the filesystem doesn't know it has to recover the log. We have to choices to fix this: (1) force a log recovery of an already mounted and in use filesystem (2) make sure data is in the right place before suspending (1) is pretty nasty, and hard to do across filesystems. (2) is already implemented and easily useable by the suspend code. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 21:33 ` Christoph Hellwig @ 2006-10-24 21:43 ` Pavel Machek 0 siblings, 0 replies; 28+ messages in thread From: Pavel Machek @ 2006-10-24 21:43 UTC (permalink / raw) To: Christoph Hellwig, Rafael J. Wysocki, David Chinner, Nigel Cunningham, Andrew Morton, LKML, xfs Hi! On Tue 2006-10-24 22:33:42, Christoph Hellwig wrote: > On Tue, Oct 24, 2006 at 11:26:48PM +0200, Pavel Machek wrote: > > > No, that's definitly not enough. You need to freeze_bdev to make sure > > > data is on disk in the place it's expected by the filesystem without > > > starting a log recovery. > > > > I believe log recovery is okay in this case. > > > > It can only happen when kernel dies during suspend or during > > resume... And log recovery seems okay in that case. We even guarantee > > that user did not loose any data -- by using sys_sync() after userland > > is stopped -- but let's not overdo over protections. > > You're still entirely missing the problem. > > Take a look at http://www.opengroup.org/onlinepubs/007908799/xsh/sync.html > and the linux sync(2) manpage. The only thing sync guarantees is writing > out all in-memory data to disk. It doesn't even gurantee completion, > although we've been synchronous in Linux for a while. Ok, I assume sys_sync is synchronous, but that's okay. > What it does not gurantee is where on disk the data is located. Now for > a journaling filesystem pushing everything to the log is the easiest way > to complete sync, and it's perfectly valid - if the system crashes after > the sync and before data is written back to it's normal place on disk > the system notices it's not been unmounted cleanly and will do a log > recovery. In the suspend case however the system neither crashes nor > is unmounted - thus the filesystem doesn't know it has to recover the > log. We have to choices to fix this: > > (1) force a log recovery of an already mounted and in use filesystem > (2) make sure data is in the right place before suspending > > (1) is pretty nasty, and hard to do across filesystems. (2) is already > implemented and easily useable by the suspend code. No, there's no need to do either (1) or (2) in "machine suspended and resumed successfully". In that case, machine just continues as if no suspend has happened. In fact I could remove sys_sync() from freezer. suspend code would still be correct. That sys_sync() only matters in case of suspend but machine died during resume... and in that case we know we crashed, and journal recovery is okay. I do know how journaling works (from 10000feet, anyway), and it is okay in this case. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Freeze bdevs when freezing processes. 2006-10-24 14:44 ` [PATCH] Freeze bdevs when freezing processes David Chinner 2006-10-24 15:29 ` Rafael J. Wysocki @ 2006-10-24 22:19 ` Nigel Cunningham 1 sibling, 0 replies; 28+ messages in thread From: Nigel Cunningham @ 2006-10-24 22:19 UTC (permalink / raw) To: David Chinner; +Cc: Rafael J. Wysocki, Andrew Morton, LKML, Pavel Machek, xfs Hi David. On Wed, 2006-10-25 at 00:44 +1000, David Chinner wrote: > On Mon, Oct 23, 2006 at 12:36:53PM +0200, Rafael J. Wysocki wrote: > > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote: > > > XFS can continue to submit I/O from a timer routine, even after > > > freezeable kernel and userspace threads are frozen. This doesn't seem to > > > be an issue for current swsusp code, > > > > So it doesn't look like we need the patch _now_. > > > > > but is definitely an issue for Suspend2, where the pages being written could > > > be overwritten by Suspend2's atomic copy. > > > > And IMO that's a good reason why we shouldn't use RCU pages for storing the > > image. XFS is one known example that breaks things if we do so and > > there may be more such things that we don't know of. The fact that they > > haven't appeared in testing so far doesn't mean they don't exist and > > moreover some things like that may appear in the future. > > Could you please tell us which XFS bits are broken so we can get > them fixed? The XFS daemons should all be checking if they are > supposed to freeze (i.e. they call try_to_freeze() after they wake > up due to timer expiry) so I thought they were doing the right > thing. The problem (in my experience) isn't the threads but a timer that submits I/O even when the threads are frozen. It stops when the bdev is frozen. The last report I've seen was before I added bdev freezing to suspend2, which was 2.6.14, so you guys may have fixed it since then. I can seek to get a trace if you like. Regards, Nigel ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2006-10-29 23:59 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1161576735.3466.7.camel@nigel.suspend2.net>
[not found] ` <200610231236.54317.rjw@sisk.pl>
2006-10-24 14:44 ` [PATCH] Freeze bdevs when freezing processes David Chinner
2006-10-24 15:29 ` Rafael J. Wysocki
2006-10-24 16:27 ` Oleg Verych
2006-10-25 8:05 ` Pavel Machek
2006-10-24 16:33 ` David Chinner
2006-10-24 21:37 ` Pavel Machek
2006-10-25 0:13 ` David Chinner
2006-10-25 8:10 ` Pavel Machek
2006-10-25 8:38 ` David Chinner
2006-10-25 8:47 ` Pavel Machek
2006-10-25 12:32 ` Rafael J. Wysocki
2006-10-25 13:23 ` Nigel Cunningham
[not found] ` <200610252105.56862.rjw@sisk.pl>
2006-10-26 7:30 ` David Chinner
2006-10-26 8:18 ` Nigel Cunningham
2006-10-26 8:48 ` Rafael J. Wysocki
2006-10-26 8:57 ` David Chinner
2006-10-26 9:11 ` Rafael J. Wysocki
2006-10-27 1:38 ` David Chinner
2006-10-27 14:37 ` Rafael J. Wysocki
2006-10-29 17:35 ` Pavel Machek
[not found] ` <200610300029.25555.rjw@sisk.pl>
2006-10-29 23:46 ` Nigel Cunningham
2006-10-26 9:18 ` Nigel Cunningham
2006-10-26 9:08 ` Rafael J. Wysocki
2006-10-24 17:06 ` Christoph Hellwig
2006-10-24 21:26 ` Pavel Machek
2006-10-24 21:33 ` Christoph Hellwig
2006-10-24 21:43 ` Pavel Machek
2006-10-24 22:19 ` Nigel Cunningham
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox