* [bug] GNOME loses all settings following failure to resume from suspend @ 2022-01-05 17:34 Chris Murphy 2022-01-05 18:04 ` Filipe Manana 0 siblings, 1 reply; 18+ messages in thread From: Chris Murphy @ 2022-01-05 17:34 UTC (permalink / raw) To: Btrfs BTRFS; +Cc: Josef Bacik https://gitlab.gnome.org/GNOME/dconf/-/issues/73 Following a crash, instead of either the old or new dconf database file being present, a corrupt one is present. dconf uses g_file_set_contents() to atomically update the database file, which effectively inhibits (one or more?) fsync's, yet somehow in the crash/powerfail case this is resulting in a corrupt dconf database. I don't know if by "corrupt" this is a 0 length file or some other effect. Thanks, -- Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 17:34 [bug] GNOME loses all settings following failure to resume from suspend Chris Murphy @ 2022-01-05 18:04 ` Filipe Manana 2022-01-05 18:32 ` Filipe Manana ` (3 more replies) 0 siblings, 4 replies; 18+ messages in thread From: Filipe Manana @ 2022-01-05 18:04 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Josef Bacik On Wed, Jan 05, 2022 at 10:34:10AM -0700, Chris Murphy wrote: > https://gitlab.gnome.org/GNOME/dconf/-/issues/73 > > Following a crash, instead of either the old or new dconf database > file being present, a corrupt one is present. > > dconf uses g_file_set_contents() to atomically update the database > file, which effectively inhibits (one or more?) fsync's, yet somehow > in the crash/powerfail case this is resulting in a corrupt dconf > database. I don't know if by "corrupt" this is a 0 length file or some > other effect. Looking at the issue, Sebastian Keller posted a patch to disable one optimization in glib, that should fix it. Looking at the code before that patch, it explicitly skips fsync after a rename pointing out to: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F I'm afraid that information is wrong, perhaps it might have been true in some very distant past, but certainly not for many years. The wiki says, doing something like this: echo "oldcontent" > file # make sure oldcontent is on disk sync echo "newcontent" > file.tmp mv -f file.tmp file # *crash* Will give either: file contains "newcontent"; file.tmp does not exist file contains "oldcontent"; file.tmp may contain "newcontent", be zero-length or not exists at all. However that's not true, there's a chance 'file' will be empty. During a rename with overwrite we trigger writeback (via filemap_flush()) of the file being renamed but never wait for it to complete before returning to user space. So what can happen is: 1) We trigger writeback; 2) We join the current transaction and do the rename; 3) We return from rename to user space with success (0); 4) Writeback didn't finish yet, or it has finished but the ordered extent is not yet complete - i.e. btrfs_finish_ordered_io() did not complete yet; 5) The transaction used by the rename is committed; A transaction commit does not wait for any writeback or in flight ordered extents to complete - except if the fs is mounted with "-o flushoncommit"; 6) Crash After mounting the fs again 'file' is empty and 'file.tmp' does not exists. The only for that to guarantee 'file' is not empty and has the expected data ("newcontent"), would be to mount with -o flushoncommit. I don't think I have a wiki account enabled, but I'll see if I get that updated soon. Thanks. > > Thanks, > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:04 ` Filipe Manana @ 2022-01-05 18:32 ` Filipe Manana 2022-01-05 18:34 ` Hugo Mills ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: Filipe Manana @ 2022-01-05 18:32 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Josef Bacik On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > On Wed, Jan 05, 2022 at 10:34:10AM -0700, Chris Murphy wrote: > > https://gitlab.gnome.org/GNOME/dconf/-/issues/73 > > > > Following a crash, instead of either the old or new dconf database > > file being present, a corrupt one is present. > > > > dconf uses g_file_set_contents() to atomically update the database > > file, which effectively inhibits (one or more?) fsync's, yet somehow > > in the crash/powerfail case this is resulting in a corrupt dconf > > database. I don't know if by "corrupt" this is a 0 length file or some > > other effect. > > Looking at the issue, Sebastian Keller posted a patch to disable one > optimization in glib, that should fix it. > > Looking at the code before that patch, it explicitly skips fsync after > a rename pointing out to: > > https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > I'm afraid that information is wrong, perhaps it might have been true in > some very distant past, but certainly not for many years. After some digging in git history: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d875f95da43c6a8f18f77869f2ef26e9594fecc So that guarantee existed until August 2014. Not anymore, and no one should rely on it (except when using "-o flushoncommit"). > > The wiki says, doing something like this: > > echo "oldcontent" > file > > # make sure oldcontent is on disk > sync > > echo "newcontent" > file.tmp > mv -f file.tmp file > > # *crash* > > Will give either: > > file contains "newcontent"; file.tmp does not exist > file contains "oldcontent"; file.tmp may contain "newcontent", be zero-length or not exists at all. > > However that's not true, there's a chance 'file' will be empty. > > During a rename with overwrite we trigger writeback (via filemap_flush()) > of the file being renamed but never wait for it to complete before > returning to user space. So what can happen is: > > 1) We trigger writeback; > > 2) We join the current transaction and do the rename; > > 3) We return from rename to user space with success (0); > > 4) Writeback didn't finish yet, or it has finished but the > ordered extent is not yet complete - i.e. btrfs_finish_ordered_io() > did not complete yet; > > 5) The transaction used by the rename is committed; > A transaction commit does not wait for any writeback or in flight > ordered extents to complete - except if the fs is mounted with > "-o flushoncommit"; > > 6) Crash > > After mounting the fs again 'file' is empty and 'file.tmp' does not exists. > > The only for that to guarantee 'file' is not empty and has the expected > data ("newcontent"), would be to mount with -o flushoncommit. > > I don't think I have a wiki account enabled, but I'll see if I get that > updated soon. > > Thanks. > > > > > Thanks, > > > > -- > > Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:04 ` Filipe Manana 2022-01-05 18:32 ` Filipe Manana @ 2022-01-05 18:34 ` Hugo Mills 2022-01-05 20:38 ` Filipe Manana 2022-01-05 18:40 ` Chris Murphy 2022-01-09 17:04 ` Remi Gauvin 3 siblings, 1 reply; 18+ messages in thread From: Hugo Mills @ 2022-01-05 18:34 UTC (permalink / raw) To: Filipe Manana; +Cc: Chris Murphy, Btrfs BTRFS, Josef Bacik Hi, Filipe, On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > I don't think I have a wiki account enabled, but I'll see if I get that > updated soon. If you can't (or don't want to), feel free to put the text you want to replace it with here, and I'll update the wiki for you... Hugo. -- Hugo Mills | "There's a Martian war machine outside -- they want hugo@... carfax.org.uk | to talk to you about a cure for the common cold." http://carfax.org.uk/ | PGP: E2AB1DE4 | Stephen Franklin, Babylon 5 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:34 ` Hugo Mills @ 2022-01-05 20:38 ` Filipe Manana 2022-01-05 21:31 ` Hugo Mills 0 siblings, 1 reply; 18+ messages in thread From: Filipe Manana @ 2022-01-05 20:38 UTC (permalink / raw) To: Hugo Mills, Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > Hi, Filipe, > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > I don't think I have a wiki account enabled, but I'll see if I get that > > updated soon. > > If you can't (or don't want to), feel free to put the text you want > to replace it with here, and I'll update the wiki for you... Hi Hugo, That would be great. I don't have a concrete text, but as you are a native english speaker, a version from you would sound better :) Perhaps just mention that as of kernel 3.17 (and maybe point to that commit too), the behaviour is no longer guaranteed, and we can end up getting a file of 0 bytes. So an explicit fsync on the file is needed (just like ext4 and other filesystems). I asked for an account creation before seeing your reply. Anyway, if you want to do it, go ahead. Thanks. > > Hugo. > > -- > Hugo Mills | "There's a Martian war machine outside -- they want > hugo@... carfax.org.uk | to talk to you about a cure for the common cold." > http://carfax.org.uk/ | > PGP: E2AB1DE4 | Stephen Franklin, Babylon 5 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 20:38 ` Filipe Manana @ 2022-01-05 21:31 ` Hugo Mills 2022-01-05 21:39 ` Hugo Mills 0 siblings, 1 reply; 18+ messages in thread From: Hugo Mills @ 2022-01-05 21:31 UTC (permalink / raw) To: Filipe Manana; +Cc: Chris Murphy, Btrfs BTRFS, Josef Bacik On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > Hi, Filipe, > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > updated soon. > > > > If you can't (or don't want to), feel free to put the text you want > > to replace it with here, and I'll update the wiki for you... > > Hi Hugo, > > That would be great. > I don't have a concrete text, but as you are a native english speaker, > a version from you would sound better :) > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > commit too), the behaviour is no longer guaranteed, and we can end up > getting a file of 0 bytes. I'd rather not reinforce the wrong usage with an example of it. :) Better to document the correct usage... > So an explicit fsync on the file is needed (just like ext4 and other > filesystems). At what point in the process does the fsync need to be done? Before/after/instead of the sync? Hugo. -- Hugo Mills | Geek, n.: hugo@... carfax.org.uk | Circus sideshow performer specialising in the eating http://carfax.org.uk/ | of live animals. PGP: E2AB1DE4 | OED ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 21:31 ` Hugo Mills @ 2022-01-05 21:39 ` Hugo Mills 2022-01-05 21:53 ` Hugo Mills 0 siblings, 1 reply; 18+ messages in thread From: Hugo Mills @ 2022-01-05 21:39 UTC (permalink / raw) To: Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Wed, Jan 05, 2022 at 09:31:57PM +0000, Hugo Mills wrote: > On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > Hi, Filipe, > > > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > > updated soon. > > > > > > If you can't (or don't want to), feel free to put the text you want > > > to replace it with here, and I'll update the wiki for you... > > > > Hi Hugo, > > > > That would be great. > > I don't have a concrete text, but as you are a native english speaker, > > a version from you would sound better :) > > > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > > commit too), the behaviour is no longer guaranteed, and we can end up > > getting a file of 0 bytes. > > I'd rather not reinforce the wrong usage with an example of it. :) > Better to document the correct usage... > > > So an explicit fsync on the file is needed (just like ext4 and other > > filesystems). > > At what point in the process does the fsync need to be done? > Before/after/instead of the sync? Hmm. That doesn't make sense, of course (sorry, it's late, I've had a hard day). I'm guessing that the fsync needs to go after the write of the new data and before the rename. Is there any other constraint on what needs to be done to make this work safely? Hugo. -- Hugo Mills | Hickory Dickory Dock, hugo@... carfax.org.uk | Three mice ran up the clock. http://carfax.org.uk/ | The clock struck one, PGP: E2AB1DE4 | The other two escaped with minor injuries ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 21:39 ` Hugo Mills @ 2022-01-05 21:53 ` Hugo Mills 2022-01-06 9:51 ` Filipe Manana 0 siblings, 1 reply; 18+ messages in thread From: Hugo Mills @ 2022-01-05 21:53 UTC (permalink / raw) To: Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Wed, Jan 05, 2022 at 09:39:21PM +0000, Hugo Mills wrote: > On Wed, Jan 05, 2022 at 09:31:57PM +0000, Hugo Mills wrote: > > On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > > > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > > Hi, Filipe, > > > > > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > > > updated soon. > > > > > > > > If you can't (or don't want to), feel free to put the text you want > > > > to replace it with here, and I'll update the wiki for you... > > > > > > Hi Hugo, > > > > > > That would be great. > > > I don't have a concrete text, but as you are a native english speaker, > > > a version from you would sound better :) > > > > > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > > > commit too), the behaviour is no longer guaranteed, and we can end up > > > getting a file of 0 bytes. > > > > I'd rather not reinforce the wrong usage with an example of it. :) > > Better to document the correct usage... > > > > > So an explicit fsync on the file is needed (just like ext4 and other > > > filesystems). > > > > At what point in the process does the fsync need to be done? > > Before/after/instead of the sync? > > Hmm. That doesn't make sense, of course (sorry, it's late, I've had > a hard day). I'm guessing that the fsync needs to go after the write > of the new data and before the rename. Is there any other constraint > on what needs to be done to make this work safely? Right, I think I've got it. Ping me in the morning if it's not correct. Hugo. -- Hugo Mills | Great films about cricket: Monster's No-Ball hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 21:53 ` Hugo Mills @ 2022-01-06 9:51 ` Filipe Manana 2022-01-06 10:20 ` Hugo Mills 0 siblings, 1 reply; 18+ messages in thread From: Filipe Manana @ 2022-01-06 9:51 UTC (permalink / raw) To: Hugo Mills, Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Wed, Jan 5, 2022 at 9:54 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > On Wed, Jan 05, 2022 at 09:39:21PM +0000, Hugo Mills wrote: > > On Wed, Jan 05, 2022 at 09:31:57PM +0000, Hugo Mills wrote: > > > On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > > > > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > > > > Hi, Filipe, > > > > > > > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > > > > updated soon. > > > > > > > > > > If you can't (or don't want to), feel free to put the text you want > > > > > to replace it with here, and I'll update the wiki for you... > > > > > > > > Hi Hugo, > > > > > > > > That would be great. > > > > I don't have a concrete text, but as you are a native english speaker, > > > > a version from you would sound better :) > > > > > > > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > > > > commit too), the behaviour is no longer guaranteed, and we can end up > > > > getting a file of 0 bytes. > > > > > > I'd rather not reinforce the wrong usage with an example of it. :) > > > Better to document the correct usage... > > > > > > > So an explicit fsync on the file is needed (just like ext4 and other > > > > filesystems). > > > > > > At what point in the process does the fsync need to be done? > > > Before/after/instead of the sync? > > > > Hmm. That doesn't make sense, of course (sorry, it's late, I've had > > a hard day). I'm guessing that the fsync needs to go after the write > > of the new data and before the rename. Is there any other constraint > > on what needs to be done to make this work safely? > > Right, I think I've got it. Ping me in the morning if it's not > correct. Yep, that's correct Hugo. Starting with 3.17, the example on the wiki needs an fsync on "file.tmp" after writing to it and before renaming it over "file". I.e. the full example should be: **** echo "oldcontent" > file # make sure oldcontent is on disk sync echo "newcontent" > file.tmp fsync file.tmp mv -f file.tmp file # *crash* Will give either file contains "newcontent"; file.tmp does not exist file contains "oldcontent"; file.tmp may contain "newcontent", be zero-length or not exists at all. **** > > Hugo. > > -- > Hugo Mills | Great films about cricket: Monster's No-Ball > hugo@... carfax.org.uk | > http://carfax.org.uk/ | > PGP: E2AB1DE4 | ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 9:51 ` Filipe Manana @ 2022-01-06 10:20 ` Hugo Mills 2022-01-06 10:27 ` Filipe Manana 2022-01-06 21:07 ` Adam Borowski 0 siblings, 2 replies; 18+ messages in thread From: Hugo Mills @ 2022-01-06 10:20 UTC (permalink / raw) To: Filipe Manana; +Cc: Chris Murphy, Btrfs BTRFS, Josef Bacik On Thu, Jan 06, 2022 at 09:51:16AM +0000, Filipe Manana wrote: > On Wed, Jan 5, 2022 at 9:54 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > On Wed, Jan 05, 2022 at 09:39:21PM +0000, Hugo Mills wrote: > > > On Wed, Jan 05, 2022 at 09:31:57PM +0000, Hugo Mills wrote: > > > > On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > > > > > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > > > > > > Hi, Filipe, > > > > > > > > > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > > > > > updated soon. > > > > > > > > > > > > If you can't (or don't want to), feel free to put the text you want > > > > > > to replace it with here, and I'll update the wiki for you... > > > > > > > > > > Hi Hugo, > > > > > > > > > > That would be great. > > > > > I don't have a concrete text, but as you are a native english speaker, > > > > > a version from you would sound better :) > > > > > > > > > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > > > > > commit too), the behaviour is no longer guaranteed, and we can end up > > > > > getting a file of 0 bytes. > > > > > > > > I'd rather not reinforce the wrong usage with an example of it. :) > > > > Better to document the correct usage... > > > > > > > > > So an explicit fsync on the file is needed (just like ext4 and other > > > > > filesystems). > > > > > > > > At what point in the process does the fsync need to be done? > > > > Before/after/instead of the sync? > > > > > > Hmm. That doesn't make sense, of course (sorry, it's late, I've had > > > a hard day). I'm guessing that the fsync needs to go after the write > > > of the new data and before the rename. Is there any other constraint > > > on what needs to be done to make this work safely? > > > > Right, I think I've got it. Ping me in the morning if it's not > > correct. > > Yep, that's correct Hugo. > > Starting with 3.17, the example on the wiki needs an fsync on > "file.tmp" after writing to it and before renaming it over "file". > I.e. the full example should be: > > **** > echo "oldcontent" > file > > # make sure oldcontent is on disk > sync > > echo "newcontent" > file.tmp > fsync file.tmp > mv -f file.tmp file > > # *crash* > > Will give either > > file contains "newcontent"; file.tmp does not exist > file contains "oldcontent"; file.tmp may contain "newcontent", be > zero-length or not exists at all. > **** Since I can't find an "fsync" command line tool, I've rewritten the example in more general terms, rather than tying it to a specific implementation (such as a sequence of shell commands). I note that we've got a near-duplicate entry in the FAQ, two items down ("What are the crash guarantees of rename?"), that should probably be removed. Updated entry: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F Candidate for deletion: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_rename.3F Hugo. -- Hugo Mills | I thought I'd discovered a new colour, but it was hugo@... carfax.org.uk | just a pigment of my imagination. http://carfax.org.uk/ | PGP: E2AB1DE4 | ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 10:20 ` Hugo Mills @ 2022-01-06 10:27 ` Filipe Manana 2022-01-06 20:02 ` Chris Murphy 2022-01-06 21:07 ` Adam Borowski 1 sibling, 1 reply; 18+ messages in thread From: Filipe Manana @ 2022-01-06 10:27 UTC (permalink / raw) To: Hugo Mills, Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Thu, Jan 6, 2022 at 10:21 AM Hugo Mills <hugo@carfax.org.uk> wrote: > > On Thu, Jan 06, 2022 at 09:51:16AM +0000, Filipe Manana wrote: > > On Wed, Jan 5, 2022 at 9:54 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > On Wed, Jan 05, 2022 at 09:39:21PM +0000, Hugo Mills wrote: > > > > On Wed, Jan 05, 2022 at 09:31:57PM +0000, Hugo Mills wrote: > > > > > On Wed, Jan 05, 2022 at 08:38:37PM +0000, Filipe Manana wrote: > > > > > > On Wed, Jan 5, 2022 at 6:34 PM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > > > > > > > > Hi, Filipe, > > > > > > > > > > > > > > On Wed, Jan 05, 2022 at 06:04:38PM +0000, Filipe Manana wrote: > > > > > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > > > > > > > updated soon. > > > > > > > > > > > > > > If you can't (or don't want to), feel free to put the text you want > > > > > > > to replace it with here, and I'll update the wiki for you... > > > > > > > > > > > > Hi Hugo, > > > > > > > > > > > > That would be great. > > > > > > I don't have a concrete text, but as you are a native english speaker, > > > > > > a version from you would sound better :) > > > > > > > > > > > > Perhaps just mention that as of kernel 3.17 (and maybe point to that > > > > > > commit too), the behaviour is no longer guaranteed, and we can end up > > > > > > getting a file of 0 bytes. > > > > > > > > > > I'd rather not reinforce the wrong usage with an example of it. :) > > > > > Better to document the correct usage... > > > > > > > > > > > So an explicit fsync on the file is needed (just like ext4 and other > > > > > > filesystems). > > > > > > > > > > At what point in the process does the fsync need to be done? > > > > > Before/after/instead of the sync? > > > > > > > > Hmm. That doesn't make sense, of course (sorry, it's late, I've had > > > > a hard day). I'm guessing that the fsync needs to go after the write > > > > of the new data and before the rename. Is there any other constraint > > > > on what needs to be done to make this work safely? > > > > > > Right, I think I've got it. Ping me in the morning if it's not > > > correct. > > > > Yep, that's correct Hugo. > > > > Starting with 3.17, the example on the wiki needs an fsync on > > "file.tmp" after writing to it and before renaming it over "file". > > I.e. the full example should be: > > > > **** > > echo "oldcontent" > file > > > > # make sure oldcontent is on disk > > sync > > > > echo "newcontent" > file.tmp > > fsync file.tmp > > mv -f file.tmp file > > > > # *crash* > > > > Will give either > > > > file contains "newcontent"; file.tmp does not exist > > file contains "oldcontent"; file.tmp may contain "newcontent", be > > zero-length or not exists at all. > > **** > > Since I can't find an "fsync" command line tool, I've rewritten the > example in more general terms, rather than tying it to a specific > implementation (such as a sequence of shell commands). I note that > we've got a near-duplicate entry in the FAQ, two items down ("What are > the crash guarantees of rename?"), that should probably be removed. > > Updated entry: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > Candidate for deletion: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_rename.3F There's the xfs_io command from xfsprogs that can be used to trigger an fsync like this: xfs_io -c fsync /path/to/file But it's probably not well known for non-developers. Anyway, as it is, it looks perfect to me, thanks! > > Hugo. > > -- > Hugo Mills | I thought I'd discovered a new colour, but it was > hugo@... carfax.org.uk | just a pigment of my imagination. > http://carfax.org.uk/ | > PGP: E2AB1DE4 | ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 10:27 ` Filipe Manana @ 2022-01-06 20:02 ` Chris Murphy 2022-01-06 20:06 ` Chris Murphy 0 siblings, 1 reply; 18+ messages in thread From: Chris Murphy @ 2022-01-06 20:02 UTC (permalink / raw) To: Filipe Manana; +Cc: Hugo Mills, Chris Murphy, Btrfs BTRFS, Josef Bacik On Thu, Jan 6, 2022 at 3:28 AM Filipe Manana <fdmanana@kernel.org> wrote: > > On Thu, Jan 6, 2022 at 10:21 AM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > Yep, that's correct Hugo. > > > > > > Starting with 3.17, the example on the wiki needs an fsync on > > > "file.tmp" after writing to it and before renaming it over "file". > > > I.e. the full example should be: > > > > > > **** > > > echo "oldcontent" > file > > > > > > # make sure oldcontent is on disk > > > sync > > > > > > echo "newcontent" > file.tmp > > > fsync file.tmp > > > mv -f file.tmp file > > > > > > # *crash* > > > > > > Will give either > > > > > > file contains "newcontent"; file.tmp does not exist > > > file contains "oldcontent"; file.tmp may contain "newcontent", be > > > zero-length or not exists at all. > > > **** > > > > Since I can't find an "fsync" command line tool, I've rewritten the > > example in more general terms, rather than tying it to a specific > > implementation (such as a sequence of shell commands). I note that > > we've got a near-duplicate entry in the FAQ, two items down ("What are > > the crash guarantees of rename?"), that should probably be removed. > > > > Updated entry: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > Candidate for deletion: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_rename.3F > > There's the xfs_io command from xfsprogs that can be used to trigger > an fsync like this: xfs_io -c fsync /path/to/file > But it's probably not well known for non-developers. > > Anyway, as it is, it looks perfect to me, thanks! I think it's OK to use pseudo-code. Folks will figure it out. So you can just write it as fsync() and even if you're not using the exact correct syntax it'll be understood. -- Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 20:02 ` Chris Murphy @ 2022-01-06 20:06 ` Chris Murphy 2022-01-06 20:23 ` Hugo Mills 0 siblings, 1 reply; 18+ messages in thread From: Chris Murphy @ 2022-01-06 20:06 UTC (permalink / raw) To: Chris Murphy; +Cc: Filipe Manana, Hugo Mills, Btrfs BTRFS, Josef Bacik On Thu, Jan 6, 2022 at 1:02 PM Chris Murphy <lists@colorremedies.com> wrote: > > On Thu, Jan 6, 2022 at 3:28 AM Filipe Manana <fdmanana@kernel.org> wrote: > > > > On Thu, Jan 6, 2022 at 10:21 AM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > Yep, that's correct Hugo. > > > > > > > > Starting with 3.17, the example on the wiki needs an fsync on > > > > "file.tmp" after writing to it and before renaming it over "file". > > > > I.e. the full example should be: > > > > > > > > **** > > > > echo "oldcontent" > file > > > > > > > > # make sure oldcontent is on disk > > > > sync > > > > > > > > echo "newcontent" > file.tmp > > > > fsync file.tmp > > > > mv -f file.tmp file > > > > > > > > # *crash* > > > > > > > > Will give either > > > > > > > > file contains "newcontent"; file.tmp does not exist > > > > file contains "oldcontent"; file.tmp may contain "newcontent", be > > > > zero-length or not exists at all. > > > > **** > > > > > > Since I can't find an "fsync" command line tool, I've rewritten the > > > example in more general terms, rather than tying it to a specific > > > implementation (such as a sequence of shell commands). I note that > > > we've got a near-duplicate entry in the FAQ, two items down ("What are > > > the crash guarantees of rename?"), that should probably be removed. > > > > > > Updated entry: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > > Candidate for deletion: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_rename.3F > > > > There's the xfs_io command from xfsprogs that can be used to trigger > > an fsync like this: xfs_io -c fsync /path/to/file > > But it's probably not well known for non-developers. > > > > Anyway, as it is, it looks perfect to me, thanks! > > > I think it's OK to use pseudo-code. Folks will figure it out. So you > can just write it as fsync() and even if you're not using the exact > correct syntax it'll be understood. Topical xxample: https://lore.kernel.org/linux-xfs/6A65F394-C1BA-4339-AC9B-051885D12F65@corp.ovh.com/ Reminds me to ask Filipe if the example Hugo is writing up also needs an fsync() on the enclosing directory *after* the rename? -- Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 20:06 ` Chris Murphy @ 2022-01-06 20:23 ` Hugo Mills 0 siblings, 0 replies; 18+ messages in thread From: Hugo Mills @ 2022-01-06 20:23 UTC (permalink / raw) To: Chris Murphy; +Cc: Filipe Manana, Btrfs BTRFS, Josef Bacik On Thu, Jan 06, 2022 at 01:06:40PM -0700, Chris Murphy wrote: > On Thu, Jan 6, 2022 at 1:02 PM Chris Murphy <lists@colorremedies.com> wrote: > > > > On Thu, Jan 6, 2022 at 3:28 AM Filipe Manana <fdmanana@kernel.org> wrote: > > > > > > On Thu, Jan 6, 2022 at 10:21 AM Hugo Mills <hugo@carfax.org.uk> wrote: > > > > > > > > > Yep, that's correct Hugo. > > > > > > > > > > Starting with 3.17, the example on the wiki needs an fsync on > > > > > "file.tmp" after writing to it and before renaming it over "file". > > > > > I.e. the full example should be: > > > > > > > > > > **** > > > > > echo "oldcontent" > file > > > > > > > > > > # make sure oldcontent is on disk > > > > > sync > > > > > > > > > > echo "newcontent" > file.tmp > > > > > fsync file.tmp > > > > > mv -f file.tmp file > > > > > > > > > > # *crash* > > > > > > > > > > Will give either > > > > > > > > > > file contains "newcontent"; file.tmp does not exist > > > > > file contains "oldcontent"; file.tmp may contain "newcontent", be > > > > > zero-length or not exists at all. > > > > > **** > > > > > > > > Since I can't find an "fsync" command line tool, I've rewritten the > > > > example in more general terms, rather than tying it to a specific > > > > implementation (such as a sequence of shell commands). I note that > > > > we've got a near-duplicate entry in the FAQ, two items down ("What are > > > > the crash guarantees of rename?"), that should probably be removed. > > > > > > > > Updated entry: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > > > Candidate for deletion: https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_rename.3F > > > > > > There's the xfs_io command from xfsprogs that can be used to trigger > > > an fsync like this: xfs_io -c fsync /path/to/file > > > But it's probably not well known for non-developers. > > > > > > Anyway, as it is, it looks perfect to me, thanks! > > > > > > I think it's OK to use pseudo-code. Folks will figure it out. So you > > can just write it as fsync() and even if you're not using the exact > > correct syntax it'll be understood. > > Topical xxample: > https://lore.kernel.org/linux-xfs/6A65F394-C1BA-4339-AC9B-051885D12F65@corp.ovh.com/ I did wonder whether to write it as C-ish code, but I felt that the prose description was sufficient, and didn't presuppose any particular implementation. > Reminds me to ask Filipe if the example Hugo is writing up also needs > an fsync() on the enclosing directory *after* the rename? As I understand it, as long as the data is fsynced properly before the move, the semantics are safe and atomic (if you don't get the new data afterwards, you get the old data, regardless of where the crash is). Syncing the directory and waiting for that to complete gives you the guarantee that you'll get the new data from that point on of the system crashes afterwards. Hugo. -- Hugo Mills | I hate housework. You make the beds, you wash the hugo@... carfax.org.uk | dishes, and six months later you have to start all http://carfax.org.uk/ | over again. PGP: E2AB1DE4 | Joan Rivers ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-06 10:20 ` Hugo Mills 2022-01-06 10:27 ` Filipe Manana @ 2022-01-06 21:07 ` Adam Borowski 1 sibling, 0 replies; 18+ messages in thread From: Adam Borowski @ 2022-01-06 21:07 UTC (permalink / raw) To: Hugo Mills, Filipe Manana, Chris Murphy, Btrfs BTRFS, Josef Bacik On Thu, Jan 06, 2022 at 10:20:56AM +0000, Hugo Mills wrote: > Since I can't find an "fsync" command line tool sync /the/file Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋⠀ Certified airhead; got the CT scan to prove that! ⠈⠳⣄⠀⠀⠀⠀ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:04 ` Filipe Manana 2022-01-05 18:32 ` Filipe Manana 2022-01-05 18:34 ` Hugo Mills @ 2022-01-05 18:40 ` Chris Murphy 2022-01-05 20:32 ` Filipe Manana 2022-01-09 17:04 ` Remi Gauvin 3 siblings, 1 reply; 18+ messages in thread From: Chris Murphy @ 2022-01-05 18:40 UTC (permalink / raw) To: Filipe Manana; +Cc: Chris Murphy, Btrfs BTRFS, Josef Bacik, David Sterba On Wed, Jan 5, 2022 at 11:04 AM Filipe Manana <fdmanana@kernel.org> wrote: > > Looking at the code before that patch, it explicitly skips fsync after > a rename pointing out to: > > https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > I'm afraid that information is wrong, perhaps it might have been true in > some very distant past, but certainly not for many years. However, the system does suspend correctly and before that there should be "Filesystems sync" message from the kernel (this is the case on my laptops, unconfirmed for the GNOME bug case). [1] If the sequence is: write to file no fsync "filesystem sync" (I guess that's syncfs for all mounted filesystems? no idea) suspend crash Still seems like the file should be either old or new, not 0 length? > I don't think I have a wiki account enabled, but I'll see if I get that > updated soon. Thanks Felipe! [1] "Filesystems sync" message appears to come from here https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/main.c?h=v5.16-rc8#n62 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/main.c?h=v5.16-rc8#n195 -- Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:40 ` Chris Murphy @ 2022-01-05 20:32 ` Filipe Manana 0 siblings, 0 replies; 18+ messages in thread From: Filipe Manana @ 2022-01-05 20:32 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Josef Bacik, David Sterba On Wed, Jan 5, 2022 at 6:40 PM Chris Murphy <lists@colorremedies.com> wrote: > > On Wed, Jan 5, 2022 at 11:04 AM Filipe Manana <fdmanana@kernel.org> wrote: > > > > Looking at the code before that patch, it explicitly skips fsync after > > a rename pointing out to: > > > > https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F > > > > I'm afraid that information is wrong, perhaps it might have been true in > > some very distant past, but certainly not for many years. > > However, the system does suspend correctly and before that there > should be "Filesystems sync" message from the kernel (this is the case > on my laptops, unconfirmed for the GNOME bug case). [1] > > If the sequence is: > > write to file > no fsync > "filesystem sync" (I guess that's syncfs for all mounted filesystems? no idea) > suspend > crash > > Still seems like the file should be either old or new, not 0 length? Nop, it can still be 0. So where that message is printed, before it, ksys_sync() is called, which will indeed call btrfs' sync_fs() callback (btrfs_sync_fs()). However btrfs_sync_fs() will only wait for existing ordered extents to complete, and then commit the current transaction. Ordered extents are created when writeback is started. In the rename path (btrfs_rename()) we trigger writeback with filemap_flush() (mm/filemap.c), but that does not guarantee that I/O is started for all dirty pages: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/filemap.c?h=v5.16-rc8#n462 /** * filemap_flush - mostly a non-blocking flush * @mapping: target address_space * * This is a mostly non-blocking flush. Not suitable for data-integrity * purposes - I/O may not be started against all dirty pages. * * Return: %0 on success, negative error code otherwise. */ int filemap_flush(struct address_space *mapping) { return __filemap_fdatawrite(mapping, WB_SYNC_NONE); } EXPORT_SYMBOL(filemap_flush); I.e., we have no guarantee that writeback was started by the time the rename returns to user space. Regardless of that, it was not safe... a crash can happen after the rename and before ksys_sync() is called. > > > > I don't think I have a wiki account enabled, but I'll see if I get that > > updated soon. > > Thanks Felipe! Filipe Thanks. > > [1] "Filesystems sync" message appears to come from here > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/main.c?h=v5.16-rc8#n62 > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/main.c?h=v5.16-rc8#n195 > > > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [bug] GNOME loses all settings following failure to resume from suspend 2022-01-05 18:04 ` Filipe Manana ` (2 preceding siblings ...) 2022-01-05 18:40 ` Chris Murphy @ 2022-01-09 17:04 ` Remi Gauvin 3 siblings, 0 replies; 18+ messages in thread From: Remi Gauvin @ 2022-01-09 17:04 UTC (permalink / raw) To: linux-btrfs On 2022-01-05 1:04 p.m., Filipe Manana wrote: > After mounting the fs again 'file' is empty and 'file.tmp' does not exists. > > The only for that to guarantee 'file' is not empty and has the expected > data ("newcontent"), would be to mount with -o flushoncommit. > Was flushoncommit on the default back in kernel/tools version 4.4? I found an old Ubuntu man page that seemed to indicate as much. That would be a very radical and (for me at least.) unexpected change. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-01-09 17:04 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-05 17:34 [bug] GNOME loses all settings following failure to resume from suspend Chris Murphy 2022-01-05 18:04 ` Filipe Manana 2022-01-05 18:32 ` Filipe Manana 2022-01-05 18:34 ` Hugo Mills 2022-01-05 20:38 ` Filipe Manana 2022-01-05 21:31 ` Hugo Mills 2022-01-05 21:39 ` Hugo Mills 2022-01-05 21:53 ` Hugo Mills 2022-01-06 9:51 ` Filipe Manana 2022-01-06 10:20 ` Hugo Mills 2022-01-06 10:27 ` Filipe Manana 2022-01-06 20:02 ` Chris Murphy 2022-01-06 20:06 ` Chris Murphy 2022-01-06 20:23 ` Hugo Mills 2022-01-06 21:07 ` Adam Borowski 2022-01-05 18:40 ` Chris Murphy 2022-01-05 20:32 ` Filipe Manana 2022-01-09 17:04 ` Remi Gauvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).