* UBIFS robustness questions @ 2009-07-24 4:00 Charles Manning 2009-07-24 6:03 ` Artem Bityutskiy 2009-07-24 6:43 ` Adrian Hunter 0 siblings, 2 replies; 9+ messages in thread From: Charles Manning @ 2009-07-24 4:00 UTC (permalink / raw) To: linux-mtd This is probably documented somewhere but I could not find it... What operations in UBIFS are robust to power failure and which are not? I know for example that writing a file into flash does not mean it has been completely written to flash until after a sync, but what about other operations such as mv? The reasonn I'm asking this is that I want to be able to "hot-swap" a directory of files without losing any file state. What I'm considerings doing is something like: Start with ~/runtime having a sane set of files untar etc into ~/updated sync mv ~/updated ~/run-time sync What is unacceptable is that, at any time, a power failure/reboot results in ~/runtime having a non-sane set of files. * Does the above sequence look safe? * Is the second sync required? TIA -- Charles ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 4:00 UBIFS robustness questions Charles Manning @ 2009-07-24 6:03 ` Artem Bityutskiy 2009-07-24 6:43 ` Adrian Hunter 1 sibling, 0 replies; 9+ messages in thread From: Artem Bityutskiy @ 2009-07-24 6:03 UTC (permalink / raw) To: Charles Manning; +Cc: linux-mtd On 07/24/2009 07:00 AM, Charles Manning wrote: > This is probably documented somewhere but I could not find it... > > What operations in UBIFS are robust to power failure and which are not? Hi, did you look through these: http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writeback http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writebuffer http://www.linux-mtd.infradead.org/doc/ubifs.html#L_sync_exceptions http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file > > I know for example that writing a file into flash does not mean it has been > completely written to flash until after a sync, but what about other > operations such as mv? > > The reasonn I'm asking this is that I want to be able to "hot-swap" a > directory of files without losing any file state. Err, if you do sync() and the like properly, you should not loose anything. > What I'm considerings doing is something like: > > Start with ~/runtime having a sane set of files > > untar etc into ~/updated > sync > mv ~/updated ~/run-time > sync > > What is unacceptable is that, at any time, a power failure/reboot results in > ~/runtime having a non-sane set of files. Err, this will just move "updated" to the "runtime" directory. Is this what you mean? But the above must be safe. > * Does the above sequence look safe? > * Is the second sync required? It is required if you want to make sure that the directory has really been renamed, otherwise the renaming data will sit in the write-buffer for some time, and in case of a power you end up with "updated" at the old place, but nothing should be corrupted. IOW, you do not have to, but may want to. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 4:00 UBIFS robustness questions Charles Manning 2009-07-24 6:03 ` Artem Bityutskiy @ 2009-07-24 6:43 ` Adrian Hunter 2009-07-24 9:24 ` Adrian Hunter 1 sibling, 1 reply; 9+ messages in thread From: Adrian Hunter @ 2009-07-24 6:43 UTC (permalink / raw) To: Charles Manning; +Cc: linux-mtd@lists.infradead.org Charles Manning wrote: > This is probably documented somewhere but I could not find it... > > What operations in UBIFS are robust to power failure and which are not? Only sync operations guarantee that changes have reached the flash. There are all the usual ways to sync: fsync/fdatasync a file/directory open a file as synchronous mark a file with the sync flag sync the filesystem mount the file system as synchronous > I know for example that writing a file into flash does not mean it has been > completely written to flash until after a sync, but what about other > operations such as mv? After mv, the containing directory must be sync'd to be sure the change reaches the flash. But rename is atomic so there will always be either the old naming or the new naming > The reasonn I'm asking this is that I want to be able to "hot-swap" a > directory of files without losing any file state. Should be no problem if you sync correctly. > What I'm considerings doing is something like: > > Start with ~/runtime having a sane set of files > > untar etc into ~/updated > sync > mv ~/updated ~/run-time > sync > > What is unacceptable is that, at any time, a power failure/reboot results in > ~/runtime having a non-sane set of files. > > * Does the above sequence look safe? Yes > * Is the second sync required? It is required to guarantee that the mv has reached the flash at that point in time i.e. power loss before the second sync => same as if mv was not done ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 6:43 ` Adrian Hunter @ 2009-07-24 9:24 ` Adrian Hunter 2009-07-24 10:03 ` Adrian Hunter 0 siblings, 1 reply; 9+ messages in thread From: Adrian Hunter @ 2009-07-24 9:24 UTC (permalink / raw) To: Charles Manning; +Cc: linux-mtd@lists.infradead.org Hunter Adrian (Nokia-D/Helsinki) wrote: > Charles Manning wrote: >> This is probably documented somewhere but I could not find it... >> >> What operations in UBIFS are robust to power failure and which are not? > > Only sync operations guarantee that changes have reached the flash. > There are all the usual ways to sync: > fsync/fdatasync a file/directory > open a file as synchronous > mark a file with the sync flag > sync the filesystem > mount the file system as synchronous > >> I know for example that writing a file into flash does not mean it has been >> completely written to flash until after a sync, but what about other >> operations such as mv? > > After mv, the containing directory must be sync'd to be sure the change reaches the > flash. But rename is atomic so there will always be either the old > naming or the new naming > >> The reasonn I'm asking this is that I want to be able to "hot-swap" a >> directory of files without losing any file state. > > Should be no problem if you sync correctly. > >> What I'm considerings doing is something like: >> >> Start with ~/runtime having a sane set of files >> >> untar etc into ~/updated >> sync >> mv ~/updated ~/run-time >> sync >> >> What is unacceptable is that, at any time, a power failure/reboot results in >> ~/runtime having a non-sane set of files. >> >> * Does the above sequence look safe? > > Yes Well, safe but not possible. You cannot rename over the top of a non-empty directory. Sorry I was misleading. >> * Is the second sync required? > > It is required to guarantee that the mv has reached the flash at that > point in time i.e. power loss before the second sync => same as if mv > was not done ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 9:24 ` Adrian Hunter @ 2009-07-24 10:03 ` Adrian Hunter 2009-07-24 23:39 ` Jamie Lokier 0 siblings, 1 reply; 9+ messages in thread From: Adrian Hunter @ 2009-07-24 10:03 UTC (permalink / raw) To: Charles Manning; +Cc: linux-mtd@lists.infradead.org Adrian Hunter wrote: > Hunter Adrian (Nokia-D/Helsinki) wrote: >> Charles Manning wrote: >>> This is probably documented somewhere but I could not find it... >>> >>> What operations in UBIFS are robust to power failure and which are not? >> Only sync operations guarantee that changes have reached the flash. >> There are all the usual ways to sync: >> fsync/fdatasync a file/directory >> open a file as synchronous >> mark a file with the sync flag >> sync the filesystem >> mount the file system as synchronous >> >>> I know for example that writing a file into flash does not mean it has been >>> completely written to flash until after a sync, but what about other >>> operations such as mv? >> After mv, the containing directory must be sync'd to be sure the change reaches the >> flash. But rename is atomic so there will always be either the old >> naming or the new naming >> >>> The reasonn I'm asking this is that I want to be able to "hot-swap" a >>> directory of files without losing any file state. >> Should be no problem if you sync correctly. >> >>> What I'm considerings doing is something like: >>> >>> Start with ~/runtime having a sane set of files >>> >>> untar etc into ~/updated >>> sync >>> mv ~/updated ~/run-time >>> sync >>> >>> What is unacceptable is that, at any time, a power failure/reboot results in >>> ~/runtime having a non-sane set of files. >>> >>> * Does the above sequence look safe? >> Yes > > Well, safe but not possible. You cannot rename over the top > of a non-empty directory. Sorry I was misleading. Sorry to drag this out but it seems like it can be done with symlinks e.g. / # mkdir test / # cd test /test # mkdir version1 /test # mkdir version2 /test # echo "This is version 1" > version1/afile /test # echo "This is version 2" > version2/afile /test # ln -s version1 current /test # ln -s version2 next /test # cat current/afile This is version 1 /test # cat next/afile This is version 2 /test # mv -T next current /test # ls -al drwxr-xr-x 4 root root 432 Jan 2 01:57 . drwxrwxrwx 25 root root 1704 Jan 2 01:44 .. lrwxrwxrwx 1 root root 8 Jan 2 01:46 current -> version2 -rwxr-xr-x 1 root root 261307 Jul 24 2009 mv drwxr-xr-x 2 root root 224 Jan 2 01:47 version1 drwxr-xr-x 2 root root 224 Jan 2 01:45 version2 /test # cat current/afile This is version 2 /test # Note that busybox's 'mv' does not support the -T option >>> * Is the second sync required? >> It is required to guarantee that the mv has reached the flash at that >> point in time i.e. power loss before the second sync => same as if mv >> was not done > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 10:03 ` Adrian Hunter @ 2009-07-24 23:39 ` Jamie Lokier 2009-07-26 6:29 ` Adrian Hunter 0 siblings, 1 reply; 9+ messages in thread From: Jamie Lokier @ 2009-07-24 23:39 UTC (permalink / raw) To: Adrian Hunter; +Cc: Charles Manning, linux-mtd@lists.infradead.org Adrian Hunter wrote: > Sorry to drag this out but it seems like it can be done with symlinks That's right. It should be powerfail safe. Don't forget to "rm -fr version1" at the end :-) However, if you are looking to use this for atomic update of a directory while there are programs still running which use the directory, it won't work. You can't delete the old directory, because programs might still be inside it... It's not even always safe to kill and restart the programs after renaming the symlink, because they might read some files from the new directory before they've finished reading other files from the old directory. Regarding powerfail safety, it means you might have to defer deleting the old directory until some major system action, like the next reboot. -- Jamie ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-24 23:39 ` Jamie Lokier @ 2009-07-26 6:29 ` Adrian Hunter 2009-07-26 19:21 ` Jamie Lokier 0 siblings, 1 reply; 9+ messages in thread From: Adrian Hunter @ 2009-07-26 6:29 UTC (permalink / raw) To: Jamie Lokier; +Cc: Charles Manning, linux-mtd@lists.infradead.org Jamie Lokier wrote: > Adrian Hunter wrote: >> Sorry to drag this out but it seems like it can be done with symlinks > > That's right. It should be powerfail safe. > Don't forget to "rm -fr version1" at the end :-) > > However, if you are looking to use this for atomic update of a > directory while there are programs still running which use the > directory, it won't work. > > You can't delete the old directory, because programs might still be > inside it... Are you sure about that. I can do this: / # mkdir test2 / # cd test2 /test2 # cp /bin/bash . /test2 # ls -al drwxr-xr-x 2 root root 224 Jan 3 22:20 . drwxrwxrwx 25 root root 1768 Jan 3 22:20 .. -rwxr-xr-x 1 root root 612764 Jan 3 22:20 bash /test2 # ./bash -c "sleep 30;echo Done" & /test2 # rm bash /test2 # cd .. / # rmdir test2 / # ps | grep bash 1261 root 2500 S ./bash -c sleep 30;echo Done / # / # / # Done [2] + Done ./bash -c "sleep 30;echo Done" ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-26 6:29 ` Adrian Hunter @ 2009-07-26 19:21 ` Jamie Lokier 2009-07-27 8:09 ` Adrian Hunter 0 siblings, 1 reply; 9+ messages in thread From: Jamie Lokier @ 2009-07-26 19:21 UTC (permalink / raw) To: Adrian Hunter; +Cc: Charles Manning, linux-mtd@lists.infradead.org Adrian Hunter wrote: > Jamie Lokier wrote: > >Adrian Hunter wrote: > >>Sorry to drag this out but it seems like it can be done with symlinks > > > >That's right. It should be powerfail safe. > >Don't forget to "rm -fr version1" at the end :-) > > > >However, if you are looking to use this for atomic update of a > >directory while there are programs still running which use the > >directory, it won't work. > > > >You can't delete the old directory, because programs might still be > >inside it... > > Are you sure about that. I can do this: > > / # mkdir test2 > / # cd test2 > /test2 # cp /bin/bash . > /test2 # ls -al > drwxr-xr-x 2 root root 224 Jan 3 22:20 . > drwxrwxrwx 25 root root 1768 Jan 3 22:20 .. > -rwxr-xr-x 1 root root 612764 Jan 3 22:20 bash > /test2 # ./bash -c "sleep 30;echo Done" & > /test2 # rm bash > /test2 # cd .. > / # rmdir test2 > / # ps | grep bash > 1261 root 2500 S ./bash -c sleep 30;echo Done > / # > / # > / # Done > > [2] + Done ./bash -c "sleep 30;echo Done" (By the way, Linux has not always allowed an empty but in-use directory to be rmdir'd, but it does these days). What I mean is, you can delete the old directory, but it's not always safe because you might break programs which are depending on the directory's contents when you do. For example: $ mkdir dir1 $ echo "message1" > dir1/message $ ln -sfT dir1 new $ mv -T new current $ sh -c 'cd current; while :; do cat message > /dev/ttyAM0; sleep 1; done' & ==> Writes "message1" to the serial port every second. $ mkdir dir2 $ echo "message2" > dir2/message $ ln -sfT dir2 new $ mv -T new current # Looks atomic ==> Still writes "message1" to the serial port every second. ==> Maybe that's ok, maybe not. $ rm -fr dir2 # Old version, no longer in use? ==> The background script Writes "File not found" error every second... ==> Clearly not ok. If the script is written differently as $ sh -c 'while :; do cat current/message > /dev/ttyAM0; sleep 1; done' & then it works better, changing the message in this example most of time. It's not obvious, but even that version has an extremely rare race condition: "cat current/message" does path traversal in the kernel, which may open "current" just before the symlink changes, then (due to preemptive scheduling or SMP) look up "message" after that's been deleted. It is probably very hard to trigger, but it's a race condition. And even without that race condition, the method doesn't work in general. If it was reading two different files, it could easily see one file from the old version and one file from the new version for a moment. The inconsistency could be harmless or fatal depending on the application. It's a hard problem to solve properly, unless you analyse each application or kill each application before the change and restart them afterwards. In which case maybe you don't need the change to be atomic :-) Databases solve it with transactions, which are nice to use and understand, but they introduces coordination problems in a different way if they aren't used consistently and correctly. This is why every Linux distro has occasional glitches when package managers update a running system, and reports of things going wrong which are too rare to fix, to transient to repeat, and go away on the next reboot. -- Jamie ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: UBIFS robustness questions 2009-07-26 19:21 ` Jamie Lokier @ 2009-07-27 8:09 ` Adrian Hunter 0 siblings, 0 replies; 9+ messages in thread From: Adrian Hunter @ 2009-07-27 8:09 UTC (permalink / raw) To: Jamie Lokier; +Cc: Charles Manning, linux-mtd@lists.infradead.org Jamie Lokier wrote: > Adrian Hunter wrote: >> Jamie Lokier wrote: >>> Adrian Hunter wrote: >>>> Sorry to drag this out but it seems like it can be done with symlinks >>> That's right. It should be powerfail safe. >>> Don't forget to "rm -fr version1" at the end :-) >>> >>> However, if you are looking to use this for atomic update of a >>> directory while there are programs still running which use the >>> directory, it won't work. >>> >>> You can't delete the old directory, because programs might still be >>> inside it... >> Are you sure about that. I can do this: >> >> / # mkdir test2 >> / # cd test2 >> /test2 # cp /bin/bash . >> /test2 # ls -al >> drwxr-xr-x 2 root root 224 Jan 3 22:20 . >> drwxrwxrwx 25 root root 1768 Jan 3 22:20 .. >> -rwxr-xr-x 1 root root 612764 Jan 3 22:20 bash >> /test2 # ./bash -c "sleep 30;echo Done" & >> /test2 # rm bash >> /test2 # cd .. >> / # rmdir test2 >> / # ps | grep bash >> 1261 root 2500 S ./bash -c sleep 30;echo Done >> / # >> / # >> / # Done >> >> [2] + Done ./bash -c "sleep 30;echo Done" > > (By the way, Linux has not always allowed an empty but in-use directory > to be rmdir'd, but it does these days). > > What I mean is, you can delete the old directory, but it's not always > safe because you might break programs which are depending on the > directory's contents when you do. > > For example: > > $ mkdir dir1 > $ echo "message1" > dir1/message > $ ln -sfT dir1 new > $ mv -T new current > > $ sh -c 'cd current; while :; do cat message > /dev/ttyAM0; sleep 1; done' & > > ==> Writes "message1" to the serial port every second. > > $ mkdir dir2 > $ echo "message2" > dir2/message > $ ln -sfT dir2 new > $ mv -T new current # Looks atomic > > ==> Still writes "message1" to the serial port every second. > ==> Maybe that's ok, maybe not. > > $ rm -fr dir2 # Old version, no longer in use? > > ==> The background script Writes "File not found" error every second... > ==> Clearly not ok. > > If the script is written differently as > > $ sh -c 'while :; do cat current/message > /dev/ttyAM0; sleep 1; done' & > > then it works better, changing the message in this example most of time. > > It's not obvious, but even that version has an extremely rare race > condition: "cat current/message" does path traversal in the kernel, > which may open "current" just before the symlink changes, then (due to > preemptive scheduling or SMP) look up "message" after that's been > deleted. It is probably very hard to trigger, but it's a race condition. > > And even without that race condition, the method doesn't work in > general. If it was reading two different files, it could easily see > one file from the old version and one file from the new version for a > moment. The inconsistency could be harmless or fatal depending on the > application. > > It's a hard problem to solve properly, unless you analyse each > application or kill each application before the change and restart > them afterwards. In which case maybe you don't need the change to be > atomic :-) > > Databases solve it with transactions, which are nice to use and > understand, but they introduces coordination problems in a different > way if they aren't used consistently and correctly. > > This is why every Linux distro has occasional glitches when package > managers update a running system, and reports of things going wrong > which are too rare to fix, to transient to repeat, and go away on the > next reboot. Another problem is that unlinked files that have not been deleted because they are open, still consume file system space. So on a little embedded system, you can unexpectedly run out of space. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-07-27 8:08 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-24 4:00 UBIFS robustness questions Charles Manning 2009-07-24 6:03 ` Artem Bityutskiy 2009-07-24 6:43 ` Adrian Hunter 2009-07-24 9:24 ` Adrian Hunter 2009-07-24 10:03 ` Adrian Hunter 2009-07-24 23:39 ` Jamie Lokier 2009-07-26 6:29 ` Adrian Hunter 2009-07-26 19:21 ` Jamie Lokier 2009-07-27 8:09 ` Adrian Hunter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).