* UBIFS power cut issues @ 2009-09-02 9:35 JiSheng Zhang 2009-09-08 6:22 ` Artem Bityutskiy 0 siblings, 1 reply; 10+ messages in thread From: JiSheng Zhang @ 2009-09-02 9:35 UTC (permalink / raw) To: linux-mtd Hi list, If we cut power when copy file into ubifs, then remount ubifs and try to read the file, we found that the data at some offset of the file began different from the data of the original file at the same offset. Is this a bug of ubifs? PS:how do you test data integrity of ubifs under power loss? Thanks in advance, Jisheng ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-02 9:35 UBIFS power cut issues JiSheng Zhang @ 2009-09-08 6:22 ` Artem Bityutskiy 2009-09-09 9:45 ` JiSheng Zhang 0 siblings, 1 reply; 10+ messages in thread From: Artem Bityutskiy @ 2009-09-08 6:22 UTC (permalink / raw) To: JiSheng Zhang; +Cc: linux-mtd Hi, sorry for late answer, was very busy. On Wed, 2009-09-02 at 17:35 +0800, JiSheng Zhang wrote: > If we cut power when copy file into ubifs, then remount ubifs and try > to read the file, we found that the data at some offset of the file > began different from the data of the original file at the same offset. > Is this a bug of ubifs? This is expected behavior on any asynchronous FS. You may switch to synchronous behavior with '-o sync' mount option. I wrote a lot of docs about write-back and the related issues. Dig UBIFS docs and FAQ. E.g.: http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writeback If you have a _specific_ question, feel free to ask, of course. But for this general question I do not have a better answer than RTFM :-))) > PS:how do you test data integrity of ubifs under power loss? We mostly checked it using either 'integck' (see mtd-utils) or 'fsstress' (see LTP). We ran those tests and cut power off at random points using these devices: http://www.cpscom.com/gprod/ipn.htm Then we mounted the FS. We did not really check the contents of the FS, because it is not simple and tricky, but we checked that it mounts, re-mounts, and files are readable/writable/deletable. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-08 6:22 ` Artem Bityutskiy @ 2009-09-09 9:45 ` JiSheng Zhang 2009-09-09 10:06 ` Artem Bityutskiy 2009-09-10 15:42 ` Artem Bityutskiy 0 siblings, 2 replies; 10+ messages in thread From: JiSheng Zhang @ 2009-09-09 9:45 UTC (permalink / raw) To: dedekind1; +Cc: linux-mtd Hi Artem, 2009/9/8 Artem Bityutskiy <dedekind1@gmail.com>: > Hi, > > sorry for late answer, was very busy. > > On Wed, 2009-09-02 at 17:35 +0800, JiSheng Zhang wrote: >> If we cut power when copy file into ubifs, then remount ubifs and try >> to read the file, we found that the data at some offset of the file >> began different from the data of the original file at the same offset. >> Is this a bug of ubifs? > > This is expected behavior on any asynchronous FS. You may switch to > synchronous behavior with '-o sync' mount option. I wrote a lot of I have tested with "mount -o sync", the result is the same. It's not empty file. For example: cp fileA /mnt/ubifs/fileB random cut power before "cp" completed. then remount >From head of /mnt/ubifs/fileB to some offset offsetC is the same as fileA. But from offsetC to the end is different from fileA at the same offset offsetC, it's not empty either. Hope I expressed myself clearly. > docs about write-back and the related issues. Dig UBIFS docs and FAQ. > E.g.: > > http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file > http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writeback > > If you have a _specific_ question, feel free to ask, of course. But > for this general question I do not have a better answer than RTFM > :-))) > >> PS:how do you test data integrity of ubifs under power loss? > > We mostly checked it using either 'integck' (see mtd-utils) or > 'fsstress' (see LTP). We ran those tests and cut power off at random > points using these devices: > > http://www.cpscom.com/gprod/ipn.htm > > Then we mounted the FS. We did not really check the contents of the > FS, because it is not simple and tricky, but we checked that it mounts, > re-mounts, and files are readable/writable/deletable. Thanks for this information. Regards, Jisheng ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-09 9:45 ` JiSheng Zhang @ 2009-09-09 10:06 ` Artem Bityutskiy 2009-09-11 9:23 ` JiSheng Zhang 2009-09-10 15:42 ` Artem Bityutskiy 1 sibling, 1 reply; 10+ messages in thread From: Artem Bityutskiy @ 2009-09-09 10:06 UTC (permalink / raw) To: JiSheng Zhang; +Cc: linux-mtd On 09/09/2009 12:45 PM, JiSheng Zhang wrote: >> On Wed, 2009-09-02 at 17:35 +0800, JiSheng Zhang wrote: >>> If we cut power when copy file into ubifs, then remount ubifs and try >>> to read the file, we found that the data at some offset of the file >>> began different from the data of the original file at the same offset. >>> Is this a bug of ubifs? >> >> This is expected behavior on any asynchronous FS. You may switch to >> synchronous behavior with '-o sync' mount option. I wrote a lot of > > I have tested with "mount -o sync", the result is the same. It's not > empty file. For example: > cp fileA /mnt/ubifs/fileB > random cut power before "cp" completed. > then remount > From head of /mnt/ubifs/fileB to some offset offsetC is the same as > fileA. But from offsetC to the end is different from fileA at the same > offset offsetC, it's not empty either. > Hope I expressed myself clearly. Hmm, ok. What is your kernel version? Could you please take a closer look and see if these differences are zeroes or not? Do you have an automated test for this? Can you share your script? -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-09 10:06 ` Artem Bityutskiy @ 2009-09-11 9:23 ` JiSheng Zhang 0 siblings, 0 replies; 10+ messages in thread From: JiSheng Zhang @ 2009-09-11 9:23 UTC (permalink / raw) To: Artem Bityutskiy; +Cc: linux-mtd Hi Artem, 2009/9/9 Artem Bityutskiy <dedekind1@gmail.com>: > On 09/09/2009 12:45 PM, JiSheng Zhang wrote: >>> >>> On Wed, 2009-09-02 at 17:35 +0800, JiSheng Zhang wrote: >>>> >>>> If we cut power when copy file into ubifs, then remount ubifs and try >>>> to read the file, we found that the data at some offset of the file >>>> began different from the data of the original file at the same offset. >>>> Is this a bug of ubifs? >>> >>> This is expected behavior on any asynchronous FS. You may switch to >>> synchronous behavior with '-o sync' mount option. I wrote a lot of >> >> I have tested with "mount -o sync", the result is the same. It's not >> empty file. For example: >> cp fileA /mnt/ubifs/fileB >> random cut power before "cp" completed. >> then remount >> From head of /mnt/ubifs/fileB to some offset offsetC is the same as >> fileA. But from offsetC to the end is different from fileA at the same >> offset offsetC, it's not empty either. >> Hope I expressed myself clearly. > > Hmm, ok. What is your kernel version? > > Could you please take a closer look and see if these differences > are zeroes or not? My mistake, sorry. I have look from the offset to the end of the file, they're really 0, that is file hole. > > Do you have an automated test for this? Can you share your script? Hmm, I just run copy manually and diff once mounted again. > > -- > Best Regards, > Artem Bityutskiy (Артём Битюцкий) > Best Regards, Jisheng ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-09 9:45 ` JiSheng Zhang 2009-09-09 10:06 ` Artem Bityutskiy @ 2009-09-10 15:42 ` Artem Bityutskiy 2009-09-10 16:00 ` Bill Gatliff 2009-09-11 9:33 ` JiSheng Zhang 1 sibling, 2 replies; 10+ messages in thread From: Artem Bityutskiy @ 2009-09-10 15:42 UTC (permalink / raw) To: JiSheng Zhang; +Cc: linux-mtd On Wed, 2009-09-09 at 17:45 +0800, JiSheng Zhang wrote: > Hi Artem, > > 2009/9/8 Artem Bityutskiy <dedekind1@gmail.com>: > > Hi, > > > > sorry for late answer, was very busy. > > > > On Wed, 2009-09-02 at 17:35 +0800, JiSheng Zhang wrote: > >> If we cut power when copy file into ubifs, then remount ubifs and try > >> to read the file, we found that the data at some offset of the file > >> began different from the data of the original file at the same offset. > >> Is this a bug of ubifs? > > > > This is expected behavior on any asynchronous FS. You may switch to > > synchronous behavior with '-o sync' mount option. I wrote a lot of > > I have tested with "mount -o sync", the result is the same. It's not > empty file. For example: > cp fileA /mnt/ubifs/fileB > random cut power before "cp" completed. > then remount > From head of /mnt/ubifs/fileB to some offset offsetC is the same as > fileA. But from offsetC to the end is different from fileA at the same > offset offsetC, it's not empty either. > Hope I expressed myself clearly. I believe you have zeroes at the end. These are actually holes. And this is actually expected. I've added these pieces of documentation for you: http://www.linux-mtd.infradead.org/faq/ubifs.html#L_end_hole http://www.linux-mtd.infradead.org/doc/ubifs.html#L_sync_semantics And the text here, just in case someone would review it. UBIFS in synchronous mode vs JFFS2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When UBIFS is mounted in synchronous mode (-o sync mount options) - all file system operations become synchronous. This means that all data are written to flash before the file-system operations return. For example, if you write 10MiB of data to a file f.dat using the write() call, and UBIFS is in synchronous mode, then UBIFS guarantees that all 10MiB of data and the meta-data (file size and date changes) will reach the flash media before write() returns. And if a power cut happens after the write() call returns, the file will contain the written data. The same is true for situations when f.dat has was opened with O_SYNC or has the sync flag (see man 2 chattr). It is well-known that the JFFS2 file-system is synchronous (except a small write-buffer). However, UBIFS in synchronous mode is not the same as JFFS2 and provides somewhat less guarantees that JFFS2 does with respect to sudden power cuts. In JFFS2 all the meta-data (like inode atime/mtime/ctime, inode size, UID/GID, etc) are stored in the data node headers. Data nodes carry 4KiB of (compressed) data. This means that the meta-data information is duplicated in many places, but this also means that every time JFFS2 writes a data node to the flash media, it updates inode size as well. In practice this means that JFFS2 will write these 10MiB of data sequentially, from the beginning to the end. And if you have a power cut, you will just loose some amount of data at the end of the inode. For example, if JFFS2 starts writing those 10MiB of data, write 5MiB, and a power cut happens, you will end up with a 5MiB f.dat file. You loose only the last 5MiB. Things are a little bit more complex in case of UBIFS, where data are stored in data nodes and meta-data are stored in (separate) inode nodes. The meta-data are not duplicated in each data node, like in JFFS2. Lets consider an example. * User creates an empty file f.dat. The file is synchronous, or UBIFS is mounted in synchronous mode. User calls the write() function with a 10MiB buffer. * The kernel first copies all 10MiB of the data to the page cache. Inode size is changed to 10MiB as well and the inode is marked as dirty. Nothing has been written to the flash media so far. If a power cut happens at this point, the user will end up with an empty f.dat file. * UBIFS sees that the I/O has to be synchronous, and starts synchronizing the inode. First of all, it writes the inode node to the flash media. If a power cut happens at this moment, the user will end up with a 10MiB file which contains no data (hole), and if he read this file, he will get 10MiB of zeroes. * UBIFS starts writing the data. If a power cut happens at this point, the user will end up with a 10MiB file containing a hole at the end. Note, if the I/O was not synchronous, UBIFS would skip the last step and would just return. And the actual write-back would then happen in back-ground. But power cuts during write-back could anyway lead to files with holes at the end. Thus, synchronous I/O in UBIFS provides less guarantees than JFFS2 I/O - UBIFS has an effect of holes at the end of files. In ideal world applications should not assume anything about the contents of files which were not synchronized before a power-cut has happened. And "mainstream" file-systems like ext3 do not provide JFSS2-like guarantees. However, UBIFS is sometimes used as a JFFS2 replacement and people may want it to behave the same way as JFFS2 if it is mounted synchronously. This is doable, but needs some non-trivial development, so this was not implemented so far. On the other hand, there was no strong demand. You may implement this as an excercise, or you may try to convince UBIFS authors to do this. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-10 15:42 ` Artem Bityutskiy @ 2009-09-10 16:00 ` Bill Gatliff 2009-09-11 8:01 ` Artem Bityutskiy 2009-09-11 9:33 ` JiSheng Zhang 1 sibling, 1 reply; 10+ messages in thread From: Bill Gatliff @ 2009-09-10 16:00 UTC (permalink / raw) To: dedekind1; +Cc: linux-mtd, JiSheng Zhang Artem Bityutskiy wrote: > And the text here, just in case someone would review it. > When you mean "something is lost", the correct spelling is "lose". To "loose" means to "disconnect", or "release" something. > However, UBIFS is sometimes used as a JFFS2 replacement and people may > want it to behave the same way as JFFS2 if it is mounted synchronously. > This is doable, but needs some non-trivial development, so this was not > implemented so far. On the other hand, there was no strong demand. You > may implement this as an excercise, or you may try to convince UBIFS > authors to do this. > In summary, the differences in results between JFFS2 and UBIFS in the case of interrupted, large synchronous writes are related to differences in how the two store and/or compute file sizes? Based on your documentation, my understanding is that with JFFS2 file sizes are stored along with the file data nodes, and are updated as the file grows in size--- so an interruption truncates the file at the point the interruption occurs. For UBIFS, in contrast, file sizes are stored in separate nodes which might not have been written at the point of interruption--- so the state if the file when power is restored depends highly upon the precise moment that the interruption occurs. b.g. -- Bill Gatliff bgat@billgatliff.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-10 16:00 ` Bill Gatliff @ 2009-09-11 8:01 ` Artem Bityutskiy 0 siblings, 0 replies; 10+ messages in thread From: Artem Bityutskiy @ 2009-09-11 8:01 UTC (permalink / raw) To: Bill Gatliff; +Cc: linux-mtd, JiSheng Zhang On 09/10/2009 07:00 PM, Bill Gatliff wrote: > Artem Bityutskiy wrote: >> And the text here, just in case someone would review it. > > When you mean "something is lost", the correct spelling is "lose". To > "loose" means to "disconnect", or "release" something. Thanks, fixed: http://git.infradead.org/mtd-www.git/commit/8b407024f4b8377eae6557644c29a31bc20e1350 >> However, UBIFS is sometimes used as a JFFS2 replacement and people may >> want it to behave the same way as JFFS2 if it is mounted synchronously. >> This is doable, but needs some non-trivial development, so this was not >> implemented so far. On the other hand, there was no strong demand. You >> may implement this as an excercise, or you may try to convince UBIFS >> authors to do this. > > In summary, the differences in results between JFFS2 and UBIFS in the > case of interrupted, large synchronous writes are related to differences > in how the two store and/or compute file sizes? Yes. JFFS2 stores inode size in data nodes. So every time it writes the data node to the flash, it updates the inode size. When JFFS2 mounts the flash, it does full scanning, finds the last written data node and thus, it has correct inode size. UBIFS does not store file size in data nodes, but stores it in separate inode nodes, pretty much like any FS does. And UBIFS does not do scanning. This is where the difficulties come from. > Based on your documentation, my understanding is that with JFFS2 file > sizes are stored along with the file data nodes, and are updated as the > file grows in size--- so an interruption truncates the file at the point > the interruption occurs. Right. > For UBIFS, in contrast, file sizes are stored > in separate nodes which might not have been written at the point of > interruption--- so the state if the file when power is restored depends > highly upon the precise moment that the interruption occurs. Not exactly. UBIFS never writes data nodes beyond the on-flash inode size. If it has to write a data node and the data node is beyond the on-flash inode size (the in-memory inode has up-to-data size, but it is dirty and was not flushed yet), then UBIFS first writes the inode to the media, and then it starts writing the data. And if you have an interrupt, you _lose_ data nodes and you have holes (or old data nodes, if you are overwriting). If you need information why UBIFS never writes beyond inode size, you may take a look at file.c, there is a comment explaining this: /* * When writing-back dirty inodes, VFS first writes-back pages belonging to the * inode, then the inode itself. For UBIFS this may cause a problem. Consider a * situation when a we have an inode with size 0, then a megabyte of data is * appended to the inode, then write-back starts and flushes some amount of the * dirty pages, the journal becomes full, commit happens and finishes, and then * an unclean reboot happens. When the file system is mounted next time, the * inode size would still be 0, but there would be many pages which are beyond * the inode size, they would be indexed and consume flash space. Because the * journal has been committed, the replay would not be able to detect this * situation and correct the inode size. This means UBIFS would have to scan * whole index and correct all inode sizes, which is long an unacceptable. * * To prevent situations like this, UBIFS writes pages back only if they are * within the last synchronized inode size, i.e. the size which has been * written to the flash media last time. Otherwise, UBIFS forces inode * write-back, thus making sure the on-flash inode contains current inode size, * and then keeps writing pages back. ... -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-10 15:42 ` Artem Bityutskiy 2009-09-10 16:00 ` Bill Gatliff @ 2009-09-11 9:33 ` JiSheng Zhang 2009-09-11 10:06 ` Artem Bityutskiy 1 sibling, 1 reply; 10+ messages in thread From: JiSheng Zhang @ 2009-09-11 9:33 UTC (permalink / raw) To: dedekind1; +Cc: linux-mtd Hi Artem, 2009/9/10 Artem Bityutskiy <dedekind1@gmail.com>: > > * User creates an empty file f.dat. The file is synchronous, or > UBIFS is mounted in synchronous mode. User calls the write() > function with a 10MiB buffer. > * The kernel first copies all 10MiB of the data to the page cache. > Inode size is changed to 10MiB as well and the inode is marked > as dirty. Nothing has been written to the flash media so far. If > a power cut happens at this point, the user will end up with an > empty f.dat file. > * UBIFS sees that the I/O has to be synchronous, and starts > synchronizing the inode. First of all, it writes the inode node > to the flash media. If a power cut happens at this moment, the > user will end up with a 10MiB file which contains no data > (hole), and if he read this file, he will get 10MiB of zeroes. > * UBIFS starts writing the data. If a power cut happens at this > point, the user will end up with a 10MiB file containing a hole > at the end. > > Note, if the I/O was not synchronous, UBIFS would skip the last step and > would just return. And the actual write-back would then happen in > back-ground. But power cuts during write-back could anyway lead to files > with holes at the end. Thanks very much for this document, excellent document, I like it very much. > > Thus, synchronous I/O in UBIFS provides less guarantees than JFFS2 I/O - > UBIFS has an effect of holes at the end of files. In ideal world > applications should not assume anything about the contents of files > which were not synchronized before a power-cut has happened. And > "mainstream" file-systems like ext3 do not provide JFSS2-like > guarantees. > > However, UBIFS is sometimes used as a JFFS2 replacement and people may > want it to behave the same way as JFFS2 if it is mounted synchronously. > This is doable, but needs some non-trivial development, so this was not > implemented so far. On the other hand, there was no strong demand. You > may implement this as an excercise, or you may try to convince UBIFS > authors to do this. Hmmm, this style(there's hole at the end of file) can be accepted. Thanks again, Jisheng ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UBIFS power cut issues 2009-09-11 9:33 ` JiSheng Zhang @ 2009-09-11 10:06 ` Artem Bityutskiy 0 siblings, 0 replies; 10+ messages in thread From: Artem Bityutskiy @ 2009-09-11 10:06 UTC (permalink / raw) To: JiSheng Zhang; +Cc: linux-mtd On 09/11/2009 12:33 PM, JiSheng Zhang wrote: >> However, UBIFS is sometimes used as a JFFS2 replacement and people may >> want it to behave the same way as JFFS2 if it is mounted synchronously. >> This is doable, but needs some non-trivial development, so this was not >> implemented so far. On the other hand, there was no strong demand. You >> may implement this as an excercise, or you may try to convince UBIFS >> authors to do this. > > Hmmm, this style(there's hole at the end of file) can be accepted. Also note, due to an MM but the pages are sometimes written not exactly in order. Adrian made a patch for this, but the patch was not yet made it upstream: http://marc.info/?l=linux-kernel&m=125233252015797&w=2 -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-09-11 10:07 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-02 9:35 UBIFS power cut issues JiSheng Zhang 2009-09-08 6:22 ` Artem Bityutskiy 2009-09-09 9:45 ` JiSheng Zhang 2009-09-09 10:06 ` Artem Bityutskiy 2009-09-11 9:23 ` JiSheng Zhang 2009-09-10 15:42 ` Artem Bityutskiy 2009-09-10 16:00 ` Bill Gatliff 2009-09-11 8:01 ` Artem Bityutskiy 2009-09-11 9:33 ` JiSheng Zhang 2009-09-11 10:06 ` Artem Bityutskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox