* UBIFS and MLC NAND Flash @ 2010-03-22 16:57 Pedro I. Sanchez 2010-04-08 8:22 ` Artem Bityutskiy 0 siblings, 1 reply; 6+ messages in thread From: Pedro I. Sanchez @ 2010-03-22 16:57 UTC (permalink / raw) To: linux-mtd Hello, I have a few questions regarding this topic. 1. The UBIFS FAQ has a summary of the state of the support for MLC NAND flash here: http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc The question is, is this still the case? Does the FAQ reflect the current state of affairs? 2. I have several boards with MLC NAND flash running the Linux kernel 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" errors, files that all of a sudden become unreadable. Curiously enough, they have been read-only files in all cases, program executables and shared libraries. Would upgrading to a more recent kernel, or back porting the latest UBIFS code, help? Shall I expect better support for MLC NAND flash in the latest UBIFS code? 3. I am also seeing other errors where it is the U-Boot or the Kernel partitions that become corrupted. UBIFS is not involved there directly since these partitions are at the mtd level and outside the UBI layer. More specifically, my flash is partitioned as mtd0, mtd1, mtd2, mtd3, mtd4. Only mtd4 has UBI/UBIFS on top. Is it possible that some flash handling problems in UBIFS (mtd4) "spill over" other non-UBIFS mtd partitions? 4. Other than minimizing flash writes, is there any other suggestion on what to do to improve on the failure rate I see in the file system? Thank you in advance, I would very much appreciate you answers. -- Pedro ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash 2010-03-22 16:57 UBIFS and MLC NAND Flash Pedro I. Sanchez @ 2010-04-08 8:22 ` Artem Bityutskiy 2010-04-19 21:57 ` twebb 0 siblings, 1 reply; 6+ messages in thread From: Artem Bityutskiy @ 2010-04-08 8:22 UTC (permalink / raw) To: Pedro I. Sanchez; +Cc: linux-mtd Hi, On Mon, 2010-03-22 at 12:57 -0400, Pedro I. Sanchez wrote: > Hello, > > I have a few questions regarding this topic. > > 1. The UBIFS FAQ has a summary of the state of the support for MLC NAND > flash here: > > http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc > > The question is, is this still the case? Does the FAQ reflect the > current state of affairs? I think so. > 2. I have several boards with MLC NAND flash running the Linux kernel > 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" > errors, files that all of a sudden become unreadable. Curiously enough, > they have been read-only files in all cases, program executables and > shared libraries. Hmm. Do you do unclean power cuts? > Would upgrading to a more recent kernel, or back porting the latest > UBIFS code, help? Shall I expect better support for MLC NAND flash in > the latest UBIFS code? You did not specify whether you pulled the ubifs-v2.6.29.git tree. If you did this, then your UBI/UBIFS should be the same as in the latest kernels. Please, do this, although this will probably not solve your corruption problems, but you'll have other bug-fixes we have made since 2.6.29 times. Please, take a look here as well: http://www.linux-mtd.infradead.org/doc/ubifs.html#L_how_send_bugreport Did you run MTD tests? If no, run them. You may have issues on driver/HW levels. Vs. "Shall I expect better support" - not as far as I know, because I have not heard if anyone is working on this. But you could do that and contribute. > 3. I am also seeing other errors where it is the U-Boot or the Kernel > partitions that become corrupted. UBIFS is not involved there directly > since these partitions are at the mtd level and outside the UBI layer. > > More specifically, my flash is partitioned as mtd0, mtd1, mtd2, mtd3, > mtd4. Only mtd4 has UBI/UBIFS on top. Is it possible that some flash > handling problems in UBIFS (mtd4) "spill over" other non-UBIFS mtd > partitions? May be. Start with validating your MTD driver using MTD tests. This may help you to narrow down the problem. > 4. Other than minimizing flash writes, is there any other suggestion on > what to do to improve on the failure rate I see in the file system? Not sure. You should carefully investigate you problems and find out the nature of the failures, and then we can discuss that. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash 2010-04-08 8:22 ` Artem Bityutskiy @ 2010-04-19 21:57 ` twebb 2010-04-19 23:52 ` Pedro I. Sanchez 0 siblings, 1 reply; 6+ messages in thread From: twebb @ 2010-04-19 21:57 UTC (permalink / raw) To: dedekind1; +Cc: Pedro I. Sanchez, linux-mtd > > > 2. I have several boards with MLC NAND flash running the Linux kernel > > 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" > > errors, files that all of a sudden become unreadable. Curiously enough, > > they have been read-only files in all cases, program executables and > > shared libraries. > > Hmm. Do you do unclean power cuts? > > > Would upgrading to a more recent kernel, or back porting the latest > > UBIFS code, help? Shall I expect better support for MLC NAND flash in > > the latest UBIFS code? > > You did not specify whether you pulled the ubifs-v2.6.29.git tree. If > you did this, then your UBI/UBIFS should be the same as in the latest > kernels. Please, do this, although this will probably not solve your > corruption problems, but you'll have other bug-fixes we have made since > 2.6.29 times. > > Pedro, I'm seeing very similar issues with MLC+UBIFS, though not only with read-only files. Have you made any progress in your investigation or while trying Artem's suggestions? I'm about to start digging into this and would be interested to hear about any issues you may have come across. Do you have any opinion on whether this "corruption" is related to the information posted on the linux-mtd site at... http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ? A few notes: - I do occasionally have power cuts, but my understanding was that UBI/UBIFS was very tolerant of that condition. - I use CONFIG_MTD_UBI_WL_THRESHOLD=256 - I'm using linux-2.6.29 Thanks, twebb ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash 2010-04-19 21:57 ` twebb @ 2010-04-19 23:52 ` Pedro I. Sanchez 2010-05-03 14:48 ` Pedro I. Sanchez 0 siblings, 1 reply; 6+ messages in thread From: Pedro I. Sanchez @ 2010-04-19 23:52 UTC (permalink / raw) To: twebb; +Cc: linux-mtd, dedekind1 twebb wrote: >>> 2. I have several boards with MLC NAND flash running the Linux kernel >>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" >>> errors, files that all of a sudden become unreadable. Curiously enough, >>> they have been read-only files in all cases, program executables and >>> shared libraries. >> Hmm. Do you do unclean power cuts? >> >>> Would upgrading to a more recent kernel, or back porting the latest >>> UBIFS code, help? Shall I expect better support for MLC NAND flash in >>> the latest UBIFS code? >> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If >> you did this, then your UBI/UBIFS should be the same as in the latest >> kernels. Please, do this, although this will probably not solve your >> corruption problems, but you'll have other bug-fixes we have made since >> 2.6.29 times. >> >> > > Pedro, > I'm seeing very similar issues with MLC+UBIFS, though not only with > read-only files. Have you made any progress in your investigation or > while trying Artem's suggestions? I'm about to start digging into > this and would be interested to hear about any issues you may have > come across. Do you have any opinion on whether this "corruption" is > related to the information posted on the linux-mtd site at... > http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ? > > A few notes: > - I do occasionally have power cuts, but my understanding was that > UBI/UBIFS was very tolerant of that condition. > - I use CONFIG_MTD_UBI_WL_THRESHOLD=256 > - I'm using linux-2.6.29 > > Thanks, > twebb I haven't had the opportunity to use 2.6.29 with the ubifs backport yet. However, I run my devices over an extended operational test and couldn't reproduce the errors. In this test I avoided any power cuts on purpose because I wanted to verify that the boards' software was not at fault during normal conditions. I still see the errors in the deployed boards and these ones are subject to random power cuts. After analyzing the logs I conclude that there is a strong correlation between the power cuts and the corruption errors. The typical scenario is a board running fine for two months without interruption, then a power cut, and then upon reboot a myriad of UBIFS error messages show up (see sample following my signature) I'm almost convinced now that power cuts are the culprit. I will be conducting test in the next few days to fully verify this. I'll post my results. Thanks, -- Pedro Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error: ubi_io_re ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): try_read_node: cannot read node type 1 from LEB 322:209624, error -74 Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error: ubi_io_re ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): ubifs_check_node: bad CRC: calculated 0x5edaa128, read 0xacfe20eb Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): ubifs_check_node: bad node at LEB 322:209624 Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): ubifs_read_node: expe cted node type 1 Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): do_readpage: cannot r ead page 261 of inode 3463, error -117 Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error: ubi_io_re ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash 2010-04-19 23:52 ` Pedro I. Sanchez @ 2010-05-03 14:48 ` Pedro I. Sanchez 2010-05-04 14:17 ` Artem Bityutskiy 0 siblings, 1 reply; 6+ messages in thread From: Pedro I. Sanchez @ 2010-05-03 14:48 UTC (permalink / raw) To: linux-mtd; +Cc: twebb, dedekind1 Pedro I. Sanchez wrote: > twebb wrote: >>>> 2. I have several boards with MLC NAND flash running the Linux kernel >>>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" >>>> errors, files that all of a sudden become unreadable. Curiously enough, >>>> they have been read-only files in all cases, program executables and >>>> shared libraries. >>> Hmm. Do you do unclean power cuts? >>> >>>> Would upgrading to a more recent kernel, or back porting the latest >>>> UBIFS code, help? Shall I expect better support for MLC NAND flash in >>>> the latest UBIFS code? >>> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If >>> you did this, then your UBI/UBIFS should be the same as in the latest >>> kernels. Please, do this, although this will probably not solve your >>> corruption problems, but you'll have other bug-fixes we have made since >>> 2.6.29 times. >>> >>> >> >> Pedro, >> I'm seeing very similar issues with MLC+UBIFS, though not only with >> read-only files. Have you made any progress in your investigation or >> while trying Artem's suggestions? I'm about to start digging into >> this and would be interested to hear about any issues you may have >> come across. Do you have any opinion on whether this "corruption" is >> related to the information posted on the linux-mtd site at... >> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ? >> >> A few notes: >> - I do occasionally have power cuts, but my understanding was that >> UBI/UBIFS was very tolerant of that condition. >> - I use CONFIG_MTD_UBI_WL_THRESHOLD=256 >> - I'm using linux-2.6.29 >> >> Thanks, >> twebb > > I haven't had the opportunity to use 2.6.29 with the ubifs backport yet. > However, I run my devices over an extended operational test and couldn't > reproduce the errors. In this test I avoided any power cuts on purpose > because I wanted to verify that the boards' software was not at fault > during normal conditions. > > I still see the errors in the deployed boards and these ones are subject > to random power cuts. After analyzing the logs I conclude that there is > a strong correlation between the power cuts and the corruption errors. > The typical scenario is a board running fine for two months without > interruption, then a power cut, and then upon reboot a myriad of UBIFS > error messages show up (see sample following my signature) > > I'm almost convinced now that power cuts are the culprit. I will be > conducting test in the next few days to fully verify this. I'll post my > results. > > Thanks, > My tests are done. I arrived to the following conclusions: 1. All errors, zero-size files and random corruption, are related to power outages. 2. I was not able to reproduce any corruption errors under stable conditions (no sudden power cuts). We are now making some hardware mods to better handle power outages, basically holding the processor's reset line until power is stable. Item 2 above speaks well of the UBIFS layer anyway. Even though we have MLC flash I couldn't replicate any corruption problems. However, we are moving to SLC flash for our next round of boards anyway, just to be safe (or safer!). Thanks, -- Pedro ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash 2010-05-03 14:48 ` Pedro I. Sanchez @ 2010-05-04 14:17 ` Artem Bityutskiy 0 siblings, 0 replies; 6+ messages in thread From: Artem Bityutskiy @ 2010-05-04 14:17 UTC (permalink / raw) To: Pedro I. Sanchez; +Cc: twebb, linux-mtd On Mon, 2010-05-03 at 10:48 -0400, Pedro I. Sanchez wrote: > Pedro I. Sanchez wrote: > > twebb wrote: > >>>> 2. I have several boards with MLC NAND flash running the Linux kernel > >>>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption" > >>>> errors, files that all of a sudden become unreadable. Curiously enough, > >>>> they have been read-only files in all cases, program executables and > >>>> shared libraries. > >>> Hmm. Do you do unclean power cuts? > >>> > >>>> Would upgrading to a more recent kernel, or back porting the latest > >>>> UBIFS code, help? Shall I expect better support for MLC NAND flash in > >>>> the latest UBIFS code? > >>> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If > >>> you did this, then your UBI/UBIFS should be the same as in the latest > >>> kernels. Please, do this, although this will probably not solve your > >>> corruption problems, but you'll have other bug-fixes we have made since > >>> 2.6.29 times. > >>> > >>> > >> > >> Pedro, > >> I'm seeing very similar issues with MLC+UBIFS, though not only with > >> read-only files. Have you made any progress in your investigation or > >> while trying Artem's suggestions? I'm about to start digging into > >> this and would be interested to hear about any issues you may have > >> come across. Do you have any opinion on whether this "corruption" is > >> related to the information posted on the linux-mtd site at... > >> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ? > >> > >> A few notes: > >> - I do occasionally have power cuts, but my understanding was that > >> UBI/UBIFS was very tolerant of that condition. > >> - I use CONFIG_MTD_UBI_WL_THRESHOLD=256 > >> - I'm using linux-2.6.29 > >> > >> Thanks, > >> twebb > > > > I haven't had the opportunity to use 2.6.29 with the ubifs backport yet. > > However, I run my devices over an extended operational test and couldn't > > reproduce the errors. In this test I avoided any power cuts on purpose > > because I wanted to verify that the boards' software was not at fault > > during normal conditions. > > > > I still see the errors in the deployed boards and these ones are subject > > to random power cuts. After analyzing the logs I conclude that there is > > a strong correlation between the power cuts and the corruption errors. > > The typical scenario is a board running fine for two months without > > interruption, then a power cut, and then upon reboot a myriad of UBIFS > > error messages show up (see sample following my signature) > > > > I'm almost convinced now that power cuts are the culprit. I will be > > conducting test in the next few days to fully verify this. I'll post my > > results. > > > > Thanks, > > > My tests are done. I arrived to the following conclusions: > > 1. All errors, zero-size files and random corruption, are related to > power outages. Well, on SLC we did huge amount of power-cut tests and were always able to mount the FS. Zero-files and zeroes in files are possible, and this is described here: http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file http://www.linux-mtd.infradead.org/faq/ubifs.html#L_end_hole Not sure what you mean by random corruption, but this is probably something which should not happen. But a better description would be interesting. Anyway, if you have problems, they are probably MLC-specific, and of course it would be nice if someone with the real HW would investigate and fix them... -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-05-04 14:23 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-22 16:57 UBIFS and MLC NAND Flash Pedro I. Sanchez 2010-04-08 8:22 ` Artem Bityutskiy 2010-04-19 21:57 ` twebb 2010-04-19 23:52 ` Pedro I. Sanchez 2010-05-03 14:48 ` Pedro I. Sanchez 2010-05-04 14:17 ` Artem Bityutskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).