* UBIFS and MLC NAND Flash
@ 2010-03-22 16:57 Pedro I. Sanchez
2010-04-08 8:22 ` Artem Bityutskiy
0 siblings, 1 reply; 6+ messages in thread
From: Pedro I. Sanchez @ 2010-03-22 16:57 UTC (permalink / raw)
To: linux-mtd
Hello,
I have a few questions regarding this topic.
1. The UBIFS FAQ has a summary of the state of the support for MLC NAND
flash here:
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc
The question is, is this still the case? Does the FAQ reflect the
current state of affairs?
2. I have several boards with MLC NAND flash running the Linux kernel
2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
errors, files that all of a sudden become unreadable. Curiously enough,
they have been read-only files in all cases, program executables and
shared libraries.
Would upgrading to a more recent kernel, or back porting the latest
UBIFS code, help? Shall I expect better support for MLC NAND flash in
the latest UBIFS code?
3. I am also seeing other errors where it is the U-Boot or the Kernel
partitions that become corrupted. UBIFS is not involved there directly
since these partitions are at the mtd level and outside the UBI layer.
More specifically, my flash is partitioned as mtd0, mtd1, mtd2, mtd3,
mtd4. Only mtd4 has UBI/UBIFS on top. Is it possible that some flash
handling problems in UBIFS (mtd4) "spill over" other non-UBIFS mtd
partitions?
4. Other than minimizing flash writes, is there any other suggestion on
what to do to improve on the failure rate I see in the file system?
Thank you in advance, I would very much appreciate you answers.
--
Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash
2010-03-22 16:57 UBIFS and MLC NAND Flash Pedro I. Sanchez
@ 2010-04-08 8:22 ` Artem Bityutskiy
2010-04-19 21:57 ` twebb
0 siblings, 1 reply; 6+ messages in thread
From: Artem Bityutskiy @ 2010-04-08 8:22 UTC (permalink / raw)
To: Pedro I. Sanchez; +Cc: linux-mtd
Hi,
On Mon, 2010-03-22 at 12:57 -0400, Pedro I. Sanchez wrote:
> Hello,
>
> I have a few questions regarding this topic.
>
> 1. The UBIFS FAQ has a summary of the state of the support for MLC NAND
> flash here:
>
> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc
>
> The question is, is this still the case? Does the FAQ reflect the
> current state of affairs?
I think so.
> 2. I have several boards with MLC NAND flash running the Linux kernel
> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
> errors, files that all of a sudden become unreadable. Curiously enough,
> they have been read-only files in all cases, program executables and
> shared libraries.
Hmm. Do you do unclean power cuts?
> Would upgrading to a more recent kernel, or back porting the latest
> UBIFS code, help? Shall I expect better support for MLC NAND flash in
> the latest UBIFS code?
You did not specify whether you pulled the ubifs-v2.6.29.git tree. If
you did this, then your UBI/UBIFS should be the same as in the latest
kernels. Please, do this, although this will probably not solve your
corruption problems, but you'll have other bug-fixes we have made since
2.6.29 times.
Please, take a look here as well:
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_how_send_bugreport
Did you run MTD tests? If no, run them. You may have issues on driver/HW
levels.
Vs. "Shall I expect better support" - not as far as I know, because I
have not heard if anyone is working on this. But you could do that and
contribute.
> 3. I am also seeing other errors where it is the U-Boot or the Kernel
> partitions that become corrupted. UBIFS is not involved there directly
> since these partitions are at the mtd level and outside the UBI layer.
>
> More specifically, my flash is partitioned as mtd0, mtd1, mtd2, mtd3,
> mtd4. Only mtd4 has UBI/UBIFS on top. Is it possible that some flash
> handling problems in UBIFS (mtd4) "spill over" other non-UBIFS mtd
> partitions?
May be. Start with validating your MTD driver using MTD tests. This may
help you to narrow down the problem.
> 4. Other than minimizing flash writes, is there any other suggestion on
> what to do to improve on the failure rate I see in the file system?
Not sure. You should carefully investigate you problems and find out the
nature of the failures, and then we can discuss that.
--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash
2010-04-08 8:22 ` Artem Bityutskiy
@ 2010-04-19 21:57 ` twebb
2010-04-19 23:52 ` Pedro I. Sanchez
0 siblings, 1 reply; 6+ messages in thread
From: twebb @ 2010-04-19 21:57 UTC (permalink / raw)
To: dedekind1; +Cc: Pedro I. Sanchez, linux-mtd
>
> > 2. I have several boards with MLC NAND flash running the Linux kernel
> > 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
> > errors, files that all of a sudden become unreadable. Curiously enough,
> > they have been read-only files in all cases, program executables and
> > shared libraries.
>
> Hmm. Do you do unclean power cuts?
>
> > Would upgrading to a more recent kernel, or back porting the latest
> > UBIFS code, help? Shall I expect better support for MLC NAND flash in
> > the latest UBIFS code?
>
> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If
> you did this, then your UBI/UBIFS should be the same as in the latest
> kernels. Please, do this, although this will probably not solve your
> corruption problems, but you'll have other bug-fixes we have made since
> 2.6.29 times.
>
>
Pedro,
I'm seeing very similar issues with MLC+UBIFS, though not only with
read-only files. Have you made any progress in your investigation or
while trying Artem's suggestions? I'm about to start digging into
this and would be interested to hear about any issues you may have
come across. Do you have any opinion on whether this "corruption" is
related to the information posted on the linux-mtd site at...
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ?
A few notes:
- I do occasionally have power cuts, but my understanding was that
UBI/UBIFS was very tolerant of that condition.
- I use CONFIG_MTD_UBI_WL_THRESHOLD=256
- I'm using linux-2.6.29
Thanks,
twebb
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash
2010-04-19 21:57 ` twebb
@ 2010-04-19 23:52 ` Pedro I. Sanchez
2010-05-03 14:48 ` Pedro I. Sanchez
0 siblings, 1 reply; 6+ messages in thread
From: Pedro I. Sanchez @ 2010-04-19 23:52 UTC (permalink / raw)
To: twebb; +Cc: linux-mtd, dedekind1
twebb wrote:
>>> 2. I have several boards with MLC NAND flash running the Linux kernel
>>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
>>> errors, files that all of a sudden become unreadable. Curiously enough,
>>> they have been read-only files in all cases, program executables and
>>> shared libraries.
>> Hmm. Do you do unclean power cuts?
>>
>>> Would upgrading to a more recent kernel, or back porting the latest
>>> UBIFS code, help? Shall I expect better support for MLC NAND flash in
>>> the latest UBIFS code?
>> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If
>> you did this, then your UBI/UBIFS should be the same as in the latest
>> kernels. Please, do this, although this will probably not solve your
>> corruption problems, but you'll have other bug-fixes we have made since
>> 2.6.29 times.
>>
>>
>
> Pedro,
> I'm seeing very similar issues with MLC+UBIFS, though not only with
> read-only files. Have you made any progress in your investigation or
> while trying Artem's suggestions? I'm about to start digging into
> this and would be interested to hear about any issues you may have
> come across. Do you have any opinion on whether this "corruption" is
> related to the information posted on the linux-mtd site at...
> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ?
>
> A few notes:
> - I do occasionally have power cuts, but my understanding was that
> UBI/UBIFS was very tolerant of that condition.
> - I use CONFIG_MTD_UBI_WL_THRESHOLD=256
> - I'm using linux-2.6.29
>
> Thanks,
> twebb
I haven't had the opportunity to use 2.6.29 with the ubifs backport yet.
However, I run my devices over an extended operational test and couldn't
reproduce the errors. In this test I avoided any power cuts on purpose
because I wanted to verify that the boards' software was not at fault
during normal conditions.
I still see the errors in the deployed boards and these ones are subject
to random power cuts. After analyzing the logs I conclude that there is
a strong correlation between the power cuts and the corruption errors.
The typical scenario is a board running fine for two months without
interruption, then a power cut, and then upon reboot a myriad of UBIFS
error messages show up (see sample following my signature)
I'm almost convinced now that power cuts are the culprit. I will be
conducting test in the next few days to fully verify this. I'll post my
results.
Thanks,
--
Pedro
Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error:
ubi_io_re
ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes
Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507):
try_read_node: cannot
read node type 1 from LEB 322:209624, error -74
Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error:
ubi_io_re
ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes
Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507):
ubifs_check_node: bad
CRC: calculated 0x5edaa128, read 0xacfe20eb
Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507):
ubifs_check_node: bad
node at LEB 322:209624
Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507):
ubifs_read_node: expe
cted node type 1
Mar 16 00:58:22 blazepoint kernel: UBIFS error (pid 1507): do_readpage:
cannot r
ead page 261 of inode 3463, error -117
Mar 16 00:58:22 blazepoint kernel: uncorrectable error : <3>UBI error:
ubi_io_re
ad: error -74 while reading 2560 bytes from PEB 376:213720, read 2560 bytes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash
2010-04-19 23:52 ` Pedro I. Sanchez
@ 2010-05-03 14:48 ` Pedro I. Sanchez
2010-05-04 14:17 ` Artem Bityutskiy
0 siblings, 1 reply; 6+ messages in thread
From: Pedro I. Sanchez @ 2010-05-03 14:48 UTC (permalink / raw)
To: linux-mtd; +Cc: twebb, dedekind1
Pedro I. Sanchez wrote:
> twebb wrote:
>>>> 2. I have several boards with MLC NAND flash running the Linux kernel
>>>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
>>>> errors, files that all of a sudden become unreadable. Curiously enough,
>>>> they have been read-only files in all cases, program executables and
>>>> shared libraries.
>>> Hmm. Do you do unclean power cuts?
>>>
>>>> Would upgrading to a more recent kernel, or back porting the latest
>>>> UBIFS code, help? Shall I expect better support for MLC NAND flash in
>>>> the latest UBIFS code?
>>> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If
>>> you did this, then your UBI/UBIFS should be the same as in the latest
>>> kernels. Please, do this, although this will probably not solve your
>>> corruption problems, but you'll have other bug-fixes we have made since
>>> 2.6.29 times.
>>>
>>>
>>
>> Pedro,
>> I'm seeing very similar issues with MLC+UBIFS, though not only with
>> read-only files. Have you made any progress in your investigation or
>> while trying Artem's suggestions? I'm about to start digging into
>> this and would be interested to hear about any issues you may have
>> come across. Do you have any opinion on whether this "corruption" is
>> related to the information posted on the linux-mtd site at...
>> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ?
>>
>> A few notes:
>> - I do occasionally have power cuts, but my understanding was that
>> UBI/UBIFS was very tolerant of that condition.
>> - I use CONFIG_MTD_UBI_WL_THRESHOLD=256
>> - I'm using linux-2.6.29
>>
>> Thanks,
>> twebb
>
> I haven't had the opportunity to use 2.6.29 with the ubifs backport yet.
> However, I run my devices over an extended operational test and couldn't
> reproduce the errors. In this test I avoided any power cuts on purpose
> because I wanted to verify that the boards' software was not at fault
> during normal conditions.
>
> I still see the errors in the deployed boards and these ones are subject
> to random power cuts. After analyzing the logs I conclude that there is
> a strong correlation between the power cuts and the corruption errors.
> The typical scenario is a board running fine for two months without
> interruption, then a power cut, and then upon reboot a myriad of UBIFS
> error messages show up (see sample following my signature)
>
> I'm almost convinced now that power cuts are the culprit. I will be
> conducting test in the next few days to fully verify this. I'll post my
> results.
>
> Thanks,
>
My tests are done. I arrived to the following conclusions:
1. All errors, zero-size files and random corruption, are related to
power outages.
2. I was not able to reproduce any corruption errors under stable
conditions (no sudden power cuts).
We are now making some hardware mods to better handle power outages,
basically holding the processor's reset line until power is stable.
Item 2 above speaks well of the UBIFS layer anyway. Even though we have
MLC flash I couldn't replicate any corruption problems. However, we are
moving to SLC flash for our next round of boards anyway, just to be safe
(or safer!).
Thanks,
--
Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: UBIFS and MLC NAND Flash
2010-05-03 14:48 ` Pedro I. Sanchez
@ 2010-05-04 14:17 ` Artem Bityutskiy
0 siblings, 0 replies; 6+ messages in thread
From: Artem Bityutskiy @ 2010-05-04 14:17 UTC (permalink / raw)
To: Pedro I. Sanchez; +Cc: twebb, linux-mtd
On Mon, 2010-05-03 at 10:48 -0400, Pedro I. Sanchez wrote:
> Pedro I. Sanchez wrote:
> > twebb wrote:
> >>>> 2. I have several boards with MLC NAND flash running the Linux kernel
> >>>> 2.6.29 and UBIFS. I am seeing a fairly large rate of file "corruption"
> >>>> errors, files that all of a sudden become unreadable. Curiously enough,
> >>>> they have been read-only files in all cases, program executables and
> >>>> shared libraries.
> >>> Hmm. Do you do unclean power cuts?
> >>>
> >>>> Would upgrading to a more recent kernel, or back porting the latest
> >>>> UBIFS code, help? Shall I expect better support for MLC NAND flash in
> >>>> the latest UBIFS code?
> >>> You did not specify whether you pulled the ubifs-v2.6.29.git tree. If
> >>> you did this, then your UBI/UBIFS should be the same as in the latest
> >>> kernels. Please, do this, although this will probably not solve your
> >>> corruption problems, but you'll have other bug-fixes we have made since
> >>> 2.6.29 times.
> >>>
> >>>
> >>
> >> Pedro,
> >> I'm seeing very similar issues with MLC+UBIFS, though not only with
> >> read-only files. Have you made any progress in your investigation or
> >> while trying Artem's suggestions? I'm about to start digging into
> >> this and would be interested to hear about any issues you may have
> >> come across. Do you have any opinion on whether this "corruption" is
> >> related to the information posted on the linux-mtd site at...
> >> http://www.linux-mtd.infradead.org/faq/ubifs.html#L_ubifs_mlc ?
> >>
> >> A few notes:
> >> - I do occasionally have power cuts, but my understanding was that
> >> UBI/UBIFS was very tolerant of that condition.
> >> - I use CONFIG_MTD_UBI_WL_THRESHOLD=256
> >> - I'm using linux-2.6.29
> >>
> >> Thanks,
> >> twebb
> >
> > I haven't had the opportunity to use 2.6.29 with the ubifs backport yet.
> > However, I run my devices over an extended operational test and couldn't
> > reproduce the errors. In this test I avoided any power cuts on purpose
> > because I wanted to verify that the boards' software was not at fault
> > during normal conditions.
> >
> > I still see the errors in the deployed boards and these ones are subject
> > to random power cuts. After analyzing the logs I conclude that there is
> > a strong correlation between the power cuts and the corruption errors.
> > The typical scenario is a board running fine for two months without
> > interruption, then a power cut, and then upon reboot a myriad of UBIFS
> > error messages show up (see sample following my signature)
> >
> > I'm almost convinced now that power cuts are the culprit. I will be
> > conducting test in the next few days to fully verify this. I'll post my
> > results.
> >
> > Thanks,
> >
> My tests are done. I arrived to the following conclusions:
>
> 1. All errors, zero-size files and random corruption, are related to
> power outages.
Well, on SLC we did huge amount of power-cut tests and were always able
to mount the FS. Zero-files and zeroes in files are possible, and this
is described here:
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_end_hole
Not sure what you mean by random corruption, but this is probably
something which should not happen. But a better description would be
interesting.
Anyway, if you have problems, they are probably MLC-specific, and of
course it would be nice if someone with the real HW would investigate
and fix them...
--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-05-04 14:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-22 16:57 UBIFS and MLC NAND Flash Pedro I. Sanchez
2010-04-08 8:22 ` Artem Bityutskiy
2010-04-19 21:57 ` twebb
2010-04-19 23:52 ` Pedro I. Sanchez
2010-05-03 14:48 ` Pedro I. Sanchez
2010-05-04 14:17 ` Artem Bityutskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).