linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Does UBIFS NAND ECC info get stored in OOB?
@ 2014-12-30 19:44 Steve deRosier
  2014-12-31  2:04 ` Josh Wu
  2014-12-31  2:15 ` hujianyang
  0 siblings, 2 replies; 8+ messages in thread
From: Steve deRosier @ 2014-12-30 19:44 UTC (permalink / raw)
  To: linux-mtd

 Hi All,

Sorry if this is a stupid question, but I found a number of old
archived messages that explicitly state that UBIFS (actually, probably
UBI) doesn't utilize the OOB of a NAND flash at all for storing the
ECC information. And as near as I can tell from behavior and code, it
does certainly store ECC info in the OOB area.

So, does UBIFS utilize the OOB area to store ECC bits?  And if not,
where/how does it store this information?

I'm starting to assume that you're simply saying that UBIFS itself
doesn't use the OOB area, nor even handles the ECC itself, but that's
up to the chip driver layer. And that the driver will handle the ECC
and OOB as appropriate.  Am I correct?

Details of my question:

We're having some trouble with filesystem corruption on a Linux 3.8
kernel based on an Atmel SAM9g25 controller.  The controller does have
the PMECC unit.

It utilizes the mtd/nand/atmel_nand.c driver. This driver has the
PMECC bits in it and does appear to write/read/correct-via ECC bits in
the OOB area of the NAND.

We're using UBIFS for our rootfs.

And yes, I understand the 3.8 kernel is old, and we're upgrading, but
I'm trying to figure out why we're having the problems as I'm assuming
it's not a bug in the code but more of a configuration or process or
hardware issue.

One example of finding that UBI & UBIFS doesn't use the OOB area is "
this is not a problem for UBI/UBIFS, because neither UBIFS nor UBI use
OOB area;" from http://www.linux-mtd.infradead.org/doc/ubifs.html

Thanks any help,
- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2014-12-30 19:44 Does UBIFS NAND ECC info get stored in OOB? Steve deRosier
@ 2014-12-31  2:04 ` Josh Wu
  2015-01-02 18:06   ` Steve deRosier
  2014-12-31  2:15 ` hujianyang
  1 sibling, 1 reply; 8+ messages in thread
From: Josh Wu @ 2014-12-31  2:04 UTC (permalink / raw)
  To: linux-mtd

Hi, Steve

On 12/31/2014 3:44 AM, Steve deRosier wrote:
>   Hi All,
>
> Sorry if this is a stupid question, but I found a number of old
> archived messages that explicitly state that UBIFS (actually, probably
> UBI) doesn't utilize the OOB of a NAND flash at all for storing the
> ECC information.
Could you list out these UBI/UBIFS messages so that people can help?

> And as near as I can tell from behavior and code, it
> does certainly store ECC info in the OOB area.
>
> So, does UBIFS utilize the OOB area to store ECC bits?  And if not,
> where/how does it store this information?
>
> I'm starting to assume that you're simply saying that UBIFS itself
> doesn't use the OOB area, nor even handles the ECC itself, but that's
> up to the chip driver layer. And that the driver will handle the ECC
> and OOB as appropriate.  Am I correct?
yes. I think you are correct.

>
> Details of my question:
>
> We're having some trouble with filesystem corruption on a Linux 3.8
> kernel based on an Atmel SAM9g25 controller.  The controller does have
> the PMECC unit.

Does your system can boot up correctly and work sometime? or you cannot 
mount your UBI filesystem at all?
Could get me a system boot log about your corruption, and another boot 
log without corruption?
>
> It utilizes the mtd/nand/atmel_nand.c driver. This driver has the
> PMECC bits in it and does appear to write/read/correct-via ECC bits in
> the OOB area of the NAND.
>
> We're using UBIFS for our rootfs.
>
> And yes, I understand the 3.8 kernel is old, and we're upgrading, but
> I'm trying to figure out why we're having the problems as I'm assuming
> it's not a bug in the code but more of a configuration or process or
> hardware issu
So could give me some configuration about your PMECC?
4 bits correction in 512 bytes or else? What is your nand flash ecc 
minimal requirement?

Best Regards,
Josh Wu

>
> One example of finding that UBI & UBIFS doesn't use the OOB area is "
> this is not a problem for UBI/UBIFS, because neither UBIFS nor UBI use
> OOB area;" from http://www.linux-mtd.infradead.org/doc/ubifs.html
>
> Thanks any help,
> - Steve
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2014-12-30 19:44 Does UBIFS NAND ECC info get stored in OOB? Steve deRosier
  2014-12-31  2:04 ` Josh Wu
@ 2014-12-31  2:15 ` hujianyang
  2015-01-02 18:12   ` Steve deRosier
  1 sibling, 1 reply; 8+ messages in thread
From: hujianyang @ 2014-12-31  2:15 UTC (permalink / raw)
  To: Steve deRosier; +Cc: Richard Weinberger, linux-mtd, Artem Bityutskiy

On 2014/12/31 3:44, Steve deRosier wrote:
> 
> So, does UBIFS utilize the OOB area to store ECC bits?  And if not,
> where/how does it store this information?

No, UBIFS doesn't use OOB area.

See func ubi_io_mark_bad() in drivers/mtd/ubi/io.c. If an eraseblock
turns to bad, UBI driver uses mtd interface to mark this eb as bad.

> 
> I'm starting to assume that you're simply saying that UBIFS itself
> doesn't use the OOB area, nor even handles the ECC itself, but that's
> up to the chip driver layer. And that the driver will handle the ECC
> and OOB as appropriate.  Am I correct?
>

Yes, you are right.

UBIFS doesn't directly write to OOB area. MTD or nand controller is
responsible to the data management in OOB area.

I think the announce "neither UBIFS nor UBI use OOB area" is compared
to the filesystem which directly write data to OOB area across the
interface provided by lower layer. For example, Yaffs2? I not sure
of it.

>
> And yes, I understand the 3.8 kernel is old, and we're upgrading, but
> I'm trying to figure out why we're having the problems as I'm assuming
> it's not a bug in the code but more of a configuration or process or
> hardware issue.
>

What kind of problems?

Thanks,
Hu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2014-12-31  2:04 ` Josh Wu
@ 2015-01-02 18:06   ` Steve deRosier
  2015-01-04  3:52     ` Josh Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Steve deRosier @ 2015-01-02 18:06 UTC (permalink / raw)
  To: Josh Wu; +Cc: linux-mtd

Hi Josh,


On Tue, Dec 30, 2014 at 6:04 PM, Josh Wu <josh.wu@atmel.com> wrote:
> Hi, Steve
>
> On 12/31/2014 3:44 AM, Steve deRosier wrote:
>>
>>   Hi All,
>>
>> Sorry if this is a stupid question, but I found a number of old
>> archived messages that explicitly state that UBIFS (actually, probably
>> UBI) doesn't utilize the OOB of a NAND flash at all for storing the
>> ECC information.
>
> Could you list out these UBI/UBIFS messages so that people can help?
>

Sorry, I found them about a month ago and have already cleared the
tabs.  But one clear version of it is directly on the pages at the MTD
site:

http://www.linux-mtd.infradead.org/doc/ubifs.html  under the title
"UBIFS and MLC NAND flash": "because neither UBIFS nor UBI use OOB
area;"
and here:
http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob

The list messages were from ~5 years ago or so from Artem IIRC.



>
> Does your system can boot up correctly and work sometime? or you cannot
> mount your UBI filesystem at all?
> Could get me a system boot log about your corruption, and another boot log
> without corruption?

Our system actually works 99.999% of the time. Which is why it's been
so difficult finding the problem. It's not so much a mount or
boot-time problem, though it happens sometimes then.  The system
usually works fine for a while, then you set it on a shelf for a
couple of weeks and when you bring it back up, it then randomly fails.
Sometimes at boot, sometimes when reading or running a specific file.
Sometimes the error message is an LZO muckup one, sometimes it's a bad
data node.  Typical:

UBIFS error (pid 919): read_block: bad data node (block 290, inode 67)
     magic          0x6101831
     crc            0x92684951
     node_type      1 (data node)
     group_type     0 (no node group)
     sqnum          297
     len            2152
     key            (67, data, 290)
     size           4096
     compr_typ      1
     data size      2104
     data:
     00000000: 2f 04 88 05 87 06 86 07 85 08 84 09 46 0e 58 00 00 24
00 00 00 cc 4f 00 00 f8 f1 fb ff 38 01 50
 ...
     00000820: 5d 02 92 5d 01 d1 4d 04 e4 4d 03 0a 7c 03 4d 03 bd ec
44 cc 6f 11 00 00
UBIFS error (pid 919): do_readpage: cannot read page 290 of inode 67, error -22

I think I've tracked it down to one of our junior engineers choosing
to use `nandwrite -n` in an update script he wrote. This results in
lack of ECC information being created on flashing it.  Not to mention
the writing of 0xffs and killing of the UBI ECs.  His tool then goes
further and ubiattaches the system, which then corrects the UBI
metadata, including writing the ECC data.  Which results in a weird
situation where a quick look at the flash data shows ECC data there,
but if you dig deeper, it's missing on the data nodes further on in
the system.

So, the rewrite of the UBI metadata with the ECC info obfuscated the
problem. It looks like we're not writing the ECC data on most of the
data. It works fine, then a bit-flips and then it fails later.
Unfortunately, waiting for bitflips is random and not terribly
testable. Knowing what I know now, I am able to update it with the old
script, manually cause a bitflip and see the exact same symptoms. And
with the rewritten version with ubiformat, I can do the same test and
it works fully.


>
> So could give me some configuration about your PMECC?
> 4 bits correction in 512 bytes or else? What is your nand flash ecc minimal
> requirement?
>

4 bits, yes.  And the requirement is 4bits.  For clarity, here's the
relevant chunk from the devicetree:

    nand0: nand@40000000 {
        nand-bus-width = <8>;
        nand-ecc-mode = "hw";
        atmel,has-pmecc; /* enable PMECC */
        atmel,pmecc-cap = <4>;
        atmel,pmecc-sector-size = <512>;
        atmel,pmecc-lookup-table-offset = <0x8000 0x10000>;
        nand-on-flash-bbt;
        status = "okay";

Thanks,
- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2014-12-31  2:15 ` hujianyang
@ 2015-01-02 18:12   ` Steve deRosier
  0 siblings, 0 replies; 8+ messages in thread
From: Steve deRosier @ 2015-01-02 18:12 UTC (permalink / raw)
  To: hujianyang; +Cc: linux-mtd

Hu,


On Tue, Dec 30, 2014 at 6:15 PM, hujianyang <hujianyang@huawei.com> wrote:
> On 2014/12/31 3:44, Steve deRosier wrote:
>> I'm starting to assume that you're simply saying that UBIFS itself
>> doesn't use the OOB area, nor even handles the ECC itself, but that's
>> up to the chip driver layer. And that the driver will handle the ECC
>> and OOB as appropriate.  Am I correct?
>>
>
> Yes, you are right.
>
> UBIFS doesn't directly write to OOB area. MTD or nand controller is
> responsible to the data management in OOB area.
>

Thanks for your answer, I was looking to see if I interpreted it right
and it looks like that's the case.


> I think the announce "neither UBIFS nor UBI use OOB area" is compared
> to the filesystem which directly write data to OOB area across the
> interface provided by lower layer. For example, Yaffs2? I not sure
> of it.
>

I have no clue. I'm glad that UBIFS doesn't use OOB for it's own
meta-data because I'm about to up our ECC strength to 8-bit, which
will take up nearly all the OOB area.  I was just worried that it
meant that it was keeping and doing ECC elsewhere, and I wanted to
know how it was doing that.

Thanks,
- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2015-01-02 18:06   ` Steve deRosier
@ 2015-01-04  3:52     ` Josh Wu
  2015-01-09  5:05       ` Steve deRosier
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Wu @ 2015-01-04  3:52 UTC (permalink / raw)
  To: Steve deRosier; +Cc: linux-mtd

Hi, Steve

On 1/3/2015 2:06 AM, Steve deRosier wrote:
> Hi Josh,
>
>
> On Tue, Dec 30, 2014 at 6:04 PM, Josh Wu <josh.wu@atmel.com> wrote:
>> Hi, Steve
>>
>> On 12/31/2014 3:44 AM, Steve deRosier wrote:
>>>    Hi All,
>>>
>>> Sorry if this is a stupid question, but I found a number of old
>>> archived messages that explicitly state that UBIFS (actually, probably
>>> UBI) doesn't utilize the OOB of a NAND flash at all for storing the
>>> ECC information.
>> Could you list out these UBI/UBIFS messages so that people can help?
>>
> Sorry, I found them about a month ago and have already cleared the
> tabs.  But one clear version of it is directly on the pages at the MTD
> site:
>
> http://www.linux-mtd.infradead.org/doc/ubifs.html  under the title
> "UBIFS and MLC NAND flash": "because neither UBIFS nor UBI use OOB
> area;"
> and here:
> http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
>
> The list messages were from ~5 years ago or so from Artem IIRC.
Sorry I didn't make me clear here. I just want to see the error message 
when your UBI system fail to work.
But never mind, I saw it in your following message  :)

>
>
>
>> Does your system can boot up correctly and work sometime? or you cannot
>> mount your UBI filesystem at all?
>> Could get me a system boot log about your corruption, and another boot log
>> without corruption?
> Our system actually works 99.999% of the time. Which is why it's been
> so difficult finding the problem.
Okay.

> It's not so much a mount or
> boot-time problem, though it happens sometimes then.  The system
> usually works fine for a while, then you set it on a shelf for a
> couple of weeks and when you bring it back up, it then randomly fails.
> Sometimes at boot, sometimes when reading or running a specific file.
> Sometimes the error message is an LZO muckup one, sometimes it's a bad
> data node.  Typical:
>
> UBIFS error (pid 919): read_block: bad data node (block 290, inode 67)
>       magic          0x6101831
>       crc            0x92684951
>       node_type      1 (data node)
>       group_type     0 (no node group)
>       sqnum          297
>       len            2152
>       key            (67, data, 290)
>       size           4096
>       compr_typ      1
>       data size      2104
>       data:
>       00000000: 2f 04 88 05 87 06 86 07 85 08 84 09 46 0e 58 00 00 24
> 00 00 00 cc 4f 00 00 f8 f1 fb ff 38 01 50
>   ...
>       00000820: 5d 02 92 5d 01 d1 4d 04 e4 4d 03 0a 7c 03 4d 03 bd ec
> 44 cc 6f 11 00 00
> UBIFS error (pid 919): do_readpage: cannot read page 290 of inode 67, error -22
There seems has some UBI fix on 3.8.x stable tree. It is better if you 
can apply these fixes.

➜  mainline git:(99f3cd5) ✗  git log --oneline v3.8..v3.8.13 | grep -i UBI
1afae69 UBIFS: make space fixup work in the remount case
d90dc15 UBIFS: fix double free of ubifs_orphan objects
ce7f4e8 UBIFS: fix use of freed ubifs_orphan objects
>
> I think I've tracked it down to one of our junior engineers choosing
> to use `nandwrite -n` in an update script he wrote. This results in
> lack of ECC information being created on flashing it.  Not to mention
> the writing of 0xffs and killing of the UBI ECs.  His tool then goes
> further and ubiattaches the system, which then corrects the UBI
> metadata, including writing the ECC data.  Which results in a weird
> situation where a quick look at the flash data shows ECC data there,
> but if you dig deeper, it's missing on the data nodes further on in
> the system.
>
> So, the rewrite of the UBI metadata with the ECC info obfuscated the
> problem. It looks like we're not writing the ECC data on most of the
> data. It works fine, then a bit-flips and then it fails later.
> Unfortunately, waiting for bitflips is random and not terribly
> testable. Knowing what I know now, I am able to update it with the old
> script, manually cause a bitflip and see the exact same symptoms. And
> with the rewritten version with ubiformat, I can do the same test and
> it works fully.
For at91sam9x5ek PMECC, we cannot do pmecc correction for the erased 
page(all 0xff) if there has some bit flips.
The reason is 9x5ek PMECC will generate non-0xff ecc code for the erased 
page(all 0xff in the page).

This will case issues:
1. if there is any bitflip happen in erased page's oob area, that will 
cause PMECC error.
2. if there is any bitflip happen in erased pages' data area, This 
bitflip cannot be correct. And driver won't report any ECC error. I am 
not sure whether this can cause problem? As the UBI  may record the 
erased page, so the data corruption maybe doesn't matter. When UBI write 
data to this bitfliped erased page, as the PMECC code will write 
correctly into oob area. So this bitflip can be corrected by PMECC hardware.

I think you can manually insert bitflip into the erased page to see 
whether this cause your issue.

>
>
>> So could give me some configuration about your PMECC?
>> 4 bits correction in 512 bytes or else? What is your nand flash ecc minimal
>> requirement?
>>
> 4 bits, yes.  And the requirement is 4bits.  For clarity, here's the
> relevant chunk from the devicetree:
>
>      nand0: nand@40000000 {
>          nand-bus-width = <8>;
>          nand-ecc-mode = "hw";
>          atmel,has-pmecc; /* enable PMECC */
>          atmel,pmecc-cap = <4>;
>          atmel,pmecc-sector-size = <512>;
>          atmel,pmecc-lookup-table-offset = <0x8000 0x10000>;
>          nand-on-flash-bbt;
>          status = "okay";
These seems ok.
Be caution: if you use 1024 as sector size, you need apply the fix: 
2fa831f9db1f <mtd: atmel_nand: pmecc: fix failure to correct bit error 
in 1024-bytes sector>

>
> Thanks,
> - Steve
Best Regards,
Josh Wu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2015-01-04  3:52     ` Josh Wu
@ 2015-01-09  5:05       ` Steve deRosier
  2015-01-12  8:33         ` Josh Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Steve deRosier @ 2015-01-09  5:05 UTC (permalink / raw)
  To: Josh Wu; +Cc: linux-mtd@lists.infradead.org

Hi Josh,

On Sat, Jan 3, 2015 at 7:52 PM, Josh Wu <josh.wu@atmel.com> wrote:
> Hi, Steve
>
> On 1/3/2015 2:06 AM, Steve deRosier wrote:
>>
>
> There seems has some UBI fix on 3.8.x stable tree. It is better if you can
> apply these fixes.
>
> ➜  mainline git:(99f3cd5) ✗  git log --oneline v3.8..v3.8.13 | grep -i UBI
> 1afae69 UBIFS: make space fixup work in the remount case
> d90dc15 UBIFS: fix double free of ubifs_orphan objects
> ce7f4e8 UBIFS: fix use of freed ubifs_orphan objects

Will do!  I had pulled in a number of other upstreamed fixes but these
must be newer than last time I looked. Thanks!



>
> For at91sam9x5ek PMECC, we cannot do pmecc correction for the erased
> page(all 0xff) if there has some bit flips.
> The reason is 9x5ek PMECC will generate non-0xff ecc code for the erased
> page(all 0xff in the page).
>
> This will case issues:
> 1. if there is any bitflip happen in erased page's oob area, that will cause
> PMECC error.
> 2. if there is any bitflip happen in erased pages' data area, This bitflip
> cannot be correct. And driver won't report any ECC error. I am not sure
> whether this can cause problem? As the UBI  may record the erased page, so
> the data corruption maybe doesn't matter. When UBI write data to this
> bitfliped erased page, as the PMECC code will write correctly into oob area.
> So this bitflip can be corrected by PMECC hardware.
>
> I think you can manually insert bitflip into the erased page to see whether
> this cause your issue.

Well, our issue is clearly caused by the use of `nandflash -n`.
Moving to ubiformat fixes it.

But, what you pointed out made me interested in a few more problem scenarios:

1. Bitflip in ECC data of a valid data page
2. Bitflip in data area of an erased page
3. Bitflip in the ECC data of an erased page.

So I tried them.  I was hoping for the best and fearing the worst.
Thankfully I effectively got the best.
1. This was the scary one for me. But, it seems that this is handled
nicely by the ECC process. dmesg printed:
    atmel_nand 40000000.nand: Bit flip in OOB, oob_byte_pos: 48,
bit_pos: 0, 0xec -> 0xed
This is awesome, it found the flip, identified where it was and fixed it. Yay.

Both 2 and 3 were non-events.  As near as I could tell, UBIFS and the
MTD system ignored those. I have some special code that noticed it,
but none of the stock stuff did.  Writing and reading data there
worked fine.  And, I'd expect that if the flip caused a flip in data
that was written and later corrected, it would be fine.

>
> These seems ok.
> Be caution: if you use 1024 as sector size, you need apply the fix:
> 2fa831f9db1f <mtd: atmel_nand: pmecc: fix failure to correct bit error in
> 1024-bytes sector>
>

Thanks for the heads up on this fix.  We're using 512, but after
reading some stuff, I'm thinking that going to 1024 might make some
sense, so I might need that.

Thanks,
- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Does UBIFS NAND ECC info get stored in OOB?
  2015-01-09  5:05       ` Steve deRosier
@ 2015-01-12  8:33         ` Josh Wu
  0 siblings, 0 replies; 8+ messages in thread
From: Josh Wu @ 2015-01-12  8:33 UTC (permalink / raw)
  To: Steve deRosier; +Cc: ricard.wanderlof, linux-mtd@lists.infradead.org, ezequiel

Hi, Steve

On 1/9/2015 1:05 PM, Steve deRosier wrote:
> Hi Josh,
>
> On Sat, Jan 3, 2015 at 7:52 PM, Josh Wu <josh.wu@atmel.com> wrote:
>> Hi, Steve
>>
>> On 1/3/2015 2:06 AM, Steve deRosier wrote:
>> There seems has some UBI fix on 3.8.x stable tree. It is better if you can
>> apply these fixes.
>>
>> ➜  mainline git:(99f3cd5) ✗  git log --oneline v3.8..v3.8.13 | grep -i UBI
>> 1afae69 UBIFS: make space fixup work in the remount case
>> d90dc15 UBIFS: fix double free of ubifs_orphan objects
>> ce7f4e8 UBIFS: fix use of freed ubifs_orphan objects
> Will do!  I had pulled in a number of other upstreamed fixes but these
> must be newer than last time I looked. Thanks!
>
>
>
>> For at91sam9x5ek PMECC, we cannot do pmecc correction for the erased
>> page(all 0xff) if there has some bit flips.
>> The reason is 9x5ek PMECC will generate non-0xff ecc code for the erased
>> page(all 0xff in the page).
>>
>> This will case issues:
>> 1. if there is any bitflip happen in erased page's oob area, that will cause
>> PMECC error.
>> 2. if there is any bitflip happen in erased pages' data area, This bitflip
>> cannot be correct. And driver won't report any ECC error. I am not sure
>> whether this can cause problem? As the UBI  may record the erased page, so
>> the data corruption maybe doesn't matter. When UBI write data to this
>> bitfliped erased page, as the PMECC code will write correctly into oob area.
>> So this bitflip can be corrected by PMECC hardware.
>>
>> I think you can manually insert bitflip into the erased page to see whether
>> this cause your issue.
> Well, our issue is clearly caused by the use of `nandflash -n`.
> Moving to ubiformat fixes it.
>
> But, what you pointed out made me interested in a few more problem scenarios:
>
> 1. Bitflip in ECC data of a valid data page
> 2. Bitflip in data area of an erased page
> 3. Bitflip in the ECC data of an erased page.
>
> So I tried them.  I was hoping for the best and fearing the worst.
> Thankfully I effectively got the best.
> 1. This was the scary one for me. But, it seems that this is handled
> nicely by the ECC process. dmesg printed:
>      atmel_nand 40000000.nand: Bit flip in OOB, oob_byte_pos: 48,
> bit_pos: 0, 0xec -> 0xed
> This is awesome, it found the flip, identified where it was and fixed it. Yay.
Yes. In this case, since ECC and data (512 bytes) sector or block 
combined into a code word.
any bitflip happened in the code word can be corrected.

So that means if only let PMECC driver to operate the oob, e.g. all used 
oob data is ECC and it's part of code word.
Then the bitflips in PMECC's capability can be corrected.

>
> Both 2 and 3 were non-events.  As near as I could tell, UBIFS and the
> MTD system ignored those. I have some special code that noticed it,
> but none of the stock stuff did.  Writing and reading data there
> worked fine.  And, I'd expect that if the flip caused a flip in data
> that was written and later corrected, it would be fine.
This test result sound good to me. Actually I am worry about this kind 
of situation.
I don't check the UBI code details, but I guess this is because the UBI 
will record the erased pages. So UBI don't read the erased page at all. 
UBI only write data into it.

Best Regards,
Josh Wu

>
>> These seems ok.
>> Be caution: if you use 1024 as sector size, you need apply the fix:
>> 2fa831f9db1f <mtd: atmel_nand: pmecc: fix failure to correct bit error in
>> 1024-bytes sector>
>>
> Thanks for the heads up on this fix.  We're using 512, but after
> reading some stuff, I'm thinking that going to 1024 might make some
> sense, so I might need that.
>
> Thanks,
> - Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-01-12  8:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-30 19:44 Does UBIFS NAND ECC info get stored in OOB? Steve deRosier
2014-12-31  2:04 ` Josh Wu
2015-01-02 18:06   ` Steve deRosier
2015-01-04  3:52     ` Josh Wu
2015-01-09  5:05       ` Steve deRosier
2015-01-12  8:33         ` Josh Wu
2014-12-31  2:15 ` hujianyang
2015-01-02 18:12   ` Steve deRosier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).