public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Single bit error correction on NAND.
@ 2005-07-28 19:40 DataCom - Virgílio
  2005-07-28 21:06 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: DataCom - Virgílio @ 2005-07-28 19:40 UTC (permalink / raw)
  To: linux-mtd

I have noticed that NAND support on MTD can detect/correct single bit errors 
in 256 bytes, using software ECC. However, the correct data is not written 
back to flash. In other words, the bit stays "flipped" in flash. Isn't that 
dangerous? I mean, if another bitflip occurs in the same 256 bytes, we won't 
be able to correct by software ECC...

Am I missing something here? Thanks in advance.

-- 
Virgílio Silva

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Single bit error correction on NAND.
  2005-07-28 19:40 Single bit error correction on NAND DataCom - Virgílio
@ 2005-07-28 21:06 ` Thomas Gleixner
  2005-07-28 21:40   ` DataCom - Virgílio
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2005-07-28 21:06 UTC (permalink / raw)
  To: DataCom - Virgílio; +Cc: linux-mtd

On Thu, 2005-07-28 at 16:40 -0300, DataCom - Virgílio wrote:
> I have noticed that NAND support on MTD can detect/correct single bit errors 
> in 256 bytes, using software ECC. However, the correct data is not written 
> back to flash. In other words, the bit stays "flipped" in flash. Isn't that 
> dangerous? I mean, if another bitflip occurs in the same 256 bytes, we won't 
> be able to correct by software ECC...
> 
> Am I missing something here?

Yes. 

It is not and can not be the responsibility of the NAND driver to handle
this. Where to should it write the data ? The existing data _cannot_ be
overwritten, without erasing the whole eraseblock and the implied risk
of loosing _ALL_ the data on the block due to a powerfail.

The NAND driver returns the corrected data and an appropriate error code
and the calling (e.g. filesystem) driver has to deal with the data
copying or whatever the developer has decided to be the correct
solution.

The NAND driver provides only the lowest level of access to the flash
device.

tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Single bit error correction on NAND.
  2005-07-28 21:06 ` Thomas Gleixner
@ 2005-07-28 21:40   ` DataCom - Virgílio
  2005-07-29  7:35     ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: DataCom - Virgílio @ 2005-07-28 21:40 UTC (permalink / raw)
  To: linux-mtd

On Thursday 28 July 2005 18:06, Thomas Gleixner wrote:
> On Thu, 2005-07-28 at 16:40 -0300, DataCom - Virgílio wrote:
> > I have noticed that NAND support on MTD can detect/correct single bit
> > errors in 256 bytes, using software ECC. However, the correct data is not
> > written back to flash. In other words, the bit stays "flipped" in flash.
> > Isn't that dangerous? I mean, if another bitflip occurs in the same 256
> > bytes, we won't be able to correct by software ECC...
> >
> > Am I missing something here?
>
> Yes.
>
> It is not and can not be the responsibility of the NAND driver to handle
> this. Where to should it write the data ? The existing data _cannot_ be
> overwritten, without erasing the whole eraseblock and the implied risk
> of loosing _ALL_ the data on the block due to a powerfail.
>
> The NAND driver returns the corrected data and an appropriate error code

Maybe I'm missing something, but the function nand_do_read_ecc returns 
-EBADMSG only if an uncorrectable error is detected by nand_correct_data. If 
just one bit is flipped, nand_correct_data corrects it, and nand_do_read_ecc 
returns 0.

> and the calling (e.g. filesystem) driver has to deal with the data
> copying or whatever the developer has decided to be the correct
> solution.

So it would be JFFS2 responsability to write back the correct data to flash, 
after a bit-flip is detected and corrected by the NAND driver. However, 
apparently it doesn't.

My question is: do you think it's safe to just leave the bit flipped in flash? 
I mean, everytime I try to read it, ECC will handle it. But what are the odds 
of another bit-flip in the same 256 bytes?

-- 
VIRGÍLIO Silva

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Single bit error correction on NAND.
  2005-07-28 21:40   ` DataCom - Virgílio
@ 2005-07-29  7:35     ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2005-07-29  7:35 UTC (permalink / raw)
  To: DataCom - Virgílio; +Cc: linux-mtd

On Thu, 2005-07-28 at 18:40 -0300, DataCom - Virgílio wrote:
> > It is not and can not be the responsibility of the NAND driver to handle
> > this. Where to should it write the data ? The existing data _cannot_ be
> > overwritten, without erasing the whole eraseblock and the implied risk
> > of loosing _ALL_ the data on the block due to a powerfail.
> >
> > The NAND driver returns the corrected data and an appropriate error code
> 
> Maybe I'm missing something, but the function nand_do_read_ecc returns 
> -EBADMSG only if an uncorrectable error is detected by nand_correct_data. If 
> just one bit is flipped, nand_correct_data corrects it, and nand_do_read_ecc 
> returns 0.

Damn, you're right. I check into this.

tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-29  7:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-28 19:40 Single bit error correction on NAND DataCom - Virgílio
2005-07-28 21:06 ` Thomas Gleixner
2005-07-28 21:40   ` DataCom - Virgílio
2005-07-29  7:35     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox