public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* NAND read errors, R/O FS
@ 2006-05-10 19:12 Steve Finney
  2006-05-10 19:20 ` Josh Boyer
  2006-05-10 23:28 ` Jamie Lokier
  0 siblings, 2 replies; 3+ messages in thread
From: Steve Finney @ 2006-05-10 19:12 UTC (permalink / raw)
  To: linux-mtd

>On Wed, 10 May 2006 15:20:44 +0400, Vitaly Wool wrote:
>> Josh Boyer wrote:
>>
>> >Not quite the case.  You need bad block skipping, yes.  But NAND can
>> >get bit flips in good blocks still.  How do you deal with that?  You
>> >can't leave the block in that state forever because it will continue
>> >to get bit flips and then your data will be unusable.
>> Yep, I know about the issue. The recommended way to go here AFAIK is to 
>> mark the block as bad and copy its contents to a free one.

FWIW, on the Samsung K9F* NAND chip, the explicitly recommended procedure
for single bit read errors is to do single bit Error Correction (assuming ECC 
is implemented) and to NOT remap (you _are_ supposed to  remap and
copy one write errors). Apparently the chances of a 2nd bit error within a single
512 byte page (the ECC unit) are negligible.

sf

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: NAND read errors, R/O FS
  2006-05-10 19:12 NAND read errors, R/O FS Steve Finney
@ 2006-05-10 19:20 ` Josh Boyer
  2006-05-10 23:28 ` Jamie Lokier
  1 sibling, 0 replies; 3+ messages in thread
From: Josh Boyer @ 2006-05-10 19:20 UTC (permalink / raw)
  To: Steve Finney; +Cc: linux-mtd

On 5/10/06, Steve Finney <saf76@earthlink.net> wrote:
> >On Wed, 10 May 2006 15:20:44 +0400, Vitaly Wool wrote:
> >> Josh Boyer wrote:
> >>
> >> >Not quite the case.  You need bad block skipping, yes.  But NAND can
> >> >get bit flips in good blocks still.  How do you deal with that?  You
> >> >can't leave the block in that state forever because it will continue
> >> >to get bit flips and then your data will be unusable.
> >> Yep, I know about the issue. The recommended way to go here AFAIK is to
> >> mark the block as bad and copy its contents to a free one.
>
> FWIW, on the Samsung K9F* NAND chip, the explicitly recommended procedure
> for single bit read errors is to do single bit Error Correction (assuming ECC
> is implemented) and to NOT remap (you _are_ supposed to  remap and
> copy one write errors). Apparently the chances of a 2nd bit error within a single
> 512 byte page (the ECC unit) are negligible.

That's ok for those specific chips, but the problems are more generic.
 A read-only filesystem can read a single page repeatedly and really
often.  Then can cause read distrub errors in other pages.

And I've really only been keeping SLC chips in mind.  When you get to
MLC technology, it's even less reliable.

josh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: NAND read errors, R/O FS
  2006-05-10 19:12 NAND read errors, R/O FS Steve Finney
  2006-05-10 19:20 ` Josh Boyer
@ 2006-05-10 23:28 ` Jamie Lokier
  1 sibling, 0 replies; 3+ messages in thread
From: Jamie Lokier @ 2006-05-10 23:28 UTC (permalink / raw)
  To: Steve Finney; +Cc: linux-mtd

Steve Finney wrote:
> >On Wed, 10 May 2006 15:20:44 +0400, Vitaly Wool wrote:
> >> Josh Boyer wrote:
> >>
> >> >Not quite the case.  You need bad block skipping, yes.  But NAND can
> >> >get bit flips in good blocks still.  How do you deal with that?  You
> >> >can't leave the block in that state forever because it will continue
> >> >to get bit flips and then your data will be unusable.
> >> Yep, I know about the issue. The recommended way to go here AFAIK is to 
> >> mark the block as bad and copy its contents to a free one.
> 
> FWIW, on the Samsung K9F* NAND chip, the explicitly recommended
> procedure for single bit read errors is to do single bit Error
> Correction (assuming ECC is implemented) and to NOT remap (you _are_
> supposed to remap and copy one write errors). Apparently the chances
> of a 2nd bit error within a single 512 byte page (the ECC unit) are
> negligible.

Surely after observing one bit error, that doesn't reduce the
probability of observing another one later?  I'd expect the bits to be
independent.

In other words, is it not the case that, _after_ observing a single
bit error, the probability of observing another bit error is exactly
the same as the probability of observing the first one was, before it
was observed?

-- Jamie

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-05-10 23:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-10 19:12 NAND read errors, R/O FS Steve Finney
2006-05-10 19:20 ` Josh Boyer
2006-05-10 23:28 ` Jamie Lokier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox