public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Artem Bityutskiy <dedekind1@gmail.com>
To: Ivan Djelic <ivan.djelic@parrot.com>
Cc: "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>,
	David Peverley <pev@sketchymonkey.com>,
	Ricard Wanderlof <ricard.wanderlof@axis.com>
Subject: Re: CONFIG_MTD_NAND_VERIFY_WRITE with Software ECC
Date: Fri, 25 Feb 2011 18:41:36 +0200	[thread overview]
Message-ID: <1298652096.2346.14.camel@koala> (raw)
In-Reply-To: <20110225144410.GC21841@parrot.com>

On Fri, 2011-02-25 at 15:44 +0100, Ivan Djelic wrote:
> On Fri, Feb 25, 2011 at 12:12:10PM +0000, Artem Bityutskiy wrote:
> > On Fri, 2011-02-25 at 12:36 +0100, Ivan Djelic wrote:
> (...)
> > > 
> > > The fact that a bitflip detected during torture is enough to decide that a
> > > block is bad causes problems on some 4-bit ecc devices we are using. If we
> > > stick to this policy, we end up with a _lot_ of blocks being marked as bad
> > > (i.e. way too many).
> > 
> > I see. May be in your case 1 bit errors are completely harmless, but 2
> > and 3 are not?
> 
> When a NAND device requires 4-bit ecc or more, you do see a lot of 1-bit errors
> (compared to previous NAND devices). They are not "completely harmless" because
> you are still supposed to relocate data in some other block and erase the block
> (those bitflips are reversible errors), in order to avoid error accumulation
> and stay below the specified ecc requirement. But they probably should not be
> considered an indication that the block has gone bad.

So basically, this means that UBI need to distinguish between "gentle
bit-flips" and "evil bit-flips". This could be done, again, but teaching
MTD to return the bit-flip level and provide us some thresholds, which
could either be set automatically from OFNI data or heuristics or could
be provided by the user via some MTD sysfs files.

This should not be huge amount of job.


> > > Our NAND manufacturer tells us that, as long as a block erase operation
> > > completes without a failure reported by the device, it should not be classified
> > > as bad, even if it has bitflips (which sounds risky at best).
> > 
> > For any amount of flipped bits per page? Sounds a bit scary.
> 
> I agree. Our NAND manufacturer even told us that a single permanent 1-bit
> failure in a block is not enough for marking this block as bad on 4-bit ecc NAND
> devices. I still think there should be a specified amount of errors above which
> the block should be considered bad. Maybe only permanent bit failures should
> be considered.

Yes, this sounds logical and gives system designers flexibility.

> Yes, this makes sense. When you say "react", do you also mean not doing any
> scrubbing when the error count is below the threshold ?

Yes. There may be separate thresholds for different things: one for
scrubbing, one for marking bad. Feel free to send patches :-)

-- 
Best Regards,
Artem Bityutskiy (Битюцкий Артём)

  reply	other threads:[~2011-02-25 16:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-15 12:35 CONFIG_MTD_NAND_VERIFY_WRITE with Software ECC David Peverley
2011-02-15 13:02 ` Ricard Wanderlof
2011-02-15 14:00   ` David Peverley
2011-02-15 15:01     ` Ricard Wanderlof
2011-02-15 17:58       ` David Peverley
2011-02-17 10:04         ` Ricard Wanderlof
2011-02-25  8:42           ` Artem Bityutskiy
2011-02-25  9:09             ` Ricard Wanderlof
2011-02-25 10:29               ` Artem Bityutskiy
2011-02-25 11:36                 ` Ivan Djelic
2011-02-25 12:12                   ` Artem Bityutskiy
2011-02-25 12:59                     ` David Peverley
2011-02-25 13:21                       ` Artem Bityutskiy
2011-02-25 18:27                       ` Ivan Djelic
2011-02-25 14:44                     ` Ivan Djelic
2011-02-25 16:41                       ` Artem Bityutskiy [this message]
2011-02-25 12:22                   ` Artem Bityutskiy
2011-02-25 15:14                     ` Ivan Djelic
2011-02-25  8:31     ` Artem Bityutskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1298652096.2346.14.camel@koala \
    --to=dedekind1@gmail.com \
    --cc=ivan.djelic@parrot.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=pev@sketchymonkey.com \
    --cc=ricard.wanderlof@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox