Re: [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

From: Ivan Djelic <ivan.djelic@parrot.com>
To: Mike Dunn <mikedunn@newsguy.com>
Cc: Ricard Wanderlof <ricard.wanderlof@axis.com>,
	Robert Jarzmik <robert.jarzmik@free.fr>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads
Date: Fri, 16 Mar 2012 19:43:45 +0100	[thread overview]
Message-ID: <20120316184345.GF10228@parrot.com> (raw)
In-Reply-To: <4F636964.3030904@newsguy.com>

On Fri, Mar 16, 2012 at 04:25:08PM +0000, Mike Dunn wrote:
> 
> Maybe my (admittedly limited) understanding of the physical nature of NAND flash
> is flawed.  I assumed that a writesize region (i.e., a NAND page for our
> purposes) is the most elemental unit wrt physical wear, regardless of whether or
> not ecc is caclulated once for the whole page or incrementally in steps.
> 
> 
> > 
> > In both cases, you will compare the same value 4 to mtd->bitflip_threshold (16)
> > and decide to return 0 (and not -EUCLEAN).
> > 
> > So my point is that the cleaning decision happens at the ecc step level,
> > not at the page reading level.
> 
> 
> But you're sayimg my assumption is incorrect.  So each ecc-sized area within a
> page is physically distinct and must be considered in isolation?  Could you
> maybe elaborate on this?

When NAND manufacturers specify ECC requirements in their datasheet, they indicate:
- a block size on which ECC should be computed (possibly smaller than the page size)
- how many errors should be correctable in a single block size, i.e. strength

For instance:
- size = 512 bytes
- strength = 8 errors

If the NAND device has 2 kB (resp. 4kB) pages, you will need to perform 4 (resp. 8) ECC
computations per page.

The point of implementing the specified ECC is to ensure the integrity of data for the
specified lifetime, i.e. its longevity.

Manufacturers select an ECC size/strength setup for a device purely from statistical
computations or empirical results: there is no "special" physical ecc-sized area in each
page. It's just that the error distribution empirically observed is well covered by the
specified size/strength combination. You could perfectly protect your NAND device using
a different size/strength combination, with different longevity results.
The manufacturer size/strength recommendation takes also into account the available spare
space and ECC computational requirements.

As a matter of fact, NAND manufacturers run write/read cycle aging tests in ovens, and
measure how fast data corruption happens, assuming various protection schemes: no ECC,
1-bit ECC/512, 2-bit ECC/512, and so on. Google "NAND UBER" for more details (UBER =
Uncorrectable Bit Error Rate).

Now, the point of scrubbing a block is to avoid error accumulation to the point when
ECC is no longer able to correct all errors. This is why we must monitor each single
ecc step rather than consider the total amount of errors corrected in a page.

BR,
-- 
Ivan

next prev parent reply	other threads:[~2012-03-16 18:45 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15 17:25 [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
2012-03-15 17:25 ` [PATCH 1/3] MTD: expose ecc_strength through sysfs Mike Dunn
2012-03-15 17:25 ` [PATCH 2/3] MTD: bitflip_threshold added to mtd_info and sysfs Mike Dunn
2012-03-16 16:31   ` Ivan Djelic
2012-03-15 17:25 ` [PATCH 3/3] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
2012-03-16 11:19 ` [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Ivan Djelic
2012-03-16 12:49   ` Artem Bityutskiy
2012-03-16 16:30     ` Mike Dunn
2012-03-16 16:25   ` Mike Dunn
2012-03-16 18:43     ` Ivan Djelic [this message]
2012-03-17 20:18       ` Mike Dunn
2012-03-18  8:00         ` Shmulik Ladkani
2012-03-19  8:50           ` Matthieu CASTET
2012-03-19  9:29             ` Shmulik Ladkani
2012-03-19 19:09             ` Mike Dunn
     [not found]               ` <20120319211835.1073a491@halley>
2012-03-20  1:27                 ` Mike Dunn
2012-03-30 14:21           ` Artem Bityutskiy
2012-03-31  2:03             ` Mike Dunn
2012-03-30 14:16         ` Artem Bityutskiy
2012-03-31  1:23           ` Mike Dunn
2012-03-30 14:19         ` Artem Bityutskiy
2012-03-16 21:54     ` Shmulik Ladkani
2012-03-16 22:57       ` Peter Barada
2012-03-17 21:10         ` Mike Dunn
2012-03-17 20:50       ` Mike Dunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120316184345.GF10228@parrot.com \
    --to=ivan.djelic@parrot.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mikedunn@newsguy.com \
    --cc=ricard.wanderlof@axis.com \
    --cc=robert.jarzmik@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox