All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Dunn <mikedunn@newsguy.com>
To: Ivan Djelic <ivan.djelic@parrot.com>
Cc: Ricard Wanderlof <ricard.wanderlof@axis.com>,
	Robert Jarzmik <robert.jarzmik@free.fr>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads
Date: Fri, 16 Mar 2012 09:25:08 -0700	[thread overview]
Message-ID: <4F636964.3030904@newsguy.com> (raw)
In-Reply-To: <20120316111939.GA10362@parrot.com>

Hi Ivan.  Thanks for the review!

On 03/16/2012 04:19 AM, Ivan Djelic wrote:
> 
> Consider the following situation:
> - a NAND device with 2kB pages and 4 ecc steps per page (4 x 512 bytes)
> - the driver has chip->ecc.strength = 4, and therefore mtd->ecc_strength = 16
> - let's say mtd->bitflip_threshold = 16
> 
> The driver read() method could return a non-negative integer, say 4, in at least
> the following cases:
> 
> 1. During a single page read, each of the 4 ecc steps corrected 1 bit, with a
> total variation of ecc_stats.corrected equal to 4.
> => no cleaning needed
> 
> 2. During a single page read, 1 ecc step corrected 4 bits, the 3 other steps had
> no correction to perform, with a total variation of ecc_stats.corrected equal
> to 4.
> => cleaning is needed


Maybe my (admittedly limited) understanding of the physical nature of NAND flash
is flawed.  I assumed that a writesize region (i.e., a NAND page for our
purposes) is the most elemental unit wrt physical wear, regardless of whether or
not ecc is caclulated once for the whole page or incrementally in steps.


> 
> In both cases, you will compare the same value 4 to mtd->bitflip_threshold (16)
> and decide to return 0 (and not -EUCLEAN).
> 
> So my point is that the cleaning decision happens at the ecc step level,
> not at the page reading level.


But you're sayimg my assumption is incorrect.  So each ecc-sized area within a
page is physically distinct and must be considered in isolation?  Could you
maybe elaborate on this?


> 
> I think this could be fixed by dropping 'ecc_strength' and changing the semantics
> of 'bitflip_threshold' in the following way (rephrasing your explanation):
> 
>   (3) The drivers' read methods, absent an error, return a non-negative integer
>       indicating the maximum number of bit errors that were corrected in any one
>       ecc step.  MTD returns -EUCLEAN if this is >= bitflip_threshold, 0
>       otherwise.
> 
>   So basically, the meaning of -EUCLEAN is changed from "one or more bit errors
>   were corrected", to "a dangerously high number of bit errors were corrected on
>   one or more ecc step block".  By default, "dangerously high" is interpreted
>   as chip->ecc.strength.  Drivers can specify a different value, and the user can
>   override it if more or less caution regarding data integrity is desired.
> 
> But still, there is a problem: how do we implement (3), i.e. how do we know
> "the maximum number of bit errors that were corrected in any one ecc step" ?
> 
> Just looking at ecc_stats.corrected is not enough, as it accumulates over each
> ecc step result, and does not allow us to distinguish cases 1 and 2 (from my
> previous example). Maybe we could have per-step ecc stats ? or have the driver
> return directly the information ?


Yes, this will require more work, touching many drivers :(  The per-page stats
allowed me to limit most of the changes to nand_base.c.

If you are correct about the need to consider each ecc-sized region separately,
then these patches are actually a regression, since a "dangerously high" number
of bitflips will be considered OK, as your example illustrates.

Thanks again.

Mike

  parent reply	other threads:[~2012-03-16 16:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15 17:25 [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
2012-03-15 17:25 ` [PATCH 1/3] MTD: expose ecc_strength through sysfs Mike Dunn
2012-03-15 17:25 ` [PATCH 2/3] MTD: bitflip_threshold added to mtd_info and sysfs Mike Dunn
2012-03-16 16:31   ` Ivan Djelic
2012-03-15 17:25 ` [PATCH 3/3] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
2012-03-16 11:19 ` [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Ivan Djelic
2012-03-16 12:49   ` Artem Bityutskiy
2012-03-16 16:30     ` Mike Dunn
2012-03-16 16:25   ` Mike Dunn [this message]
2012-03-16 18:43     ` Ivan Djelic
2012-03-17 20:18       ` Mike Dunn
2012-03-18  8:00         ` Shmulik Ladkani
2012-03-19  8:50           ` Matthieu CASTET
2012-03-19  9:29             ` Shmulik Ladkani
2012-03-19 19:09             ` Mike Dunn
     [not found]               ` <20120319211835.1073a491@halley>
2012-03-20  1:27                 ` Mike Dunn
2012-03-30 14:21           ` Artem Bityutskiy
2012-03-31  2:03             ` Mike Dunn
2012-03-30 14:16         ` Artem Bityutskiy
2012-03-31  1:23           ` Mike Dunn
2012-03-30 14:19         ` Artem Bityutskiy
2012-03-16 21:54     ` Shmulik Ladkani
2012-03-16 22:57       ` Peter Barada
2012-03-17 21:10         ` Mike Dunn
2012-03-17 20:50       ` Mike Dunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F636964.3030904@newsguy.com \
    --to=mikedunn@newsguy.com \
    --cc=ivan.djelic@parrot.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=ricard.wanderlof@axis.com \
    --cc=robert.jarzmik@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.