All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ivan Djelic <ivan.djelic@parrot.com>
To: Mike Dunn <mikedunn@newsguy.com>
Cc: Ricard Wanderlof <ricard.wanderlof@axis.com>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: [PATCH 2/3] MTD: bitflip_threshold added to mtd_info and sysfs
Date: Fri, 16 Mar 2012 17:31:11 +0100	[thread overview]
Message-ID: <20120316163111.GE10228@parrot.com> (raw)
In-Reply-To: <1331832353-15569-3-git-send-email-mikedunn@newsguy.com>

On Thu, Mar 15, 2012 at 05:25:52PM +0000, Mike Dunn wrote:
> +
> +What:		/sys/class/mtd/mtdX/bitflip_threshold
> +Date:		March 2012
> +KernelVersion:	3.3.1
> +Contact:	linux-mtd@lists.infradead.org
> +Description:
> +		This allows the user to examine and adjust the criteria by which
> +		mtd returns -EUCLEAN from mtd_read() and mtd_read_oob().  If the
> +		maximum number of bit errors that were corrected on any single
> +		writesize region (as reported by the driver) equals or exceeds
> +		this value, -EUCLEAN is returned.  Otherwise, absent an error, 0
> +		is returned.  Higher layers (e.g., UBI) use this return code as
> +		an indication that an erase block may be degrading and should be
> +		scrutinized as a candidate for being marked as bad.
> +
> +		The initial value may be specified by the flash device driver.
> +		If not, then the default value is ecc_strength.  Users who wish
> +		to be more paranoid about data integrity can lower the value.
> +		If the value exceeds ecc_strength, -EUCLEAN is never returned by
> +		the read functions.

Hmmm. I don't think it's a good idea to say "Users who wish to be more paranoid
about data integrity can lower the value"; because this is not exactly true.

Lowering the value is very dangerous, and can have devastating effects: on NAND
devices where sticky bitflips appear (we have plenty of those devices), a low
threshold (say 1) triggers block torture by UBI, then bad block retirement,
quickly reducing the number of valid blocks; the other "sane" blocks
with intermittent bitflips keep being scrubbed, thrashing the whole device.
Even worse: if enough bad blocks appear, UBI runs out of replacement blocks
and stops working.

IMHO the value of 'bitflip_threshold' should be carefully chosen:

- low enough to ensure ecc correction has a safety margin and manufacturer
  requirements are met
- high enough to avoid the effects described above

In some cases, controlling bitflip_threshold can be interesting for other
reasons; for instance, on a specific board, I have used a NAND device
requiring 4-bit ecc, but I implemented 8-bit protection through hardware BCH for
extra safety (and future 8-bit NAND upgrades).
In that particular setup, I would set bitflip_threshold to 3 or 4 instead of the
value derived from the ecc strength (8).

So in practice, setting bitflip_threshold is tricky and requires a good
knowledge of the NAND (or Doc/whatever) device your are using, and of how mtd/UBI
will use the threshold.
I suggest we warn about the dangers and discourage people from messing with this
knob unless they know what they are doing.

BR,
-- 
Ivan

  reply	other threads:[~2012-03-16 16:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15 17:25 [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
2012-03-15 17:25 ` [PATCH 1/3] MTD: expose ecc_strength through sysfs Mike Dunn
2012-03-15 17:25 ` [PATCH 2/3] MTD: bitflip_threshold added to mtd_info and sysfs Mike Dunn
2012-03-16 16:31   ` Ivan Djelic [this message]
2012-03-15 17:25 ` [PATCH 3/3] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
2012-03-16 11:19 ` [PATCH 0/3] MTD: Change meaning of -EUCLEAN return code on reads Ivan Djelic
2012-03-16 12:49   ` Artem Bityutskiy
2012-03-16 16:30     ` Mike Dunn
2012-03-16 16:25   ` Mike Dunn
2012-03-16 18:43     ` Ivan Djelic
2012-03-17 20:18       ` Mike Dunn
2012-03-18  8:00         ` Shmulik Ladkani
2012-03-19  8:50           ` Matthieu CASTET
2012-03-19  9:29             ` Shmulik Ladkani
2012-03-19 19:09             ` Mike Dunn
     [not found]               ` <20120319211835.1073a491@halley>
2012-03-20  1:27                 ` Mike Dunn
2012-03-30 14:21           ` Artem Bityutskiy
2012-03-31  2:03             ` Mike Dunn
2012-03-30 14:16         ` Artem Bityutskiy
2012-03-31  1:23           ` Mike Dunn
2012-03-30 14:19         ` Artem Bityutskiy
2012-03-16 21:54     ` Shmulik Ladkani
2012-03-16 22:57       ` Peter Barada
2012-03-17 21:10         ` Mike Dunn
2012-03-17 20:50       ` Mike Dunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120316163111.GE10228@parrot.com \
    --to=ivan.djelic@parrot.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mikedunn@newsguy.com \
    --cc=ricard.wanderlof@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.