linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Brian Norris <computersforpeace@gmail.com>
To: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
	linux-mtd@lists.infradead.org, Andrea Scian <rnd4@dave-tech.it>,
	Richard Weinberger <richard@nod.at>
Subject: Re: [PATCH v2 2/2] mtd: nand: use nand_check_erased_ecc_chunk in default ECC read functions
Date: Wed, 2 Sep 2015 13:35:30 -0700	[thread overview]
Message-ID: <20150902203530.GT81844@google.com> (raw)
In-Reply-To: <1440409642-5495-3-git-send-email-boris.brezillon@free-electrons.com>

On Mon, Aug 24, 2015 at 11:47:22AM +0200, Boris Brezillon wrote:
> The default NAND read functions are relying on an underlying controller
> to correct bitflips, but some of those controller cannot properly fix
> bitflips in erased pages.
> In case of ECC failures, check if the page of subpage is empty before
> reporting an ECC failure.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

General note: this looks pretty good to me. Are there drivers which we
should kill erased-page checks from now, given this patch? There are
several of dubious value that we might drop without consequence. But
with some, I'd wonder if we might cause a performance slowdown and/or
high CPU utilization -- particularly those that look like they might
signal ECC errors on all-0xff pages, even with no bitflips.

> ---
>  drivers/mtd/nand/nand_base.c | 62 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 55 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 4d2ef65..e095d86 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -1400,6 +1400,19 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
>  		stat = chip->ecc.correct(mtd, p,
>  			&chip->buffers->ecccode[i], &chip->buffers->ecccalc[i]);
>  		if (stat < 0) {

I'm not sure if this is a fault of your patch or of the API design, but
do we want to do erased-ECC checks on all failures, regardless of type?
I would have expected maybe we could check only for -EBADMSG, but it
appears that's not consistent. Apparently all correction failures are
just "some negative value."

Anyway, if we had better consistency, I'd suggest:

		if (stat == -EBADMSG) {

But I suppose that 'stat < 0' is the best we can do for now.

> +			/* check for empty pages with bitflips */
> +			int col = (int)(p - bufpoi);
> +
> +			chip->cmdfunc(mtd, NAND_CMD_RNDOUT, col, -1);

Are all drivers that use this function prepared to handle another RNDOUT
properly? I know some drivers tend to make assumptions about things that
nand_base is doing like this. I know that would be a dirty trick, but
it's not impossible...

> +			chip->read_buf(mtd, p, chip->ecc.size);

Also, are you sure we need to re-read here? Technically, drivers are
supposed to be leaving uncorrected data in their buffers if they can't
correct it, no?

Similar comments apply to the other cases.

> +			stat = nand_check_erased_ecc_chunk(p, chip->ecc.size,
> +						&chip->buffers->ecccode[i],
> +						chip->ecc.bytes,
> +						NULL, 0,
> +						chip->ecc.strength);
> +		}
> +
> +		if (stat < 0) {
>  			mtd->ecc_stats.failed++;
>  		} else {
>  			mtd->ecc_stats.corrected += stat;
> @@ -1449,6 +1462,16 @@ static int nand_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
>  
>  		stat = chip->ecc.correct(mtd, p, &ecc_code[i], &ecc_calc[i]);
>  		if (stat < 0) {
> +			/* check for empty pages with bitflips */
> +			chip->cmdfunc(mtd, NAND_CMD_RNDOUT, i, -1);
> +			chip->read_buf(mtd, p, eccsize);
> +			stat = nand_check_erased_ecc_chunk(p, eccsize,
> +						&ecc_code[i], eccbytes,
> +						NULL, 0,
> +						chip->ecc.strength);
> +		}
> +
> +		if (stat < 0) {
>  			mtd->ecc_stats.failed++;
>  		} else {
>  			mtd->ecc_stats.corrected += stat;
> @@ -1501,6 +1524,17 @@ static int nand_read_page_hwecc_oob_first(struct mtd_info *mtd,
>  
>  		stat = chip->ecc.correct(mtd, p, &ecc_code[i], NULL);
>  		if (stat < 0) {
> +			/* check for empty pages with bitflips */
> +			chip->cmdfunc(mtd, NAND_CMD_RNDOUT,
> +				      i + mtd->oobsize, -1);
> +			chip->read_buf(mtd, p, eccsize);
> +			stat = nand_check_erased_ecc_chunk(p, eccsize,
> +						&ecc_code[i], eccbytes,
> +						NULL, 0,
> +						chip->ecc.strength);
> +		}
> +
> +		if (stat < 0) {
>  			mtd->ecc_stats.failed++;
>  		} else {
>  			mtd->ecc_stats.corrected += stat;
> @@ -1527,6 +1561,8 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
>  	int i, eccsize = chip->ecc.size;
>  	int eccbytes = chip->ecc.bytes;
>  	int eccsteps = chip->ecc.steps;
> +	int eccstepsize = eccsize + eccbytes + chip->ecc.prepad +
> +			  chip->ecc.postpad;
>  	uint8_t *p = buf;
>  	uint8_t *oob = chip->oob_poi;
>  	unsigned int max_bitflips = 0;
> @@ -1546,19 +1582,31 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
>  		chip->read_buf(mtd, oob, eccbytes);
>  		stat = chip->ecc.correct(mtd, p, oob, NULL);
>  
> -		if (stat < 0) {
> -			mtd->ecc_stats.failed++;
> -		} else {
> -			mtd->ecc_stats.corrected += stat;
> -			max_bitflips = max_t(unsigned int, max_bitflips, stat);
> -		}
> -
>  		oob += eccbytes;
>  
>  		if (chip->ecc.postpad) {
>  			chip->read_buf(mtd, oob, chip->ecc.postpad);
>  			oob += chip->ecc.postpad;
>  		}
> +
> +		if (stat < 0) {
> +			/* check for empty pages with bitflips */
> +			chip->cmdfunc(mtd, NAND_CMD_RNDOUT,
> +				      i * eccstepsize, -1);
> +			chip->read_buf(mtd, p, chip->ecc.size);
> +			stat = nand_check_erased_ecc_chunk(p, chip->ecc.size,
> +							   oob - eccstepsize,
> +							   eccstepsize,
> +							   NULL, 0,
> +							   chip->ecc.strength);
> +		}
> +
> +		if (stat < 0) {
> +			mtd->ecc_stats.failed++;
> +		} else {
> +			mtd->ecc_stats.corrected += stat;
> +			max_bitflips = max_t(unsigned int, max_bitflips, stat);
> +		}
>  	}
>  
>  	/* Calculate remaining oob bytes */

Brian

  reply	other threads:[~2015-09-02 20:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-24  9:47 [PATCH v2 0/2] mtd: nand: properly handle bitflips in erased pages Boris Brezillon
2015-08-24  9:47 ` [PATCH v2 1/2] mtd: nand: add nand_check_erased helper functions Boris Brezillon
2015-09-02 18:41   ` Brian Norris
2015-09-02 19:30     ` Boris Brezillon
2015-09-02 20:26       ` Brian Norris
2015-09-02 20:51         ` Boris Brezillon
2015-09-03 13:22       ` Andrea Scian
2015-08-24  9:47 ` [PATCH v2 2/2] mtd: nand: use nand_check_erased_ecc_chunk in default ECC read functions Boris Brezillon
2015-09-02 20:35   ` Brian Norris [this message]
2015-09-02 20:45     ` Boris Brezillon
2015-09-02 12:46 ` [PATCH v2 0/2] mtd: nand: properly handle bitflips in erased pages Boris Brezillon
2015-09-02 19:43 ` Brian Norris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150902203530.GT81844@google.com \
    --to=computersforpeace@gmail.com \
    --cc=boris.brezillon@free-electrons.com \
    --cc=dwmw2@infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=richard@nod.at \
    --cc=rnd4@dave-tech.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).