linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Markus Pargmann <mpa@pengutronix.de>
To: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Han Xu <han.xu@nxp.com>, David Woodhouse <dwmw2@infradead.org>,
	Fabio Estevam <fabio.estevam@freescale.com>,
	linux-mtd@lists.infradead.org, kernel@pengutronix.de,
	Huang Shijie <shijie.huang@intel.com>,
	Brian Norris <computersforpeace@gmail.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
Date: Mon, 11 Apr 2016 08:34:35 +0200	[thread overview]
Message-ID: <3992227.rRuQ67A3ma@galactica> (raw)
In-Reply-To: <20160224145750.21050def@bbrezillon>

[-- Attachment #1: Type: text/plain, Size: 4871 bytes --]

Hi Boris,

On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> Hi Markus,
> 
> On Sun, 21 Feb 2016 13:52:06 +0100
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > ECC is only calculated for written pages. As erased pages are not
> > actively written the ECC is always invalid. For this purpose the
> > Hardware BCH unit is able to check for erased pages and does not raise
> > an ECC error in this case. This behaviour can be influenced using the
> > BCH_MODE register which sets the number of allowed bitflips in an erased
> > page. Unfortunately the unit is not capable of fixing the bitflips in
> > memory.
> > 
> > To avoid complete software checks for erased pages, we can simply check
> > buffers with uncorrectable ECC errors because we know that any erased
> > page with errors is uncorrectable by the BCH unit.
> > 
> > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > to correct erased pages. To have the valid data in the buffer before
> > using them, this patch moves the read_page_swap_end() call before the
> > ECC status checking for-loop.
> > 
> > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > ---
> >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> >  1 file changed, 44 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > index 235ddcb58f39..ce5a21252102 100644
> > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> >  	/* Loop over status bytes, accumulating ECC status. */
> >  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> >  
> > +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> > +			   this->payload_virt, this->payload_phys,
> > +			   nfc_geo->payload_size,
> > +			   payload_virt, payload_phys);
> > +
> >  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> >  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> >  			continue;
> >  
> >  		if (*status == STATUS_UNCORRECTABLE) {
> > +			int flips;
> > +
> > +			/*
> > +			 * The ECC hardware has an uncorrectable ECC status
> > +			 * code in case we have bitflips in an erased page. As
> > +			 * nothing was written into this subpage the ECC is
> > +			 * obviously wrong and we can not trust it. We assume
> > +			 * at this point that we are reading an erased page and
> > +			 * try to correct the bitflips in buffer up to
> > +			 * ecc_strength bitflips. If this is a page with random
> > +			 * data, we exceed this number of bitflips and have a
> > +			 * ECC failure. Otherwise we use the corrected buffer.
> > +			 */
> > +			if (i == 0) {
> > +				/* The first block includes metadata */
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						auxiliary_virt,
> > +						nfc_geo->metadata_size,
> > +						nfc_geo->ecc_strength);
> > +			} else {
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						NULL, 0,
> > +						nfc_geo->ecc_strength);
> > +			}
> 
> You're not checking ECC bytes. I know it's a bit more complicated to
> implement, but as Brian stated several times, you should always check
> ECC bytes along with data bytes when testing for an erased chunk.
> 
> Indeed, it might appear that the user really programmed ff on the page,
> and in this case you don't want to consider the page as erased.
> In this case, the ECC bytes will be different from ff, and you'll
> be able to identify it by checking those ECC bytes.

Thanks for the feedback. Talking with a coworker about this we may have found a
better approach to this that is less complicated to implement. The hardware
unit allows us to set a bitflip threshold for erased pages. The ECC unit
creates an ECC error only if the number of bitflips exceeds this threshold, but
it does not correct these. So the idea is to change the patch so that we set
pages, that are signaled by the ECC as erased, to 0xff completely without
checking. So the ECC will do all the work and we completely trust in its
abilities to do it correctly.

I will send a modified patch as soon as I have some time for this.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-04-11  6:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-21 12:52 [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages Markus Pargmann
2016-02-24 13:57 ` Boris Brezillon
2016-04-11  6:34   ` Markus Pargmann [this message]
2016-04-11  7:29     ` Boris Brezillon
2016-04-12 22:39       ` Han Xu
2016-04-12 22:51         ` Boris Brezillon
2016-04-15  7:55           ` Markus Pargmann
2016-04-15  8:35             ` Boris Brezillon
2016-04-15  9:35               ` Markus Pargmann
2016-04-15  9:39                 ` Boris Brezillon
2016-04-15 12:03                   ` Markus Pargmann
2016-04-15 15:33                     ` Han Xu
2016-04-15 15:40                       ` Boris Brezillon
2016-04-18 10:07                         ` Markus Pargmann
2016-04-18 14:47                   ` Stefan Christ
2016-04-18 15:10                     ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3992227.rRuQ67A3ma@galactica \
    --to=mpa@pengutronix.de \
    --cc=boris.brezillon@free-electrons.com \
    --cc=computersforpeace@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=fabio.estevam@freescale.com \
    --cc=han.xu@nxp.com \
    --cc=kernel@pengutronix.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=shijie.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).