Re: [PATCH 12/13] mtd/docg3: add ECC correction code

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ivan Djelic <ivan.djelic@parrot.com>
To: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: "dwmw2@infradead.org" <dwmw2@infradead.org>,
	"dedekind1@gmail.com" <dedekind1@gmail.com>,
	"mikedunn@newsguy.com" <mikedunn@newsguy.com>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 12/13] mtd/docg3: add ECC correction code
Date: Sun, 30 Oct 2011 02:10:00 +0200	[thread overview]
Message-ID: <20111030000959.GA12624@parrot.com> (raw)
In-Reply-To: <8739ebbvow.fsf@free.fr>

On Sat, Oct 29, 2011 at 05:37:35PM +0100, Robert Jarzmik wrote:
> >> +static struct bch_control *docg3_bch;
> >
> > Why not putting this into your struct docg3, instead of adding a global var ?
> Because I have multiple floors (ie. 4 floors for example), which are split into
> 4 different devices. If I put that in docg3 structures (ie. the 4 allocated
> structures, each for one floor), I'd either have to :
>  - allocate 4 different bch "engines"
>  - or count docg3 releases and release the bch at the last kfree(docg3), which
>  makes me have another global variable.

OK, got it; using a struct to hold all your common vars (docg3_floors,
docg3_bch, ...) and hook that to your platform data instead of docg3_floors
would still be a bit cleaner I think, but no big deal.

> What I'm a bit afraid of is my poor understanding of the hardware ECC engine. I
> know that the write part is correct (ie. ECC calculation), but I'm a bit
> confused by the read part.
> 
> What wories me is that the hardware ECC got back while reading (ie. what I
> called calc_ecc) is always 00:00:00:00:00:00:00 when I read data (because I
> don't have bitflips on my flash). This looks to me more a "syndrom" than a
> "calc_ecc".

OK, I'll try to clarify that. The hardware ECC engine divides a huge polynomial
(520*8 = 4160 bits) by a generator polynomial and computes a 56-bit remainder.
So this remainder (let's call it R) depends only on 520 input data bytes.

- during a write operation: input data is what you write to the controller,
you get R from the ecc engine and this is what you write to oob[8..14].

- during a read operation: the ecc engine computes R on 520 input bytes read
from flash (this is calc_ecc), and also reads oob[8..14] (this is recv_ecc,
previously programmed during the write operation).
Then the ecc engine computes calc_ecc^recv_ecc, and this is what you get from
the ecc registers. And as long as there is no bitflip, its all 00s (because
calc_ecc=recv_ecc).

> To be sure, I could write a page of 512 bytes + 16 bytes, where the BCH would be
> forced (and incorrect), to check what the hardware generator gives me back. I'd
> like you to help me, ie:
>  - tell me what to write to the first 512 bytes (only 0, all 0 but one byte to
>  1, other ...)
>  - I think I'll write 8 bytes to 0x01 for the first 8 OOB bytes (Hamming false
>  but I won't care)
>  - tell me what to write to the 7 BCH ECC

OK, this is really simple:

1. Prepare a buffer of 520 bytes of data, containing pseudo-random bytes or
any pattern you like. Let's call this buffer 'ref_buf'.

2. Program 'ref_buf' to a nand page; you will write ecc bytes to oob during
that operation; let's call those ecc bytes 'ref_ecc'.

3. Now, you are ready to perform corruption tests:

 3.1 Make a copy of 'ref_buf' in which you flip 1, 2, 3 or 4 bits selected
     at random.

 3.2 Program this corrupt buffer, _but_ write 'ref_ecc' to oob instead of hw
     generated ecc bytes.

 3.3 Read page back: you should get exactly 'ref_buf', and the errorpos[]
     array of corrected bits should match your flip bits.

After step 3.2, your flash is exactly in the same state as if it had produced
the bitflips itself.

Repeat steps 3.1 to 3.3 on a large enough set of random vectors to convince
yourself that your code works (be careful not to wear out your device,
though :-). You should also try a few 5-bit corruptions and see failures, just
to verify that your corruptions have some effect.

In theory, testing the BCH algorithm like you did should be enough, but real
hardware tests are helpful to verify that the entire system behaves as
expected.

Hope that helps,
BR,
--
Ivan

next prev parent reply	other threads:[~2011-10-30  0:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-28 17:51 [PATCH 00/13] DocG3 fixes and write support Robert Jarzmik
2011-10-28 17:51 ` [PATCH 01/13] mtd/docg3: fix debug log verbosity Robert Jarzmik
2011-10-28 17:51 ` [PATCH 02/13] mtd/docg3: fix tracing of IO in writeb Robert Jarzmik
2011-10-28 17:51 ` [PATCH 03/13] mtd/docg3: fix protection areas reading Robert Jarzmik
2011-10-28 17:51 ` [PATCH 04/13] mtd/docg3: fix BCH registers Robert Jarzmik
2011-10-28 17:51 ` [PATCH 05/13] mtd/docg3: add multiple floor support Robert Jarzmik
2011-10-28 17:51 ` [PATCH 06/13] mtd/docg3: add OOB layout to mtdinfo Robert Jarzmik
2011-10-28 17:51 ` [PATCH 07/13] mtd/docg3: add registers for erasing and writing Robert Jarzmik
2011-10-28 17:51 ` [PATCH 08/13] mtd/docg3: add OOB buffer to device structure Robert Jarzmik
2011-10-28 17:51 ` [PATCH 09/13] mtd/docg3: add write functions Robert Jarzmik
2011-10-28 17:51 ` [PATCH 10/13] mtd/docg3: add erase functions Robert Jarzmik
2011-10-28 17:51 ` [PATCH 11/13] mtd/docg3: map erase and write functions Robert Jarzmik
2011-10-28 17:51 ` [PATCH 12/13] mtd/docg3: add ECC correction code Robert Jarzmik
2011-10-29  8:52   ` Ivan Djelic
2011-10-29  9:09     ` Ivan Djelic
2011-10-29 16:37     ` Robert Jarzmik
2011-10-30  0:10       ` Ivan Djelic [this message]
2011-10-28 17:51 ` [PATCH 13/13] mtd/docg3: add suspend and resume Robert Jarzmik
2011-10-30  0:41 ` [PATCH 00/13] DocG3 fixes and write support Marek Vasut
2011-10-30  9:04   ` Robert Jarzmik
2011-10-30 21:43     ` Mike Dunn
2011-10-30 22:18       ` Marek Vasut

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111030000959.GA12624@parrot.com \
    --to=ivan.djelic@parrot.com \
    --cc=dedekind1@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mikedunn@newsguy.com \
    --cc=robert.jarzmik@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).