All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bill Pringlemeir <bpringlemeir@nbsps.com>
To: Stefan Agner <stefan@agner.ch>
Cc: b21989@freescale.com, linux-mtd@lists.infradead.org,
	Jason.jin@freescale.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC 2/5] mtd:fsl_nfc: Add hardware 45 byte BHC-ECC support for 24 bit corrections.
Date: Thu, 11 Dec 2014 11:44:30 -0500	[thread overview]
Message-ID: <8761dit9u9.fsf@nbsps.com> (raw)
In-Reply-To: <0bc8cec13bcf5b9cbea9cd3345815e4a@agner.ch> (Stefan Agner's message of "Wed, 10 Dec 2014 15:56:39 +0100")


>>>> On 17 Sep 2014, stefan@agner.ch wrote:

>>> Yes, we are using Macronix SLC NAND.

>>>> On 17 Sep 2014, stefan@agner.ch wrote:

>>> This is a new device, but its one out of several dozens. The device
>>> had two factory marked bad page. This four page would then be 6 bad
>>> pages. I would say that your guess is probably the case at hand
>>> (should be considered bad, but were marked by factory).

On 10 Dec 2014, stefan@agner.ch wrote:

> What I currently did, is just accept strength / 2 bits. This is not a
> clean solution since it will also count the ECC bits, but it works for
> now:
> --- a/drivers/mtd/nand/fsl_nfc.c
> +++ b/drivers/mtd/nand/fsl_nfc.c
> @@ -524,7 +524,7 @@ static int nfc_correct_data(struct mtd_info *mtd,
> u_char *dat,
> flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
>
> /* ECC failed. */
> -       if (flip > ecc_count)
> +       if (flip > ecc_count && flip > (nfc->chip.ecc.strength / 2))
> return -1;
>
> /* Erased page. */

> I think we are facing multiple issues here. One might contain general
> software/hardware issues (non bit-flip related). I had this issue
> again on a different module with 3.18-rc5 (without the "fix"
> above). The kernel output looks like this:

[snip]

> Interesting is that this error happens every second PEB (every 128
> page, but erase block size is 64) and it is always the second page. On
> that device, this is completely reproduceable, e.g. I can erase
> everything and flash it again, the same happens.

> I dumped the block in question:

> Page 00240800 dump:
> ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  f7 ff ff ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  ff ff fb ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff f7
> ....

> I also printed flip count and ecc_count the values for all those pages
> are: flip 3, ecc_count 2

> Now the interesting part: When I erase the block, and dump that page
> again, it is completely empty! No flips, no ecc_count anymore! UBI
> attach writes something into the first page, hence it looks like this
> write into the first page influences the values of the second
> page... I verified this behavior this using U-Boot and the Linux
> kernel.

> I digged a bit deeper, and wrote just zeros into the first page. In
> the second page some bits are flipped. However, writing into the
> second page does not influence the third page. But a bit in the first
> page is flipped. And the third page influences the forth page. It
> looks like the pages behave in pairs.... Any idea what kind of issue
> we are facing here?

Hmm.  It sounds like MLC flash, but you say you have SLC.  It could be
that some bus signalling is marginal?  Could you reduce the clocks a bit
on this device and see if the behaviour changes?  I am pretty sure that
stuck-at-zero errors will stay that way.

I would love to get back to this controller code to fix some issues you
noted and bring in the changes to the u-boot review.  Unfortunately, I
keep getting stuck with legacy hw issues.

fwiw,
Bill.

WARNING: multiple messages have this Message-ID (diff)
From: bpringlemeir@nbsps.com (Bill Pringlemeir)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC 2/5] mtd:fsl_nfc: Add hardware 45 byte BHC-ECC support for 24 bit corrections.
Date: Thu, 11 Dec 2014 11:44:30 -0500	[thread overview]
Message-ID: <8761dit9u9.fsf@nbsps.com> (raw)
In-Reply-To: <0bc8cec13bcf5b9cbea9cd3345815e4a@agner.ch> (Stefan Agner's message of "Wed, 10 Dec 2014 15:56:39 +0100")


>>>> On 17 Sep 2014, stefan at agner.ch wrote:

>>> Yes, we are using Macronix SLC NAND.

>>>> On 17 Sep 2014, stefan at agner.ch wrote:

>>> This is a new device, but its one out of several dozens. The device
>>> had two factory marked bad page. This four page would then be 6 bad
>>> pages. I would say that your guess is probably the case at hand
>>> (should be considered bad, but were marked by factory).

On 10 Dec 2014, stefan at agner.ch wrote:

> What I currently did, is just accept strength / 2 bits. This is not a
> clean solution since it will also count the ECC bits, but it works for
> now:
> --- a/drivers/mtd/nand/fsl_nfc.c
> +++ b/drivers/mtd/nand/fsl_nfc.c
> @@ -524,7 +524,7 @@ static int nfc_correct_data(struct mtd_info *mtd,
> u_char *dat,
> flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
>
> /* ECC failed. */
> -       if (flip > ecc_count)
> +       if (flip > ecc_count && flip > (nfc->chip.ecc.strength / 2))
> return -1;
>
> /* Erased page. */

> I think we are facing multiple issues here. One might contain general
> software/hardware issues (non bit-flip related). I had this issue
> again on a different module with 3.18-rc5 (without the "fix"
> above). The kernel output looks like this:

[snip]

> Interesting is that this error happens every second PEB (every 128
> page, but erase block size is 64) and it is always the second page. On
> that device, this is completely reproduceable, e.g. I can erase
> everything and flash it again, the same happens.

> I dumped the block in question:

> Page 00240800 dump:
> ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  f7 ff ff ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  ff ff fb ff ff ff ff ff
> ....
> ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff f7
> ....

> I also printed flip count and ecc_count the values for all those pages
> are: flip 3, ecc_count 2

> Now the interesting part: When I erase the block, and dump that page
> again, it is completely empty! No flips, no ecc_count anymore! UBI
> attach writes something into the first page, hence it looks like this
> write into the first page influences the values of the second
> page... I verified this behavior this using U-Boot and the Linux
> kernel.

> I digged a bit deeper, and wrote just zeros into the first page. In
> the second page some bits are flipped. However, writing into the
> second page does not influence the third page. But a bit in the first
> page is flipped. And the third page influences the forth page. It
> looks like the pages behave in pairs.... Any idea what kind of issue
> we are facing here?

Hmm.  It sounds like MLC flash, but you say you have SLC.  It could be
that some bus signalling is marginal?  Could you reduce the clocks a bit
on this device and see if the behaviour changes?  I am pretty sure that
stuck-at-zero errors will stay that way.

I would love to get back to this controller code to fix some issues you
noted and bring in the changes to the u-boot review.  Unfortunately, I
keep getting stuck with legacy hw issues.

fwiw,
Bill.

  reply	other threads:[~2014-12-11 16:44 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-21 17:01 VF610+ColdFireM54418 controller Bill Pringlemeir
2013-11-21 21:52 ` Bill Pringlemeir
2014-01-08 23:07 ` [RFC 0/5] Nand Bill Pringlemeir
2014-01-08 23:07   ` [RFC 1/5] mtd:fsl_nfc: Nand flash controller for VF610, MPC5125, etc Bill Pringlemeir
2014-04-28 14:41     ` Stefan Agner
2014-04-28 14:41       ` Stefan Agner
2014-04-28 16:51       ` Bill Pringlemeir
2014-04-28 16:51         ` Bill Pringlemeir
2014-04-29  7:50         ` Stefan Agner
2014-04-29  7:50           ` Stefan Agner
2014-04-29 16:36       ` Bill Pringlemeir
2014-04-29 16:36         ` Bill Pringlemeir
2014-01-08 23:07   ` [RFC 2/5] mtd:fsl_nfc: Add hardware 45 byte BHC-ECC support for 24 bit corrections Bill Pringlemeir
2014-09-17 17:02     ` Stefan Agner
2014-09-17 17:02       ` Stefan Agner
2014-09-17 18:06       ` Bill Pringlemeir
2014-09-17 18:06         ` Bill Pringlemeir
2014-09-17 20:08         ` Stefan Agner
2014-09-17 20:08           ` Stefan Agner
2014-09-17 22:21           ` Bill Pringlemeir
2014-09-17 22:21             ` Bill Pringlemeir
2014-12-10 14:56             ` Stefan Agner
2014-12-10 14:56               ` Stefan Agner
2014-12-11 16:44               ` Bill Pringlemeir [this message]
2014-12-11 16:44                 ` Bill Pringlemeir
2015-03-01  0:38                 ` Stefan Agner
2015-03-01  0:38                   ` Stefan Agner
2015-03-02 15:05                   ` Bill Pringlemeir
2015-03-02 15:05                     ` Bill Pringlemeir
2015-03-02 21:39                     ` Aaron Brice
2015-03-02 21:39                       ` Aaron Brice
2015-03-02 21:44                       ` Stefan Agner
2015-03-02 21:44                         ` Stefan Agner
2014-01-08 23:07   ` [RFC 3/5] mtd:fsl_nfc: Add device tree documentation Bill Pringlemeir
2014-01-08 23:07   ` [RFC 4/5] imx:vf610: Add device tree support for the fsl_nfc driver and NAND interface Bill Pringlemeir
2014-01-08 23:07   ` [RFC 5/5] imx:vf610: Allow user to enable NAND controller for the VF610 SOC Bill Pringlemeir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8761dit9u9.fsf@nbsps.com \
    --to=bpringlemeir@nbsps.com \
    --cc=Jason.jin@freescale.com \
    --cc=b21989@freescale.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=stefan@agner.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.