linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@free-electrons.com>
To: Tim Harvey <tharvey@gateworks.com>
Cc: Richard Weinberger <richard@nod.at>,
	Elie De Brauwer <eliedebrauwer@gmail.com>,
	Artem Bityutskiy <dedekind1@gmail.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	linux-mtd@lists.infradead.org,
	Huang Shijie <shijie.huang@arm.com>,
	Brian Norris <computersforpeace@gmail.com>
Subject: Re: UBIFS corruption after power cut - possibly unstable bits issue?
Date: Tue, 1 Dec 2015 10:12:49 +0100	[thread overview]
Message-ID: <20151201101249.1fc3448f@bbrezillon> (raw)
In-Reply-To: <CAJ+vNU2M5getk6=J-wXDhQf1qwPr7ivRQFtO8pV1695WK7ZR+g@mail.gmail.com>

On Mon, 30 Nov 2015 13:58:34 -0800
Tim Harvey <tharvey@gateworks.com> wrote:

> On Mon, Nov 16, 2015 at 7:01 AM, Tim Harvey <tharvey@gateworks.com> wrote:
> > On Tue, Nov 3, 2015 at 5:38 AM, Boris Brezillon
> > <boris.brezillon@free-electrons.com> wrote:
> >> Hi Tim,
> >>
> >> On Mon, 2 Nov 2015 12:31:11 -0800
> >> Tim Harvey <tharvey@gateworks.com> wrote:
> >>
> >>> On Mon, Nov 2, 2015 at 12:27 PM, Tim Harvey <tharvey@gateworks.com> wrote:
> >>> > [    8.635364] UBIFS (ubi0:0): recovery needed
> >>> > [    8.676203] ubi0 warning: ubi_io_read: error -74 (ECC error) while
> >>> > reading 69632 bytes from PEB 2254:192512, read only 69632 bytes, retry
> >>> > [    8.692460] ubi0 warning: ubi_io_read: error -74 (ECC error) while
> >>> > reading 69632 bytes from PEB 2254:192512, read only 69632 bytes, retry
> >>> > [    8.708741] ubi0 warning: ubi_io_read: error -74 (ECC error) while
> >>> > reading 69632 bytes from PEB 2254:192512, read only 69632 bytes, retry
> >>> > ^^^^ non correctable ecc error on PEB 2254  - I verified that this was
> >>> > not the first time this PEB has been used
> >>
> >> I suspect one of the bit in PEB 2254 to be stuck at 0 (even after
> >> erasing the block the bit stays at 0). Have you tried to erase this
> >> block (flash_erase /dev/mtd2 0x23380000 1) and dump it in raw mode
> >> (nanddump -n -l 0x40000 -s 0x23380000 -f /tmp/dump /dev/mtd2)?
> >
> > Boris,
> >
> > I examined the bad PEB on several boards now that I have reproduced
> > this issue with and found no stuck bits (no 0's following erase, no
> > 1's following erase and raw write all ff's).
> >
> > So in this case it doesn't appear to be a bad block. Incidentally for
> > UBI/UBIFS, what is in charge of detecting bad blocks, how are they
> > detected, and when/how are they marked?
> >
> >>
> >>> >
> >>> > I've cc'd Huang, Elie, and Brian who were involved in the patch to
> >>> > detect bit-flips in gpmi-nand.c reads - perhaps they have some more
> >>> > ideas. I find it interesting that in one case that patch resolves the
> >>> > issue and in the other it does not.
> >>
> >> I posted a slightly reworked version of Huang's patch [1] a while ago
> >> addressing the "account for bitflips in OOB area" problem, but maybe we
> >> could do better (avoid this extra "read in raw mode" step, or use the
> >> generic nand_check_erased_ecc_chunk() function when ECC bytes are
> >> aligned).
> >>
> >> Best Regards,
> >>
> >> Boris
> >>
> >> [1]https://patchwork.ozlabs.org/patch/416543/
> >
> > At this point I likely need to reproduce this problem with additional
> > debugging enabled to show what last erased and/or wrote to the PEB's
> > that are corrupt. I will also try your patch as well and see if that
> > resolves anything.
> >
> > Regards,
> >
> > Tim
> 
> Boris,
> 
> I tried your patch [1] on a week-long test over 10x IMX6 boards
> booting over 60K times across temperature ranges and the patch
> resolved many previous failures to mount rootfs errors (previously I
> would encounter around 1% failure to mount rootfs). In addition I saw
> no nand corruption where I would have expected to see it several times
> with those numbers so I suspect this may have resolved that as well.
> 
> Can you re-submit your patch for inclusion and/or discussion?

I'm quite busy on other topics lately, but feel free to adapt/resubmit
the patch.

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

  reply	other threads:[~2015-12-01  9:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-26 19:37 UBIFS corruption after power cut - possibly unstable bits issue? Tim Harvey
2015-10-26 20:01 ` Richard Weinberger
2015-10-26 20:31   ` Tim Harvey
2015-10-26 21:41     ` Richard Weinberger
2015-10-27 19:01       ` Tim Harvey
2015-10-27 19:52         ` Richard Weinberger
2015-11-02 20:27           ` Tim Harvey
2015-11-02 20:31             ` Tim Harvey
2015-11-02 21:31               ` Richard Weinberger
2015-11-02 22:11                 ` Brian Norris
2015-11-03 13:38               ` Boris Brezillon
2015-11-16 15:01                 ` Tim Harvey
2015-11-30 21:58                   ` Tim Harvey
2015-12-01  9:12                     ` Boris Brezillon [this message]
2015-11-03  9:10             ` Artem Bityutskiy
2015-11-03 10:06   ` Michal Suchanek
2015-11-03 10:18     ` Ricard Wanderlof
2015-11-03 10:43     ` Artem Bityutskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151201101249.1fc3448f@bbrezillon \
    --to=boris.brezillon@free-electrons.com \
    --cc=adrian.hunter@intel.com \
    --cc=computersforpeace@gmail.com \
    --cc=dedekind1@gmail.com \
    --cc=eliedebrauwer@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=richard@nod.at \
    --cc=shijie.huang@arm.com \
    --cc=tharvey@gateworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).