public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: Olivier Schonken <olivier.schonken@gmail.com>
Cc: linux-mtd@lists.infradead.org,
	David Woodhouse <dwmw2@infradead.org>,
	Brian Norris <computersforpeace@gmail.com>,
	Boris Brezillon <boris.brezillon@bootlin.com>,
	Marek Vasut <marek.vasut@gmail.com>
Subject: Re: Atmel Nand PMECC UBI ECC issue
Date: Mon, 26 Mar 2018 22:07:49 +0200	[thread overview]
Message-ID: <1849050.lXIvJB3S2I@blindfold> (raw)
In-Reply-To: <CALdGskLaHrZGa7u4_jmJjq1VPRv6AYgRpSsYcNy+-EtgWR0H=Q@mail.gmail.com>

Oliver,

Am Montag, 26. März 2018, 16:56:17 CEST schrieb Olivier Schonken:
> Sorry for the resend, seems my gmail editor was in HTML mode which got
> rejected by the mailing list.  Humble apologies.
> 
> I have run into an issue with the Atmel nand controller on the
> SAMA5D36, which I am struggling to debug.
> 
> We are using custom hardware based on the SAMA5D36. With Micron
> MT29F8G08ABBCAH4 NAND flash.  Kernel version is 4.14.29 - mainline
> from kernel.org.  ECC strength is 24 bits with 1024 byte sector size.
> The PMECC settings was calculated as per
> https://www.at91.com/linux4sam/bin/view/Linux4SAM/PmeccConfigure, with
> the nand HEADER value at 0xc0e18e05.
> 
> The system works, and only some units present the error, the baffling
> part of it, is that a unit can work properly for a long while, and
> then suddenly the error presents itself. (Once traced it to a glibc
> library file, which means it isn't even due to heavy writing on the
> filesystem.) I have noticed that most of the time the PEB in which the
> error occurs is the same.  Even after reprogramming the device via
> ubiformat, or SAM-BA.
> 
> In the attached log output, you will see that there is a UBIFS error,
> where it detects a bitflip, which I confirmed by comparing the binary
> sequence to the Buildroot generated ubi file.
> 
> Using Atmel's SAM-BA to read back the contents of the NAND flash,
> yields the correct contents for the page causing the ECC error.
> 
> 31 18 10 06 00 FE A2 74 FB CF 00 00 00 00 00 00 C5 05 00 00 01 00 00
> 00 AB 0C 00 00

At which offset it this?

> Starting up linux again results in the same issue.
> This extract shows the ubifs magic number with the bitflip. The rest
> of the binary sequence matches a unique part of the ubi image.
> 
> [   75.140000] 7fe0: b6f8f8e4 becf7a40 b6be5788 b6e9c000 60000010 ffffffff
> [   75.150000] UBIFS error (ubi0:0 pid 1): ubifs_check_node: bad magic
> 0x6101830, expected 0x6101831
> [   75.160000] UBIFS error (ubi0:0 pid 1): ubifs_check_node: bad node
> at LEB 325:216208
> [   75.160000] Not a node, first 24 bytes:
> [   75.160000] 00000000: 30 18 10 06 00 fe a2 74 fb cf 00 00 00 00 00
> 00 c5 05 00 00 01 00 00 00
> 0......t................
> [   75.180000] CPU: 0 PID: 1 Comm: systemd Not tainted 4.14.29+ #706
> 
> mtdinfo for the partition in question
> Type:                           nand
> Eraseblock size:                262144 bytes, 256.0 KiB
> Amount of eraseblocks:          2048 (536870912 bytes, 512.0 MiB)
> Minimum input/output unit size: 4096 bytes
> Sub-page size:                  4096 bytes
> OOB size:                       224 bytes
> Character device major/minor:   90:10
> Bad blocks are allowed:         true
> Device is writable:             true
> 
> Device tree entry:
>         nand_controller: nand-controller {
>                 status = "okay";
> 
>                 nand@3 {
>                     reg = <0x3 0x0 0x800000>;
>                     atmel,rb = <0>;
>                     nand-bus-width = <8>;
>                     nand-ecc-mode = "hw";
>                     nand-ecc-strength = <24>;
>                     nand-ecc-step-size = <1024>;
>                     nand-on-flash-bbt;
>                     label = "atmel_nand";
>                 };
>             };
> 
> Attached are the dmesg traces with the ECC issue.  A nanddump of the
> block with the ECC error, including OOB contents as per "nanddump -f
> nandblock-withoob.ubi /dev/mtd5 -s 0x51c0000 -o -l 262144 &>
> nanddump-cmdline-output.txt"

Can you please share the dump without OOB?
UBI does not use OOB, so we don't need it and can use offsets as seen by UBI 
and UBIFS as-is. :)

Thanks,
//richard

  reply	other threads:[~2018-03-26 20:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 14:56 Atmel Nand PMECC UBI ECC issue Olivier Schonken
2018-03-26 20:07 ` Richard Weinberger [this message]
2018-03-27  8:28   ` Olivier Schonken
2018-03-27 14:08     ` Richard Weinberger
2018-03-28  8:32       ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1849050.lXIvJB3S2I@blindfold \
    --to=richard@nod.at \
    --cc=boris.brezillon@bootlin.com \
    --cc=computersforpeace@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=marek.vasut@gmail.com \
    --cc=olivier.schonken@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox