From: Richard Weinberger <richard@nod.at>
To: Olivier Schonken <olivier.schonken@gmail.com>
Cc: linux-mtd@lists.infradead.org,
David Woodhouse <dwmw2@infradead.org>,
Brian Norris <computersforpeace@gmail.com>,
Boris Brezillon <boris.brezillon@bootlin.com>,
Marek Vasut <marek.vasut@gmail.com>
Subject: Re: Atmel Nand PMECC UBI ECC issue
Date: Mon, 26 Mar 2018 22:07:49 +0200 [thread overview]
Message-ID: <1849050.lXIvJB3S2I@blindfold> (raw)
In-Reply-To: <CALdGskLaHrZGa7u4_jmJjq1VPRv6AYgRpSsYcNy+-EtgWR0H=Q@mail.gmail.com>
Oliver,
Am Montag, 26. März 2018, 16:56:17 CEST schrieb Olivier Schonken:
> Sorry for the resend, seems my gmail editor was in HTML mode which got
> rejected by the mailing list. Humble apologies.
>
> I have run into an issue with the Atmel nand controller on the
> SAMA5D36, which I am struggling to debug.
>
> We are using custom hardware based on the SAMA5D36. With Micron
> MT29F8G08ABBCAH4 NAND flash. Kernel version is 4.14.29 - mainline
> from kernel.org. ECC strength is 24 bits with 1024 byte sector size.
> The PMECC settings was calculated as per
> https://www.at91.com/linux4sam/bin/view/Linux4SAM/PmeccConfigure, with
> the nand HEADER value at 0xc0e18e05.
>
> The system works, and only some units present the error, the baffling
> part of it, is that a unit can work properly for a long while, and
> then suddenly the error presents itself. (Once traced it to a glibc
> library file, which means it isn't even due to heavy writing on the
> filesystem.) I have noticed that most of the time the PEB in which the
> error occurs is the same. Even after reprogramming the device via
> ubiformat, or SAM-BA.
>
> In the attached log output, you will see that there is a UBIFS error,
> where it detects a bitflip, which I confirmed by comparing the binary
> sequence to the Buildroot generated ubi file.
>
> Using Atmel's SAM-BA to read back the contents of the NAND flash,
> yields the correct contents for the page causing the ECC error.
>
> 31 18 10 06 00 FE A2 74 FB CF 00 00 00 00 00 00 C5 05 00 00 01 00 00
> 00 AB 0C 00 00
At which offset it this?
> Starting up linux again results in the same issue.
> This extract shows the ubifs magic number with the bitflip. The rest
> of the binary sequence matches a unique part of the ubi image.
>
> [ 75.140000] 7fe0: b6f8f8e4 becf7a40 b6be5788 b6e9c000 60000010 ffffffff
> [ 75.150000] UBIFS error (ubi0:0 pid 1): ubifs_check_node: bad magic
> 0x6101830, expected 0x6101831
> [ 75.160000] UBIFS error (ubi0:0 pid 1): ubifs_check_node: bad node
> at LEB 325:216208
> [ 75.160000] Not a node, first 24 bytes:
> [ 75.160000] 00000000: 30 18 10 06 00 fe a2 74 fb cf 00 00 00 00 00
> 00 c5 05 00 00 01 00 00 00
> 0......t................
> [ 75.180000] CPU: 0 PID: 1 Comm: systemd Not tainted 4.14.29+ #706
>
> mtdinfo for the partition in question
> Type: nand
> Eraseblock size: 262144 bytes, 256.0 KiB
> Amount of eraseblocks: 2048 (536870912 bytes, 512.0 MiB)
> Minimum input/output unit size: 4096 bytes
> Sub-page size: 4096 bytes
> OOB size: 224 bytes
> Character device major/minor: 90:10
> Bad blocks are allowed: true
> Device is writable: true
>
> Device tree entry:
> nand_controller: nand-controller {
> status = "okay";
>
> nand@3 {
> reg = <0x3 0x0 0x800000>;
> atmel,rb = <0>;
> nand-bus-width = <8>;
> nand-ecc-mode = "hw";
> nand-ecc-strength = <24>;
> nand-ecc-step-size = <1024>;
> nand-on-flash-bbt;
> label = "atmel_nand";
> };
> };
>
> Attached are the dmesg traces with the ECC issue. A nanddump of the
> block with the ECC error, including OOB contents as per "nanddump -f
> nandblock-withoob.ubi /dev/mtd5 -s 0x51c0000 -o -l 262144 &>
> nanddump-cmdline-output.txt"
Can you please share the dump without OOB?
UBI does not use OOB, so we don't need it and can use offsets as seen by UBI
and UBIFS as-is. :)
Thanks,
//richard
next prev parent reply other threads:[~2018-03-26 20:08 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-26 14:56 Atmel Nand PMECC UBI ECC issue Olivier Schonken
2018-03-26 20:07 ` Richard Weinberger [this message]
2018-03-27 8:28 ` Olivier Schonken
2018-03-27 14:08 ` Richard Weinberger
2018-03-28 8:32 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1849050.lXIvJB3S2I@blindfold \
--to=richard@nod.at \
--cc=boris.brezillon@bootlin.com \
--cc=computersforpeace@gmail.com \
--cc=dwmw2@infradead.org \
--cc=linux-mtd@lists.infradead.org \
--cc=marek.vasut@gmail.com \
--cc=olivier.schonken@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.