public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-mtd <linux-mtd@lists.infradead.org>
Subject: Re: nand: WARNING: a0000000.nand: the ECC used on your system (1b/256B) is too weak compared to the one required by the NAND chip (4b/512B)
Date: Sat, 19 Jun 2021 20:40:35 +0200	[thread overview]
Message-ID: <20210618225032.69cdc30c@xps13> (raw)
In-Reply-To: <d37a8a7e-6181-9642-18fb-470d1d8cf006@csgroup.eu>

Hi Christophe,

> >> Now and then I'm using one of the latest kernels (Today is 5.13-rc6), and sometime in one of the 5.x releases, I started to get errors like:
> >>
> >> [    5.098265] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.103859] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 60
> >>    bytes from PEB 99:59824, read only 60 bytes, retry
> >> [    5.525843] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.531571] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.537490] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
> >> 73 bytes from PEB 107:108976, read only 3073 bytes, retry
> >> [    5.691121] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.696709] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.702426] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.708141] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.714103] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
> >> 35 bytes from PEB 107:25144, read only 3035 bytes, retry
> >> [   20.523689] random: crng init done
> >> [   21.892130] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [   21.897730] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 13
> >> 94 bytes from PEB 116:75776, read only 1394 bytes, retry
> >>
> >> Most of the time, when the reading of the file fails, I just have to read it once more and it gets read without that error.  
> > 
> > It really looks like a regular bitflip happening "sometimes". Is this a
> > board which already had a life? What are the usage counters (UBI should
> > tell you this) compared to the official endurance of your chip (see the
> > datasheet)?  
> 
> The board had a peacefull life:
> 
> UBI reports "ubi0: max/mean erase counter: 49/20, WL threshold: 4096"

Mmmh. Indeed.

> 
> I have tried with half a dozen of boards and all have the issue.
> 
> >   
> >> What am I supposed to do to avoid the ECC weakness warning at startup and to fix that ECC error issue ?  
> > 
> > I honestly don't think the errors come from the 5.1x kernels given the
> > above logs. If you flash back your old 4.14 I am pretty sure you'll
> > have the same errors at some point.  
> 
> I don't have any problem like that with 4.14 with any of the board.
> 
> When booting a 4.14 kernel I don't get any problem on the same board.
> 

If you can reliably show that when returning to a 4.14 kernel the ECC
weakness disappears, then there is certainly something new. What driver
are you using? Maybe you can do a bisection?

> > 
> > NAND really is a fragile storage medium, not following in a production
> > environment the minimum ECC scheme (there is a real difference between
> > 1/256 vs 4/512) really leads to complicated solutions like this one,
> > unfortunately...  
> 
> I see kernel has "Software BCH ECC". Should I use that with that chip ?
> 
> If yes, how do I use it ? Seems like selecting the option at Kernel build is not enough, do I have to configure something somewhere, for instance in the device tree ? At the time being I have the following in the device tree:

Enabling software BCH in the configuration will just built-in the
support. You then need to follow the NAND controller bindings, see the
example in [1].

However, given all the data you provided, I know think that there is
something weird happening in the driver you use, it might be relevant
to try to understand what. 

[1] Documentation/devicetree/bindings/mtd/nand-controller.yaml

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2021-06-19 18:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 17:17 nand: WARNING: a0000000.nand: the ECC used on your system (1b/256B) is too weak compared to the one required by the NAND chip (4b/512B) Christophe Leroy
2021-06-18  6:43 ` Miquel Raynal
2021-06-18 14:18   ` Christophe Leroy
2021-06-19 18:40     ` Miquel Raynal [this message]
2021-06-23  9:41       ` Christophe Leroy
2021-06-23 13:16         ` Miquel Raynal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210618225032.69cdc30c@xps13 \
    --to=miquel.raynal@bootlin.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox