* NAND ECC errors
@ 2025-08-07 15:16 markus.stockhausen
2025-08-07 20:36 ` Chris Packham
0 siblings, 1 reply; 10+ messages in thread
From: markus.stockhausen @ 2025-08-07 15:16 UTC (permalink / raw)
To: miquel.raynal, vigneshr, richard, tudor.ambarus, linux-mtd
Cc: 'Chris Packham'
Hi,
Chris (CC) developed the drivers/spi/spi-realtek-rtl-snand.c for the
Realtek switch platform. Thanks for that and the inclusion into mainline.
While adding it to one of my devices I'm getting ECC errors.
Situation is as follows.
- Linksys LGS328 (with RTL9301 SOC and that NAND controller)
- OpenWrt with Kernel 6.12 longterm
- The Realtek SPI NAND driver (backported from current master)
- Macronix MX35LF1GE4AB (1GBit)
- Boot via TFTP
I found a vendor UBI partition in NAND that I want to analyze.
It is actively and the vendor firmware seems to work on in.
I assume it contains a filesystem with configuration and logs.
During ubiattach I get tons of errors "ubi0 warning: ubi_io_read:
Error -77 (ECC error) while reading 64 bytes from PEB 0:0, read
only 64 bytes, retry".
Call stack shows:
spinand_mtd_regular_page_read
spinand_read_page
spinand_load_page_op
spinand_wait -> sets status = STATUS_ECC_UNCOR_ERROR
nand_ecc_finish_io_req start
spinand_ondie_ecc_finish_io_req run
spinand_check_ecc_status start
macronix_ecc_get_status -> reads status & returns -EBADMSG
Reading data from NAND directly I see this data layout for 2K data
- 4x 512 bytes data
- 4x 6 bytes oob = 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
- 4x 10 bytes ECC
A quick ECC calc for empty blocks says it must be BCH6. So now I have
several options but have no idea if I'm right or which to follow.
1. The NAND chip seems to have ECC build in. Ignored by vendor?
2. There is a hardware ECC controller -> Driver must be coded
3. Maybe I must activate the software BCH driver
4. The old vendor firmware (Linux 4.x) uses other ECC logic.
Anyone good ideas what to do first from here?
Thanks in advance.
Markus
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAND ECC errors
2025-08-07 15:16 NAND ECC errors markus.stockhausen
@ 2025-08-07 20:36 ` Chris Packham
2025-08-07 21:23 ` AW: " markus.stockhausen
2025-08-08 8:37 ` Miquel Raynal
0 siblings, 2 replies; 10+ messages in thread
From: Chris Packham @ 2025-08-07 20:36 UTC (permalink / raw)
To: markus.stockhausen@gmx.de, miquel.raynal@bootlin.com,
vigneshr@ti.com, richard@nod.at, tudor.ambarus@linaro.org,
linux-mtd@lists.infradead.org
Hi Markus,
On 08/08/2025 03:16, markus.stockhausen@gmx.de wrote:
> Hi,
>
> Chris (CC) developed the drivers/spi/spi-realtek-rtl-snand.c for the
> Realtek switch platform. Thanks for that and the inclusion into mainline.
> While adding it to one of my devices I'm getting ECC errors.
>
> Situation is as follows.
>
> - Linksys LGS328 (with RTL9301 SOC and that NAND controller)
> - OpenWrt with Kernel 6.12 longterm
> - The Realtek SPI NAND driver (backported from current master)
> - Macronix MX35LF1GE4AB (1GBit)
> - Boot via TFTP
>
> I found a vendor UBI partition in NAND that I want to analyze.
> It is actively and the vendor firmware seems to work on in.
> I assume it contains a filesystem with configuration and logs.
> During ubiattach I get tons of errors "ubi0 warning: ubi_io_read:
> Error -77 (ECC error) while reading 64 bytes from PEB 0:0, read
> only 64 bytes, retry".
>
> Call stack shows:
>
> spinand_mtd_regular_page_read
> spinand_read_page
> spinand_load_page_op
> spinand_wait -> sets status = STATUS_ECC_UNCOR_ERROR
> nand_ecc_finish_io_req start
> spinand_ondie_ecc_finish_io_req run
> spinand_check_ecc_status start
> macronix_ecc_get_status -> reads status & returns -EBADMSG
>
> Reading data from NAND directly I see this data layout for 2K data
>
> - 4x 512 bytes data
> - 4x 6 bytes oob = 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
> - 4x 10 bytes ECC
>
> A quick ECC calc for empty blocks says it must be BCH6. So now I have
> several options but have no idea if I'm right or which to follow.
>
> 1. The NAND chip seems to have ECC build in. Ignored by vendor?
As far as I understand the expectation in Linux was that all SPI-NAND
chips have on-die ECC.
> 2. There is a hardware ECC controller -> Driver must be coded
Yes there is an ECC controller in the RTL93xx chips but based on the
comment above (and some pretty useless documentation) I elected not to
attempt to use it.
> 3. Maybe I must activate the software BCH driver
Software BCH might be an alternative to using the ECC controller.
> 4. The old vendor firmware (Linux 4.x) uses other ECC logic.
I think this is the crux of the problem. Realtek seem totally
uninterested in upstreaming support for their chips (not sure how that's
going to pan our with emerging requirements like RED and CRA) so it's
left to people like you and I. In the meantime their SDK has made
decisions that upstream don't know about and when it comes to things
like NAND ECC layouts this causes problems.
> Anyone good ideas what to do first from here?
Probably depends. Blanking the NAND chip and reformatting it will
resolve the errors from and upstream point of view. That's obviously not
really going to be something you want to do if you expect to swap back
and forth between the stock firmware and an upstream kernel.
You'll probably want to convince the mtd code to allow the on-die ECC to
be disabled and find whatever software BCH settings are needed that work
with the stock firmware. Then we could maybe look at using the ECC
controller to accelerate that.
> Thanks in advance.
>
> Markus
>
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* AW: NAND ECC errors
2025-08-07 20:36 ` Chris Packham
@ 2025-08-07 21:23 ` markus.stockhausen
2025-08-07 21:32 ` Chris Packham
2025-08-08 8:37 ` Miquel Raynal
1 sibling, 1 reply; 10+ messages in thread
From: markus.stockhausen @ 2025-08-07 21:23 UTC (permalink / raw)
To: 'Chris Packham', miquel.raynal, vigneshr, richard,
tudor.ambarus, linux-mtd
> Anyone good ideas what to do first from here?
>
> Probably depends. Blanking the NAND chip and reformatting it will
> resolve the errors from and upstream point of view. That's obviously not
> really going to be something you want to do if you expect to swap back
> and forth between the stock firmware and an upstream kernel.
>
> You'll probably want to convince the mtd code to allow the on-die ECC to
> be disabled and find whatever software BCH settings are needed that work
> with the stock firmware. Then we could maybe look at using the ECC
> controller to accelerate that.
Good direction. So assumption is:
- vendor wrote BCH6 into OOB
- This design is also active in his firmware (with Realtek ECC engine)
- now we read data with upstream kernel and on-die ECC
- Each read produces an on-die ECC error
- As soon as we write new data on-die ECC will be created
But where is the ECC knob in between spi-realtek-nand, mtd and ubi?
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: NAND ECC errors
2025-08-07 21:23 ` AW: " markus.stockhausen
@ 2025-08-07 21:32 ` Chris Packham
2025-08-08 8:41 ` Miquel Raynal
0 siblings, 1 reply; 10+ messages in thread
From: Chris Packham @ 2025-08-07 21:32 UTC (permalink / raw)
To: markus.stockhausen@gmx.de, miquel.raynal@bootlin.com,
vigneshr@ti.com, richard@nod.at, tudor.ambarus@linaro.org,
linux-mtd@lists.infradead.org
On 08/08/2025 09:23, markus.stockhausen@gmx.de wrote:
>> Anyone good ideas what to do first from here?
>>
>> Probably depends. Blanking the NAND chip and reformatting it will
>> resolve the errors from and upstream point of view. That's obviously not
>> really going to be something you want to do if you expect to swap back
>> and forth between the stock firmware and an upstream kernel.
>>
>> You'll probably want to convince the mtd code to allow the on-die ECC to
>> be disabled and find whatever software BCH settings are needed that work
>> with the stock firmware. Then we could maybe look at using the ECC
>> controller to accelerate that.
> Good direction. So assumption is:
> - vendor wrote BCH6 into OOB
> - This design is also active in his firmware (with Realtek ECC engine)
> - now we read data with upstream kernel and on-die ECC
> - Each read produces an on-die ECC error
> - As soon as we write new data on-die ECC will be created
>
> But where is the ECC knob in between spi-realtek-nand, mtd and ubi?
Yeah that's the part I don't know. I don't think there is an existing
one for SPI-NAND (there is one for parallel NAND). I think it's a chip
level thing so spi-realtek-nand will probably be ignorant of any such
option (at least until the ECC controller comes into play). I also think
ubi doesn't care who does the ECC as long as they tell them when there
is an error.
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAND ECC errors
2025-08-07 20:36 ` Chris Packham
2025-08-07 21:23 ` AW: " markus.stockhausen
@ 2025-08-08 8:37 ` Miquel Raynal
1 sibling, 0 replies; 10+ messages in thread
From: Miquel Raynal @ 2025-08-08 8:37 UTC (permalink / raw)
To: Chris Packham
Cc: markus.stockhausen@gmx.de, vigneshr@ti.com, richard@nod.at,
tudor.ambarus@linaro.org, linux-mtd@lists.infradead.org
On 07/08/2025 at 20:36:56 GMT, Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> Hi Markus,
>
> On 08/08/2025 03:16, markus.stockhausen@gmx.de wrote:
>> Hi,
>>
>> Chris (CC) developed the drivers/spi/spi-realtek-rtl-snand.c for the
>> Realtek switch platform. Thanks for that and the inclusion into mainline.
>> While adding it to one of my devices I'm getting ECC errors.
>>
>> Situation is as follows.
>>
>> - Linksys LGS328 (with RTL9301 SOC and that NAND controller)
>> - OpenWrt with Kernel 6.12 longterm
>> - The Realtek SPI NAND driver (backported from current master)
>> - Macronix MX35LF1GE4AB (1GBit)
>> - Boot via TFTP
>>
>> I found a vendor UBI partition in NAND that I want to analyze.
>> It is actively and the vendor firmware seems to work on in.
>> I assume it contains a filesystem with configuration and logs.
>> During ubiattach I get tons of errors "ubi0 warning: ubi_io_read:
>> Error -77 (ECC error) while reading 64 bytes from PEB 0:0, read
>> only 64 bytes, retry".
>>
>> Call stack shows:
>>
>> spinand_mtd_regular_page_read
>> spinand_read_page
>> spinand_load_page_op
>> spinand_wait -> sets status = STATUS_ECC_UNCOR_ERROR
>> nand_ecc_finish_io_req start
>> spinand_ondie_ecc_finish_io_req run
>> spinand_check_ecc_status start
>> macronix_ecc_get_status -> reads status & returns -EBADMSG
>>
>> Reading data from NAND directly I see this data layout for 2K data
>>
>> - 4x 512 bytes data
>> - 4x 6 bytes oob = 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
>> - 4x 10 bytes ECC
>>
>> A quick ECC calc for empty blocks says it must be BCH6. So now I have
>> several options but have no idea if I'm right or which to follow.
>>
>> 1. The NAND chip seems to have ECC build in. Ignored by vendor?
> As far as I understand the expectation in Linux was that all SPI-NAND
> chips have on-die ECC.
This was initially true, but a year ago (or so) I added support for
external engines, allowing to use software and external HW engines.
>> 2. There is a hardware ECC controller -> Driver must be coded
> Yes there is an ECC controller in the RTL93xx chips but based on the
> comment above (and some pretty useless documentation) I elected not to
> attempt to use it.
It was not even possible to do differently at that time :)
>> 3. Maybe I must activate the software BCH driver
> Software BCH might be an alternative to using the ECC controller.
It is now indeed. I haven't played much with sw engines with SPI NANDs
but it should work, checkout the bindings.
>> 4. The old vendor firmware (Linux 4.x) uses other ECC logic.
> I think this is the crux of the problem. Realtek seem totally
> uninterested in upstreaming support for their chips (not sure how that's
> going to pan our with emerging requirements like RED and CRA) so it's
> left to people like you and I. In the meantime their SDK has made
> decisions that upstream don't know about and when it comes to things
> like NAND ECC layouts this causes problems.
The ECC layout matters if you use jffs2, or if you disable the on-die
ECC engine and replace it by something else.
>> Anyone good ideas what to do first from here?
Any chances the data could be scrambled? (just asking). Be careful,
current Macronix DT property to enable scrambling is an OTP bit. There
is a method with ->set_feature() which is volatile but it is not yet
implemented upstream. So in case your vendor fw did enable it, it might
lead to errors appearing like uncorrectable ECC errors.
> Probably depends. Blanking the NAND chip and reformatting it will
> resolve the errors from and upstream point of view. That's obviously not
> really going to be something you want to do if you expect to swap back
> and forth between the stock firmware and an upstream kernel.
>
> You'll probably want to convince the mtd code to allow the on-die ECC to
> be disabled and find whatever software BCH settings are needed that work
> with the stock firmware. Then we could maybe look at using the ECC
> controller to accelerate that.
Yes.
Good luck,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: NAND ECC errors
2025-08-07 21:32 ` Chris Packham
@ 2025-08-08 8:41 ` Miquel Raynal
2025-08-08 12:25 ` AW: " markus.stockhausen
0 siblings, 1 reply; 10+ messages in thread
From: Miquel Raynal @ 2025-08-08 8:41 UTC (permalink / raw)
To: Chris Packham
Cc: markus.stockhausen@gmx.de, vigneshr@ti.com, richard@nod.at,
tudor.ambarus@linaro.org, linux-mtd@lists.infradead.org
On 07/08/2025 at 21:32:37 GMT, Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> On 08/08/2025 09:23, markus.stockhausen@gmx.de wrote:
>>> Anyone good ideas what to do first from here?
>>>
>>> Probably depends. Blanking the NAND chip and reformatting it will
>>> resolve the errors from and upstream point of view. That's obviously not
>>> really going to be something you want to do if you expect to swap back
>>> and forth between the stock firmware and an upstream kernel.
>>>
>>> You'll probably want to convince the mtd code to allow the on-die ECC to
>>> be disabled and find whatever software BCH settings are needed that work
>>> with the stock firmware. Then we could maybe look at using the ECC
>>> controller to accelerate that.
>> Good direction. So assumption is:
>> - vendor wrote BCH6 into OOB
>> - This design is also active in his firmware (with Realtek ECC engine)
>> - now we read data with upstream kernel and on-die ECC
>> - Each read produces an on-die ECC error
>> - As soon as we write new data on-die ECC will be created
>>
>> But where is the ECC knob in between spi-realtek-nand, mtd and ubi?
> Yeah that's the part I don't know. I don't think there is an existing
> one for SPI-NAND (there is one for parallel NAND). I think it's a chip
> level thing so spi-realtek-nand will probably be ignorant of any such
> option (at least until the ECC controller comes into play). I also think
> ubi doesn't care who does the ECC as long as they tell them when there
> is an error.
As explained in my other answer, you can now choose between ECC engines
when there are several. All happens at the SPI NAND level though, MTD,
UBI of filesystems are ignorant about it. They just expect (corrected)
data.
However if your fw used to play with an external or pipelined ECC
engine, you are good to write a driver for it! But be happy, now there
is all the infrastructure needed for it.
Reference implementations:
- drivers/mtd/nand/ecc-mxic.c
- drivers/mtd/nand/ecc-mtk.c
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* AW: AW: NAND ECC errors
2025-08-08 8:41 ` Miquel Raynal
@ 2025-08-08 12:25 ` markus.stockhausen
2025-08-08 12:41 ` Miquel Raynal
0 siblings, 1 reply; 10+ messages in thread
From: markus.stockhausen @ 2025-08-08 12:25 UTC (permalink / raw)
To: 'Miquel Raynal'
Cc: vigneshr, richard, tudor.ambarus, linux-mtd,
'Chris Packham'
Hi Miguel,
> -----Ursprüngliche Nachricht-----
> Von: Miquel Raynal <miquel.raynal@bootlin.com>
> Gesendet: Freitag, 8. August 2025 10:42
>
>
> As explained in my other answer, you can now choose between ECC engines
> when there are several. All happens at the SPI NAND level though, MTD,
> UBI of filesystems are ignorant about it. They just expect (corrected)
> data.
>
> However if your fw used to play with an external or pipelined ECC
> engine, you are good to write a driver for it! But be happy, now there
> is all the infrastructure needed for it.
>
> Reference implementations:
> - drivers/mtd/nand/ecc-mxic.c
> - drivers/mtd/nand/ecc-mtk.c
Thanks a lot for your detailed explanations. The missing bit was the
info that all is in place for spi nand. I just was focused on that
on-die code snippet. So I was able to configure all I need. The DTS
reads now:
flash@0 {
compatible = "spi-nand";
reg = <0>;
nand-use-soft-ecc-engine;
nand-ecc-algo = "bch";
nand-ecc-placement = "oob";
nand-ecc-strength = <6>;
nand-ecc-step-size = <512>;
The software BCH engine is finally kicking in. Now there is a last
small bit that is somehow off. As explained I found this data
4*512 bytes data
4*6 bytes (set to 0xff 0xff 0xff 0xff 0xff 0xff)
4*10 bytes ECC.
While the bch6 code is automatically directed to the right ECC
data slot the calculated checksums are totally different. So
still unrecoverable errors. Looking deeper I noticed that the
original ECC was not only calculated over the 512 bytes data
but also over the 6*0xff extra bytes. These are simply appended.
So we have
DATA:
000 : 55 42 49 23 01 00 00 00 00 00 00 00 00 00 00 01 00 00 08 00 00 00 10 00 7b 6d c1 75 00 00 00 00
032 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9f 21 e9 0b
064 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
096 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
128 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
160 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
192 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
224 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
256 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
288 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
320 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
352 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
384 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
416 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
448 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
480 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ECC read from flash: a8 5e 66 15 61 32 de 2b 8b 44
ECC BCH6 512 bytes : fa 7c bc 7b 44 c5 c7 b7 d8 6b
ECC BCH6 518 (0xff): a8 5e 66 15 61 32 de 2b 8b 44
This is consistent for all blocks. I'll add some hacks to
software bch engine to verify it. Aside from the stupid
idea of coding a bch "proxy" engine what would be a proper
solution?
Markus
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: AW: NAND ECC errors
2025-08-08 12:25 ` AW: " markus.stockhausen
@ 2025-08-08 12:41 ` Miquel Raynal
2025-08-08 13:21 ` AW: " markus.stockhausen
0 siblings, 1 reply; 10+ messages in thread
From: Miquel Raynal @ 2025-08-08 12:41 UTC (permalink / raw)
To: markus.stockhausen
Cc: vigneshr, richard, tudor.ambarus, linux-mtd,
'Chris Packham'
On 08/08/2025 at 14:25:18 +02, <markus.stockhausen@gmx.de> wrote:
> Hi Miguel,
>
>> -----Ursprüngliche Nachricht-----
>> Von: Miquel Raynal <miquel.raynal@bootlin.com>
>> Gesendet: Freitag, 8. August 2025 10:42
>>
>>
>> As explained in my other answer, you can now choose between ECC engines
>> when there are several. All happens at the SPI NAND level though, MTD,
>> UBI of filesystems are ignorant about it. They just expect (corrected)
>> data.
>>
>> However if your fw used to play with an external or pipelined ECC
>> engine, you are good to write a driver for it! But be happy, now there
>> is all the infrastructure needed for it.
>>
>> Reference implementations:
>> - drivers/mtd/nand/ecc-mxic.c
>> - drivers/mtd/nand/ecc-mtk.c
>
> Thanks a lot for your detailed explanations. The missing bit was the
> info that all is in place for spi nand. I just was focused on that
> on-die code snippet. So I was able to configure all I need. The DTS
> reads now:
>
> flash@0 {
> compatible = "spi-nand";
> reg = <0>;
>
> nand-use-soft-ecc-engine;
> nand-ecc-algo = "bch";
> nand-ecc-placement = "oob";
> nand-ecc-strength = <6>;
> nand-ecc-step-size = <512>;
>
> The software BCH engine is finally kicking in. Now there is a last
> small bit that is somehow off. As explained I found this data
>
> 4*512 bytes data
> 4*6 bytes (set to 0xff 0xff 0xff 0xff 0xff 0xff)
> 4*10 bytes ECC.
>
> While the bch6 code is automatically directed to the right ECC
> data slot the calculated checksums are totally different. So
> still unrecoverable errors. Looking deeper I noticed that the
> original ECC was not only calculated over the 512 bytes data
> but also over the 6*0xff extra bytes. These are simply appended.
>
> So we have
>
> DATA:
> 000 : 55 42 49 23 01 00 00 00 00 00 00 00 00 00 00 01 00 00 08 00 00 00 10 00 7b 6d c1 75 00 00 00 00
> 032 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9f 21 e9 0b
> 064 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 096 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 128 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 160 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 192 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 224 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 256 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 288 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 320 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 352 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 384 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 416 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 448 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 480 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> ECC read from flash: a8 5e 66 15 61 32 de 2b 8b 44
> ECC BCH6 512 bytes : fa 7c bc 7b 44 c5 c7 b7 d8 6b
> ECC BCH6 518 (0xff): a8 5e 66 15 61 32 de 2b 8b 44
>
> This is consistent for all blocks. I'll add some hacks to
> software bch engine to verify it. Aside from the stupid
> idea of coding a bch "proxy" engine what would be a proper
> solution?
Just to fully understand, you are trying to replicate what the software
BCH engine of your realtek SPI controller does?
If so, why not just configuring the engine directly and use it? It will
be much faster and easier to do.
Otherwise, if you're trying to do in software what the hardware does,
you must be prepared to more maths:
https://bootlin.com/blog/supporting-a-misbehaving-nand-ecc-engine/
Cheers,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* AW: AW: AW: NAND ECC errors
2025-08-08 12:41 ` Miquel Raynal
@ 2025-08-08 13:21 ` markus.stockhausen
2025-08-08 13:31 ` Miquel Raynal
0 siblings, 1 reply; 10+ messages in thread
From: markus.stockhausen @ 2025-08-08 13:21 UTC (permalink / raw)
To: 'Miquel Raynal'
Cc: vigneshr, richard, tudor.ambarus, linux-mtd,
'Chris Packham'
> Von: linux-mtd <linux-mtd-bounces@lists.infradead.org> Im Auftrag von Miquel Raynal
> Gesendet: Freitag, 8. August 2025 14:42
>
> On 08/08/2025 at 14:25:18 +02, <markus.stockhausen@gmx.de> wrote:
> Just to fully understand, you are trying to replicate what the software
> BCH engine of your realtek SPI controller does?
>
> If so, why not just configuring the engine directly and use it? It will
> be much faster and easier to do.
>
> Otherwise, if you're trying to do in software what the hardware does,
> you must be prepared to more maths:
> https://bootlin.com/blog/supporting-a-misbehaving-nand-ecc-engine/
Idea was to start with something I have fully understood. Wantet to
leave Hardware for later as ECC is a total new area now (as it was with
NAND the last week). Had a short look at the hardware engine GPL
code. The most basic parts look straight forward with three functions
- Prepare to to encoded data
- Prepare to be decoded data
- Kick engine
ecc-mtk seems tightly integrated into the nand contoller while
ecc-mxic seems to be some passthrough/pipeline integration.
What would be your advise for the most simple enhancement
for our existing Realtek spi-mem controller?
Markus
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: AW: AW: NAND ECC errors
2025-08-08 13:21 ` AW: " markus.stockhausen
@ 2025-08-08 13:31 ` Miquel Raynal
0 siblings, 0 replies; 10+ messages in thread
From: Miquel Raynal @ 2025-08-08 13:31 UTC (permalink / raw)
To: markus.stockhausen
Cc: vigneshr, richard, tudor.ambarus, linux-mtd,
'Chris Packham'
On 08/08/2025 at 15:21:40 +02, <markus.stockhausen@gmx.de> wrote:
>> Von: linux-mtd <linux-mtd-bounces@lists.infradead.org> Im Auftrag von Miquel Raynal
>> Gesendet: Freitag, 8. August 2025 14:42
>>
>> On 08/08/2025 at 14:25:18 +02, <markus.stockhausen@gmx.de> wrote:
>> Just to fully understand, you are trying to replicate what the software
>> BCH engine of your realtek SPI controller does?
>>
>> If so, why not just configuring the engine directly and use it? It will
>> be much faster and easier to do.
>>
>> Otherwise, if you're trying to do in software what the hardware does,
>> you must be prepared to more maths:
>> https://bootlin.com/blog/supporting-a-misbehaving-nand-ecc-engine/
>
> Idea was to start with something I have fully understood. Wantet to
> leave Hardware for later as ECC is a total new area now (as it was with
> NAND the last week). Had a short look at the hardware engine GPL
> code. The most basic parts look straight forward with three functions
>
> - Prepare to to encoded data
> - Prepare to be decoded data
> - Kick engine
>
> ecc-mtk seems tightly integrated into the nand contoller while
> ecc-mxic seems to be some passthrough/pipeline integration.
>
> What would be your advise for the most simple enhancement
> for our existing Realtek spi-mem controller?
From my current understanding, implementing the external approach (see
Maronix external ECC engine case) should be the easiest path forward.
If the engine is tightly integrated to the controller (I have no idea),
then use the pipelined example or MTK's example.
You need to implement very few callbacks for getting it to work, and
except the init which might be a bit painful they will all be simple
(like set/clear a bit, return the value of a register...).
Good luck!
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-08 13:31 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-07 15:16 NAND ECC errors markus.stockhausen
2025-08-07 20:36 ` Chris Packham
2025-08-07 21:23 ` AW: " markus.stockhausen
2025-08-07 21:32 ` Chris Packham
2025-08-08 8:41 ` Miquel Raynal
2025-08-08 12:25 ` AW: " markus.stockhausen
2025-08-08 12:41 ` Miquel Raynal
2025-08-08 13:21 ` AW: " markus.stockhausen
2025-08-08 13:31 ` Miquel Raynal
2025-08-08 8:37 ` Miquel Raynal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).