* [PATCH RFC] spi: spi-qpic-snand: overestimate corrected bitflips
@ 2025-05-22 17:33 Gabor Juhos
2025-05-23 14:39 ` Miquel Raynal
0 siblings, 1 reply; 3+ messages in thread
From: Gabor Juhos @ 2025-05-22 17:33 UTC (permalink / raw)
To: Mark Brown, Miquel Raynal
Cc: Md Sadre Alam, Varadarajan Narayanan, Sricharan Ramabadhran,
linux-spi, linux-mtd, linux-arm-msm, linux-kernel, Gabor Juhos
The QPIC hardware is not capable of reporting the exact number of the
corrected bitflips, it only reports the number of the corrected bytes.
However the current code treats that as corrected bitflips which is
quite inaccurate in most cases. For example, even if the hardware reports
only one byte as corrected, that byte may have contained multiple bit
errors from one up to the maximum number of correctable bits.
Change the code to report the maximum of the possibly corrected bits,
thus allowing upper layers to do certain actions before the data gets
lost due to uncorrectable errors.
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
The patch tries to address Miquel's concerns [1] about the corrected bit
error reporting capabilities of the QPIC hardware.
[1] https://lore.kernel.org/all/87h61e8kow.fsf@bootlin.com
nanbiterrs test result after the change:
# insmod mtd_test; insmod mtd_nandbiterrs dev=4
[ 40.685251]
[ 40.685278] ==================================================
[ 40.685803] mtd_nandbiterrs: MTD device: 4
[ 40.691566] mtd_nandbiterrs: MTD device size 7602176, eraseblock=131072, page=2048, oob=128
[ 40.695528] mtd_nandbiterrs: Device uses 1 subpages of 2048 bytes
[ 40.703789] mtd_nandbiterrs: Using page=0, offset=0, eraseblock=0
[ 40.713655] mtd_nandbiterrs: incremental biterrors test
[ 40.716118] mtd_nandbiterrs: write_page
[ 40.722126] mtd_nandbiterrs: rewrite page
[ 40.725878] mtd_nandbiterrs: read_page
[ 41.103196] mtd_nandbiterrs: verify_page
[ 41.106915] mtd_nandbiterrs: Successfully corrected 0 bit errors per subpage
[ 41.111001] mtd_nandbiterrs: Inserted biterror @ 1/7
[ 41.118007] mtd_nandbiterrs: rewrite page
[ 41.123850] mtd_nandbiterrs: read_page
[ 41.127458] qcom_snand 79b0000.spi: the number of corrected bits may be inaccurate
[ 41.130538] mtd_nandbiterrs: Read reported 4 corrected bit errors
[ 41.138060] mtd_nandbiterrs: verify_page
[ 41.144265] mtd_nandbiterrs: Successfully corrected 1 bit errors per subpage
[ 41.148217] mtd_nandbiterrs: Inserted biterror @ 3/7
[ 41.155260] mtd_nandbiterrs: rewrite page
[ 41.161076] mtd_nandbiterrs: read_page
[ 41.164687] qcom_snand 79b0000.spi: the number of corrected bits may be inaccurate
[ 41.167750] mtd_nandbiterrs: Read reported 4 corrected bit errors
[ 41.175324] mtd_nandbiterrs: verify_page
[ 41.181511] mtd_nandbiterrs: Successfully corrected 2 bit errors per subpage
[ 41.185456] mtd_nandbiterrs: Inserted biterror @ 5/7
[ 41.192516] mtd_nandbiterrs: rewrite page
[ 41.198301] mtd_nandbiterrs: read_page
[ 41.201949] qcom_snand 79b0000.spi: the number of corrected bits may be inaccurate
[ 41.204990] mtd_nandbiterrs: Read reported 4 corrected bit errors
[ 41.212568] mtd_nandbiterrs: verify_page
[ 41.218720] mtd_nandbiterrs: Successfully corrected 3 bit errors per subpage
[ 41.222714] mtd_nandbiterrs: Inserted biterror @ 7/7
[ 41.229739] mtd_nandbiterrs: rewrite page
[ 41.235548] mtd_nandbiterrs: read_page
[ 41.239168] mtd_nandbiterrs: Read reported 4 corrected bit errors
[ 41.242252] mtd_nandbiterrs: verify_page
[ 41.248407] mtd_nandbiterrs: Successfully corrected 4 bit errors per subpage
[ 41.252406] mtd_nandbiterrs: Inserted biterror @ 8/7
[ 41.259413] mtd_nandbiterrs: rewrite page
[ 41.265238] mtd_nandbiterrs: read_page
[ 41.268858] mtd_nandbiterrs: error: read failed at 0x0
[ 41.271937] mtd_nandbiterrs: After 5 biterrors per subpage, read reported error -74
[ 41.280512] mtd_nandbiterrs: finished successfully.
[ 41.284587] ==================================================
failed to insert /lib/modules/6.15.0-rc5+/mtd_nandbiterrs.ko
#
---
drivers/spi/spi-qpic-snand.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/spi/spi-qpic-snand.c b/drivers/spi/spi-qpic-snand.c
index 17bcd12f6cb932fdf2d968692359a0301ce4acdc..137d60081754db04ceb5db64e6f250f6477021db 100644
--- a/drivers/spi/spi-qpic-snand.c
+++ b/drivers/spi/spi-qpic-snand.c
@@ -638,6 +638,21 @@ static int qcom_spi_check_error(struct qcom_nand_controller *snandc)
unsigned int stat;
stat = buffer & BS_CORRECTABLE_ERR_MSK;
+
+ if (stat && stat != ecc_cfg->strength) {
+ /*
+ * The exact number of the corrected bits is
+ * unknown because the hardware only reports
+ * the number of the corrected bytes.
+ *
+ * Assume the worst case scenario and use
+ * the maximum.
+ */
+ dev_warn(snandc->dev,
+ "the number of corrected bits may be inaccurate\n");
+ stat = ecc_cfg->strength;
+ }
+
snandc->qspi->ecc_stats.corrected += stat;
max_bitflips = max(max_bitflips, stat);
}
---
base-commit: e7f3d11567c2c79c4342791ba91c500b434ce147
change-id: 20250521-qpic-snand-overestimate-bitflips-64112df41c38
Best regards,
--
Gabor Juhos <j4g8y7@gmail.com>
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH RFC] spi: spi-qpic-snand: overestimate corrected bitflips
2025-05-22 17:33 [PATCH RFC] spi: spi-qpic-snand: overestimate corrected bitflips Gabor Juhos
@ 2025-05-23 14:39 ` Miquel Raynal
2025-05-26 19:00 ` Gabor Juhos
0 siblings, 1 reply; 3+ messages in thread
From: Miquel Raynal @ 2025-05-23 14:39 UTC (permalink / raw)
To: Gabor Juhos
Cc: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, linux-spi, linux-mtd, linux-arm-msm,
linux-kernel
On 22/05/2025 at 19:33:26 +02, Gabor Juhos <j4g8y7@gmail.com> wrote:
> The QPIC hardware is not capable of reporting the exact number of the
> corrected bitflips, it only reports the number of the corrected bytes.
> However the current code treats that as corrected bitflips which is
> quite inaccurate in most cases. For example, even if the hardware reports
> only one byte as corrected, that byte may have contained multiple bit
> errors from one up to the maximum number of correctable bits.
>
> Change the code to report the maximum of the possibly corrected bits,
> thus allowing upper layers to do certain actions before the data gets
> lost due to uncorrectable errors.
>
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> The patch tries to address Miquel's concerns [1] about the corrected bit
> error reporting capabilities of the QPIC hardware.
>
> [1] https://lore.kernel.org/all/87h61e8kow.fsf@bootlin.com
Thank you very much for attempting to improve the situation. Giving this
a second look, it will not work either and will be even worse, forcing
wear levelling after each read. So let's not change the returned value,
hopefully the real life is different as the test case and most bitflips
will be spread and not concentrated in a single byte. However I'd
welcome either a pr_warn_once() or at least a comment somewhere about
this.
Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH RFC] spi: spi-qpic-snand: overestimate corrected bitflips
2025-05-23 14:39 ` Miquel Raynal
@ 2025-05-26 19:00 ` Gabor Juhos
0 siblings, 0 replies; 3+ messages in thread
From: Gabor Juhos @ 2025-05-26 19:00 UTC (permalink / raw)
To: Miquel Raynal
Cc: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, linux-spi, linux-mtd, linux-arm-msm,
linux-kernel
2025. 05. 23. 16:39 keltezéssel, Miquel Raynal írta:
> On 22/05/2025 at 19:33:26 +02, Gabor Juhos <j4g8y7@gmail.com> wrote:
>
>> The QPIC hardware is not capable of reporting the exact number of the
>> corrected bitflips, it only reports the number of the corrected bytes.
>> However the current code treats that as corrected bitflips which is
>> quite inaccurate in most cases. For example, even if the hardware reports
>> only one byte as corrected, that byte may have contained multiple bit
>> errors from one up to the maximum number of correctable bits.
>>
>> Change the code to report the maximum of the possibly corrected bits,
>> thus allowing upper layers to do certain actions before the data gets
>> lost due to uncorrectable errors.
>>
>> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
>> ---
>> The patch tries to address Miquel's concerns [1] about the corrected bit
>> error reporting capabilities of the QPIC hardware.
>>
>> [1] https://lore.kernel.org/all/87h61e8kow.fsf@bootlin.com
>
> Thank you very much for attempting to improve the situation. Giving this
> a second look, it will not work either and will be even worse, forcing
> wear levelling after each read. So let's not change the returned value,
> hopefully the real life is different as the test case and most bitflips
> will be spread and not concentrated in a single byte. However I'd
> welcome either a pr_warn_once() or at least a comment somewhere about
> this.
Ok, I will rework the patch. If it turns out that the current approach behaves
badly in real life, we can still change it later.
Thanks,
Gabor
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-26 19:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-22 17:33 [PATCH RFC] spi: spi-qpic-snand: overestimate corrected bitflips Gabor Juhos
2025-05-23 14:39 ` Miquel Raynal
2025-05-26 19:00 ` Gabor Juhos
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox