* [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption
@ 2025-05-25 17:05 Gabor Juhos
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Gabor Juhos @ 2025-05-25 17:05 UTC (permalink / raw)
To: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel, Gabor Juhos
The 'spi-qpic-nand' driver may cause memory corruption under some
circumstances. The first patch in the series changes the driver to
avoid that, whereas the second adds some sanity checks to the common
QPIC code in order to make detecting such errors easier in the future.
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
Gabor Juhos (2):
spi: spi-qpic-snand: reallocate BAM transactions
mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
drivers/mtd/nand/qpic_common.c | 28 ++++++++++++++++++++++++----
drivers/spi/spi-qpic-snand.c | 16 ++++++++++++++++
include/linux/mtd/nand-qpic-common.h | 8 ++++++++
3 files changed, 48 insertions(+), 4 deletions(-)
---
base-commit: b00d6864a4c948529dc6ddd2df76bf175bf27c63
change-id: 20250523-qpic-snand-avoid-mem-corruption-301afabeb0eb
Best regards,
--
Gabor Juhos <j4g8y7@gmail.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions
2025-05-25 17:05 [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Gabor Juhos
@ 2025-05-25 17:05 ` Gabor Juhos
2025-05-26 5:56 ` Md Sadre Alam
2025-05-27 11:28 ` Mark Brown
2025-05-25 17:05 ` [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays Gabor Juhos
2025-05-26 8:13 ` [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Miquel Raynal
2 siblings, 2 replies; 12+ messages in thread
From: Gabor Juhos @ 2025-05-25 17:05 UTC (permalink / raw)
To: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel, Gabor Juhos
Using the mtd_nandbiterrs module for testing the driver occasionally
results in weird things like below.
1. swiotlb mapping fails with the following message:
[ 85.926216] qcom_snand 79b0000.spi: swiotlb buffer is full (sz: 4294967294 bytes), total 512 (slots), used 0 (slots)
[ 85.932937] qcom_snand 79b0000.spi: failure in mapping desc
[ 87.999314] qcom_snand 79b0000.spi: failure to write raw page
[ 87.999352] mtd_nandbiterrs: error: write_oob failed (-110)
Rebooting the board after that causes a panic:
# reboot
[ 877.178844] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
[ 877.178913] Mem abort info:
[ 877.186767] ESR = 0x0000000096000005
[ 877.189508] EC = 0x25: DABT (current EL), IL = 32 bits
[ 877.193312] SET = 0, FnV = 0
[ 877.198780] EA = 0, S1PTW = 0
[ 877.201676] FSC = 0x05: level 1 translation fault
[ 877.204684] Data abort info:
[ 877.209559] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 877.212669] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 877.217964] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 877.223098] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000446ae000
[ 877.228470] [0000000000000078] pgd=080000004492d403, p4d=080000004492d403, pud=080000004492d403, pmd=0000000000000000
[ 877.234944] Internal error: Oops: 0000000096000005 [#1] SMP
[ 877.245395] Modules linked in: mtd_test pppoe ppp_async nft_fib_inet nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack slhc sfp qrtr_smd nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mdio_i2c crc_ccitt evdev gpio_fan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic seqiv geniv drbg hmac arc4 libarc4 gpio_keys usb_storage input_core leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom phy_qcom_qusb2 phy_qcom_qmp_usb phy_qcom_m31 aquantia hwmon crc_itu_t crc32c_generic
[ 877.297039] CPU: 1 UID: 0 PID: 2060 Comm: reboot Not tainted 6.15.0-rc4+ #0 NONE
[ 877.319267] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
[ 877.326820] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 877.333936] pc : wakeup_flusher_threads+0x98/0x1d0
[ 877.340792] lr : wakeup_flusher_threads+0x128/0x1d0
[ 877.345653] sp : ffffffc082b23db0
[ 877.350426] x29: ffffffc082b23db0 x28: ffffff8003dd4000 x27: 0000000000000000
[ 877.353902] x26: 0000000000000000 x25: 0000000000000000 x24: ffffffc081c89b08
[ 877.361021] x23: ffffffc081cafc78 x22: 0000000000000020 x21: 0000000000000002
[ 877.368138] x20: 0000000000000001 x19: ffffffc08032c000 x18: 0000000000000000
[ 877.375257] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 877.382374] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 877.389492] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffffc08032c36c
[ 877.396610] x8 : 0000000000000001 x7 : ffffffc080327318 x6 : 0000000000000000
[ 877.403728] x5 : 0000000000000000 x4 : 000000000002000f x3 : 0000000000000000
[ 877.410847] x2 : 0000000000000001 x1 : ffffff8003dd4000 x0 : 0000000000000040
[ 877.417965] Call trace:
[ 877.425078] wakeup_flusher_threads+0x98/0x1d0 (P)
[ 877.427337] ksys_sync+0x2c/0xa0
[ 877.432197] __arm64_sys_sync+0x18/0x30
[ 877.435582] invoke_syscall.constprop.0+0x64/0x110
[ 877.439143] do_el0_svc+0x48/0xd8
[ 877.444003] el0_svc+0x3c/0xf8
[ 877.447388] el0t_64_sync_handler+0x10c/0x138
[ 877.450341] el0t_64_sync+0x180/0x188
[ 877.454770] Code: d1008016 eb17001f 540002c0 a90153f3 (f9402ec0)
[ 877.458416] ---[ end trace 0000000000000000 ]---
[ 877.464492] Kernel panic - not syncing: Oops: Fatal exception
[ 877.469180] SMP: stopping secondary CPUs
[ 877.474825] Kernel Offset: disabled
[ 877.478812] CPU features: 0x0000,00000068,00000000,0200400b
[ 877.482025] Memory Limit: none
[ 877.487581] Rebooting in 3 seconds..
2. If the swiotlb mapping does not fail, rebooting the board may result
in a different panic:
# reboot
[ 256.104459] BUG: spinlock bad magic on CPU#3, procd/2241
[ 256.104488] Unable to handle kernel paging request at virtual address ffffffff0000049b
[ 256.108827] Mem abort info:
[ 256.116548] ESR = 0x0000000096000005
[ 256.119240] EC = 0x25: DABT (current EL), IL = 32 bits
[ 256.123060] SET = 0, FnV = 0
[ 256.128528] EA = 0, S1PTW = 0
[ 256.131391] FSC = 0x05: level 1 translation fault
[ 256.134431] Data abort info:
[ 256.139291] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 256.142419] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 256.147713] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 256.152836] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000041e5b000
[ 256.158218] [ffffffff0000049b] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 256.164906] Internal error: Oops: 0000000096000005 [#1] SMP
[ 256.173323] Modules linked in: mtd_test pppoe ppp_async nft_fib_inet nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack slhc sfp qrtr_smd nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mdio_i2c crc_ccitt evdev gpio_fan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic seqiv geniv drbg hmac arc4 libarc4 gpio_keys usb_storage input_core leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom phy_qcom_qusb2 phy_qcom_qmp_usb phy_qcom_m31 aquantia hwmon crc_itu_t crc32c_generic
[ 256.225139] CPU: 3 UID: 0 PID: 2241 Comm: procd Not tainted 6.15.0-rc4+ #0 NONE
[ 256.247369] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
[ 256.254921] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 256.261951] pc : spin_bug+0x70/0xd8
[ 256.268806] lr : spin_bug+0x68/0xd8
[ 256.272278] sp : ffffffc083763b00
[ 256.275750] x29: ffffffc083763b00 x28: ffffff80005b8000 x27: ffffffc0827f4000
[ 256.279225] x26: ffffffc080dab000 x25: ffffffc083763c18 x24: ffffff80005b8000
[ 256.286344] x23: 0000000000000002 x22: ffffffc081c00c00 x21: ffffffff0000001b
[ 256.293462] x20: ffffffc080d44c80 x19: ffffff8004d52888 x18: 0000000000000005
[ 256.300580] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000006
[ 256.307698] x14: 0000000000000000 x13: 313432322f64636f x12: ffffffc081c707b8
[ 256.314816] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc0800cdfd0
[ 256.321935] x8 : c0000000ffffefff x7 : ffffffc081c186a8 x6 : 0000000000057fa8
[ 256.329052] x5 : ffffffc081c70760 x4 : 0000000000000000 x3 : ffffffc0837638e0
[ 256.336170] x2 : 000000000000001b x1 : ffffff80005b8000 x0 : 000000000000002c
[ 256.343289] Call trace:
[ 256.350402] spin_bug+0x70/0xd8 (P)
[ 256.352660] do_raw_spin_lock+0xa4/0x128
[ 256.356132] _raw_spin_lock_irqsave+0x70/0x98
[ 256.360299] __mutex_lock+0x268/0xdf0
[ 256.364552] mutex_lock_nested+0x2c/0x40
[ 256.368199] device_shutdown+0xfc/0x260
[ 256.372191] kernel_restart+0x4c/0xb8
[ 256.375750] __do_sys_reboot+0x108/0x230
[ 256.379570] __arm64_sys_reboot+0x2c/0x40
[ 256.383563] invoke_syscall.constprop.0+0x64/0x110
[ 256.387470] do_el0_svc+0x48/0xd8
[ 256.392156] el0_svc+0x3c/0xf8
[ 256.395541] el0t_64_sync_handler+0x10c/0x138
[ 256.398494] el0t_64_sync+0x180/0x188
[ 256.402923] Code: 54000220 940009a5 b9400662 b4000295 (b94482a4)
[ 256.406570] ---[ end trace 0000000000000000 ]---
[ 256.412645] Kernel panic - not syncing: Oops: Fatal exception
[ 256.417340] Kernel Offset: disabled
[ 256.422972] CPU features: 0x0000,00000068,00000000,0200400b
[ 256.426273] Memory Limit: none
[ 256.431828] Rebooting in 3 seconds..
Investigating the issue revealed that these symptoms are results of
memory corruption which is caused by out of bounds access within the
driver.
The driver uses a dynamically allocated structure for BAM transactions,
which structure must have enough space for all possible variations of
different flash operations initiated by the driver. The required space
heavily depends on the actual number of 'codewords' which is calculated
from the pagesize of the actual NAND chip.
Although the qcom_nandc_alloc() function allocates memory for the BAM
transactions during probe, but since the actual number of 'codewords'
is not yet know the allocation is done for one 'codeword' only.
Because of this, whenever the driver does a flash operation, and the
number of the required transactions exceeds the size of the allocated
arrays the driver accesses memory out of the allocated range.
To avoid this, change the code to free the initially allocated BAM
transactions memory, and allocate a new one once the actual number of
'codewords' required for a given NAND chip is known.
Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
drivers/spi/spi-qpic-snand.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/spi/spi-qpic-snand.c b/drivers/spi/spi-qpic-snand.c
index fd129650434f0129e24d3bdac7e7c4d5542627e6..c552cb7aa80e368e193d71e1826b2cc005571a9c 100644
--- a/drivers/spi/spi-qpic-snand.c
+++ b/drivers/spi/spi-qpic-snand.c
@@ -315,6 +315,22 @@ static int qcom_spi_ecc_init_ctx_pipelined(struct nand_device *nand)
mtd_set_ooblayout(mtd, &qcom_spi_ooblayout);
+ /*
+ * Free the temporary BAM transaction allocated initially by
+ * qcom_nandc_alloc(), and allocate a new one based on the
+ * updated max_cwperpage value.
+ */
+ qcom_free_bam_transaction(snandc);
+
+ snandc->max_cwperpage = cwperpage;
+
+ snandc->bam_txn = qcom_alloc_bam_transaction(snandc);
+ if (!snandc->bam_txn) {
+ dev_err(snandc->dev, "failed to allocate BAM transaction\n");
+ ret = -ENOMEM;
+ goto err_free_ecc_cfg;
+ }
+
ecc_cfg->cfg0 = FIELD_PREP(CW_PER_PAGE_MASK, (cwperpage - 1)) |
FIELD_PREP(UD_SIZE_BYTES_MASK, ecc_cfg->cw_data) |
FIELD_PREP(DISABLE_STATUS_AFTER_WRITE, 1) |
--
2.49.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-25 17:05 [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Gabor Juhos
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
@ 2025-05-25 17:05 ` Gabor Juhos
2025-05-26 6:53 ` Md Sadre Alam
2025-05-26 9:27 ` Miquel Raynal
2025-05-26 8:13 ` [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Miquel Raynal
2 siblings, 2 replies; 12+ messages in thread
From: Gabor Juhos @ 2025-05-25 17:05 UTC (permalink / raw)
To: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel, Gabor Juhos
The common QPIC code does not do any boundary checking when it handles
the command elements and scatter gater list arrays of a BAM transaction,
thus it allows to access out of bounds elements in those.
Although it is the responsibility of the given driver to allocate enough
space for all possible BAM transaction variations, however there can be
mistakes in the driver code which can lead to hidden memory corruption
issues which are hard to debug.
This kind of problem has been observed during testing the 'spi-qpic-snand'
driver. Although the driver has been fixed with a preceding patch, but it
still makes sense to reduce the chance of having such errors again later.
In order to prevent such errors, change the qcom_alloc_bam_transaction()
function to store the number of elements of the arrays in the
'bam_transaction' strucutre during allocation. Also, add sanity checks to
the qcom_prep_bam_dma_desc_{cmd,data}() functions to avoid using out of
bounds indices for the arrays.
Tested with the 'spi-qpic-snand' driver only.
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
---
Preferably, this should go in via the SPI tree along with the previous
patch. It is not a strict requirement though, in the case it gets
included separately through the mtd tree it reveals the bug fixed in
the first patch.
---
drivers/mtd/nand/qpic_common.c | 28 ++++++++++++++++++++++++----
include/linux/mtd/nand-qpic-common.h | 8 ++++++++
2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/drivers/mtd/nand/qpic_common.c b/drivers/mtd/nand/qpic_common.c
index e0ed25b5afea9b289b767cd3d9c2d7572ed52008..fb1f81e4bdacaa3e81660a20e164926c64633513 100644
--- a/drivers/mtd/nand/qpic_common.c
+++ b/drivers/mtd/nand/qpic_common.c
@@ -15,6 +15,13 @@
#include <linux/slab.h>
#include <linux/mtd/nand-qpic-common.h>
+static inline int qcom_err_bam_array_full(struct qcom_nand_controller *nandc,
+ const char *name)
+{
+ dev_err(nandc->dev, "BAM %s array is full\n", name);
+ return -EINVAL;
+}
+
/**
* qcom_free_bam_transaction() - Frees the BAM transaction memory
* @nandc: qpic nand controller
@@ -57,14 +64,15 @@ qcom_alloc_bam_transaction(struct qcom_nand_controller *nandc)
bam_txn_buf += sizeof(*bam_txn);
bam_txn->bam_ce = bam_txn_buf;
- bam_txn_buf +=
- sizeof(*bam_txn->bam_ce) * QPIC_PER_CW_CMD_ELEMENTS * num_cw;
+ bam_txn->bam_ce_nitems = QPIC_PER_CW_CMD_ELEMENTS * num_cw;
+ bam_txn_buf += sizeof(*bam_txn->bam_ce) * bam_txn->bam_ce_nitems;
bam_txn->cmd_sgl = bam_txn_buf;
- bam_txn_buf +=
- sizeof(*bam_txn->cmd_sgl) * QPIC_PER_CW_CMD_SGL * num_cw;
+ bam_txn->cmd_sgl_nitems = QPIC_PER_CW_CMD_SGL * num_cw;
+ bam_txn_buf += sizeof(*bam_txn->cmd_sgl) * bam_txn->cmd_sgl_nitems;
bam_txn->data_sgl = bam_txn_buf;
+ bam_txn->data_sgl_nitems = QPIC_PER_CW_DATA_SGL * num_cw;
init_completion(&bam_txn->txn_done);
@@ -237,6 +245,9 @@ int qcom_prep_bam_dma_desc_cmd(struct qcom_nand_controller *nandc, bool read,
struct bam_cmd_element *bam_ce_buffer;
struct bam_transaction *bam_txn = nandc->bam_txn;
+ if (bam_txn->bam_ce_pos + size > bam_txn->bam_ce_nitems)
+ return qcom_err_bam_array_full(nandc, "CE");
+
bam_ce_buffer = &bam_txn->bam_ce[bam_txn->bam_ce_pos];
/* fill the command desc */
@@ -258,6 +269,9 @@ int qcom_prep_bam_dma_desc_cmd(struct qcom_nand_controller *nandc, bool read,
/* use the separate sgl after this command */
if (flags & NAND_BAM_NEXT_SGL) {
+ if (bam_txn->cmd_sgl_pos >= bam_txn->cmd_sgl_nitems)
+ return qcom_err_bam_array_full(nandc, "CMD sgl");
+
bam_ce_buffer = &bam_txn->bam_ce[bam_txn->bam_ce_start];
bam_ce_size = (bam_txn->bam_ce_pos -
bam_txn->bam_ce_start) *
@@ -297,10 +311,16 @@ int qcom_prep_bam_dma_desc_data(struct qcom_nand_controller *nandc, bool read,
struct bam_transaction *bam_txn = nandc->bam_txn;
if (read) {
+ if (bam_txn->rx_sgl_pos >= bam_txn->data_sgl_nitems)
+ return qcom_err_bam_array_full(nandc, "RX sgl");
+
sg_set_buf(&bam_txn->data_sgl[bam_txn->rx_sgl_pos],
vaddr, size);
bam_txn->rx_sgl_pos++;
} else {
+ if (bam_txn->tx_sgl_pos >= bam_txn->data_sgl_nitems)
+ return qcom_err_bam_array_full(nandc, "TX sgl");
+
sg_set_buf(&bam_txn->data_sgl[bam_txn->tx_sgl_pos],
vaddr, size);
bam_txn->tx_sgl_pos++;
diff --git a/include/linux/mtd/nand-qpic-common.h b/include/linux/mtd/nand-qpic-common.h
index cd7172e6c1bbffeee0363a14044980a72ea17723..3ca4073a496b8fd2a99112a9caefd3f110260568 100644
--- a/include/linux/mtd/nand-qpic-common.h
+++ b/include/linux/mtd/nand-qpic-common.h
@@ -240,6 +240,9 @@
* @last_data_desc - last DMA desc in data channel (tx/rx).
* @last_cmd_desc - last DMA desc in command channel.
* @txn_done - completion for NAND transfer.
+ * @bam_ce_nitems - the number of elements in the @bam_ce array
+ * @cmd_sgl_nitems - the number of elements in the @cmd_sgl array
+ * @data_sgl_nitems - the number of elements in the @data_sgl array
* @bam_ce_pos - the index in bam_ce which is available for next sgl
* @bam_ce_start - the index in bam_ce which marks the start position ce
* for current sgl. It will be used for size calculation
@@ -258,6 +261,11 @@ struct bam_transaction {
struct dma_async_tx_descriptor *last_data_desc;
struct dma_async_tx_descriptor *last_cmd_desc;
struct completion txn_done;
+
+ unsigned int bam_ce_nitems;
+ unsigned int cmd_sgl_nitems;
+ unsigned int data_sgl_nitems;
+
struct_group(bam_positions,
u32 bam_ce_pos;
u32 bam_ce_start;
--
2.49.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
@ 2025-05-26 5:56 ` Md Sadre Alam
2025-05-27 11:28 ` Mark Brown
1 sibling, 0 replies; 12+ messages in thread
From: Md Sadre Alam @ 2025-05-26 5:56 UTC (permalink / raw)
To: Gabor Juhos, Mark Brown, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel
Hi,
On 5/25/2025 10:35 PM, Gabor Juhos wrote:
> Using the mtd_nandbiterrs module for testing the driver occasionally
> results in weird things like below.
>
> 1. swiotlb mapping fails with the following message:
>
> [ 85.926216] qcom_snand 79b0000.spi: swiotlb buffer is full (sz: 4294967294 bytes), total 512 (slots), used 0 (slots)
> [ 85.932937] qcom_snand 79b0000.spi: failure in mapping desc
> [ 87.999314] qcom_snand 79b0000.spi: failure to write raw page
> [ 87.999352] mtd_nandbiterrs: error: write_oob failed (-110)
>
> Rebooting the board after that causes a panic:
>
> # reboot
> [ 877.178844] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
> [ 877.178913] Mem abort info:
> [ 877.186767] ESR = 0x0000000096000005
> [ 877.189508] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 877.193312] SET = 0, FnV = 0
> [ 877.198780] EA = 0, S1PTW = 0
> [ 877.201676] FSC = 0x05: level 1 translation fault
> [ 877.204684] Data abort info:
> [ 877.209559] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> [ 877.212669] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 877.217964] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 877.223098] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000446ae000
> [ 877.228470] [0000000000000078] pgd=080000004492d403, p4d=080000004492d403, pud=080000004492d403, pmd=0000000000000000
> [ 877.234944] Internal error: Oops: 0000000096000005 [#1] SMP
> [ 877.245395] Modules linked in: mtd_test pppoe ppp_async nft_fib_inet nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack slhc sfp qrtr_smd nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mdio_i2c crc_ccitt evdev gpio_fan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic seqiv geniv drbg hmac arc4 libarc4 gpio_keys usb_storage input_core leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom phy_qcom_qusb2 phy_qcom_qmp_usb phy_qcom_m31 aquantia hwmon crc_itu_t crc32c_generic
> [ 877.297039] CPU: 1 UID: 0 PID: 2060 Comm: reboot Not tainted 6.15.0-rc4+ #0 NONE
> [ 877.319267] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
> [ 877.326820] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 877.333936] pc : wakeup_flusher_threads+0x98/0x1d0
> [ 877.340792] lr : wakeup_flusher_threads+0x128/0x1d0
> [ 877.345653] sp : ffffffc082b23db0
> [ 877.350426] x29: ffffffc082b23db0 x28: ffffff8003dd4000 x27: 0000000000000000
> [ 877.353902] x26: 0000000000000000 x25: 0000000000000000 x24: ffffffc081c89b08
> [ 877.361021] x23: ffffffc081cafc78 x22: 0000000000000020 x21: 0000000000000002
> [ 877.368138] x20: 0000000000000001 x19: ffffffc08032c000 x18: 0000000000000000
> [ 877.375257] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [ 877.382374] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [ 877.389492] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffffc08032c36c
> [ 877.396610] x8 : 0000000000000001 x7 : ffffffc080327318 x6 : 0000000000000000
> [ 877.403728] x5 : 0000000000000000 x4 : 000000000002000f x3 : 0000000000000000
> [ 877.410847] x2 : 0000000000000001 x1 : ffffff8003dd4000 x0 : 0000000000000040
> [ 877.417965] Call trace:
> [ 877.425078] wakeup_flusher_threads+0x98/0x1d0 (P)
> [ 877.427337] ksys_sync+0x2c/0xa0
> [ 877.432197] __arm64_sys_sync+0x18/0x30
> [ 877.435582] invoke_syscall.constprop.0+0x64/0x110
> [ 877.439143] do_el0_svc+0x48/0xd8
> [ 877.444003] el0_svc+0x3c/0xf8
> [ 877.447388] el0t_64_sync_handler+0x10c/0x138
> [ 877.450341] el0t_64_sync+0x180/0x188
> [ 877.454770] Code: d1008016 eb17001f 540002c0 a90153f3 (f9402ec0)
> [ 877.458416] ---[ end trace 0000000000000000 ]---
> [ 877.464492] Kernel panic - not syncing: Oops: Fatal exception
> [ 877.469180] SMP: stopping secondary CPUs
> [ 877.474825] Kernel Offset: disabled
> [ 877.478812] CPU features: 0x0000,00000068,00000000,0200400b
> [ 877.482025] Memory Limit: none
> [ 877.487581] Rebooting in 3 seconds..
>
> 2. If the swiotlb mapping does not fail, rebooting the board may result
> in a different panic:
>
> # reboot
> [ 256.104459] BUG: spinlock bad magic on CPU#3, procd/2241
> [ 256.104488] Unable to handle kernel paging request at virtual address ffffffff0000049b
> [ 256.108827] Mem abort info:
> [ 256.116548] ESR = 0x0000000096000005
> [ 256.119240] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 256.123060] SET = 0, FnV = 0
> [ 256.128528] EA = 0, S1PTW = 0
> [ 256.131391] FSC = 0x05: level 1 translation fault
> [ 256.134431] Data abort info:
> [ 256.139291] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> [ 256.142419] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 256.147713] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 256.152836] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000041e5b000
> [ 256.158218] [ffffffff0000049b] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
> [ 256.164906] Internal error: Oops: 0000000096000005 [#1] SMP
> [ 256.173323] Modules linked in: mtd_test pppoe ppp_async nft_fib_inet nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack slhc sfp qrtr_smd nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mdio_i2c crc_ccitt evdev gpio_fan crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic seqiv geniv drbg hmac arc4 libarc4 gpio_keys usb_storage input_core leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom phy_qcom_qusb2 phy_qcom_qmp_usb phy_qcom_m31 aquantia hwmon crc_itu_t crc32c_generic
> [ 256.225139] CPU: 3 UID: 0 PID: 2241 Comm: procd Not tainted 6.15.0-rc4+ #0 NONE
> [ 256.247369] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
> [ 256.254921] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 256.261951] pc : spin_bug+0x70/0xd8
> [ 256.268806] lr : spin_bug+0x68/0xd8
> [ 256.272278] sp : ffffffc083763b00
> [ 256.275750] x29: ffffffc083763b00 x28: ffffff80005b8000 x27: ffffffc0827f4000
> [ 256.279225] x26: ffffffc080dab000 x25: ffffffc083763c18 x24: ffffff80005b8000
> [ 256.286344] x23: 0000000000000002 x22: ffffffc081c00c00 x21: ffffffff0000001b
> [ 256.293462] x20: ffffffc080d44c80 x19: ffffff8004d52888 x18: 0000000000000005
> [ 256.300580] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000006
> [ 256.307698] x14: 0000000000000000 x13: 313432322f64636f x12: ffffffc081c707b8
> [ 256.314816] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc0800cdfd0
> [ 256.321935] x8 : c0000000ffffefff x7 : ffffffc081c186a8 x6 : 0000000000057fa8
> [ 256.329052] x5 : ffffffc081c70760 x4 : 0000000000000000 x3 : ffffffc0837638e0
> [ 256.336170] x2 : 000000000000001b x1 : ffffff80005b8000 x0 : 000000000000002c
> [ 256.343289] Call trace:
> [ 256.350402] spin_bug+0x70/0xd8 (P)
> [ 256.352660] do_raw_spin_lock+0xa4/0x128
> [ 256.356132] _raw_spin_lock_irqsave+0x70/0x98
> [ 256.360299] __mutex_lock+0x268/0xdf0
> [ 256.364552] mutex_lock_nested+0x2c/0x40
> [ 256.368199] device_shutdown+0xfc/0x260
> [ 256.372191] kernel_restart+0x4c/0xb8
> [ 256.375750] __do_sys_reboot+0x108/0x230
> [ 256.379570] __arm64_sys_reboot+0x2c/0x40
> [ 256.383563] invoke_syscall.constprop.0+0x64/0x110
> [ 256.387470] do_el0_svc+0x48/0xd8
> [ 256.392156] el0_svc+0x3c/0xf8
> [ 256.395541] el0t_64_sync_handler+0x10c/0x138
> [ 256.398494] el0t_64_sync+0x180/0x188
> [ 256.402923] Code: 54000220 940009a5 b9400662 b4000295 (b94482a4)
> [ 256.406570] ---[ end trace 0000000000000000 ]---
> [ 256.412645] Kernel panic - not syncing: Oops: Fatal exception
> [ 256.417340] Kernel Offset: disabled
> [ 256.422972] CPU features: 0x0000,00000068,00000000,0200400b
> [ 256.426273] Memory Limit: none
> [ 256.431828] Rebooting in 3 seconds..
>
> Investigating the issue revealed that these symptoms are results of
> memory corruption which is caused by out of bounds access within the
> driver.
>
> The driver uses a dynamically allocated structure for BAM transactions,
> which structure must have enough space for all possible variations of
> different flash operations initiated by the driver. The required space
> heavily depends on the actual number of 'codewords' which is calculated
> from the pagesize of the actual NAND chip.
>
> Although the qcom_nandc_alloc() function allocates memory for the BAM
> transactions during probe, but since the actual number of 'codewords'
> is not yet know the allocation is done for one 'codeword' only.
>
> Because of this, whenever the driver does a flash operation, and the
> number of the required transactions exceeds the size of the allocated
> arrays the driver accesses memory out of the allocated range.
>
> To avoid this, change the code to free the initially allocated BAM
> transactions memory, and allocate a new one once the actual number of
> 'codewords' required for a given NAND chip is known.
>
> Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface")
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> drivers/spi/spi-qpic-snand.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/spi/spi-qpic-snand.c b/drivers/spi/spi-qpic-snand.c
> index fd129650434f0129e24d3bdac7e7c4d5542627e6..c552cb7aa80e368e193d71e1826b2cc005571a9c 100644
> --- a/drivers/spi/spi-qpic-snand.c
> +++ b/drivers/spi/spi-qpic-snand.c
> @@ -315,6 +315,22 @@ static int qcom_spi_ecc_init_ctx_pipelined(struct nand_device *nand)
>
> mtd_set_ooblayout(mtd, &qcom_spi_ooblayout);
>
> + /*
> + * Free the temporary BAM transaction allocated initially by
> + * qcom_nandc_alloc(), and allocate a new one based on the
> + * updated max_cwperpage value.
> + */
> + qcom_free_bam_transaction(snandc);
> +
> + snandc->max_cwperpage = cwperpage;
> +
> + snandc->bam_txn = qcom_alloc_bam_transaction(snandc);
> + if (!snandc->bam_txn) {
> + dev_err(snandc->dev, "failed to allocate BAM transaction\n");
> + ret = -ENOMEM;
> + goto err_free_ecc_cfg;
> + }
> +
> ecc_cfg->cfg0 = FIELD_PREP(CW_PER_PAGE_MASK, (cwperpage - 1)) |
> FIELD_PREP(UD_SIZE_BYTES_MASK, ecc_cfg->cw_data) |
> FIELD_PREP(DISABLE_STATUS_AFTER_WRITE, 1) |
>
Reviewed-by: Md Sadre Alam <quic_mdalam@quicinc.com>
Thanks,
Alam.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-25 17:05 ` [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays Gabor Juhos
@ 2025-05-26 6:53 ` Md Sadre Alam
2025-05-26 20:01 ` Gabor Juhos
2025-05-26 9:27 ` Miquel Raynal
1 sibling, 1 reply; 12+ messages in thread
From: Md Sadre Alam @ 2025-05-26 6:53 UTC (permalink / raw)
To: Gabor Juhos, Mark Brown, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel,
Lakshmi Sowjanya D
Hi,
On 5/25/2025 10:35 PM, Gabor Juhos wrote:
> The common QPIC code does not do any boundary checking when it handles
> the command elements and scatter gater list arrays of a BAM transaction,
> thus it allows to access out of bounds elements in those.
>
> Although it is the responsibility of the given driver to allocate enough
> space for all possible BAM transaction variations, however there can be
> mistakes in the driver code which can lead to hidden memory corruption
> issues which are hard to debug.
>
> This kind of problem has been observed during testing the 'spi-qpic-snand'
> driver. Although the driver has been fixed with a preceding patch, but it
> still makes sense to reduce the chance of having such errors again later.
>
> In order to prevent such errors, change the qcom_alloc_bam_transaction()
> function to store the number of elements of the arrays in the
> 'bam_transaction' strucutre during allocation. Also, add sanity checks to
> the qcom_prep_bam_dma_desc_{cmd,data}() functions to avoid using out of
> bounds indices for the arrays.
>
> Tested with the 'spi-qpic-snand' driver only.
I recommend testing this patch on both the IPQ and SDX platforms,
as the QPIC raw NAND driver are utilized across both.
If you have access to IPQ and SDX devices with raw NAND, please proceed
with testing on both.
Otherwise, I can handle testing on the IPQ raw NAND device and
coordinate with Lakshmi Sowjanya D (quic_laksd@quicinc.com)
for testing on the SDX platform.
Thanks,
Alam.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption
2025-05-25 17:05 [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Gabor Juhos
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
2025-05-25 17:05 ` [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays Gabor Juhos
@ 2025-05-26 8:13 ` Miquel Raynal
2 siblings, 0 replies; 12+ messages in thread
From: Miquel Raynal @ 2025-05-26 8:13 UTC (permalink / raw)
To: Gabor Juhos
Cc: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Richard Weinberger, Vignesh Raghavendra,
linux-spi, linux-mtd, linux-arm-msm, linux-kernel
Hi Gabor,
On 25/05/2025 at 19:05:34 +02, Gabor Juhos <j4g8y7@gmail.com> wrote:
> The 'spi-qpic-nand' driver may cause memory corruption under some
> circumstances. The first patch in the series changes the driver to
> avoid that, whereas the second adds some sanity checks to the common
> QPIC code in order to make detecting such errors easier in the future.
>
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> Gabor Juhos (2):
> spi: spi-qpic-snand: reallocate BAM transactions
> mtd: nand: qpic_common: prevent out of bounds access of BAM
> arrays
Are these two paches related somehow? Or can we apply them in
two different trees? If yes, please send the patches separately, which
eases our work.
Thanks,
Miquèl
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-25 17:05 ` [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays Gabor Juhos
2025-05-26 6:53 ` Md Sadre Alam
@ 2025-05-26 9:27 ` Miquel Raynal
2025-05-26 20:21 ` Gabor Juhos
1 sibling, 1 reply; 12+ messages in thread
From: Miquel Raynal @ 2025-05-26 9:27 UTC (permalink / raw)
To: Gabor Juhos
Cc: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Richard Weinberger, Vignesh Raghavendra,
linux-spi, linux-mtd, linux-arm-msm, linux-kernel
Hi Gabor,
On 25/05/2025 at 19:05:36 +02, Gabor Juhos <j4g8y7@gmail.com> wrote:
> The common QPIC code does not do any boundary checking when it handles
> the command elements and scatter gater list arrays of a BAM transaction,
> thus it allows to access out of bounds elements in those.
>
> Although it is the responsibility of the given driver to allocate enough
> space for all possible BAM transaction variations, however there can be
> mistakes in the driver code which can lead to hidden memory corruption
> issues which are hard to debug.
>
> This kind of problem has been observed during testing the 'spi-qpic-snand'
> driver. Although the driver has been fixed with a preceding patch, but it
> still makes sense to reduce the chance of having such errors again later.
>
> In order to prevent such errors, change the qcom_alloc_bam_transaction()
> function to store the number of elements of the arrays in the
> 'bam_transaction' strucutre during allocation. Also, add sanity checks to
> the qcom_prep_bam_dma_desc_{cmd,data}() functions to avoid using out of
> bounds indices for the arrays.
>
> Tested with the 'spi-qpic-snand' driver only.
>
> Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
> ---
> Preferably, this should go in via the SPI tree along with the previous
> patch. It is not a strict requirement though, in the case it gets
> included separately through the mtd tree it reveals the bug fixed in
> the first patch.
Sorry, didn't see that in the first place. Fine by me.
> ---
> drivers/mtd/nand/qpic_common.c | 28 ++++++++++++++++++++++++----
> include/linux/mtd/nand-qpic-common.h | 8 ++++++++
> 2 files changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/mtd/nand/qpic_common.c b/drivers/mtd/nand/qpic_common.c
> index e0ed25b5afea9b289b767cd3d9c2d7572ed52008..fb1f81e4bdacaa3e81660a20e164926c64633513 100644
> --- a/drivers/mtd/nand/qpic_common.c
> +++ b/drivers/mtd/nand/qpic_common.c
> @@ -15,6 +15,13 @@
> #include <linux/slab.h>
> #include <linux/mtd/nand-qpic-common.h>
>
> +static inline int qcom_err_bam_array_full(struct qcom_nand_controller *nandc,
> + const char *name)
> +{
> + dev_err(nandc->dev, "BAM %s array is full\n", name);
> + return -EINVAL;
> +}
This is rather uncommon, I don't know if it's very relevant to do
that. Please drop this static inline function.
Thanks,
Miquèl
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-26 6:53 ` Md Sadre Alam
@ 2025-05-26 20:01 ` Gabor Juhos
2025-05-28 6:11 ` Lakshmi Sowjanya D (QUIC)
0 siblings, 1 reply; 12+ messages in thread
From: Gabor Juhos @ 2025-05-26 20:01 UTC (permalink / raw)
To: Md Sadre Alam, Mark Brown, Varadarajan Narayanan,
Sricharan Ramabadhran, Miquel Raynal, Richard Weinberger,
Vignesh Raghavendra
Cc: linux-spi, linux-mtd, linux-arm-msm, linux-kernel,
Lakshmi Sowjanya D
2025. 05. 26. 8:53 keltezéssel, Md Sadre Alam írta:
> Hi,
>
> On 5/25/2025 10:35 PM, Gabor Juhos wrote:
>> The common QPIC code does not do any boundary checking when it handles
>> the command elements and scatter gater list arrays of a BAM transaction,
>> thus it allows to access out of bounds elements in those.
>>
>> Although it is the responsibility of the given driver to allocate enough
>> space for all possible BAM transaction variations, however there can be
>> mistakes in the driver code which can lead to hidden memory corruption
>> issues which are hard to debug.
>>
>> This kind of problem has been observed during testing the 'spi-qpic-snand'
>> driver. Although the driver has been fixed with a preceding patch, but it
>> still makes sense to reduce the chance of having such errors again later.
>>
>> In order to prevent such errors, change the qcom_alloc_bam_transaction()
>> function to store the number of elements of the arrays in the
>> 'bam_transaction' strucutre during allocation. Also, add sanity checks to
>> the qcom_prep_bam_dma_desc_{cmd,data}() functions to avoid using out of
>> bounds indices for the arrays.
>>
>> Tested with the 'spi-qpic-snand' driver only.
> I recommend testing this patch on both the IPQ and SDX platforms,
> as the QPIC raw NAND driver are utilized across both.
>
> If you have access to IPQ and SDX devices with raw NAND, please proceed
> with testing on both.
Sorry, I have no SDX devices at all, and unfortunately I can't access my older
IPQ boards before next week.
>
> Otherwise, I can handle testing on the IPQ raw NAND device and coordinate with
> Lakshmi Sowjanya D (quic_laksd@quicinc.com)
> for testing on the SDX platform.
If you could do some testing in the meantime, that would be superb.
Thanks for that in advance!
Regards,
Gabor
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-26 9:27 ` Miquel Raynal
@ 2025-05-26 20:21 ` Gabor Juhos
0 siblings, 0 replies; 12+ messages in thread
From: Gabor Juhos @ 2025-05-26 20:21 UTC (permalink / raw)
To: Miquel Raynal
Cc: Mark Brown, Md Sadre Alam, Varadarajan Narayanan,
Sricharan Ramabadhran, Richard Weinberger, Vignesh Raghavendra,
linux-spi, linux-mtd, linux-arm-msm, linux-kernel
Hi Miquel,
[...]
>> ---
>> Preferably, this should go in via the SPI tree along with the previous
>> patch. It is not a strict requirement though, in the case it gets
>> included separately through the mtd tree it reveals the bug fixed in
>> the first patch.
>
> Sorry, didn't see that in the first place. Fine by me.
Sorry for the inconvenience. Should I add these kind of notes into the cover
letter next time?
>
>> ---
>> drivers/mtd/nand/qpic_common.c | 28 ++++++++++++++++++++++++----
>> include/linux/mtd/nand-qpic-common.h | 8 ++++++++
>> 2 files changed, 32 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/mtd/nand/qpic_common.c b/drivers/mtd/nand/qpic_common.c
>> index e0ed25b5afea9b289b767cd3d9c2d7572ed52008..fb1f81e4bdacaa3e81660a20e164926c64633513 100644
>> --- a/drivers/mtd/nand/qpic_common.c
>> +++ b/drivers/mtd/nand/qpic_common.c
>> @@ -15,6 +15,13 @@
>> #include <linux/slab.h>
>> #include <linux/mtd/nand-qpic-common.h>
>>
>> +static inline int qcom_err_bam_array_full(struct qcom_nand_controller *nandc,
>> + const char *name)
>> +{
>> + dev_err(nandc->dev, "BAM %s array is full\n", name);
>> + return -EINVAL;
>> +}
>
> This is rather uncommon, I don't know if it's very relevant to do
> that. Please drop this static inline function.
The purpose of the inline function is a bit similar to dev_err_probe().
It helps to standardize the error message, and it allows to print the message
and return with a value in one go.
So this kind of statement ...
if (bam_txn->bam_ce_pos + size > bam_txn->bam_ce_nitems) {
dev_err(nandc->dev, "BAM %s array is full\n", "CE");
return -EINVAL;
}
... can be simplied like this:
if (bam_txn->bam_ce_pos + size > bam_txn->bam_ce_nitems)
return qcom_err_bam_array_full(nandc, "CE");
The latter is cleaner for me, but no problem, I will remove that.
Regards,
Gabor
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
2025-05-26 5:56 ` Md Sadre Alam
@ 2025-05-27 11:28 ` Mark Brown
2025-05-27 12:23 ` Gabor Juhos
1 sibling, 1 reply; 12+ messages in thread
From: Mark Brown @ 2025-05-27 11:28 UTC (permalink / raw)
To: Gabor Juhos
Cc: Md Sadre Alam, Varadarajan Narayanan, Sricharan Ramabadhran,
Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, linux-spi,
linux-mtd, linux-arm-msm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 779 bytes --]
On Sun, May 25, 2025 at 07:05:35PM +0200, Gabor Juhos wrote:
> # reboot
> [ 877.178844] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
> [ 877.178913] Mem abort info:
> [ 877.186767] ESR = 0x0000000096000005
> [ 877.189508] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 877.193312] SET = 0, FnV = 0
> [ 877.198780] EA = 0, S1PTW = 0
Please think hard before including complete backtraces in upstream
reports, they are very large and contain almost no useful information
relative to their size so often obscure the relevant content in your
message. If part of the backtrace is usefully illustrative (it often is
for search engines if nothing else) then it's usually better to pull out
the relevant sections.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions
2025-05-27 11:28 ` Mark Brown
@ 2025-05-27 12:23 ` Gabor Juhos
0 siblings, 0 replies; 12+ messages in thread
From: Gabor Juhos @ 2025-05-27 12:23 UTC (permalink / raw)
To: Mark Brown
Cc: Md Sadre Alam, Varadarajan Narayanan, Sricharan Ramabadhran,
Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, linux-spi,
linux-mtd, linux-arm-msm, linux-kernel
2025. 05. 27. 13:28 keltezéssel, Mark Brown írta:
> On Sun, May 25, 2025 at 07:05:35PM +0200, Gabor Juhos wrote:
>
>> # reboot
>> [ 877.178844] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
>> [ 877.178913] Mem abort info:
>> [ 877.186767] ESR = 0x0000000096000005
>> [ 877.189508] EC = 0x25: DABT (current EL), IL = 32 bits
>> [ 877.193312] SET = 0, FnV = 0
>> [ 877.198780] EA = 0, S1PTW = 0
>
> Please think hard before including complete backtraces in upstream
> reports, they are very large and contain almost no useful information
> relative to their size so often obscure the relevant content in your
> message. If part of the backtrace is usefully illustrative (it often is
> for search engines if nothing else) then it's usually better to pull out
> the relevant sections.
Sorry about that. I try to avoid doing that in the future.
-Gabor
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
2025-05-26 20:01 ` Gabor Juhos
@ 2025-05-28 6:11 ` Lakshmi Sowjanya D (QUIC)
0 siblings, 0 replies; 12+ messages in thread
From: Lakshmi Sowjanya D (QUIC) @ 2025-05-28 6:11 UTC (permalink / raw)
To: Gabor Juhos, Md Sadre Alam (QUIC), Mark Brown,
Varadarajan Narayanan (QUIC), Sricharan Ramabadhran (QUIC),
Miquel Raynal, Richard Weinberger, Vignesh Raghavendra
Cc: linux-spi@vger.kernel.org, linux-mtd@lists.infradead.org,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org
> -----Original Message-----
> From: Gabor Juhos <j4g8y7@gmail.com>
> Sent: Tuesday, May 27, 2025 1:31 AM
> To: Md Sadre Alam (QUIC) <quic_mdalam@quicinc.com>; Mark Brown
> <broonie@kernel.org>; Varadarajan Narayanan (QUIC)
> <quic_varada@quicinc.com>; Sricharan Ramabadhran (QUIC)
> <quic_srichara@quicinc.com>; Miquel Raynal <miquel.raynal@bootlin.com>;
> Richard Weinberger <richard@nod.at>; Vignesh Raghavendra
> <vigneshr@ti.com>
> Cc: linux-spi@vger.kernel.org; linux-mtd@lists.infradead.org; linux-arm-
> msm@vger.kernel.org; linux-kernel@vger.kernel.org; Lakshmi Sowjanya D
> (QUIC) <quic_laksd@quicinc.com>
> Subject: Re: [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds
> access of BAM arrays
>
> 2025. 05. 26. 8:53 keltezéssel, Md Sadre Alam írta:
> > Hi,
> >
> > On 5/25/2025 10:35 PM, Gabor Juhos wrote:
> >> The common QPIC code does not do any boundary checking when it
> >> handles the command elements and scatter gater list arrays of a BAM
> >> transaction, thus it allows to access out of bounds elements in those.
> >>
> >> Although it is the responsibility of the given driver to allocate
> >> enough space for all possible BAM transaction variations, however
> >> there can be mistakes in the driver code which can lead to hidden
> >> memory corruption issues which are hard to debug.
> >>
> >> This kind of problem has been observed during testing the 'spi-qpic-snand'
> >> driver. Although the driver has been fixed with a preceding patch,
> >> but it still makes sense to reduce the chance of having such errors again
> later.
> >>
> >> In order to prevent such errors, change the
> >> qcom_alloc_bam_transaction() function to store the number of elements
> >> of the arrays in the 'bam_transaction' strucutre during allocation.
> >> Also, add sanity checks to the qcom_prep_bam_dma_desc_{cmd,data}()
> >> functions to avoid using out of bounds indices for the arrays.
> >>
> >> Tested with the 'spi-qpic-snand' driver only.
> > I recommend testing this patch on both the IPQ and SDX platforms, as
> > the QPIC raw NAND driver are utilized across both.
> >
> > If you have access to IPQ and SDX devices with raw NAND, please
> > proceed with testing on both.
>
> Sorry, I have no SDX devices at all, and unfortunately I can't access my older
> IPQ boards before next week.
>
> >
> > Otherwise, I can handle testing on the IPQ raw NAND device and
> > coordinate with Lakshmi Sowjanya D (quic_laksd@quicinc.com) for
> > testing on the SDX platform.
>
> If you could do some testing in the meantime, that would be superb.
> Thanks for that in advance!
>
> Regards,
> Gabor
Tested-by: Lakshmi Sowjanya D <quic_laksd@quicinc.com> # on SDX75
--
Regards
Lakshmi Sowjanya
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-28 6:11 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-25 17:05 [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Gabor Juhos
2025-05-25 17:05 ` [PATCH 1/2] spi: spi-qpic-snand: reallocate BAM transactions Gabor Juhos
2025-05-26 5:56 ` Md Sadre Alam
2025-05-27 11:28 ` Mark Brown
2025-05-27 12:23 ` Gabor Juhos
2025-05-25 17:05 ` [PATCH 2/2] mtd: nand: qpic_common: prevent out of bounds access of BAM arrays Gabor Juhos
2025-05-26 6:53 ` Md Sadre Alam
2025-05-26 20:01 ` Gabor Juhos
2025-05-28 6:11 ` Lakshmi Sowjanya D (QUIC)
2025-05-26 9:27 ` Miquel Raynal
2025-05-26 20:21 ` Gabor Juhos
2025-05-26 8:13 ` [PATCH 0/2] spi: spi-qpic-snand: avoid memory corruption Miquel Raynal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).