* [PATCH] net: stmmac: fix oops when split header is enabled
@ 2025-12-02 2:54 Jie Zhang
2025-12-04 0:37 ` Jacob Keller
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jie Zhang @ 2025-12-02 2:54 UTC (permalink / raw)
To: netdev
Cc: Jie Zhang, Jie Zhang, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Furong Xu, Kunihiko Hayashi,
Vladimir Oltean, linux-stm32, linux-arm-kernel, linux-kernel
For GMAC4, when split header is enabled, in some rare cases, the
hardware does not fill buf2 of the first descriptor with payload.
Thus we cannot assume buf2 is always fully filled if it is not
the last descriptor. Otherwise, the length of buf2 of the second
descriptor will be calculated wrong and cause an oops:
Unable to handle kernel paging request at virtual address ffff00019246bfc0
Mem abort info:
ESR = 0x0000000096000145
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000145, ISS2 = 0x00000000
CM = 1, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000090d8b000
[ffff00019246bfc0] pgd=180000009dfff403, p4d=180000009dfff403, pud=0000000000000000
Internal error: Oops: 0000000096000145 [#1] SMP
Modules linked in:
CPU: 0 UID: 0 PID: 157 Comm: iperf3 Not tainted 6.18.0-rc6 #1 PREEMPT
Hardware name: ADI 64-bit SC598 SOM EZ Kit (DT)
pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : dcache_inval_poc+0x28/0x58
lr : arch_sync_dma_for_cpu+0x28/0x34
sp : ffff800080dcbc40
x29: ffff800080dcbc40 x28: 0000000000000008 x27: ffff000091c50980
x26: ffff000091c50980 x25: 0000000000000000 x24: ffff000092a5fb00
x23: ffff000092768f28 x22: 000000009246c000 x21: 0000000000000002
x20: 00000000ffffffdc x19: ffff000091844c10 x18: 0000000000000000
x17: ffff80001d308000 x16: ffff800080dc8000 x15: ffff0000929fb034
x14: 70f709157374dd21 x13: ffff000092812ec0 x12: 0000000000000000
x11: 000000000000dd86 x10: 0000000000000040 x9 : 0000000000000600
x8 : ffff000092a5fbac x7 : 0000000000000001 x6 : 0000000000004240
x5 : 000000009246c000 x4 : ffff000091844c10 x3 : 000000000000003f
x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000
Call trace:
dcache_inval_poc+0x28/0x58 (P)
dma_direct_sync_single_for_cpu+0x38/0x6c
__dma_sync_single_for_cpu+0x34/0x6c
stmmac_napi_poll_rx+0x8f0/0xb60
__napi_poll.constprop.0+0x30/0x144
net_rx_action+0x160/0x274
handle_softirqs+0x1b8/0x1fc
__do_softirq+0x10/0x18
____do_softirq+0xc/0x14
call_on_irq_stack+0x30/0x48
do_softirq_own_stack+0x18/0x20
__irq_exit_rcu+0x64/0xe8
irq_exit_rcu+0xc/0x14
el1_interrupt+0x3c/0x58
el1h_64_irq_handler+0x14/0x1c
el1h_64_irq+0x6c/0x70
__arch_copy_to_user+0xbc/0x240 (P)
simple_copy_to_iter+0x28/0x30
__skb_datagram_iter+0x1bc/0x268
skb_copy_datagram_iter+0x1c/0x24
tcp_recvmsg_locked+0x3ec/0x778
tcp_recvmsg+0x10c/0x194
inet_recvmsg+0x64/0xa0
sock_recvmsg_nosec+0x1c/0x24
sock_read_iter+0x8c/0xdc
vfs_read+0x144/0x1a0
ksys_read+0x74/0xdc
__arm64_sys_read+0x14/0x1c
invoke_syscall+0x60/0xe4
el0_svc_common.constprop.0+0xb0/0xcc
do_el0_svc+0x18/0x20
el0_svc+0x80/0xc8
el0t_64_sync_handler+0x58/0x134
el0t_64_sync+0x170/0x174
Code: d1000443 ea03003f 8a230021 54000040 (d50b7e21)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
Kernel Offset: disabled
CPU features: 0x080000,00008000,08006281,0400520b
Memory Limit: none
---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
To fix this, the PL bit-field in RDES3 register is used for all
descriptors, whether it is the last descriptor or not.
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 20 ++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 7b90ecd3a55e..848b1769c573 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4878,13 +4878,27 @@ static unsigned int stmmac_rx_buf2_len(struct stmmac_priv *priv,
if (!priv->sph)
return 0;
- /* Not last descriptor */
- if (status & rx_not_ls)
+ /* For GMAC4, when split header is enabled, in some rare cases, the
+ * hardware does not fill buf2 of the first descriptor with payload.
+ * Thus we cannot assume buf2 is always fully filled if it is not
+ * the last descriptor. Otherwise, the length of buf2 of the second
+ * descriptor will be calculated wrong and cause an oops.
+ *
+ * If this is the last descriptor, 'plen' is the length of the
+ * received packet that was transferred to system memory.
+ * Otherwise, it is the accumulated number of bytes that have been
+ * transferred for the current packet.
+ *
+ * Thus 'plen - len' always gives the correct length of buf2.
+ */
+
+ /* Not GMAC4 and not last descriptor */
+ if (!priv->plat->has_gmac4 && (status & rx_not_ls))
return priv->dma_conf.dma_buf_sz;
+ /* GMAC4 or last descriptor */
plen = stmmac_get_rx_frame_len(priv, p, coe);
- /* Last descriptor */
return plen - len;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] net: stmmac: fix oops when split header is enabled
2025-12-02 2:54 [PATCH] net: stmmac: fix oops when split header is enabled Jie Zhang
@ 2025-12-04 0:37 ` Jacob Keller
2025-12-04 12:26 ` Paolo Abeni
2025-12-04 13:11 ` Simon Horman
2 siblings, 0 replies; 4+ messages in thread
From: Jacob Keller @ 2025-12-04 0:37 UTC (permalink / raw)
To: Jie Zhang, netdev
Cc: Jie Zhang, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Furong Xu, Kunihiko Hayashi,
Vladimir Oltean, linux-stm32, linux-arm-kernel, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 3846 bytes --]
On 12/1/2025 6:54 PM, Jie Zhang wrote:
> For GMAC4, when split header is enabled, in some rare cases, the
> hardware does not fill buf2 of the first descriptor with payload.
> Thus we cannot assume buf2 is always fully filled if it is not
> the last descriptor. Otherwise, the length of buf2 of the second
> descriptor will be calculated wrong and cause an oops:
>
> Unable to handle kernel paging request at virtual address ffff00019246bfc0
> Mem abort info:
> ESR = 0x0000000096000145
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x05: level 1 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000145, ISS2 = 0x00000000
> CM = 1, WnR = 1, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000090d8b000
> [ffff00019246bfc0] pgd=180000009dfff403, p4d=180000009dfff403, pud=0000000000000000
> Internal error: Oops: 0000000096000145 [#1] SMP
> Modules linked in:
> CPU: 0 UID: 0 PID: 157 Comm: iperf3 Not tainted 6.18.0-rc6 #1 PREEMPT
> Hardware name: ADI 64-bit SC598 SOM EZ Kit (DT)
> pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : dcache_inval_poc+0x28/0x58
> lr : arch_sync_dma_for_cpu+0x28/0x34
> sp : ffff800080dcbc40
> x29: ffff800080dcbc40 x28: 0000000000000008 x27: ffff000091c50980
> x26: ffff000091c50980 x25: 0000000000000000 x24: ffff000092a5fb00
> x23: ffff000092768f28 x22: 000000009246c000 x21: 0000000000000002
> x20: 00000000ffffffdc x19: ffff000091844c10 x18: 0000000000000000
> x17: ffff80001d308000 x16: ffff800080dc8000 x15: ffff0000929fb034
> x14: 70f709157374dd21 x13: ffff000092812ec0 x12: 0000000000000000
> x11: 000000000000dd86 x10: 0000000000000040 x9 : 0000000000000600
> x8 : ffff000092a5fbac x7 : 0000000000000001 x6 : 0000000000004240
> x5 : 000000009246c000 x4 : ffff000091844c10 x3 : 000000000000003f
> x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000
> Call trace:
> dcache_inval_poc+0x28/0x58 (P)
> dma_direct_sync_single_for_cpu+0x38/0x6c
> __dma_sync_single_for_cpu+0x34/0x6c
> stmmac_napi_poll_rx+0x8f0/0xb60
> __napi_poll.constprop.0+0x30/0x144
> net_rx_action+0x160/0x274
> handle_softirqs+0x1b8/0x1fc
> __do_softirq+0x10/0x18
> ____do_softirq+0xc/0x14
> call_on_irq_stack+0x30/0x48
> do_softirq_own_stack+0x18/0x20
> __irq_exit_rcu+0x64/0xe8
> irq_exit_rcu+0xc/0x14
> el1_interrupt+0x3c/0x58
> el1h_64_irq_handler+0x14/0x1c
> el1h_64_irq+0x6c/0x70
> __arch_copy_to_user+0xbc/0x240 (P)
> simple_copy_to_iter+0x28/0x30
> __skb_datagram_iter+0x1bc/0x268
> skb_copy_datagram_iter+0x1c/0x24
> tcp_recvmsg_locked+0x3ec/0x778
> tcp_recvmsg+0x10c/0x194
> inet_recvmsg+0x64/0xa0
> sock_recvmsg_nosec+0x1c/0x24
> sock_read_iter+0x8c/0xdc
> vfs_read+0x144/0x1a0
> ksys_read+0x74/0xdc
> __arm64_sys_read+0x14/0x1c
> invoke_syscall+0x60/0xe4
> el0_svc_common.constprop.0+0xb0/0xcc
> do_el0_svc+0x18/0x20
> el0_svc+0x80/0xc8
> el0t_64_sync_handler+0x58/0x134
> el0t_64_sync+0x170/0x174
> Code: d1000443 ea03003f 8a230021 54000040 (d50b7e21)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops: Fatal exception in interrupt
> Kernel Offset: disabled
> CPU features: 0x080000,00008000,08006281,0400520b
> Memory Limit: none
> ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
>
You could probably minify the crash here a bit since much of this
information isn't that helpful in showing the crash. Not too big a deal
overall.
> To fix this, the PL bit-field in RDES3 register is used for all
> descriptors, whether it is the last descriptor or not.
>
> Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: stmmac: fix oops when split header is enabled
2025-12-02 2:54 [PATCH] net: stmmac: fix oops when split header is enabled Jie Zhang
2025-12-04 0:37 ` Jacob Keller
@ 2025-12-04 12:26 ` Paolo Abeni
2025-12-04 13:11 ` Simon Horman
2 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2025-12-04 12:26 UTC (permalink / raw)
To: Jie Zhang, netdev
Cc: Jie Zhang, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Furong Xu, Kunihiko Hayashi,
Vladimir Oltean, linux-stm32, linux-arm-kernel, linux-kernel
n 12/2/25 3:54 AM, Jie Zhang wrote:
> For GMAC4, when split header is enabled, in some rare cases, the
> hardware does not fill buf2 of the first descriptor with payload.
> Thus we cannot assume buf2 is always fully filled if it is not
> the last descriptor. Otherwise, the length of buf2 of the second
> descriptor will be calculated wrong and cause an oops:
>
> Unable to handle kernel paging request at virtual address ffff00019246bfc0
> Mem abort info:
> ESR = 0x0000000096000145
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x05: level 1 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000145, ISS2 = 0x00000000
> CM = 1, WnR = 1, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000090d8b000
> [ffff00019246bfc0] pgd=180000009dfff403, p4d=180000009dfff403, pud=0000000000000000
> Internal error: Oops: 0000000096000145 [#1] SMP
> Modules linked in:
> CPU: 0 UID: 0 PID: 157 Comm: iperf3 Not tainted 6.18.0-rc6 #1 PREEMPT
> Hardware name: ADI 64-bit SC598 SOM EZ Kit (DT)
> pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : dcache_inval_poc+0x28/0x58
> lr : arch_sync_dma_for_cpu+0x28/0x34
> sp : ffff800080dcbc40
> x29: ffff800080dcbc40 x28: 0000000000000008 x27: ffff000091c50980
> x26: ffff000091c50980 x25: 0000000000000000 x24: ffff000092a5fb00
> x23: ffff000092768f28 x22: 000000009246c000 x21: 0000000000000002
> x20: 00000000ffffffdc x19: ffff000091844c10 x18: 0000000000000000
> x17: ffff80001d308000 x16: ffff800080dc8000 x15: ffff0000929fb034
> x14: 70f709157374dd21 x13: ffff000092812ec0 x12: 0000000000000000
> x11: 000000000000dd86 x10: 0000000000000040 x9 : 0000000000000600
> x8 : ffff000092a5fbac x7 : 0000000000000001 x6 : 0000000000004240
> x5 : 000000009246c000 x4 : ffff000091844c10 x3 : 000000000000003f
> x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000
> Call trace:
> dcache_inval_poc+0x28/0x58 (P)
> dma_direct_sync_single_for_cpu+0x38/0x6c
> __dma_sync_single_for_cpu+0x34/0x6c
> stmmac_napi_poll_rx+0x8f0/0xb60
> __napi_poll.constprop.0+0x30/0x144
> net_rx_action+0x160/0x274
> handle_softirqs+0x1b8/0x1fc
> __do_softirq+0x10/0x18
> ____do_softirq+0xc/0x14
> call_on_irq_stack+0x30/0x48
> do_softirq_own_stack+0x18/0x20
> __irq_exit_rcu+0x64/0xe8
> irq_exit_rcu+0xc/0x14
> el1_interrupt+0x3c/0x58
> el1h_64_irq_handler+0x14/0x1c
> el1h_64_irq+0x6c/0x70
> __arch_copy_to_user+0xbc/0x240 (P)
> simple_copy_to_iter+0x28/0x30
> __skb_datagram_iter+0x1bc/0x268
> skb_copy_datagram_iter+0x1c/0x24
> tcp_recvmsg_locked+0x3ec/0x778
> tcp_recvmsg+0x10c/0x194
> inet_recvmsg+0x64/0xa0
> sock_recvmsg_nosec+0x1c/0x24
> sock_read_iter+0x8c/0xdc
> vfs_read+0x144/0x1a0
> ksys_read+0x74/0xdc
> __arm64_sys_read+0x14/0x1c
> invoke_syscall+0x60/0xe4
> el0_svc_common.constprop.0+0xb0/0xcc
> do_el0_svc+0x18/0x20
> el0_svc+0x80/0xc8
> el0t_64_sync_handler+0x58/0x134
> el0t_64_sync+0x170/0x174
> Code: d1000443 ea03003f 8a230021 54000040 (d50b7e21)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops: Fatal exception in interrupt
> Kernel Offset: disabled
> CPU features: 0x080000,00008000,08006281,0400520b
> Memory Limit: none
> ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
Please avoid including the last 6 lines in the commit message, the '---'
separator could foul git when applying the patch and truncate the commit
message.
> To fix this, the PL bit-field in RDES3 register is used for all
> descriptors, whether it is the last descriptor or not.
>
> Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Looks like a fixes suitable for net; a fix tag is required and you should
include the target tree into the subj prefix, see:
https://elixir.bootlin.com/linux/v6.18/source/Documentation/process/maintainer-netdev.rst#L64The
patch does not apply to 'net' anymore, please rebase and resubmit. You
can retain Jacob's ack.
/P
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: stmmac: fix oops when split header is enabled
2025-12-02 2:54 [PATCH] net: stmmac: fix oops when split header is enabled Jie Zhang
2025-12-04 0:37 ` Jacob Keller
2025-12-04 12:26 ` Paolo Abeni
@ 2025-12-04 13:11 ` Simon Horman
2 siblings, 0 replies; 4+ messages in thread
From: Simon Horman @ 2025-12-04 13:11 UTC (permalink / raw)
To: Jie Zhang
Cc: netdev, Jie Zhang, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Furong Xu, Kunihiko Hayashi,
Vladimir Oltean, linux-stm32, linux-arm-kernel, linux-kernel
On Mon, Dec 01, 2025 at 09:54:16PM -0500, Jie Zhang wrote:
> For GMAC4, when split header is enabled, in some rare cases, the
> hardware does not fill buf2 of the first descriptor with payload.
> Thus we cannot assume buf2 is always fully filled if it is not
> the last descriptor. Otherwise, the length of buf2 of the second
> descriptor will be calculated wrong and cause an oops:
...
As a bug fix this should have a Fixes tag here
(no blank line between it and your Signed-off-by line).
As a rule of thumb, it should cite the patch where
the bug was introduced.
Perhaps in this case the following is appropriate:
Fixes: ec222003bd94 ("net: stmmac: Prepare to add Split Header support")
> Signed-off-by: Jie Zhang <jie.zhang@analog.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-12-04 13:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 2:54 [PATCH] net: stmmac: fix oops when split header is enabled Jie Zhang
2025-12-04 0:37 ` Jacob Keller
2025-12-04 12:26 ` Paolo Abeni
2025-12-04 13:11 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).