* [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
@ 2025-03-15 12:04 Lorenzo Bianconi
2025-03-15 13:58 ` Andrew Lunn
0 siblings, 1 reply; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-15 12:04 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Lorenzo Bianconi
Cc: linux-arm-kernel, linux-mediatek, netdev
The system occasionally crashes dereferencing a NULL pointer when it is
forwarding constant, high load bidirectional traffic.
[ 2149.913414] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000
[ 2149.925812] Mem abort info:
[ 2149.928713] ESR = 0x0000000096000005
[ 2149.932762] EC = 0x25: DABT (current EL), IL = 32 bits
[ 2149.938429] SET = 0, FnV = 0
[ 2149.941814] EA = 0, S1PTW = 0
[ 2149.945187] FSC = 0x05: level 1 translation fault
[ 2149.950348] Data abort info:
[ 2149.953472] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 2149.959243] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 2149.964593] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 2149.970243] user pgtable: 4k pages, 39-bit VAs, pgdp=000000008b507000
[ 2149.977068] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 2149.986062] Internal error: Oops: 0000000096000005 [#1] SMP
[ 2150.082282] arht_wrapper(O) i2c_core arht_hook(O) crc32_generic
[ 2150.177623] CPU: 0 PID: 38 Comm: kworker/u9:1 Tainted: G O 6.6.73 #0
[ 2150.185362] Hardware name: Airoha AN7581 Evaluation Board (DT)
[ 2150.191189] Workqueue: nf_ft_offload_add nf_flow_rule_route_ipv6 [nf_flow_table]
[ 2150.198653] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2150.205615] pc : airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54
[ 2150.211882] lr : airoha_ppe_flow_offload_replace.isra.0+0x6cc/0xc54
[ 2150.218149] sp : ffffffc080e8ba10
[ 2150.221456] x29: ffffffc080e8bae0 x28: ffffff80080b0000 x27: 0000000000000000
[ 2150.228591] x26: ffffff8001c70020 x25: 0000000000000002 x24: 0000000000000000
[ 2150.235727] x23: 0000000061000000 x22: 00000000ffffffed x21: ffffffc080e8bbb0
[ 2150.242862] x20: ffffff8001c70000 x19: 0000000000000008 x18: 0000000000000000
[ 2150.249998] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 2150.257133] x14: 0000000000000001 x13: 0000000000000008 x12: 0101010101010101
[ 2150.264268] x11: 7f7f7f7f7f7f7f7f x10: 0000000000000041 x9 : 0000000000000000
[ 2150.271404] x8 : ffffffc080e8bad8 x7 : 0000000000000000 x6 : 0000000000000015
[ 2150.278540] x5 : ffffffc080e8ba4e x4 : 0000000000000004 x3 : 0000000000000000
[ 2150.285675] x2 : 0000000000000008 x1 : 00000000080b0000 x0 : 0000000000000000
[ 2150.292811] Call trace:
[ 2150.295250] airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54
[ 2150.301171] airoha_ppe_setup_tc_block_cb+0x7c/0x8b4
[ 2150.306135] nf_flow_offload_ip_hook+0x710/0x874 [nf_flow_table]
[ 2150.312168] nf_flow_rule_route_ipv6+0x53c/0x580 [nf_flow_table]
[ 2150.318200] process_one_work+0x178/0x2f0
[ 2150.322211] worker_thread+0x2e4/0x4cc
[ 2150.325953] kthread+0xd8/0xdc
[ 2150.329008] ret_from_fork+0x10/0x20
[ 2150.332589] Code: b9007bf7 b4001e9c f9448380 b9491381 (f9400000)
[ 2150.338681] ---[ end trace 0000000000000000 ]---
[ 2150.343298] Kernel panic - not syncing: Oops: Fatal exception
[ 2150.349035] SMP: stopping secondary CPUs
[ 2150.352954] Kernel Offset: disabled
[ 2150.356438] CPU features: 0x0,00000000,00000000,1000400b
[ 2150.361743] Memory Limit: none
Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
routine.
Fixes: 00a7678310fe ("net: airoha: Introduce flowtable offload support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
Changes in v2:
- Avoid checking netdev_priv pointer since it is always not NULL
- Link to v1: https://lore.kernel.org/r/20250312-airoha-flowtable-null-ptr-fix-v1-1-6363fab884d0@kernel.org
---
drivers/net/ethernet/airoha/airoha_eth.c | 13 +++++++++++++
drivers/net/ethernet/airoha/airoha_eth.h | 3 +++
drivers/net/ethernet/airoha/airoha_ppe.c | 10 ++++++++--
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index c0a642568ac115ea9df6fbaf7133627a4405a36c..bf9c882e9c8b087dbf5e907636547a0117d1b96a 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -2454,6 +2454,19 @@ static void airoha_metadata_dst_free(struct airoha_gdm_port *port)
}
}
+int airoha_is_valid_gdm_port(struct airoha_eth *eth,
+ struct airoha_gdm_port *port)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(eth->ports); i++) {
+ if (eth->ports[i] == port)
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
static int airoha_alloc_gdm_port(struct airoha_eth *eth,
struct device_node *np, int index)
{
diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
index f66b9b736b9447b31afc036eb906d0a1c617e132..c7d4f124d11481cd31c1566936cd47e3446877c0 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.h
+++ b/drivers/net/ethernet/airoha/airoha_eth.h
@@ -532,6 +532,9 @@ u32 airoha_rmw(void __iomem *base, u32 offset, u32 mask, u32 val);
#define airoha_qdma_clear(qdma, offset, val) \
airoha_rmw((qdma)->regs, (offset), (val), 0)
+int airoha_is_valid_gdm_port(struct airoha_eth *eth,
+ struct airoha_gdm_port *port);
+
void airoha_ppe_check_skb(struct airoha_ppe *ppe, u16 hash);
int airoha_ppe_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
void *cb_priv);
diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
index 8b55e871352d359fa692c253d3f3315c619472b3..65833e2058194a64569eafec08b80df8190bba6c 100644
--- a/drivers/net/ethernet/airoha/airoha_ppe.c
+++ b/drivers/net/ethernet/airoha/airoha_ppe.c
@@ -197,7 +197,8 @@ static int airoha_get_dsa_port(struct net_device **dev)
#endif
}
-static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
+static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
+ struct airoha_foe_entry *hwe,
struct net_device *dev, int type,
struct airoha_flow_data *data,
int l4proto)
@@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
if (dev) {
struct airoha_gdm_port *port = netdev_priv(dev);
u8 pse_port;
+ int err;
+
+ err = airoha_is_valid_gdm_port(eth, port);
+ if (err)
+ return err;
if (dsa_port >= 0)
pse_port = port->id == 4 ? FE_PSE_PORT_GDM4 : port->id;
@@ -633,7 +639,7 @@ static int airoha_ppe_flow_offload_replace(struct airoha_gdm_port *port,
!is_valid_ether_addr(data.eth.h_dest))
return -EINVAL;
- err = airoha_ppe_foe_entry_prepare(&hwe, odev, offload_type,
+ err = airoha_ppe_foe_entry_prepare(eth, &hwe, odev, offload_type,
&data, l4proto);
if (err)
return err;
---
base-commit: bfc6c67ec2d64d0ca4e5cc3e1ac84298a10b8d62
change-id: 20250312-airoha-flowtable-null-ptr-fix-a4656d12546a
Best regards,
--
Lorenzo Bianconi <lorenzo@kernel.org>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
2025-03-15 12:04 [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare() Lorenzo Bianconi
@ 2025-03-15 13:58 ` Andrew Lunn
2025-03-15 14:59 ` Lorenzo Bianconi
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2025-03-15 13:58 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-arm-kernel, linux-mediatek, netdev
> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> routine.
A more interesting question is, why do you see an invalid port? Is the
hardware broken? Something not correctly configured? Are you just
papering over the crack?
> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> + struct airoha_foe_entry *hwe,
> struct net_device *dev, int type,
> struct airoha_flow_data *data,
> int l4proto)
> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> if (dev) {
> struct airoha_gdm_port *port = netdev_priv(dev);
If port is invalid, is dev also invalid? And if dev is invalid, could
dereferencing it to get priv cause an opps?
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
2025-03-15 13:58 ` Andrew Lunn
@ 2025-03-15 14:59 ` Lorenzo Bianconi
2025-03-21 18:18 ` Paolo Abeni
0 siblings, 1 reply; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-15 14:59 UTC (permalink / raw)
To: Andrew Lunn
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-arm-kernel, linux-mediatek, netdev
[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]
> > Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> > routine.
>
> A more interesting question is, why do you see an invalid port? Is the
> hardware broken? Something not correctly configured? Are you just
> papering over the crack?
>
> > -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> > +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> > + struct airoha_foe_entry *hwe,
> > struct net_device *dev, int type,
> > struct airoha_flow_data *data,
> > int l4proto)
> > @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> > if (dev) {
> > struct airoha_gdm_port *port = netdev_priv(dev);
>
> If port is invalid, is dev also invalid? And if dev is invalid, could
> dereferencing it to get priv cause an opps?
I do not think this is a hw problem. Running bidirectional high load traffic,
I got the sporadic crash reported above. In particular, netfilter runs
airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
dev pointer is not NULL). Adding this sanity check makes the system stable.
Please note a similar check is available even in mtk driver [0].
Regards,
Lorenzo
[0] https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/mediatek/mtk_ppe_offload.c#L220
>
> Andrew
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
2025-03-15 14:59 ` Lorenzo Bianconi
@ 2025-03-21 18:18 ` Paolo Abeni
2025-03-21 18:30 ` Lorenzo Bianconi
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2025-03-21 18:18 UTC (permalink / raw)
To: Lorenzo Bianconi, Andrew Lunn
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-mediatek, netdev
On 3/15/25 3:59 PM, Lorenzo Bianconi wrote:
>>> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
>>> routine.
>>
>> A more interesting question is, why do you see an invalid port? Is the
>> hardware broken? Something not correctly configured? Are you just
>> papering over the crack?
>>
>>> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
>>> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
>>> + struct airoha_foe_entry *hwe,
>>> struct net_device *dev, int type,
>>> struct airoha_flow_data *data,
>>> int l4proto)
>>> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
>>> if (dev) {
>>> struct airoha_gdm_port *port = netdev_priv(dev);
>>
>> If port is invalid, is dev also invalid? And if dev is invalid, could
>> dereferencing it to get priv cause an opps?
>
> I do not think this is a hw problem. Running bidirectional high load traffic,
> I got the sporadic crash reported above. In particular, netfilter runs
> airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
> in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
> crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
> dev pointer is not NULL). Adding this sanity check makes the system stable.
> Please note a similar check is available even in mtk driver [0].
I agree with Andrew, you need a better understanding of the root cause.
This really looks like papering over some deeper issue.
AFAICS 'dev' is fetched from the airoha driver itself a few lines
before. Possibly you should double check that code.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
2025-03-21 18:18 ` Paolo Abeni
@ 2025-03-21 18:30 ` Lorenzo Bianconi
0 siblings, 0 replies; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-21 18:30 UTC (permalink / raw)
To: Paolo Abeni
Cc: Lorenzo Bianconi, Andrew Lunn, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-mediatek,
netdev
[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]
> On 3/15/25 3:59 PM, Lorenzo Bianconi wrote:
> >>> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> >>> routine.
> >>
> >> A more interesting question is, why do you see an invalid port? Is the
> >> hardware broken? Something not correctly configured? Are you just
> >> papering over the crack?
> >>
> >>> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> >>> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> >>> + struct airoha_foe_entry *hwe,
> >>> struct net_device *dev, int type,
> >>> struct airoha_flow_data *data,
> >>> int l4proto)
> >>> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> >>> if (dev) {
> >>> struct airoha_gdm_port *port = netdev_priv(dev);
> >>
> >> If port is invalid, is dev also invalid? And if dev is invalid, could
> >> dereferencing it to get priv cause an opps?
> >
> > I do not think this is a hw problem. Running bidirectional high load traffic,
> > I got the sporadic crash reported above. In particular, netfilter runs
> > airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
> > in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
> > crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
> > dev pointer is not NULL). Adding this sanity check makes the system stable.
> > Please note a similar check is available even in mtk driver [0].
>
> I agree with Andrew, you need a better understanding of the root cause.
> This really looks like papering over some deeper issue.
>
> AFAICS 'dev' is fetched from the airoha driver itself a few lines
> before. Possibly you should double check that code.
Are you referring to airoha_get_dsa_port() routine?
I think dev pointer in airoha_ppe_foe_entry_prepare() is not strictly
necessary a device from a driver itself since it is an egress device
and the flowtable can contain even a wlan or a vlan device. In this
case airoha_get_dsa_port() will just return the original device pointer
and we can't assume priv pointer points to a airoha_gdm_port struct.
Agree?
Regards,
Lorenzo
>
> Thanks,
>
> Paolo
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-21 18:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-15 12:04 [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare() Lorenzo Bianconi
2025-03-15 13:58 ` Andrew Lunn
2025-03-15 14:59 ` Lorenzo Bianconi
2025-03-21 18:18 ` Paolo Abeni
2025-03-21 18:30 ` Lorenzo Bianconi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).