netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
@ 2025-03-15 12:04 Lorenzo Bianconi
  2025-03-15 13:58 ` Andrew Lunn
  0 siblings, 1 reply; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-15 12:04 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Lorenzo Bianconi
  Cc: linux-arm-kernel, linux-mediatek, netdev

The system occasionally crashes dereferencing a NULL pointer when it is
forwarding constant, high load bidirectional traffic.

[ 2149.913414] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000
[ 2149.925812] Mem abort info:
[ 2149.928713]   ESR = 0x0000000096000005
[ 2149.932762]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 2149.938429]   SET = 0, FnV = 0
[ 2149.941814]   EA = 0, S1PTW = 0
[ 2149.945187]   FSC = 0x05: level 1 translation fault
[ 2149.950348] Data abort info:
[ 2149.953472]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 2149.959243]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 2149.964593]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 2149.970243] user pgtable: 4k pages, 39-bit VAs, pgdp=000000008b507000
[ 2149.977068] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 2149.986062] Internal error: Oops: 0000000096000005 [#1] SMP
[ 2150.082282]  arht_wrapper(O) i2c_core arht_hook(O) crc32_generic
[ 2150.177623] CPU: 0 PID: 38 Comm: kworker/u9:1 Tainted: G           O       6.6.73 #0
[ 2150.185362] Hardware name: Airoha AN7581 Evaluation Board (DT)
[ 2150.191189] Workqueue: nf_ft_offload_add nf_flow_rule_route_ipv6 [nf_flow_table]
[ 2150.198653] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2150.205615] pc : airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54
[ 2150.211882] lr : airoha_ppe_flow_offload_replace.isra.0+0x6cc/0xc54
[ 2150.218149] sp : ffffffc080e8ba10
[ 2150.221456] x29: ffffffc080e8bae0 x28: ffffff80080b0000 x27: 0000000000000000
[ 2150.228591] x26: ffffff8001c70020 x25: 0000000000000002 x24: 0000000000000000
[ 2150.235727] x23: 0000000061000000 x22: 00000000ffffffed x21: ffffffc080e8bbb0
[ 2150.242862] x20: ffffff8001c70000 x19: 0000000000000008 x18: 0000000000000000
[ 2150.249998] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 2150.257133] x14: 0000000000000001 x13: 0000000000000008 x12: 0101010101010101
[ 2150.264268] x11: 7f7f7f7f7f7f7f7f x10: 0000000000000041 x9 : 0000000000000000
[ 2150.271404] x8 : ffffffc080e8bad8 x7 : 0000000000000000 x6 : 0000000000000015
[ 2150.278540] x5 : ffffffc080e8ba4e x4 : 0000000000000004 x3 : 0000000000000000
[ 2150.285675] x2 : 0000000000000008 x1 : 00000000080b0000 x0 : 0000000000000000
[ 2150.292811] Call trace:
[ 2150.295250]  airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54
[ 2150.301171]  airoha_ppe_setup_tc_block_cb+0x7c/0x8b4
[ 2150.306135]  nf_flow_offload_ip_hook+0x710/0x874 [nf_flow_table]
[ 2150.312168]  nf_flow_rule_route_ipv6+0x53c/0x580 [nf_flow_table]
[ 2150.318200]  process_one_work+0x178/0x2f0
[ 2150.322211]  worker_thread+0x2e4/0x4cc
[ 2150.325953]  kthread+0xd8/0xdc
[ 2150.329008]  ret_from_fork+0x10/0x20
[ 2150.332589] Code: b9007bf7 b4001e9c f9448380 b9491381 (f9400000)
[ 2150.338681] ---[ end trace 0000000000000000 ]---
[ 2150.343298] Kernel panic - not syncing: Oops: Fatal exception
[ 2150.349035] SMP: stopping secondary CPUs
[ 2150.352954] Kernel Offset: disabled
[ 2150.356438] CPU features: 0x0,00000000,00000000,1000400b
[ 2150.361743] Memory Limit: none

Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
routine.

Fixes: 00a7678310fe ("net: airoha: Introduce flowtable offload support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
Changes in v2:
- Avoid checking netdev_priv pointer since it is always not NULL
- Link to v1: https://lore.kernel.org/r/20250312-airoha-flowtable-null-ptr-fix-v1-1-6363fab884d0@kernel.org
---
 drivers/net/ethernet/airoha/airoha_eth.c | 13 +++++++++++++
 drivers/net/ethernet/airoha/airoha_eth.h |  3 +++
 drivers/net/ethernet/airoha/airoha_ppe.c | 10 ++++++++--
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index c0a642568ac115ea9df6fbaf7133627a4405a36c..bf9c882e9c8b087dbf5e907636547a0117d1b96a 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -2454,6 +2454,19 @@ static void airoha_metadata_dst_free(struct airoha_gdm_port *port)
 	}
 }
 
+int airoha_is_valid_gdm_port(struct airoha_eth *eth,
+			     struct airoha_gdm_port *port)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(eth->ports); i++) {
+		if (eth->ports[i] == port)
+			return 0;
+	}
+
+	return -EINVAL;
+}
+
 static int airoha_alloc_gdm_port(struct airoha_eth *eth,
 				 struct device_node *np, int index)
 {
diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
index f66b9b736b9447b31afc036eb906d0a1c617e132..c7d4f124d11481cd31c1566936cd47e3446877c0 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.h
+++ b/drivers/net/ethernet/airoha/airoha_eth.h
@@ -532,6 +532,9 @@ u32 airoha_rmw(void __iomem *base, u32 offset, u32 mask, u32 val);
 #define airoha_qdma_clear(qdma, offset, val)			\
 	airoha_rmw((qdma)->regs, (offset), (val), 0)
 
+int airoha_is_valid_gdm_port(struct airoha_eth *eth,
+			     struct airoha_gdm_port *port);
+
 void airoha_ppe_check_skb(struct airoha_ppe *ppe, u16 hash);
 int airoha_ppe_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 				 void *cb_priv);
diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
index 8b55e871352d359fa692c253d3f3315c619472b3..65833e2058194a64569eafec08b80df8190bba6c 100644
--- a/drivers/net/ethernet/airoha/airoha_ppe.c
+++ b/drivers/net/ethernet/airoha/airoha_ppe.c
@@ -197,7 +197,8 @@ static int airoha_get_dsa_port(struct net_device **dev)
 #endif
 }
 
-static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
+static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
+					struct airoha_foe_entry *hwe,
 					struct net_device *dev, int type,
 					struct airoha_flow_data *data,
 					int l4proto)
@@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
 	if (dev) {
 		struct airoha_gdm_port *port = netdev_priv(dev);
 		u8 pse_port;
+		int err;
+
+		err = airoha_is_valid_gdm_port(eth, port);
+		if (err)
+			return err;
 
 		if (dsa_port >= 0)
 			pse_port = port->id == 4 ? FE_PSE_PORT_GDM4 : port->id;
@@ -633,7 +639,7 @@ static int airoha_ppe_flow_offload_replace(struct airoha_gdm_port *port,
 	    !is_valid_ether_addr(data.eth.h_dest))
 		return -EINVAL;
 
-	err = airoha_ppe_foe_entry_prepare(&hwe, odev, offload_type,
+	err = airoha_ppe_foe_entry_prepare(eth, &hwe, odev, offload_type,
 					   &data, l4proto);
 	if (err)
 		return err;

---
base-commit: bfc6c67ec2d64d0ca4e5cc3e1ac84298a10b8d62
change-id: 20250312-airoha-flowtable-null-ptr-fix-a4656d12546a

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
  2025-03-15 12:04 [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare() Lorenzo Bianconi
@ 2025-03-15 13:58 ` Andrew Lunn
  2025-03-15 14:59   ` Lorenzo Bianconi
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2025-03-15 13:58 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-arm-kernel, linux-mediatek, netdev

> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> routine.

A more interesting question is, why do you see an invalid port? Is the
hardware broken? Something not correctly configured? Are you just
papering over the crack?

> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> +					struct airoha_foe_entry *hwe,
>  					struct net_device *dev, int type,
>  					struct airoha_flow_data *data,
>  					int l4proto)
> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
>  	if (dev) {
>  		struct airoha_gdm_port *port = netdev_priv(dev);

If port is invalid, is dev also invalid? And if dev is invalid, could
dereferencing it to get priv cause an opps?

	Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
  2025-03-15 13:58 ` Andrew Lunn
@ 2025-03-15 14:59   ` Lorenzo Bianconi
  2025-03-21 18:18     ` Paolo Abeni
  0 siblings, 1 reply; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-15 14:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-arm-kernel, linux-mediatek, netdev

[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]

> > Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> > routine.
> 
> A more interesting question is, why do you see an invalid port? Is the
> hardware broken? Something not correctly configured? Are you just
> papering over the crack?
> 
> > -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> > +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> > +					struct airoha_foe_entry *hwe,
> >  					struct net_device *dev, int type,
> >  					struct airoha_flow_data *data,
> >  					int l4proto)
> > @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> >  	if (dev) {
> >  		struct airoha_gdm_port *port = netdev_priv(dev);
> 
> If port is invalid, is dev also invalid? And if dev is invalid, could
> dereferencing it to get priv cause an opps?

I do not think this is a hw problem. Running bidirectional high load traffic,
I got the sporadic crash reported above. In particular, netfilter runs
airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
dev pointer is not NULL). Adding this sanity check makes the system stable.
Please note a similar check is available even in mtk driver [0].

Regards,
Lorenzo

[0] https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/mediatek/mtk_ppe_offload.c#L220

> 
> 	Andrew

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
  2025-03-15 14:59   ` Lorenzo Bianconi
@ 2025-03-21 18:18     ` Paolo Abeni
  2025-03-21 18:30       ` Lorenzo Bianconi
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2025-03-21 18:18 UTC (permalink / raw)
  To: Lorenzo Bianconi, Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	linux-arm-kernel, linux-mediatek, netdev

On 3/15/25 3:59 PM, Lorenzo Bianconi wrote:
>>> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
>>> routine.
>>
>> A more interesting question is, why do you see an invalid port? Is the
>> hardware broken? Something not correctly configured? Are you just
>> papering over the crack?
>>
>>> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
>>> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
>>> +					struct airoha_foe_entry *hwe,
>>>  					struct net_device *dev, int type,
>>>  					struct airoha_flow_data *data,
>>>  					int l4proto)
>>> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
>>>  	if (dev) {
>>>  		struct airoha_gdm_port *port = netdev_priv(dev);
>>
>> If port is invalid, is dev also invalid? And if dev is invalid, could
>> dereferencing it to get priv cause an opps?
> 
> I do not think this is a hw problem. Running bidirectional high load traffic,
> I got the sporadic crash reported above. In particular, netfilter runs
> airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
> in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
> crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
> dev pointer is not NULL). Adding this sanity check makes the system stable.
> Please note a similar check is available even in mtk driver [0].

I agree with Andrew, you need a better understanding of the root cause.
This really looks like papering over some deeper issue.

AFAICS 'dev' is fetched from the airoha driver itself a few lines
before. Possibly you should double check that code.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare()
  2025-03-21 18:18     ` Paolo Abeni
@ 2025-03-21 18:30       ` Lorenzo Bianconi
  0 siblings, 0 replies; 5+ messages in thread
From: Lorenzo Bianconi @ 2025-03-21 18:30 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Lorenzo Bianconi, Andrew Lunn, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-mediatek,
	netdev

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

> On 3/15/25 3:59 PM, Lorenzo Bianconi wrote:
> >>> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare
> >>> routine.
> >>
> >> A more interesting question is, why do you see an invalid port? Is the
> >> hardware broken? Something not correctly configured? Are you just
> >> papering over the crack?
> >>
> >>> -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> >>> +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
> >>> +					struct airoha_foe_entry *hwe,
> >>>  					struct net_device *dev, int type,
> >>>  					struct airoha_flow_data *data,
> >>>  					int l4proto)
> >>> @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe,
> >>>  	if (dev) {
> >>>  		struct airoha_gdm_port *port = netdev_priv(dev);
> >>
> >> If port is invalid, is dev also invalid? And if dev is invalid, could
> >> dereferencing it to get priv cause an opps?
> > 
> > I do not think this is a hw problem. Running bidirectional high load traffic,
> > I got the sporadic crash reported above. In particular, netfilter runs
> > airoha_ppe_flow_offload_replace() providing the egress net_device pointer used
> > in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system
> > crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if
> > dev pointer is not NULL). Adding this sanity check makes the system stable.
> > Please note a similar check is available even in mtk driver [0].
> 
> I agree with Andrew, you need a better understanding of the root cause.
> This really looks like papering over some deeper issue.
> 
> AFAICS 'dev' is fetched from the airoha driver itself a few lines
> before. Possibly you should double check that code.

Are you referring to airoha_get_dsa_port() routine?
I think dev pointer in airoha_ppe_foe_entry_prepare() is not strictly
necessary a device from a driver itself since it is an egress device
and the flowtable can contain even a wlan or a vlan device. In this
case airoha_get_dsa_port() will just return the original device pointer
and we can't assume priv pointer points to a airoha_gdm_port struct.
Agree?

Regards,
Lorenzo

> 
> Thanks,
> 
> Paolo
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-03-21 18:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-15 12:04 [PATCH net-next v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare() Lorenzo Bianconi
2025-03-15 13:58 ` Andrew Lunn
2025-03-15 14:59   ` Lorenzo Bianconi
2025-03-21 18:18     ` Paolo Abeni
2025-03-21 18:30       ` Lorenzo Bianconi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).