* [PATCH net 0/4] Bug fixes from XDP patch series
@ 2025-04-28 12:04 Meghana Malladi
2025-04-28 12:04 ` [PATCH net 1/4] net: ti: icssg-prueth: Set XDP feature flags for ndev Meghana Malladi
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Meghana Malladi @ 2025-04-28 12:04 UTC (permalink / raw)
To: dan.carpenter, m-malladi, john.fastabend, hawk, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
This patch series fixes the bugs introduced while adding
xdp support in the icssg driver, and were reproduced while
running xdp-trafficgen to generate xdp traffic on icssg interfaces.
Meghana Malladi (4):
net: ti: icssg-prueth: Set XDP feature flags for ndev
net: ti: icssg-prueth: Report BQL before sending XDP packets
net: ti: icssg-prueth: Fix race condition for traffic from different
network sockets
net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue
access
drivers/net/ethernet/ti/icssg/icssg_common.c | 22 ++++++++++++++++++--
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 16 ++++++++------
drivers/net/ethernet/ti/icssg/icssg_prueth.h | 1 +
3 files changed, 31 insertions(+), 8 deletions(-)
base-commit: cc17b4b9c332cc654b248619c1d8ca40d80d1155
--
2.43.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 1/4] net: ti: icssg-prueth: Set XDP feature flags for ndev
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
@ 2025-04-28 12:04 ` Meghana Malladi
2025-04-28 12:04 ` [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets Meghana Malladi
` (3 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Meghana Malladi @ 2025-04-28 12:04 UTC (permalink / raw)
To: dan.carpenter, m-malladi, john.fastabend, hawk, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
xdp_features demonstrates what all XDP capabilities are supported
on a given network device. The driver needs to set these xdp_features
flag to let the network stack know what XDP features a given driver
is supporting. These flags need to be set for a given ndev irrespective
of any XDP program being loaded or not.
Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
---
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index 443f90fa6557..ee35fecf61e7 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -1109,11 +1109,6 @@ static int emac_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frame
static int emac_xdp_setup(struct prueth_emac *emac, struct netdev_bpf *bpf)
{
struct bpf_prog *prog = bpf->prog;
- xdp_features_t val;
-
- val = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
- NETDEV_XDP_ACT_NDO_XMIT;
- xdp_set_features_flag(emac->ndev, val);
if (!emac->xdpi.prog && !prog)
return 0;
@@ -1291,6 +1286,10 @@ static int prueth_netdev_init(struct prueth *prueth,
ndev->hw_features = NETIF_F_SG;
ndev->features = ndev->hw_features | NETIF_F_HW_VLAN_CTAG_FILTER;
ndev->hw_features |= NETIF_PRUETH_HSR_OFFLOAD_FEATURES;
+ xdp_set_features_flag(ndev,
+ NETDEV_XDP_ACT_BASIC |
+ NETDEV_XDP_ACT_REDIRECT |
+ NETDEV_XDP_ACT_NDO_XMIT);
netif_napi_add(ndev, &emac->napi_rx, icssg_napi_rx_poll);
hrtimer_setup(&emac->rx_hrtimer, &emac_rx_timer_callback, CLOCK_MONOTONIC,
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
2025-04-28 12:04 ` [PATCH net 1/4] net: ti: icssg-prueth: Set XDP feature flags for ndev Meghana Malladi
@ 2025-04-28 12:04 ` Meghana Malladi
2025-05-01 14:53 ` Jakub Kicinski
2025-04-28 12:04 ` [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets Meghana Malladi
` (2 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Meghana Malladi @ 2025-04-28 12:04 UTC (permalink / raw)
To: dan.carpenter, m-malladi, john.fastabend, hawk, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
When sending out any kind of traffic, it is essential that the driver
keeps reporting BQL of the number of bytes that have been sent so that
BQL can track the amount of data in the queue and prevents it from
overflowing. If BQL is not reported, the driver may continue sending
packets even when the queue is full, leading to packet loss, congestion
and decreased network performance. Currently this is missing in
emac_xmit_xdp_frame() and this patch fixes it.
Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
---
drivers/net/ethernet/ti/icssg/icssg_common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
index b4be76e13a2f..4f45f2b6b67f 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_common.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -187,7 +187,6 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
xdp_return_frame(xdpf);
break;
default:
- netdev_err(ndev, "tx_complete: invalid swdata type %d\n", swdata->type);
prueth_xmit_free(tx_chn, desc_tx);
ndev->stats.tx_dropped++;
continue;
@@ -567,6 +566,7 @@ u32 emac_xmit_xdp_frame(struct prueth_emac *emac,
{
struct cppi5_host_desc_t *first_desc;
struct net_device *ndev = emac->ndev;
+ struct netdev_queue *netif_txq;
struct prueth_tx_chn *tx_chn;
dma_addr_t desc_dma, buf_dma;
struct prueth_swdata *swdata;
@@ -620,12 +620,17 @@ u32 emac_xmit_xdp_frame(struct prueth_emac *emac,
swdata->data.xdpf = xdpf;
}
+ /* Report BQL before sending the packet */
+ netif_txq = netdev_get_tx_queue(ndev, tx_chn->id);
+ netdev_tx_sent_queue(netif_txq, xdpf->len);
+
cppi5_hdesc_set_pktlen(first_desc, xdpf->len);
desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
if (ret) {
netdev_err(ndev, "xdp tx: push failed: %d\n", ret);
+ netdev_tx_completed_queue(netif_txq, 1, xdpf->len);
goto drop_free_descs;
}
@@ -979,6 +984,7 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev
ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
if (ret) {
netdev_err(ndev, "tx: push failed: %d\n", ret);
+ netdev_tx_completed_queue(netif_txq, 1, pkt_len);
goto drop_free_descs;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
2025-04-28 12:04 ` [PATCH net 1/4] net: ti: icssg-prueth: Set XDP feature flags for ndev Meghana Malladi
2025-04-28 12:04 ` [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets Meghana Malladi
@ 2025-04-28 12:04 ` Meghana Malladi
2025-05-01 14:56 ` Jakub Kicinski
2025-04-28 12:04 ` [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access Meghana Malladi
2025-05-02 7:16 ` [PATCH net 0/4] Bug fixes from XDP patch series Jesper Dangaard Brouer
4 siblings, 1 reply; 12+ messages in thread
From: Meghana Malladi @ 2025-04-28 12:04 UTC (permalink / raw)
To: dan.carpenter, m-malladi, john.fastabend, hawk, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
When dealing with transmitting traffic from different network
sockets to a single Tx channel, freeing the DMA descriptors can lead
to kernel panic with the following error:
[ 394.602494] ------------[ cut here ]------------
[ 394.607134] kernel BUG at lib/genalloc.c:508!
[ 394.611485] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
logs: https://gist.github.com/MeghanaMalladiTI/ad1d1da3b6e966bc6962c105c0b1d0b6
The above error was reproduced when sending XDP traffic from XSK
socket along with network traffic from BSD socket. This causes
a race condition leading to corrupted DMA descriptors. Fix this
by adding spinlock protection while accessing the DMA descriptors
of a Tx ring.
Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
---
drivers/net/ethernet/ti/icssg/icssg_common.c | 7 +++++++
drivers/net/ethernet/ti/icssg/icssg_prueth.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
index 4f45f2b6b67f..a120ff6fec8f 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_common.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -157,7 +157,9 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
tx_chn = &emac->tx_chns[chn];
while (true) {
+ spin_lock(&tx_chn->lock);
res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, &desc_dma);
+ spin_unlock(&tx_chn->lock);
if (res == -ENODATA)
break;
@@ -325,6 +327,7 @@ int prueth_init_tx_chns(struct prueth_emac *emac)
snprintf(tx_chn->name, sizeof(tx_chn->name),
"tx%d-%d", slice, i);
+ spin_lock_init(&tx_chn->lock);
tx_chn->emac = emac;
tx_chn->id = i;
tx_chn->descs_num = PRUETH_MAX_TX_DESC;
@@ -627,7 +630,9 @@ u32 emac_xmit_xdp_frame(struct prueth_emac *emac,
cppi5_hdesc_set_pktlen(first_desc, xdpf->len);
desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
+ spin_lock_bh(&tx_chn->lock);
ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
+ spin_unlock_bh(&tx_chn->lock);
if (ret) {
netdev_err(ndev, "xdp tx: push failed: %d\n", ret);
netdev_tx_completed_queue(netif_txq, 1, xdpf->len);
@@ -981,7 +986,9 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev
/* cppi5_desc_dump(first_desc, 64); */
skb_tx_timestamp(skb); /* SW timestamp if SKBTX_IN_PROGRESS not set */
+ spin_lock_bh(&tx_chn->lock);
ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
+ spin_unlock_bh(&tx_chn->lock);
if (ret) {
netdev_err(ndev, "tx: push failed: %d\n", ret);
netdev_tx_completed_queue(netif_txq, 1, pkt_len);
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
index b6be4aa57a61..4e5354c2866a 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
@@ -119,6 +119,7 @@ struct prueth_tx_chn {
struct k3_cppi_desc_pool *desc_pool;
struct k3_udma_glue_tx_channel *tx_chn;
struct prueth_emac *emac;
+ spinlock_t lock; /* protect TX rings in multi-port mode */
u32 id;
u32 descs_num;
unsigned int irq;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
` (2 preceding siblings ...)
2025-04-28 12:04 ` [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets Meghana Malladi
@ 2025-04-28 12:04 ` Meghana Malladi
2025-05-02 7:14 ` Jesper Dangaard Brouer
2025-05-02 7:16 ` [PATCH net 0/4] Bug fixes from XDP patch series Jesper Dangaard Brouer
4 siblings, 1 reply; 12+ messages in thread
From: Meghana Malladi @ 2025-04-28 12:04 UTC (permalink / raw)
To: dan.carpenter, m-malladi, john.fastabend, hawk, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
Add __netif_tx_lock() to ensure that only one packet is being
transmitted at a time to avoid race conditions in the netif_txq
struct and prevent packet data corruption. Failing to do so causes
kernel panic with the following error:
[ 2184.746764] ------------[ cut here ]------------
[ 2184.751412] kernel BUG at lib/dynamic_queue_limits.c:99!
[ 2184.756728] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
logs: https://gist.github.com/MeghanaMalladiTI/9c7aa5fc3b7fb03f87c74aad487956e9
The lock is acquired before calling emac_xmit_xdp_frame() and released after the
call returns. This ensures that the TX queue is protected from concurrent access
during the transmission of XDP frames.
Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
---
drivers/net/ethernet/ti/icssg/icssg_common.c | 7 ++++++-
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 7 ++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
index a120ff6fec8f..e509b6ff81e7 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_common.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -660,6 +660,8 @@ static u32 emac_run_xdp(struct prueth_emac *emac, struct xdp_buff *xdp,
struct page *page, u32 *len)
{
struct net_device *ndev = emac->ndev;
+ struct netdev_queue *netif_txq;
+ int cpu = smp_processor_id();
struct bpf_prog *xdp_prog;
struct xdp_frame *xdpf;
u32 pkt_len = *len;
@@ -679,8 +681,11 @@ static u32 emac_run_xdp(struct prueth_emac *emac, struct xdp_buff *xdp,
goto drop;
}
- q_idx = smp_processor_id() % emac->tx_ch_num;
+ q_idx = cpu % emac->tx_ch_num;
+ netif_txq = netdev_get_tx_queue(ndev, q_idx);
+ __netif_tx_lock(netif_txq, cpu);
result = emac_xmit_xdp_frame(emac, xdpf, page, q_idx);
+ __netif_tx_unlock(netif_txq);
if (result == ICSSG_XDP_CONSUMED) {
ndev->stats.tx_dropped++;
goto drop;
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index ee35fecf61e7..b31060e7f698 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -1075,20 +1075,25 @@ static int emac_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frame
{
struct prueth_emac *emac = netdev_priv(dev);
struct net_device *ndev = emac->ndev;
+ struct netdev_queue *netif_txq;
+ int cpu = smp_processor_id();
struct xdp_frame *xdpf;
unsigned int q_idx;
int nxmit = 0;
u32 err;
int i;
- q_idx = smp_processor_id() % emac->tx_ch_num;
+ q_idx = cpu % emac->tx_ch_num;
+ netif_txq = netdev_get_tx_queue(ndev, q_idx);
if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
return -EINVAL;
for (i = 0; i < n; i++) {
xdpf = frames[i];
+ __netif_tx_lock(netif_txq, cpu);
err = emac_xmit_xdp_frame(emac, xdpf, NULL, q_idx);
+ __netif_tx_unlock(netif_txq);
if (err != ICSSG_XDP_TX) {
ndev->stats.tx_dropped++;
break;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets
2025-04-28 12:04 ` [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets Meghana Malladi
@ 2025-05-01 14:53 ` Jakub Kicinski
2025-05-02 6:18 ` Malladi, Meghana
0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2025-05-01 14:53 UTC (permalink / raw)
To: Meghana Malladi
Cc: dan.carpenter, john.fastabend, hawk, daniel, ast, pabeni,
edumazet, davem, andrew+netdev, bpf, linux-kernel, netdev,
linux-arm-kernel, srk, Vignesh Raghavendra, Roger Quadros,
danishanwar
On Mon, 28 Apr 2025 17:34:57 +0530 Meghana Malladi wrote:
> When sending out any kind of traffic, it is essential that the driver
> keeps reporting BQL of the number of bytes that have been sent so that
> BQL can track the amount of data in the queue and prevents it from
> overflowing. If BQL is not reported, the driver may continue sending
> packets even when the queue is full, leading to packet loss, congestion
> and decreased network performance. Currently this is missing in
> emac_xmit_xdp_frame() and this patch fixes it.
The ordering of patches in the series is a bit off.
The order should be something like:
net: ti: icssg-prueth: Set XDP feature flags for ndev
net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue ...
net: ti: icssg-prueth: Fix race condition for traffic from different ...
net: ti: icssg-prueth: Report BQL before sending XDP packets
This patch is not correct without the extra locking in place.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets
2025-04-28 12:04 ` [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets Meghana Malladi
@ 2025-05-01 14:56 ` Jakub Kicinski
2025-05-02 9:31 ` Malladi, Meghana
0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2025-05-01 14:56 UTC (permalink / raw)
To: Meghana Malladi
Cc: dan.carpenter, john.fastabend, hawk, daniel, ast, pabeni,
edumazet, davem, andrew+netdev, bpf, linux-kernel, netdev,
linux-arm-kernel, srk, Vignesh Raghavendra, Roger Quadros,
danishanwar
On Mon, 28 Apr 2025 17:34:58 +0530 Meghana Malladi wrote:
> When dealing with transmitting traffic from different network
> sockets to a single Tx channel, freeing the DMA descriptors can lead
> to kernel panic with the following error:
>
> [ 394.602494] ------------[ cut here ]------------
> [ 394.607134] kernel BUG at lib/genalloc.c:508!
> [ 394.611485] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>
> logs: https://gist.github.com/MeghanaMalladiTI/ad1d1da3b6e966bc6962c105c0b1d0b6
>
> The above error was reproduced when sending XDP traffic from XSK
> socket along with network traffic from BSD socket. This causes
> a race condition leading to corrupted DMA descriptors. Fix this
> by adding spinlock protection while accessing the DMA descriptors
> of a Tx ring.
IDK how XSK vs normal sockets matters after what is now patch 4.
The only possible race you may be protecting against is pushing
work vs completion. Please double check this is even needed,
and if so fix the commit msg.
> Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
> Signed-off-by: Meghana Malladi <m-malladi@ti.com>
> ---
> drivers/net/ethernet/ti/icssg/icssg_common.c | 7 +++++++
> drivers/net/ethernet/ti/icssg/icssg_prueth.h | 1 +
> 2 files changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
> index 4f45f2b6b67f..a120ff6fec8f 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_common.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
> @@ -157,7 +157,9 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
> tx_chn = &emac->tx_chns[chn];
>
> while (true) {
> + spin_lock(&tx_chn->lock);
> res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, &desc_dma);
> + spin_unlock(&tx_chn->lock);
> if (res == -ENODATA)
> break;
>
> @@ -325,6 +327,7 @@ int prueth_init_tx_chns(struct prueth_emac *emac)
> snprintf(tx_chn->name, sizeof(tx_chn->name),
> "tx%d-%d", slice, i);
>
> + spin_lock_init(&tx_chn->lock);
> tx_chn->emac = emac;
> tx_chn->id = i;
> tx_chn->descs_num = PRUETH_MAX_TX_DESC;
> @@ -627,7 +630,9 @@ u32 emac_xmit_xdp_frame(struct prueth_emac *emac,
> cppi5_hdesc_set_pktlen(first_desc, xdpf->len);
> desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
>
> + spin_lock_bh(&tx_chn->lock);
> ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
> + spin_unlock_bh(&tx_chn->lock);
I'm afraid this needs to be some form of spin_lock_irq
The completions may run from hard irq context when netpoll/netconsole
is used.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets
2025-05-01 14:53 ` Jakub Kicinski
@ 2025-05-02 6:18 ` Malladi, Meghana
0 siblings, 0 replies; 12+ messages in thread
From: Malladi, Meghana @ 2025-05-02 6:18 UTC (permalink / raw)
To: Jakub Kicinski
Cc: dan.carpenter, john.fastabend, hawk, daniel, ast, pabeni,
edumazet, davem, andrew+netdev, bpf, linux-kernel, netdev,
linux-arm-kernel, srk, Vignesh Raghavendra, Roger Quadros,
danishanwar
Hi Jakub,
On 5/1/2025 8:23 PM, Jakub Kicinski wrote:
> On Mon, 28 Apr 2025 17:34:57 +0530 Meghana Malladi wrote:
>> When sending out any kind of traffic, it is essential that the driver
>> keeps reporting BQL of the number of bytes that have been sent so that
>> BQL can track the amount of data in the queue and prevents it from
>> overflowing. If BQL is not reported, the driver may continue sending
>> packets even when the queue is full, leading to packet loss, congestion
>> and decreased network performance. Currently this is missing in
>> emac_xmit_xdp_frame() and this patch fixes it.
>
> The ordering of patches in the series is a bit off.
> The order should be something like:
>
> net: ti: icssg-prueth: Set XDP feature flags for ndev
> net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue ...
> net: ti: icssg-prueth: Fix race condition for traffic from different ...
> net: ti: icssg-prueth: Report BQL before sending XDP packets
>
> This patch is not correct without the extra locking in place.
Actually the order of bug fixes which I posted, is the order in which I
fixed the bugs, while running xdp-trafficgen:
1. ndo_xdp_xmit() wasn't getting called because of missing XDP feature
flags check.
2. kernel crash: kernel BUG at lib/dynamic_queue_limits.c:99! and was
fixed by reporting BQL for packet transmission.My bad, I forgot to add
this in the commit message.
3. kernel crash: kernel BUG at lib/genalloc.c:508! due to memory
corruption in DMA descriptors and fixed it with spinlock
4. kernel crash: kernel BUG at lib/dynamic_queue_limits.c:99! due to
race condition in netif_txq and fixed it with __netif_tx_lock()
But yeah I think the order you suggested makes more sense. Let me try
that and post v2.
--
Thanks,
Meghana Malladi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access
2025-04-28 12:04 ` [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access Meghana Malladi
@ 2025-05-02 7:14 ` Jesper Dangaard Brouer
2025-05-02 9:07 ` Malladi, Meghana
0 siblings, 1 reply; 12+ messages in thread
From: Jesper Dangaard Brouer @ 2025-05-02 7:14 UTC (permalink / raw)
To: Meghana Malladi, dan.carpenter, john.fastabend, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
On 28/04/2025 14.04, Meghana Malladi wrote:
> Add __netif_tx_lock() to ensure that only one packet is being
> transmitted at a time to avoid race conditions in the netif_txq
> struct and prevent packet data corruption. Failing to do so causes
> kernel panic with the following error:
>
> [ 2184.746764] ------------[ cut here ]------------
> [ 2184.751412] kernel BUG at lib/dynamic_queue_limits.c:99!
> [ 2184.756728] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>
> logs: https://gist.github.com/MeghanaMalladiTI/9c7aa5fc3b7fb03f87c74aad487956e9
>
> The lock is acquired before calling emac_xmit_xdp_frame() and released after the
> call returns. This ensures that the TX queue is protected from concurrent access
> during the transmission of XDP frames.
>
> Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
> Signed-off-by: Meghana Malladi <m-malladi@ti.com>
> ---
> drivers/net/ethernet/ti/icssg/icssg_common.c | 7 ++++++-
> drivers/net/ethernet/ti/icssg/icssg_prueth.c | 7 ++++++-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
> index a120ff6fec8f..e509b6ff81e7 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_common.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
> @@ -660,6 +660,8 @@ static u32 emac_run_xdp(struct prueth_emac *emac, struct xdp_buff *xdp,
> struct page *page, u32 *len)
> {
> struct net_device *ndev = emac->ndev;
> + struct netdev_queue *netif_txq;
> + int cpu = smp_processor_id();
> struct bpf_prog *xdp_prog;
> struct xdp_frame *xdpf;
> u32 pkt_len = *len;
> @@ -679,8 +681,11 @@ static u32 emac_run_xdp(struct prueth_emac *emac, struct xdp_buff *xdp,
> goto drop;
> }
>
> - q_idx = smp_processor_id() % emac->tx_ch_num;
> + q_idx = cpu % emac->tx_ch_num;
> + netif_txq = netdev_get_tx_queue(ndev, q_idx);
> + __netif_tx_lock(netif_txq, cpu);
> result = emac_xmit_xdp_frame(emac, xdpf, page, q_idx);
> + __netif_tx_unlock(netif_txq);
> if (result == ICSSG_XDP_CONSUMED) {
> ndev->stats.tx_dropped++;
> goto drop;
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> index ee35fecf61e7..b31060e7f698 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> @@ -1075,20 +1075,25 @@ static int emac_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frame
> {
> struct prueth_emac *emac = netdev_priv(dev);
> struct net_device *ndev = emac->ndev;
> + struct netdev_queue *netif_txq;
> + int cpu = smp_processor_id();
> struct xdp_frame *xdpf;
> unsigned int q_idx;
> int nxmit = 0;
> u32 err;
> int i;
>
> - q_idx = smp_processor_id() % emac->tx_ch_num;
> + q_idx = cpu % emac->tx_ch_num;
> + netif_txq = netdev_get_tx_queue(ndev, q_idx);
>
> if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
> return -EINVAL;
>
> for (i = 0; i < n; i++) {
> xdpf = frames[i];
> + __netif_tx_lock(netif_txq, cpu);
> err = emac_xmit_xdp_frame(emac, xdpf, NULL, q_idx);
> + __netif_tx_unlock(netif_txq);
Why are you taking and releasing this lock in a loop?
XDP gain performance by sending a batch of 'n' packets.
This approach looks like a performance killer.
> if (err != ICSSG_XDP_TX) {
> ndev->stats.tx_dropped++;
> break;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 0/4] Bug fixes from XDP patch series
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
` (3 preceding siblings ...)
2025-04-28 12:04 ` [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access Meghana Malladi
@ 2025-05-02 7:16 ` Jesper Dangaard Brouer
4 siblings, 0 replies; 12+ messages in thread
From: Jesper Dangaard Brouer @ 2025-05-02 7:16 UTC (permalink / raw)
To: Meghana Malladi, dan.carpenter, john.fastabend, daniel, ast,
pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
On 28/04/2025 14.04, Meghana Malladi wrote:
> This patch series fixes the bugs introduced while adding
> xdp support in the icssg driver, and were reproduced while
> running xdp-trafficgen to generate xdp traffic on icssg interfaces.
>
This patchset clearly hurts performance by introducing a lot of locking.
Can you please report what the performance regressions this is introducing?
> Meghana Malladi (4):
> net: ti: icssg-prueth: Set XDP feature flags for ndev
> net: ti: icssg-prueth: Report BQL before sending XDP packets
> net: ti: icssg-prueth: Fix race condition for traffic from different
> network sockets
> net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue
> access
>
> drivers/net/ethernet/ti/icssg/icssg_common.c | 22 ++++++++++++++++++--
> drivers/net/ethernet/ti/icssg/icssg_prueth.c | 16 ++++++++------
> drivers/net/ethernet/ti/icssg/icssg_prueth.h | 1 +
> 3 files changed, 31 insertions(+), 8 deletions(-)
>
>
> base-commit: cc17b4b9c332cc654b248619c1d8ca40d80d1155
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access
2025-05-02 7:14 ` Jesper Dangaard Brouer
@ 2025-05-02 9:07 ` Malladi, Meghana
0 siblings, 0 replies; 12+ messages in thread
From: Malladi, Meghana @ 2025-05-02 9:07 UTC (permalink / raw)
To: Jesper Dangaard Brouer, dan.carpenter, john.fastabend, daniel,
ast, pabeni, kuba, edumazet, davem, andrew+netdev
Cc: bpf, linux-kernel, netdev, linux-arm-kernel, srk,
Vignesh Raghavendra, Roger Quadros, danishanwar
Hi Jesper,
On 5/2/2025 12:44 PM, Jesper Dangaard Brouer wrote:
>
>
> On 28/04/2025 14.04, Meghana Malladi wrote:
>> Add __netif_tx_lock() to ensure that only one packet is being
>> transmitted at a time to avoid race conditions in the netif_txq
>> struct and prevent packet data corruption. Failing to do so causes
>> kernel panic with the following error:
>>
>> [ 2184.746764] ------------[ cut here ]------------
>> [ 2184.751412] kernel BUG at lib/dynamic_queue_limits.c:99!
>> [ 2184.756728] Internal error: Oops - BUG: 00000000f2000800 [#1]
>> PREEMPT SMP
>>
>> logs: https://gist.github.com/
>> MeghanaMalladiTI/9c7aa5fc3b7fb03f87c74aad487956e9
>>
>> The lock is acquired before calling emac_xmit_xdp_frame() and released
>> after the
>> call returns. This ensures that the TX queue is protected from
>> concurrent access
>> during the transmission of XDP frames.
>>
>> Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
>> Signed-off-by: Meghana Malladi <m-malladi@ti.com>
>> ---
>> drivers/net/ethernet/ti/icssg/icssg_common.c | 7 ++++++-
>> drivers/net/ethernet/ti/icssg/icssg_prueth.c | 7 ++++++-
>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/
>> net/ethernet/ti/icssg/icssg_common.c
>> index a120ff6fec8f..e509b6ff81e7 100644
>> --- a/drivers/net/ethernet/ti/icssg/icssg_common.c
>> +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
>> @@ -660,6 +660,8 @@ static u32 emac_run_xdp(struct prueth_emac *emac,
>> struct xdp_buff *xdp,
>> struct page *page, u32 *len)
>> {
>> struct net_device *ndev = emac->ndev;
>> + struct netdev_queue *netif_txq;
>> + int cpu = smp_processor_id();
>> struct bpf_prog *xdp_prog;
>> struct xdp_frame *xdpf;
>> u32 pkt_len = *len;
>> @@ -679,8 +681,11 @@ static u32 emac_run_xdp(struct prueth_emac *emac,
>> struct xdp_buff *xdp,
>> goto drop;
>> }
>> - q_idx = smp_processor_id() % emac->tx_ch_num;
>> + q_idx = cpu % emac->tx_ch_num;
>> + netif_txq = netdev_get_tx_queue(ndev, q_idx);
>> + __netif_tx_lock(netif_txq, cpu);
>> result = emac_xmit_xdp_frame(emac, xdpf, page, q_idx);
>> + __netif_tx_unlock(netif_txq);
>> if (result == ICSSG_XDP_CONSUMED) {
>> ndev->stats.tx_dropped++;
>> goto drop;
>> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/
>> net/ethernet/ti/icssg/icssg_prueth.c
>> index ee35fecf61e7..b31060e7f698 100644
>> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
>> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
>> @@ -1075,20 +1075,25 @@ static int emac_xdp_xmit(struct net_device
>> *dev, int n, struct xdp_frame **frame
>> {
>> struct prueth_emac *emac = netdev_priv(dev);
>> struct net_device *ndev = emac->ndev;
>> + struct netdev_queue *netif_txq;
>> + int cpu = smp_processor_id();
>> struct xdp_frame *xdpf;
>> unsigned int q_idx;
>> int nxmit = 0;
>> u32 err;
>> int i;
>> - q_idx = smp_processor_id() % emac->tx_ch_num;
>> + q_idx = cpu % emac->tx_ch_num;
>> + netif_txq = netdev_get_tx_queue(ndev, q_idx);
>> if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
>> return -EINVAL;
>> for (i = 0; i < n; i++) {
>> xdpf = frames[i];
>> + __netif_tx_lock(netif_txq, cpu);
>> err = emac_xmit_xdp_frame(emac, xdpf, NULL, q_idx);
>> + __netif_tx_unlock(netif_txq);
>
> Why are you taking and releasing this lock in a loop?
>
> XDP gain performance by sending a batch of 'n' packets.
> This approach looks like a performance killer.
>
Yes, I agree with you. This wasn't the intended change. Thank you for
pointing this out. The lock and unlock should happen outside the loop.
Will fix this in v2.
>
>> if (err != ICSSG_XDP_TX) {
>> ndev->stats.tx_dropped++;
>> break;
>
>
--
Thanks,
Meghana Malladi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets
2025-05-01 14:56 ` Jakub Kicinski
@ 2025-05-02 9:31 ` Malladi, Meghana
0 siblings, 0 replies; 12+ messages in thread
From: Malladi, Meghana @ 2025-05-02 9:31 UTC (permalink / raw)
To: Jakub Kicinski
Cc: dan.carpenter, john.fastabend, hawk, daniel, ast, pabeni,
edumazet, davem, andrew+netdev, bpf, linux-kernel, netdev,
linux-arm-kernel, srk, Vignesh Raghavendra, Roger Quadros,
danishanwar
Hi Jakub,
On 5/1/2025 8:26 PM, Jakub Kicinski wrote:
> On Mon, 28 Apr 2025 17:34:58 +0530 Meghana Malladi wrote:
>> When dealing with transmitting traffic from different network
>> sockets to a single Tx channel, freeing the DMA descriptors can lead
>> to kernel panic with the following error:
>>
>> [ 394.602494] ------------[ cut here ]------------
>> [ 394.607134] kernel BUG at lib/genalloc.c:508!
>> [ 394.611485] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>>
>> logs: https://gist.github.com/MeghanaMalladiTI/ad1d1da3b6e966bc6962c105c0b1d0b6
>>
>> The above error was reproduced when sending XDP traffic from XSK
>> socket along with network traffic from BSD socket. This causes
>> a race condition leading to corrupted DMA descriptors. Fix this
>> by adding spinlock protection while accessing the DMA descriptors
>> of a Tx ring.
>
> IDK how XSK vs normal sockets matters after what is now patch 4.
> The only possible race you may be protecting against is pushing
> work vs completion. Please double check this is even needed,
> and if so fix the commit msg.
I can think of race conditions happening in the following cases:
1. Multiport use cases where traffic is being handled on more than one
interface to a single Tx channel.
2. Having emac_xmit_xdp_frame() and icssg_ndo_start_xmit(), two
different traffics being transmitted over a single interface to a single
tx channel.
In both of the above scenarios Tx channel is a common resource which
needs to be protected from any race conditions, which might happen
during Tx descriptor push/pop. As suggested by you, I am currently
excluding this patch and doing some stress testing. Regardless
conceptually I still think spinlock is needed, please do correct me if I
am wrong.
>
>> Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
>> Signed-off-by: Meghana Malladi <m-malladi@ti.com>
>> ---
>> drivers/net/ethernet/ti/icssg/icssg_common.c | 7 +++++++
>> drivers/net/ethernet/ti/icssg/icssg_prueth.h | 1 +
>> 2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
>> index 4f45f2b6b67f..a120ff6fec8f 100644
>> --- a/drivers/net/ethernet/ti/icssg/icssg_common.c
>> +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
>> @@ -157,7 +157,9 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
>> tx_chn = &emac->tx_chns[chn];
>>
>> while (true) {
>> + spin_lock(&tx_chn->lock);
>> res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, &desc_dma);
>> + spin_unlock(&tx_chn->lock);
>> if (res == -ENODATA)
>> break;
>>
>> @@ -325,6 +327,7 @@ int prueth_init_tx_chns(struct prueth_emac *emac)
>> snprintf(tx_chn->name, sizeof(tx_chn->name),
>> "tx%d-%d", slice, i);
>>
>> + spin_lock_init(&tx_chn->lock);
>> tx_chn->emac = emac;
>> tx_chn->id = i;
>> tx_chn->descs_num = PRUETH_MAX_TX_DESC;
>> @@ -627,7 +630,9 @@ u32 emac_xmit_xdp_frame(struct prueth_emac *emac,
>> cppi5_hdesc_set_pktlen(first_desc, xdpf->len);
>> desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
>>
>> + spin_lock_bh(&tx_chn->lock);
>> ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
>> + spin_unlock_bh(&tx_chn->lock);
>
> I'm afraid this needs to be some form of spin_lock_irq
> The completions may run from hard irq context when netpoll/netconsole
> is used.
Didn't know system can handle network interrupts in a hard IRQ context.
Ok I will update to spin_lock_irq() if this patch is necessary.
--
Thanks,
Meghana Malladi
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-02 9:37 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-28 12:04 [PATCH net 0/4] Bug fixes from XDP patch series Meghana Malladi
2025-04-28 12:04 ` [PATCH net 1/4] net: ti: icssg-prueth: Set XDP feature flags for ndev Meghana Malladi
2025-04-28 12:04 ` [PATCH net 2/4] net: ti: icssg-prueth: Report BQL before sending XDP packets Meghana Malladi
2025-05-01 14:53 ` Jakub Kicinski
2025-05-02 6:18 ` Malladi, Meghana
2025-04-28 12:04 ` [PATCH net 3/4] net: ti: icssg-prueth: Fix race condition for traffic from different network sockets Meghana Malladi
2025-05-01 14:56 ` Jakub Kicinski
2025-05-02 9:31 ` Malladi, Meghana
2025-04-28 12:04 ` [PATCH net 4/4] net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access Meghana Malladi
2025-05-02 7:14 ` Jesper Dangaard Brouer
2025-05-02 9:07 ` Malladi, Meghana
2025-05-02 7:16 ` [PATCH net 0/4] Bug fixes from XDP patch series Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).