* [PATCH net v2 1/2] net: ethernet: oa_tc6: Protect skb pointer used by two different kernel instances
2026-06-26 15:35 [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path Selvamani Rajagopal via B4 Relay
@ 2026-06-26 15:35 ` Selvamani Rajagopal via B4 Relay
2026-06-30 2:15 ` Jakub Kicinski
2026-06-26 15:35 ` [PATCH net v2 2/2] net: ethernet: oa_tc6: Improvement in buffer overflow handling Selvamani Rajagopal via B4 Relay
2026-06-30 5:38 ` [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path Parthiban.Veerasooran
2 siblings, 1 reply; 9+ messages in thread
From: Selvamani Rajagopal via B4 Relay @ 2026-06-26 15:35 UTC (permalink / raw)
To: Parthiban Veerasooran, Andrew Lunn, Piergiorgio Beruto,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Andrew Lunn, Parthiban Veerasooran,
Selvamani Rajagopal
From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
Threaded IRQ uses waiting_tx_skb. Transmit path also uses
this pointer without any mutual exclusion protection. As a
result, it might leak skb buffer, particularly threaded IRQ
runs in the middle of tranmsmit path, near skb_linearize.
Fixes: b542d13fab0f ("net: ethernet: oa_tc6: Interrupt is active low, level triggered.")
Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
changes in v2:
added the missing prefix to the title
---
drivers/net/ethernet/oa_tc6.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c
index 0727d53345a3..3fd4851ee66d 100644
--- a/drivers/net/ethernet/oa_tc6.c
+++ b/drivers/net/ethernet/oa_tc6.c
@@ -672,10 +672,16 @@ static void oa_tc6_cleanup_ongoing_tx_skb(struct oa_tc6 *tc6)
static void oa_tc6_cleanup_waiting_tx_skb(struct oa_tc6 *tc6)
{
- if (tc6->waiting_tx_skb) {
+ struct sk_buff *skb;
+
+ spin_lock_bh(&tc6->tx_skb_lock);
+ skb = tc6->waiting_tx_skb;
+ tc6->waiting_tx_skb = NULL;
+ spin_unlock_bh(&tc6->tx_skb_lock);
+
+ if (skb) {
tc6->netdev->stats.tx_dropped++;
- kfree_skb(tc6->waiting_tx_skb);
- tc6->waiting_tx_skb = NULL;
+ kfree_skb(skb);
}
}
@@ -1250,11 +1256,6 @@ EXPORT_SYMBOL_GPL(oa_tc6_zero_align_receive_frame_enable);
*/
netdev_tx_t oa_tc6_start_xmit(struct oa_tc6 *tc6, struct sk_buff *skb)
{
- if (tc6->disable_traffic || tc6->waiting_tx_skb) {
- netif_stop_queue(tc6->netdev);
- return NETDEV_TX_BUSY;
- }
-
if (skb_linearize(skb)) {
dev_kfree_skb_any(skb);
tc6->netdev->stats.tx_dropped++;
@@ -1262,6 +1263,11 @@ netdev_tx_t oa_tc6_start_xmit(struct oa_tc6 *tc6, struct sk_buff *skb)
}
spin_lock_bh(&tc6->tx_skb_lock);
+ if (tc6->disable_traffic || tc6->waiting_tx_skb) {
+ netif_stop_queue(tc6->netdev);
+ spin_unlock_bh(&tc6->tx_skb_lock);
+ return NETDEV_TX_BUSY;
+ }
tc6->waiting_tx_skb = skb;
spin_unlock_bh(&tc6->tx_skb_lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v2 2/2] net: ethernet: oa_tc6: Improvement in buffer overflow handling
2026-06-26 15:35 [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path Selvamani Rajagopal via B4 Relay
2026-06-26 15:35 ` [PATCH net v2 1/2] net: ethernet: oa_tc6: Protect skb pointer used by two different kernel instances Selvamani Rajagopal via B4 Relay
@ 2026-06-26 15:35 ` Selvamani Rajagopal via B4 Relay
2026-06-30 2:29 ` Jakub Kicinski
2026-06-30 5:38 ` [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path Parthiban.Veerasooran
2 siblings, 1 reply; 9+ messages in thread
From: Selvamani Rajagopal via B4 Relay @ 2026-06-26 15:35 UTC (permalink / raw)
To: Parthiban Veerasooran, Andrew Lunn, Piergiorgio Beruto,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Andrew Lunn, Parthiban Veerasooran,
Selvamani Rajagopal
From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
When oversubscribed traffic causes lot of buffer overflow errors,
probably due to loss of data chunks, driver fails to find a
data chunk with end_valid bit set, before it runs out of sk buffer
space. As a result, assert is seen during skb_put.
Now check is made if tail + len > end, driver abandons the current
data and starts look for a data chunk with start_valid bit,
that is a new frame.
Fixes: d70a0d8f2f2d ("net: ethernet: oa_tc6: implement receive path to receive rx ethernet frames")
Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
changes in v2
- Check rx_skb pointer before new allocation and NULL before use.
---
drivers/net/ethernet/oa_tc6.c | 69 +++++++++++++++++++++++++++++++------------
1 file changed, 50 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c
index 3fd4851ee66d..c59daa032e70 100644
--- a/drivers/net/ethernet/oa_tc6.c
+++ b/drivers/net/ethernet/oa_tc6.c
@@ -692,6 +692,12 @@ static void oa_tc6_free_pending_skbs(struct oa_tc6 *tc6)
oa_tc6_cleanup_waiting_tx_skb(tc6);
}
+static void oa_tc6_look_for_new_frame(struct oa_tc6 *tc6)
+{
+ tc6->rx_buf_overflow = true;
+ oa_tc6_cleanup_ongoing_rx_skb(tc6);
+}
+
/* If the failure is at SPI interface level, masking and clearing
* the interrupt of the device won't work. Since SPI interrupt is
* disabled, it should stop the repeated interrupts.
@@ -729,8 +735,7 @@ static int oa_tc6_process_extended_status(struct oa_tc6 *tc6)
}
if (FIELD_GET(STATUS0_RX_BUFFER_OVERFLOW_ERROR, value)) {
- tc6->rx_buf_overflow = true;
- oa_tc6_cleanup_ongoing_rx_skb(tc6);
+ oa_tc6_look_for_new_frame(tc6);
net_err_ratelimited("%s: Receive buffer overflow error\n",
tc6->netdev->name);
return -EAGAIN;
@@ -811,13 +816,35 @@ static void oa_tc6_submit_rx_skb(struct oa_tc6 *tc6)
tc6->rx_skb = NULL;
}
-static void oa_tc6_update_rx_skb(struct oa_tc6 *tc6, u8 *payload, u8 length)
+/* On oversubscribed traffic condition, particularly with overwhelming rx
+ * buffer overflow errors, there could be data chunk loss. If tail + length
+ * goes beyond end pointer, that is an indication that the data chunk with
+ * end_valid bit is lost. Time to look for a data chunk with start_valid bit.
+ *
+ * If rx_skb is NULL, it is time to start looking for data chunk with
+ * start_bit.
+ */
+static int oa_tc6_update_rx_skb(struct oa_tc6 *tc6, u8 *payload, u8 length)
{
+ if (!tc6->rx_skb ||
+ (tc6->rx_skb->tail + length) > tc6->rx_skb->end) {
+ oa_tc6_look_for_new_frame(tc6);
+ return -EAGAIN;
+ }
+
memcpy(skb_put(tc6->rx_skb, length), payload, length);
+ return 0;
}
+/* On overwhelming rx buffer overflow errors, due to data chunk loss, it is
+ * possible that we get two data chunks with start_valid bit set, without
+ * end_valid bit set in between. In this case, rx_skb would have a valid
+ * buffer pointer. We should release, if a valid pointer is found before
+ * allocating a new one.
+ */
static int oa_tc6_allocate_rx_skb(struct oa_tc6 *tc6)
{
+ oa_tc6_cleanup_ongoing_rx_skb(tc6);
tc6->rx_skb = netdev_alloc_skb_ip_align(tc6->netdev, tc6->netdev->mtu +
ETH_HLEN + ETH_FCS_LEN);
if (!tc6->rx_skb) {
@@ -837,7 +864,9 @@ static int oa_tc6_prcs_complete_rx_frame(struct oa_tc6 *tc6, u8 *payload,
if (ret)
return ret;
- oa_tc6_update_rx_skb(tc6, payload, size);
+ ret = oa_tc6_update_rx_skb(tc6, payload, size);
+ if (ret)
+ return ret;
oa_tc6_submit_rx_skb(tc6);
@@ -852,22 +881,24 @@ static int oa_tc6_prcs_rx_frame_start(struct oa_tc6 *tc6, u8 *payload, u16 size)
if (ret)
return ret;
- oa_tc6_update_rx_skb(tc6, payload, size);
-
- return 0;
+ return oa_tc6_update_rx_skb(tc6, payload, size);
}
-static void oa_tc6_prcs_rx_frame_end(struct oa_tc6 *tc6, u8 *payload, u16 size)
+static int oa_tc6_prcs_rx_frame_end(struct oa_tc6 *tc6, u8 *payload, u16 size)
{
- oa_tc6_update_rx_skb(tc6, payload, size);
+ int ret;
- oa_tc6_submit_rx_skb(tc6);
+ ret = oa_tc6_update_rx_skb(tc6, payload, size);
+ if (!ret)
+ oa_tc6_submit_rx_skb(tc6);
+ return ret;
}
-static void oa_tc6_prcs_ongoing_rx_frame(struct oa_tc6 *tc6, u8 *payload,
- u32 footer)
+static int oa_tc6_prcs_ongoing_rx_frame(struct oa_tc6 *tc6, u8 *payload,
+ u32 footer)
{
- oa_tc6_update_rx_skb(tc6, payload, OA_TC6_CHUNK_PAYLOAD_SIZE);
+ return oa_tc6_update_rx_skb(tc6, payload,
+ OA_TC6_CHUNK_PAYLOAD_SIZE);
}
static int oa_tc6_prcs_rx_chunk_payload(struct oa_tc6 *tc6, u8 *data,
@@ -880,6 +911,7 @@ static int oa_tc6_prcs_rx_chunk_payload(struct oa_tc6 *tc6, u8 *data,
bool start_valid = FIELD_GET(OA_TC6_DATA_FOOTER_START_VALID, footer);
bool end_valid = FIELD_GET(OA_TC6_DATA_FOOTER_END_VALID, footer);
u16 size;
+ int ret;
/* Restart the new rx frame after receiving rx buffer overflow error */
if (start_valid && tc6->rx_buf_overflow)
@@ -907,8 +939,7 @@ static int oa_tc6_prcs_rx_chunk_payload(struct oa_tc6 *tc6, u8 *data,
/* Process the chunk with only rx frame end */
if (end_valid && !start_valid) {
size = end_byte_offset + 1;
- oa_tc6_prcs_rx_frame_end(tc6, data, size);
- return 0;
+ return oa_tc6_prcs_rx_frame_end(tc6, data, size);
}
/* Process the chunk with previous rx frame end and next rx frame
@@ -921,7 +952,9 @@ static int oa_tc6_prcs_rx_chunk_payload(struct oa_tc6 *tc6, u8 *data,
*/
if (tc6->rx_skb) {
size = end_byte_offset + 1;
- oa_tc6_prcs_rx_frame_end(tc6, data, size);
+ ret = oa_tc6_prcs_rx_frame_end(tc6, data, size);
+ if (ret)
+ return ret;
}
size = OA_TC6_CHUNK_PAYLOAD_SIZE - start_byte_offset;
return oa_tc6_prcs_rx_frame_start(tc6,
@@ -930,9 +963,7 @@ static int oa_tc6_prcs_rx_chunk_payload(struct oa_tc6 *tc6, u8 *data,
}
/* Process the chunk with ongoing rx frame data */
- oa_tc6_prcs_ongoing_rx_frame(tc6, data, footer);
-
- return 0;
+ return oa_tc6_prcs_ongoing_rx_frame(tc6, data, footer);
}
static u32 oa_tc6_get_rx_chunk_footer(struct oa_tc6 *tc6, u16 footer_offset)
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path
2026-06-26 15:35 [PATCH net v2 0/2] Fix to possible skb leak due to race condtion in tx path Selvamani Rajagopal via B4 Relay
2026-06-26 15:35 ` [PATCH net v2 1/2] net: ethernet: oa_tc6: Protect skb pointer used by two different kernel instances Selvamani Rajagopal via B4 Relay
2026-06-26 15:35 ` [PATCH net v2 2/2] net: ethernet: oa_tc6: Improvement in buffer overflow handling Selvamani Rajagopal via B4 Relay
@ 2026-06-30 5:38 ` Parthiban.Veerasooran
2026-06-30 5:57 ` Selvamani Rajagopal
2 siblings, 1 reply; 9+ messages in thread
From: Parthiban.Veerasooran @ 2026-06-30 5:38 UTC (permalink / raw)
To: Selvamani.Rajagopal
Cc: netdev, linux-kernel, andrew, andrew+netdev, pier.beruto, davem,
edumazet, kuba, pabeni
Hi Selvamani,
On 26/06/26 9:05 pm, Selvamani Rajagopal via B4 Relay wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> Now the traffic is handled in threaded IRQ, and the
> disable_traffic flag is checked before handling the
> data, new race condition is exposed, in which
> buffer may leak, if threaded IRQ interrupts the
> trasmit path midway.
>
> With this change, disable_traffic and waiting_tx_skb
> pointer are protected by spin lock/unlock pair.
>
> This is highlighted in Sashiko review
> https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260611-level-trigger-v5-0-4533a9e85ce2%40onsemi.com
>
> Also on buffer overrun condition, probably due to loss of
> SPI data chunks, receive path doesn't see the expected
> data chunk with end_valid bit set. As a result, driver
> keeps adding data chunks to the skb before running out
> of space and kernel panic is seen.
>
> With this change, before adding data to the skb, if there
> is no space, skb is freed and driver starts looking for
> new frame by looking for a data chunk with start_valid
> bit set.
>
> [ 705.405490] skbuff: skb_over_panic: text:ffffffd2eb72a264 len:1600 put:64 head:ffffff804e5cdc40 data:ffffff804e5cdc80 tail:0x680 end:0x640 dev:eth1
> [ 705.405569] ------------[ cut here ]------------
> [ 705.405575] kernel BUG at net/core/skbuff.c:214!
> [ 705.405589] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
>
> [ 6703.427690] Call trace:
> [ 705.925157] skb_panic+0x58/0x68 (P)
> [ 705.928726] skb_put+0x74/0x80
> [ 705.931772] oa_tc6_update_rx_skb+0x44/0x98 [oa_tc6_mod]
> [ 705.937084] oa_tc6_macphy_threaded_irq+0x3f4/0x900 [oa_tc6_mod]
> [ 705.943084] irq_thread_fn+0x34/0xb8
> [ 705.946654] irq_thread+0x1a0/0x300
> [ 705.950134] kthread+0x138/0x150
> [ 705.953356] ret_from_fork+0x10/0x20
>
> Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
> ---
> Changes in v2:
> - Improvment to how error -EAGAIN is handled. Took care of
> couple of use cases where start_bit and end_bit may be missing or
> repeated due to lost data chunks.
> - Protected handling of waiting_tx_skb pointer with spin lock
> - Link to v1: https://lore.kernel.org/r/20260621-fix-race-condition-and-crash-v1-0-87e290d9357f@onsemi.com
I performed a one-hour quick test using two instances of the LAN8651
MAC-PHY Click (Test Case 2), and it ran without any crashes. Thank you
for the fixes.
Best regards,
Parthiban V
>
> ---
> Selvamani Rajagopal (2):
> net: ethernet: oa_tc6: Protect skb pointer used by two different kernel instances
> net: ethernet: oa_tc6: Improvement in buffer overflow handling
>
> drivers/net/ethernet/oa_tc6.c | 91 ++++++++++++++++++++++++++++++-------------
> 1 file changed, 64 insertions(+), 27 deletions(-)
> ---
> base-commit: 805185b7c7a1069e407b6f7b3bc98e44d415f484
> change-id: 20260621-fix-race-condition-and-crash-94d055a665c4
>
> Best regards,
> --
> Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread