From: Sean Anderson <sean.anderson@linux.dev>
To: Mike Galbraith <efault@gmx.de>,
Robert Hancock <robert.hancock@calian.com>
Cc: Breno Leitao <leitao@debian.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: netpoll: raspberrypi [4 5] driver locking woes
Date: Thu, 28 Aug 2025 14:56:00 -0400 [thread overview]
Message-ID: <8e60d336-9cab-4003-8972-bda0b041d8cf@linux.dev> (raw)
In-Reply-To: <39f14032374d5d60c62b283637267a96ce535861.camel@gmx.de>
[-- Attachment #1: Type: text/plain, Size: 2058 bytes --]
Hi Mike,
On 8/28/25 13:26, Mike Galbraith wrote:
> On Thu, 2025-08-28 at 10:57 -0400, Sean Anderson wrote:
>> Hi Mike,
>>
>> On 8/27/25 12:02, Mike Galbraith wrote:
>> > Unexpected addendum to done deal datapoint, so off list.
>> >
>> > On Tue, 2025-08-26 at 11:49 +0200, Mike Galbraith wrote:
>> > >
>> > > The pi5 gripe fix is equally trivial, but submitting that is pointless
>> > > given there's something else amiss in fingered commit. This is all of
>> > > the crash info that escapes the box w/wo gripes silenced.
>> > >
>> > > [ 51.688868] sysrq: Trigger a crash
>> > > [ 51.688892] Kernel panic - not syncing: sysrq triggered crash
>> > > [ 51.698066] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc2-v8-lockdep #533 PREEMPTLAZY
>> > > [ 51.707234] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
>> > > [ 51.713085] Call trace:
>> > > [ 51.715532] show_stack+0x20/0x38 (C)
>> > > [ 51.719206] dump_stack_lvl+0x38/0xd0
>> > > [ 51.722878] dump_stack+0x18/0x28
>> > >
>> > > That aspect is a punt and run atm (time.. and dash of laziness:).
>> >
>> > Plan was to end datapoint thread, but after booting pi5's 6.12 kernel,
>> > for some reason I fired up netconsole.. and box promptly exhibited the
>> > netpoll locking bug warning, indicating presence of 138badbc21a0.
>> > Instead of saying to self "nope, just walk away", I poked SysRq-C.. and
>> > the bloody damn monitoring box received a 100% complete death rattle.
>> > Well bugger.
>>
>> Did you get a backtrace for this?
>
> Yes, logs for 6.12.41 and 6.17.0-rc2 attached.
>
> Since a patch has meanwhile landed, also a log of patched 6.17.0-rc3
> now gripe free (yay) but with aforementioned broken output, followed by
> addition of the e6a532185daa revert to confirm it still cures that.
>
>> And to be clear, the steps to reproduce this are to boot a kernel with
>> lockdep enabled with netconsole on macb and then hit sysrq?
>
> Yup.
Looks like the tx completion path can also be called from netpoll. Can
you try the attached patch?
--Sean
[-- Attachment #2: 0001-net-macb-Fix-tx_ptr_lock-locking.patch --]
[-- Type: text/x-patch, Size: 4585 bytes --]
From 3fe439760955370b9f005ef2728b5fc9b14eeea5 Mon Sep 17 00:00:00 2001
From: Sean Anderson <sean.anderson@linux.dev>
Date: Thu, 28 Aug 2025 11:55:01 -0400
Subject: [PATCH net v2] net: macb: Fix tx_ptr_lock locking
macb_start_xmit can be called with bottom-halves disabled (e.g.
transmitting from softirqs) as well as with interrupts disabled (with
netpoll). Because of this, all other functions taking tx_ptr_lock must
disable IRQs, and macb_start_xmit must only re-enable IRQs if they
were already enabled.
Fixes: 138badbc21a0 ("net: macb: use NAPI for TX completion path")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---
Changes in v2:
- Use irqsave/restore for all accesses, since they can also also be
called from netpoll.
drivers/net/ethernet/cadence/macb_main.c | 28 ++++++++++++++----------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 16d28a8b3b56..c769b7dbd3ba 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1223,12 +1223,13 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
{
struct macb *bp = queue->bp;
u16 queue_index = queue - bp->queues;
+ unsigned long flags;
unsigned int tail;
unsigned int head;
int packets = 0;
u32 bytes = 0;
- spin_lock(&queue->tx_ptr_lock);
+ spin_lock_irqsave(&queue->tx_ptr_lock, flags);
head = queue->tx_head;
for (tail = queue->tx_tail; tail != head && packets < budget; tail++) {
struct macb_tx_skb *tx_skb;
@@ -1291,7 +1292,7 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
CIRC_CNT(queue->tx_head, queue->tx_tail,
bp->tx_ring_size) <= MACB_TX_WAKEUP_THRESH(bp))
netif_wake_subqueue(bp->dev, queue_index);
- spin_unlock(&queue->tx_ptr_lock);
+ spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
return packets;
}
@@ -1707,8 +1708,9 @@ static void macb_tx_restart(struct macb_queue *queue)
{
struct macb *bp = queue->bp;
unsigned int head_idx, tbqp;
+ unsigned long flags;
- spin_lock(&queue->tx_ptr_lock);
+ spin_lock_irqsave(&queue->tx_ptr_lock, flags);
if (queue->tx_head == queue->tx_tail)
goto out_tx_ptr_unlock;
@@ -1720,19 +1722,20 @@ static void macb_tx_restart(struct macb_queue *queue)
if (tbqp == head_idx)
goto out_tx_ptr_unlock;
- spin_lock_irq(&bp->lock);
+ spin_lock(&bp->lock);
macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
- spin_unlock_irq(&bp->lock);
+ spin_unlock(&bp->lock);
out_tx_ptr_unlock:
- spin_unlock(&queue->tx_ptr_lock);
+ spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
}
static bool macb_tx_complete_pending(struct macb_queue *queue)
{
bool retval = false;
+ unsigned long flags;
- spin_lock(&queue->tx_ptr_lock);
+ spin_lock_irqsave(&queue->tx_ptr_lock, flags);
if (queue->tx_head != queue->tx_tail) {
/* Make hw descriptor updates visible to CPU */
rmb();
@@ -1740,7 +1743,7 @@ static bool macb_tx_complete_pending(struct macb_queue *queue)
if (macb_tx_desc(queue, queue->tx_tail)->ctrl & MACB_BIT(TX_USED))
retval = true;
}
- spin_unlock(&queue->tx_ptr_lock);
+ spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
return retval;
}
@@ -2308,6 +2311,7 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
struct macb_queue *queue = &bp->queues[queue_index];
unsigned int desc_cnt, nr_frags, frag_size, f;
unsigned int hdrlen;
+ unsigned long flags;
bool is_lso;
netdev_tx_t ret = NETDEV_TX_OK;
@@ -2368,7 +2372,7 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
desc_cnt += DIV_ROUND_UP(frag_size, bp->max_tx_length);
}
- spin_lock_bh(&queue->tx_ptr_lock);
+ spin_lock_irqsave(&queue->tx_ptr_lock, flags);
/* This is a hard error, log it. */
if (CIRC_SPACE(queue->tx_head, queue->tx_tail,
@@ -2392,15 +2396,15 @@ static netdev_tx_t macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
netdev_tx_sent_queue(netdev_get_tx_queue(bp->dev, queue_index),
skb->len);
- spin_lock_irq(&bp->lock);
+ spin_lock(&bp->lock);
macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
- spin_unlock_irq(&bp->lock);
+ spin_unlock(&bp->lock);
if (CIRC_SPACE(queue->tx_head, queue->tx_tail, bp->tx_ring_size) < 1)
netif_stop_subqueue(dev, queue_index);
unlock:
- spin_unlock_bh(&queue->tx_ptr_lock);
+ spin_unlock_irqrestore(&queue->tx_ptr_lock, flags);
return ret;
}
--
2.35.1.1320.gc452695387.dirty
next prev parent reply other threads:[~2025-08-28 18:56 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 5:53 netpoll: raspberrypi [4 5] driver locking woes Mike Galbraith
2025-08-25 10:10 ` Mike Galbraith
2025-08-26 9:49 ` Mike Galbraith
[not found] ` <f4fa3fcc637ffb6531982a90dbd9c27114e93036.camel@gmx.de>
2025-08-28 14:57 ` Sean Anderson
2025-08-28 17:26 ` Mike Galbraith
2025-08-28 18:56 ` Sean Anderson [this message]
2025-08-29 2:55 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e60d336-9cab-4003-8972-bda0b041d8cf@linux.dev \
--to=sean.anderson@linux.dev \
--cc=efault@gmx.de \
--cc=leitao@debian.org \
--cc=netdev@vger.kernel.org \
--cc=robert.hancock@calian.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.