public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Bert Karwatzki <spasswolf@web.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	netdev@vger.kernel.org
Subject: Re: "Dead loop on virtual device" error without softirq-BKL on PREEMPT_RT
Date: Thu, 26 Feb 2026 18:29:27 +0100	[thread overview]
Message-ID: <20260226172927.2Ck9wZMw@linutronix.de> (raw)
In-Reply-To: <ae213d2b908e46d34856be9daec891d7210f46f7.camel@web.de>

On 2026-02-18 13:50:14 [+0100], Bert Karwatzki wrote:
> Am Mittwoch, dem 18.02.2026 um 08:30 +0100 schrieb Sebastian Andrzej Siewior:
> > On 2026-02-17 20:10:09 [+0100], Bert Karwatzki wrote:
> > > 
> > > I tried to research the original commit which introduced the xmit_lock_owner check, but
> > > it is present since linux 2.3.6 (released 19990610) (when __dev_queue_xmit() was still called dev_queue_xmit()),
> > > so I can't tell the original idea behind that check (perhaps recuesion detection ...), so I'm
> > > not completely sure if it can be omitted (and just let dev_xmit_recursion() do the recursion checking).
> > 
> > Okay. Thank you. I add it to my list.
> > 
> I've thought about it again and I now think the xmit_lock_owner check IS necessary to
> avoid deadlocks on recursion in the non-RT case.
> 
> My idea to use get_current()->tgid as lock owner also does not work as interrupts are still enabled
> and __dev_queue_xmit() can be called from interrupt context. So in a situation where an interrupt occurs
> after the lock has been taken and the interrupt handler calls __dev_queue_xmit() on the same CPU a deadlock
> would occur.

The warning happens because taskA on cpuX goes through
HARD_TX_LOCK(), gets preempted and then taskB on cpuX wants also to send
send a packet. The second one throws the warning.

We could ignore this check because a deadlock will throw a warning and
"halt" the task that runs into the deadlock.
But then we could be smart about this in the same way !RT is and
pro-active check for the simple deadlock. The lock owner of
netdev_queue::_xmit_lock is recorded can be checked vs current.

The snippet below should work. I need to see if tomorrow this is still a
good idea.

diff --git a/net/core/dev.c b/net/core/dev.c
index 6ff4256700e60..de342ceb17201 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4821,7 +4821,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 		/* Other cpus might concurrently change txq->xmit_lock_owner
 		 * to -1 or to their cpu id, but not to our id.
 		 */
-		if (READ_ONCE(txq->xmit_lock_owner) != cpu) {
+		if (rt_mutex_owner(&txq->_xmit_lock.lock) != current) {
 			if (dev_xmit_recursion())
 				goto recursion_alert;
 

> Bert Karwatzki

Sebastian

  reply	other threads:[~2026-02-26 17:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260216134333.412332-1-spasswolf@web.de>
     [not found] ` <6274de932f4a62c51b424b65fc875ef3cb5ffd60.camel@web.de>
     [not found]   ` <20260216153745.CA3__zRc@linutronix.de>
     [not found]     ` <37d6e27f96afb57c5716798530cb3560d25202e5.camel@web.de>
     [not found]       ` <20260217071952.WCXLGs5-@linutronix.de>
     [not found]         ` <80114792206dc00d0099f00999a209e717debb12.camel@web.de>
     [not found]           ` <20260217095700.SjYjM8RO@linutronix.de>
     [not found]             ` <4fba57892e5bd6a1afc4a36a80b40e3ecc28cac5.camel@web.de>
2026-02-17 11:24               ` "Dead loop on virtual device" error without softirq-BKL on PREEMPT_RT Bert Karwatzki
2026-02-17 16:52                 ` Bert Karwatzki
2026-02-17 19:10                   ` Bert Karwatzki
2026-02-18  7:30                     ` Sebastian Andrzej Siewior
2026-02-18 12:50                       ` Bert Karwatzki
2026-02-26 17:29                         ` Sebastian Andrzej Siewior [this message]
2026-03-18 10:30                           ` Daniel Vacek
2026-03-18 11:18                             ` Sebastian Andrzej Siewior
2026-03-18 14:43                               ` Daniel Vacek
2026-03-18 14:51                                 ` Sebastian Andrzej Siewior
2026-03-18 14:58                                   ` Daniel Vacek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260226172927.2Ck9wZMw@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=spasswolf@web.de \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox