netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Breno Leitao <leitao@debian.org>, Mike Galbraith <efault@gmx.de>,
	paulmck@kernel.org, kuba@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, boqun.feng@gmail.com
Subject: Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning
Date: Thu, 14 Aug 2025 16:45:19 +0100	[thread overview]
Message-ID: <001d822f-2f78-4ba5-b29f-23ec1813d3d4@gmail.com> (raw)
In-Reply-To: <oth5t27z6acp7qxut7u45ekyil7djirg2ny3bnsvnzeqasavxb@nhwdxahvcosh>

On 8/14/25 11:16, Breno Leitao wrote:
> Hello Mike,
> 
> On Wed, Aug 13, 2025 at 06:14:36AM +0200, Mike Galbraith wrote:
>> [  107.984942] Chain exists of:
>>                   console_owner --> target_list_lock --> &fq->lock
>>
>> [  107.984947]  Possible interrupt unsafe locking scenario:
>>
>> [  107.984948]        CPU0                    CPU1
>> [  107.984949]        ----                    ----
>> [  107.984950]   lock(&fq->lock);
>> [  107.984952]                                local_irq_disable();
>> [  107.984952]                                lock(console_owner);
>> [  107.984954]                                lock(target_list_lock);
> 
> Thanks for the report. I _think_ I understand the problem, it should be
> easier to see it while thinking about a single CPU:
> 
>   1) lock(&fq->lock); 			// This is not hard irq safe log
>   2) IRQ					// IRQ hits the while the lock is held
>   2.1) printk() 				// WARNs and printk can in fact happen during IRQs
>   2.2) netconsole subsystem 		/// target_list_lock is not important and can be ignored
>   2.2) netpoll 				// net poll will call the network subsystem to send the packet
>   2.3) lock(&fq->lock);			// Try to get the lock while the lock was already held
>   3) Dead lock!
> 
> Given fq->lock is not IRQ safe, then this is a possible deadlock.
> 
> In fact, I would say that FQ is not the only lock that might get into
> this deadlock.
> 
> Possible solutions that come to my mind:
> 
> 1) make those lock (fq->lock and TX locks) IRQ safe

And I'm pretty sure the list is not exhaustive.

>   * cons: This has network performance penalties, and very intrusive.
> 2) Making printk from IRQs deferred. Calling `printk_deferred_enter` at
>     IRQs handlers ?!

It'd only help if the deferred printk doesn't need the
console_lock / doesn't disable irqs.

>   * Cons: This will add latency to printk() inside IRQs.
> 3) Create a deferred mechanism inside netconsole, that would buffer and
>     defer the TX of the packet to outside of the IRQs.
>     a) Basically on netconsole, check if it is being invoke inside an
>     IRQ, then buffer the message and it it at Softirq/task context.
>   * Cons: this would use extra memory for printks() inside IRQs and also
>     latency (netconsole only).

That should work, we basically need to pull xmit out of the
console_lock protected section, and deferring is not a bad option

> Let me add some other developers who might have other opinions and help
> to decide what is the best approach.

-- 
Pavel Begunkov


  reply	other threads:[~2025-08-14 15:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fb38cfe5153fd67f540e6e8aff814c60b7129480.camel@gmx.de>
2025-08-14 10:16 ` netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning Breno Leitao
2025-08-14 15:45   ` Pavel Begunkov [this message]
2025-08-15  0:23   ` Jakub Kicinski
2025-08-15 10:44     ` Pavel Begunkov
2025-08-15 16:42       ` Jakub Kicinski
2025-08-15 17:29         ` Breno Leitao
2025-08-15 17:33           ` Jakub Kicinski
2025-08-18 12:23             ` Breno Leitao
2025-08-15 19:10           ` Calvin Owens
2025-08-16  9:19             ` Mike Galbraith
2025-08-15 20:02           ` Pavel Begunkov
2025-08-18 12:10             ` Breno Leitao
2025-08-19 17:27               ` Breno Leitao
2025-08-20 12:31                 ` Mike Galbraith
2025-08-20 17:36                   ` Breno Leitao
2025-08-21  3:37                     ` Mike Galbraith
2025-08-21  3:51                       ` Mike Galbraith
2025-08-21 17:35                         ` Breno Leitao
2025-08-22  3:54                           ` Mike Galbraith
2025-08-26 12:43                             ` Breno Leitao
2025-08-26 13:56                               ` Mike Galbraith
2025-09-05 12:48                               ` John Ogness
2025-08-21 10:06                       ` Mike Galbraith
2025-08-21 13:12                         ` Mike Galbraith
2025-08-15 17:37         ` Calvin Owens
2025-08-26 14:10         ` Johannes Berg
2025-08-15 12:45     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=001d822f-2f78-4ba5-b29f-23ec1813d3d4@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=efault@gmx.de \
    --cc=kuba@kernel.org \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).