From: Thomas Gleixner <tglx@linutronix.de>
To: Liangyan <liangyan.peng@bytedance.com>
Cc: linux-kernel@vger.kernel.org,
Liangyan <liangyan.peng@bytedance.com>,
Yicong Shen <shenyicong.1023@bytedance.com>
Subject: Re: [RFC] genirq: Fix lockup in handle_edge_irq
Date: Wed, 02 Jul 2025 15:17:26 +0200 [thread overview]
Message-ID: <87a55mlok9.ffs@tglx> (raw)
In-Reply-To: <20250701163558.2588435-1-liangyan.peng@bytedance.com>
On Wed, Jul 02 2025 at 00:35, Liangyan wrote:
> void handle_edge_irq(struct irq_desc *desc)
> {
> + bool need_unmask = false;
> +
> guard(raw_spinlock)(&desc->lock);
>
> if (!irq_can_handle(desc)) {
> @@ -791,12 +793,16 @@ void handle_edge_irq(struct irq_desc *desc)
> if (unlikely(desc->istate & IRQS_PENDING)) {
> if (!irqd_irq_disabled(&desc->irq_data) &&
> irqd_irq_masked(&desc->irq_data))
> - unmask_irq(desc);
> + need_unmask = true;
> }
>
> handle_irq_event(desc);
>
> } while ((desc->istate & IRQS_PENDING) && !irqd_irq_disabled(&desc->irq_data));
> +
> + if (need_unmask && !irqd_irq_disabled(&desc->irq_data) &&
> + irqd_irq_masked(&desc->irq_data))
> + unmask_irq(desc);
This might work in your setup by some definition of "works", but it
breaks the semantics of this handler because of this:
device interrupt CPU0 CPU1
handle_edge_irq()
set(INPROGRESS);
do {
handle_event();
device interrupt
handle_edge_irq()
if (INPROGRESS) {
set(PENDING);
mask();
return;
}
...
if (PENDING) {
need_unmask = true;
}
handle_event();
device interrupt << possible FAIL
because there are enough edge type interrupt controllers out there which
lose an edge when the line is masked at the interrupt controller
level. As edge type interrupts are fire and forget from the device
perspective, the interrupt is not retriggered when unmasking later.
That's the reason why this handler is written the way it is and this
cannot be changed for obvious reasons.
So no, this is not going to happen.
The only possible solution for this is to analyze all interrupt
controllers, which are involved in the delivery chain, and establish
whether they are affected by the above problem. If not, then that
particular delivery chain combination of interrupt controllers can be
changed to use a different flow handler along with a profound
explanation why this is correct under all circumstances.
As you failed to provide any information about the involved controllers,
I cannot even give any hint about a possible solution.
Thanks,
tglx
next prev parent reply other threads:[~2025-07-02 13:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 16:35 [RFC] genirq: Fix lockup in handle_edge_irq Liangyan
2025-07-02 13:17 ` Thomas Gleixner [this message]
2025-07-03 15:31 ` [External] " Liangyan
2025-07-04 14:42 ` Thomas Gleixner
2025-07-04 15:36 ` Liangyan
2025-07-08 1:43 ` Liangyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a55mlok9.ffs@tglx \
--to=tglx@linutronix.de \
--cc=liangyan.peng@bytedance.com \
--cc=linux-kernel@vger.kernel.org \
--cc=shenyicong.1023@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.