From: Marc Zyngier <maz@kernel.org>
To: James Gowans <jgowans@amazon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
<linux-kernel@vger.kernel.org>,
Liao Chang <liaochang1@huawei.com>,
KarimAllah Raslan <karahmed@amazon.com>,
Yipeng Zou <zouyipeng@huawei.com>,
Zhang Jianhua <chris.zjh@huawei.com>
Subject: Re: [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke
Date: Wed, 31 May 2023 08:00:37 +0100 [thread overview]
Message-ID: <867cspc7e2.wl-maz@kernel.org> (raw)
In-Reply-To: <20230530213848.3273006-2-jgowans@amazon.com>
On Tue, 30 May 2023 22:38:48 +0100,
James Gowans <jgowans@amazon.com> wrote:
>
> Update the generic handle_fasteoi_irq to cater for the case when the
> next interrupt comes in while the previous handler is still running.
> Currently when that happens the irq_may_run() early out causes the next
> IRQ to be lost. Change the behaviour to mark the interrupt as pending
> and re-send the interrupt when handle_fasteoi_irq sees that the pending
> flag has been set. This is largely inspired by handle_edge_irq.
>
> Generally it should not be possible for the next interrupt to arrive
> while the previous handler is still running: the next interrupt should
> only arrive after the EOI message has been sent and the previous handler
> has returned.
There is no such message with LPIs. I pointed that out previously.
> However, there is a race where if the interrupt affinity
> is changed while the previous handler is running, then the next
> interrupt can arrive at a different CPU while the previous handler is
> still running. In that case there will be a concurrent invoke and the
> early out will be taken.
>
> For example:
>
> CPU 0 | CPU 1
> -----------------------------|-----------------------------
> interrupt start |
> handle_fasteoi_irq | set_affinity(CPU 1)
> handler |
> ... | interrupt start
> ... | handle_fasteoi_irq -> early out
> handle_fasteoi_irq return | interrupt end
> interrupt end |
>
> This issue was observed specifically on an arm64 system with a GIC-v3
> handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
> that the global ITS is responsible for affinity but does not know
> whether interrupts are pending/running, only the CPU-local redistributor
> handles the EOI. Hence when the affinity is changed in the ITS, the new
> CPU's redistributor does not know that the original CPU is still running
> the handler.
Similar to your previous patch, you don't explain *why* the interrupt
gets delivered when it is an LPI, and not for any of the other GICv3
interrupt types. That's an important point.
>
> Implementation notes:
>
> It is believed that it's NOT necessary to mask the interrupt in
> handle_fasteoi_irq() the way that handle_edge_irq() does. This is
> because handle_edge_irq() caters for controllers which are too simple to
> gate interrupts from the same source, so the kernel explicitly masks the
> interrupt if it re-occurs [0].
>
> [0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@huawei.com/
>
> Suggested-by: Liao Chang <liaochang1@huawei.com>
> Signed-off-by: James Gowans <jgowans@amazon.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: KarimAllah Raslan <karahmed@amazon.com>
> Cc: Yipeng Zou <zouyipeng@huawei.com>
> Cc: Zhang Jianhua <chris.zjh@huawei.com>
> ---
> kernel/irq/chip.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..42f33e77c16b 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,15 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> raw_spin_lock(&desc->lock);
>
> - if (!irq_may_run(desc))
> + /*
> + * When an affinity change races with IRQ delivery, the next interrupt
> + * can arrive on the new CPU before the original CPU has completed
> + * handling the previous one. Mark it as pending and return EOI.
> + */
> + if (!irq_may_run(desc)) {
> + desc->istate |= IRQS_PENDING;
> goto out;
> + }
>
> desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>
> @@ -715,6 +722,12 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> cond_unmask_eoi_irq(desc, chip);
>
> + /*
> + * When the race descibed above happens, this will resend the interrupt.
> + */
> + if (unlikely(desc->istate & IRQS_PENDING))
> + check_irq_resend(desc, false);
> +
> raw_spin_unlock(&desc->lock);
> return;
> out:
While I'm glad that you eventually decided to use the resend mechanism
instead of spinning on the "old" CPU, I still think imposing this
behaviour on all users without any discrimination is wrong.
Look at what it does if an interrupt is a wake-up source. You'd
pointlessly requeue the interrupt (bonus points if the irqchip doesn't
provide a HW-based retrigger mechanism).
I still maintain that this change should only be applied for the
particular interrupts that *require* it, and not as a blanket change
affecting everything under the sun. I have proposed such a change in
the past, feel free to use it or roll your own.
In the meantime, I strongly oppose this change as proposed.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2023-05-31 7:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-30 21:38 [PATCH 1/2] genirq: Expand doc for PENDING and REPLAY flags James Gowans
2023-05-30 21:38 ` [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke James Gowans
2023-05-30 23:32 ` Randy Dunlap
2023-05-31 7:00 ` Marc Zyngier [this message]
2023-06-01 7:24 ` Gowans, James
2023-06-05 16:10 ` Gowans, James
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=867cspc7e2.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=chris.zjh@huawei.com \
--cc=jgowans@amazon.com \
--cc=karahmed@amazon.com \
--cc=liaochang1@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=zouyipeng@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.