From: "Gowans, James" <jgowans@amazon.com>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
"Raslan, KarimAllah" <karahmed@amazon.com>,
"Woodhouse, David" <dwmw@amazon.co.uk>,
"zouyipeng@huawei.com" <zouyipeng@huawei.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Sironi, Filippo" <sironi@amazon.de>,
"chris.zjh@huawei.com" <chris.zjh@huawei.com>
Subject: Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke
Date: Tue, 18 Apr 2023 10:56:07 +0000 [thread overview]
Message-ID: <7fdfb01590d8e502f384aa0bb0dc9c614caa5dfc.camel@amazon.com> (raw)
In-Reply-To: <86pm89kyyt.wl-maz@kernel.org>
On Wed, 2023-04-12 at 14:32 +0100, Marc Zyngier wrote:
>
> > > > 1. Do we need to mask the IRQ and then unmask it later? I don't think so
> > > > but it's not entirely clear why handle_edge_irq does this anyway; it's
> > > > an edge IRQ so not sure why it needs to be masked.
> > >
> > > Please measure that cost and weep, specially in the context of
> > > multiple concurrent interrupts serviced by a single ITS (cost of
> > > locking the command queue, of waiting for a full round trip to the ITS
> > > for a couple of commands...).
> >
> > Fortunately this mask/unmasking would only happen in the rare(ish) cased of the
> > race condition described here being hit. Exactly the same as
> > with handle_edge_irq(), the masking and later unmasking would only be done
> > when irq_may_run() == false due to the race being hit. Considering that this is
> > a rare occurrence, I think we could stomach the occasional overhead? I was more
> > asking if it's actually *necessary* to do this masking/unmasking. I'm not sure
> > it's necessary anyway, hence it wasn't implemented in my patch.
>
> But does it solve anything? At the point where you mask the interrupt,
> you already have consumed it. You'd still need to make it pending
> somehow, which is what your patch somehow.
I don't really know - the reason I asked the question is that the related
handle_edge_irq() does this mask/unmasking, and I wasn't quite sure why it
did that and hence if we needed to do something similar.
Anyway, let's focus on your patch rather - I think it's more compelling.
> > Yes. This bothered me too initially, but on reflection I'm not sure it's
> > actually a problem. One possible issue that came to mind was around CPU
> > offlining, but in the event that a CPU being offlined was running interrupt
> > handlers it wouldn't be able to complete the offline anyway until the handlers
> > were finished, so I don't think this is an issue. Do you see any practical issue
> > with running the handler once more on the original CPU immediately after the
> > affinity has been changed?
>
> My take on this is that we put the pressure on the CPU we want to move
> away from. I'd rather we put the it on the GIC itself, and use its
> Turing-complete powers to force it to redeliver the interrupt at a
> more convenient time.
This idea and implementation looks and works great! It may need a few
tweaks; discussing below.
>
> From c96d2ab37fe273724f1264fba5f4913259875d56 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Mon, 10 Apr 2023 10:56:32 +0100
> Subject: [PATCH] irqchip/gicv3-its: Force resend of LPIs taken while
> already
> in-progress
Perhaps you can pillage some of my commit message to explain the race here
when you send this patch?
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
>
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index b1b28affb32a..4b2a7cc96eb2 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -223,6 +223,8 @@ struct irq_data {
> * irq_chip::irq_set_affinity() when
> deactivated.
> * IRQD_IRQ_ENABLED_ON_SUSPEND - Interrupt is enabled on suspend by irq
> pm if
> * irqchip have flag
> IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND set.
> + * IRQD_RESEND_WHEN_IN_PROGRESS - Interrupt may fire when already in
> progress,
> + * needs resending.
> */
> enum {
> IRQD_TRIGGER_MASK = 0xf,
> @@ -249,6 +251,7 @@ enum {
> IRQD_HANDLE_ENFORCE_IRQCTX = (1 << 28),
> IRQD_AFFINITY_ON_ACTIVATE = (1 << 29),
> IRQD_IRQ_ENABLED_ON_SUSPEND = (1 << 30),
> + IRQD_RESEND_WHEN_IN_PROGRESS = (1 << 31),
> };
Do we really want a new flag here? I'd be keen to fix this race for all
drivers, not just those who know to set this flag. I think the patch
you're suggesting is pretty close to being safe to enable generally? If so
my preference is for one less config option - just run it always.
> static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..73546ba8bc43 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,11 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> raw_spin_lock(&desc->lock);
>
> - if (!irq_may_run(desc))
> + if (!irq_may_run(desc)) {
> + if (irqd_needs_resend_when_in_progress(&desc->irq_data))
> + check_irq_resend(desc, true);
> goto out;
> + }
This will run check_irq_resend() on the *newly affined* CPU, while the old
one is still running the original handler. AFAICT what will happen is:
check_irq_resend
try_retrigger
irq_chip_retrigger_hierarchy
its_irq_retrigger
... which will cause the ITS to *immediately* re-trigger the IRQ. The
original CPU can still be running the handler in that case.
If that happens, consider what will happen in check_irq_resend:
- first IRQ comes in, successflly runs try_retrigger and sets IRQS_REPLAY.
- it is *immediately* retriggered by ITS, and because the original handler
on the other CPU is still running, comes into check_irq_resend again.
- check_irq_resend now observes that IRQS_REPLAY is set and early outs.
- No more resends, the IRQ is still lost. :-(
Now I admit the failure mode is getting a bit pathological: two re-
triggers while the original handler is still running, but I was able to
hit this on my test machine by intentionally slowing
the handler down by a few dozen micros. Should we cater for this?
I can see two possibilities:
- tweak check_irq_resend() to not early-out in this case but to keep re-
triggering until it eventually runs.
- move the check_irq_resend to only happen later, *after* the original
handler has finished running. This would be very similar to what I
suggested in my original patch, except instead of running a do/while loop,
the code would observe that the pending flag was set again and run
check_irq_resend.
I'm also wondering what will happen for users who don't have the
chip->irq_retrigger callback set and fall back to the tasklet
via irq_sw_resend()... Looks like it will work fine. However if we do my
suggestion and move check_irq_resend to the end of handle_fasteoi_irq then
the tasklet will be scheduled on the old CPU again, which may be sub-
optimal.
JG
next prev parent reply other threads:[~2023-04-18 10:56 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-17 9:53 [PATCH] irq: fasteoi handler re-runs on concurrent invoke James Gowans
2023-03-17 10:12 ` Yipeng Zou
2023-03-17 11:49 ` Gowans, James
2023-03-22 6:26 ` Yipeng Zou
2023-03-22 7:48 ` Gowans, James
2023-03-22 10:37 ` Thomas Gleixner
2023-04-03 13:17 ` zhangjianhua (E)
2023-04-03 13:19 ` Marc Zyngier
2023-04-03 13:39 ` Gowans, James
2023-03-22 10:38 ` Yipeng Zou
2023-04-09 11:41 ` Marc Zyngier
2023-04-11 10:27 ` Gowans, James
2023-04-12 13:32 ` Marc Zyngier
2023-04-18 2:39 ` Yipeng Zou
2023-04-18 10:56 ` Gowans, James [this message]
2023-04-19 3:08 ` Yipeng Zou
2023-05-02 8:43 ` Gowans, James
2023-05-23 3:16 ` liaochang (A)
2023-05-25 10:04 ` Gowans, James
2023-05-29 2:47 ` Liao, Chang
2023-05-30 21:47 ` Gowans, James
[not found] ` <86sfcfghqh.wl-maz@kernel.org>
2023-05-23 12:47 ` Gowans, James
2023-05-25 12:31 ` Liao, Chang
2023-05-02 10:28 ` Marc Zyngier
2023-05-23 3:16 ` liaochang (A)
2023-05-23 3:15 ` liaochang (A)
2023-05-23 11:59 ` Gowans, James
2023-05-25 12:31 ` Liao, Chang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7fdfb01590d8e502f384aa0bb0dc9c614caa5dfc.camel@amazon.com \
--to=jgowans@amazon.com \
--cc=chris.zjh@huawei.com \
--cc=dwmw@amazon.co.uk \
--cc=karahmed@amazon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=sironi@amazon.de \
--cc=tglx@linutronix.de \
--cc=zouyipeng@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox