From: Marc Zyngier <maz@kernel.org>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Jason Cooper <jason@lakedaemon.net>
Subject: Re: [RFC PATCH] irqchip/gic: Implement irq_chip->irq_retrigger()
Date: Fri, 10 Jul 2020 17:29:49 +0100 [thread overview]
Message-ID: <87tuyfxria.wl-maz@kernel.org> (raw)
In-Reply-To: <20200710145642.28978-1-valentin.schneider@arm.com>
On Fri, 10 Jul 2020 15:56:42 +0100,
Valentin Schneider <valentin.schneider@arm.com> wrote:
>
> While digging around IRQCHIP_EOI_IF_HANDLED and irq/resend.c, it has come
> to my attention that the IRQ resend situation seems a bit precarious for
> the GIC(s). Below is a (somewhat structured) dump of my notes/thoughts
> about that.
>
> IRQCHIP_EOI_IF_HANDLED
> ======================
>
> If the irqchip doesn't have this flag set, we may issue an irq_eoi()
> despite not having handled the IRQ in the following cases:
>
> o !irq_may_run()
> - IRQ may be in progress (handle_irq_event() ongoing)
> - IRQ is an armed wakeup source (also sets it pending)
>
> o !action or IRQ disabled; in this case it is set as pending.
>
> irq_resend()
> ============
>
> For the above cases where the IRQ is marked as pending, it means that when
> we'll go through check_irq_resend() (e.g. when we re-enable the IRQ) we
> will end up in resend_irqs() which goes through the flow handler again, IOW
> will issue *another* EOI on the same IRQ.
>
> Now, this is all done via a tasklet, so AFAICT cannot happen in irq
> context (as defined by in_interrupt() - it can happen at the tail of
> handling an IRQ if it wasn't nesting).
>
> GIC woes
> ========
>
> The TL;DR for IRQ handling on the GIC is that we have 3 steps:
> o Acknowledgement (Ack)
> o priority drop (PD)
> o deactivation (D)
>
> The GIC also has an "eoimode" knob that says whether PD and D are split, IOW:
> o eoimode=0: irq_eoi() does PD + D
> o eoimode=1: irq_eoi() does D
>
> Regardless of the mode, it is paramount that any PD is
> o issued on the same CPU that Ack'd the IRQ
> o issued in reverse order of the Acks
>
> IHI0069D, 4.1.1 Physical CPU interface, Priority drop
> """
> A priority drop must be performed by the same PE that activated the
> interrupt.
>
> [...]
>
> For each CPU interface, the GIC architecture requires the order of the
> valid writes to ICC_EOIR0_EL1 and ICC_EOIR1_EL1 to be the exact reverse of
> the order of the reads from ICC_IAR0_EL1 and ICC_IAR1_EL1
> """
>
> IHI0069D, 8.2.9 ICC_EOIR0_EL1, Interrupt Controller End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this PE from an Interrupt Acknowledge Register, and must correspond to the
> INTID that was read from ICC_IAR0_EL1, otherwise the system behavior is
> UNPREDICTABLE.
> """
>
> For eoimode=1, the PD is hidden from genirq and is contained within the GIC
> driver. This means that issuing another irq_eoi() will only be re-issuing a
> D, which I think the GIC can live with (even if issued from a different CPU).
>
> For eoimode=0, it is more troubling, as we break the aforementioned
> restrictions. That said, IIUC this cannot cause e.g. a bogus running
> priority reduction due to the tasklet context mentioned above (running
> priority ought to be idle priority).
>
> Linux hosts will almost always use eoimode=1, so that leaves us with
> guests running with eoimode=0, and I don't know how common it is (if at all
> possible) for those to use pm / wakeup IRQs. I suppose that is a reason
> why this hasn't cropped up before, that or I miserably misunderstood the
> whole thing.
>
> In any case, the virtual interface follows the same restrictions wrt
> PD ordering:
>
> IHI0069D 8.3.7 ICV_EOIR0_EL1, Interrupt Controller Virtual End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this vPE from a Virtual Interrupt Acknowledge Register, and must correspond
> to the INTID that was read from ICV_IAR0_EL1, otherwise the system behavior
> is UNPREDICTABLE.
> """
>
> Change
> ======
>
> One way to ensure we only get one PD per interrupt activation and maintain
> the expected ordering is to add the IRQCHIP_EOI_IF_HANDLED flag to all
> irq-gic chips, but that only really works for eoimode=1; for eoimode=0 that
> would mean leaving the flow handler without having issued a PD at all.
>
> At the same time, this whole IRQS_PENDING & resend thing feels like
> something we can handle in hardware: we can leverage
> irq_chip.irq_retrigger() and use this to mark the interrupt as pending in
> the GIC itself, in which case we *don't* want to have
> IRQCHIP_EOI_IF_HANDLED (as the retrigger will lead to another ack+eoi
> pair).
>
> Implement irq_chip.irq_retrigger() for both GICs.
Although I am very grateful for the whole documentation, I'd rather
have a slightly more condensed changelog that documents the
implementation of the retrigger callback! ;-)
>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> drivers/irqchip/irq-gic-v3.c | 7 +++++++
> drivers/irqchip/irq-gic.c | 6 ++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index cc46bc2d634b..c025e8b51464 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1207,6 +1207,11 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> #define gic_smp_init() do { } while(0)
> #endif
>
> +static int gic_retrigger(struct irq_data *data)
> +{
> + return gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
If I'm not mistaken, check_irq_resend() requires a non-zero return
value if the retrigger has succeeded. So something like
return !gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
would be more appropriate.
Otherwise, looks good.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Thomas Gleixner <tglx@linutronix.de>,
Jason Cooper <jason@lakedaemon.net>,
Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Subject: Re: [RFC PATCH] irqchip/gic: Implement irq_chip->irq_retrigger()
Date: Fri, 10 Jul 2020 17:29:49 +0100 [thread overview]
Message-ID: <87tuyfxria.wl-maz@kernel.org> (raw)
In-Reply-To: <20200710145642.28978-1-valentin.schneider@arm.com>
On Fri, 10 Jul 2020 15:56:42 +0100,
Valentin Schneider <valentin.schneider@arm.com> wrote:
>
> While digging around IRQCHIP_EOI_IF_HANDLED and irq/resend.c, it has come
> to my attention that the IRQ resend situation seems a bit precarious for
> the GIC(s). Below is a (somewhat structured) dump of my notes/thoughts
> about that.
>
> IRQCHIP_EOI_IF_HANDLED
> ======================
>
> If the irqchip doesn't have this flag set, we may issue an irq_eoi()
> despite not having handled the IRQ in the following cases:
>
> o !irq_may_run()
> - IRQ may be in progress (handle_irq_event() ongoing)
> - IRQ is an armed wakeup source (also sets it pending)
>
> o !action or IRQ disabled; in this case it is set as pending.
>
> irq_resend()
> ============
>
> For the above cases where the IRQ is marked as pending, it means that when
> we'll go through check_irq_resend() (e.g. when we re-enable the IRQ) we
> will end up in resend_irqs() which goes through the flow handler again, IOW
> will issue *another* EOI on the same IRQ.
>
> Now, this is all done via a tasklet, so AFAICT cannot happen in irq
> context (as defined by in_interrupt() - it can happen at the tail of
> handling an IRQ if it wasn't nesting).
>
> GIC woes
> ========
>
> The TL;DR for IRQ handling on the GIC is that we have 3 steps:
> o Acknowledgement (Ack)
> o priority drop (PD)
> o deactivation (D)
>
> The GIC also has an "eoimode" knob that says whether PD and D are split, IOW:
> o eoimode=0: irq_eoi() does PD + D
> o eoimode=1: irq_eoi() does D
>
> Regardless of the mode, it is paramount that any PD is
> o issued on the same CPU that Ack'd the IRQ
> o issued in reverse order of the Acks
>
> IHI0069D, 4.1.1 Physical CPU interface, Priority drop
> """
> A priority drop must be performed by the same PE that activated the
> interrupt.
>
> [...]
>
> For each CPU interface, the GIC architecture requires the order of the
> valid writes to ICC_EOIR0_EL1 and ICC_EOIR1_EL1 to be the exact reverse of
> the order of the reads from ICC_IAR0_EL1 and ICC_IAR1_EL1
> """
>
> IHI0069D, 8.2.9 ICC_EOIR0_EL1, Interrupt Controller End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this PE from an Interrupt Acknowledge Register, and must correspond to the
> INTID that was read from ICC_IAR0_EL1, otherwise the system behavior is
> UNPREDICTABLE.
> """
>
> For eoimode=1, the PD is hidden from genirq and is contained within the GIC
> driver. This means that issuing another irq_eoi() will only be re-issuing a
> D, which I think the GIC can live with (even if issued from a different CPU).
>
> For eoimode=0, it is more troubling, as we break the aforementioned
> restrictions. That said, IIUC this cannot cause e.g. a bogus running
> priority reduction due to the tasklet context mentioned above (running
> priority ought to be idle priority).
>
> Linux hosts will almost always use eoimode=1, so that leaves us with
> guests running with eoimode=0, and I don't know how common it is (if at all
> possible) for those to use pm / wakeup IRQs. I suppose that is a reason
> why this hasn't cropped up before, that or I miserably misunderstood the
> whole thing.
>
> In any case, the virtual interface follows the same restrictions wrt
> PD ordering:
>
> IHI0069D 8.3.7 ICV_EOIR0_EL1, Interrupt Controller Virtual End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this vPE from a Virtual Interrupt Acknowledge Register, and must correspond
> to the INTID that was read from ICV_IAR0_EL1, otherwise the system behavior
> is UNPREDICTABLE.
> """
>
> Change
> ======
>
> One way to ensure we only get one PD per interrupt activation and maintain
> the expected ordering is to add the IRQCHIP_EOI_IF_HANDLED flag to all
> irq-gic chips, but that only really works for eoimode=1; for eoimode=0 that
> would mean leaving the flow handler without having issued a PD at all.
>
> At the same time, this whole IRQS_PENDING & resend thing feels like
> something we can handle in hardware: we can leverage
> irq_chip.irq_retrigger() and use this to mark the interrupt as pending in
> the GIC itself, in which case we *don't* want to have
> IRQCHIP_EOI_IF_HANDLED (as the retrigger will lead to another ack+eoi
> pair).
>
> Implement irq_chip.irq_retrigger() for both GICs.
Although I am very grateful for the whole documentation, I'd rather
have a slightly more condensed changelog that documents the
implementation of the retrigger callback! ;-)
>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
> drivers/irqchip/irq-gic-v3.c | 7 +++++++
> drivers/irqchip/irq-gic.c | 6 ++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index cc46bc2d634b..c025e8b51464 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1207,6 +1207,11 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> #define gic_smp_init() do { } while(0)
> #endif
>
> +static int gic_retrigger(struct irq_data *data)
> +{
> + return gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
If I'm not mistaken, check_irq_resend() requires a non-zero return
value if the retrigger has succeeded. So something like
return !gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
would be more appropriate.
Otherwise, looks good.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2020-07-10 16:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-10 14:56 [RFC PATCH] irqchip/gic: Implement irq_chip->irq_retrigger() Valentin Schneider
2020-07-10 14:56 ` Valentin Schneider
2020-07-10 16:29 ` Marc Zyngier [this message]
2020-07-10 16:29 ` Marc Zyngier
2020-07-10 17:08 ` Valentin Schneider
2020-07-10 17:08 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tuyfxria.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=Lorenzo.Pieralisi@arm.com \
--cc=jason@lakedaemon.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=valentin.schneider@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.