All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: James Gowans <jgowans@amazon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	<linux-kernel@vger.kernel.org>,
	Liao Chang <liaochang1@huawei.com>,
	KarimAllah Raslan <karahmed@amazon.com>,
	Yipeng Zou <zouyipeng@huawei.com>,
	Zhang Jianhua <chris.zjh@huawei.com>
Subject: Re: [PATCH v3 2/2] genirq: fasteoi supports resend on concurrent invoke
Date: Tue, 06 Jun 2023 18:05:42 +0100	[thread overview]
Message-ID: <87jzwgo715.wl-maz@kernel.org> (raw)
In-Reply-To: <20230605155723.2628097-2-jgowans@amazon.com>

On Mon, 05 Jun 2023 16:57:23 +0100,
James Gowans <jgowans@amazon.com> wrote:
> 
> ... and enable that functionality for GIC-v3 only.

nit: drop the multi-line subject.

>
> Update the generic handle_fasteoi_irq to cater for the case when the
> next interrupt comes in while the previous handler is still running.
> Currently when that happens the irq_may_run() early out causes the next
> IRQ to be lost. Support marking the interrupt as pending when that
> happens so that the interrupt can be resent just before
> handle_fasteoi_irq returns.  This is largely inspired by
> handle_edge_irq.
> 
> Generally it should not be possible for the next interrupt to arrive
> while the previous handler is still running: the CPU will not preempt an
> interrupt with another from the same source or same priority. However,
> there is a race: if the interrupt affinity is changed while the previous
> handler is running, then the next interrupt can arrive at a different
> CPU while the previous handler is still running. In that case there will
> be a concurrent invoke and the early out will be taken.
> 
> For example:
> 
>            CPU 0             |          CPU 1
> -----------------------------|-----------------------------
> interrupt start              |
>   handle_fasteoi_irq         | set_affinity(CPU 1)
>     handler                  |
>     ...                      | interrupt start
>     ...                      |   handle_fasteoi_irq -> early out
>   handle_fasteoi_irq return  | interrupt end
> interrupt end                |
> 
> This issue was observed specifically on an arm64 system with a GIC-v3
> handling MSIs via LPI; GIC-v3 uses the handle_fasteoi_irq handler. The
> issue is that the GIC-v3's physical LPIs do not have a global active
> state. If LPIs had an active state, then it would not be able to be
> retriggered until the first CPU had issued a deactivation. Without this
> global active state, when the affinity is change the next interrupt can
> be triggered on the new CPU while the old CPU is still running the
> handler from the previous interrupt.
> 
> Implementation notes:
> 
> It is believed that it's NOT necessary to mask the interrupt in
> handle_fasteoi_irq() the way that handle_edge_irq() does. This is
> because handle_edge_irq() caters for controllers which are too simple to
> gate interrupts from the same source, so the kernel explicitly masks the
> interrupt if it re-occurs [0].
> 
> The resend on concurrent invoke logic is gated by a flag which the
> interrupt controller can set if it's susceptible to this problem. It is
> not desirable to resend unconditionally: a wake up source for example
> has no need to be re-sent.
> 
> [0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@huawei.com/
> 
> Suggested-by: Liao Chang <liaochang1@huawei.com>
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: James Gowans <jgowans@amazon.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: KarimAllah Raslan <karahmed@amazon.com>
> Cc: Yipeng Zou <zouyipeng@huawei.com>
> Cc: Zhang Jianhua <chris.zjh@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c |  2 ++
>  include/linux/irq.h              | 13 +++++++++++++
>  kernel/irq/chip.c                | 16 +++++++++++++++-
>  kernel/irq/debugfs.c             |  2 ++
>  4 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 586271b8aa39..d9becfe696f0 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3574,6 +3574,7 @@ static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
>  		irqd = irq_get_irq_data(virq + i);
>  		irqd_set_single_target(irqd);
>  		irqd_set_affinity_on_activate(irqd);
> +		irqd_set_resend_when_in_progress(irqd);
>  		pr_debug("ID:%d pID:%d vID:%d\n",
>  			 (int)(hwirq + i - its_dev->event_map.lpi_base),
>  			 (int)(hwirq + i), virq + i);
> @@ -4512,6 +4513,7 @@ static int its_vpe_irq_domain_alloc(struct irq_domain *domain, unsigned int virq
>  		irq_domain_set_hwirq_and_chip(domain, virq + i, i,
>  					      irqchip, vm->vpes[i]);
>  		set_bit(i, bitmap);
> +		irqd_set_resend_when_in_progress(irq_get_irq_data(virq + i));
>  	}
>  
>  	if (err) {
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index b1b28affb32a..b76cc90faebd 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -223,6 +223,8 @@ struct irq_data {
>   *				  irq_chip::irq_set_affinity() when deactivated.
>   * IRQD_IRQ_ENABLED_ON_SUSPEND	- Interrupt is enabled on suspend by irq pm if
>   *				  irqchip have flag IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND set.
> + * RQD_RESEND_WHEN_IN_PROGRESS	- Interrupt may fire when already in progress in which
> + *				  case it must be resent at the next available opportunity.
>   */
>  enum {
>  	IRQD_TRIGGER_MASK		= 0xf,
> @@ -249,6 +251,7 @@ enum {
>  	IRQD_HANDLE_ENFORCE_IRQCTX	= (1 << 28),
>  	IRQD_AFFINITY_ON_ACTIVATE	= (1 << 29),
>  	IRQD_IRQ_ENABLED_ON_SUSPEND	= (1 << 30),
> +	IRQD_RESEND_WHEN_IN_PROGRESS    = (1 << 31),

Make this unsigned, as it otherwise has the potential to sign-extend
and lead to "fun to debug" issues.

>  };
>  
>  #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors)
> @@ -448,6 +451,16 @@ static inline bool irqd_affinity_on_activate(struct irq_data *d)
>  	return __irqd_to_state(d) & IRQD_AFFINITY_ON_ACTIVATE;
>  }
>  
> +static inline void irqd_set_resend_when_in_progress(struct irq_data *d)
> +{
> +	__irqd_to_state(d) |= IRQD_RESEND_WHEN_IN_PROGRESS;
> +}
> +
> +static inline bool irqd_needs_resend_when_in_progress(struct irq_data *d)
> +{
> +	return __irqd_to_state(d) & IRQD_RESEND_WHEN_IN_PROGRESS;
> +}
> +
>  #undef __irqd_to_state
>  
>  static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..a531692ca62f 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,16 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>  
>  	raw_spin_lock(&desc->lock);
>  
> -	if (!irq_may_run(desc))
> +	/*
> +	 * When an affinity change races with IRQ delivery, the next interrupt

The race is with the IRQ *handling*.

> +	 * can arrive on the new CPU before the original CPU has completed
> +	 * handling the previous one - it may need to be resent.
> +	 */
> +	if (!irq_may_run(desc)) {
> +		if (irqd_needs_resend_when_in_progress(&desc->irq_data))
> +			desc->istate |= IRQS_PENDING;
>  		goto out;
> +	}
>  
>  	desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>  
> @@ -715,6 +723,12 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>  
>  	cond_unmask_eoi_irq(desc, chip);
>  
> +	/*
> +	 * When the race described above happens this will resend the interrupt.
> +	 */
> +	if (unlikely(desc->istate & IRQS_PENDING))
> +		check_irq_resend(desc, false);
> +
>  	raw_spin_unlock(&desc->lock);
>  	return;
>  out:
> diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
> index bbcaac64038e..5971a66be034 100644
> --- a/kernel/irq/debugfs.c
> +++ b/kernel/irq/debugfs.c
> @@ -133,6 +133,8 @@ static const struct irq_bit_descr irqdata_states[] = {
>  	BIT_MASK_DESCR(IRQD_HANDLE_ENFORCE_IRQCTX),
>  
>  	BIT_MASK_DESCR(IRQD_IRQ_ENABLED_ON_SUSPEND),
> +
> +	BIT_MASK_DESCR(IRQD_RESEND_WHEN_IN_PROGRESS),
>  };
>  
>  static const struct irq_bit_descr irqdesc_states[] = {

If you respin this series shortly (*with* a cover letter, please),
I'll queue it so that it can simmer in -next for a bit.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  parent reply	other threads:[~2023-06-06 17:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05 15:57 [PATCH v3 1/2] genirq: Expand doc for PENDING and REPLAY flags James Gowans
2023-06-05 15:57 ` [PATCH v3 2/2] genirq: fasteoi supports resend on concurrent invoke James Gowans
2023-06-06  2:06   ` Randy Dunlap
2023-06-06 17:05   ` Marc Zyngier [this message]
2023-06-07 12:21     ` Gowans, James
2023-06-07 13:13       ` Marc Zyngier
2023-06-08 12:12         ` Gowans, James

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzwgo715.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=chris.zjh@huawei.com \
    --cc=jgowans@amazon.com \
    --cc=karahmed@amazon.com \
    --cc=liaochang1@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=zouyipeng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.