From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03730C7EE24 for ; Tue, 6 Jun 2023 17:06:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238421AbjFFRFz (ORCPT ); Tue, 6 Jun 2023 13:05:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231824AbjFFRFt (ORCPT ); Tue, 6 Jun 2023 13:05:49 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB37A139 for ; Tue, 6 Jun 2023 10:05:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6FB7C6347B for ; Tue, 6 Jun 2023 17:05:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8ECFC433D2; Tue, 6 Jun 2023 17:05:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686071145; bh=E4BlTngwvDMSZ0w+4SJayJf9tqejAv0EQEYzZuNR2TE=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=ddgsE8md+BWCQZG4KHegYN1kamvLw9x7y7I9K+lSSMf+1a4evAmY50+XyeAWA+XQy F/U1ToU/owPhyEGHuGLWG3Z5O51zOhvLucfIppb6T0I58B3J/pEdGI5IaU4xsE+bRR 22Qapt/1/w8ir8ae4sd11lBxJ8WOTCRhgZzTjr+L91+Q6dxgmaRecP/CdWKPBwSdt9 kZAhxbq3N1CdJ89NKufBFdWqNT9cO14P15xcd2HYZQi8yZm2y7L2dkPJ+Pyeheq/CI tP29+H/Xalku3h9qLYqEQ0JgGI8OXr0YkztQxmbBWDUjQ0fBrCRZaiA+wlE+CDrgv4 dJH4xbDS7DxsA== Received: from 152.5.30.93.rev.sfr.net ([93.30.5.152] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q6a7r-003Gm6-8S; Tue, 06 Jun 2023 18:05:43 +0100 Date: Tue, 06 Jun 2023 18:05:42 +0100 Message-ID: <87jzwgo715.wl-maz@kernel.org> From: Marc Zyngier To: James Gowans Cc: Thomas Gleixner , , Liao Chang , KarimAllah Raslan , Yipeng Zou , Zhang Jianhua Subject: Re: [PATCH v3 2/2] genirq: fasteoi supports resend on concurrent invoke In-Reply-To: <20230605155723.2628097-2-jgowans@amazon.com> References: <20230605155723.2628097-1-jgowans@amazon.com> <20230605155723.2628097-2-jgowans@amazon.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 93.30.5.152 X-SA-Exim-Rcpt-To: jgowans@amazon.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, liaochang1@huawei.com, karahmed@amazon.com, zouyipeng@huawei.com, chris.zjh@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 05 Jun 2023 16:57:23 +0100, James Gowans wrote: > > ... and enable that functionality for GIC-v3 only. nit: drop the multi-line subject. > > Update the generic handle_fasteoi_irq to cater for the case when the > next interrupt comes in while the previous handler is still running. > Currently when that happens the irq_may_run() early out causes the next > IRQ to be lost. Support marking the interrupt as pending when that > happens so that the interrupt can be resent just before > handle_fasteoi_irq returns. This is largely inspired by > handle_edge_irq. > > Generally it should not be possible for the next interrupt to arrive > while the previous handler is still running: the CPU will not preempt an > interrupt with another from the same source or same priority. However, > there is a race: if the interrupt affinity is changed while the previous > handler is running, then the next interrupt can arrive at a different > CPU while the previous handler is still running. In that case there will > be a concurrent invoke and the early out will be taken. > > For example: > > CPU 0 | CPU 1 > -----------------------------|----------------------------- > interrupt start | > handle_fasteoi_irq | set_affinity(CPU 1) > handler | > ... | interrupt start > ... | handle_fasteoi_irq -> early out > handle_fasteoi_irq return | interrupt end > interrupt end | > > This issue was observed specifically on an arm64 system with a GIC-v3 > handling MSIs via LPI; GIC-v3 uses the handle_fasteoi_irq handler. The > issue is that the GIC-v3's physical LPIs do not have a global active > state. If LPIs had an active state, then it would not be able to be > retriggered until the first CPU had issued a deactivation. Without this > global active state, when the affinity is change the next interrupt can > be triggered on the new CPU while the old CPU is still running the > handler from the previous interrupt. > > Implementation notes: > > It is believed that it's NOT necessary to mask the interrupt in > handle_fasteoi_irq() the way that handle_edge_irq() does. This is > because handle_edge_irq() caters for controllers which are too simple to > gate interrupts from the same source, so the kernel explicitly masks the > interrupt if it re-occurs [0]. > > The resend on concurrent invoke logic is gated by a flag which the > interrupt controller can set if it's susceptible to this problem. It is > not desirable to resend unconditionally: a wake up source for example > has no need to be re-sent. > > [0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@huawei.com/ > > Suggested-by: Liao Chang > Suggested-by: Marc Zyngier > Signed-off-by: James Gowans > Cc: Thomas Gleixner > Cc: Marc Zyngier > Cc: KarimAllah Raslan > Cc: Yipeng Zou > Cc: Zhang Jianhua > --- > drivers/irqchip/irq-gic-v3-its.c | 2 ++ > include/linux/irq.h | 13 +++++++++++++ > kernel/irq/chip.c | 16 +++++++++++++++- > kernel/irq/debugfs.c | 2 ++ > 4 files changed, 32 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index 586271b8aa39..d9becfe696f0 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -3574,6 +3574,7 @@ static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, > irqd = irq_get_irq_data(virq + i); > irqd_set_single_target(irqd); > irqd_set_affinity_on_activate(irqd); > + irqd_set_resend_when_in_progress(irqd); > pr_debug("ID:%d pID:%d vID:%d\n", > (int)(hwirq + i - its_dev->event_map.lpi_base), > (int)(hwirq + i), virq + i); > @@ -4512,6 +4513,7 @@ static int its_vpe_irq_domain_alloc(struct irq_domain *domain, unsigned int virq > irq_domain_set_hwirq_and_chip(domain, virq + i, i, > irqchip, vm->vpes[i]); > set_bit(i, bitmap); > + irqd_set_resend_when_in_progress(irq_get_irq_data(virq + i)); > } > > if (err) { > diff --git a/include/linux/irq.h b/include/linux/irq.h > index b1b28affb32a..b76cc90faebd 100644 > --- a/include/linux/irq.h > +++ b/include/linux/irq.h > @@ -223,6 +223,8 @@ struct irq_data { > * irq_chip::irq_set_affinity() when deactivated. > * IRQD_IRQ_ENABLED_ON_SUSPEND - Interrupt is enabled on suspend by irq pm if > * irqchip have flag IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND set. > + * RQD_RESEND_WHEN_IN_PROGRESS - Interrupt may fire when already in progress in which > + * case it must be resent at the next available opportunity. > */ > enum { > IRQD_TRIGGER_MASK = 0xf, > @@ -249,6 +251,7 @@ enum { > IRQD_HANDLE_ENFORCE_IRQCTX = (1 << 28), > IRQD_AFFINITY_ON_ACTIVATE = (1 << 29), > IRQD_IRQ_ENABLED_ON_SUSPEND = (1 << 30), > + IRQD_RESEND_WHEN_IN_PROGRESS = (1 << 31), Make this unsigned, as it otherwise has the potential to sign-extend and lead to "fun to debug" issues. > }; > > #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) > @@ -448,6 +451,16 @@ static inline bool irqd_affinity_on_activate(struct irq_data *d) > return __irqd_to_state(d) & IRQD_AFFINITY_ON_ACTIVATE; > } > > +static inline void irqd_set_resend_when_in_progress(struct irq_data *d) > +{ > + __irqd_to_state(d) |= IRQD_RESEND_WHEN_IN_PROGRESS; > +} > + > +static inline bool irqd_needs_resend_when_in_progress(struct irq_data *d) > +{ > + return __irqd_to_state(d) & IRQD_RESEND_WHEN_IN_PROGRESS; > +} > + > #undef __irqd_to_state > > static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d) > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index 49e7bc871fec..a531692ca62f 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -692,8 +692,16 @@ void handle_fasteoi_irq(struct irq_desc *desc) > > raw_spin_lock(&desc->lock); > > - if (!irq_may_run(desc)) > + /* > + * When an affinity change races with IRQ delivery, the next interrupt The race is with the IRQ *handling*. > + * can arrive on the new CPU before the original CPU has completed > + * handling the previous one - it may need to be resent. > + */ > + if (!irq_may_run(desc)) { > + if (irqd_needs_resend_when_in_progress(&desc->irq_data)) > + desc->istate |= IRQS_PENDING; > goto out; > + } > > desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); > > @@ -715,6 +723,12 @@ void handle_fasteoi_irq(struct irq_desc *desc) > > cond_unmask_eoi_irq(desc, chip); > > + /* > + * When the race described above happens this will resend the interrupt. > + */ > + if (unlikely(desc->istate & IRQS_PENDING)) > + check_irq_resend(desc, false); > + > raw_spin_unlock(&desc->lock); > return; > out: > diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c > index bbcaac64038e..5971a66be034 100644 > --- a/kernel/irq/debugfs.c > +++ b/kernel/irq/debugfs.c > @@ -133,6 +133,8 @@ static const struct irq_bit_descr irqdata_states[] = { > BIT_MASK_DESCR(IRQD_HANDLE_ENFORCE_IRQCTX), > > BIT_MASK_DESCR(IRQD_IRQ_ENABLED_ON_SUSPEND), > + > + BIT_MASK_DESCR(IRQD_RESEND_WHEN_IN_PROGRESS), > }; > > static const struct irq_bit_descr irqdesc_states[] = { If you respin this series shortly (*with* a cover letter, please), I'll queue it so that it can simmer in -next for a bit. Thanks, M. -- Without deviation from the norm, progress is not possible.