From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8818DC83F11 for ; Sat, 26 Aug 2023 16:17:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229772AbjHZQQh (ORCPT ); Sat, 26 Aug 2023 12:16:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229652AbjHZQQ0 (ORCPT ); Sat, 26 Aug 2023 12:16:26 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9917EE79 for ; Sat, 26 Aug 2023 09:16:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 287D360B8B for ; Sat, 26 Aug 2023 16:16:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71DE7C433C7; Sat, 26 Aug 2023 16:16:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1693066582; bh=aDafZhJdBZ54bMBH0U8rbFAbXjiaRfjXVMc/PxusHh4=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=LjFt1NlW0FYDk4RMQrP4RMgE6stQgS2eTIz3lF7hgyAbBATfhHuS0CaXprPiNP9Vq NrMsNY+j2WkOt9GM6N7cpYBvspW3E/TudzFDjAI41MyiMrD1MCbzp4sp8CQRpyo9an /BJnuYV2REUk2M2V1Px8OWWMBxLZ39JO9g7Rqk2SAkXeh/aScY6J6t1D+Xls1DWS82 reELGlQfwrM2MHT/cGjlGAXAlVMGSksmR6+JI0LPGB3EPE6I4bJk1awIUKRwMwI2xz qVPH7Bm8TpvCZvMqCM8BEvFDKxx9Mgmg4mKcifonG3MrIOOHBHNjM8/8WYA8R7vM47 cfzYPOFmaETLw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qZvxT-008Jns-Ho; Sat, 26 Aug 2023 17:16:19 +0100 Date: Sat, 26 Aug 2023 17:16:19 +0100 Message-ID: <867cphg4oc.wl-maz@kernel.org> From: Marc Zyngier To: Johan Hovold Cc: Thomas Gleixner , linux-kernel@vger.kernel.org, Shanker Donthineni Subject: Re: [PATCH] genirq: Fix software resend lockup and nested resend In-Reply-To: <20230826154004.1417-1-johan+linaro@kernel.org> References: <20230826154004.1417-1-johan+linaro@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: johan+linaro@kernel.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, sdonthineni@nvidia.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Johan, On Sat, 26 Aug 2023 16:40:04 +0100, Johan Hovold wrote: > > The switch to using hlist for managing software resend of interrupts > broke resend in at least two ways: > > First, unconditionally adding interrupt descriptors to the resend list > can corrupt the list when the descriptor in question has already been > added. This causes the resend tasklet to loop indefinitely with > interrupts disabled as was recently reported with the Lenovo ThinkPad > X13s after threaded NAPI was disabled in the ath11k WiFi driver. [1] Gah, of course. Making the descriptor pending again isn't an idempotent operation anymore (while setting an already set bit in the bitmap was). You'll need the right timing, but it looks like you've managed to achieve that reliably. > > This bug is easily fixed by restoring the old semantics of > irq_sw_resend() so that it can be called also for descriptors that have > already been marked for resend. > > Second, the offending commit also broke software resend of nested > interrupts by simply discarding the code that made sure that such > interrupts are retriggered using the parent interrupt. > > Add back the corresponding code that adds the parent descriptor to the > resend list. Note that this bit is untested, but I decided to include it > to avoid having to revert the offending commit and the maple tree > conversion that depends on it. Indeed, and that one is not timing related, but 100% broken. > > [1] https://lore.kernel.org/lkml/20230809073432.4193-1-johan+linaro@kernel.org/ > > Fixes: bc06a9e08742 ("genirq: Use hlist for managing resend handlers") > Cc: Shanker Donthineni > Signed-off-by: Johan Hovold > --- > > Hi Thomas and Marc, > > This patch fixes a severe regression in the resend code in 6.5-rc1 that > breaks machines like the Lenovo X13s and which ideally should be > addressed before 6.5 is released tomorrow. This relies on Thomas seeing this email before tomorrow. Unless you directly reach out to Linus to try and intercept the release. > > I hesitated about including the fix for nested interrupts as I've not > had time to test this bit, but I ultimately decided to include it to > avoid having to suggest a revert of the maple tree conversion. Let me > know if you prefer to go this route and I'll post a (prepared) revert > series instead. I think it is too late for that revert to happen, and we might as well plough along and take the fix. > > Johan > > > kernel/irq/resend.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/kernel/irq/resend.c b/kernel/irq/resend.c > index edec335c0a7a..5f2c66860ac6 100644 > --- a/kernel/irq/resend.c > +++ b/kernel/irq/resend.c > @@ -68,11 +68,16 @@ static int irq_sw_resend(struct irq_desc *desc) > */ > if (!desc->parent_irq) > return -EINVAL; > + > + desc = irq_to_desc(desc->parent_irq); > + if (!desc) > + return -EINVAL; > } > > /* Add to resend_list and activate the softirq: */ > raw_spin_lock(&irq_resend_lock); > - hlist_add_head(&desc->resend_node, &irq_resend_list); > + if (hlist_unhashed(&desc->resend_node)) > + hlist_add_head(&desc->resend_node, &irq_resend_list); > raw_spin_unlock(&irq_resend_lock); > tasklet_schedule(&resend_tasklet); > return 0; FWIW: Reviewed-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible.