From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46C1EC77B7C for ; Fri, 12 May 2023 08:02:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239930AbjELICm (ORCPT ); Fri, 12 May 2023 04:02:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239915AbjELICl (ORCPT ); Fri, 12 May 2023 04:02:41 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB6881B1; Fri, 12 May 2023 01:02:39 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5749B653B7; Fri, 12 May 2023 08:02:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A449FC433EF; Fri, 12 May 2023 08:02:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683878558; bh=TpLd8uvfhqYZkikF7ov+Xf8tJE9kPo4VVe2Awm40Ty4=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=B6JQGiicfw/ZwQL3DL4gjeCoA8Ar/1S5qecbqddyYqp7tmV3boCpWBZT9a5v/HMpw hw8bjCTAZhP+idODWaXi54kb6LAxehzVpVfRU/nc3NoNSTZynT1tBNwL8dH8zsz9/8 dgYVrKqeirSCutUR3iCpxw5tK+YipDkOolnZ0rrEyZlOSpNdJ0/i0YEp2b35i0WKzn 63o3bYJ5iyj1EVZFJHRpil8g4Q3tSiMfdl2g6lpJIcLE6S3h4RCoSJ49U0x0nAkunN rftpu9HXhrtgCSYZlkGGpxGIvvqj/jOJrl2Qb0iUPrMqmJBQJZUrw8ajxVyJoYLeOa Hmb9GTOCQHN9A== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pxNjY-00EVpi-BC; Fri, 12 May 2023 09:02:36 +0100 Date: Fri, 12 May 2023 09:02:35 +0100 Message-ID: <86v8gym0ys.wl-maz@kernel.org> From: Marc Zyngier To: Douglas Anderson Cc: Thomas Gleixner , Rob Herring , Krzysztof Kozlowski , Matthias Brugger , devicetree@vger.kernel.org, linux-mediatek@lists.infradead.org, wenst@chromium.org, Eddie Huang , Allen-KH Cheng , Ben Ho , Weiyi Lu , AngeloGioacchino Del Regno , linux-arm-kernel@lists.infradead.org, Tinghan Shen , jwerner@chromium.org, Hsin-Hsiung Wang , yidilin@chromium.org, Seiya Wang , Conor Dooley , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/6] dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW In-Reply-To: <20230511150539.1.Iabe67a827e206496efec6beb5616d5a3b99c1e65@changeid> References: <20230511150539.6.Ia0b6ebbaa351e3cd67e201355b9ae67783c7d718@changeid> <20230511150539.1.Iabe67a827e206496efec6beb5616d5a3b99c1e65@changeid> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: dianders@chromium.org, tglx@linutronix.de, robh+dt@kernel.org, krzysztof.kozlowski+dt@linaro.org, matthias.bgg@gmail.com, devicetree@vger.kernel.org, linux-mediatek@lists.infradead.org, wenst@chromium.org, eddie.huang@mediatek.com, allen-kh.cheng@mediatek.com, Ben.Ho@mediatek.com, weiyi.lu@mediatek.com, angelogioacchino.delregno@collabora.com, linux-arm-kernel@lists.infradead.org, tinghan.shen@mediatek.com, jwerner@chromium.org, hsin-hsiung.wang@mediatek.com, yidilin@chromium.org, seiya.wang@mediatek.com, conor+dt@kernel.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org On Thu, 11 May 2023 23:05:35 +0100, Douglas Anderson wrote: > > When trying to turn on the "pseudo NMI" kernel feature in Linux, it > was discovered that all Mediatek-based Chromebooks that ever shipped > (at least ones with GICv3) had a firmware bug where they wouldn't save > certain GIC "GICR" registers properly. If a processor ever entered a > suspend/idle mode where the GICR registers lost state then they'd be > reset to their default state. > > As a result of the bug, if you try to enable "pseudo NMIs" on the > affected devices then certain interrupts will unexpectedly get > promoted to be "pseudo NMIs" and cause crashes / freezes / general > mayhem. > > ChromeOS is looking to start turning on "pseudo NMIs" in production to > make crash reports more actionable. To do so, we will release firmware > updates for at least some of the affected Mediatek Chromebooks. > However, even when we update the firmware of a Chromebook it's always > possible that a user will end up booting with old firmware. We need to > be able to detect when we're running with firmware that will crash and > burn if pseudo NMIs are enabled. > > The current plan is: > * Update the device trees of all affected Chromebooks to include the > 'mediatek,gicr-save-quirk' property. The kernel can use this to know > not to enable certain features like "pseudo NMI". NOTE: device trees > for Chromebooks are never baked into the firmware but are bundled > with the kernel. A kernel will never be configured to use "pseudo > NMIs" and be bundled with an old device tree. > * When we get a fixed firmware for one of these Chromebooks, it will > patch the device tree to remove this property. Since you're in control of distributing the FW together with the kernel, I assume you're also in control of the command line. Why can't that firmware pass the option enabling the pseudo-NMI support, dispensing ourselves from all of this? > > For some details, you can also see the public bug > > > Signed-off-by: Douglas Anderson > --- > > .../bindings/interrupt-controller/arm,gic-v3.yaml | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml > index 92117261e1e1..8c251caae537 100644 > --- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml > +++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml > @@ -166,6 +166,12 @@ properties: > resets: > maxItems: 1 > > + mediatek,gicr-save-quirk: I think this deserves something *much* stronger that outlines what is wrong, because this is not just a quirk. This is a failure to even remotely grasp the requirements of the architecture (and to use standard, public code that would have done it correctly). Something like "mediatek,broken-save-restore-fw" would be more adequate. > + type: boolean > + description: > + Asserts that the firmware on this device has issues saving and restoring > + GICR registers when CPUs are powered off. Nit: not the the CPUs, but the GIC redistributors. Thanks, M. -- Without deviation from the norm, progress is not possible.