From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20F25C4708E for ; Thu, 5 Jan 2023 20:20:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=iGpgj2SM97nkNkLf3+CGSwWSxVuxAenT7t+pTKx/6sM=; b=3aRQ9z+pnfAArK fZT4TxBZSU5jhH0Tvis3UMJAaQQ+81Fjo96FqDEvSGWHg/PXt/WdrFdPvvkwqZc78q6ImLDQKzawO /0j2Rc5TaCOUmgZgle+UF+zfsMmFa1J+kSSYN0I9KpPXy80cPvBlZSj4ej/OhJcYqdVSr/xYhFIAx 91Bj+EL4TwZMTjwOE+b7HvRtLqE6hPvvO8ZFMjua2DYLtBcyw4g4GLNnYPnZ1nZEOQI6clQsb7mJg DsDiGkWl6UooXxb6nAGnaF2AQwfOpZJ11qbjo2+QRrW5Bot6AUsFh3joMmgHWLEtkgZEsUyUefbG/ 2v9ZiUZcDvmIoZrKkMwg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pDWhz-00ETOa-EP; Thu, 05 Jan 2023 20:19:28 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pDNdB-00AqMp-9S for linux-arm-kernel@lists.infradead.org; Thu, 05 Jan 2023 10:37:55 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A8E5C6197C; Thu, 5 Jan 2023 10:37:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7BEE3C433EF; Thu, 5 Jan 2023 10:37:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672915072; bh=HjtVTec2Eam1rLoZhzwVIqXG4zU6tkLHBJl3TYQVHGs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=UH/qg+z3VtLWYAwlP1ucCdu9rhBX/pKMoq0s2T7ULsMSoXHWh9rwFnLccVXraZ2Ix w7m9szg98UkDozvIXAZiGN+aJd4akOkms0cPnXPz+FRoo36V/L/ghh+HnCHUHxwuUn 5ut6jl9YLPyDt/2BEsqG9DmJlZXfQKNFs5aFgDFYV5yrionlEIclMcdKKmiEeWMOuE jQmJ4w3We5k1jv2ug/vhNvnKwl5r5nIxKc8fUTAZgtgG5FeP+YOngknAB6mka2E6pl IqJ79nu6yIggcH30Alu85JguP/1H11SWIfRPPlUpY4OultjBRgQ/b4tR+vvM3vVCBn IDys5zY/eAcFA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pDNd8-00GyON-7B; Thu, 05 Jan 2023 10:37:50 +0000 Date: Thu, 05 Jan 2023 10:37:50 +0000 Message-ID: <86lemh6zcx.wl-maz@kernel.org> From: Marc Zyngier To: Joakim Tjernlund Cc: "linux-arm-kernel@lists.infradead.org" , "mark.rutland@arm.com" Subject: Re: [PATCH] GICv3: Add restart handler to detach CPU from GICv3 In-Reply-To: References: <20221216162128.10808-1-joakim.tjernlund@infinera.com> <86pmbu6y7k.wl-maz@kernel.org> <86o7re6sq5.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: Joakim.Tjernlund@infinera.com, linux-arm-kernel@lists.infradead.org, mark.rutland@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230105_023753_438702_7B304EC0 X-CRM114-Status: GOOD ( 58.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, 04 Jan 2023 19:52:33 +0000, Joakim Tjernlund wrote: > > On Wed, 2023-01-04 at 18:48 +0000, Marc Zyngier wrote: > > On Wed, 04 Jan 2023 17:23:42 +0000, > > Joakim Tjernlund wrote: > > > > > > On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote: > > > > On Wed, 04 Jan 2023 16:04:14 +0000, > > > > Mark Rutland wrote: > > > > > > > > > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote: > > > > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote: > > > > > > > > > > > > Ping? > > > > > > > > > > To whom? > > > > > > > > > > You don't appeared to have Cc'd any relevant maintainer, and people are still > > > > > on holiday, so it's extremely likely this will be missed. > > > > > > > > That, plus nobody reads the list looking for this sort of things. > > > > > > > > > > > > > > For the maintainer, please use scripts/get_maintainer.pl, e.g. > > > > > > > > > > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c > > > > > > Thomas Gleixner (maintainer:IRQCHIP DRIVERS) > > > > > > Marc Zyngier (maintainer:IRQCHIP DRIVERS) > > > > > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS) > > > > > > > > > > Note: I've Cc'd Marc, who wrote the GICv3 driver. > > > > > > > > Cheers Mark, much appreciated. > > > > > > Sorry for missing that extra maintainer CC: > > > > > > > > > > > > > > > > > > > Needed for reboot without resetting the whole GIC > > > > > > > > > > This doesn't really explain what you're trying to do nor why. > > > > > > > > > > Why do you need to "reboot without resetting the whole GIC" ? > > > > > > > > > > Do you encounter a problem if we try to reset the whole GIC? > > > > > > > > > > Is this for kexec? > > > > > > > > > > Is this for some use-case enabled by out-of-tree code? > > > > > > > > All valid questions. This smells of a terrible hack... > > > > > > Yes, all god Q's. > > > > And no answer? > > No kexec, out of tree kernel with minor mods > > > > > > > > > > > > > The interesting aspect is that this is only done when DS=1, probably > > > > meaning that they are doing this in a VM. it also rely on some > > > > > > Nope, on our custom target. > > > > And you run with DS=1? On bare metal? Humpf... > > Yes, we control all SW running on this target. > > > > > > > > > > (unbounded) UNPRED behaviour as ProcessorSleep is entered without > > > > any consideration for Group0... Good luck with that. > > No Group0 IRQ used(or so I think but I could be mistaken) > > > > > > > hmm, I am doing the same as PM does which also needs DS=1 so I figured > > > this was uncontroversial. > > > > I seriously doubt anyone is actually using that code in anger. I > > always found it dodgy, and I'm pretty sure it is totally broken. > > I see > > > > > > > > > > > > > > Anyway, I don't think we want any of this stuff. Certainly not without > > > > a cast-iron justification. > > > > > > We use several cores but only one run Linux so a Linux reboot will > > > only reset its own core. > > > > And what happens with the distributor? One of the key assumption of > > the GIC architecture is that there is only one operating system in > > control of it, the whole of it. The only way to share it is by > > virtualising it. Shades of Jailhouse... > > Distributor is not touched until Linux is going up again > > > > > > > > > Without this patch I either need to reset the GIC as part of the > > > reboot or I get RWP timeout when linux starts again. Resetting the > > > GIC kills IRQ on the other cores for a long time which is unwanted. > > > > But that's what happens anyway when Linux boots (we reinitialise the > > whole distributor). So what is this about? > > At that point IRQ will only be lost shortly until the other cores > are reloaded by Linux user space How different is that from reseting the GIC altogether? > > > > > > > > > The RWP timeout comes from lost HW handshake between core and GIC. > > > Is there another way to regain the HW handshake ? > > > > That's the firmware's job. But overall, getting these timeouts means > > your GIC has locked up. It would be more interesting to understand > > FW here is u-boot starting from EL3 > > > *why* you get in that situation, which I presume is due to the way the > > driver initialises itself. > > > > Assuming you use kexec to reboot your Linux instance, does the > > following help (totally untested)? > > no kexec and no it did not help. > I get the first RWP here: > static void __init gic_dist_init(void) > { > unsigned int i; > u64 affinity; > void __iomem *base = gic_data.dist_base; > u32 val; > > /* Disable the distributor */ > writel_relaxed(0, base + GICD_CTLR); > gic_dist_wait_for_rwp(); <------ RWP timeout > > I guess it is impossible to recover GIC HW handshake once it is lost ? This is trying to teardown the distributor, but the downstream redistributor (and CPU interface) are still up. The architecture doesn't specify what can happen in this case, and I wouldn't be surprised if you were triggering some really bad HW bugs. Please provide the exact version of your GIC (at least the GICD_IIDR value). Since you're in control of your firmware, you should reset things there, because even if we hack something here for the boot CPU, this says nothing of the other CPUs. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel