From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D24D1BAEDC for ; Thu, 30 Jan 2025 09:59:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738231142; cv=none; b=rOXOqZYpjbKtCdGYDLD14eBEpr9KyyRGKp6ZEehEKfQD9v623/ubOjw2Z03GBN3wtiWqX8oXX7mj+YaQCtLTlA5QUsalzaXg0f2SEOA57etDGe1iHLYbi5umkZo3hOBQQxNl7Z+TZiUB4xv4EJHXh+w7erFwlPgOwou4gedPoj8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738231142; c=relaxed/simple; bh=X4O8RHC23bDLL4d0U/H61Gdzfi5M7BgXiXGvH6PLJJA=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=I0pmlTMrR4jo/9TZBIfCsjomTjm1+uhiBqmRTspX/cghr9H4dqQu6KsK42g4H/350ksSehFE8Z0rsCeXu8bmIyquFHubJP6laImbvrzSMqV+jtMLKYT4N4SggshwpfXqRgoYb/3RBb3egshdkFk52ybBvMBbn+C54iFx27MNZrI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kFUEQjKW; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kFUEQjKW" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BD127C4CEE4; Thu, 30 Jan 2025 09:59:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1738231141; bh=X4O8RHC23bDLL4d0U/H61Gdzfi5M7BgXiXGvH6PLJJA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=kFUEQjKWgcsfHb+wdCn3LBnTSei0cOXdeLxglggqIwSewERj1TpC9Q4jkextL9SzK P8YdJ7G5ASCTbEOuN/c4DAFyaVwSPEFfKXEyQkBOOSRMsMhwGfYt1F1F+xHYZErdSy opZBMqw463Ybf1fMOX/I8UGe9236Ye2cf/7VZQyVyECyfMJfmyssH2N0H2Aglrq7Na WUfxWoi9oV88b+WjVaRsbyqb0QG30tckJT4UhqhmarRTZrZK8RdQegH+UBxZZnPMug pFn3PH3rDLo5cbyBIb0sxwm3WEFbjYMXQj8D9q+jUhduQwSdECCwGa6k5DXrc0XyFW lZhjbp7A99rZA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1tdRK7-00GZPZ-AP; Thu, 30 Jan 2025 09:58:59 +0000 Date: Thu, 30 Jan 2025 09:58:58 +0000 Message-ID: <86a5b8vd0d.wl-maz@kernel.org> From: Marc Zyngier To: chf.fritz@googlemail.com Cc: Mark Rutland , Chen-Yu Tsai , KeverYang , Heiko Stuebner , linux-rockchip@lists.infradead.org, stable , linux-arm-kernel Subject: Re: rk3399 fails to boot since v6.12.7 In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: chf.fritz@googlemail.com, mark.rutland@arm.com, wens@csie.org, kever.yang@rock-chips.com, heiko@sntech.de, linux-rockchip@lists.infradead.org, stable@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Hi Christoph, Thanks for reporting this. On Wed, 29 Jan 2025 21:31:53 +0000, Christoph Fritz wrote: > > Hello Marc, > > since 773c05f417fa1 ("irqchip/gic-v3: Work around insecure GIC > integrations") landed in stable v6.12.7 as 0bf32f482887, here the > rk3399 fails to boot (~4 out of 10 times) because OP-TEE panics (gets a > secure interrupt that it cannot handle). I think it may actually get a *non-secure* interrupt. > > Setup is: > - BL31 proprietary (since mainline TF-A has no DMA) > - OP-TEE mainline version: 3.20 > - Kernel v6.12.7 > > > [ 0.000000] GICv3: Broken GIC integration, security disabled > > E/TC:4 0 Panic 'Secure interrupt handler not defined' at core/kernel/interrupt.c:139 > E/TC:4 0 TEE load address @ 0x30000000 > E/TC:4 0 Call stack: > E/TC:4 0 0x300091f8 > E/TC:4 0 0x30016664 > E/TC:4 0 0x30015710 > E/TC:4 0 0x30005714 > [ 26.087363] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 26.087925] rcu: (detected by 2, t=21002 jiffies, g=2233, q=2416 ncpus=6) > [ 26.088530] rcu: All QSes seen, last rcu_preempt kthread activity 21002 (4294693363-4294672361), jiffies_till_next_fqs=3, root ->qsmask 0x0 > [ 26.089623] rcu: rcu_preempt kthread timer wakeup didn't happen for 20999 jiffies! g2233 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 > [ 26.090617] rcu: Possible timer handling issue on cpu=4 timer-softirq=293 > [ 26.091218] rcu: rcu_preempt kthread starved for 21002 jiffies! g2233 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=4 > [ 26.092131] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. > [ 26.092926] rcu: RCU grace-period kthread stack dump: > [ 26.093369] task:rcu_preempt state:R stack:0 pid:16 tgid:16 ppid:2 flags:0x00000008 > [ 26.094192] Call trace: > [ 26.094409] __switch_to+0xf0/0x14c > [ 26.094728] __schedule+0x264/0xa90 > [ 26.095040] schedule+0x34/0x104 > [ 26.095329] schedule_timeout+0x80/0xf4 > [ 26.095672] rcu_gp_fqs_loop+0x14c/0x4a4 > [ 26.096027] rcu_gp_kthread+0x138/0x164 > [ 26.096369] kthread+0x114/0x118 > [ 26.096661] ret_from_fork+0x10/0x20 > [ 26.096982] rcu: Stack dump where RCU GP kthread last ran: > [ 26.097463] Sending NMI from CPU 2 to CPUs 4: > [ 46.276364] sched: DL replenish lagged too much > > Is it too late for the kernel to disable the "security" since OP-TEE > assumes it is enabled? Well, it was never effective security the first place, but clearly this effect is not expected (my own machine doesn't have anything running on the secure side). > Any ideas? I think this calls for a revert of this patch, potentially at the expense if NMI support on this machine. Could you show how SCR_EL3.FIQ is configured on this machine? Mine shows: [ 0.000000] GICv3: GICD_CTRL.DS=0, SCR_EL3.FIQ=0 and I suspect yours has FIQ=1. If that's the case, we could use that as the discriminant. However, this machine has a much bigger issues. For things to work as expected, the GIC driver must preserve all the secure configuration, and nothing does that today. So even before this patch, your secure payload won't get any interrupt, as we blindly configure everything to be Group-1NS. Thanks, M. -- Without deviation from the norm, progress is not possible.