From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 075B9C636D4 for ; Fri, 10 Feb 2023 22:17:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References :In-Reply-To:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=9Xjtfi5vXgCzQEz6HvOpQgqVrixe6yYJSFo2utp1pRI=; b=jIvpRF/UawYisP zTdKS79dIai/kv661PEXNwhzkUc+zOrCk4Ef+Ui2c7Ic6YpK8krzxXh/AvXxhkn0hpqt8nrHL4dfW BA9cZll2jEh7AWoW22MwXVYjpGwiBusxhziDUzDfS00LtlWtor9HdVlvIUS7YWdUY1EC1zPHA+qgr mCgtElX7Mj8p05MRZTncEHM5Y5Ue2PTAUht4cNtKBfpS+o0cGTY4Wsc1lz6swPY5wxkoOn2yWLN24 dLLkKGUCDcxH3ZTJYQ5OF733WFQ0Bo29qGfYOXFhEBsVKSfbd3CJiM8o6MkH/uws7ruSVU7/JRzQ5 kr1oW6rFgS7zbBf2TA7Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pQbgt-007lCY-5C; Fri, 10 Feb 2023 22:16:23 +0000 Received: from mail-mw2nam10on20729.outbound.protection.outlook.com ([2a01:111:f400:7e89::729] helo=NAM10-MW2-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pQbgp-007lBt-57 for linux-arm-kernel@lists.infradead.org; Fri, 10 Feb 2023 22:16:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ljSpg/lQdFld+Ycz7kMmbdqXqRJeS3tJkhFFGt3uqGGAEnm8GThYKQmWoFJO8KikiN7USey4MG82U/WbuwWR3EAjJv9/0SEkLIvnENQPeOkuKnhLDFO4EpHzN25UKPRzm+nlYxfFOVtopJHpMwLO2kXzLh5yfPYqDpkNWX4NJpFEftE/PRMl9VnT8/FUqeJu7ViKB9Cqq0Zq8mJMgtWokpAS1AFj/283gjSHznYJUe/XM/qilKVSANLyQ/SS/0NsmvXV2z8E9dYXQyexVVkRLKfD7fLsYHjymvo5HPy+/8hO3ITizf7q6XlpXvAWTk9/pqZUF7d00obZ5sPIlf88bQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+xpGD6LfECegcdiuBAOKx8opghScwoNXKD+eBY4ur+w=; b=NbBpNb0qOwrEdV6YbXekQGN1xWsDLKLXRqddqRJAhAHYHLPAHeYlA0dP2Vrx5wJ/svhND0ieF+IfUxOtu4kV1IY+jKfTnnOEp03UE34f8AUTy1qb02ooJD9U1NF82o71or+T1l8IxImyAKHYqFucizbze6V0M7j9SrAs3InRGOkutVOJKMvLp9cGM0cmBsnNwGW1bBuVul6EyGihAz2oWgEa45xbKr6g/JuyibrG9amFcsW0vKW5aW5BXtQdMjhrryazwBVvVjOm55pdV2IqKYdBqyueAGznck69DJuAzRPZfRgFW0gMM/bqKCEcDmUFIYIaN4ka9MqOBJ+l56wryQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+xpGD6LfECegcdiuBAOKx8opghScwoNXKD+eBY4ur+w=; b=dyceo0gpAwi2Mqe04pJFWgGoruM13Ta8ssKYbB9a91Z15+SoArnc4roLWslx/cYYVyDBdTUnzqhNRzPOhJOgPw9AAFJAY6jTNVIBzACC6ptfSsP1jLnDzd+02wvhzh/T3Cig8ZREbqSvweLTqqBHWt5Jen0fummZIcxrDZvKsqU= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=os.amperecomputing.com; Received: from MWHPR0101MB2893.prod.exchangelabs.com (2603:10b6:301:33::25) by SJ0PR01MB6398.prod.exchangelabs.com (2603:10b6:a03:2a3::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.17; Fri, 10 Feb 2023 22:16:12 +0000 Received: from MWHPR0101MB2893.prod.exchangelabs.com ([fe80::a0d2:ebe6:dfcb:f75c]) by MWHPR0101MB2893.prod.exchangelabs.com ([fe80::a0d2:ebe6:dfcb:f75c%6]) with mapi id 15.20.6064.032; Fri, 10 Feb 2023 22:16:11 +0000 From: D Scott Phillips To: James Morse , linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas , Will Deacon Subject: Re: [PATCH] arm64: abort SDEI handlers during crash In-Reply-To: <1c855f3d-980b-c867-ea54-37d53fbc46ab@arm.com> References: <20230204000851.3871-1-scott@os.amperecomputing.com> <1c855f3d-980b-c867-ea54-37d53fbc46ab@arm.com> Date: Fri, 10 Feb 2023 14:16:07 -0800 Message-ID: <86sffdnp3s.fsf@scott-ph-mail.amperecomputing.com> X-ClientProxiedBy: CH0PR08CA0006.namprd08.prod.outlook.com (2603:10b6:610:33::11) To MWHPR0101MB2893.prod.exchangelabs.com (2603:10b6:301:33::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWHPR0101MB2893:EE_|SJ0PR01MB6398:EE_ X-MS-Office365-Filtering-Correlation-Id: f618d1b2-5dbc-49a0-6c52-08db0bb46a97 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: mBsFI/Ti3CybIHxWzPKcfljr5GvLZOikZHWUH0hqYu+T0ptnvdSP4p9jEngQEqvOMSOoRBWooBVbHipyIcj6/0eJ8kfAeAF01wZtpx395k7Y6DFI2xf8vtTEf5O7PG3GjhBkwPz9p9nkYYCu+JDGMLRYDTTDjNAKDspX5zfkA3BaXIMXzJ43tEC9JR12sbQjKZ7zlHsafdleolS60aAGhstm4eI83DMySwyJvT9b4CsT4AHYWxVOW274ZxPyt9EzFVXZIeKbF3KHQi/g+0Fn5WdfR+z/K6he1ut+bOxuu3QbpC6zSQtzaKQgliyP3kFg9LyjavpcMT9b56cmEoMRmKKwIpsrReXfyb4Od9fNnQFcbskL44AN9bXmqSsz0G1QiYvt7Hk+5FvgtabLn9khV3qWxBxXMBSnP/ePRrcmVZOoQY34lxBoo1uvfn39q6uRDFSR8FH6VPGweja8LwQrUakZ5D/iwdosuludvoPTZ0v6nc+JKvvQPdcFwJpFryEAn6vRqpRzNuHwakvpdBcjr5IXtLyDLmgNgOMwdzYzxwKNYrDLfkDW1jjenA/HkscoVIF+jueAsWDLBfUnKY1Yi+Ufv4gFTqN/mX2hQ2h6M7QOshzxe1gDvrRmquLiPwE0Lkq8cFltEWSe4zfgVgrUQxzOaDKZlr6Yofb7s6ItjywMXBQmtd93do1KXdVc4I6vzxQ8yevCvNp4h8H1NaXF5C8498hyYb0ckjzDHQ1KokI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MWHPR0101MB2893.prod.exchangelabs.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(376002)(366004)(136003)(346002)(396003)(39850400004)(451199018)(54906003)(478600001)(316002)(2906002)(86362001)(6512007)(26005)(41300700001)(4326008)(6506007)(8676002)(8936002)(66556008)(5660300002)(6666004)(53546011)(66946007)(66476007)(52116002)(9686003)(186003)(38100700002)(38350700002)(6486002)(966005)(66899018)(83380400001);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?wmtolJFaZgJipCQyzyDv7v+BcNcbw8zqdD2DKJsuFvzuPHhObkz2KEACkd+s?= =?us-ascii?Q?dqaMcVNlNP1zWARIUA4ep+cSknTRipe0GdM8iHfnfOk44Dux09AIuU3Z3cck?= =?us-ascii?Q?bO3CVF2nYIdhzQanKSO76OAMZbercZZh8feJjmUlOcFeY0NqceOjB2ILgbvi?= =?us-ascii?Q?8BtfjJGbdvzikOpIrtdeNKSAHqMWBhlgVMsNOcpqK9w2BxF2ayCq/Hwnmn5E?= =?us-ascii?Q?Gv+Fxe7CJsCjsv7CymJgLsmk2xiI3uGOHHkyyGmD7y9pDx+y6l4I1RrHUlp9?= =?us-ascii?Q?siCcyzf5XW1V7FTaWqrqzGZjx5wl2BBYcisRocRUM8FZavrXIYj/hkqH2TO6?= =?us-ascii?Q?sk35/sxkxHhq0Q4AVwUvO20u4X5IFFh7g2avwmnYY5PZLqHaYRqpHDgO1aMB?= =?us-ascii?Q?Vk4IgZ1IpOKhOb5uCKnTO9nlrYQVbAdWPfijpfvaSFK7wsIGfTmaDPqWxqYn?= =?us-ascii?Q?g91KHpomgtEF+ivzIh4f21CGtnfIwejkaT3loUOv1SlMBKc0UKcFm7Nxgq9a?= =?us-ascii?Q?gf+v+ZmMUlZBS9mF/7SHTAh/Rc+YHX3ziSBLzt9ZEqwzK/PxlDeolQw8+RfO?= =?us-ascii?Q?6S/TomV/ArtEBwWxkPGuMM7EUMfgwDe6oSJt741rudk00GOi1n1RJdaGrHHx?= =?us-ascii?Q?JoreVoO9UA+KJDE6PFQZ44L3vEBt/XvIRaHr6rr5p0/hWEShOaZWJT1UcgQE?= =?us-ascii?Q?Wx7PGX9F/Nh1+hq1F4laliJecpTqoD1nNWGLKy0iU/zkPJOJ3VIoEBbGe9Ba?= =?us-ascii?Q?DwC5Xx1a0jIMmmX6hGLDtP1F3xi2UM092dNdWa3+3DeEKMHEh6V872zWlmYT?= =?us-ascii?Q?fNs9z2mGkMGTSLq6+C0M7aAMK7uOUiTo6eMuhcVpWrRHySH9HteOzduO/SF+?= =?us-ascii?Q?ePu08e/30Pc/U7aKLLjE64ZUFrtWyQ1ndgB940G2koOsNnhUMfhh80mVBeh4?= =?us-ascii?Q?n04nExH8UA7rPmItmOHiQk8oR29bSUcixNsIwTURmXStddty6EqQXOZ77t9n?= =?us-ascii?Q?GaB+edsryj4T/IuWctb64tDAOgOa/sVG4U26YR7IPKq2E750vkxro6WnUy7L?= =?us-ascii?Q?I+aSqlgGQP5XuGmr2sUJw1dyn8XPn2WxmD3vCMoblZV/lEMry5Q1frsIFnnu?= =?us-ascii?Q?lAnhwvba1WftmCXOKbHmkgvGTqyWnPfcTk6zrkM752KabYmpy+9pbr9JTS6N?= =?us-ascii?Q?HV7gfb5qhP3xUHvpvBZOzfPfmxGS9zKwSTG/zpq14+3QZGnR2/gvBUUkAUvp?= =?us-ascii?Q?kn4gvAbTaQEYeq+7k+hESGUa5WhvwHwRZyd3FIAGlxBGx7rXuLMvsqLWhBbQ?= =?us-ascii?Q?tiVV7bm4l++3B9Cy9q7c3OtWzXRxKbKvMt/QgSNAhkzTkLqtBWdWs3A5qOXy?= =?us-ascii?Q?GExTWJy0cVH/DwZ8yPXiylJPQoHKwzbZGtnkr0PRmAdM5D7MZhK5O4OWn/it?= =?us-ascii?Q?2+4F3FlErOItsxFLVdXm71I+03sHudGWKEsDsR+MCPuM8xPLzL4fu4f07Nx4?= =?us-ascii?Q?Fm7/fHyLKu91F6iXyzMOU9oQpVg77/+DktqIwlN/3wUtGclAtN3Ue6Tbc89q?= =?us-ascii?Q?SPjfswa3Q8Ityg/ZX8VoetMJNo/rdmEDdjFZKod8e8C6T/1X2QYacDLZFd82?= =?us-ascii?Q?kW2CrG1C7QCx4f5JtoLq3nY=3D?= X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-Network-Message-Id: f618d1b2-5dbc-49a0-6c52-08db0bb46a97 X-MS-Exchange-CrossTenant-AuthSource: MWHPR0101MB2893.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Feb 2023 22:16:11.7602 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: vXfOgYIE3wF5oe4hI6WE7LLFnO90xuueCQIsY/JIqmZWlIG/qVydNkx50Zu9nZTzG5OXSf6giH0YTpQOs1nIJRNqyrrqpeTbdyHuCp/Zj91ShjamLQLQLrK9BJojOgIt X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR01MB6398 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230210_141619_418446_FEF140B5 X-CRM114-Status: GOOD ( 48.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org James Morse writes: > Hi Scott, > > On 04/02/2023 00:08, D Scott Phillips wrote: >> Interrupts are blocked in SDEI context, per the SDEI spec: "The client >> interrupts cannot preempt the event handler." > > Firmware is supposed to do this by ensuring PSTATE.I is set when the > handler runs. See "4.3.1 Arm PE Architecture". > > Unfortunately trusted-firmware is setting PMR, (which is specific to > GIC, not the 'PE architecture') to a value in the secure range, > meaning no normal world interrupt can fire. Hi James, in the case I see on my machine, the cause isn't an elevated PMR, but rather an eleveated RPR. IRQs are blocked because the interrupt that triggered the SDEI event is still active and hasn't dropped priority on the core that's running the SDEI event handler. Specifically see ICC_PMR_EL1 == 0xf8 (all priority bits set), but ICC_RPR_EL1 == 0x70. The spec does say that PSTATE.I will be set in the handler context, and also that you "shouldn't" clear PSTATE.I while in the handler, but I don't read that section or others in the spec as requiring firmware to maintain the running priority, priority mask, etc such that just clearing PSTATE.I gets you back to a working system. Perhaps I'm reading into the "5.2 Event context" bit that says "The client interrupts cannot preempt the event handler" more than I should, but I'm taking that to mean, "don't count on working interrupts until you end the handler" In my AGDI case, the irq related to the event is entirely owned by firmware, but in the case of irqs bound to SDEI events, the spec says that the firmware will activate the interrupt before calling the event handler and end the interrupt after. It's not explicit about dropping priority, but it is explicit that the interrupt is meant to be active during the sdei handler. So I'm not sure what the firmware is doing with its AGDI interrupt is totally wrong here. > This is a bug in trusted firmware, but it looks like its firmly > ingrained in their 'exception handling framework', and there is no > appetite for fixing it. > > So we'll have to hack round it in the kernel. > > >> If we crashed in the SDEI >> handler-running context (as with ACPI's AGDI) then we need to clean up the >> SDEI state before proceeding to the crash kernel so that the crash kernel >> can have working interrupts. Try two COMPLETE_AND_RESUMEs in case both a >> normal and critical event were being handled. > > There are multiple reasons the kernel might panic(), doing this in crash_smp_send_stop() > is a good call. > > >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S >> index 11cb99c4d298..03dc233bdaa1 100644 >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -909,13 +909,18 @@ NOKPROBE(call_on_irq_stack) >> #include >> #include >> >> -.macro sdei_handler_exit exit_mode >> - /* On success, this call never returns... */ >> +.macro sdei_handler_exit_fallthrough exit_mode >> cmp \exit_mode, #SDEI_EXIT_SMC >> b.ne 99f >> smc #0 >> - b . >> + b 100f >> 99: hvc #0 >> +100: >> +.endm > > This was a tangled mess before, but now that it can fallthrough we should try harder. > Could you look at smccc_patch_fw_mitigation_conduit for an example of how this can be done > cleanly. That would allow sdei_exit_mode to be removed. > (if that makes sense to do, please do it as a preparatory patch) OK, will do. > >> @@ -1077,4 +1082,17 @@ alternative_else_nop_endif > >> +SYM_CODE_START(sdei_handler_abort) >> + mov_q x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME >> + adr x1, 1f >> + ldr_l x2, sdei_exit_mode >> + sdei_handler_exit_fallthrough exit_mode=x2 >> + // either fallthrough if not in handler context, or exit the handler >> + // and jump to the next instruction. Exit will stomp x0-x17, PSTATE, >> + // ELR_ELx, and SPSR_ELx. >> +1: ret >> +SYM_CODE_END(sdei_handler_abort) >> +NOKPROBE(sdei_handler_abort) >> + >> #endif /* CONFIG_ARM_SDE_INTERFACE */ > >> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c >> index ffc5d76cf695..bc1b3000197e 100644 >> --- a/arch/arm64/kernel/smp.c >> +++ b/arch/arm64/kernel/smp.c >> @@ -1047,10 +1047,8 @@ void crash_smp_send_stop(void) >> * If this cpu is the only one alive at this point in time, online or >> * not, there are no stop messages to be sent around, so just back out. >> */ >> - if (num_other_online_cpus() == 0) { >> - sdei_mask_local_cpu(); >> - return; >> - } >> + if (num_other_online_cpus() == 0) >> + goto skip_ipi; >> >> cpumask_copy(&mask, cpu_online_mask); >> cpumask_clear_cpu(smp_processor_id(), &mask); >> @@ -1069,7 +1067,15 @@ void crash_smp_send_stop(void) >> pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", >> cpumask_pr_args(&mask)); >> >> +skip_ipi: >> sdei_mask_local_cpu(); >> + /* >> + * The crash may have happened in a critical event handler which >> + * preempted a normal handler. So at most we might have two >> + * levels of SDEI context to exit. >> + */ >> + sdei_handler_abort(); >> + sdei_handler_abort(); > > And if SDEI wasn't supported? Before SMC-CC you couldn't probe for firmware calls, you had > to know they were implemented. Its entirely possible there are platforms out there that > corrupt more than x0-x17 when you do this. > (also, what happens on machines where the kernel runs at EL2, and there is no EL3?) > > You can tell if the kernel is in the 'middle' of an SDEI event based on the stack. Ah, all good points. The case I was worried about was an SDEI normal event that itself got preempted by a critical event before it switched sp to the sdei stack, or after it had switched back but hadn't ended, but you're right, "just try and see" is a really a non-solution. How about: if (sp within sdei_shadow_call_stack_normal || sp within sdei_shadow_call_stack_critical) sdei_handler_abort() if (sp within sdei_shadow_call_stack_critical && (interrupted_regs.sp within sdei_shadow_call_stack_normal || interrupted_regs.pc within __sdei_asm_handler)) sdei_handler_abort() > As this is a firmware bug, I'd like to print a warning. Something has > gone wrong already, otherwise we wouldn't be panic()ing, I think its > important to identify this extra code is running - in case it leads to > more problems. (and not run it on platforms that aren't affected). Yep, good idea, will do. > Could we check PMR_EL1 to see if its got a value outside the kernel's > PMR range before triggering any of this. This would also let you know > if you need to do this twice. PMR and RPR? What if the firmware fiddled with interrupts some other way? Group enables? Maybe that's too paranoid and it's better to just deal with firmware we've got. > > Thanks, > > James > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel