From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A01B2C54E71 for ; Fri, 22 Mar 2024 16:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=WIh4In4KcLLuSphKjSgMk4++0uHe8rk0XIQyTNr35zw=; b=vPQVPA/qQQowzq /dfXzy1dz/IC/FKgWqFP45BaAj4uy5pERlMYzYfRf+Y5c+YDvDkcWIjbhIluRqSDtFea5TH6XZpCk TlSc870j7BHCeEPhqun+P4bieQWFTx7M8KNaAIQavRN9feG1D8xjQsO/iSEiQtMktKEA1hOapF7zl q8bqrNKRkyecGIryaVOaAiFNZ73IUFS7FPCCFkzTspkJBSOD/O0fq8RT0sK4rNSK0Fb7/R/FLwjlt dZyMDjTx3O1gm4A/ySBGf7oVPUOwi3BNUt8K9PxfjtHuPER3OxkfF0CsEhXx9Y/3dHkTcRAmPNbXA +C0HLVU5/HS2X3ORAriw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rnhPw-00000007sT5-3A3v; Fri, 22 Mar 2024 16:06:52 +0000 Received: from sin.source.kernel.org ([2604:1380:40e1:4800::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rnhPs-00000007sOa-0VrW for linux-arm-kernel@lists.infradead.org; Fri, 22 Mar 2024 16:06:49 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 627ABCE1804; Fri, 22 Mar 2024 16:06:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D579AC433C7; Fri, 22 Mar 2024 16:06:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711123605; bh=jNixdvLmbYlpPwGHT7OTZy68M8BgCmi9z9/4OazvKY4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=b/dUuBgBeKZmVhtE3pwalY6r0uugrKLvrgDZXcvghTaAAojqhL0kwxziiKqXCFdJC jBP2neVKoOYX4/xAsKu5gAYcJ73FwKBaNqU0qv3s+0WSy1ChUnxpnW30mU038ivEuN E1rZiLKtRvVR2I0I7P47ljUKcLtObluNFzwFNONVlrVb5TtUPVnVVImj2Oz5VaJgOJ XNMRq1Rm+mqTRoMc2Ajqe1TU+ol/PWTzRMvJdJ80F/oDYCDTm4UhLrG7E6YSMWAyhu AFVkXigGZF7Lk8M09QzMjCl7pYPfoVg4NdZpyGfZk1uRk1uYgozedLoAVplzgORzSK DYPcCtEC58itw== Date: Fri, 22 Mar 2024 16:06:40 +0000 From: Will Deacon To: Tyler Hicks Cc: Robin Murphy , Jason Gunthorpe , Jerry Snitselaar , linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Dexuan Cui , Easwar Hariharan Subject: Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? Message-ID: <20240322160640.GF5634@willie-the-truck> References: <120d0dec-450f-41f8-9e05-fd763e84f6dd@arm.com> <20240319154756.GB2901@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240322_090648_626335_A42E02EF X-CRM114-Status: GOOD ( 24.94 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Mar 19, 2024 at 02:14:26PM -0500, Tyler Hicks wrote: > On 2024-03-19 15:47:56, Will Deacon wrote: > > On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote: > > > Beyond properly quiescing and resetting the system back to a boot-time > > > state, the outgoing kernel in a kexec can only really do things which affect > > > itself. Sure, we *could* configure the SMMU to block all traffic and disable > > > the interrupt to avoid getting stuck in a storm of faults on the way out, > > > but what does that mean for the incoming kexec payload? That it can have the > > > pleasure of discovering the SMMU, innocently enabling the interrupt and > > > getting stuck in an unexpected storm of faults. Or perhaps just resetting > > > the SMMU into a disabled state and thus still unwittingly allowing its > > > memory to be corrupted by the previous kernel not supporting kexec properly. > > > > Right, it's hard to win if DMA-active devices weren't quiesced properly > > by the outgoing kernel. Either the SMMU was left in abort (leading to the > > problems you list above) or the SMMU is left in bypass (leading to possible > > data corruption). Which is better? > > My thoughts are that a loud and obvious failure (via unidentified stream > fault messages and/or a possible interrupt storm preventing the new > kernel from booting) is favorable to silent and subtle data corruption > of the target kernel. Looking at the SMMUv3 spec, the architecture does actually allow hardware to reset into an aborting state: [GBPA.ABORT] | Note: An implementation can reset this field to 1, in order to | implement a default deny policy on reset. so perhaps it's not that unreasonable. I just dread the flood of emails I'll get because the SMMU driver is noisy due to missing ->shutdown() callbacks elsewhere :/ > > The best solution is obviously to implement those missing ->shutdown() > > callbacks. > > Completely agree here but it can be difficult to even identify that a > missing ->shutdown hook is the root cause without code changes to put > the SMMU into abort mode and sleep for a bit in the SMMU's ->shutdown > hook. Perhaps that's the thing to tackle first, then? If we make it easier for folks to diagnose and fix the missing ->shutdown() callbacks, then going into abort is much more reasonable, Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel