From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 46E75EB64D7 for ; Fri, 23 Jun 2023 16:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rcgsx7qgM7H6KIeej+b18Uxc7QHcL9rYMjvS9Ac5sR0=; b=RrpLcwPuE7r5e/ CTcoq+aiicCN8pJ2xNLUxFPWjutl+Jo48CyTq5dS/xYSzQQVon7CHu4/x5JuS3b6+as3/GtgcREbl R+nWSUxINc85YtD15NMD3D9cx2l2siXUKLouuyG9S+mouiM69AALOaE9AYdRYMz5noisaiUwjZcTM AP2cqqolOpRDf+72veKExClwC3y6YtTebi5fNTdYHR2kBsxEMHtjK76ClohvEJq/9Iphk0HiU/pGT 81J4eSz7kJ06xFGA1k5gfwdBv/3Ay9V7dhjACZCCzAZgEJk2JTcgBpTSa6pDAN10etfvSXjl1QvSc LmjcjG2Dzo139SHQcerQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qCjmc-0045mC-2A; Fri, 23 Jun 2023 16:37:14 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qCjmZ-0045lH-1t for linux-arm-kernel@lists.infradead.org; Fri, 23 Jun 2023 16:37:12 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 19C0061A8D; Fri, 23 Jun 2023 16:37:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76315C433C0; Fri, 23 Jun 2023 16:37:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687538230; bh=360mtTu1d4TwnAixCNSS5YATBS2DxBB7+vK9xBkda6k=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=mevzZM3YL9qgaYVD6Gtn33YQ5BbS5XdX6dJlcdSgbzmP1AY7xASsi7KwLKOSkMWdZ d7x03/Q5vu1mzyMlw/d3tTrYGdg0y92zR3AfSqNYUUroQS+OM5HytMZa810wlwvUR/ 66OYMylp0gpxnfmotgIROVrx2mEOKO5/zyxSkDTCN9eA+X4PA2uSryZXAHVkZTRXsr joWOkqLmDbLDP4Bb+wo9Gk1XJYaVJNWVTeZRc8DEvMEDhGwA5mdNQWcxxcj3GGHwiC QkuHSXKmzlIbbBXbORqbLFeg7xwLnA49Im7E5/OepokH1atFIsz+W/dDcKD3Xe2UHU P+YEZ6tAVu6pA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qCjmW-007o1z-82; Fri, 23 Jun 2023 17:37:08 +0100 Date: Fri, 23 Jun 2023 17:37:07 +0100 Message-ID: <86legab0ek.wl-maz@kernel.org> From: Marc Zyngier To: Ard Biesheuvel Cc: "Russell King (Oracle)" , Quentin Perret , Mark Rutland , Catalin Marinas , Jonathan Corbet , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org Subject: Re: [PATCH RFC 00/17] arm64 kernel text replication In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ardb@kernel.org, linux@armlinux.org.uk, qperret@google.com, mark.rutland@arm.com, catalin.marinas@arm.com, corbet@lwn.net, will@kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230623_093711_714612_36A908C0 X-CRM114-Status: GOOD ( 31.75 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, 23 Jun 2023 16:24:20 +0100, Ard Biesheuvel wrote: > > (cc Marc and Quentin) > > On Mon, 5 Jun 2023 at 11:05, Russell King (Oracle) > wrote: > > > > Hi, > > > > Are there any comments on this? > > > > Hi Russell, > > I think the proposed approach is sound, but it is rather intrusive, as > you've pointed out already (wrt KASLR and KASAN etc). And once my LPA2 > work gets merged (which uses root level -1 when booted on LPA2 capable > hardware, and level 0 otherwise), we'll have yet another combination > that is either fully incompatible, or cumbersome to support at the > very least. > > I wonder if it would be worthwhile to explore an alternative approach, > using pKVM and the host stage2: > > - all stage1 kernel mappings remain as they are, and the kernel code > running at EL1 has no awareness of the replication beyond being > involved in allocating the memory; > - host is booted in protected KVM mode, which means that the host > kernel executes under a stage 2 mapping; > - each NUMA node has its own set of stage 2 page tables, and maps the > kernel's code/rodata IPA range to a NUMA local PA range > - the kernel's code and rodata are mapped read-only in the primary > stage-2 mapping so updates trap to EL2, permitting the hypervisor to > replicate those update to all clones. > > Note that pKVM retains the capabilities of ordinary KVM, so as long as > you boot at EL2, the only downside compared to your approach would be > the increased TLB footprint due to the stage 2 mappings for the host > kernel. > > Marc, Quentin, Will: any thoughts? I like the idea, though there are a couple of 'interesting' corner cases: - you have to give up VHE, which means that if your workload is to mainly run VMs, you pay an extra cost on each guest entry/exit - the EL2 code doesn't have the luxury of a stage-2, meaning that either you accept the fact that this code is going to suffer form uneven performance, or you keep the complexity of the kernel-visible replication for the EL2 code only - memory allocation for the stage-2 is tricky (Quentin can talk about that), and relies on being able to steal enough memory to cover the whole of the host's memory-map, including I/O. Having a set of S2 PTs per node is going to increase that pressure/complexity - I'm not too worried about the TLB aspect. Cores tend to cache VA/PA, not VA/IPA+IPA/PA. What is going to cost is the walk itself. This could be mitigated if S2 uses large mappings (possibly using 64k pages). The last point makes me think that what this approach may not be pKVM itself, but something that builds on top of what pKVM has (host S2) and the nVHE/hVHE behaviour. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel