From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF5EF1624CE for ; Tue, 15 Apr 2025 08:35:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744706143; cv=none; b=DSjfh/w9eTwojmg8v5mCJnA+v17GVDvmBBh42CwHZXcsV1I3HixLF7uVxx50V8G+BVsMhpVEaBgkz29qns3n6MdgKHPndlwfRysV5FjcCTJiYoikLjYneQCEHNAtUEoDx1EfgHJp5n49C+SQzKRuiN3woRVFmFh+/hwoTlB0XOU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744706143; c=relaxed/simple; bh=GpBbXos+rpkgip/aX1rfTMEn4O35+HO/6MJjvnJqcyA=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=l77LWwAG4OjY3cBfZjLo4E0c+jMBr1iQTF/Fo+zajLbm1Y3taLLbn9IV5bG8uMqW/aiuTMN1NpvlUfllKn7T/MD0Z6+1bIgMGZPBKVqL7HRfhljAcSvjyFqQYbsuWDMRjXjHfIrUpt8W36fmagJvuQiU08s0mpmD2HJh/yxfOSA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=seP/ZTf8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="seP/ZTf8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3BBF5C4CEEB; Tue, 15 Apr 2025 08:35:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744706143; bh=GpBbXos+rpkgip/aX1rfTMEn4O35+HO/6MJjvnJqcyA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=seP/ZTf8iLlbgzcIHo8RL2ZCfC/mUhhQgKaygg5g8slnHNvHmqaoMoGa4AjXvFXGG 3wP6ia8Vo/+1JaS17ddt15Db8hClkaIa7FV7T53FS4poYiADDVq1hrutP2hdDzbzsO V8wZZox3s8OFC67d25DTlYVeFCHJlbLl8eN6Zo5fmtODTunDTXFk4leS/acZ/q8k1w 37LfkrbCG3jYOhX6/0NmAXJ4lxukhVLvhc8+JjCR9Niy06spZJuQAlk31f9IsexsdQ tuv1b/SqEL/kA3zG34VhrUc5G1dGYx5XweGbbtWPIBhmQ8ERbmMWb3GecHzIxbIezI izBcf1FFTqMUA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1u4blc-005aaY-P8; Tue, 15 Apr 2025 09:35:40 +0100 Date: Tue, 15 Apr 2025 09:35:40 +0100 Message-ID: <867c3llt43.wl-maz@kernel.org> From: Marc Zyngier To: Ilias Stamatis Cc: , , , , , , , , , , , , Subject: Re: [RFC] ARM vGIC-ITS tables serialization when running protected VMs In-Reply-To: <20250414111244.153528-1-ilstam@amazon.com> References: <20250414111244.153528-1-ilstam@amazon.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ilstam@amazon.com, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, eric.auger@redhat.com, andre.przywara@arm.com, will@kernel.org, dwmw@amazon.co.uk, jgrall@amazon.co.uk, ugurus@amazon.co.uk, nh-open-source@amazon.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Mon, 14 Apr 2025 12:12:43 +0100, Ilias Stamatis wrote: > > # The problem > > KVM's ARM Virtual Interrupt Translation Service (ITS) interface supports the > KVM_DEV_ARM_ITS_SAVE_TABLES and KVM_DEV_ARM_ITS_RESTORE_TABLES operations. > These operations save and restore a set of tables (Device Tables, Interrupt > Translation Tables, Collection Table) to and from guest memory. > > This can be a problem when running a protected VM on top of pKVM or another > lowvisor since the host kernel (running at EL1) cannot access guest memory. > pKVM doesn't allow a guest to be saved/restored, full stop. > # Page declassification and why ITTs are special > > The Collection and Device tables are page aligned and their sizes must be a > multiple of page size. If the lowvisor knows where these tables live, it is > possible to "declassify" the corresponding pages and configure the MMU such as > that the EL1 host can write to guest memory directly. > > The ITTs (Interrupt Translation Tables) are different. They are NOT page > aligned, they are 256 byte aligned and their size is variable. That means that > the lowvisor can't declassify pages containing ITTs and configure the MMU > giving the host direct access as above since those pages may contain unrelated > data. And it is the responsibility of the guest to make these page aligned if it intend to let the hypervisor use them. To sum it up, the ITT isn't special at all. > > If the lowvisor knows where the ITTs live in guest memory it could instead > perform the guest memory accesses on behalf of the host. I.e. the EL1 host > would attempt to save the ITTs to guest memory like it does today, that would > generate a data abort, and then the EL2 lowvisor could perform the copy after > validating that the faulty address belongs to an ITT in guest memory. > > One issue with the above is that the ITS save/restore happens at hypervisor > live update which is a time sensitive operation and the extra traps (one per > interrupt mapping?) can introduce significant additional overhead > there. I don't believe this for a second. > > Another issue is that it's actually hard for the lowvisor to know where these > tables live without trusting the EL1 host which virtualizes the ITS. It is > especially hard knowing the locations of the ITTs (compared to > Collection/Device tables) because that probably means having to parse the ITS > command queue from EL2 which is complex and undesirable. > > # An alternative: Serializing ITTs into a userspace buffer NAK. Share the page-aligned memory with the rest of the hypervisor, and use the existing API. M. -- Without deviation from the norm, progress is not possible.