public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/14] KVM: ITS hardening for pKVM
@ 2026-03-10 12:49 Sebastian Ene
  2026-03-10 12:49 ` [PATCH 01/14] KVM: arm64: Donate MMIO to the hypervisor Sebastian Ene
                   ` (15 more replies)
  0 siblings, 16 replies; 36+ messages in thread
From: Sebastian Ene @ 2026-03-10 12:49 UTC (permalink / raw)
  To: alexandru.elisei, kvmarm, linux-arm-kernel, linux-kernel,
	android-kvm
  Cc: catalin.marinas, dbrazdil, joey.gouly, kees, mark.rutland, maz,
	oupton, perlarsen, qperret, rananta, sebastianene, smostafa,
	suzuki.poulose, tabba, tglx, vdonnefort, bgrzesik, will,
	yuzenghui

This series introduces the necessary machinery to perform trap & emulate
on device access in pKVM. Furthermore, it hardens the GIC/ITS controller to
prevent an attacker from tampering with the hypervisor protected memory
through this device. 

In pKVM, the host kernel is initially trusted to manage the boot process but
its permissions are revoked once KVM initializes. The GIC/ITS device is
configured before the kernel deprivileges itself. Once the hypervisor
becomes available, sanitize the accesses to the ITS controller by
trapping and emulating certain registers and by shadowing some memory
structures used by the ITS.

This is required because the ITS can issue transactions on the memory
bus *directly*, without having an SMMU in front of it, which makes it
an interesting target for crossing the hypervisor-established privilege
boundary.


Patch overview
==============

The first patch is re-used from Mostafa's series[1] which brings SMMU-v3
support to pKVM.

[1] https://lore.kernel.org/linux-iommu/20251117184815.1027271-1-smostafa@google.com/#r

Some of the infrastructure built in that series might intersect and we
agreed to converge on some changes. The patches [1 - 3] allow unmapping
devices from the host address space and installing a handler to trap
accesses from the host. While executing in the handler, enough context
has to be given from mem-abort to perform the emulation of the device
such as: the offset, the access size, direction of the write and private
related data specific to the device. 
The unmapping of the device from the host address space is performed
after the host deprivilege (during _kvm_host_prot_finalize call).

The 4th patch looks up the ITS node from the device tree and adds it to
an array of unmapped devices. It install a handler that forwards all the
MMIO request to mediate the host access inside the emulation layer and
to prevent breaking ITS functionality. 

The 5th patch changes the GIC/ITS driver to exposes two new methods
which will be called from the KVM layer to setup the shadow state and
to take the appropriate locks. This one is the most intrusive as it
changes the current GIC/ITS driver. I tried to avoid creating a
dependency with KVM to keep the GIC driver agnostic of the virtualization
layer but I am happy to explore other options as well. 
To avoid re-programming the ITS device with new shadow structures after
pKVM is ready, I exposed two functions to change the
pointers inside the driver for the following structures:
- the command queue points to a newly allocated queue
- the GITS_BASER<n> tables configured with an indirect layout have the
  first layer shadowed and they point to a new memory region

Patch 6 adds the entry point into the emulation setup and sets up the
shadow command queue. It adds some helper macros to define the offset
register and the associate action that we want to execute in the
emulation. It also unmaps the state passed from the host kernel
to prevent it from playing nasty games later on. The patch
traps accesses to CWRITER register and copies the commands from the
host command queue to the shadow command queue. 

Patch 7 prevents the host from directly accessing the first layer of the
indirect tables held in GITS_BASER<n>. It also prevents the host from
directly accesssing the last layer of the Device Table (since the entries
in this table hold the address of the ITT table) and of the vPE Table
(since the vPE table entries hold the address of the virtual LPI pending
table.

Patches [8-10] sanitize the commands sent to the ITS and their
arguments.

Patches [11-13] restrict the access of the host to certain registers
and prevent undefined behaviour. Prevent the host from re-programming
the tables held in the GITS_BASER register.

The last patch introduces an hvc to setup the ITS emulation and calls
into the ITS driver to setup the shadow state. 


Design
======


1. Command queue shadowing

The ITS hardware supports a command queue which is programmed by the driver
in the GITS_CBASER register. To inform the hardware that a new command
has been added, the driver updates an index into the GITS_CWRITER
register. The driver then reads the GITS_CREADR register to see if the
command was processed or if the queue is stalled.
 
To create a new command, the emulation layer mirrors the behavior
as following:
 (i) The host ITS driver creates a command in the shadow queue:
	its_allocate_entry() -> builder()
 (ii) Notifies the hardware that a new command is available:
	its_post_commands()
 (iii) Hypervisor traps the write to GITS_CWRITER:
	handle_host_mem_abort() -> handle_host_mmio_trap() ->
            pkvm_handle_gic_emulation()
 (iv) Hypervisor copies the command from the host command queue
      to the original queue which is not accessible to the host.
      It parses the command and updates the hardware write.

The driver allocates space for the original command queue and programs
the hardware (GITS_CWRITER). When pKVM becomes available, the driver
allocates a new (shadow) queue and replaces its original pointer to
the queue with this new one. This is to prevent a malicious host from
tampering with the commands sent to the ITS hardware.

The entry point of our emulation shares the memory of the newly
allocated queue with the hypervisor and donates the memory of the
original queue to make it inaccesible to the host.


2. Indirect tables first level shadowing

The ITS hardware supports indirection to minimize the space required to
accommodate large tables (eg. deviceId space used to index the Device Table
is quite sparse). This is a 2-level indirection, with entries from the
first table pointing to a second table.

An attacker in control of the host can insert an address that points to
the hypervisor protected memory in the first level table and then use
subsequent ITS commands to write to this memory (MAPD).

To shadow this tables, we rely on the driver to allocate space for it
and we copy the original content from the table into the copy. When
pKVM becomes available we switch the pointers that hold the orginal
tables to point to the copy.
To keep the tables from the hypervisor in sync with what the host
has, we update the tables when commands are sent to the ITS.


3. Hiding the last layer of the Device Table and vPE Table from the host

An attacker in control of the host kernel can alter the content of these
tables directly (the Arm IHI 0069H.b spec says that is undefined behavior
if entries are created by software). Normally these entries are created in
response of commands sent to the ITS.

A Device Table entry that has the following structure:

type DeviceTableEntry is (
	boolean Valid,
	Address ITT_base,
	bits(5) ITT_size
) 

This can be maliciously created by an attacker and the ITT_base can be
pointed to hypervisor protected memory. The MAPTI command can then be
used to write over the ITT_base with an ITE entry.

Similarly a vCPU Table entry has the following structure:

type VCPUTableEntry is (
	boolean Valid,
	bits(32) RDbase,
	Address VPT_base,
	bits(5) VPT_size
)

VPT_base can be pointed to hypervisor protected memory and then a
command can be used to raise interrupts and set the corresponding
bit. This would give a 1-bit write primitive so is not "as generous"
as the others.


Notes
=====


Performance impact is expected with this as the emulation dance is not
cost free.
I haven't implemented any ITS quirks in the emulation and I don't know
whether we will need it ? (some hardware needs explicit dcache flushing
ITS_FLAGS_CMDQ_NEEDS_FLUSHING). 

Please note that Redistributors trapping hasn't been addressed at all in
this series and the solution is not sufficient but this can be extended
afterwards. 
The current series has been tested with Qemu (-machine
virt,virtualization=true,gic-version=4) and with Pixel 10.


Thanks,
Sebastian E.

Mostafa Saleh (1):
  KVM: arm64: Donate MMIO to the hypervisor

Sebastian Ene (13):
  KVM: arm64: Track host-unmapped MMIO regions in a static array
  KVM: arm64: Support host MMIO trap handlers for unmapped devices
  KVM: arm64: Mediate host access to GIC/ITS MMIO via unmapping
  irqchip/gic-v3-its: Prepare shadow structures for KVM host deprivilege
  KVM: arm64: Add infrastructure for ITS emulation setup
  KVM: arm64: Restrict host access to the ITS tables
  KVM: arm64: Trap & emulate the ITS MAPD command
  KVM: arm64: Trap & emulate the ITS VMAPP command
  KVM: arm64: Trap & emulate the ITS MAPC command
  KVM: arm64: Restrict host updates to GITS_CTLR
  KVM: arm64: Restrict host updates to GITS_CBASER
  KVM: arm64 Restrict host updates to GITS_BASER
  KVM: arm64: Implement HVC interface for ITS emulation setup

 arch/arm64/include/asm/kvm_arm.h              |   3 +
 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pkvm.h             |  20 +
 arch/arm64/kvm/hyp/include/nvhe/its_emulate.h |  17 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   2 +
 arch/arm64/kvm/hyp/nvhe/Makefile              |   3 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  14 +
 arch/arm64/kvm/hyp/nvhe/its_emulate.c         | 653 ++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 134 ++++
 arch/arm64/kvm/hyp/nvhe/setup.c               |  28 +
 arch/arm64/kvm/hyp/pgtable.c                  |   9 +-
 arch/arm64/kvm/pkvm.c                         |  60 ++
 drivers/irqchip/irq-gic-v3-its.c              | 177 ++++-
 include/linux/irqchip/arm-gic-v3.h            |  36 +
 14 files changed, 1126 insertions(+), 31 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/its_emulate.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/its_emulate.c

-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2026-03-25 16:26 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 12:49 [RFC PATCH 00/14] KVM: ITS hardening for pKVM Sebastian Ene
2026-03-10 12:49 ` [PATCH 01/14] KVM: arm64: Donate MMIO to the hypervisor Sebastian Ene
2026-03-12 17:57   ` Fuad Tabba
2026-03-13 10:40   ` Suzuki K Poulose
2026-03-24 10:39   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 02/14] KVM: arm64: Track host-unmapped MMIO regions in a static array Sebastian Ene
2026-03-12 19:05   ` Fuad Tabba
2026-03-24 10:46   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 03/14] KVM: arm64: Support host MMIO trap handlers for unmapped devices Sebastian Ene
2026-03-13  9:31   ` Fuad Tabba
2026-03-24 10:59   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 04/14] KVM: arm64: Mediate host access to GIC/ITS MMIO via unmapping Sebastian Ene
2026-03-13  9:58   ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 05/14] irqchip/gic-v3-its: Prepare shadow structures for KVM host deprivilege Sebastian Ene
2026-03-13 11:26   ` Fuad Tabba
2026-03-13 13:10     ` Fuad Tabba
2026-03-20 15:11     ` Sebastian Ene
2026-03-24 14:36       ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 06/14] KVM: arm64: Add infrastructure for ITS emulation setup Sebastian Ene
2026-03-16 10:46   ` Fuad Tabba
2026-03-17  9:40     ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables Sebastian Ene
2026-03-16 16:13   ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 08/14] KVM: arm64: Trap & emulate the ITS MAPD command Sebastian Ene
2026-03-17 10:20   ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 09/14] KVM: arm64: Trap & emulate the ITS VMAPP command Sebastian Ene
2026-03-10 12:49 ` [PATCH 10/14] KVM: arm64: Trap & emulate the ITS MAPC command Sebastian Ene
2026-03-10 12:49 ` [PATCH 11/14] KVM: arm64: Restrict host updates to GITS_CTLR Sebastian Ene
2026-03-10 12:49 ` [PATCH 12/14] KVM: arm64: Restrict host updates to GITS_CBASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 13/14] KVM: arm64: Restrict host updates to GITS_BASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 14/14] KVM: arm64: Implement HVC interface for ITS emulation setup Sebastian Ene
2026-03-12 17:56 ` [RFC PATCH 00/14] KVM: ITS hardening for pKVM Fuad Tabba
2026-03-20 14:42   ` Sebastian Ene
2026-03-13 15:18 ` Mostafa Saleh
2026-03-15 13:24   ` Fuad Tabba
2026-03-25 16:26   ` Sebastian Ene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox