From: Sebastian Ene <sebastianene@google.com>
To: Fuad Tabba <tabba@google.com>
Cc: alexandru.elisei@arm.com, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, android-kvm@google.com,
catalin.marinas@arm.com, joey.gouly@arm.com, kees@kernel.org,
mark.rutland@arm.com, maz@kernel.org, oupton@kernel.org,
perlarsen@google.com, qperret@google.com, rananta@google.com,
smostafa@google.com, suzuki.poulose@arm.com, tglx@kernel.org,
vdonnefort@google.com, bgrzesik@google.com, will@kernel.org,
yuzenghui@huawei.com
Subject: Re: [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables
Date: Fri, 10 Apr 2026 13:52:54 +0000 [thread overview]
Message-ID: <adkAtvBiYDH3_xSA@google.com> (raw)
In-Reply-To: <CA+EHjTxUbLBnyBGg58wsJQvTXPN0FTbj53H5r3sC9_TizpjvdQ@mail.gmail.com>
On Mon, Mar 16, 2026 at 04:13:59PM +0000, Fuad Tabba wrote:
Hello Fuad,
> Hi Sebastian,
>
> On Tue, 10 Mar 2026 at 12:49, Sebastian Ene <sebastianene@google.com> wrote:
> >
> > Setup shadow structures for ITS indirect tables held in
> > the GITS_BASER<n> registers.
> > Make the last level of the Device Table and vPE Table
> > inacessible to the host.
>
> inacessible -> inaccessible
Applied fix, thanks.
> > In a direct layout configuration, donate the table to
> > the hypervisor since the software is not expected to
> > program them directly.
>
> This commit message is too brief and doesn't fully explain the
> problem, the impact, and the mechanism of the solution. It also
> appears to contradict the actual code changes.
>
> For example, could you elaborate why must the last level of indirect
> tables be inaccessible?
For device table, a malicious host can write the ITT address that points
to hyp memory and then use MAPTI to write over that memory.
>
> Can you also please explain the mechanism? You are parsing
> GITS_BASER_INDIRECT to determine if a shadow Level 1 table must be
> shared with the host, while unconditionally donating the original
> physical tables. You also explicitly exclude Collection tables. The
> msg should briefly justify why Collection tables are safe to leave
> accessible to the host.
>
> There is also a contradiction in the message. You state "In a direct
> layout configuration, donate the table...". However, your code donates
> the original hardware table unconditionally on every iteration of the
> loop, regardless of whether GITS_BASER_INDIRECT is set. Please ensure
> the commit log accurately reflects the code implementation.
>
I see no contradition, I only need to shadow the first layer of the
indirect tables. Shadowing implies donation and sharing:
because we are donating the original tables from host -> hyp and we sharing the
host view of the tables with the hypervisor (which is a copy).
> Maybe you could say that the problem is Host DMA attacks via ITS table
> manipulation. Whereas the mechanism is to unconditionally donate
This has nothing to do with Host DMA attacks, it is just the AP that can
write to memory.
> hardware tables to EL2. For indirect Device/vPE tables, share a L1
> shadow table with the host and strictly donate the L2 pages to prevent
> the host from writing malicious L2 pointers.
>
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/kvm/hyp/nvhe/its_emulate.c | 143 ++++++++++++++++++++++++++
> > 1 file changed, 143 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/its_emulate.c b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > index 4a3ccc90a1a9..865a5d6353ed 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > @@ -141,6 +141,145 @@ static struct pkvm_protected_reg *get_region(phys_addr_t dev_addr)
> > return NULL;
> > }
> >
> > +static int pkvm_host_unmap_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > + u64 *table = shadow;
> > + int ret, i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
> > + phys_addr_t table_addr;
>
> RCT, mixing initialized variables and uninitialized variables, plus
> variables of conceptually different "types" in the same declaration.
>
> Please use sizeof(*table): sizeof(table) evaluates to the size of the
> pointer (8 bytes), NOT the size of the array element. In this case,
> this happens to be the same, but it's still wrong.
>
> Maybe the following is clearer:
> + int end = num_pages * (PAGE_SIZE / sizeof(*table));
>
>
Will use the suggestion and do the same for the
pkvm_host_map_last_level.
> > +
> > + for (i = 0; i < end; i++) {
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & PHYS_MASK;
> > + ret = __pkvm_host_donate_hyp(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
>
> The ITS-configured page size and the host page size could be
> different, but the number of pages to donate for Level 2 tables is
> calculated based on psz (the ITS).
>
> If the ITS hardware is configured for 4KB pages, but the host kernel
> is using (e.g.,) 64KB pages, psz >> PAGE_SHIFT evaluates to 0.
I need to revisit this, thanks for pointing out.
>
> You need to account for mismatched page sizes, perhaps by using
> DIV_ROUND_UP(psz, PAGE_SIZE) (or something similar) to ensure the
> containing host page is donated.
>
> > + if (ret)
> > + goto err_donate;
> > + }
> > +
> > + return 0;
> > +err_donate:
> > + for (i = i - 1; i >= 0; i--) {
>
> Please use the while (i--) idiom for rollback loops.
>
>
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & PHYS_MASK;
> > + __pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
>
> Please wrap this in WARN_ON(...). If donating back to the host fails
> during a rollback, we have a fatal page leak that needs to be loudly
> flagged, similar to how you handle it in pkvm_unshare_shadow_table.
>
>
> > + }
> > + return ret;
> > +}
> > +
> > +static int pkvm_share_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > + u64 i, ret, start_pfn = hyp_virt_to_pfn(shadow);
>
> Same comment as before with RCT and the mixing of declarations.
>
>
> > +
> > + for (i = 0; i < nr_pages; i++) {
> > + ret = __pkvm_host_share_hyp(start_pfn + i);
> > + if (ret)
> > + goto unshare;
> > + }
> > +
> > + ret = hyp_pin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > + if (ret)
> > + goto unshare;
> > +
> > + return ret;
> > +unshare:
>
> Please use the while (i--) idiom for rollback loops.
>
> Also, please use consistent naming conventions for the labels. Here
> you call it unshare, and earlier it was err_donate.
>
>
> > + for (i = i - 1; i >= 0; i--)
> > + __pkvm_host_unshare_hyp(start_pfn + i);
> > + return ret;
> > +}
> > +
> > +static void pkvm_unshare_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > + u64 i, start_pfn = hyp_virt_to_pfn(shadow);
> > +
> > + hyp_unpin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > +
> > + for (i = 0; i < nr_pages; i++)
> > + WARN_ON(__pkvm_host_unshare_hyp(start_pfn + i));
> > +}
> > +
> > +static void pkvm_host_map_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > + u64 *table;
>
> RCT, and you forgot to initialize table:
> + u64 *table = shadow;
Fixed this, thanks. I never ended up on this code path during testing,
maybe I should create a test for it to trigger it.
>
> > + int i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
>
> Same sizeof(table) pointer-size bug as above.
>
>
> > + phys_addr_t table_addr;
> > +
> > + for (i = 0; i < end; i++) {
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & ~GITS_BASER_VALID;
>
> Inconsistent masking logic, since in pkvm_host_unmap_last_level you
> correctly used PHYS_MASK to extract the address, but here in the
> rollback path you use ~GITS_BASER_VALID.
>
> While both currently work because the upper bits and lower bits (below
> the page size) are defined as RES0 in the GIC spec, ~GITS_BASER_VALID
> is architecturally fragile. If a future hardware revision repurposes
> the upper RES0 bits [62:52] for new attributes (e.g., memory
> encryption flags), ~GITS_BASER_VALID will leak those attribute bits
> into the physical address calculation.
>
> Since PHYS_MASK correctly handles the address extraction across all
> page sizes (relying on the lower bits being RES0) and safely masks off
> future upper attribute bits, please standardize on using table_addr =
> table[i] & PHYS_MASK; for both functions.
>
>
Fixed the inconsistency and used PHYS_MASK everywhere.
> > + WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT));
> > + }
> > +}
> > +
> > +static int pkvm_setup_its_shadow_baser(struct its_shadow_tables *shadow)
> > +{
> > + int i, ret;
> > + u64 baser_val, num_pages, type;
> > + void *base, *host_base;
> > +
> > + for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> > + baser_val = shadow->tables[i].val;
> > + if (!(baser_val & GITS_BASER_VALID))
> > + continue;
> > +
> > + base = kern_hyp_va(shadow->tables[i].base);
> > + num_pages = (1 << shadow->tables[i].order);
> > +
> > + ret = __pkvm_host_donate_hyp(hyp_virt_to_pfn(base), num_pages);
> > + if (ret)
> > + goto err_donate;
> > +
> > + if (baser_val & GITS_BASER_INDIRECT) {
> > + host_base = kern_hyp_va(shadow->tables[i].shadow);
> > + ret = pkvm_share_shadow_table(host_base, num_pages);
> > + if (ret)
> > + goto err_with_donation;
> > +
> > + type = GITS_BASER_TYPE(baser_val);
> > + if (type == GITS_BASER_TYPE_COLLECTION)
> > + continue;
> > +
> > + ret = pkvm_host_unmap_last_level(base, num_pages,
> > + shadow->tables[i].psz);
> > + if (ret)
> > + goto err_with_share;
> > + }
> > + }
> > +
> > + return 0;
> > +err_with_share:
> > + pkvm_unshare_shadow_table(host_base, num_pages);
> > +err_with_donation:
> > + __pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages);
> > +err_donate:
> > + for (i = i - 1; i >= 0; i--) {
>
> Please use the while (i--) idiom for rollback loops.
>
>
> > + baser_val = shadow->tables[i].val;
> > + if (!(baser_val & GITS_BASER_VALID))
> > + continue;
> > +
> > + base = kern_hyp_va(shadow->tables[i].base);
> > + num_pages = (1 << shadow->tables[i].order);
> > +
> > + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
>
> The sequence of rollback operations here creates a TOCTOU vulnerability.
>
There is a different problem here related to functionality rather than
this: donating the base to the host first and then iterating over it
will make the hypervisor explode. I fixed this.
> - First, you donate base (the Level 1 indirect table) back to the host.
> - Then, you pass base into pkvm_host_map_last_level().
> - Finally, pkvm_host_map_last_level() reads table[i] out of base to
> determine which Level 2 pages to donate back to the host.
>
> Because the host regains ownership of base _first_, it can be running
> concurrently on another CPU. A malicious host can overwrite the Level
> 1 table with pointers to arbitrary hypervisor-owned memory. The
> hypervisor will then read those malicious pointers and dutifully grant
> the host access to its own secure memory.
>
> The order of operations needs to be reversed: you must read base to
> roll back the L2 pages, unshare the shadow table, and *only then*
> donate base back to the host.
>
> Also, num_pages = (1 << shadow->tables[i].order); calculates a 32-bit
> signed integer because the literal 1 is a signed 32-bit int. If order
> is 31, this evaluates to a negative number. If order is 32 or higher,
> this is undefined behavior. Because num_pages is declared as a u64,
> you should use the standard kernel macro BIT_ULL().
>
> Here's my suggested fix (not tested). Reorder the operations to safely
> rollback L2 before donating L1, use the standard `while (i--)` loop,
> and fix the page calculation:
>
> + while (i--) {
> + baser_val = shadow->tables[i].val;
> + if (!(baser_val & GITS_BASER_VALID))
> + continue;
> +
> + base = kern_hyp_va(shadow->tables[i].base);
> + num_pages = BIT_ULL(shadow->tables[i].order);
> +
> + if (baser_val & GITS_BASER_INDIRECT) {
> + host_base = kern_hyp_va(shadow->tables[i].shadow);
> +
> + type = GITS_BASER_TYPE(baser_val);
> + if (type != GITS_BASER_TYPE_COLLECTION)
> + pkvm_host_map_last_level(base, num_pages,
> + shadow->tables[i].psz);
> +
> + pkvm_unshare_shadow_table(host_base, num_pages);
> + }
> +
> + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
> + }
>
>
>
> > + if (baser_val & GITS_BASER_INDIRECT) {
> > + host_base = kern_hyp_va(shadow->tables[i].shadow);
> > + pkvm_unshare_shadow_table(host_base, num_pages);
> > +
> > + type = GITS_BASER_TYPE(baser_val);
> > + if (type == GITS_BASER_TYPE_COLLECTION)
> > + continue;
> > +
> > + pkvm_host_map_last_level(base, num_pages, shadow->tables[i].psz);
> > + }
> > + }
>
> You have duplicated the entire table decoding logic (calculating base,
> num_pages, checking INDIRECT...) down here in the rollback path.
> Consider abstracting "setup one table" and "teardown one table" into
> helper functions to make pkvm_setup_its_shadow_baser more readable and
> less prone to copy-pasta errors.
>
> Cheers,
> /fuad
>
Thanks,
Sebastian
>
> > +
> > + return ret;
> > +}
> > +
> > static int pkvm_setup_its_shadow_cmdq(struct its_shadow_tables *shadow)
> > {
> > int ret, i, num_pages;
> > @@ -205,6 +344,10 @@ int pkvm_init_gic_its_emulation(phys_addr_t dev_addr, void *host_priv_state,
> > if (ret)
> > goto err_with_shadow;
> >
> > + ret = pkvm_setup_its_shadow_baser(shadow);
> > + if (ret)
> > + goto err_with_shadow;
> > +
> > its_reg->priv = priv_state;
> >
> > hyp_spin_lock_init(&priv_state->its_lock);
> > --
> > 2.53.0.473.g4a7958ca14-goog
> >
next prev parent reply other threads:[~2026-04-10 13:53 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 12:49 [RFC PATCH 00/14] KVM: ITS hardening for pKVM Sebastian Ene
2026-03-10 12:49 ` [PATCH 01/14] KVM: arm64: Donate MMIO to the hypervisor Sebastian Ene
2026-03-12 17:57 ` Fuad Tabba
2026-03-13 10:40 ` Suzuki K Poulose
2026-03-24 10:39 ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 02/14] KVM: arm64: Track host-unmapped MMIO regions in a static array Sebastian Ene
2026-03-12 19:05 ` Fuad Tabba
2026-03-24 10:46 ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 03/14] KVM: arm64: Support host MMIO trap handlers for unmapped devices Sebastian Ene
2026-03-13 9:31 ` Fuad Tabba
2026-03-24 10:59 ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 04/14] KVM: arm64: Mediate host access to GIC/ITS MMIO via unmapping Sebastian Ene
2026-03-13 9:58 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 05/14] irqchip/gic-v3-its: Prepare shadow structures for KVM host deprivilege Sebastian Ene
2026-03-13 11:26 ` Fuad Tabba
2026-03-13 13:10 ` Fuad Tabba
2026-03-20 15:11 ` Sebastian Ene
2026-03-24 14:36 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 06/14] KVM: arm64: Add infrastructure for ITS emulation setup Sebastian Ene
2026-03-16 10:46 ` Fuad Tabba
2026-03-17 9:40 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables Sebastian Ene
2026-03-16 16:13 ` Fuad Tabba
2026-04-10 13:52 ` Sebastian Ene [this message]
2026-03-10 12:49 ` [PATCH 08/14] KVM: arm64: Trap & emulate the ITS MAPD command Sebastian Ene
2026-03-17 10:20 ` Fuad Tabba
2026-04-08 14:05 ` Sebastian Ene
2026-03-10 12:49 ` [PATCH 09/14] KVM: arm64: Trap & emulate the ITS VMAPP command Sebastian Ene
2026-03-10 12:49 ` [PATCH 10/14] KVM: arm64: Trap & emulate the ITS MAPC command Sebastian Ene
2026-03-10 12:49 ` [PATCH 11/14] KVM: arm64: Restrict host updates to GITS_CTLR Sebastian Ene
2026-03-10 12:49 ` [PATCH 12/14] KVM: arm64: Restrict host updates to GITS_CBASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 13/14] KVM: arm64: Restrict host updates to GITS_BASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 14/14] KVM: arm64: Implement HVC interface for ITS emulation setup Sebastian Ene
2026-03-12 17:56 ` [RFC PATCH 00/14] KVM: ITS hardening for pKVM Fuad Tabba
2026-03-20 14:42 ` Sebastian Ene
2026-03-13 15:18 ` Mostafa Saleh
2026-03-15 13:24 ` Fuad Tabba
2026-03-25 16:26 ` Sebastian Ene
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adkAtvBiYDH3_xSA@google.com \
--to=sebastianene@google.com \
--cc=alexandru.elisei@arm.com \
--cc=android-kvm@google.com \
--cc=bgrzesik@google.com \
--cc=catalin.marinas@arm.com \
--cc=joey.gouly@arm.com \
--cc=kees@kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=perlarsen@google.com \
--cc=qperret@google.com \
--cc=rananta@google.com \
--cc=smostafa@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vdonnefort@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox