From: Sebastian Ene <sebastianene@google.com>
To: Fuad Tabba <tabba@google.com>
Cc: alexandru.elisei@arm.com, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, android-kvm@google.com,
catalin.marinas@arm.com, joey.gouly@arm.com, kees@kernel.org,
mark.rutland@arm.com, maz@kernel.org, oupton@kernel.org,
perlarsen@google.com, qperret@google.com, rananta@google.com,
smostafa@google.com, suzuki.poulose@arm.com, tglx@kernel.org,
vdonnefort@google.com, bgrzesik@google.com, will@kernel.org,
yuzenghui@huawei.com
Subject: Re: [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables
Date: Fri, 10 Apr 2026 13:52:54 +0000 [thread overview]
Message-ID: <adkAtvBiYDH3_xSA@google.com> (raw)
In-Reply-To: <CA+EHjTxUbLBnyBGg58wsJQvTXPN0FTbj53H5r3sC9_TizpjvdQ@mail.gmail.com>
On Mon, Mar 16, 2026 at 04:13:59PM +0000, Fuad Tabba wrote:
Hello Fuad,
> Hi Sebastian,
>
> On Tue, 10 Mar 2026 at 12:49, Sebastian Ene <sebastianene@google.com> wrote:
> >
> > Setup shadow structures for ITS indirect tables held in
> > the GITS_BASER<n> registers.
> > Make the last level of the Device Table and vPE Table
> > inacessible to the host.
>
> inacessible -> inaccessible
Applied fix, thanks.
> > In a direct layout configuration, donate the table to
> > the hypervisor since the software is not expected to
> > program them directly.
>
> This commit message is too brief and doesn't fully explain the
> problem, the impact, and the mechanism of the solution. It also
> appears to contradict the actual code changes.
>
> For example, could you elaborate why must the last level of indirect
> tables be inaccessible?
For device table, a malicious host can write the ITT address that points
to hyp memory and then use MAPTI to write over that memory.
>
> Can you also please explain the mechanism? You are parsing
> GITS_BASER_INDIRECT to determine if a shadow Level 1 table must be
> shared with the host, while unconditionally donating the original
> physical tables. You also explicitly exclude Collection tables. The
> msg should briefly justify why Collection tables are safe to leave
> accessible to the host.
>
> There is also a contradiction in the message. You state "In a direct
> layout configuration, donate the table...". However, your code donates
> the original hardware table unconditionally on every iteration of the
> loop, regardless of whether GITS_BASER_INDIRECT is set. Please ensure
> the commit log accurately reflects the code implementation.
>
I see no contradition, I only need to shadow the first layer of the
indirect tables. Shadowing implies donation and sharing:
because we are donating the original tables from host -> hyp and we sharing the
host view of the tables with the hypervisor (which is a copy).
> Maybe you could say that the problem is Host DMA attacks via ITS table
> manipulation. Whereas the mechanism is to unconditionally donate
This has nothing to do with Host DMA attacks, it is just the AP that can
write to memory.
> hardware tables to EL2. For indirect Device/vPE tables, share a L1
> shadow table with the host and strictly donate the L2 pages to prevent
> the host from writing malicious L2 pointers.
>
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> > arch/arm64/kvm/hyp/nvhe/its_emulate.c | 143 ++++++++++++++++++++++++++
> > 1 file changed, 143 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/its_emulate.c b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > index 4a3ccc90a1a9..865a5d6353ed 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > @@ -141,6 +141,145 @@ static struct pkvm_protected_reg *get_region(phys_addr_t dev_addr)
> > return NULL;
> > }
> >
> > +static int pkvm_host_unmap_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > + u64 *table = shadow;
> > + int ret, i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
> > + phys_addr_t table_addr;
>
> RCT, mixing initialized variables and uninitialized variables, plus
> variables of conceptually different "types" in the same declaration.
>
> Please use sizeof(*table): sizeof(table) evaluates to the size of the
> pointer (8 bytes), NOT the size of the array element. In this case,
> this happens to be the same, but it's still wrong.
>
> Maybe the following is clearer:
> + int end = num_pages * (PAGE_SIZE / sizeof(*table));
>
>
Will use the suggestion and do the same for the
pkvm_host_map_last_level.
> > +
> > + for (i = 0; i < end; i++) {
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & PHYS_MASK;
> > + ret = __pkvm_host_donate_hyp(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
>
> The ITS-configured page size and the host page size could be
> different, but the number of pages to donate for Level 2 tables is
> calculated based on psz (the ITS).
>
> If the ITS hardware is configured for 4KB pages, but the host kernel
> is using (e.g.,) 64KB pages, psz >> PAGE_SHIFT evaluates to 0.
I need to revisit this, thanks for pointing out.
>
> You need to account for mismatched page sizes, perhaps by using
> DIV_ROUND_UP(psz, PAGE_SIZE) (or something similar) to ensure the
> containing host page is donated.
>
> > + if (ret)
> > + goto err_donate;
> > + }
> > +
> > + return 0;
> > +err_donate:
> > + for (i = i - 1; i >= 0; i--) {
>
> Please use the while (i--) idiom for rollback loops.
>
>
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & PHYS_MASK;
> > + __pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
>
> Please wrap this in WARN_ON(...). If donating back to the host fails
> during a rollback, we have a fatal page leak that needs to be loudly
> flagged, similar to how you handle it in pkvm_unshare_shadow_table.
>
>
> > + }
> > + return ret;
> > +}
> > +
> > +static int pkvm_share_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > + u64 i, ret, start_pfn = hyp_virt_to_pfn(shadow);
>
> Same comment as before with RCT and the mixing of declarations.
>
>
> > +
> > + for (i = 0; i < nr_pages; i++) {
> > + ret = __pkvm_host_share_hyp(start_pfn + i);
> > + if (ret)
> > + goto unshare;
> > + }
> > +
> > + ret = hyp_pin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > + if (ret)
> > + goto unshare;
> > +
> > + return ret;
> > +unshare:
>
> Please use the while (i--) idiom for rollback loops.
>
> Also, please use consistent naming conventions for the labels. Here
> you call it unshare, and earlier it was err_donate.
>
>
> > + for (i = i - 1; i >= 0; i--)
> > + __pkvm_host_unshare_hyp(start_pfn + i);
> > + return ret;
> > +}
> > +
> > +static void pkvm_unshare_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > + u64 i, start_pfn = hyp_virt_to_pfn(shadow);
> > +
> > + hyp_unpin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > +
> > + for (i = 0; i < nr_pages; i++)
> > + WARN_ON(__pkvm_host_unshare_hyp(start_pfn + i));
> > +}
> > +
> > +static void pkvm_host_map_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > + u64 *table;
>
> RCT, and you forgot to initialize table:
> + u64 *table = shadow;
Fixed this, thanks. I never ended up on this code path during testing,
maybe I should create a test for it to trigger it.
>
> > + int i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
>
> Same sizeof(table) pointer-size bug as above.
>
>
> > + phys_addr_t table_addr;
> > +
> > + for (i = 0; i < end; i++) {
> > + if (!(table[i] & GITS_BASER_VALID))
> > + continue;
> > +
> > + table_addr = table[i] & ~GITS_BASER_VALID;
>
> Inconsistent masking logic, since in pkvm_host_unmap_last_level you
> correctly used PHYS_MASK to extract the address, but here in the
> rollback path you use ~GITS_BASER_VALID.
>
> While both currently work because the upper bits and lower bits (below
> the page size) are defined as RES0 in the GIC spec, ~GITS_BASER_VALID
> is architecturally fragile. If a future hardware revision repurposes
> the upper RES0 bits [62:52] for new attributes (e.g., memory
> encryption flags), ~GITS_BASER_VALID will leak those attribute bits
> into the physical address calculation.
>
> Since PHYS_MASK correctly handles the address extraction across all
> page sizes (relying on the lower bits being RES0) and safely masks off
> future upper attribute bits, please standardize on using table_addr =
> table[i] & PHYS_MASK; for both functions.
>
>
Fixed the inconsistency and used PHYS_MASK everywhere.
> > + WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT));
> > + }
> > +}
> > +
> > +static int pkvm_setup_its_shadow_baser(struct its_shadow_tables *shadow)
> > +{
> > + int i, ret;
> > + u64 baser_val, num_pages, type;
> > + void *base, *host_base;
> > +
> > + for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> > + baser_val = shadow->tables[i].val;
> > + if (!(baser_val & GITS_BASER_VALID))
> > + continue;
> > +
> > + base = kern_hyp_va(shadow->tables[i].base);
> > + num_pages = (1 << shadow->tables[i].order);
> > +
> > + ret = __pkvm_host_donate_hyp(hyp_virt_to_pfn(base), num_pages);
> > + if (ret)
> > + goto err_donate;
> > +
> > + if (baser_val & GITS_BASER_INDIRECT) {
> > + host_base = kern_hyp_va(shadow->tables[i].shadow);
> > + ret = pkvm_share_shadow_table(host_base, num_pages);
> > + if (ret)
> > + goto err_with_donation;
> > +
> > + type = GITS_BASER_TYPE(baser_val);
> > + if (type == GITS_BASER_TYPE_COLLECTION)
> > + continue;
> > +
> > + ret = pkvm_host_unmap_last_level(base, num_pages,
> > + shadow->tables[i].psz);
> > + if (ret)
> > + goto err_with_share;
> > + }
> > + }
> > +
> > + return 0;
> > +err_with_share:
> > + pkvm_unshare_shadow_table(host_base, num_pages);
> > +err_with_donation:
> > + __pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages);
> > +err_donate:
> > + for (i = i - 1; i >= 0; i--) {
>
> Please use the while (i--) idiom for rollback loops.
>
>
> > + baser_val = shadow->tables[i].val;
> > + if (!(baser_val & GITS_BASER_VALID))
> > + continue;
> > +
> > + base = kern_hyp_va(shadow->tables[i].base);
> > + num_pages = (1 << shadow->tables[i].order);
> > +
> > + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
>
> The sequence of rollback operations here creates a TOCTOU vulnerability.
>
There is a different problem here related to functionality rather than
this: donating the base to the host first and then iterating over it
will make the hypervisor explode. I fixed this.
> - First, you donate base (the Level 1 indirect table) back to the host.
> - Then, you pass base into pkvm_host_map_last_level().
> - Finally, pkvm_host_map_last_level() reads table[i] out of base to
> determine which Level 2 pages to donate back to the host.
>
> Because the host regains ownership of base _first_, it can be running
> concurrently on another CPU. A malicious host can overwrite the Level
> 1 table with pointers to arbitrary hypervisor-owned memory. The
> hypervisor will then read those malicious pointers and dutifully grant
> the host access to its own secure memory.
>
> The order of operations needs to be reversed: you must read base to
> roll back the L2 pages, unshare the shadow table, and *only then*
> donate base back to the host.
>
> Also, num_pages = (1 << shadow->tables[i].order); calculates a 32-bit
> signed integer because the literal 1 is a signed 32-bit int. If order
> is 31, this evaluates to a negative number. If order is 32 or higher,
> this is undefined behavior. Because num_pages is declared as a u64,
> you should use the standard kernel macro BIT_ULL().
>
> Here's my suggested fix (not tested). Reorder the operations to safely
> rollback L2 before donating L1, use the standard `while (i--)` loop,
> and fix the page calculation:
>
> + while (i--) {
> + baser_val = shadow->tables[i].val;
> + if (!(baser_val & GITS_BASER_VALID))
> + continue;
> +
> + base = kern_hyp_va(shadow->tables[i].base);
> + num_pages = BIT_ULL(shadow->tables[i].order);
> +
> + if (baser_val & GITS_BASER_INDIRECT) {
> + host_base = kern_hyp_va(shadow->tables[i].shadow);
> +
> + type = GITS_BASER_TYPE(baser_val);
> + if (type != GITS_BASER_TYPE_COLLECTION)
> + pkvm_host_map_last_level(base, num_pages,
> + shadow->tables[i].psz);
> +
> + pkvm_unshare_shadow_table(host_base, num_pages);
> + }
> +
> + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
> + }
>
>
>
> > + if (baser_val & GITS_BASER_INDIRECT) {
> > + host_base = kern_hyp_va(shadow->tables[i].shadow);
> > + pkvm_unshare_shadow_table(host_base, num_pages);
> > +
> > + type = GITS_BASER_TYPE(baser_val);
> > + if (type == GITS_BASER_TYPE_COLLECTION)
> > + continue;
> > +
> > + pkvm_host_map_last_level(base, num_pages, shadow->tables[i].psz);
> > + }
> > + }
>
> You have duplicated the entire table decoding logic (calculating base,
> num_pages, checking INDIRECT...) down here in the rollback path.
> Consider abstracting "setup one table" and "teardown one table" into
> helper functions to make pkvm_setup_its_shadow_baser more readable and
> less prone to copy-pasta errors.
>
> Cheers,
> /fuad
>
Thanks,
Sebastian
>
> > +
> > + return ret;
> > +}
> > +
> > static int pkvm_setup_its_shadow_cmdq(struct its_shadow_tables *shadow)
> > {
> > int ret, i, num_pages;
> > @@ -205,6 +344,10 @@ int pkvm_init_gic_its_emulation(phys_addr_t dev_addr, void *host_priv_state,
> > if (ret)
> > goto err_with_shadow;
> >
> > + ret = pkvm_setup_its_shadow_baser(shadow);
> > + if (ret)
> > + goto err_with_shadow;
> > +
> > its_reg->priv = priv_state;
> >
> > hyp_spin_lock_init(&priv_state->its_lock);
> > --
> > 2.53.0.473.g4a7958ca14-goog
> >
next prev parent reply other threads:[~2026-04-10 13:53 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 12:49 [RFC PATCH 00/14] KVM: ITS hardening for pKVM Sebastian Ene
2026-03-10 12:49 ` [PATCH 01/14] KVM: arm64: Donate MMIO to the hypervisor Sebastian Ene
2026-03-12 17:57 ` Fuad Tabba
2026-03-13 10:40 ` Suzuki K Poulose
2026-03-24 10:39 ` Vincent Donnefort
2026-04-17 17:18 ` Mostafa Saleh
2026-03-10 12:49 ` [PATCH 02/14] KVM: arm64: Track host-unmapped MMIO regions in a static array Sebastian Ene
2026-03-12 19:05 ` Fuad Tabba
2026-03-24 10:46 ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 03/14] KVM: arm64: Support host MMIO trap handlers for unmapped devices Sebastian Ene
2026-03-13 9:31 ` Fuad Tabba
2026-03-24 10:59 ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 04/14] KVM: arm64: Mediate host access to GIC/ITS MMIO via unmapping Sebastian Ene
2026-03-13 9:58 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 05/14] irqchip/gic-v3-its: Prepare shadow structures for KVM host deprivilege Sebastian Ene
2026-03-13 11:26 ` Fuad Tabba
2026-03-13 13:10 ` Fuad Tabba
2026-03-20 15:11 ` Sebastian Ene
2026-03-24 14:36 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 06/14] KVM: arm64: Add infrastructure for ITS emulation setup Sebastian Ene
2026-03-16 10:46 ` Fuad Tabba
2026-03-17 9:40 ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables Sebastian Ene
2026-03-16 16:13 ` Fuad Tabba
2026-04-10 13:52 ` Sebastian Ene [this message]
2026-03-10 12:49 ` [PATCH 08/14] KVM: arm64: Trap & emulate the ITS MAPD command Sebastian Ene
2026-03-17 10:20 ` Fuad Tabba
2026-04-08 14:05 ` Sebastian Ene
2026-03-10 12:49 ` [PATCH 09/14] KVM: arm64: Trap & emulate the ITS VMAPP command Sebastian Ene
2026-03-10 12:49 ` [PATCH 10/14] KVM: arm64: Trap & emulate the ITS MAPC command Sebastian Ene
2026-03-10 12:49 ` [PATCH 11/14] KVM: arm64: Restrict host updates to GITS_CTLR Sebastian Ene
2026-03-10 12:49 ` [PATCH 12/14] KVM: arm64: Restrict host updates to GITS_CBASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 13/14] KVM: arm64: Restrict host updates to GITS_BASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 14/14] KVM: arm64: Implement HVC interface for ITS emulation setup Sebastian Ene
2026-03-12 17:56 ` [RFC PATCH 00/14] KVM: ITS hardening for pKVM Fuad Tabba
2026-03-20 14:42 ` Sebastian Ene
2026-03-13 15:18 ` Mostafa Saleh
2026-03-15 13:24 ` Fuad Tabba
2026-03-25 16:26 ` Sebastian Ene
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adkAtvBiYDH3_xSA@google.com \
--to=sebastianene@google.com \
--cc=alexandru.elisei@arm.com \
--cc=android-kvm@google.com \
--cc=bgrzesik@google.com \
--cc=catalin.marinas@arm.com \
--cc=joey.gouly@arm.com \
--cc=kees@kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=perlarsen@google.com \
--cc=qperret@google.com \
--cc=rananta@google.com \
--cc=smostafa@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vdonnefort@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.