public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Sebastian Ene <sebastianene@google.com>
To: Fuad Tabba <tabba@google.com>
Cc: alexandru.elisei@arm.com, kvmarm@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, android-kvm@google.com,
	catalin.marinas@arm.com, joey.gouly@arm.com, kees@kernel.org,
	mark.rutland@arm.com, maz@kernel.org, oupton@kernel.org,
	perlarsen@google.com, qperret@google.com, rananta@google.com,
	smostafa@google.com, suzuki.poulose@arm.com, tglx@kernel.org,
	vdonnefort@google.com, bgrzesik@google.com, will@kernel.org,
	yuzenghui@huawei.com
Subject: Re: [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables
Date: Fri, 10 Apr 2026 13:52:54 +0000	[thread overview]
Message-ID: <adkAtvBiYDH3_xSA@google.com> (raw)
In-Reply-To: <CA+EHjTxUbLBnyBGg58wsJQvTXPN0FTbj53H5r3sC9_TizpjvdQ@mail.gmail.com>

On Mon, Mar 16, 2026 at 04:13:59PM +0000, Fuad Tabba wrote:

Hello Fuad,

> Hi Sebastian,
> 
> On Tue, 10 Mar 2026 at 12:49, Sebastian Ene <sebastianene@google.com> wrote:
> >
> > Setup shadow structures for ITS indirect tables held in
> > the GITS_BASER<n> registers.
> > Make the last level of the Device Table and vPE Table
> > inacessible to the host.
> 
> inacessible  -> inaccessible

Applied fix, thanks.

> > In a direct layout configuration, donate the table to
> > the hypervisor since the software is not expected to
> > program them directly.
> 
> This commit message is too brief and doesn't fully explain the
> problem, the impact, and the mechanism of the solution. It also
> appears to contradict the actual code changes.
> 
> For example, could you elaborate why must the last level of indirect
> tables be inaccessible?

For device table, a malicious host can write the ITT address that points
to hyp memory and then use MAPTI to write over that memory.

> 
> Can you also please explain the mechanism? You are parsing
> GITS_BASER_INDIRECT to determine if a shadow Level 1 table must be
> shared with the host, while unconditionally donating the original
> physical tables. You also explicitly exclude Collection tables. The
> msg should briefly justify why Collection tables are safe to leave
> accessible to the host.
> 
> There is also a contradiction in the message. You state "In a direct
> layout configuration, donate the table...". However, your code donates
> the original hardware table unconditionally on every iteration of the
> loop, regardless of whether GITS_BASER_INDIRECT is set. Please ensure
> the commit log accurately reflects the code implementation.
>

I see no contradition, I only need to shadow the first layer of the
indirect tables. Shadowing implies donation and sharing:
because we are donating the original tables from host -> hyp and we sharing the
host view of the tables with the hypervisor (which is a copy).

> Maybe you could say that the problem is Host DMA attacks via ITS table
> manipulation. Whereas the mechanism is to unconditionally donate

This has nothing to do with Host DMA attacks, it is just the AP that can
write to memory.

> hardware tables to EL2. For indirect Device/vPE tables, share a L1
> shadow table with the host and strictly donate the L2 pages to prevent
> the host from writing malicious L2 pointers.
> 
> >
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/its_emulate.c | 143 ++++++++++++++++++++++++++
> >  1 file changed, 143 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/its_emulate.c b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > index 4a3ccc90a1a9..865a5d6353ed 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/its_emulate.c
> > @@ -141,6 +141,145 @@ static struct pkvm_protected_reg *get_region(phys_addr_t dev_addr)
> >         return NULL;
> >  }
> >
> > +static int pkvm_host_unmap_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > +       u64 *table = shadow;
> > +       int ret, i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
> > +       phys_addr_t table_addr;
> 
> RCT, mixing initialized variables and uninitialized variables, plus
> variables of conceptually different "types" in the same declaration.
> 
> Please use sizeof(*table): sizeof(table) evaluates to the size of the
> pointer (8 bytes), NOT the size of the array element. In this case,
> this happens to be the same, but it's still wrong.
> 
> Maybe the following is clearer:
> +        int end = num_pages * (PAGE_SIZE / sizeof(*table));
> 
>

Will use the suggestion and do the same for the
pkvm_host_map_last_level.

> > +
> > +       for (i = 0; i < end; i++) {
> > +               if (!(table[i] & GITS_BASER_VALID))
> > +                       continue;
> > +
> > +               table_addr = table[i] & PHYS_MASK;
> > +               ret = __pkvm_host_donate_hyp(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
> 
> The ITS-configured page size and the host page size could be
> different, but the number of pages to donate for Level 2 tables is
> calculated based on psz (the ITS).
> 
> If the ITS hardware is configured for 4KB pages, but the host kernel
> is using (e.g.,) 64KB pages, psz >> PAGE_SHIFT evaluates to 0.

I need to revisit this, thanks for pointing out.

> 
> You need to account for mismatched page sizes, perhaps by using
> DIV_ROUND_UP(psz, PAGE_SIZE) (or something similar) to ensure the
> containing host page is donated.
> 
> > +               if (ret)
> > +                       goto err_donate;
> > +       }
> > +
> > +       return 0;
> > +err_donate:
> > +       for (i = i - 1; i >= 0; i--) {
> 
> Please use the while (i--) idiom for rollback loops.
> 
> 
> > +               if (!(table[i] & GITS_BASER_VALID))
> > +                       continue;
> > +
> > +               table_addr = table[i] & PHYS_MASK;
> > +               __pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT);
> 
> Please wrap this in WARN_ON(...). If donating back to the host fails
> during a rollback, we have a fatal page leak that needs to be loudly
> flagged, similar to how you handle it in pkvm_unshare_shadow_table.
> 
> 
> > +       }
> > +       return ret;
> > +}
> > +
> > +static int pkvm_share_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > +       u64 i, ret, start_pfn = hyp_virt_to_pfn(shadow);
> 
> Same comment as before with RCT and the mixing of declarations.
> 
> 
> > +
> > +       for (i = 0; i < nr_pages; i++) {
> > +               ret = __pkvm_host_share_hyp(start_pfn + i);
> > +               if (ret)
> > +                       goto unshare;
> > +       }
> > +
> > +       ret = hyp_pin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > +       if (ret)
> > +               goto unshare;
> > +
> > +       return ret;
> > +unshare:
> 
> Please use the while (i--) idiom for rollback loops.
> 
> Also, please use consistent naming conventions for the labels. Here
> you call it unshare, and earlier it was err_donate.
> 
> 
> > +       for (i = i - 1; i >= 0; i--)
> > +               __pkvm_host_unshare_hyp(start_pfn + i);
> > +       return ret;
> > +}
> > +
> > +static void pkvm_unshare_shadow_table(void *shadow, u64 nr_pages)
> > +{
> > +       u64 i, start_pfn = hyp_virt_to_pfn(shadow);
> > +
> > +       hyp_unpin_shared_mem(shadow, shadow + (nr_pages << PAGE_SHIFT));
> > +
> > +       for (i = 0; i < nr_pages; i++)
> > +               WARN_ON(__pkvm_host_unshare_hyp(start_pfn + i));
> > +}
> > +
> > +static void pkvm_host_map_last_level(void *shadow, size_t num_pages, u32 psz)
> > +{
> > +       u64 *table;
> 
> RCT, and you forgot to initialize table:
> +       u64 *table = shadow;

Fixed this, thanks. I never ended up on this code path during testing,
maybe I should create a test for it to trigger it.

> 
> > +       int i, end = (num_pages << PAGE_SHIFT) / sizeof(table);
> 
> Same sizeof(table) pointer-size bug as above.
> 
> 
> > +       phys_addr_t table_addr;
> > +
> > +       for (i = 0; i < end; i++) {
> > +               if (!(table[i] & GITS_BASER_VALID))
> > +                       continue;
> > +
> > +               table_addr = table[i] & ~GITS_BASER_VALID;
> 
> Inconsistent masking logic, since in pkvm_host_unmap_last_level you
> correctly used PHYS_MASK to extract the address, but here in the
> rollback path you use ~GITS_BASER_VALID.
> 
> While both currently work because the upper bits and lower bits (below
> the page size) are defined as RES0 in the GIC spec, ~GITS_BASER_VALID
> is architecturally fragile. If a future hardware revision repurposes
> the upper RES0 bits [62:52] for new attributes (e.g., memory
> encryption flags), ~GITS_BASER_VALID will leak those attribute bits
> into the physical address calculation.
> 
> Since PHYS_MASK correctly handles the address extraction across all
> page sizes (relying on the lower bits being RES0) and safely masks off
> future upper attribute bits, please standardize on using table_addr =
> table[i] & PHYS_MASK; for both functions.
> 
>

Fixed the inconsistency and used PHYS_MASK everywhere.

> > +               WARN_ON(__pkvm_hyp_donate_host(hyp_phys_to_pfn(table_addr), psz >> PAGE_SHIFT));
> > +       }
> > +}
> > +
> > +static int pkvm_setup_its_shadow_baser(struct its_shadow_tables *shadow)
> > +{
> > +       int i, ret;
> > +       u64 baser_val, num_pages, type;
> > +       void *base, *host_base;
> > +
> > +       for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> > +               baser_val = shadow->tables[i].val;
> > +               if (!(baser_val & GITS_BASER_VALID))
> > +                       continue;
> > +
> > +               base = kern_hyp_va(shadow->tables[i].base);
> > +               num_pages = (1 << shadow->tables[i].order);
> > +
> > +               ret = __pkvm_host_donate_hyp(hyp_virt_to_pfn(base), num_pages);
> > +               if (ret)
> > +                       goto err_donate;
> > +
> > +               if (baser_val & GITS_BASER_INDIRECT) {
> > +                       host_base = kern_hyp_va(shadow->tables[i].shadow);
> > +                       ret = pkvm_share_shadow_table(host_base, num_pages);
> > +                       if (ret)
> > +                               goto err_with_donation;
> > +
> > +                       type = GITS_BASER_TYPE(baser_val);
> > +                       if (type == GITS_BASER_TYPE_COLLECTION)
> > +                               continue;
> > +
> > +                       ret = pkvm_host_unmap_last_level(base, num_pages,
> > +                                                        shadow->tables[i].psz);
> > +                       if (ret)
> > +                               goto err_with_share;
> > +               }
> > +       }
> > +
> > +       return 0;
> > +err_with_share:
> > +       pkvm_unshare_shadow_table(host_base, num_pages);
> > +err_with_donation:
> > +       __pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages);
> > +err_donate:
> > +       for (i = i - 1; i >= 0; i--) {
> 
> Please use the while (i--) idiom for rollback loops.
> 
> 
> > +               baser_val = shadow->tables[i].val;
> > +               if (!(baser_val & GITS_BASER_VALID))
> > +                       continue;
> > +
> > +               base = kern_hyp_va(shadow->tables[i].base);
> > +               num_pages = (1 << shadow->tables[i].order);
> > +
> > +               WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
> 
> The sequence of rollback operations here creates a TOCTOU vulnerability.
> 

There is a different problem here related to functionality rather than
this: donating the base to the host first and then iterating over it
will make the hypervisor explode. I fixed this.

> - First, you donate base (the Level 1 indirect table) back to the host.
> - Then, you pass base into pkvm_host_map_last_level().
> - Finally, pkvm_host_map_last_level() reads table[i] out of base to
> determine which Level 2 pages to donate back to the host.
>
> Because the host regains ownership of base _first_, it can be running
> concurrently on another CPU. A malicious host can overwrite the Level
> 1 table with pointers to arbitrary hypervisor-owned memory. The
> hypervisor will then read those malicious pointers and dutifully grant
> the host access to its own secure memory.
> 
> The order of operations needs to be reversed: you must read base to
> roll back the L2 pages, unshare the shadow table, and *only then*
> donate base back to the host.
> 
> Also, num_pages = (1 << shadow->tables[i].order); calculates a 32-bit
> signed integer because the literal 1 is a signed 32-bit int. If order
> is 31, this evaluates to a negative number. If order is 32 or higher,
> this is undefined behavior. Because num_pages is declared as a u64,
> you should use the standard kernel macro BIT_ULL().
> 
> Here's my suggested fix (not tested). Reorder the operations to safely
> rollback L2 before donating L1, use the standard `while (i--)` loop,
> and fix the page calculation:
> 
> +    while (i--) {
> +        baser_val = shadow->tables[i].val;
> +        if (!(baser_val & GITS_BASER_VALID))
> +                continue;
> +
> +        base = kern_hyp_va(shadow->tables[i].base);
> +        num_pages = BIT_ULL(shadow->tables[i].order);
> +
> +        if (baser_val & GITS_BASER_INDIRECT) {
> +                host_base = kern_hyp_va(shadow->tables[i].shadow);
> +
> +            type = GITS_BASER_TYPE(baser_val);
> +            if (type != GITS_BASER_TYPE_COLLECTION)
> +                    pkvm_host_map_last_level(base, num_pages,
> +                                 shadow->tables[i].psz);
> +
> +            pkvm_unshare_shadow_table(host_base, num_pages);
> +        }
> +
> +        WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(base), num_pages));
> +    }
> 
> 
> 
> > +               if (baser_val & GITS_BASER_INDIRECT) {
> > +                       host_base = kern_hyp_va(shadow->tables[i].shadow);
> > +                       pkvm_unshare_shadow_table(host_base, num_pages);
> > +
> > +                       type = GITS_BASER_TYPE(baser_val);
> > +                       if (type == GITS_BASER_TYPE_COLLECTION)
> > +                               continue;
> > +
> > +                       pkvm_host_map_last_level(base, num_pages, shadow->tables[i].psz);
> > +               }
> > +       }
> 
> You have duplicated the entire table decoding logic (calculating base,
> num_pages, checking INDIRECT...) down here in the rollback path.
> Consider abstracting "setup one table" and "teardown one table" into
> helper functions to make pkvm_setup_its_shadow_baser more readable and
> less prone to copy-pasta errors.
>
> Cheers,
> /fuad
> 

Thanks,
Sebastian

> 
> > +
> > +       return ret;
> > +}
> > +
> >  static int pkvm_setup_its_shadow_cmdq(struct its_shadow_tables *shadow)
> >  {
> >         int ret, i, num_pages;
> > @@ -205,6 +344,10 @@ int pkvm_init_gic_its_emulation(phys_addr_t dev_addr, void *host_priv_state,
> >         if (ret)
> >                 goto err_with_shadow;
> >
> > +       ret = pkvm_setup_its_shadow_baser(shadow);
> > +       if (ret)
> > +               goto err_with_shadow;
> > +
> >         its_reg->priv = priv_state;
> >
> >         hyp_spin_lock_init(&priv_state->its_lock);
> > --
> > 2.53.0.473.g4a7958ca14-goog
> >


  reply	other threads:[~2026-04-10 13:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 12:49 [RFC PATCH 00/14] KVM: ITS hardening for pKVM Sebastian Ene
2026-03-10 12:49 ` [PATCH 01/14] KVM: arm64: Donate MMIO to the hypervisor Sebastian Ene
2026-03-12 17:57   ` Fuad Tabba
2026-03-13 10:40   ` Suzuki K Poulose
2026-03-24 10:39   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 02/14] KVM: arm64: Track host-unmapped MMIO regions in a static array Sebastian Ene
2026-03-12 19:05   ` Fuad Tabba
2026-03-24 10:46   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 03/14] KVM: arm64: Support host MMIO trap handlers for unmapped devices Sebastian Ene
2026-03-13  9:31   ` Fuad Tabba
2026-03-24 10:59   ` Vincent Donnefort
2026-03-10 12:49 ` [PATCH 04/14] KVM: arm64: Mediate host access to GIC/ITS MMIO via unmapping Sebastian Ene
2026-03-13  9:58   ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 05/14] irqchip/gic-v3-its: Prepare shadow structures for KVM host deprivilege Sebastian Ene
2026-03-13 11:26   ` Fuad Tabba
2026-03-13 13:10     ` Fuad Tabba
2026-03-20 15:11     ` Sebastian Ene
2026-03-24 14:36       ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 06/14] KVM: arm64: Add infrastructure for ITS emulation setup Sebastian Ene
2026-03-16 10:46   ` Fuad Tabba
2026-03-17  9:40     ` Fuad Tabba
2026-03-10 12:49 ` [PATCH 07/14] KVM: arm64: Restrict host access to the ITS tables Sebastian Ene
2026-03-16 16:13   ` Fuad Tabba
2026-04-10 13:52     ` Sebastian Ene [this message]
2026-03-10 12:49 ` [PATCH 08/14] KVM: arm64: Trap & emulate the ITS MAPD command Sebastian Ene
2026-03-17 10:20   ` Fuad Tabba
2026-04-08 14:05     ` Sebastian Ene
2026-03-10 12:49 ` [PATCH 09/14] KVM: arm64: Trap & emulate the ITS VMAPP command Sebastian Ene
2026-03-10 12:49 ` [PATCH 10/14] KVM: arm64: Trap & emulate the ITS MAPC command Sebastian Ene
2026-03-10 12:49 ` [PATCH 11/14] KVM: arm64: Restrict host updates to GITS_CTLR Sebastian Ene
2026-03-10 12:49 ` [PATCH 12/14] KVM: arm64: Restrict host updates to GITS_CBASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 13/14] KVM: arm64: Restrict host updates to GITS_BASER Sebastian Ene
2026-03-10 12:49 ` [PATCH 14/14] KVM: arm64: Implement HVC interface for ITS emulation setup Sebastian Ene
2026-03-12 17:56 ` [RFC PATCH 00/14] KVM: ITS hardening for pKVM Fuad Tabba
2026-03-20 14:42   ` Sebastian Ene
2026-03-13 15:18 ` Mostafa Saleh
2026-03-15 13:24   ` Fuad Tabba
2026-03-25 16:26   ` Sebastian Ene

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adkAtvBiYDH3_xSA@google.com \
    --to=sebastianene@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=android-kvm@google.com \
    --cc=bgrzesik@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kees@kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oupton@kernel.org \
    --cc=perlarsen@google.com \
    --cc=qperret@google.com \
    --cc=rananta@google.com \
    --cc=smostafa@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=tglx@kernel.org \
    --cc=vdonnefort@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox