linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mostafa Saleh <smostafa@google.com>
To: Will Deacon <will@kernel.org>
Cc: linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, robin.murphy@arm.com,
	jean-philippe@linaro.org, qperret@google.com, tabba@google.com,
	jgg@ziepe.ca, mark.rutland@arm.com, praan@google.com
Subject: Re: [PATCH v4 02/28] KVM: arm64: Donate MMIO to the hypervisor
Date: Tue, 16 Sep 2025 13:27:39 +0000	[thread overview]
Message-ID: <aMlly_9kDE04Inn2@google.com> (raw)
In-Reply-To: <aMA13Sowq91XEqP9@willie-the-truck>

On Tue, Sep 09, 2025 at 03:12:45PM +0100, Will Deacon wrote:
> On Tue, Aug 19, 2025 at 09:51:30PM +0000, Mostafa Saleh wrote:
> > Add a function to donate MMIO to the hypervisor so IOMMU hypervisor
> > drivers can use that to protect the MMIO of IOMMU.
> > The initial attempt to implement this was to have a new flag to
> > "___pkvm_host_donate_hyp" to accept MMIO. However that had many problems,
> > it was quite intrusive for host/hyp to check/set page state to make it
> > aware of MMIO and to encode the state in the page table in that case.
> > Which is called in paths that can be sensitive to performance (FFA, VMs..)
> > 
> > As donating MMIO is very rare, and we don’t need to encode the full state,
> > it’s reasonable to have a separate function to do this.
> > It will init the host s2 page table with an invalid leaf with the owner ID
> > to prevent the host from mapping the page on faults.
> > 
> > Also, prevent kvm_pgtable_stage2_unmap() from removing owner ID from
> > stage-2 PTEs, as this can be triggered from recycle logic under memory
> > pressure. There is no code relying on this, as all ownership changes is
> > done via kvm_pgtable_stage2_set_owner()
> > 
> > For error path in IOMMU drivers, add a function to donate MMIO back
> > from hyp to host.
> > 
> > Signed-off-by: Mostafa Saleh <smostafa@google.com>
> > ---
> >  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 64 +++++++++++++++++++
> >  arch/arm64/kvm/hyp/pgtable.c                  |  9 +--
> >  3 files changed, 68 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> > index 52d7ee91e18c..98e173da0f9b 100644
> > --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> > @@ -37,6 +37,8 @@ int __pkvm_host_share_hyp(u64 pfn);
> >  int __pkvm_host_unshare_hyp(u64 pfn);
> >  int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages);
> >  int ___pkvm_host_donate_hyp(u64 pfn, u64 nr_pages, enum kvm_pgtable_prot prot);
> > +int __pkvm_host_donate_hyp_mmio(u64 pfn);
> > +int __pkvm_hyp_donate_host_mmio(u64 pfn);
> >  int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages);
> >  int __pkvm_host_share_ffa(u64 pfn, u64 nr_pages);
> >  int __pkvm_host_unshare_ffa(u64 pfn, u64 nr_pages);
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 861e448183fd..c9a15ef6b18d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -799,6 +799,70 @@ int ___pkvm_host_donate_hyp(u64 pfn, u64 nr_pages, enum kvm_pgtable_prot prot)
> >  	return ret;
> >  }
> >  
> > +int __pkvm_host_donate_hyp_mmio(u64 pfn)
> > +{
> > +	u64 phys = hyp_pfn_to_phys(pfn);
> > +	void *virt = __hyp_va(phys);
> > +	int ret;
> > +	kvm_pte_t pte;
> > +
> > +	host_lock_component();
> > +	hyp_lock_component();
> > +
> > +	ret = kvm_pgtable_get_leaf(&host_mmu.pgt, phys, &pte, NULL);
> > +	if (ret)
> > +		goto unlock;
> > +
> > +	if (pte && !kvm_pte_valid(pte)) {
> > +		ret = -EPERM;
> > +		goto unlock;
> > +	}
> 
> Shouldn't we first check that the pfn is indeed MMIO? Otherwise, testing
> the pte for the ownership information isn't right.

I will add it, although the input should be trusted as it comes from the
hypervisor SMMUv3 driver.

> 
> > +	ret = kvm_pgtable_get_leaf(&pkvm_pgtable, (u64)virt, &pte, NULL);
> > +	if (ret)
> > +		goto unlock;
> > +	if (pte) {
> > +		ret = -EBUSY;
> > +		goto unlock;
> > +	}
> > +
> > +	ret = pkvm_create_mappings_locked(virt, virt + PAGE_SIZE, PAGE_HYP_DEVICE);
> > +	if (ret)
> > +		goto unlock;
> > +	/*
> > +	 * We set HYP as the owner of the MMIO pages in the host stage-2, for:
> > +	 * - host aborts: host_stage2_adjust_range() would fail for invalid non zero PTEs.
> > +	 * - recycle under memory pressure: host_stage2_unmap_dev_all() would call
> > +	 *   kvm_pgtable_stage2_unmap() which will not clear non zero invalid ptes (counted).
> > +	 * - other MMIO donation: Would fail as we check that the PTE is valid or empty.
> > +	 */
> > +	WARN_ON(host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt, phys,
> > +				PAGE_SIZE, &host_s2_pool, PKVM_ID_HYP));
> > +unlock:
> > +	hyp_unlock_component();
> > +	host_unlock_component();
> > +
> > +	return ret;
> > +}
> > +
> > +int __pkvm_hyp_donate_host_mmio(u64 pfn)
> > +{
> > +	u64 phys = hyp_pfn_to_phys(pfn);
> > +	u64 virt = (u64)__hyp_va(phys);
> > +	size_t size = PAGE_SIZE;
> > +
> > +	host_lock_component();
> > +	hyp_lock_component();
> 
> Shouldn't we check that:
> 
>   1. pfn is mmio
>   2. pfn is owned by hyp
>   3. The host doesn't have something mapped at pfn already
> 
> ?
> 

I thought about this initially, but as
- This code is only called from the hypervisor with trusted
  inputs (only at boot)
- Only called on error path

So WARN_ON in case of failure to unmap MMIO pages seemed is good enough,
to avoid extra code.

But I can add the checks if you think they are necessary, we will need
to add new helpers for MMIO state though.

> > +	WARN_ON(kvm_pgtable_hyp_unmap(&pkvm_pgtable, virt, size) != size);
> > +	WARN_ON(host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt, phys,
> > +				PAGE_SIZE, &host_s2_pool, PKVM_ID_HOST));
> > +	hyp_unlock_component();
> > +	host_unlock_component();
> > +
> > +	return 0;
> > +}
> > +
> >  int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages)
> >  {
> >  	return ___pkvm_host_donate_hyp(pfn, nr_pages, PAGE_HYP);
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index c351b4abd5db..ba06b0c21d5a 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -1095,13 +1095,8 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
> >  	kvm_pte_t *childp = NULL;
> >  	bool need_flush = false;
> >  
> > -	if (!kvm_pte_valid(ctx->old)) {
> > -		if (stage2_pte_is_counted(ctx->old)) {
> > -			kvm_clear_pte(ctx->ptep);
> > -			mm_ops->put_page(ctx->ptep);
> > -		}
> > -		return 0;
> > -	}
> > +	if (!kvm_pte_valid(ctx->old))
> > +		return stage2_pte_is_counted(ctx->old) ? -EPERM : 0;
> 
> Can this code be reached for the guest? For example, if
> pkvm_pgtable_stage2_destroy() runs into an MMIO-guarded pte on teardown?

AFAICT, VMs page table is destroyed from reclaim_pgtable_pages() =>
kvm_pgtable_stage2_destroy() => kvm_pgtable_stage2_destroy_range() ... =>
stage2_free_walker()

Which doesn't interact with “stage2_unmap_walker”, so that should be
fine.

Thanks,
Mostafa


> 
> Will


  reply	other threads:[~2025-09-16 13:27 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19 21:51 [PATCH v4 00/28] KVM: arm64: SMMUv3 driver for pKVM (trap and emulate) Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 01/28] KVM: arm64: Add a new function to donate memory with prot Mostafa Saleh
2025-09-09 13:46   ` Will Deacon
2025-09-14 19:23     ` Pranjal Shrivastava
2025-09-16 11:58       ` Mostafa Saleh
2025-09-16 11:56     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 02/28] KVM: arm64: Donate MMIO to the hypervisor Mostafa Saleh
2025-09-09 14:12   ` Will Deacon
2025-09-16 13:27     ` Mostafa Saleh [this message]
2025-09-26 14:33       ` Will Deacon
2025-09-29 10:57         ` Mostafa Saleh
2025-09-14 20:41   ` Pranjal Shrivastava
2025-09-16 13:43     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 03/28] KVM: arm64: pkvm: Add pkvm_time_get() Mostafa Saleh
2025-09-09 14:16   ` Will Deacon
2025-09-09 15:56     ` Marc Zyngier
2025-09-15 11:10       ` Pranjal Shrivastava
2025-09-16 14:04       ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 04/28] iommu/io-pgtable-arm: Move selftests to a separate file Mostafa Saleh
2025-09-15 14:37   ` Pranjal Shrivastava
2025-09-16 14:07     ` Mostafa Saleh
2025-09-15 16:45   ` Jason Gunthorpe
2025-09-16 14:09     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 05/28] iommu/io-pgtable-arm: Factor kernel specific code out Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 06/28] iommu/arm-smmu-v3: Split code with hyp Mostafa Saleh
2025-09-09 14:23   ` Will Deacon
2025-09-16 14:10     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 07/28] iommu/arm-smmu-v3: Move TLB range invalidation into a macro Mostafa Saleh
2025-09-09 14:25   ` Will Deacon
2025-08-19 21:51 ` [PATCH v4 08/28] iommu/arm-smmu-v3: Move IDR parsing to common functions Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 09/28] KVM: arm64: iommu: Introduce IOMMU driver infrastructure Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 10/28] KVM: arm64: iommu: Shadow host stage-2 page table Mostafa Saleh
2025-09-09 14:42   ` Will Deacon
2025-09-16 14:24     ` Mostafa Saleh
2025-09-26 14:42       ` Will Deacon
2025-09-29 11:01         ` Mostafa Saleh
2025-09-30 12:38           ` Jason Gunthorpe
2025-09-30 12:55             ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 11/28] KVM: arm64: iommu: Add memory pool Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 12/28] KVM: arm64: iommu: Support DABT for IOMMU Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 13/28] iommu/arm-smmu-v3-kvm: Add SMMUv3 driver Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 14/28] iommu/arm-smmu-v3: Add KVM mode in the driver Mostafa Saleh
2025-09-12 13:52   ` Will Deacon
2025-09-16 14:30     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 15/28] iommu/arm-smmu-v3: Load the driver later in KVM mode Mostafa Saleh
2025-09-12 13:54   ` Will Deacon
2025-09-23 14:35     ` Mostafa Saleh
2025-09-23 17:38       ` Jason Gunthorpe
2025-09-29 11:10         ` Mostafa Saleh
2025-10-02 15:13           ` Jason Gunthorpe
2025-11-05 16:40             ` Mostafa Saleh
2025-11-05 17:12               ` Jason Gunthorpe
2025-11-06 11:06                 ` Mostafa Saleh
2025-11-06 13:23                   ` Jason Gunthorpe
2025-11-06 16:54                     ` Mostafa Saleh
2025-11-06 17:16                       ` Jason Gunthorpe
2025-08-19 21:51 ` [PATCH v4 16/28] iommu/arm-smmu-v3-kvm: Create array for hyp SMMUv3 Mostafa Saleh
2025-09-09 18:30   ` Daniel Mentz
2025-09-16 14:35     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 17/28] iommu/arm-smmu-v3-kvm: Take over SMMUs Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 18/28] iommu/arm-smmu-v3-kvm: Probe SMMU HW Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 19/28] iommu/arm-smmu-v3-kvm: Add MMIO emulation Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 20/28] iommu/arm-smmu-v3-kvm: Shadow the command queue Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 21/28] iommu/arm-smmu-v3-kvm: Add CMDQ functions Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 22/28] iommu/arm-smmu-v3-kvm: Emulate CMDQ for host Mostafa Saleh
2025-09-12 14:18   ` Will Deacon
2025-09-15 16:38     ` Jason Gunthorpe
2025-09-16 15:19       ` Mostafa Saleh
2025-09-17 12:36         ` Jason Gunthorpe
2025-09-17 15:01           ` Will Deacon
2025-09-17 15:16             ` Jason Gunthorpe
2025-09-17 15:25               ` Will Deacon
2025-09-17 15:59                 ` Jason Gunthorpe
2025-09-18 10:26                   ` Will Deacon
2025-09-18 14:36                     ` Jason Gunthorpe
2025-09-16 14:50     ` Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 23/28] iommu/arm-smmu-v3-kvm: Shadow stream table Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 24/28] iommu/arm-smmu-v3-kvm: Shadow STEs Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 25/28] iommu/arm-smmu-v3-kvm: Emulate GBPA Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 26/28] iommu/arm-smmu-v3-kvm: Support io-pgtable Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 27/28] iommu/arm-smmu-v3-kvm: Shadow the CPU stage-2 page table Mostafa Saleh
2025-08-19 21:51 ` [PATCH v4 28/28] iommu/arm-smmu-v3-kvm: Enable nesting Mostafa Saleh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMlly_9kDE04Inn2@google.com \
    --to=smostafa@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=praan@google.com \
    --cc=qperret@google.com \
    --cc=robin.murphy@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).