public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Will Deacon <will@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	Marc Zyngier <maz@kernel.org>, Oliver Upton <oupton@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Quentin Perret <qperret@google.com>,
	Fuad Tabba <tabba@google.com>,
	Vincent Donnefort <vdonnefort@google.com>,
	Mostafa Saleh <smostafa@google.com>
Subject: Re: [PATCH v2 14/35] KVM: arm64: Handle aborts from protected VMs
Date: Thu, 12 Feb 2026 10:37:19 +0000	[thread overview]
Message-ID: <aY2tX6V0pCqwGth5@raptor> (raw)
In-Reply-To: <20260119124629.2563-15-will@kernel.org>

Hi Will,

On Mon, Jan 19, 2026 at 12:46:07PM +0000, Will Deacon wrote:
> Introduce a new abort handler for resolving stage-2 page faults from
> protected VMs by pinning and donating anonymous memory. This is
> considerably simpler than the infamous user_mem_abort() as we only have
> to deal with translation faults at the pte level.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/mmu.c | 89 ++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index a23a4b7f108c..b21a5bf3d104 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1641,6 +1641,74 @@ static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	return ret != -EAGAIN ? ret : 0;
>  }
>  
> +static int pkvm_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> +		struct kvm_memory_slot *memslot, unsigned long hva)
> +{
> +	unsigned int flags = FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE;
> +	struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt;
> +	struct mm_struct *mm = current->mm;
> +	struct kvm *kvm = vcpu->kvm;
> +	void *hyp_memcache;
> +	struct page *page;
> +	int ret;
> +
> +	ret = prepare_mmu_memcache(vcpu, true, &hyp_memcache);
> +	if (ret)
> +		return -ENOMEM;
> +
> +	ret = account_locked_vm(mm, 1, true);
> +	if (ret)
> +		return ret;
> +
> +	mmap_read_lock(mm);
> +	ret = pin_user_pages(hva, 1, flags, &page);
> +	mmap_read_unlock(mm);

If the page is part of a large folio, the entire folio gets pinned here, not
just the page returned by pin_user_pages(). Do you reckon that should be
considered when calling account_locked_vm()?

> +
> +	if (ret == -EHWPOISON) {
> +		kvm_send_hwpoison_signal(hva, PAGE_SHIFT);
> +		ret = 0;
> +		goto dec_account;
> +	} else if (ret != 1) {
> +		ret = -EFAULT;
> +		goto dec_account;
> +	} else if (!folio_test_swapbacked(page_folio(page))) {
> +		/*
> +		 * We really can't deal with page-cache pages returned by GUP
> +		 * because (a) we may trigger writeback of a page for which we
> +		 * no longer have access and (b) page_mkclean() won't find the
> +		 * stage-2 mapping in the rmap so we can get out-of-whack with
> +		 * the filesystem when marking the page dirty during unpinning
> +		 * (see cc5095747edf ("ext4: don't BUG if someone dirty pages
> +		 * without asking ext4 first")).

I've been trying to wrap my head around this. Would you mind providing a few
more hints about what the issue is? I'm sure the approach is correct, it's
likely just me not being familiar with the code.

> +		 *
> +		 * Ideally we'd just restrict ourselves to anonymous pages, but
> +		 * we also want to allow memfd (i.e. shmem) pages, so check for
> +		 * pages backed by swap in the knowledge that the GUP pin will
> +		 * prevent try_to_unmap() from succeeding.
> +		 */
> +		ret = -EIO;
> +		goto unpin;
> +	}
> +
> +	write_lock(&kvm->mmu_lock);
> +	ret = pkvm_pgtable_stage2_map(pgt, fault_ipa, PAGE_SIZE,
> +				      page_to_phys(page), KVM_PGTABLE_PROT_RWX,
> +				      hyp_memcache, 0);
> +	write_unlock(&kvm->mmu_lock);
> +	if (ret) {
> +		if (ret == -EAGAIN)
> +			ret = 0;
> +		goto unpin;
> +	}

This looks correct to me, there's no need to check for the notifier sequence
number if the MMU notifiers are ignored. And concurrent faults on the same page
are handled by treating -EAGAIN as success.

> +
> +	return 0;
> +unpin:
> +	unpin_user_pages(&page, 1);
> +dec_account:
> +	account_locked_vm(mm, 1, false);
> +	return ret;
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_s2_trans *nested,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> @@ -2190,15 +2258,20 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  		goto out_unlock;
>  	}
>  
> -	VM_WARN_ON_ONCE(kvm_vcpu_trap_is_permission_fault(vcpu) &&
> -			!write_fault && !kvm_vcpu_trap_is_exec_fault(vcpu));
> +	if (kvm_vm_is_protected(vcpu->kvm)) {
> +		ret = pkvm_mem_abort(vcpu, fault_ipa, memslot, hva);

I guess the reason this comes after handling an access fault is because you want
the WARN_ON() to trigger in pkvm_pgtable_stage2_mkyoung().

Thanks,
Alex

> +	} else {
> +		VM_WARN_ON_ONCE(kvm_vcpu_trap_is_permission_fault(vcpu) &&
> +				!write_fault &&
> +				!kvm_vcpu_trap_is_exec_fault(vcpu));
>  
> -	if (kvm_slot_has_gmem(memslot))
> -		ret = gmem_abort(vcpu, fault_ipa, nested, memslot,
> -				 esr_fsc_is_permission_fault(esr));
> -	else
> -		ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva,
> -				     esr_fsc_is_permission_fault(esr));
> +		if (kvm_slot_has_gmem(memslot))
> +			ret = gmem_abort(vcpu, fault_ipa, nested, memslot,
> +					 esr_fsc_is_permission_fault(esr));
> +		else
> +			ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva,
> +					     esr_fsc_is_permission_fault(esr));
> +	}
>  	if (ret == 0)
>  		ret = 1;
>  out:
> -- 
> 2.52.0.457.g6b5491de43-goog
> 
> 


  reply	other threads:[~2026-02-12 10:37 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19 12:45 [PATCH v2 00/35] KVM: arm64: Add support for protected guest memory with pKVM Will Deacon
2026-01-19 12:45 ` [PATCH v2 01/35] KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers Will Deacon
2026-01-19 12:45 ` [PATCH v2 02/35] KVM: arm64: Don't leak stage-2 page-table if VM fails to init under pKVM Will Deacon
2026-01-19 12:45 ` [PATCH v2 03/35] KVM: arm64: Move handle check into pkvm_pgtable_stage2_destroy_range() Will Deacon
2026-01-19 12:45 ` [PATCH v2 04/35] KVM: arm64: Rename __pkvm_pgtable_stage2_unmap() Will Deacon
2026-01-19 12:45 ` [PATCH v2 05/35] KVM: arm64: Don't advertise unsupported features for protected guests Will Deacon
2026-01-19 12:45 ` [PATCH v2 06/35] KVM: arm64: Expose self-hosted debug regs as RAZ/WI " Will Deacon
2026-01-19 12:46 ` [PATCH v2 07/35] KVM: arm64: Remove is_protected_kvm_enabled() checks from hypercalls Will Deacon
2026-02-10 14:53   ` Alexandru Elisei
2026-03-03 15:45     ` Will Deacon
2026-03-06 11:33       ` Alexandru Elisei
2026-01-19 12:46 ` [PATCH v2 08/35] KVM: arm64: Ignore MMU notifier callbacks for protected VMs Will Deacon
2026-01-19 12:46 ` [PATCH v2 09/35] KVM: arm64: Prevent unsupported memslot operations on " Will Deacon
2026-01-19 12:46 ` [PATCH v2 10/35] KVM: arm64: Ignore -EAGAIN when mapping in pages for the pKVM host Will Deacon
2026-01-19 12:46 ` [PATCH v2 11/35] KVM: arm64: Split teardown hypercall into two phases Will Deacon
2026-01-19 12:46 ` [PATCH v2 12/35] KVM: arm64: Introduce __pkvm_host_donate_guest() Will Deacon
2026-01-19 12:46 ` [PATCH v2 13/35] KVM: arm64: Hook up donation hypercall to pkvm_pgtable_stage2_map() Will Deacon
2026-01-19 12:46 ` [PATCH v2 14/35] KVM: arm64: Handle aborts from protected VMs Will Deacon
2026-02-12 10:37   ` Alexandru Elisei [this message]
2026-03-04 14:06     ` Will Deacon
2026-03-06 11:34       ` Alexandru Elisei
2026-03-11 10:24   ` Fuad Tabba
2026-01-19 12:46 ` [PATCH v2 15/35] KVM: arm64: Introduce __pkvm_reclaim_dying_guest_page() Will Deacon
2026-01-19 12:46 ` [PATCH v2 16/35] KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy() Will Deacon
2026-01-19 12:46 ` [PATCH v2 17/35] KVM: arm64: Refactor enter_exception64() Will Deacon
2026-01-19 12:46 ` [PATCH v2 18/35] KVM: arm64: Inject SIGSEGV on illegal accesses Will Deacon
2026-01-19 12:46 ` [PATCH v2 19/35] KVM: arm64: Avoid pointless annotation when mapping host-owned pages Will Deacon
2026-01-19 12:46 ` [PATCH v2 20/35] KVM: arm64: Generalise kvm_pgtable_stage2_set_owner() Will Deacon
2026-01-19 12:46 ` [PATCH v2 21/35] KVM: arm64: Introduce host_stage2_set_owner_metadata_locked() Will Deacon
2026-01-19 12:46 ` [PATCH v2 22/35] KVM: arm64: Change 'pkvm_handle_t' to u16 Will Deacon
2026-01-28 10:28   ` Fuad Tabba
2026-01-19 12:46 ` [PATCH v2 23/35] KVM: arm64: Annotate guest donations with handle and gfn in host stage-2 Will Deacon
2026-01-28 10:29   ` Fuad Tabba
2026-01-19 12:46 ` [PATCH v2 24/35] KVM: arm64: Introduce hypercall to force reclaim of a protected page Will Deacon
2026-02-12 17:18   ` Alexandru Elisei
2026-03-04 14:08     ` Will Deacon
2026-01-19 12:46 ` [PATCH v2 25/35] KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler Will Deacon
2026-02-12 17:22   ` Alexandru Elisei
2026-03-04 14:06     ` Will Deacon
2026-01-19 12:46 ` [PATCH v2 26/35] KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte Will Deacon
2026-01-19 12:46 ` [PATCH v2 27/35] KVM: arm64: Add hvc handler at EL2 for hypercalls from protected VMs Will Deacon
2026-01-19 12:46 ` [PATCH v2 28/35] KVM: arm64: Implement the MEM_SHARE hypercall for " Will Deacon
2026-01-19 12:46 ` [PATCH v2 29/35] KVM: arm64: Implement the MEM_UNSHARE " Will Deacon
2026-01-19 12:46 ` [PATCH v2 30/35] KVM: arm64: Allow userspace to create protected VMs when pKVM is enabled Will Deacon
2026-01-19 12:46 ` [PATCH v2 31/35] KVM: arm64: Add some initial documentation for pKVM Will Deacon
2026-01-19 12:46 ` [PATCH v2 32/35] KVM: arm64: Extend pKVM page ownership selftests to cover guest donation Will Deacon
2026-01-19 12:46 ` [PATCH v2 33/35] KVM: arm64: Register 'selftest_vm' in the VM table Will Deacon
2026-01-19 12:46 ` [PATCH v2 34/35] KVM: arm64: Extend pKVM page ownership selftests to cover forced reclaim Will Deacon
2026-01-19 12:46 ` [PATCH v2 35/35] KVM: arm64: Extend pKVM page ownership selftests to cover guest hvcs Will Deacon
2026-02-10 18:58 ` [PATCH v2 00/35] KVM: arm64: Add support for protected guest memory with pKVM Trilok Soni
2026-02-10 19:03   ` Fuad Tabba
2026-02-16 10:58   ` Venkata Rao Kakani
2026-02-16 11:00     ` Fuad Tabba
2026-02-17 10:43       ` Venkata Rao Kakani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aY2tX6V0pCqwGth5@raptor \
    --to=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oupton@kernel.org \
    --cc=qperret@google.com \
    --cc=smostafa@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=vdonnefort@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox