Re: [PATCH 3/3] KVM: arm64: nv: Punt stage-2 recycling to a vCPU request

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	 Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH 3/3] KVM: arm64: nv: Punt stage-2 recycling to a vCPU request
Date: Tue, 1 Oct 2024 12:05:01 -0700	[thread overview]
Message-ID: <ZvxH3el9SNuNWwi8@google.com> (raw)
In-Reply-To: <20241001001709.1303668-4-oliver.upton@linux.dev>

On Tue, Oct 01, 2024, Oliver Upton wrote:
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 367f0d3d3194..ab7d387379a0 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1031,6 +1031,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
>  
>  		if (kvm_dirty_ring_check_request(vcpu))
>  			return 0;
> +
> +		check_nested_vcpu_requests(vcpu);
>  	}
>  
>  	return 1;
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 1ba211541290..749babdbda81 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -632,9 +632,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
>  	/* set the scene for the next search */
>  	kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size;
>  
> -	/* clear the old state */
> +	/* make sure we don't forget to do the laundry */
>  	if (kvm_s2_mmu_valid(s2_mmu))
> -		kvm_stage2_unmap_range(s2_mmu, 0, kvm_phys_size(s2_mmu), false);
> +		s2_mmu->pending_unmap = true;

Question time.  The code that's just out of sight modifies the nested MMU's tlb_vttbr:

	s2_mmu->tlb_vttbr = vcpu_read_sys_reg(vcpu, vttbr_el2) & ~vttbr_cnp_bit;
	s2_mmu->tlb_vtcr = vcpu_read_sys_reg(vcpu, vtcr_el2);
	s2_mmu->nested_stage2_enabled = vcpu_read_sys_reg(vcpu, hcr_el2) & hcr_vm;

which is consumed by kvm_s2_mmu_iterate_by_vmid().  IIUC, kvm_s2_mmu_iterate_by_vmid()
is used only when emulating some form of tlbi that's executed by the guest?  I.e.
is guaranteed to do the right thing because KVM ensures the old state is purged
prior to entering the guest?

>  	/*
>  	 * the virtual vmid (modulo cnp) will be used as a key when matching
> @@ -649,6 +649,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
>  	s2_mmu->nested_stage2_enabled = vcpu_read_sys_reg(vcpu, hcr_el2) & hcr_vm;
>  
>  out:
> +	if (s2_mmu->pending_unmap)
> +		kvm_make_request(kvm_req_nested_s2_unmap, vcpu);

If I followed everything correctly, I don't think a request is needed.  the
request will never be cross-vCPU, and each vCPU holds a reference to the MMU, so
the MMU can't be recycled, i.e. pending_unmap is guaranteed to be relevant to the
vCPU's usage of the MMU.  More thoughts below in check_nested_vcpu_requests().

Somewhat of a side topic, the handling of xfer_to_guest_mode_handle_work() in
kvm_arch_vcpu_ioctl_run() is confusing.  If a signal is pending (ret == -eintr),
KVM will skip checking requests, e.g. not unmap the nested stage-2, but then continue
with other pieces of the entry flow.  I'm guessing that's intentional as it looks
like KVM needs to flush hardware state in case that info is exposed to userspace.

But it really warps the mind when trying to reason about what will happen if
there is a relevant request, e.g. if the below code needs to load a new stage-2
for a new vmid, it could do so with the to-be-unmapped contents.

		/*
		 * check conditions before entering the guest
		 */
		ret = xfer_to_guest_mode_handle_work(vcpu);
		if (!ret)
			ret = 1;

		if (ret > 0)
			ret = check_vcpu_requests(vcpu);

		/*
		 * preparing the interrupts to be injected also
		 * involves poking the gic, which must be done in a
		 * non-preemptible context.
		 */
		preempt_disable();

		/*
		 * the vmid allocator only tracks active vmids per
		 * physical cpu, and therefore the vmid allocated may not be
		 * preserved on vmid roll-over if the task was preempted,
		 * making a thread's vmid inactive. so we need to call
		 * kvm_arm_vmid_update() in non-premptible context.
		 */
		if (kvm_arm_vmid_update(&vcpu_to_hw_mmu_unsafe(vcpu)->vmid) &&
		    has_vhe())
			__load_stage2(vcpu_to_hw_mmu_unsafe(vcpu),
				      &vcpu->kvm->arch);

> +
>  	kvm_get_s2_mmu(s2_mmu);
>  	return s2_mmu;
>  }
> @@ -1186,3 +1189,17 @@ int kvm_init_nv_sysregs(struct kvm *kvm)
>  
>  	return 0;
>  }
> +
> +void check_nested_vcpu_requests(struct kvm_vcpu *vcpu)
> +{
> +	if (kvm_check_request(kvm_req_nested_s2_unmap, vcpu)) {
> +		CLASS(vcpu_hw_mmu, mmu)(vcpu);
> +
> +		write_lock(&vcpu->kvm->mmu_lock);
> +		if (mmu->pending_unmap) {

I doubt it's a functional issue, but if multiple vCPUs are purging the same nested
stage-2 MMU, and the first vCPU yields while unmapping, then KVM will end up with
multiple vCPUs doing the unmap.  And irrespective of whether or not the first vCPU
(or vCPUs) yields, they'll all contend for mmu_lock, which is especially problematic
since pending writers will block future readers.

Would it instead make sense to do something like the below?  Checking for the
need to unmap outside of mmu_lock would also more or less eliminate the need for
a request.

#define KVM_S2_UNMAP_PENDING		BIT(0)
#define KVM_S2_UNMAP_IN_PROGRESS	BIT(1)

	unsigned long ops = atomic_read(mmu->unmap_operations);
	unsigned long new;

	if (!(ops & KVM_S2_UNMAP_PENDING))
		return;

	if (!(ops & KVM_S2_UNMAP_IN_PROGRESS)) {
		ops = cmpxchg(mmu->unmap_operations, ops,
			      ops | KVM_S2_UNMAP_IN_PROGRESS);

	/* Nothing more to do if the unmap already completed. */
	if (!(ops & KVM_S2_UNMAP_PENDING))
		return;

	/* If a different vCPU started the unmap, simply wait for it to finish. */
	if (ops & KVM_S2_UNMAP_IN_PROGRESS) {
		while (atomic_read(&mmu->unmap_operations))
			cond_resched(); /* enter a waitq? */
		return;
	}

	write_lock(&vcpu->kvm->mmu_lock);
	kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);
	write_unlock(&vcpu->kvm->mmu_lock);

	atomic_set(&mmu->unmap_operations, 0);

> +			kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);
> +			mmu->pending_unmap = false;
> +		}
> +		write_unlock(&vcpu->kvm->mmu_lock);
> +	}
> +}
> -- 
> 2.46.1.824.gd892dcdcdd-goog

next prev parent reply	other threads:[~2024-10-01 19:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-01  0:17 [PATCH 0/3] KVM: arm64: nv: Fixes for stage-2 MMU recycling Oliver Upton
2024-10-01  0:17 ` [PATCH 1/3] KVM: arm64: Treat stage-2 MMUs as refcounted generally Oliver Upton
2024-10-01  0:17 ` [PATCH 2/3] KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed Oliver Upton
2024-10-01  0:17 ` [PATCH 3/3] KVM: arm64: nv: Punt stage-2 recycling to a vCPU request Oliver Upton
2024-10-01 19:05   ` Sean Christopherson [this message]
2024-10-01 20:41     ` Oliver Upton
2024-10-01 23:28       ` Sean Christopherson
2024-10-01 23:49         ` Marc Zyngier
2024-10-02  0:06           ` Oliver Upton
2024-10-02  0:23             ` Sean Christopherson
2024-10-02 23:31               ` Marc Zyngier
2024-10-03  0:04                 ` Oliver Upton
2024-10-03  0:12                   ` Oliver Upton
2024-10-03 16:45                     ` Sean Christopherson
2024-10-03 17:52                       ` Oliver Upton
2024-10-03 18:23                         ` Sean Christopherson
2024-10-03 22:03                           ` Oliver Upton
2024-10-01 23:23   ` Marc Zyngier
2024-10-02  0:06     ` Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvxH3el9SNuNWwi8@google.com \
    --to=seanjc@google.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.