Re: [PATCH v2 1/4] KVM: arm64: nv: Keep reference on stage-2 MMU when scheduled out

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oliver Upton <oliver.upton@linux.dev>
To: Sean Christopherson <seanjc@google.com>
Cc: kvmarm@lists.linux.dev, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH v2 1/4] KVM: arm64: nv: Keep reference on stage-2 MMU when scheduled out
Date: Mon, 7 Oct 2024 14:22:48 -0700	[thread overview]
Message-ID: <ZwRRKM5dMZ22KYKf@linux.dev> (raw)
In-Reply-To: <ZwQ8EMcZYtUb6wk_@google.com>

On Mon, Oct 07, 2024 at 12:52:48PM -0700, Sean Christopherson wrote:
> On Mon, Oct 07, 2024, Oliver Upton wrote:
> > On Mon, Oct 07, 2024 at 11:39:37AM -0700, Sean Christopherson wrote:
> > > > @@ -674,10 +681,18 @@ void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)
> > > >  
> > > >  void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu)
> > > >  {
> > > > -	if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu)) {
> > > > +	/*
> > > > +	 * Keep a reference on the associated stage-2 MMU if the vCPU is
> > > > +	 * scheduling out and not in WFI emulation, suggesting it is likely to
> > > > +	 * reuse the MMU sometime soon.
> > > > +	 */
> > > > +	if (vcpu->scheduled_out && !vcpu_get_flag(vcpu, IN_WFI))
> > > > +		return;
> > > 
> > > Assuming KVM arm64 supports halt-polling, I think it makes more sense to check
> > > kvm_vcpu_is_blocking() instead of IN_WFI.  That way, KVM will keep the MMU resident
> > > if the vCPU happens to be preempted and halt-polling is successful.
> > > 
> > > And somewhat of a side topic, calling kvm_vgic_put() from kvm_arch_vcpu_blocking()
> > > instead of kvm_vcpu_wfi() would provide the same optimization for the GIC, as KVM
> > > would only need to put/reload the vGIC if the vCPU is actually scheduled out.
> > 
> > That wouldn't be an optimization, it'd render the halt polling loop
> > useless.

<snip>

> Hmm, but what happens if the wakeup event arrives before IN_WFI is set?  Ah, IIUC,
> GICR_VPENDBASER_PendingLast tracks if there's a pending event, KVM makes sure to
> read PendingLast after making vPE non-resident, and hardware is required to ensure
> either PendingLast=1 or a doorbell is signaled.
> 
> Niave question time: why not query GICR_VPENDBASER_PendingLast directly in
> kvm_vgic_vcpu_pending_irq() instead of putting the vGIC?

The GIC is only required to update GICR_VPENDBASER.PendingLast when
software deschedules the loaded vPE, meaning GICR_VPENDBASER.Valid goes
from 1 => 0. It is not updated for virtual interrupts that arrive after
the vPE has been descheduled.

So a polling loop that tests PendingLast would need to repeatedly mark
the vPE as scheduled/descheduled to get the GIC to process the LPI
pending table. You'd at minimum need a dsb(sy) to ensure the
redistributor has seen the new configuration *and* poll it to make sure
the LPI pending state has been completely serialized/deserialized into
memory:

	while !GICR_VPENDBASER.PendingLast:
		GICR_VPENDBASER.Valid = 1
		dsb(sy)
		while GICR_VPENDBASER.Dirty:
			cpu_relax()
		GICR_VPENDBASER.Valid = 0
		while GICR_VPENDBASER.Dirty:
			cpu_relax()

Conceptually, the redistributor is a 'far' piece of hardware, likely
running at a different clock from your CPU. So it is safe to assume that
interactions with it are pretty slow, and this overall scheme would be
quite expensive.

> E.g. if safely querying
> PendingLast is "heavy", wouldn't it still make sense to do a slow check of
> PendingLast in kvm_vgic_vcpu_pending_irq() if IN_WFI=1?  Or is putting the vGIC
> as lightweight as things get when it comes to checking PendingLast?

I believe I've answered this above, since the status bit only makes
sense when hardware has done a state transition.

> > On top of that, we need to activate the doorbell IRQ for GICv4 to give
> > KVM a kick when something arrives for the vPE.
> 
> Right, but KVM needs to "manually" detect pending events to account for events
> that arrive before the doorbell is activated, so from a functional perspective,
> it shouldn't matter when KVM activates the doorbell (with respect to halt-polling).

Like I said earlier:

<paste>

> > The most recent view of the guest's CPU interface is sitting in
> > hardware at this point, so we need to synchronize KVM's view of
> > the CPUIF with it to determine if a pending interrupt exceeds the
> > priority mask.

</paste>

The guest's enable bits and priority mask are in ICH_VMCR_EL2, and we
need to read that out of the CPU in order for this loop to work:

int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
{
	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
	struct vgic_irq *irq;
	bool pending = false;
	unsigned long flags;
	struct vgic_vmcr vmcr;

	if (!vcpu->kvm->arch.vgic.enabled)
		return false;

	if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last)
		return true;

	vgic_get_vmcr(vcpu, &vmcr);

	raw_spin_lock_irqsave(&vgic_cpu->ap_list_lock, flags);

	list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
		raw_spin_lock(&irq->irq_lock);
		pending = irq_is_pending(irq) && irq->enabled &&
			  !irq->active &&
			  irq->priority < vmcr.pmr;		<====== Need latest PMR
		raw_spin_unlock(&irq->irq_lock);

		if (pending)
			break;
	}

	raw_spin_unlock_irqrestore(&vgic_cpu->ap_list_lock, flags);

	return pending;
}

KVM's in-memory view of VMCR is updated in vgic_put(). So even w/o GICv4
direct injection, the seemingly premature GIC save/restore is essential
for halt polling to actually do something useful.

-- 
Thanks,
Oliver

next prev parent reply	other threads:[~2024-10-07 21:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 16:42 [PATCH v2 0/4] KVM: arm64: nv: Fixes for stage-2 MMU recycling Oliver Upton
2024-10-07 16:42 ` [PATCH v2 1/4] KVM: arm64: nv: Keep reference on stage-2 MMU when scheduled out Oliver Upton
2024-10-07 18:39   ` Sean Christopherson
2024-10-07 19:01     ` Oliver Upton
2024-10-07 19:52       ` Sean Christopherson
2024-10-07 21:22         ` Oliver Upton [this message]
2024-10-07 22:18           ` Sean Christopherson
2024-10-07 16:42 ` [PATCH v2 2/4] KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed Oliver Upton
2024-10-07 16:42 ` [PATCH v2 3/4] KVM: arm64: nv: Punt stage-2 recycling to a vCPU request Oliver Upton
2024-10-07 19:06   ` Marc Zyngier
2024-10-07 19:16     ` Oliver Upton
2024-10-07 16:42 ` [PATCH v2 4/4] KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule Oliver Upton
2024-10-07 18:51   ` Sean Christopherson
2024-10-07 19:20     ` Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZwRRKM5dMZ22KYKf@linux.dev \
    --to=oliver.upton@linux.dev \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.