From: Marc Zyngier <maz@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Paul Mackerras <paulus@ozlabs.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
David Hildenbrand <david@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
linux-kernel@vger.kernel.org, David Matlack <dmatlack@google.com>,
Jing Zhang <jingzhangos@google.com>
Subject: Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
Date: Thu, 30 Sep 2021 09:36:19 +0000 [thread overview]
Message-ID: <87czoqtnfw.wl-maz@kernel.org> (raw)
In-Reply-To: <YVNA+KTbLrxGQ6ML@google.com>
On Tue, 28 Sep 2021 17:21:12 +0100,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Sep 28, 2021, Marc Zyngier wrote:
> > On Mon, 27 Sep 2021 18:28:14 +0100,
> > Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > >
> > > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > > >> there is no need for arm64 to put/load
> > > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > >
> > > > > > This doesn't mean that there is no requirement for any state
> > > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > > resync is a correctness requirement.
> > >
> > > Ah crud, I didn't blame that code beforehand, I simply assumed
> > > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > > sequence. The comment in arm64's kvm_arch_vcpu_blocking() about
> > > kvm_arch_vcpu_runnable() makes more sense now too.
> > >
> > > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > > work and is a waste of time without (the current implementation of)
> > > > > put/load.
> > > >
> > > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > > still follow the standard SW injection model.
> > > >
> > > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > > up-to-date priority mask and group enable bits) for SW-injected
> > > > interrupt wake-up to work correctly, and I really don't want to save
> > > > that one eagerly on each shallow exit.
> > >
> > > IIUC, VMCR is resident in hardware while the guest is running, and
> > > KVM needs to retrieve the VMCR when processing interrupts to
> > > determine if a interrupt is above the priority threshold. If that's
> > > the case, then IMO handling the VMCR via an arch hook is
> > > unnecessarily fragile, e.g. any generic call that leads to
> > > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > > guest register.
> >
> > Not quite. We only need to retrieve the VMCR if we are in a situation
> > where we need to trigger a wake-up from WFI at the point where we have
> > not done a vcpu_put() yet. All the other cases where the interrupt is
> > injected are managed by the HW. And the only case where
> > kvm_arch_vcpu_runnable() gets called is when blocking.
> >
> > I also don't get why a hook would be fragile, as long as it has well
> > defined semantics.
>
> Generic KVM should not have to know that a seemingly benign arch hook,
> kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
> arch hook. E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin(). If
> the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
> and the target vCPU is awakened during that period, KVM can call
> kvm_arch_vcpu_runnable() while the vCPU is running.
Humph. Indeed, there is a potential gold-plated turd there.
> It's kind of a counter-example to my below suggestion as putting the vGIC would
> indeed lead to state corruption if the vCPU is running, but I would argue that
> arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
> e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.
I'll work something out for that case.
> > > A better approach for VMCR would be to retrieve the value from
> > > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > >
> > > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > index 48c6067fc5ec..0784de0c4080 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >
> > > void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > > {
> > > + if (!vcpu->...->vmcr_available) {
> > > + preempt_disable();
> > > + kvm_vgic_vmcr_sync(vcpu);
> > > + preempt_enable();
> > > + vcpu->...->vmcr_available = true;
> > > + }
> > > +
> >
> > But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> > isn't running at all (such as save/restore). It really only operates
> > on the shadow state, and what you have above will only lead to state
> > corruption.
>
> Ignoring the kvm_arch_dy_runnable() case for the moment, how would
> it lead to corruption? The idea is that the 'vmcr_available' flag
> would be cleared when the vCPU is run, i.e. it tracks whether or not
> the shadow state may be stale.
I guess that 'vmcr_available' would have to be initialised to 'true'
at vcpu reset time so that the userspace side cannot trigger a read
from the HW.
> > > if (kvm_vgic_global_state.type = VGIC_V2)
> > > vgic_v2_get_vmcr(vcpu, vmcr);
> > > else
> > >
> > >
> > > Regarding vGIC v4, does KVM require it to be resident in hardware
> > > while the vCPU is loaded?
> >
> > It is a requirement. Otherwise, we end-up with an inconsistent state
> > between the delivery of doorbells and the state of the vgic.
>
> For my own understanding, does KVM require it to be resident in
> hardware while the vCPU is loaded but _not_ running? What I don't
> fully understand is how KVM can safely load/put the vCPU if that
> true, i.e. wouldn't there always be a window for badness?
No, that part is fine. It is when you start running the vcpu without
the GICv4 context loaded that ugly stuff happens (get a doorbell that
tells you to schedule the currently running vcpu, for example).
>
> > Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> > accesses), which is why we really don't want to do that on the hot path
> > (kvm_arch_vcpu_ioctl_run() *is* a hot path).
>
> I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
> if it was put between vcpu_load() and entry to the guest.
>
> > > If not, then we could do something like
> > > this, which would eliminate the arch hooks entirely if the VMCR is
> > > handled as above.
>
> ...
>
> > > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > */
> > > preempt_disable();
> > >
> > > + /*
> > > + * Reload vGIC v4 if necessary, as it may be put on-demand so
> > > + * that KVM can detect directly injected interrupts, e.g. when
> > > + * determining if the vCPU is runnable due to a pending event.
> > > + */
> > > + vgic_v4_load(vcpu);
> >
> > You'd need to detect that a previous put has been done.
>
> Not that it will likely matter, but doesn't the its_vpe.resident
> check automatically handle this?
Sort of. I eventually want to get rid of this as it papers over all
sort of sins. I introduced it exactly because of the nesting that
vcpu_block triggers, but this is a bit of a layering violation between
KVM and the underlying GICv4 driver.
>
> > But overall, it puts the complexity at the wrong place. WFI (aka
> > kvm_vcpu_block) is the place where we want to handle this synchronisation,
> > and not the run loop.
> >
> > Instead of having a well defined interface with the blocking code
> > where we implement the required synchronisation, you spray the vgic
> > crap all over, and it becomes much harder to reason about it. Guess
> > what, I'm not keen on it.
>
> My objection to the arch hooks is that, from generic KVM's
> perspective, the direct dependency is not on blocking, it's on
> calling kvm_arch_vcpu_runnable(). That's why I suggested handling
> this by tracking whether or not the VMCR is up-to-date/stale, as it
> allows generic KVM to safely call kvm_arch_vcpu_runnable() whenever
> the vCPU is loaded.
>
> I don't have a strong opinion on arm64 preferring the sync to be
> specific to WFI, but if that's the case then IMO this should be
> handled fully in arm64, e.g. a patch like so (or with a wrapper
> around the call to kvm_vcpu_block() if we want to guard against
> future calls into generic KVM)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..312f3acd3ca3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>
> void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> {
> - /*
> - * If we're about to block (most likely because we've just hit a
> - * WFI), we need to sync back the state of the GIC CPU interface
> - * so that we have the latest PMR and group enables. This ensures
> - * that kvm_arch_vcpu_runnable has up-to-date data to decide
> - * whether we have pending interrupts.
> - *
> - * For the same reason, we want to tell GICv4 that we need
> - * doorbells to be signalled, should an interrupt become pending.
> - */
> - preempt_disable();
> - kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> {
> - preempt_disable();
> - vgic_v4_load(vcpu);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..9870e824a27e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
> } else {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> +
> + /*
> + * Sync back the state of the GIC CPU interface so that we have
> + * the latest PMR and group enables. This ensures that
> + * kvm_arch_vcpu_runnable has up-to-date data to decide whether
> + * we have pending interrupts, e.g. when determining if the
> + * vCPU should block.
> + *
> + * For the same reason, we want to tell GICv4 that we need
> + * doorbells to be signalled, should an interrupt become pending.
> + */
> + preempt_disable();
> + kvm_vgic_vmcr_sync(vcpu);
> + vgic_v4_put(vcpu, true);
> + preempt_enable();
> +
> kvm_vcpu_block(vcpu);
> kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +
> + preempt_disable();
> + vgic_v4_load(vcpu);
> + preempt_enable();
> }
>
> kvm_incr_pc(vcpu);
I actually largely prefer this approach, which is massively more
readable than the current setup. Feel free to wrap that in your
series.
I'll also have a look at the vcpu_dy_runnable() asap.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
kvm@vger.kernel.org, David Hildenbrand <david@redhat.com>,
linux-kernel@vger.kernel.org, Paul Mackerras <paulus@ozlabs.org>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
kvmarm@lists.cs.columbia.edu,
Janosch Frank <frankja@linux.ibm.com>,
Joerg Roedel <joro@8bytes.org>,
Huacai Chen <chenhuacai@kernel.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
kvm-ppc@vger.kernel.org, David Matlack <dmatlack@google.com>,
linux-arm-kernel@lists.infradead.org,
Jim Mattson <jmattson@google.com>,
Cornelia Huck <cohuck@redhat.com>,
linux-mips@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>
Subject: Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
Date: Thu, 30 Sep 2021 10:36:19 +0100 [thread overview]
Message-ID: <87czoqtnfw.wl-maz@kernel.org> (raw)
In-Reply-To: <YVNA+KTbLrxGQ6ML@google.com>
On Tue, 28 Sep 2021 17:21:12 +0100,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Sep 28, 2021, Marc Zyngier wrote:
> > On Mon, 27 Sep 2021 18:28:14 +0100,
> > Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > >
> > > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > > >> there is no need for arm64 to put/load
> > > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > >
> > > > > > This doesn't mean that there is no requirement for any state
> > > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > > resync is a correctness requirement.
> > >
> > > Ah crud, I didn't blame that code beforehand, I simply assumed
> > > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > > sequence. The comment in arm64's kvm_arch_vcpu_blocking() about
> > > kvm_arch_vcpu_runnable() makes more sense now too.
> > >
> > > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > > work and is a waste of time without (the current implementation of)
> > > > > put/load.
> > > >
> > > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > > still follow the standard SW injection model.
> > > >
> > > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > > up-to-date priority mask and group enable bits) for SW-injected
> > > > interrupt wake-up to work correctly, and I really don't want to save
> > > > that one eagerly on each shallow exit.
> > >
> > > IIUC, VMCR is resident in hardware while the guest is running, and
> > > KVM needs to retrieve the VMCR when processing interrupts to
> > > determine if a interrupt is above the priority threshold. If that's
> > > the case, then IMO handling the VMCR via an arch hook is
> > > unnecessarily fragile, e.g. any generic call that leads to
> > > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > > guest register.
> >
> > Not quite. We only need to retrieve the VMCR if we are in a situation
> > where we need to trigger a wake-up from WFI at the point where we have
> > not done a vcpu_put() yet. All the other cases where the interrupt is
> > injected are managed by the HW. And the only case where
> > kvm_arch_vcpu_runnable() gets called is when blocking.
> >
> > I also don't get why a hook would be fragile, as long as it has well
> > defined semantics.
>
> Generic KVM should not have to know that a seemingly benign arch hook,
> kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
> arch hook. E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin(). If
> the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
> and the target vCPU is awakened during that period, KVM can call
> kvm_arch_vcpu_runnable() while the vCPU is running.
Humph. Indeed, there is a potential gold-plated turd there.
> It's kind of a counter-example to my below suggestion as putting the vGIC would
> indeed lead to state corruption if the vCPU is running, but I would argue that
> arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
> e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.
I'll work something out for that case.
> > > A better approach for VMCR would be to retrieve the value from
> > > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > >
> > > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > index 48c6067fc5ec..0784de0c4080 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >
> > > void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > > {
> > > + if (!vcpu->...->vmcr_available) {
> > > + preempt_disable();
> > > + kvm_vgic_vmcr_sync(vcpu);
> > > + preempt_enable();
> > > + vcpu->...->vmcr_available = true;
> > > + }
> > > +
> >
> > But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> > isn't running at all (such as save/restore). It really only operates
> > on the shadow state, and what you have above will only lead to state
> > corruption.
>
> Ignoring the kvm_arch_dy_runnable() case for the moment, how would
> it lead to corruption? The idea is that the 'vmcr_available' flag
> would be cleared when the vCPU is run, i.e. it tracks whether or not
> the shadow state may be stale.
I guess that 'vmcr_available' would have to be initialised to 'true'
at vcpu reset time so that the userspace side cannot trigger a read
from the HW.
> > > if (kvm_vgic_global_state.type == VGIC_V2)
> > > vgic_v2_get_vmcr(vcpu, vmcr);
> > > else
> > >
> > >
> > > Regarding vGIC v4, does KVM require it to be resident in hardware
> > > while the vCPU is loaded?
> >
> > It is a requirement. Otherwise, we end-up with an inconsistent state
> > between the delivery of doorbells and the state of the vgic.
>
> For my own understanding, does KVM require it to be resident in
> hardware while the vCPU is loaded but _not_ running? What I don't
> fully understand is how KVM can safely load/put the vCPU if that
> true, i.e. wouldn't there always be a window for badness?
No, that part is fine. It is when you start running the vcpu without
the GICv4 context loaded that ugly stuff happens (get a doorbell that
tells you to schedule the currently running vcpu, for example).
>
> > Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> > accesses), which is why we really don't want to do that on the hot path
> > (kvm_arch_vcpu_ioctl_run() *is* a hot path).
>
> I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
> if it was put between vcpu_load() and entry to the guest.
>
> > > If not, then we could do something like
> > > this, which would eliminate the arch hooks entirely if the VMCR is
> > > handled as above.
>
> ...
>
> > > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > */
> > > preempt_disable();
> > >
> > > + /*
> > > + * Reload vGIC v4 if necessary, as it may be put on-demand so
> > > + * that KVM can detect directly injected interrupts, e.g. when
> > > + * determining if the vCPU is runnable due to a pending event.
> > > + */
> > > + vgic_v4_load(vcpu);
> >
> > You'd need to detect that a previous put has been done.
>
> Not that it will likely matter, but doesn't the its_vpe.resident
> check automatically handle this?
Sort of. I eventually want to get rid of this as it papers over all
sort of sins. I introduced it exactly because of the nesting that
vcpu_block triggers, but this is a bit of a layering violation between
KVM and the underlying GICv4 driver.
>
> > But overall, it puts the complexity at the wrong place. WFI (aka
> > kvm_vcpu_block) is the place where we want to handle this synchronisation,
> > and not the run loop.
> >
> > Instead of having a well defined interface with the blocking code
> > where we implement the required synchronisation, you spray the vgic
> > crap all over, and it becomes much harder to reason about it. Guess
> > what, I'm not keen on it.
>
> My objection to the arch hooks is that, from generic KVM's
> perspective, the direct dependency is not on blocking, it's on
> calling kvm_arch_vcpu_runnable(). That's why I suggested handling
> this by tracking whether or not the VMCR is up-to-date/stale, as it
> allows generic KVM to safely call kvm_arch_vcpu_runnable() whenever
> the vCPU is loaded.
>
> I don't have a strong opinion on arm64 preferring the sync to be
> specific to WFI, but if that's the case then IMO this should be
> handled fully in arm64, e.g. a patch like so (or with a wrapper
> around the call to kvm_vcpu_block() if we want to guard against
> future calls into generic KVM)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..312f3acd3ca3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>
> void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> {
> - /*
> - * If we're about to block (most likely because we've just hit a
> - * WFI), we need to sync back the state of the GIC CPU interface
> - * so that we have the latest PMR and group enables. This ensures
> - * that kvm_arch_vcpu_runnable has up-to-date data to decide
> - * whether we have pending interrupts.
> - *
> - * For the same reason, we want to tell GICv4 that we need
> - * doorbells to be signalled, should an interrupt become pending.
> - */
> - preempt_disable();
> - kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> {
> - preempt_disable();
> - vgic_v4_load(vcpu);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..9870e824a27e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
> } else {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> +
> + /*
> + * Sync back the state of the GIC CPU interface so that we have
> + * the latest PMR and group enables. This ensures that
> + * kvm_arch_vcpu_runnable has up-to-date data to decide whether
> + * we have pending interrupts, e.g. when determining if the
> + * vCPU should block.
> + *
> + * For the same reason, we want to tell GICv4 that we need
> + * doorbells to be signalled, should an interrupt become pending.
> + */
> + preempt_disable();
> + kvm_vgic_vmcr_sync(vcpu);
> + vgic_v4_put(vcpu, true);
> + preempt_enable();
> +
> kvm_vcpu_block(vcpu);
> kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +
> + preempt_disable();
> + vgic_v4_load(vcpu);
> + preempt_enable();
> }
>
> kvm_incr_pc(vcpu);
I actually largely prefer this approach, which is massively more
readable than the current setup. Feel free to wrap that in your
series.
I'll also have a look at the vcpu_dy_runnable() asap.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Paul Mackerras <paulus@ozlabs.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
David Hildenbrand <david@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
linux-kernel@vger.kernel.org, David Matlack <dmatlack@google.com>,
Jing Zhang <jingzhangos@google.com>
Subject: Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
Date: Thu, 30 Sep 2021 10:36:19 +0100 [thread overview]
Message-ID: <87czoqtnfw.wl-maz@kernel.org> (raw)
In-Reply-To: <YVNA+KTbLrxGQ6ML@google.com>
On Tue, 28 Sep 2021 17:21:12 +0100,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Sep 28, 2021, Marc Zyngier wrote:
> > On Mon, 27 Sep 2021 18:28:14 +0100,
> > Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > >
> > > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > > >> there is no need for arm64 to put/load
> > > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > >
> > > > > > This doesn't mean that there is no requirement for any state
> > > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > > resync is a correctness requirement.
> > >
> > > Ah crud, I didn't blame that code beforehand, I simply assumed
> > > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > > sequence. The comment in arm64's kvm_arch_vcpu_blocking() about
> > > kvm_arch_vcpu_runnable() makes more sense now too.
> > >
> > > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > > work and is a waste of time without (the current implementation of)
> > > > > put/load.
> > > >
> > > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > > still follow the standard SW injection model.
> > > >
> > > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > > up-to-date priority mask and group enable bits) for SW-injected
> > > > interrupt wake-up to work correctly, and I really don't want to save
> > > > that one eagerly on each shallow exit.
> > >
> > > IIUC, VMCR is resident in hardware while the guest is running, and
> > > KVM needs to retrieve the VMCR when processing interrupts to
> > > determine if a interrupt is above the priority threshold. If that's
> > > the case, then IMO handling the VMCR via an arch hook is
> > > unnecessarily fragile, e.g. any generic call that leads to
> > > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > > guest register.
> >
> > Not quite. We only need to retrieve the VMCR if we are in a situation
> > where we need to trigger a wake-up from WFI at the point where we have
> > not done a vcpu_put() yet. All the other cases where the interrupt is
> > injected are managed by the HW. And the only case where
> > kvm_arch_vcpu_runnable() gets called is when blocking.
> >
> > I also don't get why a hook would be fragile, as long as it has well
> > defined semantics.
>
> Generic KVM should not have to know that a seemingly benign arch hook,
> kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
> arch hook. E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin(). If
> the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
> and the target vCPU is awakened during that period, KVM can call
> kvm_arch_vcpu_runnable() while the vCPU is running.
Humph. Indeed, there is a potential gold-plated turd there.
> It's kind of a counter-example to my below suggestion as putting the vGIC would
> indeed lead to state corruption if the vCPU is running, but I would argue that
> arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
> e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.
I'll work something out for that case.
> > > A better approach for VMCR would be to retrieve the value from
> > > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > >
> > > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > index 48c6067fc5ec..0784de0c4080 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >
> > > void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > > {
> > > + if (!vcpu->...->vmcr_available) {
> > > + preempt_disable();
> > > + kvm_vgic_vmcr_sync(vcpu);
> > > + preempt_enable();
> > > + vcpu->...->vmcr_available = true;
> > > + }
> > > +
> >
> > But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> > isn't running at all (such as save/restore). It really only operates
> > on the shadow state, and what you have above will only lead to state
> > corruption.
>
> Ignoring the kvm_arch_dy_runnable() case for the moment, how would
> it lead to corruption? The idea is that the 'vmcr_available' flag
> would be cleared when the vCPU is run, i.e. it tracks whether or not
> the shadow state may be stale.
I guess that 'vmcr_available' would have to be initialised to 'true'
at vcpu reset time so that the userspace side cannot trigger a read
from the HW.
> > > if (kvm_vgic_global_state.type == VGIC_V2)
> > > vgic_v2_get_vmcr(vcpu, vmcr);
> > > else
> > >
> > >
> > > Regarding vGIC v4, does KVM require it to be resident in hardware
> > > while the vCPU is loaded?
> >
> > It is a requirement. Otherwise, we end-up with an inconsistent state
> > between the delivery of doorbells and the state of the vgic.
>
> For my own understanding, does KVM require it to be resident in
> hardware while the vCPU is loaded but _not_ running? What I don't
> fully understand is how KVM can safely load/put the vCPU if that
> true, i.e. wouldn't there always be a window for badness?
No, that part is fine. It is when you start running the vcpu without
the GICv4 context loaded that ugly stuff happens (get a doorbell that
tells you to schedule the currently running vcpu, for example).
>
> > Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> > accesses), which is why we really don't want to do that on the hot path
> > (kvm_arch_vcpu_ioctl_run() *is* a hot path).
>
> I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
> if it was put between vcpu_load() and entry to the guest.
>
> > > If not, then we could do something like
> > > this, which would eliminate the arch hooks entirely if the VMCR is
> > > handled as above.
>
> ...
>
> > > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > */
> > > preempt_disable();
> > >
> > > + /*
> > > + * Reload vGIC v4 if necessary, as it may be put on-demand so
> > > + * that KVM can detect directly injected interrupts, e.g. when
> > > + * determining if the vCPU is runnable due to a pending event.
> > > + */
> > > + vgic_v4_load(vcpu);
> >
> > You'd need to detect that a previous put has been done.
>
> Not that it will likely matter, but doesn't the its_vpe.resident
> check automatically handle this?
Sort of. I eventually want to get rid of this as it papers over all
sort of sins. I introduced it exactly because of the nesting that
vcpu_block triggers, but this is a bit of a layering violation between
KVM and the underlying GICv4 driver.
>
> > But overall, it puts the complexity at the wrong place. WFI (aka
> > kvm_vcpu_block) is the place where we want to handle this synchronisation,
> > and not the run loop.
> >
> > Instead of having a well defined interface with the blocking code
> > where we implement the required synchronisation, you spray the vgic
> > crap all over, and it becomes much harder to reason about it. Guess
> > what, I'm not keen on it.
>
> My objection to the arch hooks is that, from generic KVM's
> perspective, the direct dependency is not on blocking, it's on
> calling kvm_arch_vcpu_runnable(). That's why I suggested handling
> this by tracking whether or not the VMCR is up-to-date/stale, as it
> allows generic KVM to safely call kvm_arch_vcpu_runnable() whenever
> the vCPU is loaded.
>
> I don't have a strong opinion on arm64 preferring the sync to be
> specific to WFI, but if that's the case then IMO this should be
> handled fully in arm64, e.g. a patch like so (or with a wrapper
> around the call to kvm_vcpu_block() if we want to guard against
> future calls into generic KVM)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..312f3acd3ca3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>
> void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> {
> - /*
> - * If we're about to block (most likely because we've just hit a
> - * WFI), we need to sync back the state of the GIC CPU interface
> - * so that we have the latest PMR and group enables. This ensures
> - * that kvm_arch_vcpu_runnable has up-to-date data to decide
> - * whether we have pending interrupts.
> - *
> - * For the same reason, we want to tell GICv4 that we need
> - * doorbells to be signalled, should an interrupt become pending.
> - */
> - preempt_disable();
> - kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> {
> - preempt_disable();
> - vgic_v4_load(vcpu);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..9870e824a27e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
> } else {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> +
> + /*
> + * Sync back the state of the GIC CPU interface so that we have
> + * the latest PMR and group enables. This ensures that
> + * kvm_arch_vcpu_runnable has up-to-date data to decide whether
> + * we have pending interrupts, e.g. when determining if the
> + * vCPU should block.
> + *
> + * For the same reason, we want to tell GICv4 that we need
> + * doorbells to be signalled, should an interrupt become pending.
> + */
> + preempt_disable();
> + kvm_vgic_vmcr_sync(vcpu);
> + vgic_v4_put(vcpu, true);
> + preempt_enable();
> +
> kvm_vcpu_block(vcpu);
> kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +
> + preempt_disable();
> + vgic_v4_load(vcpu);
> + preempt_enable();
> }
>
> kvm_incr_pc(vcpu);
I actually largely prefer this approach, which is massively more
readable than the current setup. Feel free to wrap that in your
series.
I'll also have a look at the vcpu_dy_runnable() asap.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Paul Mackerras <paulus@ozlabs.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
David Hildenbrand <david@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
linux-kernel@vger.kernel.org, David Matlack <dmatlack@google.com>,
Jing Zhang <jingzhangos@google.com>
Subject: Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
Date: Thu, 30 Sep 2021 10:36:19 +0100 [thread overview]
Message-ID: <87czoqtnfw.wl-maz@kernel.org> (raw)
In-Reply-To: <YVNA+KTbLrxGQ6ML@google.com>
On Tue, 28 Sep 2021 17:21:12 +0100,
Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Sep 28, 2021, Marc Zyngier wrote:
> > On Mon, 27 Sep 2021 18:28:14 +0100,
> > Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > >
> > > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > > >> there is no need for arm64 to put/load
> > > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > >
> > > > > > This doesn't mean that there is no requirement for any state
> > > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > > resync is a correctness requirement.
> > >
> > > Ah crud, I didn't blame that code beforehand, I simply assumed
> > > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > > sequence. The comment in arm64's kvm_arch_vcpu_blocking() about
> > > kvm_arch_vcpu_runnable() makes more sense now too.
> > >
> > > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > > work and is a waste of time without (the current implementation of)
> > > > > put/load.
> > > >
> > > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > > still follow the standard SW injection model.
> > > >
> > > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > > up-to-date priority mask and group enable bits) for SW-injected
> > > > interrupt wake-up to work correctly, and I really don't want to save
> > > > that one eagerly on each shallow exit.
> > >
> > > IIUC, VMCR is resident in hardware while the guest is running, and
> > > KVM needs to retrieve the VMCR when processing interrupts to
> > > determine if a interrupt is above the priority threshold. If that's
> > > the case, then IMO handling the VMCR via an arch hook is
> > > unnecessarily fragile, e.g. any generic call that leads to
> > > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > > guest register.
> >
> > Not quite. We only need to retrieve the VMCR if we are in a situation
> > where we need to trigger a wake-up from WFI at the point where we have
> > not done a vcpu_put() yet. All the other cases where the interrupt is
> > injected are managed by the HW. And the only case where
> > kvm_arch_vcpu_runnable() gets called is when blocking.
> >
> > I also don't get why a hook would be fragile, as long as it has well
> > defined semantics.
>
> Generic KVM should not have to know that a seemingly benign arch hook,
> kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
> arch hook. E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin(). If
> the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
> and the target vCPU is awakened during that period, KVM can call
> kvm_arch_vcpu_runnable() while the vCPU is running.
Humph. Indeed, there is a potential gold-plated turd there.
> It's kind of a counter-example to my below suggestion as putting the vGIC would
> indeed lead to state corruption if the vCPU is running, but I would argue that
> arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
> e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.
I'll work something out for that case.
> > > A better approach for VMCR would be to retrieve the value from
> > > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > >
> > > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > index 48c6067fc5ec..0784de0c4080 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >
> > > void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > > {
> > > + if (!vcpu->...->vmcr_available) {
> > > + preempt_disable();
> > > + kvm_vgic_vmcr_sync(vcpu);
> > > + preempt_enable();
> > > + vcpu->...->vmcr_available = true;
> > > + }
> > > +
> >
> > But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> > isn't running at all (such as save/restore). It really only operates
> > on the shadow state, and what you have above will only lead to state
> > corruption.
>
> Ignoring the kvm_arch_dy_runnable() case for the moment, how would
> it lead to corruption? The idea is that the 'vmcr_available' flag
> would be cleared when the vCPU is run, i.e. it tracks whether or not
> the shadow state may be stale.
I guess that 'vmcr_available' would have to be initialised to 'true'
at vcpu reset time so that the userspace side cannot trigger a read
from the HW.
> > > if (kvm_vgic_global_state.type == VGIC_V2)
> > > vgic_v2_get_vmcr(vcpu, vmcr);
> > > else
> > >
> > >
> > > Regarding vGIC v4, does KVM require it to be resident in hardware
> > > while the vCPU is loaded?
> >
> > It is a requirement. Otherwise, we end-up with an inconsistent state
> > between the delivery of doorbells and the state of the vgic.
>
> For my own understanding, does KVM require it to be resident in
> hardware while the vCPU is loaded but _not_ running? What I don't
> fully understand is how KVM can safely load/put the vCPU if that
> true, i.e. wouldn't there always be a window for badness?
No, that part is fine. It is when you start running the vcpu without
the GICv4 context loaded that ugly stuff happens (get a doorbell that
tells you to schedule the currently running vcpu, for example).
>
> > Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> > accesses), which is why we really don't want to do that on the hot path
> > (kvm_arch_vcpu_ioctl_run() *is* a hot path).
>
> I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
> if it was put between vcpu_load() and entry to the guest.
>
> > > If not, then we could do something like
> > > this, which would eliminate the arch hooks entirely if the VMCR is
> > > handled as above.
>
> ...
>
> > > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > */
> > > preempt_disable();
> > >
> > > + /*
> > > + * Reload vGIC v4 if necessary, as it may be put on-demand so
> > > + * that KVM can detect directly injected interrupts, e.g. when
> > > + * determining if the vCPU is runnable due to a pending event.
> > > + */
> > > + vgic_v4_load(vcpu);
> >
> > You'd need to detect that a previous put has been done.
>
> Not that it will likely matter, but doesn't the its_vpe.resident
> check automatically handle this?
Sort of. I eventually want to get rid of this as it papers over all
sort of sins. I introduced it exactly because of the nesting that
vcpu_block triggers, but this is a bit of a layering violation between
KVM and the underlying GICv4 driver.
>
> > But overall, it puts the complexity at the wrong place. WFI (aka
> > kvm_vcpu_block) is the place where we want to handle this synchronisation,
> > and not the run loop.
> >
> > Instead of having a well defined interface with the blocking code
> > where we implement the required synchronisation, you spray the vgic
> > crap all over, and it becomes much harder to reason about it. Guess
> > what, I'm not keen on it.
>
> My objection to the arch hooks is that, from generic KVM's
> perspective, the direct dependency is not on blocking, it's on
> calling kvm_arch_vcpu_runnable(). That's why I suggested handling
> this by tracking whether or not the VMCR is up-to-date/stale, as it
> allows generic KVM to safely call kvm_arch_vcpu_runnable() whenever
> the vCPU is loaded.
>
> I don't have a strong opinion on arm64 preferring the sync to be
> specific to WFI, but if that's the case then IMO this should be
> handled fully in arm64, e.g. a patch like so (or with a wrapper
> around the call to kvm_vcpu_block() if we want to guard against
> future calls into generic KVM)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..312f3acd3ca3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>
> void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> {
> - /*
> - * If we're about to block (most likely because we've just hit a
> - * WFI), we need to sync back the state of the GIC CPU interface
> - * so that we have the latest PMR and group enables. This ensures
> - * that kvm_arch_vcpu_runnable has up-to-date data to decide
> - * whether we have pending interrupts.
> - *
> - * For the same reason, we want to tell GICv4 that we need
> - * doorbells to be signalled, should an interrupt become pending.
> - */
> - preempt_disable();
> - kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> {
> - preempt_disable();
> - vgic_v4_load(vcpu);
> - preempt_enable();
> +
> }
>
> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..9870e824a27e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
> } else {
> trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> vcpu->stat.wfi_exit_stat++;
> +
> + /*
> + * Sync back the state of the GIC CPU interface so that we have
> + * the latest PMR and group enables. This ensures that
> + * kvm_arch_vcpu_runnable has up-to-date data to decide whether
> + * we have pending interrupts, e.g. when determining if the
> + * vCPU should block.
> + *
> + * For the same reason, we want to tell GICv4 that we need
> + * doorbells to be signalled, should an interrupt become pending.
> + */
> + preempt_disable();
> + kvm_vgic_vmcr_sync(vcpu);
> + vgic_v4_put(vcpu, true);
> + preempt_enable();
> +
> kvm_vcpu_block(vcpu);
> kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +
> + preempt_disable();
> + vgic_v4_load(vcpu);
> + preempt_enable();
> }
>
> kvm_incr_pc(vcpu);
I actually largely prefer this approach, which is massively more
readable than the current setup. Feel free to wrap that in your
series.
I'll also have a look at the vcpu_dy_runnable() asap.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-09-30 9:36 UTC|newest]
Thread overview: 192+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-25 0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-27 6:54 ` Christian Borntraeger
2021-09-27 6:54 ` Christian Borntraeger
2021-09-27 6:54 ` Christian Borntraeger
2021-09-27 6:54 ` Christian Borntraeger
2021-09-25 0:55 ` [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 18:57 ` David Matlack
2021-09-28 18:57 ` David Matlack
2021-09-28 18:57 ` David Matlack
2021-09-28 18:57 ` David Matlack
2021-09-25 0:55 ` [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 19:01 ` David Matlack
2021-09-28 19:01 ` David Matlack
2021-09-28 19:01 ` David Matlack
2021-09-25 0:55 ` [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 21:26 ` David Matlack
2021-09-28 21:26 ` David Matlack
2021-09-28 21:26 ` David Matlack
2021-09-25 0:55 ` [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-25 0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-27 6:58 ` Christian Borntraeger
2021-09-28 21:28 ` David Matlack
2021-09-28 21:28 ` David Matlack
2021-09-28 21:28 ` David Matlack
2021-09-25 0:55 ` [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 9:50 ` Marc Zyngier
2021-09-25 9:50 ` Marc Zyngier
2021-09-25 9:50 ` Marc Zyngier
2021-09-25 9:50 ` Marc Zyngier
2021-09-26 6:27 ` Paolo Bonzini
2021-09-26 6:27 ` Paolo Bonzini
2021-09-26 6:27 ` Paolo Bonzini
2021-09-26 6:27 ` Paolo Bonzini
2021-09-26 9:02 ` Marc Zyngier
2021-09-26 9:02 ` Marc Zyngier
2021-09-26 9:02 ` Marc Zyngier
2021-09-26 9:02 ` Marc Zyngier
2021-09-27 17:28 ` Sean Christopherson
2021-09-27 17:28 ` Sean Christopherson
2021-09-27 17:28 ` Sean Christopherson
2021-09-27 17:28 ` Sean Christopherson
2021-09-28 9:24 ` Marc Zyngier
2021-09-28 9:24 ` Marc Zyngier
2021-09-28 9:24 ` Marc Zyngier
2021-09-28 9:24 ` Marc Zyngier
2021-09-28 16:21 ` Sean Christopherson
2021-09-28 16:21 ` Sean Christopherson
2021-09-28 16:21 ` Sean Christopherson
2021-09-28 16:21 ` Sean Christopherson
2021-09-30 9:36 ` Marc Zyngier [this message]
2021-09-30 9:36 ` Marc Zyngier
2021-09-30 9:36 ` Marc Zyngier
2021-09-30 9:36 ` Marc Zyngier
2021-09-25 0:55 ` [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt() Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 21:59 ` David Matlack
2021-09-28 21:59 ` David Matlack
2021-09-28 21:59 ` David Matlack
2021-09-25 0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-27 7:06 ` Christian Borntraeger
2021-09-27 7:06 ` Christian Borntraeger
2021-09-27 7:06 ` Christian Borntraeger
2021-09-27 7:06 ` Christian Borntraeger
2021-09-28 22:01 ` David Matlack
2021-09-28 22:01 ` David Matlack
2021-09-28 22:01 ` David Matlack
2021-09-28 22:01 ` David Matlack
2021-09-25 0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-27 7:41 ` Christian Borntraeger
2021-09-27 7:41 ` Christian Borntraeger
2021-09-27 7:41 ` Christian Borntraeger
2021-09-27 7:41 ` Christian Borntraeger
2021-09-28 22:03 ` David Matlack
2021-09-28 22:03 ` David Matlack
2021-09-28 22:03 ` David Matlack
2021-09-25 0:55 ` [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 22:04 ` David Matlack
2021-09-28 22:04 ` David Matlack
2021-09-28 22:04 ` David Matlack
2021-09-25 0:55 ` [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 22:08 ` David Matlack
2021-09-28 22:08 ` David Matlack
2021-09-28 22:08 ` David Matlack
2021-09-25 0:55 ` [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 22:12 ` David Matlack
2021-09-28 22:12 ` David Matlack
2021-09-28 22:12 ` David Matlack
2021-09-25 0:55 ` [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-25 0:55 ` Sean Christopherson
2021-09-28 22:14 ` David Matlack
2021-09-28 22:14 ` David Matlack
2021-09-28 22:14 ` David Matlack
2021-09-28 22:14 ` David Matlack
2021-09-27 7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new st Christian Borntraeger
2021-09-27 7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
2021-09-27 7:22 ` Christian Borntraeger
2021-09-27 7:22 ` Christian Borntraeger
2021-09-27 14:59 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Sean Christopherson
2021-09-27 14:59 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Sean Christopherson
2021-09-27 14:59 ` Sean Christopherson
2021-09-27 14:59 ` Sean Christopherson
2021-09-27 15:03 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Paolo Bonzini
2021-09-27 15:03 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Paolo Bonzini
2021-09-27 15:03 ` Paolo Bonzini
2021-09-27 15:03 ` Paolo Bonzini
2021-09-27 15:15 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Sean Christopherson
2021-09-27 15:15 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Sean Christopherson
2021-09-27 15:15 ` Sean Christopherson
2021-09-27 15:15 ` Sean Christopherson
2021-09-27 15:16 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Christian Borntraeger
2021-09-27 15:16 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
2021-09-27 15:16 ` Christian Borntraeger
2021-09-27 15:16 ` Christian Borntraeger
2021-09-27 16:58 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne David Matlack
2021-09-27 16:58 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) David Matlack
2021-09-27 16:58 ` David Matlack
2021-09-27 16:58 ` David Matlack
2021-09-29 6:56 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Christian Borntraeger
2021-09-29 6:56 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
2021-09-29 6:56 ` Christian Borntraeger
2021-09-29 6:56 ` Christian Borntraeger
2021-09-27 17:24 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Paolo Bonzini
2021-09-27 17:24 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Paolo Bonzini
2021-09-27 17:24 ` Paolo Bonzini
2021-09-27 17:24 ` Paolo Bonzini
2021-09-27 17:33 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne Sean Christopherson
2021-09-27 17:33 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Sean Christopherson
2021-09-27 17:33 ` Sean Christopherson
2021-09-27 17:33 ` Sean Christopherson
2022-11-15 3:28 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne wangyanan (Y)
2022-11-15 3:28 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) wangyanan (Y)
2022-11-15 3:28 ` wangyanan (Y)
2022-11-15 3:28 ` wangyanan (Y)
2022-11-16 17:19 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne David Matlack
2022-11-16 17:19 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) David Matlack
2022-11-16 17:19 ` David Matlack
2022-11-16 17:19 ` David Matlack
2022-11-18 2:29 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a ne wangyanan (Y)
2022-11-18 2:29 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) wangyanan (Y)
2022-11-18 2:29 ` wangyanan (Y)
2022-11-18 2:29 ` wangyanan (Y)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87czoqtnfw.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=alexandru.elisei@arm.com \
--cc=borntraeger@de.ibm.com \
--cc=chenhuacai@kernel.org \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=dmatlack@google.com \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=james.morse@arm.com \
--cc=jingzhangos@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=paulus@ozlabs.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.