From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Zyngier Date: Sat, 25 Sep 2021 09:50:05 +0000 Subject: Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Message-Id: <878rzlass2.wl-maz@kernel.org> List-Id: References: <20210925005528.1145584-1-seanjc@google.com> <20210925005528.1145584-8-seanjc@google.com> In-Reply-To: <20210925005528.1145584-8-seanjc@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Sean Christopherson Cc: Wanpeng Li , kvm@vger.kernel.org, David Hildenbrand , linux-kernel@vger.kernel.org, Paul Mackerras , Claudio Imbrenda , kvmarm@lists.cs.columbia.edu, Janosch Frank , Joerg Roedel , Huacai Chen , Christian Borntraeger , Aleksandar Markovic , kvm-ppc@vger.kernel.org, David Matlack , linux-arm-kernel@lists.infradead.org, Jim Mattson , Cornelia Huck , linux-mips@vger.kernel.org, Paolo Bonzini , Vitaly Kuznetsov On Sat, 25 Sep 2021 01:55:21 +0100, Sean Christopherson wrote: > > Invoke the arch hooks for block+unblock if and only if KVM actually > attempts to block the vCPU. The only non-nop implementation is on arm64, > and if halt-polling is successful, there is no need for arm64 to put/load > the vGIC as KVM hasn't relinquished control of the vCPU in any way. This doesn't mean that there is no requirement for any state change. The put/load on GICv4 is crucial for performance, and the VMCR resync is a correctness requirement. > > The primary motivation is to allow future cleanup to split out "block" > from "halt", but this is also likely a small performance boost on arm64 > when halt-polling is successful. > > Adjust the post-block path to update "cur" after unblocking, i.e. include > vGIC load time in halt_wait_ns and halt_wait_hist, so that the behavior > is consistent. Moving just the pre-block arch hook would result in only > the vGIC put latency being included in the halt_wait stats. There is no > obvious evidence that one way or the other is correct, so just ensure KVM > is consistent. This effectively reverts 07ab0f8d9a12 ("KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence"), which was a huge gain on arm64, not to mention a correctness fix. Without this, a GICv4 machine will always pay for the full poll penalty, going into schedule(), and only then get a doorbell interrupt signalling telling the kernel that there was an interrupt. On a non-GICv4 machine, it means that interrupts injected by another thread during the pooling will be evaluated with an outdated priority mask, which can result in either a spurious wake-up or a missed wake-up. If it means introducing a new set of {pre,post}-poll arch-specific hooks, so be it. But I don't think this change is acceptable as is. Thanks, M. -- Without deviation from the norm, progress is not possible.