From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Eduardo Valentin <eduval@amazon.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Peter Zijlstra <peterz@infradead.org>,
Waiman Long <longman@redhat.com>,
kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org,
"Jan H . Schoenherr" <jschoenh@amazon.de>,
Anthony Liguori <aliguori@amazon.com>,
msw@amazon.com
Subject: Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
Date: Wed, 8 Nov 2017 19:41:19 +0100 [thread overview]
Message-ID: <20171108184119.GB3664@flask> (raw)
In-Reply-To: <20171031170254.GA12738@u40b0340c692b58f6553c.ant.amazon.com>
2017-10-31 10:02-0700, Eduardo Valentin:
> Hello Radim,
>
> On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim Krčmář wrote:
> > 2017-10-23 17:44-0700, Eduardo Valentin:
> > > Currently, the existing qspinlock implementation will fallback to
> > > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >
> > Where have you detected the main source of overhead with pinned VCPUs?
> > Makes me wonder if we couldn't improve general PV_UNHALT,
>
> This is essentially for cases of non-overcommitted vCPUs in which we want
> the instance vCPUs to run uninterrupted as much as possible. Here by disabling
> the PV_UNHALT, we avoid the accounting needed to properly do the PV_UNHALT
> hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case.
Right, I would expect that the scenario should very rarely go into the
halt/kick path -- is SPIN_THRESHOLD too low?
We could also try abolishing the SPIN_THRESHOLD completely and only use
vcpu_is_preempted() and state of the previous lock holder to enter the
halt/kick path.
(The drawback is that vcpu_is_preempted() currently gets set even when
dropping into userspace.)
> > > This patch gives the opportunity to guest kernels to select
> > > between test-and-set and the regular queueu fair lock implementation
> > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > > flag is not set, the code will still fall back to test-and-set,
> > > but when the PV_DEDICATED flag is set, the code will use
> > > the regular queue spinlock implementation.
> >
> > Some flag makes sense and we do want to make sure that userspaces don't
> > enable it in pass-through-cpuid mode.
>
> Did you mean something like:
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..8ceb503 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
> }
> for (i = 0; i < cpuid->nent; i++) {
> vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function;
> - vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
> + vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax &
> + ~KVM_FEATURE_PV_DEDICATED;
> vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx;
> vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx;
> vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;
>
>
> But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT).
> Do you mind elaborating a bit here?
Sorry, nothing is needed. I somehow though that we need to expose this
to the userspace through CPUID, but KVM just needs to consider the flag
as reserved.
prev parent reply other threads:[~2017-11-08 18:41 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-24 0:44 [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
2017-10-24 8:13 ` Peter Zijlstra
2017-10-24 15:37 ` Eduardo Valentin
2017-10-24 16:07 ` Waiman Long
2017-10-24 16:26 ` Eduardo Valentin
2017-10-24 11:18 ` Radim Krčmář
2017-10-31 17:02 ` Eduardo Valentin
2017-11-08 18:41 ` Radim Krčmář [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171108184119.GB3664@flask \
--to=rkrcmar@redhat.com \
--cc=aliguori@amazon.com \
--cc=corbet@lwn.net \
--cc=eduval@amazon.com \
--cc=hpa@zytor.com \
--cc=jschoenh@amazon.de \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=msw@amazon.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.