public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Eduardo Habkost <ehabkost@redhat.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>,
	kvm@vger.kernel.org, pbonzini@redhat.com, berrange@redhat.com,
	Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>,
	Peter Krempa <pkrempa@redhat.com>,
	John Ferlan <jferlan@redhat.com>,
	libvir-list@redhat.com
Subject: Re: [RFC] kvm: x86: export vCPU halted state to sysfs
Date: Thu, 1 Feb 2018 18:26:49 -0200	[thread overview]
Message-ID: <20180201202649.GG26425@localhost.localdomain> (raw)
In-Reply-To: <20180201201514.GB660@flask>

On Thu, Feb 01, 2018 at 09:15:15PM +0100, Radim Krčmář wrote:
> 2018-02-01 12:54-0500, Luiz Capitulino:
> > 
> > Libvirt needs to know when a vCPU is halted. To get this information,
> 
> I don't see why upper level management should care about that, a single
> bit about halted state that can be incorrect at the time it is processed
> seems of very limited use.

I don't see why, either.

I'm CCing libvir-list and the people involved in the code that
added halt state to libvirt domain statistics.

> 
> (A much more sensible data point would be the fraction of time when VCPU
>  was running or runnable, which is roughly what you get by sampling the
>  halted state.)
> 
> A halted vCPU it might even be halted in guest mode, so KVM doesn't know
> about that state (unless you force a VM exit), which would complicate
> the collection a bit more ... but really, what is the data being used
> for?
> 
> User might care about the state, for obscure reasons, but that isn't a
> performance problem.
> 
> > libvirt has started using the query-cpus command from QEMU. However,
> > if in kernel irqchip is in use, query-cpus will force all vCPUs
> > to user-space since they have to issue the KVM_GET_MP_STATE ioctl.
> 
> Libvirt knows if KVM exits to userspace on halts, so it can just query
> QEMU in that case and in the other case, there is a very dirty
> "solution" that works on all architectures right now:
> 
>   grep kvm_vcpu_block /proc/$vcpu_task/stack
> 
> If you get something, the vcpu is halted in KVM.

Nice.


> 
> > This has catastrophic implications to low-latency workloads like
> > KVM-RT and zero packet loss with DPDK. To make matters worse, there's
> > an OpenStack service called ceilometer that causes libvirt to
> > issue query-cpus every few minutes.
> 
> I'd expect that people running these workloads can setup the system. :(
> 
> I bet that ceilometer just mindlessly collects everything, so we should
> be able to configure libvirt to collect only some stats.  Either libvirt
> or upper layer would decide what is too expensive for its usefulness.

Yes.  Including expensive-to-collect halt state in
VIR_DOMAIN_STATS_VCPU is a serious performance regression in
libvirt.

> 
> > The solution proposed in this patch is to export the vCPU
> > halted state in the already existing vcpu directory in sysfs.
> > This way, libvirt can read the vCPU halted state from sysfs and avoid
> > using the query-cpus command. This solution seems to be sufficient
> > for libvirt needs, but it has the following cons:
> > 
> >  * vcpu information in sysfs lives in a debug directory, so
> >    libvirt would be basing its API on debug info
> 
> (It pains me to say there probably already are tools that depend on
>  kvm/debug.)
> 
> It's slightly better than the stack hack, but needs more code in kernel
> and the interface is in a gray compatibility zone, so I'd like to know
> why does userspace do that in the first place.
> 
> >  * Currently, only x86 supports the vcpu dir in sysfs, so
> >    we'd have to expand this to other archs (should be doable)
> > 
> > If we agree that this solution is feasible, I'll work on extending
> > the vcpu debug information to other archs for my next posting.
> > 
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > ---
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > @@ -6273,6 +6273,7 @@ void kvm_arch_exit(void)
> >  
> >  int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >  {
> > +	kvm_vcpu_set_halted(vcpu);
> 
> There is no point to care about !lapic_in_kernel().  I'd move the logic
> into vcpu_block() to be shared among all architectures.
> 
> >  	++vcpu->stat.halt_exits;
> >  	if (lapic_in_kernel(vcpu)) {
> >  		vcpu->arch.mp_state = KVM_MP_STATE_HALTED;

-- 
Eduardo

  reply	other threads:[~2018-02-01 20:27 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-01 17:54 [RFC] kvm: x86: export vCPU halted state to sysfs Luiz Capitulino
2018-02-01 20:15 ` Radim Krčmář
2018-02-01 20:26   ` Eduardo Habkost [this message]
2018-02-02 13:53     ` Viktor Mihajlovski
2018-02-02 14:14       ` Luiz Capitulino
2018-02-02 14:15       ` Eduardo Habkost
2018-02-02 14:19         ` Daniel P. Berrangé
2018-02-02 14:21           ` Luiz Capitulino
2018-02-02 14:50             ` Eduardo Habkost
2018-02-02 14:55               ` [libvirt] " Luiz Capitulino
2018-02-02 15:07               ` Daniel P. Berrangé
2018-02-02 15:25                 ` Eduardo Habkost
2018-02-02 16:23                   ` [libvirt] " Eric Blake
2018-02-02 15:19               ` Eric Blake
2018-02-02 17:23               ` [Qemu-devel] " Dr. David Alan Gilbert
2018-02-02 17:38                 ` Eduardo Habkost
2018-02-02 15:08         ` Viktor Mihajlovski
2018-02-02 15:22           ` [libvirt] " Luiz Capitulino
2018-02-02 15:51             ` Viktor Mihajlovski
2018-02-02 15:54               ` Daniel P. Berrangé
2018-02-02 16:01                 ` Luiz Capitulino
2018-02-02 16:07                   ` Luiz Capitulino
2018-02-02 16:19                   ` Viktor Mihajlovski
2018-02-02 17:42                     ` [libvirt] " Eduardo Habkost
2018-02-02 18:50                       ` Luiz Capitulino
2018-02-02 20:09                         ` Eduardo Habkost
2018-02-02 20:19                           ` [libvirt] " Luiz Capitulino
2018-02-02 20:41                             ` Eduardo Habkost
2018-02-02 21:49                               ` Luiz Capitulino
2018-02-02 21:54                                 ` Luiz Capitulino
2018-02-05 13:43                               ` Viktor Mihajlovski
2018-02-05 13:47                                 ` Daniel P. Berrangé
2018-02-05 15:37                                   ` Luiz Capitulino
2018-02-05 16:10                                     ` Viktor Mihajlovski
2018-02-05 16:36                                       ` Luiz Capitulino
2018-02-05 22:50                                     ` Eduardo Habkost
2018-02-06  2:04                                       ` Luiz Capitulino
2018-02-02 15:55               ` [libvirt] " Luiz Capitulino
2018-02-06 10:29     ` Viktor Mihajlovski
2018-02-06 14:05       ` Luiz Capitulino
2018-02-02 12:47   ` Daniel P. Berrangé
2018-02-02 13:46     ` Luiz Capitulino
2018-02-02 12:49 ` Daniel P. Berrangé
2018-02-02 13:49   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180201202649.GG26425@localhost.localdomain \
    --to=ehabkost@redhat.com \
    --cc=berrange@redhat.com \
    --cc=jferlan@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mihajlov@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox