public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org
Subject: Re: [patch 1/2] KVM: x86: do not entry guest mode if vcpu is not runnable
Date: Sat, 26 Jul 2008 11:07:48 +0300	[thread overview]
Message-ID: <488ADB54.6040005@qumranet.com> (raw)
In-Reply-To: <20080721144037.226624791@localhost.localdomain>

Marcelo Tosatti wrote:
> If a vcpu has been offlined, or not initialized at all, signals
> requesting userspace work to be performed will result in KVM attempting
> to re-entry guest mode.
>
> Problem is that the in-kernel irqchip emulation happily executes HALTED
> state vcpu's. This breaks "savevm" on Windows SMP installation (that
> only boots up a single vcpu), for example.
>
> Fix it by blocking halted vcpu's at kvm_arch_vcpu_ioctl_run(). 
>
> Change the promotion from halted to running to happen in the vcpu
> context. Use the information available in kvm_vcpu_block(), and the
> current mpstate to make the decision:
>
> - If there's an in-kernel timer or irq event the halted->running
> promotion evaluation can be performed, no need for userspace assistance.
>
> - If there's a signal, there's either userspace work to be performed
> in the vcpu's context or irqchip emulation is in userspace.
>
> This has the nice side effect of avoiding userspace exit in case 
> of irq injection to a halted vcpu from the iothread.
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> Index: kvm/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.orig/arch/x86/kvm/x86.c
> +++ kvm/arch/x86/kvm/x86.c
> @@ -2505,17 +2505,25 @@ void kvm_arch_exit(void)
>  	kvm_mmu_module_exit();
>  }
>  
> +static void kvm_vcpu_promote_runnable(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
> +		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> +}
> +
>  int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>  {
>  	++vcpu->stat.halt_exits;
>  	KVMTRACE_0D(HLT, vcpu, handler);
>  	if (irqchip_in_kernel(vcpu->kvm)) {
> +		int ret;
>   

Missing blank line.

> @@ -2978,10 +2986,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
>  	if (vcpu->sigset_active)
>  		sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
>  
> -	if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) {
> -		kvm_vcpu_block(vcpu);
> -		r = -EAGAIN;
> -		goto out;
> +	if (unlikely(!kvm_arch_vcpu_runnable(vcpu))) {
> +		if (kvm_vcpu_block(vcpu)) {
> +			r = -EAGAIN;
> +			goto out;
> +		}
> +		kvm_vcpu_promote_runnable(vcpu);
>  	}
>   


Any reason this is not in __vcpu_run()?

Our main loop could look like

   while (no reason to stop)
         if (runnable)
              enter guest
         else
              block
         deal with aftermath

kvm_emulate_halt would then simply modify the mp state.

>  
>  	/* re-sync apic's tpr */
> Index: kvm/include/linux/kvm_host.h
> ===================================================================
> --- kvm.orig/include/linux/kvm_host.h
> +++ kvm/include/linux/kvm_host.h
> @@ -199,7 +199,7 @@ struct kvm_memory_slot *gfn_to_memslot(s
>  int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
>  void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
>  
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu);
> +int kvm_vcpu_block(struct kvm_vcpu *vcpu);
>  void kvm_resched(struct kvm_vcpu *vcpu);
>  void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
>  void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
> Index: kvm/virt/kvm/kvm_main.c
> ===================================================================
> --- kvm.orig/virt/kvm/kvm_main.c
> +++ kvm/virt/kvm/kvm_main.c
> @@ -818,9 +818,10 @@ void mark_page_dirty(struct kvm *kvm, gf
>  /*
>   * The vCPU has executed a HLT instruction with in-kernel mode enabled.
>   */
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu)
> +int kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  {
>  	DEFINE_WAIT(wait);
> +	int ret = 0;
>  
>  	for (;;) {
>  		prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
> @@ -831,8 +832,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcp
>  			break;
>  		if (kvm_arch_vcpu_runnable(vcpu))
>  			break;
> -		if (signal_pending(current))
> +		if (signal_pending(current)) {
> +			ret = 1;
>  			break;
> +		}
>   

This is ambiguous.  Multiple exit conditions could be true at the same 
time (vcpu becomes runnable _and_ signal is pending), so you can't trust 
the return code.  It doesn't affect the usage in the rest of the patch 
(I think), but it is best to avoid such subtlety.

Can this be done by setting a KVM_REQ_UNHALT bit in vcpu->requests?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


  parent reply	other threads:[~2008-07-26  8:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-21 14:38 [patch 0/2] fix migration/savevm with offline vcpus Marcelo Tosatti
2008-07-21 14:38 ` [patch 1/2] KVM: x86: do not entry guest mode if vcpu is not runnable Marcelo Tosatti
2008-07-21 16:09   ` Marcelo Tosatti
2008-07-26  8:07   ` Avi Kivity [this message]
2008-07-21 14:38 ` [patch 2/2] KVM: x86: standardize vcpu wakeup method for in-kernel irqchip Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=488ADB54.6040005@qumranet.com \
    --to=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox