public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Andrea Arcangeli <andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org>
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Re: external module sched_in event
Date: Fri, 21 Dec 2007 19:52:52 +0200	[thread overview]
Message-ID: <476BFD74.2040509@qumranet.com> (raw)
In-Reply-To: <20071221174048.GB1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>

Andrea Arcangeli wrote:
> Hello,
>
> [ I already sent it once as andrea-l3A5Bk7waGM@public.gmane.org but it didn't go through
>   for whatever reason, trying again from private email, hope there
>   won't be dups ]
>   
oh, it was sent to the list, dont trust (in case you did) the source 
forge site for the mails
inside this list, gmane is much better...
> My worst longstanding problem with KVM is that as the uptime of my
> host system increased, my opensuse guest images started to destabilize
> and lockup at boot. The weird thing was that fresh after boot
> everything was always perfectly ok, so I thought it was rmmod/insmod
> or some other sticky effect on the CPU after restarting the guest a
> few times that triggered the crash. Furthermore if I loaded the cpu a
> lot (like with a while :; do true;done), the crash would magically
> disappear. Decreasing cpu frequency and timings didn't help. Debugging
> wasn't trivial because it required a certain uptime and it didn't
> always crash.
>
> So I once debugged this more aggressively I figured out KVM was ok, it
> was the guest that crashed in the tsc clocksource because tsc wasn't
> monotone. guest was looping in an infinite loop with irq disabled. So
> I tried to pass "notsc" and that fixed the crash just fine.
>
> Initially I thought it was the tsc_offset logic being wrong but then I
> figured out that the vcpu_put/load wasn't always executed, this
> bugcheck triggers with current git and so I recommend to apply this to
> kvm.git to avoid similar nasty hard-to-detect bugs in the future (Avi
> says vmx would crash hard in such a condition, svm is much simpler and
> it somewhat survives the lack of sched_in and only crashes the guest
> due to not monotone tsc):
>
> Signed-off-by: Andrea Arcangeli <andrea-l3A5Bk7waGM@public.gmane.org>
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ac876ec..26372fa 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -742,6 +742,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +	WARN_ON(vcpu->cpu != smp_processor_id());
>  	kvm_x86_ops->vcpu_put(vcpu);
>  	kvm_put_guest_fpu(vcpu);
>  }
>
>
>
> So trying to understand why the ->cpu was wrong, I looked into the
> preempt notifiers emulation, and it looked quite fragile without a
> real sched_in hook. I figured out I could provide a real sched_in hook
> by loading the proper values in the
> tsk->thread.debugreg[0/7]. Initially I got the hooking points out of
> objdump -d vmlinux, but Avi preferred no dependency on the vmlinux and
> he suggested to try to find the sched_in hook in the stack. So that's
> what I implemented now and this should provide real robustness to the
> out of tree module compiled against binary kernel images with
> CONFIG_KVM=n. I tried to be compatible with all kernels down to 2.6.5
> but only 2.6.2x host is tested and only on 64bit and only on SVM (no
> vmx system around here at all).
>
> This fixes my longstanding KVM instability and "-smp 2" now works
> flawlessy with svm too! -smp 2 -snapshot crashes in qemu userland but
> that's not kernel related, must be some thread mutex lock recursion or
> lock inversion in the qcow cow code. Removing -snapshot make -smp 2
> stable. Multiple guests UP and SMP seems stable too.
>   
you mean that without -snapshot, the userspace not hang at the sigwait() 
in the qcow code?
> To reproduce my crash easily without waiting ages for the two tsc to
> deviate with an error larger than the number of cycles it takes for a
> CPU migration, run write_tsc(0,0) in kernel mode (like in the svm.c
> init function and then insmod kvm-amd; rmmod kvm-amd and then remove
> write_tsc and recompile kvm-amd).
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

  parent reply	other threads:[~2007-12-21 17:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-21 17:40 external module sched_in event Andrea Arcangeli
     [not found] ` <20071221174048.GB1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-21 17:52   ` Izik Eidus [this message]
     [not found]     ` <476BFD74.2040509-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-21 18:22       ` Andrea Arcangeli
     [not found]         ` <20071221182257.GG1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-21 18:50           ` mailman setup for kvm-devel (was Re: external module sched_in event) Carlo Marcelo Arenas Belon
2007-12-22 20:21             ` Avi Kivity
2007-12-22 20:24       ` external module sched_in event Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2007-12-20 16:23 Andrea Arcangeli
     [not found] ` <20071220162353.GA3802-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-22 19:13   ` Avi Kivity
     [not found]     ` <476D61E8.5000102-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-23 16:49       ` Andrea Arcangeli
     [not found]         ` <20071223164932.GA8483-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-23 17:37           ` Avi Kivity
     [not found]             ` <476E9CE4.2060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-24 16:26               ` Andrea Arcangeli
     [not found]                 ` <20071224162639.GH8483-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-25  9:00                   ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=476BFD74.2040509@qumranet.com \
    --to=izike-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox