All of lore.kernel.org
 help / color / mirror / Atom feed
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Andrea Arcangeli <andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org>
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Re: external module sched_in event
Date: Fri, 21 Dec 2007 19:52:52 +0200	[thread overview]
Message-ID: <476BFD74.2040509@qumranet.com> (raw)
In-Reply-To: <20071221174048.GB1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>

Andrea Arcangeli wrote:
> Hello,
>
> [ I already sent it once as andrea-l3A5Bk7waGM@public.gmane.org but it didn't go through
>   for whatever reason, trying again from private email, hope there
>   won't be dups ]
>   
oh, it was sent to the list, dont trust (in case you did) the source 
forge site for the mails
inside this list, gmane is much better...
> My worst longstanding problem with KVM is that as the uptime of my
> host system increased, my opensuse guest images started to destabilize
> and lockup at boot. The weird thing was that fresh after boot
> everything was always perfectly ok, so I thought it was rmmod/insmod
> or some other sticky effect on the CPU after restarting the guest a
> few times that triggered the crash. Furthermore if I loaded the cpu a
> lot (like with a while :; do true;done), the crash would magically
> disappear. Decreasing cpu frequency and timings didn't help. Debugging
> wasn't trivial because it required a certain uptime and it didn't
> always crash.
>
> So I once debugged this more aggressively I figured out KVM was ok, it
> was the guest that crashed in the tsc clocksource because tsc wasn't
> monotone. guest was looping in an infinite loop with irq disabled. So
> I tried to pass "notsc" and that fixed the crash just fine.
>
> Initially I thought it was the tsc_offset logic being wrong but then I
> figured out that the vcpu_put/load wasn't always executed, this
> bugcheck triggers with current git and so I recommend to apply this to
> kvm.git to avoid similar nasty hard-to-detect bugs in the future (Avi
> says vmx would crash hard in such a condition, svm is much simpler and
> it somewhat survives the lack of sched_in and only crashes the guest
> due to not monotone tsc):
>
> Signed-off-by: Andrea Arcangeli <andrea-l3A5Bk7waGM@public.gmane.org>
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ac876ec..26372fa 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -742,6 +742,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +	WARN_ON(vcpu->cpu != smp_processor_id());
>  	kvm_x86_ops->vcpu_put(vcpu);
>  	kvm_put_guest_fpu(vcpu);
>  }
>
>
>
> So trying to understand why the ->cpu was wrong, I looked into the
> preempt notifiers emulation, and it looked quite fragile without a
> real sched_in hook. I figured out I could provide a real sched_in hook
> by loading the proper values in the
> tsk->thread.debugreg[0/7]. Initially I got the hooking points out of
> objdump -d vmlinux, but Avi preferred no dependency on the vmlinux and
> he suggested to try to find the sched_in hook in the stack. So that's
> what I implemented now and this should provide real robustness to the
> out of tree module compiled against binary kernel images with
> CONFIG_KVM=n. I tried to be compatible with all kernels down to 2.6.5
> but only 2.6.2x host is tested and only on 64bit and only on SVM (no
> vmx system around here at all).
>
> This fixes my longstanding KVM instability and "-smp 2" now works
> flawlessy with svm too! -smp 2 -snapshot crashes in qemu userland but
> that's not kernel related, must be some thread mutex lock recursion or
> lock inversion in the qcow cow code. Removing -snapshot make -smp 2
> stable. Multiple guests UP and SMP seems stable too.
>   
you mean that without -snapshot, the userspace not hang at the sigwait() 
in the qcow code?
> To reproduce my crash easily without waiting ages for the two tsc to
> deviate with an error larger than the number of cycles it takes for a
> CPU migration, run write_tsc(0,0) in kernel mode (like in the svm.c
> init function and then insmod kvm-amd; rmmod kvm-amd and then remove
> write_tsc and recompile kvm-amd).
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

  parent reply	other threads:[~2007-12-21 17:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-21 17:40 external module sched_in event Andrea Arcangeli
     [not found] ` <20071221174048.GB1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-21 17:52   ` Izik Eidus [this message]
     [not found]     ` <476BFD74.2040509-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-21 18:22       ` Andrea Arcangeli
     [not found]         ` <20071221182257.GG1292-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-21 18:50           ` mailman setup for kvm-devel (was Re: external module sched_in event) Carlo Marcelo Arenas Belon
2007-12-22 20:21             ` Avi Kivity
2007-12-22 20:24       ` external module sched_in event Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2007-12-20 16:23 Andrea Arcangeli
     [not found] ` <20071220162353.GA3802-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-22 19:13   ` Avi Kivity
     [not found]     ` <476D61E8.5000102-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-23 16:49       ` Andrea Arcangeli
     [not found]         ` <20071223164932.GA8483-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-23 17:37           ` Avi Kivity
     [not found]             ` <476E9CE4.2060705-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-24 16:26               ` Andrea Arcangeli
     [not found]                 ` <20071224162639.GH8483-lysg2Xt5kKMAvxtiuMwx3w@public.gmane.org>
2007-12-25  9:00                   ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=476BFD74.2040509@qumranet.com \
    --to=izike-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.