From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1N206s-0008B2-7T
	for qemu-devel@nongnu.org; Sun, 25 Oct 2009 06:14:02 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1N206l-0008AN-TG
	for qemu-devel@nongnu.org; Sun, 25 Oct 2009 06:14:00 -0400
Received: from [199.232.76.173] (port=36728 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1N206l-0008AK-MC
	for qemu-devel@nongnu.org; Sun, 25 Oct 2009 06:13:55 -0400
Received: from mx1.redhat.com ([209.132.183.28]:13547)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <mst@redhat.com>) id 1N206l-0000IT-95
	for qemu-devel@nongnu.org; Sun, 25 Oct 2009 06:13:55 -0400
Date: Sun, 25 Oct 2009 12:11:34 +0200
From: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20091025101134.GD9270@redhat.com>
References: <20091022120015.GA28836@redhat.com>
	<20091022205727.GA23092@amt.cnet>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20091022205727.GA23092@amt.cnet>
Subject: [Qemu-devel] Re: qemu-kvm: sigsegv at exit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org

On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote:
> On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote:
> > Hi!
> > I'm sometimes getting segfaults when I kill qemu.
> > This time I caught it when qemu was under gdb:
> > 
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x411d0940 (LWP 14446)]
> > 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335)
> >     at /home/mst/scm/qemu-kvm/vl.c:1009
> > 1009            if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) {
> > (gdb) l
> > 1004        ts->next = *pt;
> > 1005        *pt = ts;
> > 1006
> > 1007        /* Rearm if necessary  */
> > 1008        if (pt == &active_timers[ts->clock->type]) {
> > 1009            if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) {
> > 1010                qemu_rearm_alarm_timer(alarm_timer);
> > 1011            }
> > 1012            /* Interrupt execution to force deadline recalculation.  */
> > 1013            if (use_icount)
> > (gdb) p alarm_timer
> > $1 = (struct qemu_alarm_timer *) 0x0
> > (gdb) where
> > #0  0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335)
> >     at /home/mst/scm/qemu-kvm/vl.c:1009
> > #1  0x000000000041aadf in virtio_net_handle_tx (vdev=<value optimized out>, vq=0x19f5af0)
> >     at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696
> > #2  0x0000000000421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797
> > #3  0x00000000004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714
> > #4  0x0000000000422981 in ap_main_loop (_env=<value optimized out>) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969
> > #5  0x000000377dc06367 in start_thread () from /lib64/libpthread.so.0
> > #6  0x000000377d0d30ad in clone () from /lib64/libc.so.6
> > (gdb)
> > 
> > So this probably means that we have already run quit_timers:
> > 
> > static void quit_timers(void)
> > {
> >     alarm_timer->stop(alarm_timer);
> >     alarm_timer = NULL;
> > }
> > 
> > but kvm vcpu thread is still running.
> > 
> > 
> > Not sure what the right fix is here: should we stop
> > kvm after main loop has exited?
> 
> kvm_main_loop_wait(env, 0) can process the stop request (signalling
> iothread that vcpu is stopped, so its OK to exit) and continue to
> kvm_cpu_exec.
> 
> Can you please try this:

I applied this, and have not yet see any segfaults at exit.
Not sure whether this is means anything as the crash is not
100% reproducable. Push it out to Anthony and we'll see, long term?
Based on the knowledge of how to fix this,
how would you go about reproducing it?

> diff --git a/qemu-kvm.c b/qemu-kvm.c
> index 87ece3d..141c8b1 100644
> --- a/qemu-kvm.c
> +++ b/qemu-kvm.c
> @@ -1931,7 +1931,8 @@ static int kvm_main_loop_cpu(CPUState *env)
>          }
>          if (run_cpu) {
>              kvm_main_loop_wait(env, 0);
> -            kvm_cpu_exec(env);
> +            if (!is_cpu_stopped(env))
> +                kvm_cpu_exec(env);
>          } else {
>              kvm_main_loop_wait(env, 1000);
>          }