From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=53934 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pwd7N-0001mk-R1 for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pwd7M-00053k-3g for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45023) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pwd7L-000533-ND for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:08 -0500 Date: Mon, 7 Mar 2011 13:02:26 -0300 From: Marcelo Tosatti Message-ID: <20110307160225.GA10021@amt.cnet> References: <4D68F20D.2020401@web.de> <20110305163558.GA4607@amt.cnet> <20110306103059.GL3222@playa.tlv.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110306103059.GL3222@playa.tlv.redhat.com> Subject: [Qemu-devel] Re: kvm crashes with spice while loading qxl List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka , xming , Gerd Hoffmann , kvm@vger.kernel.org, qemu-devel , Paolo Bonzini , Avi Kivity On Sun, Mar 06, 2011 at 12:30:59PM +0200, Alon Levy wrote: > On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote: > > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: > > > > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466 > > > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0 > > > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6 > > > > (gdb) > > > > > > That's a spice bug. In fact, there are a lot of > > > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few > > > of them can cause even more subtle problems. > > > > > > Two general issues with dropping the global mutex like this: > > > - The caller of mutex_unlock is responsible for maintaining > > > cpu_single_env across the unlocked phase (that's related to the > > > abort above). > > > - Dropping the lock in the middle of a callback is risky. That may > > > enable re-entrances of code sections that weren't designed for this > > > (I'm skeptic about the side effects of > > > qemu_spice_vm_change_state_handler - why dropping the lock here?). > > > > > > Spice requires a careful review regarding such issues. Or it should > > > pioneer with introducing its own lock so that we can handle at least > > > related I/O activities over the VCPUs without holding the global mutex > > > (but I bet it's not the simplest candidate for such a new scheme). > > > > > > Jan > > > > > > > Agree with the concern regarding spice. > > > > What are the pros and cons of (re)introducing a spice specific lock? > + simplicity. Only spice touches the spice lock. > - ? what were the original reasons for Gerd dropping the spice lock? > > I have no problem reintroducing this lock, I'm just concerned that it's > wasted effort because after I send that patch someone will jump and remind > me why it was removed in the first place. Well, can't comment on why it was done or a proper way to fix it. Point is that dropping the global lock requires careful review to verify safety, as Jan mentioned. For example, a potential problem would be: vcpu context iothread context qxl pio write drop lock acquire lock 1) change state drop lock acquire lock 1) could be device hotunplug, system reset, etc.