From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=53934 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Pwd7N-0001mk-R1
	for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:10 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mtosatti@redhat.com>) id 1Pwd7M-00053k-3g
	for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:09 -0500
Received: from mx1.redhat.com ([209.132.183.28]:45023)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mtosatti@redhat.com>) id 1Pwd7L-000533-ND
	for qemu-devel@nongnu.org; Mon, 07 Mar 2011 11:17:08 -0500
Date: Mon, 7 Mar 2011 13:02:26 -0300
From: Marcelo Tosatti <mtosatti@redhat.com>
Message-ID: <20110307160225.GA10021@amt.cnet>
References: <AANLkTint9-P-1jD5pbXstzPcYueLHQ68Rd0T_Chz6xRN@mail.gmail.com>
	<4D68F20D.2020401@web.de> <20110305163558.GA4607@amt.cnet>
	<20110306103059.GL3222@playa.tlv.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110306103059.GL3222@playa.tlv.redhat.com>
Subject: [Qemu-devel] Re: kvm crashes with spice while loading qxl
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jan Kiszka <jan.kiszka@web.de>, xming <xmingske@gmail.com>, Gerd Hoffmann <kraxel@redhat.com>, kvm@vger.kernel.org, qemu-devel <qemu-devel@nongnu.org>, Paolo Bonzini <pbonzini@redhat.com>, Avi Kivity <avi@redhat.com>

On Sun, Mar 06, 2011 at 12:30:59PM +0200, Alon Levy wrote:
> On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > > >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > > > (gdb)
> > > 
> > > That's a spice bug. In fact, there are a lot of
> > > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> > > of them can cause even more subtle problems.
> > > 
> > > Two general issues with dropping the global mutex like this:
> > >  - The caller of mutex_unlock is responsible for maintaining
> > >    cpu_single_env across the unlocked phase (that's related to the
> > >    abort above).
> > >  - Dropping the lock in the middle of a callback is risky. That may
> > >    enable re-entrances of code sections that weren't designed for this
> > >    (I'm skeptic about the side effects of
> > >    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> > > 
> > > Spice requires a careful review regarding such issues. Or it should
> > > pioneer with introducing its own lock so that we can handle at least
> > > related I/O activities over the VCPUs without holding the global mutex
> > > (but I bet it's not the simplest candidate for such a new scheme).
> > > 
> > > Jan
> > > 
> > 
> > Agree with the concern regarding spice.
> > 
> 
> What are the pros and cons of (re)introducing a spice specific lock?
>  + simplicity. Only spice touches the spice lock.
>  - ? what were the original reasons for Gerd dropping the spice lock?
> 
> I have no problem reintroducing this lock, I'm just concerned that it's
> wasted effort because after I send that patch someone will jump and remind
> me why it was removed in the first place.

Well, can't comment on why it was done or a proper way to fix it.
Point is that dropping the global lock requires careful review to verify
safety, as Jan mentioned.
For example, a potential problem would be:

  vcpu context                  iothread context
  qxl pio write 
  drop lock 
                                acquire lock
                                1) change state
                                drop lock
  acquire lock

1) could be device hotunplug, system reset, etc.