From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:53430)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mtosatti@redhat.com>) id 1TBRrR-000507-4Z
	for qemu-devel@nongnu.org; Tue, 11 Sep 2012 10:54:50 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mtosatti@redhat.com>) id 1TBRrH-0004Lr-IJ
	for qemu-devel@nongnu.org; Tue, 11 Sep 2012 10:54:45 -0400
Received: from mx1.redhat.com ([209.132.183.28]:12555)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mtosatti@redhat.com>) id 1TBRrH-0004Lk-9N
	for qemu-devel@nongnu.org; Tue, 11 Sep 2012 10:54:35 -0400
Date: Tue, 11 Sep 2012 11:54:25 -0300
From: Marcelo Tosatti <mtosatti@redhat.com>
Message-ID: <20120911145425.GB11792@amt.cnet>
References: <1347349912-15611-1-git-send-email-qemulist@gmail.com>
	<1347349912-15611-11-git-send-email-qemulist@gmail.com>
	<504EF7CC.6040900@redhat.com>
	<CAJnKYQmOw3nmfgXAaiEYuaqJh06C46qoi17=RtLFnVpDiY-CXQ@mail.gmail.com>
	<504F0B17.5090703@siemens.com> <504F2D97.3000705@redhat.com>
	<504F316B.1030500@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <504F316B.1030500@redhat.com>
Subject: Re: [Qemu-devel] [PATCH V3 10/11] vcpu: introduce lockmap
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, liu ping fan <qemulist@gmail.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Anthony Liguori <anthony@codemonkey.ws>, Jan Kiszka <jan.kiszka@siemens.com>, Paolo Bonzini <pbonzini@redhat.com>

On Tue, Sep 11, 2012 at 03:41:15PM +0300, Avi Kivity wrote:
> On 09/11/2012 03:24 PM, Avi Kivity wrote:
> > On 09/11/2012 12:57 PM, Jan Kiszka wrote:
> >> On 2012-09-11 11:44, liu ping fan wrote:
> >>> On Tue, Sep 11, 2012 at 4:35 PM, Avi Kivity <avi@redhat.com> wrote:
> >>>> On 09/11/2012 10:51 AM, Liu Ping Fan wrote:
> >>>>> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >>>>>
> >>>>> The func call chain can suffer from recursively hold
> >>>>> qemu_mutex_lock_iothread. We introduce lockmap to record the
> >>>>> lock depth.
> >>>>
> >>>> What is the root cause?  io handlers initiating I/O?
> >>>>
> >>> cpu_physical_memory_rw() can be called nested, and when called, it can
> >>> be protected by no-lock/device lock/ big-lock.
> >>> I think without big lock, io-dispatcher should face the same issue.
> >>> As to main-loop, have not carefully consider, but at least, dma-helper
> >>> will call cpu_physical_memory_rw() with big-lock.
> >> 
> >> That is our core problem: inconsistent invocation of existing services
> >> /wrt locking. For portio, I was lucky that there is no nesting and I was
> >> able to drop the big lock around all (x86) call sites. But MMIO is way
> >> more tricky due to DMA nesting.
> > 
> > Maybe we need to switch to a continuation style.  Instead of expecting
> > cpu_physical_memory_rw() to complete synchronously, it becomes an
> > asynchronous call and you provide it with a completion.  That means
> > devices which use it are forced to drop the lock in between.  Block and
> > network clients will be easy to convert since they already use APIs that
> > drop the lock (except for accessing the descriptors).
> > 
> >> We could try to introduce a different version of cpu_physical_memory_rw,
> >> cpu_physical_memory_rw_unlocked. But the problem remains that an MMIO
> >> request can trigger the very same access in a nested fashion, and we
> >> will have to detect this to avoid locking up QEMU (locking up the guest
> >> might be OK).
> > 
> > An async version of c_p_m_rw() will just cause a completion to bounce
> > around, consuming cpu but not deadlocking anything.  If we can keep a
> > count of the bounces, we might be able to stall it indefinitely or at
> > least ratelimit it.
> > 
> 
> Another option is to require all users of c_p_m_rw() and related to use
> a coroutine or thread.  That makes the programming easier (but still
> required a revalidation after the dropped lock).

Unless i am misunderstanding this thread, the iothread flow section of
http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04315.html
contains possible solutions for this.

Basically use trylock() and if it fails, then choose a bailout option
(which there are more than one possibilities listed).