From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:48282)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1T6m37-0001xY-2r
	for qemu-devel@nongnu.org; Wed, 29 Aug 2012 13:27:30 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1T6m35-0003Bp-Qu
	for qemu-devel@nongnu.org; Wed, 29 Aug 2012 13:27:29 -0400
Received: from mx1.redhat.com ([209.132.183.28]:20013)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1T6m35-0003Be-Ii
	for qemu-devel@nongnu.org; Wed, 29 Aug 2012 13:27:27 -0400
Message-ID: <503E50F8.5000405@redhat.com>
Date: Wed, 29 Aug 2012 10:27:20 -0700
From: Avi Kivity <avi@redhat.com>
MIME-Version: 1.0
References: <1345801763-24227-1-git-send-email-qemulist@gmail.com>
	<1345801763-24227-11-git-send-email-qemulist@gmail.com>
	<503792F1.4090109@redhat.com>
	<CAJnKYQk2zDFiCmr3Qo=Cj4B21oWd0FoC8c2N3OG3ujm=pMoD9Q@mail.gmail.com>
	<503B1B4B.6050108@redhat.com> <503B260E.70607@web.de>
	<503BA9BC.5010207@redhat.com> <503BAAF0.2020103@siemens.com>
	<503BB7E7.4050709@redhat.com> <503BB9C5.3030605@siemens.com>
	<503BBA77.4090006@redhat.com> <503BBED4.9050705@siemens.com>
	<503BC1EE.4060608@redhat.com> <503BCCA1.10403@siemens.com>
	<503BDE63.7090602@redhat.com> <503C185B.4030401@web.de>
	<503E4DBC.6080802@redhat.com> <503E4F8C.5070901@siemens.com>
In-Reply-To: <503E4F8C.5070901@siemens.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life
	cycle problem
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, Liu Ping Fan <pingfank@linux.vnet.ibm.com>, liu ping fan <qemulist@gmail.com>, Anthony Liguori <anthony@codemonkey.ws>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>

On 08/29/2012 10:21 AM, Jan Kiszka wrote:
> On 2012-08-29 19:13, Avi Kivity wrote:
> > On 08/27/2012 06:01 PM, Jan Kiszka wrote:
> >> On 2012-08-27 22:53, Avi Kivity wrote:
> >>> On 08/27/2012 12:38 PM, Jan Kiszka wrote:
> >>>>>> Even worse, apply
> >>>>>> restrictions on how the dispatched objects, the regions, have to be
> >>>>>> treated because of this.
> >>>>>
> >>>>> Please elaborate.
> >>>>
> >>>> The fact that you can't manipulate a memory region object arbitrarily
> >>>> after removing it from the mapping because you track the references to
> >>>> the object that the region points to, not the region itself. The region
> >>>> remains in use by the dispatching layer and potentially the called
> >>>> device, even after deregistration.
> >>>
> >>> That object will be a container_of() the region, usually literally but
> >>> sometimes only in spirit.  Reference counting the regions means they
> >>> cannot be embedded into other objects any more.
> >>
> >> I cannot follow the logic of the last sentence. Reference counting just
> >> means that we track if there are users left, not necessarily that we
> >> perform asynchronous releases. We can simply wait for those users to
> >> complete.
> > 
> > I don't see how.  Suppose you add a reference count to MemoryRegion. 
> > How do you delay its containing object's destructor from running?  Do
> > you iterate over all member MemoryRegion and examine their reference counts?
>
> memory_region_transaction_commit (or calls that trigger this) will have
> to wait. That is required as the caller may start fiddling with the
> region right afterward.

However, it may be running within an mmio dispatch, so it may wait forever.

Ignoring this, how does it wait? sleep()? or wait for a completion?

> > 
> > Usually a reference count controls the lifetime of the reference counted
> > object, what are you suggesting here?
>
> To synchronize on reference count going to zero. 

Quite unorthodox.  Do you have real life examples?

> Or all readers leaving
> the read-side critical sections.

This is rcu.  But again, we may be in an rcu read-side critical section
while needing to wait.  In fact this was what I suggested in the first
place, until Marcelo pointed out the deadlock, so we came up with the
refcount.

>
> > 
> >>>
> >>> We can probably figure out a way to flush out accesses.  After switching
> >>> to rcu, for example, all we need is synchronize_rcu() in a
> >>> non-deadlocking place.  But my bet is that it will not be needed.
> >>
> >> If you properly flush out accesses, you don't need to track at device
> >> level anymore - and mess with abstraction layers. That's my whole point.
> > 
> > To flush out an access you need either rwlock_write_lock() or
> > synchronize_rcu() (depending on the implementation).  But neither of
> > these can be run from an rcu read-side critical section or
> > rwlock_read_lock().
> > 
> > You could defer the change to a bottom half, but if the hardware demands
> > that the change be complete before returning, that doesn't work.
>
> Right, therefore synchronous transactions.

I don't follow.  Synchronous transactions mean you can't
synchronize_rcu() or upgrade the lock or wait for the refcount.  That's
the source of the problem!

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.