qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Liu Ping Fan <pingfank@linux.vnet.ibm.com>,
	liu ping fan <qemulist@gmail.com>,
	Anthony Liguori <anthony@codemonkey.ws>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
Date: Wed, 29 Aug 2012 19:41:04 +0200	[thread overview]
Message-ID: <503E5430.2010806@siemens.com> (raw)
In-Reply-To: <503E50F8.5000405@redhat.com>

On 2012-08-29 19:27, Avi Kivity wrote:
> On 08/29/2012 10:21 AM, Jan Kiszka wrote:
>> On 2012-08-29 19:13, Avi Kivity wrote:
>>> On 08/27/2012 06:01 PM, Jan Kiszka wrote:
>>>> On 2012-08-27 22:53, Avi Kivity wrote:
>>>>> On 08/27/2012 12:38 PM, Jan Kiszka wrote:
>>>>>>>> Even worse, apply
>>>>>>>> restrictions on how the dispatched objects, the regions, have to be
>>>>>>>> treated because of this.
>>>>>>>
>>>>>>> Please elaborate.
>>>>>>
>>>>>> The fact that you can't manipulate a memory region object arbitrarily
>>>>>> after removing it from the mapping because you track the references to
>>>>>> the object that the region points to, not the region itself. The region
>>>>>> remains in use by the dispatching layer and potentially the called
>>>>>> device, even after deregistration.
>>>>>
>>>>> That object will be a container_of() the region, usually literally but
>>>>> sometimes only in spirit.  Reference counting the regions means they
>>>>> cannot be embedded into other objects any more.
>>>>
>>>> I cannot follow the logic of the last sentence. Reference counting just
>>>> means that we track if there are users left, not necessarily that we
>>>> perform asynchronous releases. We can simply wait for those users to
>>>> complete.
>>>
>>> I don't see how.  Suppose you add a reference count to MemoryRegion. 
>>> How do you delay its containing object's destructor from running?  Do
>>> you iterate over all member MemoryRegion and examine their reference counts?
>>
>> memory_region_transaction_commit (or calls that trigger this) will have
>> to wait. That is required as the caller may start fiddling with the
>> region right afterward.
> 
> However, it may be running within an mmio dispatch, so it may wait forever.

We won't be able to wait for devices that can acquire the same lock we
hold while waiting, of course. That's why a lock ordering device-lock ->
BQL (the one we hold over memory region changes) won't be allowed. I was
wrong on this before: We must not take course-grained locks while
holding fine-grained ones, only the other way around. Pretty obvious,
specifically for realtime / low-latency.

> 
> Ignoring this, how does it wait? sleep()? or wait for a completion?

Ideally via completion.

> 
>>>
>>> Usually a reference count controls the lifetime of the reference counted
>>> object, what are you suggesting here?
>>
>> To synchronize on reference count going to zero. 
> 
> Quite unorthodox.  Do you have real life examples?

synchronize_rcu.

> 
>> Or all readers leaving
>> the read-side critical sections.
> 
> This is rcu.  But again, we may be in an rcu read-side critical section
> while needing to wait.  In fact this was what I suggested in the first
> place, until Marcelo pointed out the deadlock, so we came up with the
> refcount.

We must avoid that deadlock scenario. I bended my brain around it, and I
think that is the only answer. We can if we apply clear rules regarding
locking and BQL-protected services to those devices that will actually
use fine-grained locking, and there only for those paths that take
per-device locks. I'm pretty sure we won't get far with an
"all-or-nothing" model where every device uses a private lock.

> 
>>
>>>
>>>>>
>>>>> We can probably figure out a way to flush out accesses.  After switching
>>>>> to rcu, for example, all we need is synchronize_rcu() in a
>>>>> non-deadlocking place.  But my bet is that it will not be needed.
>>>>
>>>> If you properly flush out accesses, you don't need to track at device
>>>> level anymore - and mess with abstraction layers. That's my whole point.
>>>
>>> To flush out an access you need either rwlock_write_lock() or
>>> synchronize_rcu() (depending on the implementation).  But neither of
>>> these can be run from an rcu read-side critical section or
>>> rwlock_read_lock().
>>>
>>> You could defer the change to a bottom half, but if the hardware demands
>>> that the change be complete before returning, that doesn't work.
>>
>> Right, therefore synchronous transactions.
> 
> I don't follow.  Synchronous transactions mean you can't
> synchronize_rcu() or upgrade the lock or wait for the refcount.  That's
> the source of the problem!

Our current device models assume synchronous transactions on the memory
layer, actually on all layers. The proposals I saw so far try to change
this. But I'm very skeptical that this would succeed, due to the
conversion effort and due to the fact that it would give us a tricky  to
use API.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

  reply	other threads:[~2012-08-29 17:41 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-24  9:49 [Qemu-devel] [PATCH 0/10] rework on hot unplug Liu Ping Fan
2012-08-24  9:49 ` [Qemu-devel] [PATCH 01/10] qom: add, remove of link property need to ref, unref its target Liu Ping Fan
2012-08-24 14:52   ` Paolo Bonzini
2012-08-24  9:49 ` [Qemu-devel] [PATCH 02/10] qdev: change iterator callback seq Liu Ping Fan
2012-08-24  9:49 ` [Qemu-devel] [PATCH 03/10] qom: export object_property_is_child, object_property_is_link Liu Ping Fan
2012-08-24 14:51   ` Paolo Bonzini
2012-08-25  7:43     ` liu ping fan
2012-08-25  8:04   ` Blue Swirl
2012-08-24  9:49 ` [Qemu-devel] [PATCH 04/10] qdev: introduce new interface to remove composite sub-tree Liu Ping Fan
2012-08-24  9:49 ` [Qemu-devel] [PATCH 05/10] qdev: finalize of qbus, qdev will not the right place to free children Liu Ping Fan
2012-08-24 14:50   ` Paolo Bonzini
2012-08-24  9:49 ` [Qemu-devel] [PATCH 06/10] qom: expose object_property_del_child Liu Ping Fan
2012-08-24 14:44   ` Paolo Bonzini
2012-08-24  9:49 ` [Qemu-devel] [PATCH 07/10] unplug: using new intf qdev_delete_subtree in acpi_piix_eject_slot Liu Ping Fan
2012-08-24 10:24   ` Paolo Bonzini
2012-08-25  7:05     ` liu ping fan
2012-08-24  9:49 ` [Qemu-devel] [PATCH 08/10] qdev: rename qdev_unplug to qdev_unplug_req Liu Ping Fan
2012-08-24 14:48   ` Paolo Bonzini
2012-08-24  9:49 ` [Qemu-devel] [PATCH 09/10] mon: release dev's ref hold by qdev_get_peripheral Liu Ping Fan
2012-08-24  9:49 ` [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem Liu Ping Fan
2012-08-24 14:42   ` Paolo Bonzini
2012-08-25  7:42     ` liu ping fan
2012-08-27  7:01       ` Paolo Bonzini
2012-08-27  7:47         ` Jan Kiszka
2012-08-27  8:17           ` liu ping fan
2012-08-27  8:27             ` Jan Kiszka
2012-08-27 17:09           ` Avi Kivity
2012-08-27 17:14             ` Jan Kiszka
2012-08-27 18:09               ` Avi Kivity
2012-08-27 18:17                 ` Jan Kiszka
2012-08-27 18:20                   ` Avi Kivity
2012-08-27 18:39                     ` Jan Kiszka
2012-08-27 18:52                       ` Avi Kivity
2012-08-27 19:38                         ` Jan Kiszka
2012-08-27 20:53                           ` Avi Kivity
2012-08-28  1:01                             ` Jan Kiszka
2012-08-29 17:13                               ` Avi Kivity
2012-08-29 17:21                                 ` Jan Kiszka
2012-08-29 17:27                                   ` Avi Kivity
2012-08-29 17:41                                     ` Jan Kiszka [this message]
2012-09-03  9:09                                       ` Avi Kivity
2012-08-28  3:09                           ` liu ping fan
2012-08-28  3:38                             ` liu ping fan
2012-08-28  9:42                               ` Jan Kiszka
2012-08-28 10:05                                 ` Paolo Bonzini
2012-08-29 17:23                                 ` Avi Kivity
2012-08-29 17:30                                   ` Jan Kiszka
2012-08-29 17:40                                     ` Avi Kivity
2012-08-29 17:49                                       ` Jan Kiszka
2012-09-01  8:31                                         ` Avi Kivity
2012-09-01  8:57                                           ` Jan Kiszka
2012-09-01  9:30                                             ` Avi Kivity
2012-08-30  5:54                                       ` liu ping fan
2012-08-30  7:08                                         ` Jan Kiszka
2012-08-30  7:47                                           ` liu ping fan
2012-09-01  8:46                                           ` Avi Kivity
2012-09-03  7:44                                             ` liu ping fan
2012-09-03  8:52                                               ` Avi Kivity
2012-09-03 10:06                                                 ` liu ping fan
2012-09-03 10:16                                                   ` Avi Kivity
2012-09-04  2:33                                                     ` liu ping fan
2012-09-04  2:34                                                   ` liu ping fan
2012-09-05  8:19                                                     ` liu ping fan
2012-09-05  9:52                                                       ` Avi Kivity
2012-09-05 10:36                                                         ` Jan Kiszka
2012-09-05 10:53                                                           ` Avi Kivity
2012-09-05 11:11                                                             ` Jan Kiszka
2012-09-05 11:25                                                               ` Avi Kivity
2012-09-05 12:02                                                                 ` Jan Kiszka
2012-09-05 12:17                                                                   ` Avi Kivity
2012-08-27 13:19   ` Anthony Liguori
2012-08-27 15:02     ` Jan Kiszka
2012-08-27 15:14       ` Anthony Liguori
2012-08-27 15:26         ` Jan Kiszka
2012-08-27 16:24           ` Anthony Liguori
2012-08-27 16:59             ` Jan Kiszka
2012-08-27 18:35             ` Avi Kivity
2012-08-27 19:17               ` Anthony Liguori
2012-08-27 19:22                 ` Jan Kiszka
2012-08-27 20:58                 ` Avi Kivity
2012-08-27 21:34                   ` Paolo Bonzini
2012-08-27 18:27     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=503E5430.2010806@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pingfank@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemulist@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).