From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42972) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T61DI-0008HR-DN for qemu-devel@nongnu.org; Mon, 27 Aug 2012 11:26:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T61DB-0006rn-CN for qemu-devel@nongnu.org; Mon, 27 Aug 2012 11:26:52 -0400 Received: from david.siemens.de ([192.35.17.14]:25924) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T61DB-0006rh-2Q for qemu-devel@nongnu.org; Mon, 27 Aug 2012 11:26:45 -0400 Message-ID: <503B91A9.7010507@siemens.com> Date: Mon, 27 Aug 2012 17:26:33 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <1345801763-24227-1-git-send-email-qemulist@gmail.com> <1345801763-24227-11-git-send-email-qemulist@gmail.com> <874nnoh3pg.fsf@codemonkey.ws> <503B8C1E.6090800@siemens.com> <87pq6cfjtm.fsf@codemonkey.ws> In-Reply-To: <87pq6cfjtm.fsf@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Paolo Bonzini , Avi Kivity , Liu Ping Fan , Liu Ping Fan , "qemu-devel@nongnu.org" On 2012-08-27 17:14, Anthony Liguori wrote: > Jan Kiszka writes: > >> On 2012-08-27 15:19, Anthony Liguori wrote: >>> Liu Ping Fan writes: >>> >>>> From: Liu Ping Fan >>>> >>>> Scene: >>>> obja lies in objA, when objA's ref->0, it will be freed, >>>> but at that time obja can still be in use. >>>> >>>> The real example is: >>>> typedef struct PCIIDEState { >>>> PCIDevice dev; >>>> IDEBus bus[2]; --> create in place >>>> ..... >>>> } >>>> >>>> When without big lock protection for mmio-dispatch, we will hold >>>> obj's refcnt. So memory_region_init_io() will replace the third para >>>> "void *opaque" with "Object *obj". >>>> With this patch, we can protect PCIIDEState from disappearing during >>>> mmio-dispatch hold the IDEBus->ref. >>>> >>>> And the ref circle has been broken when calling qdev_delete_subtree(). >>>> >>>> Signed-off-by: Liu Ping Fan >>> >>> I think this is solving the wrong problem. There are many, many >>> dependencies a device may have on other devices. Memory allocation >>> isn't the only one. >>> >>> The problem is that we want to make sure that a device doesn't "go away" >>> while an MMIO dispatch is happening. This is easy to solve without >>> touching referencing counting. >>> >>> The device will hold a lock while the MMIO is being dispatched. The >>> delete path simply needs to acquire that same lock. This will ensure >>> that a delete operation cannot finish while MMIO is still in flight. >> >> That's a bit too simple. Quite a few MMIO/PIO fast-paths will work >> without any device-specific locking, e.g. just to read a simple register >> value. So we will need reference counting > > But then you'll need to acquire a lock to take the reference/remove the > reference which sort of defeats the purpose of trying to fast path. Atomic ops? RCU? This problem won't be solved for the first time. > >> (for devices using private >> locks), but on the "front-line" object: the memory region. That region >> will block its owner from disappearing by waiting on dispatch when >> someone tries to unregister it. >> >> Also note that "holding a lock" is easily said but will be more tricky >> in practice. Quite a significant share of our code will continue to run >> under BQL, even for devices with their own locks. Init/cleanup functions >> will likely fall into this category, > > I'm not sure I'm convinced of this--but it's hard to tell until we > really start converting. > > BTW, I'm pretty sure we have to tackle main loop functions first before > we try to convert any devices off the BQL. I'm sure we should leave existing code alone wherever possible, focusing on providing alternative versions for those paths that matter. Example: Most timers are fine under BQL. But some sensitive devices (RTC or HPET as clock source) will want their own timers. So the approach is to instantiate a separate, also prioritizeable instance of the timer subsystem for them and be done. We won't convert QEMU in a day, but we surely want results before the last corner is refactored (which would take years, at best). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux