From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48454) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UZLAa-0000E1-1r for qemu-devel@nongnu.org; Mon, 06 May 2013 09:09:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UZLAY-00062G-Kh for qemu-devel@nongnu.org; Mon, 06 May 2013 09:09:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37882) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UZLAY-000629-Cj for qemu-devel@nongnu.org; Mon, 06 May 2013 09:09:30 -0400 Message-ID: <5187AB80.3040906@redhat.com> Date: Mon, 06 May 2013 15:09:20 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1353808984-22368-1-git-send-email-qemulist@gmail.com> <51829B30.7020308@siemens.com> <51836FA8.2000501@siemens.com> <5184D942.5020506@redhat.com> <5184E601.9030407@web.de> <518764CD.2080602@redhat.com> <51876C86.5040301@siemens.com> <5187859C.5020305@redhat.com> <51878C4B.7050700@siemens.com> <51878CF0.1080009@redhat.com> <51878FCE.2020907@siemens.com> <518793F7.1010502@redhat.com> <51879674.70706@siemens.com> <51879844.1030101@redhat.com> <51879CDA.3000800@siemens.com> In-Reply-To: <51879CDA.3000800@siemens.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: qemu-devel , Liu Ping Fan Il 06/05/2013 14:06, Jan Kiszka ha scritto: > On 2013-05-06 13:47, Paolo Bonzini wrote: >> Il 06/05/2013 13:39, Jan Kiszka ha scritto: >>> On 2013-05-06 13:28, Paolo Bonzini wrote: >>>> Il 06/05/2013 13:11, Jan Kiszka ha scritto: >>>>> On 2013-05-06 12:58, Paolo Bonzini wrote: >>>>>> Il 06/05/2013 12:56, Jan Kiszka ha scritto: >>>>>>>> The problem is that even if I/O for a region is supposed to happen >>>>>>>> within the BQL, lookup can happen outside the BQL. Lookup will use the >>>>>>>> region even if it is just to discard it: >>>>>>>> >>>>>>>> VCPU thread (under BQL) device thread >>>>>>>> -------------------------------------------------------------------------------------- >>>>>>>> flatview_ref >>>>>>>> memory_region_find returns d->mr >>>>>>>> memory_region_ref(d->mr) /* nop */ >>>>>>>> qdev_free(d) >>>>>>>> object_unparent(d) >>>>>>>> unrealize(d) >>>>>>>> memory_region_del_subregion(d->mr) >>>>>>>> FlatView updated, d->mr not in the new view >>>>>>>> >>>>>>>> flatview_unref >>>>>>>> memory_region_unref(d->mr) >>>>>>>> object_unref(d) >>>>>>>> free(d) >>>>>>>> if (!d->mr->is_ram) { /* BAD! */ >>>>>>>> memory_region_unref(d->mr) /* nop */ >>>>>>>> return error >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> Here, the memory region is dereferenced *before* we know that it is BQL-free >>>>>>>> (in fact, exactly to ascertain whether it is BQL-free). >>>>>>> >>>>>>> Both flatview update and lookup *plus* locking type evaluation (i.e. >>>>>>> memory region dereferencing) always happen under the address space lock. >>>>>>> See Pingfan's patch. >>>>>> >>>>>> That's true of address_space_rw/map, but I don't think it holds for >>>>>> memory_region_find. >>>>> >>>>> It has to, or it would be broken: Either it is called on a region that >>>>> supports reference counting >>>> >>>> You cannot know that in advance, can you? The address is decided by the >>>> guest. >>> >>> Need to help me again to get the context: In which case is this a >>> hot-path that we want to keep BQL-free? Current users of >>> memory_region_find appear to be all relatively slow paths, thus are fine >>> with staying under BQL. >> >> virtio-blk-dataplane is basically redoing memory_region_find with a >> separate data structure, exactly so that it can run outside the BQL >> before we get BQL-free MMIO dispatch. >> >> I can try to post patches later today that actually use >> memory_region_find instead. > > We could define its semantics as follows: return a reference to the > corresponding memory region, provide this is safe. A reference is safe when > - the region supports BQL-free operation (thus provides an owner to > apply reference counting on) This doesn't really work. Regions that are known not to disappear (most importantly, the main RAM region) also support BQL-free operation, but have no owner right now. Also, memory_region_find cannot know if it's returning a valid result, and the callee cannot check it because the region may have disappeared already when it is returned. But I really would be surprised if adding an owner everywhere is so hard... let's try that first, it would solve the problem. > - the caller holds the BQL (check via qemu_mutex_iothread_is_locked() > - to be implemented) > > The latter implies that the BQL is not dropped before returning the > reference, but that's nothing memory_region_find can enforce. Paolo