From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48263) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VLyii-000828-LJ for qemu-devel@nongnu.org; Tue, 17 Sep 2013 13:05:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VLyic-0007qD-4A for qemu-devel@nongnu.org; Tue, 17 Sep 2013 13:05:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49057) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VLyib-0007q6-De for qemu-devel@nongnu.org; Tue, 17 Sep 2013 13:05:41 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r8HH5eD1022992 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Sep 2013 13:05:41 -0400 Date: Tue, 17 Sep 2013 20:07:53 +0300 From: "Michael S. Tsirkin" Message-ID: <20130917170752.GA20986@redhat.com> References: <1378211609-16121-1-git-send-email-pbonzini@redhat.com> <20130917124724.GA18965@redhat.com> <52386A29.9090908@redhat.com> <20130917144541.GA19882@redhat.com> <52387840.8090405@redhat.com> <20130917155909.GA20460@redhat.com> <52387FBF.4050504@redhat.com> <20130917162928.GA20672@redhat.com> <52388A3D.4090909@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52388A3D.4090909@redhat.com> Subject: Re: [Qemu-devel] [PATCH v2 00/38] Delay destruction of memory regions to instance_finalize List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org On Tue, Sep 17, 2013 at 06:58:37PM +0200, Paolo Bonzini wrote: > Il 17/09/2013 18:29, Michael S. Tsirkin ha scritto: > > > BTW, qemu_del_nic is another one that I forgot to mention. You could > > > have MMIO that triggers a transmit while the device is going down, for > > > example. > > > > Wait a second. This API simply does not make sense. > > If region is not visible it's MMIO really mustn't trigger, > > exit or no exit. Disabling region and still getting op callbacks > > afterwards is not what any caller of this API expects. > > > > I'm not sure what to do about the bounce buffer thing > > but it needs to be fixed some other way without > > breaking API. > > I don't think it's breaking the API. The very same thing can happen > with RAM. The only difference is that MMIO calls ops. We can argue about RAM but getting callback after disable is not really sane. > Also, this problem is subject to race conditions from buggy or > misbehaving guests. If you want to have any hope of breaking devices > free of the BQL and do "simple" register I/O without taking a lock, > there is simply no precise moment to stop MMIO at. I don't see why can't disable MR flush whatever is outstanding. > All these problems do not happen in real hardware because real hardware > has buffers between the PHY and DMA circuitries, and because bus master > transactions transfer few bytes at a time (for example in PCI even when > a device does burst transactions, the other party can halt them with > such a small granularity). A device can be quiesced in a matter of > microseconds, and other times (the time for the OS to react to hotplug > requests, the time for the driver to shut down, the time for the human > to physically unplug the connector) can be several order of magnitudes > larger. They don't happen on real hardware because once you disable memory in a PCI device, it does not accept memory transactions. > Instead we have the opposite scenario, because we want to have as few > buffers as possible and map large amounts of memory (even 4K used by the > bounce buffer is comparatively large) for the host OS's benefit. When > we do so, and the host backend fails (e.g. a disk is on NFS and there is > a networking problem), memory can remain mapped for a long time. I don't see why is this a problem. So memory disable will take a long time. Who cares? It's not data path. > DMA-to-MMIO may be a theoretical problems only, but if we don't cover it > we have a bogus solution to the problem, because exactly the same thing > can and will happen for memory hot-unplug. > > Paolo We need to cover it without breaking APIs. After memory_region_del_subregion returns, it's a promise that there will not be accesses to the region. So I'm not even sure we really need to move destroy to finalize anymore ... -- MST