From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58981) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aDTWa-0000sr-4D for qemu-devel@nongnu.org; Mon, 28 Dec 2015 03:51:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aDTWW-0006BO-Tk for qemu-devel@nongnu.org; Mon, 28 Dec 2015 03:51:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60668) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aDTWW-0006Ar-Lx for qemu-devel@nongnu.org; Mon, 28 Dec 2015 03:51:24 -0500 Date: Mon, 28 Dec 2015 10:51:12 +0200 From: "Michael S. Tsirkin" Message-ID: <20151228103332-mutt-send-email-mst@redhat.com> References: <566961C1.6030000@gmail.com> <20151210114114.GE2570@work-vm> <56698E68.5040207@intel.com> <566D9320.8000209@intel.com> <567CEA53.5030601@intel.com> <20151227103345-mutt-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] live migration vs device assignment (motivation) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Duyck Cc: Yang Zhang , "Tantilov, Emil S" , kvm@vger.kernel.org, aik@ozlabs.ru, qemu-devel@nongnu.org, lcapitulino@redhat.com, Blue Swirl , kraxel@redhat.com, "Rustad, Mark D" , quintela@redhat.com, "Skidmore, Donald C" , Alexander Graf , Or Gerlitz , "Dr. David Alan Gilbert" , Alex Williamson , Anthony Liguori , cornelia.huck@de.ibm.com, Lan Tianyu , Ard Biesheuvel , "Dong, Eddie" , "Jani, Nrupal" , amit.shah@redhat.com, Paolo Bonzini On Sun, Dec 27, 2015 at 01:45:15PM -0800, Alexander Duyck wrote: > On Sun, Dec 27, 2015 at 1:21 AM, Michael S. Tsirkin wrote: > > On Fri, Dec 25, 2015 at 02:31:14PM -0800, Alexander Duyck wrote: > >> The PCI hot-plug specification calls out that the OS can optionally > >> implement a "pause" mechanism which is meant to be used for high > >> availability type environments. What I am proposing is basically > >> extending the standard SHPC capable PCI bridge so that we can support > >> the DMA page dirtying for everything hosted on it, add a vendor > >> specific block to the config space so that the guest can notify the > >> host that it will do page dirtying, and add a mechanism to indicate > >> that all hot-plug events during the warm-up phase of the migration are > >> pause events instead of full removals. > > > > Two comments: > > > > 1. A vendor specific capability will always be problematic. > > Better to register a capability id with pci sig. > > > > 2. There are actually several capabilities: > > > > A. support for memory dirtying > > if not supported, we must stop device before migration > > > > This is supported by core guest OS code, > > using patches similar to posted by you. > > > > > > B. support for device replacement > > This is a faster form of hotplug, where device is removed and > > later another device using same driver is inserted in the same slot. > > > > This is a possible optimization, but I am convinced > > (A) should be implemented independently of (B). > > > > My thought on this was that we don't need much to really implement > either feature. Really only a bit or two for either one. I had > thought about extending the PCI Advanced Features, but for now it > might make more sense to just implement it as a vendor capability for > the QEMU based bridges instead of trying to make this a true PCI > capability since I am not sure if this in any way would apply to > physical hardware. The fact is the PCI Advanced Features capability > is essentially just a vendor specific capability with a different ID Interesting. I see it more as a backport of pci express features to pci. > so if we were to use 2 bits that are currently reserved in the > capability we could later merge the functionality without much > overhead. Don't do this. You must not touch reserved bits. > I fully agree that the two implementations should be separate but > nothing says we have to implement them completely different. If we > are just using 3 bits for capability, status, and control of each > feature there is no reason for them to need to be stored in separate > locations. True. > >> I've been poking around in the kernel and QEMU code and the part I > >> have been trying to sort out is how to get QEMU based pci-bridge to > >> use the SHPC driver because from what I can tell the driver never > >> actually gets loaded on the device as it is left in the control of > >> ACPI hot-plug. > > > > There are ways, but you can just use pci express, it's easier. > > That's true. I should probably just give up on trying to do an > implementation that works with the i440fx implementation. I could > probably move over to the q35 and once that is done then we could look > at something like the PCI Advanced Features solution for something > like the PCI-bridge drivers. > > - Alex Once we have a decent idea of what's required, I can write an ECN for pci code and id assignment specification. That's cleaner than vendor specific stuff that's tied to a specific device/vendor ID. -- MST