From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44871) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yuk43-0001iJ-7a for qemu-devel@nongnu.org; Tue, 19 May 2015 12:08:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yuk3y-0000y4-EA for qemu-devel@nongnu.org; Tue, 19 May 2015 12:08:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34923) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yuk3y-0000xW-5r for qemu-devel@nongnu.org; Tue, 19 May 2015 12:08:14 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t4JG8DXa007838 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Tue, 19 May 2015 12:08:13 -0400 Date: Tue, 19 May 2015 18:08:10 +0200 From: "Michael S. Tsirkin" Message-ID: <20150519180227-mutt-send-email-mst@redhat.com> References: <20150422092304.GE32086@redhat.com> <20150519110454-mutt-send-email-mst@redhat.com> <555B4575.1040709@redhat.com> <20150519142149.GB8535@redhat.com> <20150519150304.GD2127@work-vm> <20150519153508.GD8535@redhat.com> <20150519173724-mutt-send-email-mst@redhat.com> <20150519154503.GE8535@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150519154503.GE8535@redhat.com> Subject: Re: [Qemu-devel] [libvirt] [RFC 0/7] Live Migration with Pass-through Devices proposal List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: libvir-list@redhat.com, "Dr. David Alan Gilbert" , Laine Stump , qemu-devel@nongnu.org On Tue, May 19, 2015 at 04:45:03PM +0100, Daniel P. Berrange wrote: > On Tue, May 19, 2015 at 05:39:05PM +0200, Michael S. Tsirkin wrote: > > On Tue, May 19, 2015 at 04:35:08PM +0100, Daniel P. Berrange wrote: > > > On Tue, May 19, 2015 at 04:03:04PM +0100, Dr. David Alan Gilbert wrote: > > > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > > > On Tue, May 19, 2015 at 10:15:17AM -0400, Laine Stump wrote: > > > > > > On 05/19/2015 05:07 AM, Michael S. Tsirkin wrote: > > > > > > > On Wed, Apr 22, 2015 at 10:23:04AM +0100, Daniel P. Berrange wrote: > > > > > > >> On Fri, Apr 17, 2015 at 04:53:02PM +0800, Chen Fan wrote: > > > > > > >>> backgrond: > > > > > > >>> Live migration is one of the most important features of virtualization technology. > > > > > > >>> With regard to recent virtualization techniques, performance of network I/O is critical. > > > > > > >>> Current network I/O virtualization (e.g. Para-virtualized I/O, VMDq) has a significant > > > > > > >>> performance gap with native network I/O. Pass-through network devices have near > > > > > > >>> native performance, however, they have thus far prevented live migration. No existing > > > > > > >>> methods solve the problem of live migration with pass-through devices perfectly. > > > > > > >>> > > > > > > >>> There was an idea to solve the problem in website: > > > > > > >>> https://www.kernel.org/doc/ols/2008/ols2008v2-pages-261-267.pdf > > > > > > >>> Please refer to above document for detailed information. > > > > > > >>> > > > > > > >>> So I think this problem maybe could be solved by using the combination of existing > > > > > > >>> technology. and the following steps are we considering to implement: > > > > > > >>> > > > > > > >>> - before boot VM, we anticipate to specify two NICs for creating bonding device > > > > > > >>> (one plugged and one virtual NIC) in XML. here we can specify the NIC's mac addresses > > > > > > >>> in XML, which could facilitate qemu-guest-agent to find the network interfaces in guest. > > > > > > >>> > > > > > > >>> - when qemu-guest-agent startup in guest it would send a notification to libvirt, > > > > > > >>> then libvirt will call the previous registered initialize callbacks. so through > > > > > > >>> the callback functions, we can create the bonding device according to the XML > > > > > > >>> configuration. and here we use netcf tool which can facilitate to create bonding device > > > > > > >>> easily. > > > > > > >> I'm not really clear on why libvirt/guest agent needs to be involved in this. > > > > > > >> I think configuration of networking is really something that must be left to > > > > > > >> the guest OS admin to control. I don't think the guest agent should be trying > > > > > > >> to reconfigure guest networking itself, as that is inevitably going to conflict > > > > > > >> with configuration attempted by things in the guest like NetworkManager or > > > > > > >> systemd-networkd. > > > > > > > There should not be a conflict. > > > > > > > guest agent should just give NM the information, and have NM do > > > > > > > the right thing. > > > > > > > > > > > > That assumes the guest will have NM running. Unless you want to severely > > > > > > limit the scope of usefulness, you also need to handle systems that have > > > > > > NM disabled, and among those the different styles of system network > > > > > > config. It gets messy very fast. > > > > > > > > > > Also OpenStack already has a way to pass guest information about the > > > > > required network setup, via cloud-init, so it would not be interested > > > > > in any thing that used the QEMU guest agent to configure network > > > > > manager. Which is really just another example of why this does not > > > > > belong anywhere in libvirt or lower. The decision to use NM is a > > > > > policy decision that will always be wrong for a non-negligble set > > > > > of use cases and as such does not belong in libvirt or QEMU. It is > > > > > the job of higher level apps to make that kind of policy decision. > > > > > > > > This is exactly my worry though; why should every higher level management > > > > system have it's own way of communicating network config for hotpluggable > > > > devices. You shoudln't need to reconfigure a VM to move it between them. > > > > > > > > This just makes it hard to move it between management layers; there needs > > > > to be some standardisation (or abstraction) of this; if libvirt isn't the place > > > > to do it, then what is? > > > > > > NB, openstack isn't really defining a custom thing for networking here. It > > > is actually integrating with the standard cloud-init guest tools for this > > > task. Also note that OpenStack has defined a mechanism that works for > > > guest images regardless of what hypervisor they are running on - ie does > > > not rely on any QEMU or libvirt specific functionality here. > > > > I'm not sure what the implication is. No new functionality should be > > implemented unless we also add it to vmware? People that don't want kvm > > specific functionality, won't use it. > > I'm saying that standardization of virtualization policy in libvirt is the > wrong solution, because different applications will have different viewpoints > as to what "standardization" is useful / appropriate. Creating a standardized > policy in libvirt for KVM, does not help OpenStack may help people who only > care about KVM, but that is not the entire ecosystem. OpenStack has a > standardized solution for guest configuration imformation that works across > all the hypervisors it targets. This is just yet another example of exactly > why libvirt aims to design its APIs such that it exposes direct mechanisms > and leaves usage policy decisions upto the management applications. Libvirt > is not best placed to decide which policy all these mgmt apps must use for > this task. > > Regards, > Daniel I don't think we are pushing policy in libvirt here. What we want is a mechanism that let users specify in the XML: interface X is fallback for pass-through device Y Then when requesting migration, specify that it should use device Z on destination as replacement for Y. We are asking libvirt to automatically 1.- when migration is requested, request unplug of Y 2.- wait until Y is deleted 3.- start migration 4.- wait until migration is completed 5.- plug device Z on destination I don't see any policy above: libvirt is in control of migration and seems best placed to implement this. > -- > |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|