From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58770) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a728h-0002PK-Bq for qemu-devel@nongnu.org; Thu, 10 Dec 2015 09:24:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a728c-0003fa-Bc for qemu-devel@nongnu.org; Thu, 10 Dec 2015 09:24:11 -0500 Received: from mga11.intel.com ([192.55.52.93]:43380) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a728c-0003f9-6W for qemu-devel@nongnu.org; Thu, 10 Dec 2015 09:24:06 -0500 References: <1448372127-28115-1-git-send-email-tianyu.lan@intel.com> <20151207165039.GA20210@redhat.com> <56685631.50700@intel.com> <20151209215334-mutt-send-email-mst@redhat.com> <5668EBD6.9080506@intel.com> <20151210095213-mutt-send-email-mst@redhat.com> From: "Lan, Tianyu" Message-ID: <56698AFC.7060907@intel.com> Date: Thu, 10 Dec 2015 22:23:56 +0800 MIME-Version: 1.0 In-Reply-To: <20151210095213-mutt-send-email-mst@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] live migration vs device assignment (motivation) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, emil.s.tantilov@intel.com, kvm@vger.kernel.org, ard.biesheuvel@linaro.org, aik@ozlabs.ru, donald.c.skidmore@intel.com, quintela@redhat.com, eddie.dong@intel.com, nrupal.jani@intel.com, agraf@suse.de, blauwirbel@gmail.com, cornelia.huck@de.ibm.com, alex.williamson@redhat.com, kraxel@redhat.com, anthony@codemonkey.ws, amit.shah@redhat.com, pbonzini@redhat.com, mark.d.rustad@intel.com, lcapitulino@redhat.com, gerlitz.or@gmail.com On 12/10/2015 4:38 PM, Michael S. Tsirkin wrote: > Let's assume you do save state and do have a way to detect > whether state matches a given hardware. For example, > driver could store firmware and hardware versions > in the state, and then on destination, retrieve them > and compare. It will be pretty common that you have a mismatch, > and you must not just fail migration. You need a way to recover, > maybe with more downtime. > > > Second, you can change the driver but you can not be sure it will have > the chance to run at all. Host overload is a common reason to migrate > out of the host. You also can not trust guest to do the right thing. > So how long do you want to wait until you decide guest is not > cooperating and kill it? Most people will probably experiment a bit and > then add a bit of a buffer. This is not robust at all. > > Again, maybe you ask driver to save state, and if it does > not respond for a while, then you still migrate, > and driver has to recover on destination. > > > With the above in mind, you need to support two paths: > 1. "good path": driver stores state on source, checks it on destination > detects a match and restores state into the device > 2. "bad path": driver does not store state, or detects a mismatch > on destination. driver has to assume device was lost, > and reset it > > So what I am saying is, implement bad path first. Then good path > is an optimization - measure whether it's faster, and by how much. > These sound reasonable. Driver should have ability to do such check to ensure hardware or firmware coherence after migration and reset device when migration happens at some unexpected position. > Also, it would be nice if on the bad path there was a way > to switch to another driver entirely, even if that means > a bit more downtime. For example, have a way for driver to > tell Linux it has to re-do probing for the device. Just glace the code of device core. device_reprobe() does what you said. /** * device_reprobe - remove driver for a device and probe for a new driver * @dev: the device to reprobe * * This function detaches the attached driver (if any) for the given * device and restarts the driver probing process. It is intended * to use if probing criteria changed during a devices lifetime and * driver attachment should change accordingly. */ int device_reprobe(struct device *dev)