kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yang Zhang <yang.zhang.wz@gmail.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Lan, Tianyu" <tianyu.lan@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, emil.s.tantilov@intel.com,
	kvm@vger.kernel.org, ard.biesheuvel@linaro.org, aik@ozlabs.ru,
	donald.c.skidmore@intel.com, quintela@redhat.com,
	eddie.dong@intel.com, nrupal.jani@intel.com, agraf@suse.de,
	blauwirbel@gmail.com, cornelia.huck@de.ibm.com,
	alex.williamson@redhat.com, kraxel@redhat.com,
	anthony@codemonkey.ws, amit.shah@redhat.com, pbonzini@redhat.com,
	mark.d.rustad@intel.com, lcapitulino@redhat.com,
	gerlitz.or@gmail.com
Subject: Re: [Qemu-devel] live migration vs device assignment (motivation)
Date: Thu, 10 Dec 2015 21:07:32 +0800	[thread overview]
Message-ID: <56697914.6090605@gmail.com> (raw)
In-Reply-To: <20151210114114.GE2570@work-vm>

On 2015/12/10 19:41, Dr. David Alan Gilbert wrote:
> * Yang Zhang (yang.zhang.wz@gmail.com) wrote:
>> On 2015/12/10 18:18, Dr. David Alan Gilbert wrote:
>>> * Lan, Tianyu (tianyu.lan@intel.com) wrote:
>>>> On 12/8/2015 12:50 AM, Michael S. Tsirkin wrote:
>>>>> I thought about what this is doing at the high level, and I do have some
>>>>> value in what you are trying to do, but I also think we need to clarify
>>>>> the motivation a bit more.  What you are saying is not really what the
>>>>> patches are doing.
>>>>>
>>>>> And with that clearer understanding of the motivation in mind (assuming
>>>>> it actually captures a real need), I would also like to suggest some
>>>>> changes.
>>>>
>>>> Motivation:
>>>> Most current solutions for migration with passthough device are based on
>>>> the PCI hotplug but it has side affect and can't work for all device.
>>>>
>>>> For NIC device:
>>>> PCI hotplug solution can work around Network device migration
>>>> via switching VF and PF.
>>>>
>>>> But switching network interface will introduce service down time.
>>>>
>>>> I tested the service down time via putting VF and PV interface
>>>> into a bonded interface and ping the bonded interface during plug
>>>> and unplug VF.
>>>> 1) About 100ms when add VF
>>>> 2) About 30ms when del VF
>>>>
>>>> It also requires guest to do switch configuration. These are hard to
>>>> manage and deploy from our customers. To maintain PV performance during
>>>> migration, host side also needs to assign a VF to PV device. This
>>>> affects scalability.
>>>>
>>>> These factors block SRIOV NIC passthough usage in the cloud service and
>>>> OPNFV which require network high performance and stability a lot.
>>>
>>> Right, that I'll agree it's hard to do migration of a VM which uses
>>> an SRIOV device; and while I think it should be possible to bond a virtio device
>>> to a VF for networking and then hotplug the SR-IOV device I agree it's hard to manage.
>>>
>>>> For other kind of devices, it's hard to work.
>>>> We are also adding migration support for QAT(QuickAssist Technology) device.
>>>>
>>>> QAT device user case introduction.
>>>> Server, networking, big data, and storage applications use QuickAssist
>>>> Technology to offload servers from handling compute-intensive operations,
>>>> such as:
>>>> 1) Symmetric cryptography functions including cipher operations and
>>>> authentication operations
>>>> 2) Public key functions including RSA, Diffie-Hellman, and elliptic curve
>>>> cryptography
>>>> 3) Compression and decompression functions including DEFLATE and LZS
>>>>
>>>> PCI hotplug will not work for such devices during migration and these
>>>> operations will fail when unplug device.
>>>
>>> I don't understand that QAT argument; if the device is purely an offload
>>> engine for performance, then why can't you fall back to doing the
>>> same operations in the VM or in QEMU if the card is unavailable?
>>> The tricky bit is dealing with outstanding operations.
>>>
>>>> So we are trying implementing a new solution which really migrates
>>>> device state to target machine and won't affect user during migration
>>>> with low service down time.
>>>
>>> Right, that's a good aim - the only question is how to do it.
>>>
>>> It looks like this is always going to need some device-specific code;
>>> the question I see is whether that's in:
>>>      1) qemu
>>>      2) the host kernel
>>>      3) the guest kernel driver
>>>
>>> The objections to this series seem to be that it needs changes to (3);
>>> I can see the worry that the guest kernel driver might not get a chance
>>> to run during the right time in migration and it's painful having to
>>> change every guest driver (although your change is small).
>>>
>>> My question is what stage of the migration process do you expect to tell
>>> the guest kernel driver to do this?
>>>
>>>      If you do it at the start of the migration, and quiesce the device,
>>>      the migration might take a long time (say 30 minutes) - are you
>>>      intending the device to be quiesced for this long? And where are
>>>      you going to send the traffic?
>>>      If you are, then do you need to do it via this PCI trick, or could
>>>      you just do it via something higher level to quiesce the device.
>>>
>>>      Or are you intending to do it just near the end of the migration?
>>>      But then how do we know how long it will take the guest driver to
>>>      respond?
>>
>> Ideally, it is able to leave guest driver unmodified but it requires the
>> hypervisor or qemu to aware the device which means we may need a driver in
>> hypervisor or qemu to handle the device on behalf of guest driver.
>
> Can you answer the question of when do you use your code -
>     at the start of migration or
>     just before the end?

Tianyu can answer this question. In my initial design, i prefer to put 
more modifications in hypervisor and Qemu, and the only involvement from 
guest driver is how to restore the state after migration. But I don't 
know the later implementation since i have left Intel.

>
>>> It would be great if we could avoid changing the guest; but at least your guest
>>> driver changes don't actually seem to be that hardware specific; could your
>>> changes actually be moved to generic PCI level so they could be made
>>> to work for lots of drivers?
>>
>> It is impossible to use one common solution for all devices unless the PCIE
>> spec documents it clearly and i think one day it will be there. But before
>> that, we need some workarounds on guest driver to make it work even it looks
>> ugly.
>
> Dave
>
>>
>> --
>> best regards
>> yang
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>


-- 
best regards
yang

  reply	other threads:[~2015-12-10 13:07 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-24 13:35 [RFC PATCH V2 00/10] Qemu: Add live migration support for SRIOV NIC Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 01/10] Qemu/VFIO: Create head file pci.h to share data struct Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 02/10] Qemu/VFIO: Add new VFIO_GET_PCI_CAP_INFO ioctl cmd definition Lan Tianyu
2015-12-02 22:25   ` Alex Williamson
2015-12-03  8:40     ` Lan, Tianyu
2015-12-03 15:26       ` Alex Williamson
2015-11-24 13:35 ` [RFC PATCH V2 03/10] Qemu/VFIO: Rework vfio_std_cap_max_size() function Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 04/10] Qemu/VFIO: Add vfio_find_free_cfg_reg() to find free PCI config space regs Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 05/10] Qemu/VFIO: Expose PCI config space read/write and msix functions Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 06/10] Qemu/PCI: Add macros for faked PCI migration capability Lan Tianyu
2015-12-02 22:25   ` Alex Williamson
2015-12-03  8:57     ` Lan, Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 07/10] Qemu: Add post_load_state() to run after restoring CPU state Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 08/10] Qemu: Add save_before_stop callback to run just before stopping VCPU during migration Lan Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 09/10] Qemu/VFIO: Add SRIOV VF migration support Lan Tianyu
2015-11-24 21:03   ` Michael S. Tsirkin
2015-11-25 15:32     ` Lan, Tianyu
2015-11-25 15:44       ` Michael S. Tsirkin
2015-12-02 22:25   ` Alex Williamson
2015-12-03  8:56     ` Lan, Tianyu
2015-11-24 13:35 ` [RFC PATCH V2 10/10] Qemu/VFIO: Misc change for enable migration with VFIO Lan Tianyu
2015-11-30  8:01 ` [RFC PATCH V2 00/10] Qemu: Add live migration support for SRIOV NIC Michael S. Tsirkin
2015-12-01  6:26   ` Lan, Tianyu
2015-12-01 15:02     ` Michael S. Tsirkin
2015-12-02 14:08       ` Lan, Tianyu
2015-12-02 14:31         ` Michael S. Tsirkin
2015-12-03 14:53           ` Lan, Tianyu
2015-12-04  6:42           ` Lan, Tianyu
2015-12-04  8:05             ` Michael S. Tsirkin
2015-12-04 12:11               ` Lan, Tianyu
2015-12-03 18:32         ` Alexander Duyck
2015-12-07 16:50 ` live migration vs device assignment (was Re: [RFC PATCH V2 00/10] Qemu: Add live migration support for SRIOV NIC) Michael S. Tsirkin
2015-12-09 16:26   ` live migration vs device assignment (motivation) Lan, Tianyu
2015-12-09 17:14     ` Alexander Duyck
2015-12-10  3:15       ` Lan, Tianyu
2015-12-09 20:07     ` Michael S. Tsirkin
2015-12-10  3:04       ` Lan, Tianyu
2015-12-10  8:38         ` Michael S. Tsirkin
2015-12-10 14:23           ` Lan, Tianyu
2015-12-10 10:18     ` [Qemu-devel] " Dr. David Alan Gilbert
2015-12-10 11:28       ` Yang Zhang
2015-12-10 11:41         ` Dr. David Alan Gilbert
2015-12-10 13:07           ` Yang Zhang [this message]
2015-12-10 14:38           ` Lan, Tianyu
2015-12-10 16:11             ` [Qemu-devel] " Michael S. Tsirkin
2015-12-10 19:17               ` Alexander Duyck
2015-12-11  7:32               ` Lan, Tianyu
2015-12-14  9:12                 ` Michael S. Tsirkin
2015-12-10 16:23             ` Dr. David Alan Gilbert
2015-12-10 17:16             ` Alexander Duyck
2015-12-13 15:47               ` Lan, Tianyu
2015-12-13 19:30                 ` Alexander Duyck
2015-12-25  7:03                   ` Lan Tianyu
2015-12-25 12:11                     ` [Qemu-devel] " Michael S. Tsirkin
2015-12-28 17:42                       ` Lan, Tianyu
2015-12-29 16:46                         ` Michael S. Tsirkin
2015-12-29 17:04                           ` Alexander Duyck
2015-12-29 17:15                             ` Michael S. Tsirkin
2015-12-29 18:04                               ` [Qemu-devel] " Alexander Duyck
2016-01-04  2:15                           ` Lan Tianyu
2015-12-25 22:31                     ` Alexander Duyck
2015-12-27  9:21                       ` Michael S. Tsirkin
2015-12-27 21:45                         ` [Qemu-devel] " Alexander Duyck
2015-12-28  8:51                           ` Michael S. Tsirkin
2015-12-28  3:20                       ` Dong, Eddie
2015-12-28  4:26                         ` Alexander Duyck
2015-12-28 11:50                         ` [Qemu-devel] " Michael S. Tsirkin
2015-12-14  9:26                 ` Michael S. Tsirkin
2015-12-28  8:52                   ` Pavel Fedin
2015-12-28 11:51                     ` Michael S. Tsirkin
2016-03-17  9:15 ` [Qemu-devel] [RFC PATCH V2 00/10] Qemu: Add live migration support for SRIOV NIC Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56697914.6090605@gmail.com \
    --to=yang.zhang.wz@gmail.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=ard.biesheuvel@linaro.org \
    --cc=blauwirbel@gmail.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dgilbert@redhat.com \
    --cc=donald.c.skidmore@intel.com \
    --cc=eddie.dong@intel.com \
    --cc=emil.s.tantilov@intel.com \
    --cc=gerlitz.or@gmail.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=mark.d.rustad@intel.com \
    --cc=mst@redhat.com \
    --cc=nrupal.jani@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).