qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kirti Wankhede <kwankhede@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	Kunkun Jiang <jiangkunkun@huawei.com>,
	Tarun Gupta <targupta@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Eric Auger <eric.auger@redhat.com>,
	"Shameer Kolothum" <shameerali.kolothum.thodi@huawei.com>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>
Cc: "wanghaibin.wang@huawei.com" <wanghaibin.wang@huawei.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Keqian Zhu <zhukeqian1@huawei.com>,
	"liulongfang@huawei.com" <liulongfang@huawei.com>,
	"tangnianyao@huawei.com" <tangnianyao@huawei.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Zhao, Yan Y" <yan.y.zhao@intel.com>
Subject: Re: [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA
Date: Fri, 24 Sep 2021 14:59:20 +0530	[thread overview]
Message-ID: <06cb5bfd-f6f8-b61b-1a7e-60a9ae2f8fac@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB5433E189EEC102256A3348A18CA49@BN9PR11MB5433.namprd11.prod.outlook.com>



On 9/24/2021 12:17 PM, Tian, Kevin wrote:
>> From: Kunkun Jiang <jiangkunkun@huawei.com>
>> Sent: Friday, September 24, 2021 2:19 PM
>>
>> Hi all,
>>
>> I encountered a problem in vfio device migration test. The
>> vCPU may be paused during vfio-pci DMA in iommu nested
>> stage mode && vSVA. This may lead to migration fail and
>> other problems related to device hardware and driver
>> implementation.
>>
>> It may be a bit early to discuss this issue, after all, the iommu
>> nested stage mode and vSVA are not yet mature. But judging
>> from the current implementation, we will definitely encounter
>> this problem in the future.
> 
> Yes, this is a known limitation to support migration with vSVA.
> 
>>
>> This is the current process of vSVA processing translation fault
>> in iommu nested stage mode (take SMMU as an example):
>>
>> guest os            4.handle translation fault 5.send CMD_RESUME to vSMMU
>>
>>
>> qemu                3.inject fault into guest os 6.deliver response to
>> host os
>> (vfio/vsmmu)
>>
>>
>> host os              2.notify the qemu 7.send CMD_RESUME to SMMU
>> (vfio/smmu)
>>
>>
>> SMMU              1.address translation fault              8.retry or
>> terminate
>>
>> The order is 1--->8.
>>
>> Currently, qemu may pause vCPU at any step. It is possible to
>> pause vCPU at step 1-5, that is, in a DMA. This may lead to
>> migration fail and other problems related to device hardware
>> and driver implementation. For example, the device status
>> cannot be changed from RUNNING && SAVING to SAVING,
>> because the device DMA is not over.
>>
>> As far as i can see, vCPU should not be paused during a device
>> IO process, such as DMA. However, currently live migration
>> does not pay attention to the state of vfio device when pausing
>> the vCPU. And if the vCPU is not paused, the vfio device is
>> always running. This looks like a *deadlock*.
> 
> Basically this requires:
> 
> 1) stopping vCPU after stopping device (could selectively enable
> this sequence for vSVA);
> 

I don't think this is change is required. When vCPUs are at halt vCPU 
states are already saved, step 4 or 5 will be taken care by that. Then 
when device is transitioned in SAVING state, save qemu and host os state 
in the migration stream, i.e. state at step 2 and 3, depending on that 
take action while resuming, about step 6 or 7 to run.

Thanks,
Kirti

> 2) when stopping device, the driver should block new requests
> from vCPU (queued to a pending list) and then drain all in-fly
> requests including faults;
>      * to block this further requires switching from fast-path to
> slow trap-emulation path for the cmd portal before stopping
> the device;
> 
> 3) save the pending requests in the vm image and replay them
> after the vm is resumed;
>      * finally disable blocking by switching back to the fast-path for
> the cmd portal;
> 
>>
>> Do you have any ideas to solve this problem?
>> Looking forward to your replay.
>>
> 
> We verified above flow can work in our internal POC.
> 
> Thanks
> Kevin
> 


  reply	other threads:[~2021-09-24  9:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24  6:18 [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA Kunkun Jiang
2021-09-24  6:47 ` Tian, Kevin
2021-09-24  9:29   ` Kirti Wankhede [this message]
2021-09-26  2:48     ` Tian, Kevin
2021-09-27 12:30   ` Kunkun Jiang
2021-09-27 13:05     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06cb5bfd-f6f8-b61b-1a7e-60a9ae2f8fac@nvidia.com \
    --to=kwankhede@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kevin.tian@intel.com \
    --cc=liulongfang@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tangnianyao@huawei.com \
    --cc=targupta@nvidia.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).