From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"cjia@nvidia.com" <cjia@nvidia.com>,
"quintela@redhat.com" <quintela@redhat.com>,
"cohuck@redhat.com" <cohuck@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"lushenming@huawei.com" <lushenming@huawei.com>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"Tarun Gupta" <targupta@nvidia.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"philmd@redhat.com" <philmd@redhat.com>,
"dnigam@nvidia.com" <dnigam@nvidia.com>
Subject: Re: [PATCH v2 1/1] docs/devel: Add VFIO device migration documentation
Date: Tue, 16 Mar 2021 15:46:57 +0000 [thread overview]
Message-ID: <YFDS8eavuGHh6EwT@work-vm> (raw)
In-Reply-To: <MWHPR11MB1886E79EB08B27A46D3AC8E88C6F9@MWHPR11MB1886.namprd11.prod.outlook.com>
* Tian, Kevin (kevin.tian@intel.com) wrote:
> > From: Qemu-devel <qemu-devel-bounces+kevin.tian=intel.com@nongnu.org>
> > On Behalf Of Dr. David Alan Gilbert
> >
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Thu, Mar 11, 2021 at 12:50:09AM +0530, Tarun Gupta wrote:
> > > > Document interfaces used for VFIO device migration. Added flow of state
> > changes
> > > > during live migration with VFIO device. Tested by building docs with the
> > new
> > > > vfio-migration.rst file.
> > > >
> > > > v2:
> > > > - Included the new vfio-migration.rst file in index.rst
> > > > - Updated dirty page tracking section, also added details about
> > > > 'pre-copy-dirty-page-tracking' opt-out option.
> > > > - Incorporated comments around wording of doc.
> > > >
> > > > Signed-off-by: Tarun Gupta <targupta@nvidia.com>
> > > > Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
> > > > ---
> > > > MAINTAINERS | 1 +
> > > > docs/devel/index.rst | 1 +
> > > > docs/devel/vfio-migration.rst | 135
> > ++++++++++++++++++++++++++++++++++
> > > > 3 files changed, 137 insertions(+)
> > > > create mode 100644 docs/devel/vfio-migration.rst
> > >
> > >
> > > > +Postcopy
> > > > +========
> > > > +
> > > > +Postcopy migration is not supported for VFIO devices.
> > >
> > > What is the problem here and is there any plan for how to address it ?
> >
> > There's no equivalent to userfaultfd for accesses to RAM made by a
> > device.
> > There's some potential for this to be doable with an IOMMU or the like,
> > but:
> > a) IOMMUs and devices aren't currently happy at recovering from
> > failures
> > b) the fragementation you get during a postcopy probably isn't pretty
> > when you get to build IOMMU tables.
>
> To overcome such limitations one may adopt a prefault-and-pull scheme if
> the vendor driver has the capability to track pending DMA buffers in the
> migration process (with additional uAPI changes in VFIO or userfaultfd),
> as discussed here:
>
> https://static.sched.com/hosted_files/kvmforum2019/7a/kvm-forum-postcopy-final.pdf
Did that get any further?
I can imagine that might be tricikier for a GPU than a network card; the
shaders in a GPU are pretty random as to what they go off and access, so
I can't see how you could prefault.
Dave
> >
> > > Postcopy is essentially the only migration mechanism that can reliably
> > > complete, so it really should be considered the default approach to
> > > migration for all mgmt apps wanting to do migration, except in special
> > > cases. IOW, if we want VFIO migration to be viable, we need postcopy
> > > support.
> >
> > There's lots of other things postcopy doesn't work with; so hmm.
> >
>
> Agree. Also given the amount of work even for pre-copy migration, it makes
> more sense to do things step-by-step.
>
> Thanks
> Kevin
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2021-03-16 15:49 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-10 19:20 [PATCH v2 1/1] docs/devel: Add VFIO device migration documentation Tarun Gupta
2021-03-11 9:40 ` Daniel P. Berrangé
2021-03-11 19:41 ` Dr. David Alan Gilbert
2021-03-12 2:30 ` Tian, Kevin
2021-03-16 15:46 ` Dr. David Alan Gilbert [this message]
2021-03-17 1:44 ` Tian, Kevin
2021-03-12 3:13 ` Tian, Kevin
2021-03-16 13:34 ` Tarun Gupta (SW-GPU)
2021-03-17 1:51 ` Tian, Kevin
2021-03-15 17:22 ` Cornelia Huck
2021-03-16 16:18 ` Tarun Gupta (SW-GPU)
2021-03-18 12:28 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YFDS8eavuGHh6EwT@work-vm \
--to=dgilbert@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=berrange@redhat.com \
--cc=cjia@nvidia.com \
--cc=cohuck@redhat.com \
--cc=dnigam@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kwankhede@nvidia.com \
--cc=lushenming@huawei.com \
--cc=philmd@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=targupta@nvidia.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.