Re: [PATCH RFC] vfio: Revise and update the migration uAPI description

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"farman@linux.ibm.com" <farman@linux.ibm.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH RFC] vfio: Revise and update the migration uAPI description
Date: Wed, 26 Jan 2022 21:10:58 -0400	[thread overview]
Message-ID: <20220127011058.GW84788@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB5276141AC961A04A89235B428C219@BN9PR11MB5276.namprd11.prod.outlook.com>

On Thu, Jan 27, 2022 at 12:53:54AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, January 26, 2022 8:15 PM
> > 
> > On Wed, Jan 26, 2022 at 01:49:09AM +0000, Tian, Kevin wrote:
> > 
> > > > As STOP_PRI can be defined as halting any new PRIs and always return
> > > > immediately.
> > >
> > > The problem is that on such devices PRIs are continuously triggered
> > > when the driver tries to drain the in-fly requests to enter STOP_P2P
> > > or STOP_COPY. If we simply halt any new PRIs in STOP_PRI, it
> > > essentially implies no migration support for such device.
> > 
> > So what can this HW even do? It can't immediately stop and disable its
> > queues?
> > 
> > Are you sure it can support migration?
> 
> It's a draining model thus cannot immediately stop. Instead it has to
> wait for in-fly requests to be completed (even not talking about vPRI).

So, it can't complete draining without completing an unknown number of
vPRIs?

> timeout policy is always in userspace. We just need an interface for the user
> to communicate it to the kernel. 

Can the HW tell if the draining is completed somehow? Ie can it
trigger and eventfd or something?

The v2 API has this nice feature where it can return an FD, so we
could possibly go into a 'stopping PRI' state and that can return an
eventfd for the user to poll on to know when it is OK to move onwards.

That was the sticking point before, we want completing RUNNING_P2P to
mean the device is halted, but vPRI ideally wants to do a background
halting - now we have a way to do that..

Returning to running would abort the draining.

Userspace does the timeout with poll on the event fd..

This also logically justifies why this is not backwards compatabile as
one of the rules in the FSM construction is any arc that can return a
FD must be the final arc.

So, if the FSM seqeunce is

   RUNNING -> RUNNING_STOP_PRI -> RUNNING_STOP_P2P_AND_PRI -> STOP_COPY

Then by the design rules we cannot pass through RUNNING_STOP_PRI
automatically, it must be explicit.

A cap like "running_p2p returns an event fd, doesn't finish until the
VCPU does stuff, and stops pri as well as p2p" might be all that is
required here (and not an actual new state)

It is somewhat bizzaro from a wording perspective, but does
potentially allow qemu to be almost unchanged for the two cases..

Jason

next prev parent reply	other threads:[~2022-01-27  1:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-14 19:35 [PATCH RFC] vfio: Revise and update the migration uAPI description Jason Gunthorpe
2022-01-18 14:04 ` Yishai Hadas
2022-01-18 19:55 ` Alex Williamson
2022-01-18 21:00   ` Jason Gunthorpe
2022-01-19 11:40     ` Cornelia Huck
2022-01-19 12:44       ` Jason Gunthorpe
2022-01-19 13:42         ` Jason Gunthorpe
2022-01-19 14:59     ` Jason Gunthorpe
2022-01-19 15:32     ` Alex Williamson
2022-01-19 15:40       ` Jason Gunthorpe
2022-01-19 16:06         ` Alex Williamson
2022-01-19 16:38           ` Jason Gunthorpe
2022-01-19 17:02             ` Alex Williamson
2022-01-20  0:19               ` Jason Gunthorpe
2022-01-24 10:24                 ` Cornelia Huck
2022-01-24 17:57                   ` Jason Gunthorpe
2022-01-19 13:18   ` Jason Gunthorpe
2022-01-25  3:55 ` Tian, Kevin
2022-01-25 13:11   ` Jason Gunthorpe
2022-01-26  1:17     ` Tian, Kevin
2022-01-26  1:32       ` Jason Gunthorpe
2022-01-26  1:49         ` Tian, Kevin
2022-01-26 12:14           ` Jason Gunthorpe
2022-01-26 15:33             ` Jason Gunthorpe
2022-01-27  0:38               ` Tian, Kevin
2022-01-27  0:48                 ` Jason Gunthorpe
2022-01-27  1:03                   ` Tian, Kevin
2022-01-27  0:53             ` Tian, Kevin
2022-01-27  1:10               ` Jason Gunthorpe [this message]
2022-01-27  1:21                 ` Tian, Kevin
2022-01-26  1:35       ` Jason Gunthorpe
2022-01-26  1:58         ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220127011058.GW84788@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).