Re: [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Cornelia Huck <cohuck@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	cjia@nvidia.com, kevin.tian@intel.com, ziye.yang@intel.com,
	changpeng.liu@intel.com, yi.l.liu@intel.com, mlevitsk@redhat.com,
	eskultet@redhat.com, dgilbert@redhat.com,
	jonathan.davies@nutanix.com, eauger@redhat.com, aik@ozlabs.ru,
	pasic@linux.ibm.com, felipe@nutanix.com,
	Zhengxiao.zx@Alibaba-inc.com, shuangtai.tst@alibaba-inc.com,
	Ken.Xue@amd.com, zhi.a.wang@intel.com, yan.y.zhao@intel.com,
	yulei.zhang@intel.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface
Date: Fri, 1 Mar 2019 14:36:30 +0100	[thread overview]
Message-ID: <20190301143630.474870d2.cohuck@redhat.com> (raw)
In-Reply-To: <f15fecfe-26c0-05fa-da3b-20156145d287@nvidia.com>

On Wed, 27 Feb 2019 01:35:01 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> Alex,
> 
> On 2/22/2019 3:53 AM, Alex Williamson wrote:
> > On Wed, 20 Feb 2019 02:53:16 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> >> - Defined MIGRATION region type and sub-type.
> >> - Used 2 bits to define VFIO device states.
> >>     Bit 0 => 0/1 => _STOPPED/_RUNNING
> >>     Bit 1 => 0/1 => _RESUMING/_SAVING
> >>     Combination of these bits defines VFIO device's state during migration
> >>     _RUNNING => Normal VFIO device running state.
> >>     _STOPPED => VFIO device stopped.
> >>     _SAVING | _RUNNING => vCPUs are running, VFIO device is running but start
> >>                           saving state of device i.e. pre-copy state
> >>     _SAVING | _STOPPED => vCPUs are stoppped, VFIO device should be stopped, and
> >>                           save device state,i.e. stop-n-copy state
> >>     _RESUMING => VFIO device resuming state.  
> > 
> > Shouldn't we have a non-_RESUMING/_SAVING run state?  If these are
> > indicating directly flow, maybe we need two bits:
> > 
> >   00b - None, normal runtime
> >   01b - Saving
> >   10b - Resuming
> >   11b - Invalid/reserved (maybe a Failed state indicator)
> >   
> 
> There has to be 2 more states:
> _SAVING | _RUNNING => SAVING while device is RUNNING - pre-copy phase
> _SAVING | _STOPPED => SAVING while device is STOPPED - stop-and-copy phase
> 
> So the 2 bits used in this patch are:
> 00b - _RESUMING
> 01b - _RUNNING - Normal Running
> 10b - _SAVING | _STOPPED - stop-and-copy phase
> 11b - _SAVING | _RUNNING - pre-copy phase

Not sure if the "use two bits" approach is the most elegant here --
rightmost bit unset can mean either _RESUMING or _STOPPED...

What about simply making this four distinct states?

#define _RESUMING 0
#define _RUNNING 1 //or the other way around?
#define _SAVING_STOPPED 2
#define _SAVING_RUNNING 3

You could still check if you're currently SAVING by ANDing with 10b.

(...)

> >> + *
> >> + * data_offset: (read/write)
> >> + *      User application should read data_offset in migration region from where
> >> + *      user application should read data during _SAVING state.
> >> + *      User application would write data_offset in migration region from where
> >> + *      user application is had written data during _RESUMING state.  
> > 
> > Why wouldn't data_offset simply be fixed?  Do we really want to support
> > the user writing a arbitrary offsets in the migration region?  Each
> > chunk can simply start at the kernel provided data_offset.
> >  
> 
> data_offset differs based on region is trapped on mapped. In case when
> region is mmaped, QEMU writes data to mapped region and then write
> data_offset field, indicating data is present in mmaped buffer. Read
> below for more detailed steps.
> 
> >> + *
> >> + * data_size: (write only)
> >> + *      User application should write size of data copied in migration region
> >> + *      during _RESUMING state.  
> > 
> > How much does the user read on _SAVING then?
> >   
> 
> pending_bytes tells bytes that should be read on _SAVING.
> 
> >> + *
> >> + * start_pfn: (write only)
> >> + *      Start address pfn to get bitmap of dirty pages from vendor driver duing
> >> + *      _SAVING state.
> >> + *
> >> + * page_size: (write only)
> >> + *      User application should write the page_size of pfn.
> >> + *
> >> + * total_pfns: (write only)
> >> + *      Total pfn count from start_pfn for which dirty bitmap is requested.  
> > 
> > So effectively the area that begins at data_offset is dual purpose
> > (triple purpose vs Yan's, config, memory, and dirty bitmap) 

I'm wondering whether splitting it would combine the best of the two
approaches: Just having an optional dirty bitmap region would make it
more obvious if dirty page tracking is not available.

next prev parent reply	other threads:[~2019-03-01 13:36 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19 21:23 [Qemu-devel] [PATCH v3 0/5] Add migration support for VFIO device Kirti Wankhede
2019-02-19 21:23 ` [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface Kirti Wankhede
2019-02-21 22:23   ` Alex Williamson
2019-02-26 20:05     ` Kirti Wankhede
2019-03-01 13:36       ` Cornelia Huck [this message]
2019-03-07 19:45       ` Alex Williamson
2019-02-19 21:23 ` [Qemu-devel] [PATCH v3 2/5] Add save and load functions for VFIO PCI devices Kirti Wankhede
2019-02-19 21:23 ` [Qemu-devel] [PATCH v3 3/5] Add migration functions for VFIO devices Kirti Wankhede
2019-02-21 22:38   ` Alex Williamson
2019-02-26 20:16     ` Kirti Wankhede
2019-02-19 21:23 ` [Qemu-devel] [PATCH v3 4/5] Add vfio_listerner_log_sync to mark dirty pages Kirti Wankhede
2019-02-19 21:23 ` [Qemu-devel] [PATCH v3 5/5] Make vfio-pci device migration capable Kirti Wankhede
2019-02-20 10:22 ` [Qemu-devel] [PATCH v3 0/5] Add migration support for VFIO device Dr. David Alan Gilbert
2019-02-21  5:25   ` Kirti Wankhede
2019-02-21  5:52     ` Tian, Kevin
2019-02-21  7:10       ` Neo Jia
2019-02-27  1:51         ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301143630.474870d2.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@Alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kwankhede@nvidia.com \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yulei.zhang@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).