public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Yan Zhao <yan.y.zhao@intel.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"intel-gvt-dev@lists.freedesktop.org" 
	<intel-gvt-dev@lists.freedesktop.org>,
	"arei.gonglei@huawei.com" <arei.gonglei@huawei.com>,
	"aik@ozlabs.ru" <aik@ozlabs.ru>,
	"Zhengxiao.zx@alibaba-inc.com" <Zhengxiao.zx@alibaba-inc.com>,
	"shuangtai.tst@alibaba-inc.com" <shuangtai.tst@alibaba-inc.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"eauger@redhat.com" <eauger@redhat.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"Yang, Ziye" <ziye.yang@intel.com>,
	"mlevitsk@redhat.com" <mlevitsk@redhat.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	"felipe@nutanix.com" <felipe@nutanix.com>,
	"Liu, Changpeng" <changpeng.liu@intel.com>,
	"Ken.Xue@amd.com" <Ken.Xue@amd.com>,
	"jonathan.davies@nutanix.com" <jonathan.davies@nutanix.com>,
	"He, Shaopeng" <shaopeng.he@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>,
	"eskultet@redhat.com" <eskultet@redhat.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"zhenyuw@linux.intel.com" <zhenyuw@linux.intel.com>,
	"Wang, Zhi A" <zhi.a.wang@intel.com>,
	"cjia@nvidia.com" <cjia@nvidia.com>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"berrange@redhat.com" <berrange@redhat.com>,
	"dinechin@redhat.com" <dinechin@redhat.com>
Subject: Re: [PATCH v2 1/2] vfio/mdev: add version attribute for mdev device
Date: Sun, 12 May 2019 21:16:26 -0400	[thread overview]
Message-ID: <20190513011626.GI24397@joy-OptiPlex-7040> (raw)
In-Reply-To: <20190510114838.7e16c3d6.cohuck@redhat.com>

On Fri, May 10, 2019 at 05:48:38PM +0800, Cornelia Huck wrote:
> On Fri, 10 May 2019 10:36:09 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Cornelia Huck (cohuck@redhat.com) wrote:
> > > On Thu, 9 May 2019 17:48:26 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >   
> > > > * Cornelia Huck (cohuck@redhat.com) wrote:  
> > > > > On Thu, 9 May 2019 16:48:57 +0100
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > >     
> > > > > > * Cornelia Huck (cohuck@redhat.com) wrote:    
> > > > > > > On Tue, 7 May 2019 15:18:26 -0600
> > > > > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > > > >       
> > > > > > > > On Sun,  5 May 2019 21:49:04 -0400
> > > > > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:      
> > > > > > >       
> > > > > > > > > +  Errno:
> > > > > > > > > +  If vendor driver wants to claim a mdev device incompatible to all other mdev
> > > > > > > > > +  devices, it should not register version attribute for this mdev device. But if
> > > > > > > > > +  a vendor driver has already registered version attribute and it wants to claim
> > > > > > > > > +  a mdev device incompatible to all other mdev devices, it needs to return
> > > > > > > > > +  -ENODEV on access to this mdev device's version attribute.
> > > > > > > > > +  If a mdev device is only incompatible to certain mdev devices, write of
> > > > > > > > > +  incompatible mdev devices's version strings to its version attribute should
> > > > > > > > > +  return -EINVAL;        
> > > > > > > > 
> > > > > > > > I think it's best not to define the specific errno returned for a
> > > > > > > > specific situation, let the vendor driver decide, userspace simply
> > > > > > > > needs to know that an errno on read indicates the device does not
> > > > > > > > support migration version comparison and that an errno on write
> > > > > > > > indicates the devices are incompatible or the target doesn't support
> > > > > > > > migration versions.      
> > > > > > > 
> > > > > > > I think I have to disagree here: It's probably valuable to have an
> > > > > > > agreed error for 'cannot migrate at all' vs 'cannot migrate between
> > > > > > > those two particular devices'. Userspace might want to do different
> > > > > > > things (e.g. trying with different device pairs).      
> > > > > > 
> > > > > > Trying to stuff these things down an errno seems a bad idea; we can't
> > > > > > get much information that way.    
> > > > > 
> > > > > So, what would be a reasonable approach? Userspace should first read
> > > > > the version attributes on both devices (to find out whether migration
> > > > > is supported at all), and only then figure out via writing whether they
> > > > > are compatible?
> > > > > 
> > > > > (Or just go ahead and try, if it does not care about the reason.)    
> > > > 
> > > > Well, I'm OK with something like writing to test whether it's
> > > > compatible, it's just we need a better way of saying 'no'.
> > > > I'm not sure if that involves reading back from somewhere after
> > > > the write or what.  
> > > 
> > > Hm, so I basically see two ways of doing that:
> > > - standardize on some error codes... problem: error codes can be hard
> > >   to fit to reasons
> > > - make the error available in some attribute that can be read
> > > 
> > > I'm not sure how we can serialize the readback with the last write,
> > > though (this looks inherently racy).
> > > 
> > > How important is detailed error reporting here?  
> > 
> > I think we need something, otherwise we're just going to get vague
> > user reports of 'but my VM doesn't migrate'; I'd like the error to be
> > good enough to point most users to something they can understand
> > (e.g. wrong card family/too old a driver etc).
> 
> Ok, that sounds like a reasonable point. Not that I have a better idea
> how to achieve that, though... we could also log a more verbose error
> message to the kernel log, but that's not necessarily where a user will
> look first.
> 
> Ideally, we'd want to have the user space program setting up things
> querying the general compatibility for migration (so that it becomes
> their problem on how to alert the user to problems :), but I'm not sure
> how to eliminate the race between asking the vendor driver for
> compatibility and getting the result of that operation.
> 
> Unless we introduce an interface that can retrieve _all_ results
> together with the written value? Or is that not going to be much of a
> problem in practice?
what about defining a migration_errors attribute, storing recent 10 error
records with format like:
    input string: error
as identical input strings always have the same error string, the 10 error
records may meet 10+ reason querying operations. And in practice, I think there
wouldn't be 10 simultaneous migration requests?

or could we just define some common errno? like 
#define ENOMIGRATION         140  /* device not supporting migration */
#define EUNATCH              49  /* software version not match */
#define EHWNM                142  /* hardware not matching*/

  reply	other threads:[~2019-05-13  1:22 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-06  1:45 [PATCH v2 0/2] introduction of version attribute for VFIO live migration Yan Zhao
2019-05-06  1:51 ` [PATCH v2 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU Yan Zhao
2019-05-06  3:20   ` Zhenyu Wang
2019-05-06  7:41     ` Zhenyu Wang
2019-05-07  5:43       ` Yan Zhao
2019-05-07  9:27   ` Cornelia Huck
2019-05-08 12:02     ` Yan Zhao
2019-05-08 10:50   ` Dr. David Alan Gilbert
2019-05-08 12:10     ` Yan Zhao
     [not found] ` <20190506014904.3621-1-yan.y.zhao@intel.com>
2019-05-07  9:19   ` [PATCH v2 1/2] vfio/mdev: add version attribute for mdev device Cornelia Huck
2019-05-08 11:57     ` Yan Zhao
2019-05-09 15:24       ` Cornelia Huck
2019-05-10  2:43         ` Yan Zhao
2019-05-07 21:18   ` Alex Williamson
2019-05-08 11:27     ` Yan Zhao
2019-05-08 21:22       ` Alex Williamson
2019-05-08 15:27         ` [libvirt] " Boris Fiuczynski
2019-05-09  6:55           ` Yan Zhao
2019-05-14 15:31           ` Alex Williamson
2019-05-28 20:57             ` Boris Fiuczynski
2019-05-29 14:08               ` Alex Williamson
2019-05-09  3:10         ` Yan Zhao
2019-05-09  3:38           ` Alex Williamson
2019-05-09 15:38     ` Cornelia Huck
2019-05-09 15:48       ` Dr. David Alan Gilbert
2019-05-09 15:54         ` Cornelia Huck
2019-05-09 16:48           ` Dr. David Alan Gilbert
2019-05-10  9:08             ` Cornelia Huck
2019-05-10  9:36               ` Dr. David Alan Gilbert
2019-05-10  9:48                 ` Cornelia Huck
2019-05-13  1:16                   ` Yan Zhao [this message]
2019-05-13 13:28                   ` Erik Skultety
     [not found]                     ` <20190514061235.GC20407@joy-OptiPlex-7040>
2019-05-14  7:03                       ` Cornelia Huck
2019-05-14  7:20                       ` Erik Skultety
2019-05-14  7:32                         ` Yan Zhao
2019-05-14  7:43                           ` Erik Skultety
2019-05-14  7:47                             ` Yan Zhao
2019-05-14  9:51                               ` Cornelia Huck
2019-05-14 10:57                                 ` Erik Skultety
2019-05-14 11:01                                 ` Dr. David Alan Gilbert
2019-05-14 11:30                                   ` Cornelia Huck
2019-05-14 15:01                             ` Alex Williamson
2019-05-16  1:00                               ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190513011626.GI24397@joy-OptiPlex-7040 \
    --to=yan.y.zhao@intel.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=berrange@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shaopeng.he@intel.com \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox