qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Neo Jia <cjia@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Andy Currid <ACurrid@nvidia.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Song, Jike" <jike.song@intel.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"bjsdjshi@linux.vnet.ibm.com" <bjsdjshi@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [libvirt] [RFC v2] libvirt vGPU QEMU integration
Date: Wed, 28 Sep 2016 13:06:31 -0700	[thread overview]
Message-ID: <20160928200631.GC26800@nvidia.com> (raw)
In-Reply-To: <20160928135547.280daff0@t450s.home>

On Wed, Sep 28, 2016 at 01:55:47PM -0600, Alex Williamson wrote:
> On Wed, 28 Sep 2016 12:22:35 -0700
> Neo Jia <cjia@nvidia.com> wrote:
> 
> > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote:
> > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote:  
> > > > On Thu, 22 Sep 2016 09:41:20 +0530
> > > > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> > > >   
> > > > > >>>>> My concern is that a type id seems arbitrary but we're specifying that
> > > > > >>>>> it be unique.  We already have something unique, the name.  So why try
> > > > > >>>>> to make the type id unique as well?  A vendor can accidentally create
> > > > > >>>>> their vendor driver so that a given name means something very
> > > > > >>>>> specific.  On the other hand they need to be extremely deliberate to
> > > > > >>>>> coordinate that a type id means a unique thing across all their product
> > > > > >>>>> lines.
> > > > > >>>>>         
> > > > > >>>>
> > > > > >>>> Let me clarify, type id should be unique in the list of
> > > > > >>>> mdev_supported_types. You can't have 2 directories in with same name.      
> > > > > >>>
> > > > > >>> Of course, but does that mean it's only unique to the machine I'm
> > > > > >>> currently running on?  Let's say I have a Tesla P100 on my system and
> > > > > >>> type-id 11 is named "GRID-M60-0B".  At some point in the future I
> > > > > >>> replace the Tesla P100 with a Q1000 (made up).  Is type-id 11 on that
> > > > > >>> new card still going to be a "GRID-M60-0B"?  If not then we've based
> > > > > >>> our XML on the wrong attribute.  If the new device does not support
> > > > > >>> "GRID-M60-0B" then we should generate an error, not simply initialize
> > > > > >>> whatever type-id 11 happens to be on this new card.
> > > > > >>>       
> > > > > >>
> > > > > >> If there are 2 M60 in the system then you would find '11' type directory
> > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type would
> > > > > >> not be there in its mdev_supported_types, it will have different types.
> > > > > >>
> > > > > >> For example, if you replace M60 with P100, but XML is not updated. XML
> > > > > >> have type '11'. When libvirt would try to create mdev device, libvirt
> > > > > >> would have to find 'create' file in sysfs in following directory format:
> > > > > >>
> > > > > >>  --- mdev_supported_types
> > > > > >>      |-- 11
> > > > > >>      |   |-- create
> > > > > >>
> > > > > >> but now for P100, '11' directory is not there, so libvirt should throw
> > > > > >> error on not able to find '11' directory.    
> > > > > > 
> > > > > > This really seems like an accident waiting to happen.  What happens
> > > > > > when the user replaces their M60 with an Intel XYZ device that happens
> > > > > > to expose a type 11 mdev class gpu device?  How is libvirt supposed to
> > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an
> > > > > > INTEL-IGD-XYZ?  Doesn't basing the XML entry on the name and removing
> > > > > > yet another arbitrary requirement that we have some sort of globally
> > > > > > unique type-id database make a lot of sense?  The same issue applies
> > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and the
> > > > > > name is the primary index for the mdev device, I know what it is.
> > > > > > Seeing type-id='11' is meaningless.
> > > > > >    
> > > > > 
> > > > > Let me clarify again, type '11' is a string that vendor driver would
> > > > > define (see my previous reply below) it could be "11" or "GRID-M60-0B".
> > > > > If 2 vendors used same string we can't control that. right?
> > > > > 
> > > > >   
> > > > > >>>> Lets remove 'id' from type id in XML if that is the concern. Supported
> > > > > >>>> types is going to be defined by vendor driver, so let vendor driver
> > > > > >>>> decide what to use for directory name and same should be used in device
> > > > > >>>> xml file, it could be '11' or "GRID M60-0B":
> > > > > >>>>
> > > > > >>>>     <device>
> > > > > >>>>       <name>my-vgpu</name>
> > > > > >>>>       <parent>pci_0000_86_00_0</parent>
> > > > > >>>>       <capability type='mdev'>
> > > > > >>>>         <type='11'/>
> > > > > >>>>         ...
> > > > > >>>>       </capability>
> > > > > >>>>     </device>  
> > > > 
> > > > Then let's get rid of the 'name' attribute and let the sysfs directory
> > > > simply be the name.  Then we can get rid of 'type' altogether so we
> > > > don't have this '11' vs 'GRID-M60-0B' issue.  Thanks,  
> > > 
> > > That sounds nice to me - we don't need two unique identifiers if
> > > one will do.  
> > 
> > Hi Alex and Daniel,
> > 
> > I just had some internal discussions here within NVIDIA and found out that
> > actually the name/label potentially might not be unique and the "id" will be. 
> > So I think we still would like to keep both so the id is the programmatic id
> > and the name/label is a human readable string for it, which might get changed to
> > be non-unique by outside of engineering.
> 
> I think your discovery only means that for your vendor driver, the name
> will be "11" (as a string).  Perhaps you'd like some sort of vendor
> provided description within each type, but I am not in favor of having
> an arbitrary integer value imply something specific within the sysfs
> interface.  IOW, the NVIDIA vendor driver should be able to create:
> 
> 11
> ├── create
> ├── description
> ├── etc
> └── resolution
> 
> While Intel might create:
> 
> Skylake-vGPU
> ├── create
> ├── description
> ├── etc
> └── resolution
> 
> Maybe "description" is optional for vendors that use useful names?
> Thanks,

I think we should be able to have a unique vendor type string instead of an
arbitrary integer value there as long as we are allowed to have a description
field that can be used to show to the end user as "name / label". 

Thanks,
Neo

> 
> Alex

  reply	other threads:[~2016-09-28 20:06 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-19 20:35 [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration Kirti Wankhede
2016-09-19 21:36 ` Alex Williamson
2016-09-19 21:50   ` Paolo Bonzini
2016-09-19 22:25     ` Alex Williamson
2016-09-20 14:35       ` Kirti Wankhede
2016-09-20 14:41         ` Daniel P. Berrange
2016-09-20 14:49           ` Paolo Bonzini
2016-09-20 14:58             ` Daniel P. Berrange
2016-09-20 15:05               ` Paolo Bonzini
2016-09-20 15:14                 ` Daniel P. Berrange
2016-09-20 16:31                   ` Kirti Wankhede
2016-09-20 16:36                     ` Daniel P. Berrange
2016-09-20 16:42                       ` Kirti Wankhede
2016-09-20 16:44                         ` Daniel P. Berrange
2016-09-20 16:46                     ` Daniel P. Berrange
2016-09-20 17:21                   ` Paolo Bonzini
2016-09-21  8:34                     ` Daniel P. Berrange
2016-09-20 14:52         ` Alex Williamson
2016-09-20  1:25   ` Tian, Kevin
2016-09-20 14:21   ` Kirti Wankhede
2016-09-20 14:43     ` Alex Williamson
2016-09-20 16:23       ` Kirti Wankhede
2016-09-20 16:50         ` Alex Williamson
2016-09-21 18:34           ` Kirti Wankhede
2016-09-21 19:03             ` Alex Williamson
2016-09-22  4:11               ` Kirti Wankhede
2016-09-22 14:19                 ` Alex Williamson
2016-09-22 14:26                   ` [Qemu-devel] [libvirt] " Daniel P. Berrange
2016-09-28 19:22                     ` Neo Jia
2016-09-28 19:45                       ` Tian, Kevin
2016-09-28 19:59                         ` Neo Jia
2016-09-28 20:31                           ` Laine Stump
2016-09-28 20:47                             ` Neo Jia
2016-09-28 22:49                           ` Alex Williamson
2016-09-28 19:55                       ` Alex Williamson
2016-09-28 20:06                         ` Neo Jia [this message]
2016-09-28 22:39                           ` Alex Williamson
2016-09-29  8:03                       ` Daniel P. Berrange
2016-09-29  8:12                         ` Neo Jia
2016-09-29 14:22               ` [Qemu-devel] " Kirti Wankhede
2016-09-21  4:10         ` Tian, Kevin
2016-09-21  4:43           ` Alex Williamson
2016-09-22  2:43             ` Tian, Kevin
2016-09-22 19:25         ` Tian, Kevin
2016-09-23 18:34           ` Kirti Wankhede
2016-09-21  3:56     ` Tian, Kevin
2016-09-21  4:36       ` Alex Williamson
2016-09-22  2:33         ` Tian, Kevin
2016-09-22  3:01           ` Alex Williamson
2016-09-22  3:42             ` Tian, Kevin
     [not found]         ` <AADFC41AFE54684AB9EE6CBC0274A5D18DF86F5F@SHSMSX101.ccr.corp.intel.com>
2016-09-22  2:59           ` Tian, Kevin
2016-09-20  1:37 ` Tian, Kevin
2016-09-20  9:47 ` Daniel P. Berrange
2016-09-28 19:48   ` Neo Jia
2016-09-29  8:06     ` Daniel P. Berrange
2016-09-29 14:35       ` Tian, Kevin
2016-09-29 14:38         ` Daniel P. Berrange
2016-09-29 14:42           ` Tian, Kevin
2016-09-30  5:19             ` Kirti Wankhede
2016-10-03  8:20               ` Kirti Wankhede
2016-10-07  5:16                 ` Kirti Wankhede
2016-10-07 19:09                   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928200631.GC26800@nvidia.com \
    --to=cjia@nvidia.com \
    --cc=ACurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=jike.song@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).