qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Neo Jia <cjia@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kraxel@redhat.com" <kraxel@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Song, Jike" <jike.song@intel.com>,
	"bjsdjshi@linux.vnet.ibm.com" <bjsdjshi@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v6 1/4] vfio: Mediated device Core driver
Date: Mon, 15 Aug 2016 15:09:30 -0700	[thread overview]
Message-ID: <20160815220930.GB14876@nvidia.com> (raw)
In-Reply-To: <20160815095926.1d2d6cc2@t450s.home>

On Mon, Aug 15, 2016 at 09:59:26AM -0600, Alex Williamson wrote:
> On Mon, 15 Aug 2016 09:38:52 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Kirti Wankhede [mailto:kwankhede@nvidia.com]
> > > Sent: Saturday, August 13, 2016 8:37 AM
> > > 
> > > 
> > > 
> > > On 8/13/2016 2:46 AM, Alex Williamson wrote:  
> > > > On Sat, 13 Aug 2016 00:14:39 +0530
> > > > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> > > >  
> > > >> On 8/10/2016 12:30 AM, Alex Williamson wrote:  
> > > >>> On Thu, 4 Aug 2016 00:33:51 +0530
> > > >>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> > > >>>
> > > >>> This is used later by mdev_device_start() and mdev_device_stop() to get
> > > >>> the parent_device so it can call the start and stop ops callbacks
> > > >>> respectively.  That seems to imply that all of instances for a given
> > > >>> uuid come from the same parent_device.  Where is that enforced?  I'm
> > > >>> still having a hard time buying into the uuid+instance plan when it
> > > >>> seems like each mdev_device should have an actual unique uuid.
> > > >>> Userspace tools can figure out which uuids to start for a given user, I
> > > >>> don't see much value in collecting them to instances within a uuid.
> > > >>>  
> > > >>
> > > >> Initially we started discussion with VM_UUID+instance suggestion, where
> > > >> instance was introduced to support multiple devices in a VM.  
> > > >
> > > > The instance number was never required in order to support multiple
> > > > devices in a VM, IIRC this UUID+instance scheme was to appease NVIDIA
> > > > management tools which wanted to re-use the VM UUID by creating vGPU
> > > > devices with that same UUID and therefore associate udev events to a
> > > > given VM.  Only then does an instance number become necessary since the
> > > > UUID needs to be static for a vGPUs within a VM.  This has always felt
> > > > like a very dodgy solution when we should probably just be querying
> > > > libvirt to give us a device to VM association.  
> > 
> > Agree with Alex here. We'd better not assume that UUID will be a VM_UUID
> > for mdev in the basic design. It's bound to NVIDIA management stack too tightly.
> > 
> > I'm OK to give enough flexibility for various upper level management stacks,
> > e.g. instead of $UUID+INDEX style, would $UUID+STRING provide a better
> > option where either UUID or STRING could be optional? Upper management 
> > stack can choose its own policy to identify a mdev:
> > 
> > a) $UUID only, so each mdev is allocated with a unique UUID
> > b) STRING only, which could be an index (0, 1, 2, ...), or any combination 
> > (vgpu0, vgpu1, etc.)
> > c) $UUID+STRING, where UUID could be a VM UUID, and STRING could be
> > a numeric index
> > 
> > > >  
> > > >> 'mdev_create' creates device and 'mdev_start' is to commit resources of
> > > >> all instances of similar devices assigned to VM.
> > > >>
> > > >> For example, to create 2 devices:
> > > >> # echo "$UUID:0:params" > /sys/devices/../mdev_create
> > > >> # echo "$UUID:1:params" > /sys/devices/../mdev_create
> > > >>
> > > >> "$UUID-0" and "$UUID-1" devices are created.
> > > >>
> > > >> Commit resources for above devices with single 'mdev_start':
> > > >> # echo "$UUID" > /sys/class/mdev/mdev_start
> > > >>
> > > >> Considering $UUID to be a unique UUID of a device, we don't need
> > > >> 'instance', so 'mdev_create' would look like:
> > > >>
> > > >> # echo "$UUID1:params" > /sys/devices/../mdev_create
> > > >> # echo "$UUID2:params" > /sys/devices/../mdev_create
> > > >>
> > > >> where $UUID1 and $UUID2 would be mdev device's unique UUID and 'params'
> > > >> would be vendor specific parameters.
> > > >>
> > > >> Device nodes would be created as "$UUID1" and "$UUID"
> > > >>
> > > >> Then 'mdev_start' would be:
> > > >> # echo "$UUID1, $UUID2" > /sys/class/mdev/mdev_start
> > > >>
> > > >> Similarly 'mdev_stop' and 'mdev_destroy' would be:
> > > >>
> > > >> # echo "$UUID1, $UUID2" > /sys/class/mdev/mdev_stop  
> > > >
> > > > I'm not sure a comma separated list makes sense here, for both
> > > > simplicity in the kernel and more fine grained error reporting, we
> > > > probably want to start/stop them individually.  Actually, why is it
> > > > that we can't use the mediated device being opened and released to
> > > > automatically signal to the backend vendor driver to commit and release
> > > > resources? I don't fully understand why userspace needs this interface.  
> > 
> > There is a meaningful use of start/stop interface, as required in live
> > migration support. Such interface allows vendor driver to quiescent 
> > mdev activity on source device before mdev hardware state is snapshot,
> > and then resume mdev activity on dest device after its state is recovered.
> > Intel has implemented experimental live migration support in KVMGT (soon
> > to release), based on above two interfaces (plus another two to get/set
> > mdev state).
> 
> Ok, that's actually an interesting use case for start/stop.
> 
> > > 
> > > For NVIDIA vGPU solution we need to know all devices assigned to a VM in
> > > one shot to commit resources of all vGPUs assigned to a VM along with
> > > some common resources.  
> > 
> > Kirti, can you elaborate the background about above one-shot commit
> > requirement? It's hard to understand such a requirement. 
> 
> Agree, I know NVIDIA isn't planning to support hotplug initially, but
> this seems like we're precluding hotplug from the design.  I don't
> understand what's driving this one-shot requirement.

Hi Alex,

The requirement here is based on how our internal vGPU device model designed and
with this we are able to pre-allocate resources required for multiple virtual
devices within same domain.

And I don't think this syntax will stop us from supporting hotplug at all.

For example, you can always create a virtual mdev and then do

echo "mdev_UUID" > /sys/class/mdev/mdev_start

then use QEMU monitor to add the device for hotplug.

> 
> > As I relied in another mail, I really hope start/stop become a per-mdev
> > attribute instead of global one, e.g.:
> > 
> > echo "0/1" > /sys/class/mdev/12345678-1234-1234-1234-123456789abc/start
> > 
> > In many scenario the user space client may only want to talk to mdev
> > instance directly, w/o need to contact its parent device. Still take
> > live migration for example, I don't think Qemu wants to know parent
> > device of assigned mdev instances.
> 
> Yep, QEMU won't know the parent device, only libvirt level tools
> managing the creation and destruction of the mdev device would know
> that.  Perhaps in addition to migration uses we could even use
> start/stop for basic power management, device D3 state in the guest
> could translate to a stop command to remove that vGPU from scheduling
> while still retaining most of the state and resource allocations.

Just recap what I have replied to Kevin on his previous email, the current
mdev_start and mdev_stop doesn't require any knowledge of parent device.

Thanks,
Neo

> Thanks,
> 
> Alex

  reply	other threads:[~2016-08-15 22:09 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03 19:03 [Qemu-devel] [PATCH v6 0/4] Add Mediated device support Kirti Wankhede
2016-08-03 19:03 ` [Qemu-devel] [PATCH v6 1/4] vfio: Mediated device Core driver Kirti Wankhede
2016-08-04  7:21   ` Tian, Kevin
2016-08-05  6:13     ` Kirti Wankhede
2016-08-15  9:15       ` Tian, Kevin
2016-08-09 19:00   ` Alex Williamson
2016-08-12 18:44     ` Kirti Wankhede
2016-08-12 21:16       ` Alex Williamson
2016-08-13  0:37         ` Kirti Wankhede
2016-08-15  9:38           ` Tian, Kevin
2016-08-15 15:59             ` Alex Williamson
2016-08-15 22:09               ` Neo Jia [this message]
2016-08-15 22:52                 ` Alex Williamson
2016-08-15 23:23                   ` Neo Jia
2016-08-16  0:49                     ` Tian, Kevin
2016-08-15 19:59             ` Neo Jia
2016-08-15 22:47               ` Alex Williamson
2016-08-15 23:54                 ` Neo Jia
2016-08-16  0:18                 ` Tian, Kevin
2016-08-16 20:30                 ` Neo Jia
2016-08-16 20:51                   ` Alex Williamson
2016-08-16 21:17                     ` Neo Jia
2016-08-16  0:30               ` Tian, Kevin
2016-08-16  3:45                 ` Neo Jia
2016-08-16  3:50                   ` Tian, Kevin
2016-08-16  4:16                     ` Neo Jia
2016-08-16  4:52                       ` Tian, Kevin
2016-08-16  5:43                         ` Neo Jia
2016-08-16  5:58                           ` Tian, Kevin
2016-08-16  6:13                             ` Neo Jia
2016-08-16 21:03                               ` Alex Williamson
2016-08-16 12:49                         ` Alex Williamson
2016-08-03 19:03 ` [Qemu-devel] [PATCH v6 2/4] vfio: VFIO driver for mediated PCI device Kirti Wankhede
2016-08-03 21:03   ` kbuild test robot
2016-08-04  0:19   ` kbuild test robot
2016-08-09 19:00   ` Alex Williamson
2016-08-10 21:23     ` Kirti Wankhede
2016-08-10 23:00       ` Alex Williamson
2016-08-11 15:59         ` Kirti Wankhede
2016-08-11 16:24           ` Alex Williamson
2016-08-11 17:46             ` Kirti Wankhede
2016-08-11 18:43               ` Alex Williamson
2016-08-12 17:57                 ` Kirti Wankhede
2016-08-12 21:25                   ` Alex Williamson
2016-08-13  0:42                     ` Kirti Wankhede
2016-08-03 19:03 ` [Qemu-devel] [PATCH v6 3/4] vfio iommu: Add support for mediated devices Kirti Wankhede
2016-08-09 19:00   ` Alex Williamson
2016-08-11 14:22     ` Kirti Wankhede
2016-08-11 16:28       ` Alex Williamson
2016-08-03 19:03 ` [Qemu-devel] [PATCH v6 4/4] docs: Add Documentation for Mediated devices Kirti Wankhede
2016-08-04  7:31   ` Tian, Kevin
2016-08-05  7:45     ` Kirti Wankhede
2016-08-24 22:36   ` Daniel P. Berrange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160815220930.GB14876@nvidia.com \
    --to=cjia@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=jike.song@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).