kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Erik Skultety <eskultet@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Neo Jia <cjia@nvidia.com>,
	kvm@vger.kernel.org, libvirt <libvir-list@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Tina Zhang <tina.zhang@intel.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Gerd Hoffmann <kraxel@redhat.com>, Laine Stump <laine@redhat.com>,
	Jiri Denemark <jdenemar@redhat.com>,
	intel-gvt-dev@lists.freedesktop.org
Subject: Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.
Date: Thu, 10 May 2018 13:00:29 +0200	[thread overview]
Message-ID: <20180510110029.GA9645@erzo-ntb> (raw)
In-Reply-To: <20180504100344.221e399f@t450s.home>

...

> > Now, if we (theoretically) can settle on easing the restrictions Alex
> > has mentioned, we in fact could introduce a QMP command to probe
> > these devices and provide libvirt with useful information at that
> > point in time. Of course, since the 3rd party vendor is "de-coupled"
> > from qemu, libvirt would have no way to find out that the driver has
> > changed in the meantime, thus still using the old information we
> > gathered, ergo potentially causing the QEMU process to fail
> > eventually. But then again, there's very often a strong
> > recommendation to reboot your host after a driver update, especially
> > in NVIDIA's case, which means this fact wouldn't matter. However,
> > there's also a significant drawback to my proposal which probably
> > renders it completely useless (but we can continue from there...) and
> > that is the devices would either have to be present already (not an
> > option) or QEMU would need to be enhanced in a way, that it would
> > create a dummy device during QMP probing, open it, collect the
> > information libvirt needs, close it and remove it. If the driver
> > doesn't change in the meantime, this should be sufficient for a VM to
> > be successfully instantiated with a display, right?
>
> I don't think this last requirement is possible, QEMU is as clueless
> about the capabilities of an mdev device as anyone else until that
> device is opened and probed, so how would we invent this "dummy
> device"?  I don't really see how there's any ability for
> pre-determination of the device capabilities, we can only probe the
> actual device we intend to use.

Hmm, let's say libvirt is able to create mdevs. Do the vendor drivers impose
any kind of limitations on whether a specific device-type or a specific
instance of a type does or does not present certain features like display or
migration in comparison to the other types/instances? IOW I would assume that
once the driver version does support display/migration, any mdev instance of any
mdev type the driver supports will "inherit" the support for display/migration.
If this assumption works, libvirt, knowing there are some mdev capable parent
devices, could technically create a dummy instance of the first type it can for
each parent device, passing the UUID to qemu QMP query command, qemu would then
open and probe the device, returning the capabilities which libvirt would then
cache. Next time a VM is due to start, libvirt can use the device UUID to check
the capabilities we cached and try setting appropriate config options. However,
as you've mentioned, this approach is fairly policy-driven, which doesn't cope
with what libvirt's goal is. Would such a suggestion help at all from QEMU's
POV?

>
> > > The above has pressed the need for investigating some sort of
> > > alternative API through which libvirt might introspect a vfio device
> > > and with vfio device migration on the horizon, it's natural that
> > > some sort of support for migration state compatibility for the
> > > device need be considered as a second user of such an API.
> > > However, we currently have no concept of migration compatibility on
> > > a per-device level as there are no migratable devices that live
> > > outside of the QEMU code base. It's therefore assumed that per
> > > device migration compatibility is encompassed by the versioned
> > > machine type for the overall VM.  We need participation all the way
> > > to the top of the VM management stack to resolve this issue and
> > > it's dragging down the (possibly) more simple question of how do we
> > > resolve the display situation.  Therefore I'm looking for
> > > alternatives for display that work within what we have available to
> > > us at the moment.
> > >
> > > Erik Skultety, who initially raised the display question, has
> > > identified one possible solution, which is to simply make the
> > > display configuration the user's problem (apologies if I've
> > > misinterpreted Erik).  I believe this would work something like:
> > >
> > >  - libvirt identifies a version of QEMU that includes 'display'
> > > support for vfio-pci devices and defaults to adding display=off for
> > > every vfio-pci device [have we chosen the wrong default (auto) in
> > > QEMU?].
> >
> > From libvirt's POV, having a new XML attribute display to the host
> > device type mdev should with a default value 'off', potentially
> > extending this to 'auto' once we have enough information to base our
> > decision on. We'll need to combine this with a new attribute value
> > for the <video> element that would prevent adding an emulated VGA any
> > time <graphics> (spice,VNC) is requested, but that's something we'd
> > need to do anyway, so I'm just mentioning it.
>
> This raises another question, is the configuration of the emulated
> graphics a factor in the handling the mdev device's display option?
> AFAIK, neither vGPU vendor provides a VBIOS for boot graphics, so even

Good point, I forgot about the fact that we don't have boot graphics yet, in
which case no, having the 'none' value isn't a factor here, libvirt can continue
adding an emulated VGA device just to have some boot output. I'm also curious
how the display on the secondary GPU is going to be presented to the end user,
but that's out of scope for libvirt.

> with a display option, we're mostly targeting a secondary graphics
> head, otherwise the user will be running headless until the guest OS
> drivers initialize.
>
> > >  - New XML support would allow a user to enable display support on
> > > the vfio device.
> > >
> > >  - Resolving any OpenGL dependencies of that change would be left to
> > >    the user.
> > >
> > > A nice aspect of this is that policy decisions are left to the user
> > > and clearly no interface changes are necessary, perhaps with the
> > > exception of deciding whether we've made the wrong default choice
> > > for vfio-pci devices in QEMU.
> >
> > It's a common practice that we offload decisions like this to users
> > (including management layer, i.e. openstack, ovirt).
> >
> > >
> > > On the other hand, if we do want to give libvirt a mechanism to
> > > probe the display support for a device, we can make a simplified
> > > QEMU instance be the mechanism through which we do that.  For
> > > example the script[1] can be provided with either a PCI device or
> > > sysfs path to an mdev device and run a minimal VM instance meeting
> > > the requirements of both GVTg and NVIDIA to report the display
> > > support and GL requirements for a device.  There are clearly some
> > > unrefined and atrocious bits of this script, but it's only a proof
> > > of concept, the process management can be improved and we can
> > > decide whether we want to provide qmp mechanism to introspect the
> > > device rather than grep'ing error messages.  The goal is simply to
> > > show that we could choose to embrace
> >
> > if not for anything else, error messages change, so that's not a way,
> > QMP is a much more standardized approach, but then again, as I
> > mentioned above, at the moment, libvirt probes for capabilities
> > during its start.
>
> Right, and none of these device capabilities are currently present via
> qmp, and in fact the VM fails to start in my example script when GL is
> needed but not present, so there's no QMP interface to probe until a
> configuration is found that the VM at least initializes w/o error.
>
> > > QEMU and use it not as a VM, but simply a tool for poking at a
> > > device given the restrictions the mdev vendor drivers have already
> > > imposed.
> > >
> > > So I think the question bounces back to libvirt, does libvirt want
> > > enough information about the display requirements for a given
> > > device to automatically attempt to add GL support for it,
> > > effectively a policy of 'if it's supported try to enable it', or
> > > should we leave well enough alone and let the user choose to enable
> > > it?
> > >
> > > Maybe some guiding questions:
> > >
> > >  - Will dma-buf always require GL support?
> > >
> > >  - Does GL support limit our ability to have a display over a remote
> > >    connection?
> > >
> > >  - Do region-based displays also work with GL support, even if not
> > >    required?
> >
> > Yeah, these are IMHO really tough to answer because we can't really
> > predict the future, which again favours a new libvirt attribute more.
> > Even if we decided that we truly need a dummy VM as tool for libvirt
> > to probe this info, I still feel like this should be done up in the
> > virtualization stack and libvirt again would be just a tool to do
> > stuff the way it's told to do it. But I'd very much like to hear
> > Dan's opinion, since beside libvirt he can cover openstack too.
>
> I've learned from Gerd offline that remote connections are possible,
> requiring maybe yet a different set of options, so I'm leaning even
> further in the direction that libvirt can really only provide the user
> with options, but cannot reasonably infer the intentions of the user's
> configuration even if device capabilities were exposed.  Thanks,

Agreed, this would turn being extremely policy-based, but like Daniel, I'm
really not sure whether these can be determined in an automated way on any
level, sure, ovirt could present a set of contextual menus so a 'human' user
would make the call (even a wrong one for that matter), not as much for
openstack I guess.

Erik

  parent reply	other threads:[~2018-05-10 11:00 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20180409103513.8020-1-kraxel@redhat.com>
2018-04-09 10:35 ` [PATCH 1/3] sample: vfio mdev display - host device Gerd Hoffmann
2018-04-24  2:41   ` Alex Williamson
2018-04-24  6:29     ` Gerd Hoffmann
2018-04-09 10:35 ` [PATCH 2/3] sample: vfio mdev display - guest driver Gerd Hoffmann
2018-04-11 20:39   ` Bjorn Helgaas
2018-04-24  2:51   ` Alex Williamson
2018-04-25 21:03   ` Konrad Rzeszutek Wilk
2018-04-09 10:35 ` [PATCH 3/3] sample: vfio bochs vbe display (host device for bochs-drm) Gerd Hoffmann
2018-04-24  3:05   ` Alex Williamson
2018-04-18 18:31 ` [libvirt] [PATCH 0/3] sample: vfio mdev display devices Alex Williamson
2018-04-19  8:40   ` Gerd Hoffmann
2018-04-19 10:03     ` Zhenyu Wang
2018-04-19 14:20     ` Alex Williamson
2018-04-19 14:54     ` Paolo Bonzini
2018-04-23 21:40   ` Alex Williamson
2018-04-24  7:17     ` Gerd Hoffmann
2018-04-24 17:35       ` Alex Williamson
2018-04-25  9:49         ` Zhang, Tina
2018-04-24 19:50     ` Kirti Wankhede
2018-04-24 22:59       ` Alex Williamson
2018-04-25 15:30         ` Kirti Wankhede
2018-04-25 18:00           ` Alex Williamson
2018-04-25 19:52             ` Dr. David Alan Gilbert
2018-04-26 18:45               ` Kirti Wankhede
2018-04-26 18:55                 ` Dr. David Alan Gilbert
2018-04-27 17:21                   ` Alex Williamson
2018-05-03 18:58                   ` [libvirt] Expose vfio device display/migration to libvirt and above, was " Alex Williamson
2018-05-04  7:49                     ` Erik Skultety
2018-05-04 16:03                       ` Alex Williamson
2018-05-07  6:25                         ` Gerd Hoffmann
2018-07-20  4:56                           ` Yuan, Hang
2018-08-08  7:43                             ` Gerd Hoffmann
2018-05-10 11:00                         ` Erik Skultety [this message]
2018-05-10 15:57                           ` Alex Williamson
2018-05-04  9:16                     ` Daniel P. Berrangé
2018-05-04 17:06                       ` Alex Williamson
2018-05-07  6:15                     ` Gerd Hoffmann
2018-05-04  8:39                 ` [libvirt] " Erik Skultety
2018-04-26  3:44   ` Tian, Kevin
2018-04-26  6:14     ` Gerd Hoffmann
2018-04-26 15:44       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180510110029.GA9645@erzo-ntb \
    --to=eskultet@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=cjia@nvidia.com \
    --cc=dgilbert@redhat.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jdenemar@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=laine@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=tina.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).