qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Ankit Agrawal <ankita@nvidia.com>,
	"clg@redhat.com" <clg@redhat.com>,
	"shannon.zhaosl@gmail.com" <shannon.zhaosl@gmail.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"ani@anisinha.ca" <ani@anisinha.ca>,
	"berrange@redhat.com" <berrange@redhat.com>,
	"eduardo@habkost.net" <eduardo@habkost.net>,
	"imammedo@redhat.com" <imammedo@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"eblake@redhat.com" <eblake@redhat.com>,
	"armbru@redhat.com" <armbru@redhat.com>,
	"david@redhat.com" <david@redhat.com>,
	"gshan@redhat.com" <gshan@redhat.com>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	Aniket Agashe <aniketa@nvidia.com>, Neo Jia <cjia@nvidia.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	"Tarun Gupta (SW-GPU)" <targupta@nvidia.com>,
	Vikram Sethi <vsethi@nvidia.com>,
	Andy Currid <acurrid@nvidia.com>,
	Dheeraj Nigam <dnigam@nvidia.com>, Uday Dhoke <udhoke@nvidia.com>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	libvir-list@redhat.com, Laine Stump <laine@redhat.com>
Subject: Re: [PATCH v2 3/3] qom: Link multiple numa nodes to device using a new object
Date: Tue, 17 Oct 2023 14:24:24 -0300	[thread overview]
Message-ID: <20231017172424.GN3952@nvidia.com> (raw)
In-Reply-To: <20231017105419.1469e5c9.alex.williamson@redhat.com>

On Tue, Oct 17, 2023 at 10:54:19AM -0600, Alex Williamson wrote:
> On Tue, 17 Oct 2023 12:28:30 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Oct 17, 2023 at 09:21:16AM -0600, Alex Williamson wrote:
> > 
> > > Do we therefore need some programatic means for the kernel driver to
> > > expose the node configuration to userspace?  What interfaces would
> > > libvirt like to see here?  Is there an opportunity that this could
> > > begin to define flavors or profiles for variant devices like we have
> > > types for mdev devices where the node configuration would be
> > > encompassed in a device profile?   
> > 
> > I don't think we should shift this mess into the kernel..
> > 
> > We have a wide range of things now that the orchestration must do in
> > order to prepare that are fairly device specific. I understand in K8S
> > configurations the preference is using operators (aka user space
> > drivers) to trigger these things.
> > 
> > Supplying a few extra qemu command line options seems minor compared
> > to all the profile and provisioning work that has to happen for other
> > device types.
> 
> This seems to be a growing problem for things like mlx5-vfio-pci where
> there's non-trivial device configuration necessary to enable migration
> support.  It's not super clear to me how those devices are actually
> expected to be used in practice with that configuration burden.

Yes, it is the nature of the situation. There is lots and lots of
stuff in the background here. We can nibble at some things, but I
don't see a way to be completely free of a userspace driver providing
the orchestration piece for every device type.

Maybe someone who knows more about the k8s stuff works can explain
more?

> Are we simply saying here that it's implicit knowledge that the
> orchestration must posses that when assigning devices exactly matching
> 10de:2342 or 10de:2345 when bound to the nvgrace-gpu-vfio-pci driver
> that 8 additional NUMA nodes should be added to the VM and an ACPI
> generic initiator object created linking those additional nodes to the
> assigned GPU?

What I'm trying to say is that orchestration should try to pull in a
userspace driver to provide the non-generic pieces.

But, it isn't clear to me what that driver is generically. Something
like this case is pretty stand alone, but mlx5 needs to interact with
the networking control plane to fully provision the PCI
function. Storage needs a different control plane. Few PCI devices are
so stand alone they can be provisioned without complicated help. 

Even things like IDXD need orchestration to sort of uniquely
understand how to spawn their SIOV functions.

I'm not sure I see a clear vision here from libvirt side how all these
parts interact in the libvirt world, or if the answer is "use
openshift and the operators".

Jason


  reply	other threads:[~2023-10-17 17:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-07 20:17 [PATCH v2 0/3] acpi: report numa nodes for device memory using GI ankita
2023-10-07 20:17 ` [PATCH v2 1/3] qom: new object to associate device to numa node ankita
2023-10-09 12:26   ` Jonathan Cameron via
2023-10-09 12:26     ` Jonathan Cameron
2023-10-11 17:37     ` Vikram Sethi
2023-10-12  8:59       ` Jonathan Cameron via
2023-10-12  8:59         ` Jonathan Cameron
2023-10-09 21:16   ` Alex Williamson
2023-10-13 13:16   ` Markus Armbruster
2023-10-17 13:44     ` Ankit Agrawal
2023-10-07 20:17 ` [PATCH v2 2/3] hw/acpi: Implement the SRAT GI affinity structure ankita
2023-10-09 21:16   ` Alex Williamson
2023-10-17 13:51     ` Ankit Agrawal
2023-10-07 20:17 ` [PATCH v2 3/3] qom: Link multiple numa nodes to device using a new object ankita
2023-10-09 12:30   ` Jonathan Cameron via
2023-10-09 12:30     ` Jonathan Cameron
2023-10-09 12:57     ` David Hildenbrand
2023-10-09 21:27     ` Alex Williamson
2023-10-17 14:18       ` Ankit Agrawal
2023-10-09 21:16   ` Alex Williamson
2023-10-17 14:00     ` Ankit Agrawal
2023-10-17 15:21       ` Alex Williamson
2023-10-17 15:28         ` Jason Gunthorpe
2023-10-17 16:54           ` Alex Williamson
2023-10-17 17:24             ` Jason Gunthorpe [this message]
2023-10-13 13:17   ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231017172424.GN3952@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=ani@anisinha.ca \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cjia@nvidia.com \
    --cc=clg@redhat.com \
    --cc=david@redhat.com \
    --cc=dnigam@nvidia.com \
    --cc=eblake@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=gshan@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=kwankhede@nvidia.com \
    --cc=laine@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shannon.zhaosl@gmail.com \
    --cc=targupta@nvidia.com \
    --cc=udhoke@nvidia.com \
    --cc=vsethi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).