public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Dave Airlie <airlied@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>, Zhi Wang <zhiw@nvidia.com>,
	kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	daniel@ffwll.ch, acurrid@nvidia.com, cjia@nvidia.com,
	smitra@nvidia.com, ankita@nvidia.com, aniketa@nvidia.com,
	kwankhede@nvidia.com, targupta@nvidia.com, zhiwang@kernel.org
Subject: Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
Date: Tue, 24 Sep 2024 20:47:37 -0300	[thread overview]
Message-ID: <20240924234737.GO9417@nvidia.com> (raw)
In-Reply-To: <CAPM=9twKGFV8SA165QufaGUev0tnuHABAi0TMvDQSfa7PJfZaQ@mail.gmail.com>

On Wed, Sep 25, 2024 at 08:52:32AM +1000, Dave Airlie wrote:
> On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote:
> > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote:
> > >
> > > > > From the VFIO side I would like to see something like this merged in
> > > > > nearish future as it would bring a previously out of tree approach to
> > > > > be fully intree using our modern infrastructure. This is a big win for
> > > > > the VFIO world.
> > > > >
> > > > > As a commercial product this will be backported extensively to many
> > > > > old kernels and that is harder/impossible if it isn't exclusively in
> > > > > C. So, I think nova needs to co-exist in some way.
> > > >
> > > > We'll surely not support two drivers for the same thing in the long term,
> > > > neither does it make sense, nor is it sustainable.
> > >
> > > What is being done here is the normal correct kernel thing to
> > > do. Refactor the shared core code into a module and stick higher level
> > > stuff on top of it. Ideally Nova/Nouveau would exist as peers
> > > implementing DRM subsystem on this shared core infrastructure. We've
> > > done this sort of thing before in other places in the kernel. It has
> > > been proven to work well.
> >
> > So, that's where you have the wrong understanding of what we're
> > working on: You seem to think that Nova is just another DRM
> > subsystem layer on top of the NVKM parts (what you call the core
> > driver) of Nouveau.

Well, no, I am calling a core driver to be the very minimal parts that
are actually shared between vfio and drm. It should definitely not
include key parts you want to work on in rust, like the command
marshaling. 

I expect there is more work to do in order to make this kind of split,
but this is what I'm thinking/expecting.

> > But the whole point of Nova is to replace the NVKM parts of Nouveau, since
> > that's where the problems we want to solve reside in.
> 
> Just to re-emphasise for Jason who might not be as across this stuff,
> 
> NVKM replacement with rust is the main reason for the nova project,
> 100% the driving force for nova is the unstable NVIDIA firmware API.
> The ability to use rust proc-macros to hide the NVIDIA instability
> instead of trying to do it in C by either generators or abusing C
> macros (which I don't think are sufficient).

I would not include any of this in the very core most driver. My
thinking is informed by what we've done in RDMA, particularly mlx5
which has a pretty thin PCI driver and each of the drivers stacked on
top form their own command buffers directly. The PCI driver primarily
just does some device bootup, command execution and interrupts because
those are all shared by the subsystem drivers.

We have a lot of experiance now building these kinds of
multi-subsystem structures and this pattern works very well.

So, broadly, build your rust proc macros on the DRM Nova driver and
call a core function to submit a command buffer to the device and get
back a response.

VFIO will make it's command buffers with C and call the same core
function.

> I think the idea of a nova drm and nova core driver architecture is
> acceptable to most of us, but long term trying to main a nouveau based
> nvkm is definitely not acceptable due to the unstable firmware APIs.

? nova core, meaning nova rust, meaning vfio depends on rust, doesn't
seem acceptable ? We need to keep rust isolated to DRM for the
foreseeable future. Just need to find a separation that can do that.

Jason

  reply	other threads:[~2024-09-24 23:47 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
2024-09-26  9:20   ` Greg KH
2024-10-14  9:59     ` Zhi Wang
2024-10-14 11:36       ` Greg KH
2024-09-22 12:49 ` [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client Zhi Wang
2024-09-26  9:21   ` Greg KH
2024-10-14 10:16     ` Zhi Wang
2024-10-14 11:33       ` Greg KH
2024-09-22 12:49 ` [RFC 03/29] nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled Zhi Wang
2024-09-22 12:49 ` [RFC 04/29] nvkm/vgpu: set the VF partition count " Zhi Wang
2024-09-26 22:51   ` Jason Gunthorpe
2024-10-13 18:54     ` Zhi Wang
2024-10-15 12:20       ` Jason Gunthorpe
2024-10-15 15:19         ` Zhi Wang
2024-10-15 16:35           ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO " Zhi Wang
2024-09-26 22:52   ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 06/29] nvkm/vgpu: set RMSetSriovMode " Zhi Wang
2024-09-26 22:53   ` Jason Gunthorpe
2024-10-14  7:38     ` Zhi Wang
2024-10-15  3:49       ` Christoph Hellwig
2024-10-15 12:23       ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 07/29] nvkm/gsp: add a notify handler for GSP event GPUACCT_PERFMON_UTIL_SAMPLES Zhi Wang
2024-09-22 12:49 ` [RFC 08/29] nvkm/vgpu: get the size VMMU segment from GSP firmware Zhi Wang
2024-09-22 12:49 ` [RFC 09/29] nvkm/vgpu: introduce the reserved channel allocator Zhi Wang
2024-09-22 12:49 ` [RFC 10/29] nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module Zhi Wang
2024-09-22 12:49 ` [RFC 11/29] nvkm/vgpu: introduce GSP RM client alloc and free for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 12/29] nvkm/vgpu: introduce GSP RM control interface " Zhi Wang
2024-09-22 12:49 ` [RFC 13/29] nvkm: move chid.h to nvkm/engine Zhi Wang
2024-09-22 12:49 ` [RFC 14/29] nvkm/vgpu: introduce channel allocation for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 15/29] nvkm/vgpu: introduce FB memory " Zhi Wang
2024-09-22 12:49 ` [RFC 16/29] nvkm/vgpu: introduce BAR1 map routines for vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 17/29] nvkm/vgpu: introduce engine bitmap for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm Zhi Wang
2024-09-26 22:56   ` Jason Gunthorpe
2024-10-14  8:32     ` Zhi Wang
2024-10-15 12:27       ` Jason Gunthorpe
2024-10-15 15:14         ` Zhi Wang
2024-10-14  8:36     ` Zhi Wang
2024-09-22 12:49 ` [RFC 19/29] vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude Zhi Wang
2024-09-22 12:49 ` [RFC 20/29] vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager Zhi Wang
2024-09-22 12:49 ` [RFC 21/29] vfio/vgpu_mgr: introduce vGPU type uploading Zhi Wang
2024-09-22 12:49 ` [RFC 22/29] vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 23/29] vfio/vgpu_mgr: allocate vGPU channels " Zhi Wang
2024-09-22 12:49 ` [RFC 24/29] vfio/vgpu_mgr: allocate mgmt heap " Zhi Wang
2024-09-22 12:49 ` [RFC 25/29] vfio/vgpu_mgr: map mgmt heap when creating a vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 26/29] vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU Zhi Wang
2024-09-25  0:31   ` Dave Airlie
2024-09-22 12:49 ` [RFC 28/29] vfio/vgpu_mgr: introduce vGPU host RPC channel Zhi Wang
2024-09-22 12:49 ` [RFC 29/29] vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver Zhi Wang
2024-09-22 13:11 ` [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
2024-09-23  8:38   ` Danilo Krummrich
2024-09-24 19:49     ` Zhi Wang
2024-09-23  6:22 ` Tian, Kevin
2024-09-23 15:02   ` Jason Gunthorpe
2024-09-26  6:43     ` Tian, Kevin
2024-09-26 12:55       ` Jason Gunthorpe
2024-09-26 22:57         ` Jason Gunthorpe
2024-09-27  0:13           ` Tian, Kevin
2024-09-23  8:49 ` Danilo Krummrich
2024-09-23 15:01   ` Jason Gunthorpe
2024-09-23 22:50     ` Danilo Krummrich
2024-09-24 16:41       ` Jason Gunthorpe
2024-09-24 19:56         ` Danilo Krummrich
2024-09-24 22:52           ` Dave Airlie
2024-09-24 23:47             ` Jason Gunthorpe [this message]
2024-09-25  0:18               ` Dave Airlie
2024-09-25  1:29                 ` Jason Gunthorpe
2024-09-25  0:53           ` Jason Gunthorpe
2024-09-25  1:08             ` Dave Airlie
2024-09-25 15:28               ` Jason Gunthorpe
2024-09-25 10:55             ` Danilo Krummrich
2024-09-26  9:14     ` Greg KH
2024-09-26 12:42       ` Jason Gunthorpe
2024-09-26 12:54         ` Greg KH
2024-09-26 13:07           ` Danilo Krummrich
2024-09-26 14:40           ` Jason Gunthorpe
2024-09-26 18:07             ` Andy Ritger
2024-09-26 22:23               ` Danilo Krummrich
2024-09-26 22:42             ` Danilo Krummrich
2024-09-27 12:51               ` Jason Gunthorpe
2024-09-27 14:22                 ` Danilo Krummrich
2024-09-27 15:27                   ` Jason Gunthorpe
2024-09-30 15:59                     ` Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240924234737.GO9417@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=daniel@ffwll.ch \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=smitra@nvidia.com \
    --cc=targupta@nvidia.com \
    --cc=zhiw@nvidia.com \
    --cc=zhiwang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox