Re: [RFC v2 03/14] vfio/nvidia-vgpu: introduce vGPU type uploading

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: "Danilo Krummrich" <dakr@kernel.org>
To: "Zhi Wang" <zhiw@nvidia.com>
Cc: <kvm@vger.kernel.org>, <alex.williamson@redhat.com>,
	<kevin.tian@intel.com>, <jgg@nvidia.com>, <airlied@gmail.com>,
	<daniel@ffwll.ch>, <acurrid@nvidia.com>, <cjia@nvidia.com>,
	<smitra@nvidia.com>, <ankita@nvidia.com>, <aniketa@nvidia.com>,
	<kwankhede@nvidia.com>, <targupta@nvidia.com>,
	<zhiwang@kernel.org>, <acourbot@nvidia.com>,
	<joelagnelf@nvidia.com>, <apopple@nvidia.com>,
	<jhubbard@nvidia.com>, <nouveau@lists.freedesktop.org>
Subject: Re: [RFC v2 03/14] vfio/nvidia-vgpu: introduce vGPU type uploading
Date: Thu, 04 Sep 2025 11:41:03 +0200	[thread overview]
Message-ID: <DCJX0ZBB1ATN.1WPXONLVV8RYD@kernel.org> (raw)
In-Reply-To: <DCJWXVLI2GWB.3UBHWIZCZXKD2@kernel.org>

(Cc: Alex, John, Joel, Alistair, nouveau)

On Thu Sep 4, 2025 at 11:37 AM CEST, Danilo Krummrich wrote:
> On Thu Sep 4, 2025 at 12:11 AM CEST, Zhi Wang wrote:
>> diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/gsp.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/gsp.h
>> new file mode 100644
>> index 000000000000..c3fb7b299533
>> --- /dev/null
>> +++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/gsp.h
>> @@ -0,0 +1,18 @@
>> +/* SPDX-License-Identifier: MIT */
>> +#ifndef __NVRM_GSP_H__
>> +#define __NVRM_GSP_H__
>> +
>> +#include <nvrm/nvtypes.h>
>> +
>> +/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/570 */
>> +
>> +#define NV2080_CTRL_CMD_GSP_GET_FEATURES (0x20803601)
>> +
>> +typedef struct NV2080_CTRL_GSP_GET_FEATURES_PARAMS {
>> +	NvU32  gspFeatures;
>> +	NvBool bValid;
>> +	NvBool bDefaultGspRmGpu;
>> +	NvU8   firmwareVersion[GSP_MAX_BUILD_VERSION_LENGTH];
>> +} NV2080_CTRL_GSP_GET_FEATURES_PARAMS;
>> +
>> +#endif
>
> <snip>
>
>> +static struct version supported_version_list[] = {
>> +	{ 18, 1, "570.144" },
>> +};
>
> nova-core won't provide any firmware specific APIs, it is meant to serve as a
> hardware and firmware abstraction layer for higher level drivers, such as vGPU
> or nova-drm.
>
> As a general rule the interface between nova-core and higher level drivers must
> not leak any hardware or firmware specific details, but work on a higher level
> abstraction layer.
>
> Now, I recognize that at some point it might be necessary to do some kind of
> versioning in this API anyways. For instance, when the semantics of the firmware
> API changes too significantly.
>
> However, this would be a separte API where nova-core, at the initial handshake,
> then asks clients to use e.g. v2 of the nova-core API, still hiding any firmware
> and hardware details from the client.
>
> Some more general notes, since I also had a look at the nova-core <-> vGPU
> interface patches in your tree (even though I'm aware that they're not part of
> the RFC of course):
>
> The interface for the general lifecycle management for any clients attaching to
> nova-core (VGPU, nova-drm) should be common and not specific to vGPU. (The same
> goes for interfaces that will be used by vGPU and nova-drm.)
>
> The interface nova-core provides for that should be designed in Rust, so we can
> take advantage of all the features the type system provides us with connecting
> to Rust clients (nova-drm).
>
> For vGPU, we can then monomorphize those types into the corresponding C
> structures and provide the corresponding functions very easily.
>
> Doing it the other way around would be a very bad idea, since the Rust type
> system is much more powerful and hence it'd be very hard to avoid introducing
> limitations on the Rust side of things.
>
> Hence, I recommend to start with some patches defining the API in nova-core for
> the general lifecycle (in Rust), so we can take it from there.
>
> Another note: I don't see any use of the auxiliary bus in vGPU, any clients
> should attach via the auxiliary bus API, it provides proper matching where
> there's more than on compatible GPU in the system. nova-core already registers
> an auxiliary device for each bound PCI device.
>
> Please don't re-implement what the auxiliary bus already does for us.
>
> - Danilo

next prev parent reply	other threads:[~2025-09-04  9:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-03 22:10 [RFC v2 00/14] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
2025-09-03 22:10 ` [RFC v2 01/14] vfio/nvidia-vgpu: introduce vGPU lifecycle management prelude Zhi Wang
2025-09-03 22:10 ` [RFC v2 02/14] vfio/nvidia-vgpu: allocate GSP RM client for NVIDIA vGPU manager Zhi Wang
2025-09-03 22:11 ` [RFC v2 03/14] vfio/nvidia-vgpu: introduce vGPU type uploading Zhi Wang
2025-09-04  9:37   ` Danilo Krummrich
2025-09-04  9:41     ` Danilo Krummrich [this message]
2025-09-04 12:15       ` Jason Gunthorpe
2025-09-04 12:45         ` Danilo Krummrich
2025-09-04 13:58           ` Jason Gunthorpe
2025-09-04 15:43       ` Zhi Wang
2025-09-06 10:34         ` Danilo Krummrich
2025-09-03 22:11 ` [RFC v2 04/14] vfio/nvidia-vgpu: allocate vGPU channels when creating vGPUs Zhi Wang
2025-09-03 22:11 ` [RFC v2 05/14] vfio/nvidia-vgpu: allocate vGPU FB memory " Zhi Wang
2025-09-03 22:11 ` [RFC v2 06/14] vfio/nvidia-vgpu: allocate mgmt heap " Zhi Wang
2025-09-03 22:11 ` [RFC v2 07/14] vfio/nvidia-vgpu: map mgmt heap when creating a vGPU Zhi Wang
2025-09-03 22:11 ` [RFC v2 08/14] vfio/nvidia-vgpu: allocate GSP RM client when creating vGPUs Zhi Wang
2025-09-03 22:11 ` [RFC v2 09/14] vfio/nvidia-vgpu: bootload the new vGPU Zhi Wang
2025-09-03 22:11 ` [RFC v2 10/14] vfio/nvidia-vgpu: introduce vGPU host RPC channel Zhi Wang
2025-09-03 22:36   ` Timur Tabi
2025-09-03 22:11 ` [RFC v2 11/14] vfio/nvidia-vgpu: introduce NVIDIA vGPU VFIO variant driver Zhi Wang
2025-09-03 22:11 ` [RFC v2 12/14] vfio/nvidia-vgpu: scrub the guest FB memory of a vGPU Zhi Wang
2025-09-03 22:11 ` [RFC v2 13/14] vfio/nvidia-vgpu: introduce vGPU logging Zhi Wang
2025-09-03 22:11 ` [RFC v2 14/14] vfio/nvidia-vgpu: add a kernel doc to introduce NVIDIA vGPU Zhi Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DCJX0ZBB1ATN.1WPXONLVV8RYD@kernel.org \
    --to=dakr@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=daniel@ffwll.ch \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=smitra@nvidia.com \
    --cc=targupta@nvidia.com \
    --cc=zhiw@nvidia.com \
    --cc=zhiwang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox