public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
@ 2024-09-22 12:49 Zhi Wang
  2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
                   ` (31 more replies)
  0 siblings, 32 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

1. Background
=============

NVIDIA vGPU[1] software enables powerful GPU performance for workloads
ranging from graphics-rich virtual workstations to data science and AI,
enabling IT to leverage the management and security benefits of
virtualization as well as the performance of NVIDIA GPUs required for
modern workloads. Installed on a physical GPU in a cloud or enterprise
data center server, NVIDIA vGPU software creates virtual GPUs that can
be shared across multiple virtual machines.

The vGPU architecture[2] can be illustrated as follow:

 +--------------------+    +--------------------+ +--------------------+ +--------------------+ 
 | Hypervisor         |    | Guest VM           | | Guest VM           | | Guest VM           | 
 |                    |    | +----------------+ | | +----------------+ | | +----------------+ | 
 | +----------------+ |    | |Applications... | | | |Applications... | | | |Applications... | | 
 | |  NVIDIA        | |    | +----------------+ | | +----------------+ | | +----------------+ | 
 | |  Virtual GPU   | |    | +----------------+ | | +----------------+ | | +----------------+ | 
 | |  Manager       | |    | |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | 
 | +------^---------+ |    | +----------------+ | | +----------------+ | | +----------------+ | 
 |        |           |    +---------^----------+ +----------^---------+ +----------^---------+ 
 |        |           |              |                       |                      |           
 |        |           +--------------+-----------------------+----------------------+---------+ 
 |        |                          |                       |                      |         | 
 |        |                          |                       |                      |         | 
 +--------+--------------------------+-----------------------+----------------------+---------+ 
+---------v--------------------------+-----------------------+----------------------+----------+
| NVIDIA                  +----------v---------+ +-----------v--------+ +-----------v--------+ |
| Physical GPU            |   Virtual GPU      | |   Virtual GPU      | |   Virtual GPU      | |
|                         +--------------------+ +--------------------+ +--------------------+ |
+----------------------------------------------------------------------------------------------+

Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed amount
of GPU framebuffer, and one or more virtual display outputs or "heads".
The vGPU’s framebuffer is allocated out of the physical GPU’s framebuffer
at the time the vGPU is created, and the vGPU retains exclusive use of
that framebuffer until it is destroyed.

The number of physical GPUs that a board has depends on the board. Each
physical GPU can support several different types of virtual GPU (vGPU).
vGPU types have a fixed amount of frame buffer, number of supported
display heads, and maximum resolutions. They are grouped into different
series according to the different classes of workload for which they are
optimized. Each series is identified by the last letter of the vGPU type
name.

NVIDIA vGPU supports Windows and Linux guest VM operating systems. The
supported vGPU types depend on the guest VM OS.

2. Proposal for upstream
========================

2.1 Architecture
----------------

Moving to the upstream, the proposed architecture can be illustrated as followings:

                            +--------------------+ +--------------------+ +--------------------+ 
                            | Linux VM           | | Windows VM         | | Guest VM           | 
                            | +----------------+ | | +----------------+ | | +----------------+ | 
                            | |Applications... | | | |Applications... | | | |Applications... | | 
                            | +----------------+ | | +----------------+ | | +----------------+ | ... 
                            | +----------------+ | | +----------------+ | | +----------------+ | 
                            | |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | 
                            | +----------------+ | | +----------------+ | | +----------------+ | 
                            +---------^----------+ +----------^---------+ +----------^---------+ 
                                      |                       |                      |           
                           +--------------------------------------------------------------------+
                           |+--------------------+ +--------------------+ +--------------------+|
                           ||       QEMU         | |       QEMU         | |       QEMU         ||
                           ||                    | |                    | |                    ||
                           |+--------------------+ +--------------------+ +--------------------+|
                           +--------------------------------------------------------------------+
                                      |                       |                      |
+-----------------------------------------------------------------------------------------------+
|                           +----------------------------------------------------------------+  |
|                           |                                VFIO                            |  |
|                           |                                                                |  |
| +-----------------------+ | +------------------------+  +---------------------------------+|  |
| |  Core Driver vGPU     | | |                        |  |                                 ||  |
| |       Support        <--->|                       <---->                                ||  |
| +-----------------------+ | | NVIDIA vGPU Manager    |  | NVIDIA vGPU VFIO Variant Driver ||  |
| |    NVIDIA GPU Core    | | |                        |  |                                 ||  |
| |        Driver         | | +------------------------+  +---------------------------------+|  |
| +--------^--------------+ +----------------------------------------------------------------+  |
|          |                          |                       |                      |          |
+-----------------------------------------------------------------------------------------------+
           |                          |                       |                      |           
+----------|--------------------------|-----------------------|----------------------|----------+
|          v               +----------v---------+ +-----------v--------+ +-----------v--------+ |
|  NVIDIA                  |       PCI VF       | |       PCI VF       | |       PCI VF       | |
|  Physical GPU            |                    | |                    | |                    | |
|                          |   (Virtual GPU)    | |   (Virtual GPU)    | |    (Virtual GPU)   | |
|                          +--------------------+ +--------------------+ +--------------------+ |
+-----------------------------------------------------------------------------------------------+

The supported GPU generations will be Ada which come with the supported
GPU architecture. Each vGPU is backed by a PCI virtual function.

The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
extended management and features, e.g. selecting the vGPU types, support
live migration and driver warm update.

Like other devices that VFIO supports, VFIO provides the standard
userspace APIs for device lifecycle management and advance feature
support.

The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO
variant driver to create/destroy vGPUs, query available vGPU types, select
the vGPU type, etc.

On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver,
which provide necessary support to reach the HW functions.

2.2 Requirements to the NVIDIA GPU core driver
----------------------------------------------

The primary use case of CSP and enterprise is a standalone minimal
drivers of vGPU manager and other necessary components.

NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide
necessary support to:

- Load the GSP firmware, boot the GSP, provide commnication channel.
- Manage the shared/partitioned HW resources. E.g. reserving FB memory,
  channels for the vGPU mananger to create vGPUs.
- Exception handling. E.g. delivering the GSP events to vGPU manager.
- Host event dispatch. E.g. suspend/resume.
- Enumerations of HW configuration.

The NVIDIA GPU core driver, which sits on the PCI device interface of
NVIDIA GPU, provides support to both DRM driver and the vGPU manager.

In this RFC, the split nouveau GPU driver[3] is used as an example to
demostrate the requirements of vGPU manager to the core driver. The
nouveau driver is split into nouveau (the DRM driver) and nvkm (the core
driver).

3 Try the RFC patches
-----------------------

The RFC supports to create one VM to test the simple GPU workload.

- Host kernel: https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc
- Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4]

  Install guest driver:
  # export GRID_BUILD=1
  # ./NVIDIA-Linux-x86_64-535.154.05.run

- Tested platforms: L40.
- Tested guest OS: Ubutnu 24.04 LTS.
- Supported experience: Linux rich desktop experience with simple 3D
  workload, e.g. glmark2

4 Demo
------

A demo video can be found at: https://youtu.be/YwgIvvk-V94

[1] https://www.nvidia.com/en-us/data-center/virtual-solutions/
[2] https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
[3] https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/
[4] https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run

Zhi Wang (29):
  nvkm/vgpu: introduce NVIDIA vGPU support prelude
  nvkm/vgpu: attach to nvkm as a nvkm client
  nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
  nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
  nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  nvkm/gsp: add a notify handler for GSP event
    GPUACCT_PERFMON_UTIL_SAMPLES
  nvkm/vgpu: get the size VMMU segment from GSP firmware
  nvkm/vgpu: introduce the reserved channel allocator
  nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
  nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
  nvkm/vgpu: introduce GSP RM control interface for vGPU
  nvkm: move chid.h to nvkm/engine.
  nvkm/vgpu: introduce channel allocation for vGPU
  nvkm/vgpu: introduce FB memory allocation for vGPU
  nvkm/vgpu: introduce BAR1 map routines for vGPUs
  nvkm/vgpu: introduce engine bitmap for vGPU
  nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
  vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
  vfio/vgpu_mgr: introduce vGPU type uploading
  vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
  vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
  vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
  vfio/vgpu_mgr: map mgmt heap when creating a vGPU
  vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
  vfio/vgpu_mgr: bootload the new vGPU
  vfio/vgpu_mgr: introduce vGPU host RPC channel
  vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver

 .../drm/nouveau/include/nvkm/core/device.h    |   3 +
 .../drm/nouveau/include/nvkm/engine/chid.h    |  29 +
 .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   1 +
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  45 ++
 .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  12 +
 drivers/gpu/drm/nouveau/nvkm/Kbuild           |   1 +
 drivers/gpu/drm/nouveau/nvkm/device/pci.c     |  33 +-
 .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   |  49 +-
 .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  26 +-
 .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |   3 +
 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  14 +-
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |   3 +
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 302 +++++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 234 ++++++++
 drivers/vfio/pci/Kconfig                      |   2 +
 drivers/vfio/pci/Makefile                     |   2 +
 drivers/vfio/pci/nvidia-vgpu/Kconfig          |  13 +
 drivers/vfio/pci/nvidia-vgpu/Makefile         |   8 +
 drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 +
 .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +
 .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
 .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 152 ++++++
 .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
 .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++
 .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++
 .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 +
 .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
 drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  94 ++++
 drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 +++++++++
 drivers/vfio/pci/nvidia-vgpu/vfio.h           |  43 ++
 drivers/vfio/pci/nvidia-vgpu/vfio_access.c    | 297 ++++++++++
 drivers/vfio/pci/nvidia-vgpu/vfio_main.c      | 511 ++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c           | 352 ++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       | 144 +++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  89 +++
 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  61 +++
 37 files changed, 3702 insertions(+), 33 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
 create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/Makefile
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/debug.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/rpc.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_access.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_main.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c
 create mode 100644 include/drm/nvkm_vgpu_mgr_vfio.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26  9:20   ` Greg KH
  2024-09-22 12:49 ` [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client Zhi Wang
                   ` (30 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

NVIDIA GPU virtualization is a technology that allows multiple virtual
machines (VMs) to share the power of a single GPU, enabling greater
flexibility, efficiency, and cost-effectiveness in data centers and cloud
environments.

The first step of supporting NVIDIA vGPU in nvkm is to introduce the
necessary vGPU data structures and functions to hook into the
(de)initialization path of nvkm.

Introduce NVIDIA vGPU data structures and functions hooking into the
the (de)initialization path of nvkm and support the following patches.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../drm/nouveau/include/nvkm/core/device.h    |  3 +
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  | 17 +++++
 drivers/gpu/drm/nouveau/nvkm/Kbuild           |  1 +
 drivers/gpu/drm/nouveau/nvkm/device/pci.c     | 19 +++--
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |  2 +
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 76 +++++++++++++++++++
 6 files changed, 112 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
index fef8ca74968d..497c52f51593 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
@@ -3,6 +3,7 @@
 #define __NVKM_DEVICE_H__
 #include <core/oclass.h>
 #include <core/intr.h>
+#include <vgpu_mgr/vgpu_mgr.h>
 enum nvkm_subdev_type;
 
 #include <linux/auxiliary_bus.h>
@@ -80,6 +81,8 @@ struct nvkm_device {
 		bool legacy_done;
 	} intr;
 
+	struct nvkm_vgpu_mgr vgpu_mgr;
+
 	struct auxiliary_device auxdev;
 	const struct nvif_driver_func *driver;
 };
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
new file mode 100644
index 000000000000..3163fff1085b
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NVKM_VGPU_MGR_H__
+#define __NVKM_VGPU_MGR_H__
+
+#define NVIDIA_MAX_VGPUS 2
+
+struct nvkm_vgpu_mgr {
+	bool enabled;
+	struct nvkm_device *nvkm_dev;
+};
+
+bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
+bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device);
+int nvkm_vgpu_mgr_init(struct nvkm_device *device);
+void nvkm_vgpu_mgr_fini(struct nvkm_device *device);
+
+#endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/Kbuild b/drivers/gpu/drm/nouveau/nvkm/Kbuild
index 9e1a6ab937e1..d310467487c1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/Kbuild
@@ -8,3 +8,4 @@ include $(src)/nvkm/device/Kbuild
 include $(src)/nvkm/falcon/Kbuild
 include $(src)/nvkm/subdev/Kbuild
 include $(src)/nvkm/engine/Kbuild
+include $(src)/nvkm/vgpu_mgr/Kbuild
diff --git a/drivers/gpu/drm/nouveau/nvkm/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
index b8d2125a9f59..1543902b20e9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/device/pci.c
+++ b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
@@ -1688,6 +1688,9 @@ nvkm_device_pci_remove(struct pci_dev *dev)
 {
 	struct nvkm_device *device = pci_get_drvdata(dev);
 
+	if (nvkm_vgpu_mgr_is_enabled(device))
+		nvkm_vgpu_mgr_fini(device);
+
 	if (device->runpm) {
 		pm_runtime_get_sync(device->dev);
 		pm_runtime_forbid(device->dev);
@@ -1835,12 +1838,6 @@ nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	}
 
 	quirk_broken_nv_runpm(pdev);
-done:
-	if (ret) {
-		nvkm_device_del(&device);
-		return ret;
-	}
-
 	pci_set_drvdata(pci_dev, &pdev->device);
 
 	if (nvkm_runpm) {
@@ -1852,12 +1849,22 @@ nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 		}
 	}
 
+	if (nvkm_vgpu_mgr_is_supported(device)) {
+		ret = nvkm_vgpu_mgr_init(&pdev->device);
+		if (ret)
+			goto done;
+	}
+
 	if (device->runpm) {
 		pm_runtime_allow(device->dev);
 		pm_runtime_put(device->dev);
 	}
 
 	return 0;
+
+done:
+	nvkm_device_del(&device);
+	return ret;
 }
 
 static struct pci_device_id
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
new file mode 100644
index 000000000000..244e967d4edc
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: MIT
+nvkm-y += nvkm/vgpu_mgr/vgpu_mgr.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
new file mode 100644
index 000000000000..a506414e5ba2
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: MIT */
+#include <core/device.h>
+#include <core/pci.h>
+#include <vgpu_mgr/vgpu_mgr.h>
+
+static bool support_vgpu_mgr = false;
+module_param_named(support_vgpu_mgr, support_vgpu_mgr, bool, 0400);
+
+static inline struct pci_dev *nvkm_to_pdev(struct nvkm_device *device)
+{
+	struct nvkm_device_pci *pci = container_of(device, typeof(*pci),
+						   device);
+
+	return pci->pdev;
+}
+
+/**
+ * nvkm_vgpu_mgr_is_supported - check if a platform support vGPU
+ * @device: the nvkm_device pointer
+ *
+ * Returns: true on supported platform which is newer than ADA Lovelace
+ * with SRIOV support.
+ */
+bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device)
+{
+	struct pci_dev *pdev = nvkm_to_pdev(device);
+
+	if (!support_vgpu_mgr)
+		return false;
+
+	return device->card_type == AD100 &&  pci_sriov_get_totalvfs(pdev);
+}
+
+/**
+ * nvkm_vgpu_mgr_is_enabled - check if vGPU support is enabled on a PF
+ * @device: the nvkm_device pointer
+ *
+ * Returns: true if vGPU enabled.
+ */
+bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device)
+{
+	return device->vgpu_mgr.enabled;
+}
+
+/**
+ * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
+ * @device: the nvkm_device pointer
+ *
+ * Returns: 0 on success, -ENODEV on platforms that are not supported.
+ */
+int nvkm_vgpu_mgr_init(struct nvkm_device *device)
+{
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	if (!nvkm_vgpu_mgr_is_supported(device))
+		return -ENODEV;
+
+	vgpu_mgr->nvkm_dev = device;
+	vgpu_mgr->enabled = true;
+
+	pci_info(nvkm_to_pdev(device),
+		 "NVIDIA vGPU mananger support is enabled.\n");
+
+	return 0;
+}
+
+/**
+ * nvkm_vgpu_mgr_fini - De-initialize the vGPU manager support
+ * @device: the nvkm_device pointer
+ */
+void nvkm_vgpu_mgr_fini(struct nvkm_device *device)
+{
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	vgpu_mgr->enabled = false;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
  2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26  9:21   ` Greg KH
  2024-09-22 12:49 ` [RFC 03/29] nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled Zhi Wang
                   ` (29 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

nvkm is a HW abstraction layer(HAL) that initializes the HW and
allows its clients to manipulate the GPU functions regardless of the
generations of GPU HW. On the top layer, it provides generic APIs for a
client to connect to NVKM, enumerate the GPU functions, and manipulate
the GPU HW.

To reach nvkm, the client needs to connect to NVKM layer by layer: driver
layer, client layer, and eventually, the device layer, which provides all
the access routines to GPU functions. After a client attaches to NVKM,
it initializes the HW and is able to serve the clients.

Attach to nvkm as a nvkm client.

Cc: Neo Jia <cjia@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  8 ++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 48 ++++++++++++++++++-
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 3163fff1085b..9e10e18306b0 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -7,6 +7,14 @@
 struct nvkm_vgpu_mgr {
 	bool enabled;
 	struct nvkm_device *nvkm_dev;
+
+	const struct nvif_driver *driver;
+
+	const struct nvif_client_impl *cli_impl;
+	struct nvif_client_priv *cli_priv;
+
+	const struct nvif_device_impl *dev_impl;
+	struct nvif_device_priv *dev_priv;
 };
 
 bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index a506414e5ba2..0639596f8a96 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: MIT */
 #include <core/device.h>
+#include <core/driver.h>
+#include <nvif/driverif.h>
 #include <core/pci.h>
 #include <vgpu_mgr/vgpu_mgr.h>
 
@@ -42,6 +44,44 @@ bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device)
 	return device->vgpu_mgr.enabled;
 }
 
+static void detach_nvkm(struct nvkm_vgpu_mgr *vgpu_mgr)
+{
+	if (vgpu_mgr->dev_impl) {
+		vgpu_mgr->dev_impl->del(vgpu_mgr->dev_priv);
+		vgpu_mgr->dev_impl = NULL;
+	}
+
+	if (vgpu_mgr->cli_impl) {
+		vgpu_mgr->cli_impl->del(vgpu_mgr->cli_priv);
+		vgpu_mgr->cli_impl = NULL;
+	}
+}
+
+static int attach_nvkm(struct nvkm_vgpu_mgr *vgpu_mgr)
+{
+	struct nvkm_device *device = vgpu_mgr->nvkm_dev;
+	int ret;
+
+	ret = nvkm_driver_ctor(device, &vgpu_mgr->driver,
+			       &vgpu_mgr->cli_impl, &vgpu_mgr->cli_priv);
+	if (ret)
+		return ret;
+
+	ret = vgpu_mgr->cli_impl->device.new(vgpu_mgr->cli_priv,
+					     &vgpu_mgr->dev_impl,
+					     &vgpu_mgr->dev_priv);
+	if (ret)
+		goto fail_device_new;
+
+	return 0;
+
+fail_device_new:
+	vgpu_mgr->cli_impl->del(vgpu_mgr->cli_priv);
+	vgpu_mgr->cli_impl = NULL;
+
+	return ret;
+}
+
 /**
  * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
  * @device: the nvkm_device pointer
@@ -51,13 +91,18 @@ bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device)
 int nvkm_vgpu_mgr_init(struct nvkm_device *device)
 {
 	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+	int ret;
 
 	if (!nvkm_vgpu_mgr_is_supported(device))
 		return -ENODEV;
 
 	vgpu_mgr->nvkm_dev = device;
-	vgpu_mgr->enabled = true;
 
+	ret = attach_nvkm(vgpu_mgr);
+	if (ret)
+		return ret;
+
+	vgpu_mgr->enabled = true;
 	pci_info(nvkm_to_pdev(device),
 		 "NVIDIA vGPU mananger support is enabled.\n");
 
@@ -72,5 +117,6 @@ void nvkm_vgpu_mgr_fini(struct nvkm_device *device)
 {
 	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
 
+	detach_nvkm(vgpu_mgr);
 	vgpu_mgr->enabled = false;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 03/29] nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
  2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
  2024-09-22 12:49 ` [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 04/29] nvkm/vgpu: set the VF partition count " Zhi Wang
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

A larger GSP heap is required when enabling NVIDIA vGPU.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 18bae367b194..a38a6abcac6f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -2638,7 +2638,10 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp)
 	gsp->fb.wpr2.elf.size = gsp->fw.len;
 	gsp->fb.wpr2.elf.addr = ALIGN_DOWN(gsp->fb.wpr2.boot.addr - gsp->fb.wpr2.elf.size, 0x10000);
 
-	{
+	if (nvkm_vgpu_mgr_is_supported(device)) {
+		gsp->fb.wpr2.heap.size = SZ_256M;
+	} else {
+
 		u32 fb_size_gb = DIV_ROUND_UP_ULL(gsp->fb.size, 1 << 30);
 
 		gsp->fb.wpr2.heap.size =
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (2 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 03/29] nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26 22:51   ` Jason Gunthorpe
  2024-09-22 12:49 ` [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO " Zhi Wang
                   ` (27 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

GSP firmware needs to know the number of max-supported vGPUs when
initialization.

The field of VF partition count in the GSP WPR2 is required to be set
according to the number of max-supported vGPUs.

Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP
firmware and initializes the GSP WPR2, if vGPU is enabled.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 +
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
index 3fbc57b16a05..f52143df45c1 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -61,6 +61,7 @@ struct nvkm_gsp {
 			} frts, boot, elf, heap;
 			u64 addr;
 			u64 size;
+			u8 vf_partition_count;
 		} wpr2;
 		struct {
 			u64 addr;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index a38a6abcac6f..14fc152d6859 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -2037,6 +2037,7 @@ r535_gsp_wpr_meta_init(struct nvkm_gsp *gsp)
 	meta->vgaWorkspaceOffset = gsp->fb.bios.vga_workspace.addr;
 	meta->vgaWorkspaceSize = gsp->fb.bios.vga_workspace.size;
 	meta->bootCount = 0;
+	meta->gspFwHeapVfPartitionCount = gsp->fb.wpr2.vf_partition_count;
 	meta->partitionRpcAddr = 0;
 	meta->partitionRpcRequestOffset = 0;
 	meta->partitionRpcReplyOffset = 0;
@@ -2640,6 +2641,7 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp)
 
 	if (nvkm_vgpu_mgr_is_supported(device)) {
 		gsp->fb.wpr2.heap.size = SZ_256M;
+		gsp->fb.wpr2.vf_partition_count = NVIDIA_MAX_VGPUS;
 	} else {
 
 		u32 fb_size_gb = DIV_ROUND_UP_ULL(gsp->fb.size, 1 << 30);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (3 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 04/29] nvkm/vgpu: set the VF partition count " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26 22:52   ` Jason Gunthorpe
  2024-09-22 12:49 ` [RFC 06/29] nvkm/vgpu: set RMSetSriovMode " Zhi Wang
                   ` (26 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

GSP firmware needs to know the VF BAR offsets to correctly calculate the
VF events.

The VF BAR information is stored in GSP_VF_INFO, which needs to be
initialized and uploaded together with the GSP_SYSTEM_INFO.

Populate GSP_VF_INFO when nvkm uploads the GSP_SYSTEM_INFO if NVIDIA
vGPU is enabled.

Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  2 +
 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  3 ++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 50 +++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 9e10e18306b0..6bc10fa40cde 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -21,5 +21,7 @@ bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
 bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device);
 int nvkm_vgpu_mgr_init(struct nvkm_device *device);
 void nvkm_vgpu_mgr_fini(struct nvkm_device *device);
+void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
+					void *info);
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 14fc152d6859..49552d7df88f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1717,6 +1717,9 @@ r535_gsp_rpc_set_system_info(struct nvkm_gsp *gsp)
 	info->pciConfigMirrorSize = 0x001000;
 	r535_gsp_acpi_info(gsp, &info->acpiMethodData);
 
+	if (nvkm_vgpu_mgr_is_supported(device))
+		nvkm_vgpu_mgr_populate_gsp_vf_info(device, info);
+
 	return nvkm_gsp_rpc_wr(gsp, info, false);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index 0639596f8a96..d6ddb1f02275 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -3,6 +3,10 @@
 #include <core/driver.h>
 #include <nvif/driverif.h>
 #include <core/pci.h>
+
+#include <nvrm/nvtypes.h>
+#include <nvrm/535.113.01/nvidia/inc/kernel/gpu/gsp/gsp_static_config.h>
+
 #include <vgpu_mgr/vgpu_mgr.h>
 
 static bool support_vgpu_mgr = false;
@@ -120,3 +124,49 @@ void nvkm_vgpu_mgr_fini(struct nvkm_device *device)
 	detach_nvkm(vgpu_mgr);
 	vgpu_mgr->enabled = false;
 }
+
+/**
+ * nvkm_vgpu_mgr_populate_vf_info - populate GSP_VF_INFO when vGPU
+ * is enabled
+ * @device: the nvkm_device pointer
+ * @info: GSP_VF_INFO data structure
+ */
+void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
+					void *info)
+{
+	struct pci_dev *pdev = nvkm_to_pdev(device);
+	GspSystemInfo *gsp_info = info;
+	GSP_VF_INFO *vf_info = &gsp_info->gspVFInfo;
+	u32 lo, hi;
+	u16 v;
+	int pos;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+
+	pci_read_config_word(pdev, pos + PCI_SRIOV_TOTAL_VF, &v);
+	vf_info->totalVFs = v;
+
+	pci_read_config_word(pdev, pos + PCI_SRIOV_VF_OFFSET, &v);
+	vf_info->firstVFOffset = v;
+
+	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR, &lo);
+	vf_info->FirstVFBar0Address = lo & 0xFFFFFFF0;
+
+	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 4, &lo);
+	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 8, &hi);
+
+	vf_info->FirstVFBar1Address = (((u64)hi) << 32) + (lo & 0xFFFFFFF0);
+
+	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 12, &lo);
+	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 16, &hi);
+
+	vf_info->FirstVFBar2Address = (((u64)hi) << 32) + (lo & 0xFFFFFFF0);
+
+#define IS_BAR_64(i) (((i) & 0x00000006) == 0x00000004)
+
+	v = nvkm_rd32(device, 0x88000 + 0xbf4);
+	vf_info->b64bitBar1 = IS_BAR_64(v);
+
+	v = nvkm_rd32(device, 0x88000 + 0xbfc);
+	vf_info->b64bitBar2 = IS_BAR_64(v);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (4 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26 22:53   ` Jason Gunthorpe
  2024-09-22 12:49 ` [RFC 07/29] nvkm/gsp: add a notify handler for GSP event GPUACCT_PERFMON_UTIL_SAMPLES Zhi Wang
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The registry object "RMSetSriovMode" is required to be set when vGPU is
enabled.

Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
initialize the GSP registry objects, if vGPU is enabled.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 49552d7df88f..a7db2a7880dd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1500,6 +1500,9 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
 		kfree(p);
 	}
 
+	if (nvkm_vgpu_mgr_is_supported(gsp->subdev.device))
+		add_registry_num(gsp, "RMSetSriovMode", 1);
+
 	rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_SET_REGISTRY, gsp->registry_rpc_size);
 	if (IS_ERR(rpc)) {
 		ret = PTR_ERR(rpc);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 07/29] nvkm/gsp: add a notify handler for GSP event GPUACCT_PERFMON_UTIL_SAMPLES
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (5 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 06/29] nvkm/vgpu: set RMSetSriovMode " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 08/29] nvkm/vgpu: get the size VMMU segment from GSP firmware Zhi Wang
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

GSP firmware will periodically send GPUACCT_PERFMON_UTIL_SAMPLES event
when vGPU is enabled.

nvkm keeps dumping the entire GSP message queue in dmesg when receives an
unknown GSP message, which is too nosiy.

Add a notify handler to prevent nvkm from dumping the noisy GSP message
due to the unknown GSP message.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index a7db2a7880dd..3eea6ccb6bd2 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -2628,6 +2628,7 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp)
 	r535_gsp_msg_ntfy_add(gsp, NV_VGPU_MSG_EVENT_PERF_BRIDGELESS_INFO_UPDATE, NULL, NULL);
 	r535_gsp_msg_ntfy_add(gsp, NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT, NULL, NULL);
 	r535_gsp_msg_ntfy_add(gsp, NV_VGPU_MSG_EVENT_GSP_SEND_USER_SHARED_DATA, NULL, NULL);
+	r535_gsp_msg_ntfy_add(gsp, NV_VGPU_MSG_EVENT_GPUACCT_PERFMON_UTIL_SAMPLES, NULL, NULL);
 	ret = r535_gsp_rm_boot_ctor(gsp);
 	if (ret)
 		return ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 08/29] nvkm/vgpu: get the size VMMU segment from GSP firmware
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (6 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 07/29] nvkm/gsp: add a notify handler for GSP event GPUACCT_PERFMON_UTIL_SAMPLES Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 09/29] nvkm/vgpu: introduce the reserved channel allocator Zhi Wang
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The allocation of FBMEM for vGPUs requires to be aligned with the size of
VMMU segment. Before reserving the FBMEM for vGPUs, the size of VMMU
segment must be known.

Send a GSP RM control to get VMMU segment size from GSP firmware in vGPU
support initalization.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  2 ++
 .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    | 12 ++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 30 +++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 6bc10fa40cde..aaba6d9a88b4 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -15,6 +15,8 @@ struct nvkm_vgpu_mgr {
 
 	const struct nvif_device_impl *dev_impl;
 	struct nvif_device_priv *dev_priv;
+
+	u64 vmmu_segment_size;
 };
 
 bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
diff --git a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
index 29d7a1052142..4d57d8664ee5 100644
--- a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
+++ b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
@@ -97,4 +97,16 @@ typedef struct NV2080_CTRL_GPU_GET_GID_INFO_PARAMS {
     NvU8  data[NV2080_GPU_MAX_GID_LENGTH];
 } NV2080_CTRL_GPU_GET_GID_INFO_PARAMS;
 
+#define NV2080_CTRL_CMD_GPU_GET_VMMU_SEGMENT_SIZE (0x2080017e)
+
+typedef struct NV2080_CTRL_GPU_GET_VMMU_SEGMENT_SIZE_PARAMS {
+	NV_DECLARE_ALIGNED(NvU64 vmmuSegmentSize, 8);
+} NV2080_CTRL_GPU_GET_VMMU_SEGMENT_SIZE_PARAMS;
+
+#define NV2080_CTRL_GPU_VMMU_SEGMENT_SIZE_32MB     0x02000000U
+#define NV2080_CTRL_GPU_VMMU_SEGMENT_SIZE_64MB     0x04000000U
+#define NV2080_CTRL_GPU_VMMU_SEGMENT_SIZE_128MB    0x08000000U
+#define NV2080_CTRL_GPU_VMMU_SEGMENT_SIZE_256MB    0x10000000U
+#define NV2080_CTRL_GPU_VMMU_SEGMENT_SIZE_512MB    0x20000000U
+
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index d6ddb1f02275..d2ea5a07cbfc 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -4,6 +4,8 @@
 #include <nvif/driverif.h>
 #include <core/pci.h>
 
+#include <subdev/gsp.h>
+
 #include <nvrm/nvtypes.h>
 #include <nvrm/535.113.01/nvidia/inc/kernel/gpu/gsp/gsp_static_config.h>
 
@@ -86,6 +88,26 @@ static int attach_nvkm(struct nvkm_vgpu_mgr *vgpu_mgr)
 	return ret;
 }
 
+static int get_vmmu_segment_size(struct nvkm_vgpu_mgr *mgr)
+{
+	struct nvkm_device *device = mgr->nvkm_dev;
+	struct nvkm_gsp *gsp = device->gsp;
+	NV2080_CTRL_GPU_GET_VMMU_SEGMENT_SIZE_PARAMS *ctrl;
+
+	ctrl = nvkm_gsp_rm_ctrl_rd(&gsp->internal.device.subdevice,
+				    NV2080_CTRL_CMD_GPU_GET_VMMU_SEGMENT_SIZE,
+				    sizeof(*ctrl));
+	if (IS_ERR(ctrl))
+		return PTR_ERR(ctrl);
+
+	nvdev_debug(device, "VMMU segment size: %llx\n", ctrl->vmmuSegmentSize);
+
+	mgr->vmmu_segment_size = ctrl->vmmuSegmentSize;
+
+	nvkm_gsp_rm_ctrl_done(&gsp->internal.device.subdevice, ctrl);
+	return 0;
+}
+
 /**
  * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
  * @device: the nvkm_device pointer
@@ -106,11 +128,19 @@ int nvkm_vgpu_mgr_init(struct nvkm_device *device)
 	if (ret)
 		return ret;
 
+	ret = get_vmmu_segment_size(vgpu_mgr);
+	if (ret)
+		goto err_get_vmmu_seg_size;
+
 	vgpu_mgr->enabled = true;
 	pci_info(nvkm_to_pdev(device),
 		 "NVIDIA vGPU mananger support is enabled.\n");
 
 	return 0;
+
+err_get_vmmu_seg_size:
+	detach_nvkm(vgpu_mgr);
+	return ret;
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 09/29] nvkm/vgpu: introduce the reserved channel allocator
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (7 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 08/29] nvkm/vgpu: get the size VMMU segment from GSP firmware Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 10/29] nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module Zhi Wang
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Creating a vGPU requires a dedicated portion of the channels.

As nvkm manages all the channels, the vGPU host needs to reserve the
channels from nvkm when vGPU is enabled, and allocate the reserved
channels from the reserved channel pool when creating vGPUs.

Introduce a simple reserved channel allocator. Reserve 1536 channels for
vGPUs from nvkm and leave 512 CHIDs for nvkm when vGPU is enabled.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   | 49 ++++++++++++++++++-
 .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  4 ++
 .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |  3 ++
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.c
index 23944d95efd5..0328ee9386d4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.c
@@ -89,13 +89,14 @@ nvkm_chid_new(const struct nvkm_event_func *func, struct nvkm_subdev *subdev,
 	struct nvkm_chid *chid;
 	int id;
 
-	if (!(chid = *pchid = kzalloc(struct_size(chid, used, nr), GFP_KERNEL)))
+	if (!(chid = *pchid = kzalloc(struct_size(chid, used, 2 * nr), GFP_KERNEL)))
 		return -ENOMEM;
 
 	kref_init(&chid->kref);
 	chid->nr = nr;
 	chid->mask = chid->nr - 1;
 	spin_lock_init(&chid->lock);
+	chid->reserved = chid->used + nr;
 
 	if (!(chid->data = kvzalloc(sizeof(*chid->data) * nr, GFP_KERNEL))) {
 		nvkm_chid_unref(pchid);
@@ -109,3 +110,49 @@ nvkm_chid_new(const struct nvkm_event_func *func, struct nvkm_subdev *subdev,
 
 	return nvkm_event_init(func, subdev, 1, nr, &chid->event);
 }
+
+void
+nvkm_chid_reserved_free(struct nvkm_chid *chid, int first, int count)
+{
+	int id;
+
+	for (id = first; id < count; id++)
+		__clear_bit(id, chid->reserved);
+}
+
+int
+nvkm_chid_reserved_alloc(struct nvkm_chid *chid, int count)
+{
+	int id, start, end;
+
+	start = end = 0;
+
+	while (start != chid->nr) {
+		start = find_next_zero_bit(chid->reserved, chid->nr, end);
+		end = find_next_bit(chid->reserved, chid->nr, start);
+
+		if (end - start >= count) {
+			for (id = start; id < start + count; id++)
+				__set_bit(id, chid->reserved);
+			return start;
+		}
+	}
+
+	return -1;
+}
+
+void
+nvkm_chid_reserve(struct nvkm_chid *chid, int first, int count)
+{
+	int id;
+
+	if (WARN_ON(first + count - 1 >= chid->nr))
+		return;
+
+	for (id = 0; id < first; id++)
+		__set_bit(id, chid->reserved);
+	for (id = first + count; id < chid->nr; id++)
+		__set_bit(id, chid->reserved);
+	for (id = first; id < count; id++)
+		__set_bit(id, chid->used);
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
index 2a42efb18401..b9e507af6725 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
@@ -13,6 +13,7 @@ struct nvkm_chid {
 	void **data;
 
 	spinlock_t lock;
+	unsigned long *reserved;
 	unsigned long used[];
 };
 
@@ -22,4 +23,7 @@ struct nvkm_chid *nvkm_chid_ref(struct nvkm_chid *);
 void nvkm_chid_unref(struct nvkm_chid **);
 int nvkm_chid_get(struct nvkm_chid *, void *data);
 void nvkm_chid_put(struct nvkm_chid *, int id, spinlock_t *data_lock);
+int nvkm_chid_reserved_alloc(struct nvkm_chid *chid, int count);
+void nvkm_chid_reserved_free(struct nvkm_chid *chid, int first, int count);
+void nvkm_chid_reserve(struct nvkm_chid *chid, int first, int count);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
index 3454c7d29502..4c18dc1060fc 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
@@ -548,6 +548,9 @@ r535_fifo_runl_ctor(struct nvkm_fifo *fifo)
 	    (ret = nvkm_chid_new(&nvkm_chan_event, subdev, chids, 0, chids, &fifo->chid)))
 		return ret;
 
+	if (nvkm_vgpu_mgr_is_supported(subdev->device))
+		nvkm_chid_reserve(fifo->chid, 512, 1536);
+
 	ctrl = nvkm_gsp_rm_ctrl_rd(&gsp->internal.device.subdevice,
 				   NV2080_CTRL_CMD_FIFO_GET_DEVICE_INFO_TABLE, sizeof(*ctrl));
 	if (WARN_ON(IS_ERR(ctrl)))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 10/29] nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (8 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 09/29] nvkm/vgpu: introduce the reserved channel allocator Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 11/29] nvkm/vgpu: introduce GSP RM client alloc and free for vGPU Zhi Wang
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

NVIDIA vGPU VFIO module requires interfaces from the core driver support
to issue GSP RPCs, allocating FBMEM, create/destroy/reset vGPUs...

Implement interfaces to expose the core driver functions to
NVIDIA vGPU VFIO module.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  6 ++
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |  1 +
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 69 +++++++++++++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  |  5 ++
 include/drm/nvkm_vgpu_mgr_vfio.h              | 24 +++++++
 5 files changed, 105 insertions(+)
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
 create mode 100644 include/drm/nvkm_vgpu_mgr_vfio.h

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index aaba6d9a88b4..5a856fa905f9 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -2,6 +2,8 @@
 #ifndef __NVKM_VGPU_MGR_H__
 #define __NVKM_VGPU_MGR_H__
 
+#include <drm/nvkm_vgpu_mgr_vfio.h>
+
 #define NVIDIA_MAX_VGPUS 2
 
 struct nvkm_vgpu_mgr {
@@ -17,6 +19,9 @@ struct nvkm_vgpu_mgr {
 	struct nvif_device_priv *dev_priv;
 
 	u64 vmmu_segment_size;
+
+	void *vfio_ops;
+	struct nvidia_vgpu_vfio_handle_data vfio_handle_data;
 };
 
 bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
@@ -25,5 +30,6 @@ int nvkm_vgpu_mgr_init(struct nvkm_device *device);
 void nvkm_vgpu_mgr_fini(struct nvkm_device *device);
 void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
 					void *info);
+void nvkm_vgpu_mgr_init_vfio_ops(struct nvkm_vgpu_mgr *vgpu_mgr);
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
index 244e967d4edc..a62c10cb1446 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: MIT
 nvkm-y += nvkm/vgpu_mgr/vgpu_mgr.o
+nvkm-y += nvkm/vgpu_mgr/vfio.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
new file mode 100644
index 000000000000..e98c9e83ee60
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: MIT */
+
+#include <core/device.h>
+
+#include <vgpu_mgr/vgpu_mgr.h>
+#include <drm/nvkm_vgpu_mgr_vfio.h>
+
+static bool vgpu_mgr_is_enabled(void *handle)
+{
+	struct nvkm_device *device = handle;
+
+	return nvkm_vgpu_mgr_is_enabled(device);
+}
+
+static void get_handle(void *handle,
+		       struct nvidia_vgpu_vfio_handle_data *data)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	if (vgpu_mgr->vfio_handle_data.priv)
+		memcpy(data, &vgpu_mgr->vfio_handle_data, sizeof(*data));
+}
+
+static void detach_handle(void *handle)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	vgpu_mgr->vfio_handle_data.priv = NULL;
+}
+
+static int attach_handle(void *handle,
+			 struct nvidia_vgpu_vfio_handle_data *data)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	if (vgpu_mgr->vfio_handle_data.priv)
+		return -EEXIST;
+
+	memcpy(&vgpu_mgr->vfio_handle_data, data, sizeof(*data));
+	return 0;
+}
+
+struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
+	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
+	.get_handle = get_handle,
+	.attach_handle = attach_handle,
+	.detach_handle = detach_handle,
+};
+
+/**
+ * nvkm_vgpu_mgr_init_vfio_ops - init the callbacks for VFIO
+ * @vgpu_mgr: the nvkm vGPU manager
+ */
+void nvkm_vgpu_mgr_init_vfio_ops(struct nvkm_vgpu_mgr *vgpu_mgr)
+{
+	vgpu_mgr->vfio_ops = &nvkm_vgpu_mgr_vfio_ops;
+}
+
+struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	return vgpu_mgr->vfio_ops;
+}
+EXPORT_SYMBOL(nvkm_vgpu_mgr_get_vfio_ops);
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index d2ea5a07cbfc..caeb805cf1c3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -1,4 +1,7 @@
 /* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
 #include <core/device.h>
 #include <core/driver.h>
 #include <nvif/driverif.h>
@@ -132,6 +135,8 @@ int nvkm_vgpu_mgr_init(struct nvkm_device *device)
 	if (ret)
 		goto err_get_vmmu_seg_size;
 
+	nvkm_vgpu_mgr_init_vfio_ops(vgpu_mgr);
+
 	vgpu_mgr->enabled = true;
 	pci_info(nvkm_to_pdev(device),
 		 "NVIDIA vGPU mananger support is enabled.\n");
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
new file mode 100644
index 000000000000..09ecc3dc454f
--- /dev/null
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#ifndef __NVKM_VGPU_MGR_VFIO_H__
+#define __NVKM_VGPU_MGR_VFIO_H__
+
+struct nvidia_vgpu_vfio_handle_data {
+	void *priv;
+};
+
+struct nvkm_vgpu_mgr_vfio_ops {
+	bool (*vgpu_mgr_is_enabled)(void *handle);
+	void (*get_handle)(void *handle,
+		           struct nvidia_vgpu_vfio_handle_data *data);
+	int (*attach_handle)(void *handle,
+		             struct nvidia_vgpu_vfio_handle_data *data);
+	void (*detach_handle)(void *handle);
+};
+
+struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 11/29] nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (9 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 10/29] nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 12/29] nvkm/vgpu: introduce GSP RM control interface " Zhi Wang
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

To talk to the GSP firmware, the first step is allocating a GSP RM client.

NVDIA vGPU VFIO module requires a GSP RM client to obtain system information
and create and configure vGPUs.

Implement the GSP client allocation and free.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 58 ++++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h             | 10 ++++
 2 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index e98c9e83ee60..a0b4be2e1085 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: MIT */
 
 #include <core/device.h>
+#include <subdev/gsp.h>
 
 #include <vgpu_mgr/vgpu_mgr.h>
 #include <drm/nvkm_vgpu_mgr_vfio.h>
@@ -43,11 +44,68 @@ static int attach_handle(void *handle,
 	return 0;
 }
 
+static int alloc_gsp_client(void *handle,
+			    struct nvidia_vgpu_gsp_client *client)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_gsp *gsp = device->gsp;
+	int ret = -ENOMEM;
+
+	client->gsp_device = kzalloc(sizeof(struct nvkm_gsp_device),
+				     GFP_KERNEL);
+	if (!client->gsp_device)
+		return ret;
+
+	client->gsp_client = kzalloc(sizeof(struct nvkm_gsp_client),
+				     GFP_KERNEL);
+	if (!client->gsp_client)
+		goto fail_alloc_client;
+
+	ret = nvkm_gsp_client_device_ctor(gsp, client->gsp_client,
+					  client->gsp_device);
+	if (ret)
+		goto fail_client_device_ctor;
+
+	return 0;
+
+fail_client_device_ctor:
+	kfree(client->gsp_client);
+	client->gsp_client = NULL;
+
+fail_alloc_client:
+	kfree(client->gsp_device);
+	client->gsp_device = NULL;
+
+	return ret;
+}
+
+static void free_gsp_client(struct nvidia_vgpu_gsp_client *client)
+{
+	nvkm_gsp_device_dtor(client->gsp_device);
+	nvkm_gsp_client_dtor(client->gsp_client);
+
+	kfree(client->gsp_device);
+	client->gsp_device = NULL;
+
+	kfree(client->gsp_client);
+	client->gsp_client = NULL;
+}
+
+static u32 get_gsp_client_handle(struct nvidia_vgpu_gsp_client *client)
+{
+	struct nvkm_gsp_client *c = client->gsp_client;
+
+	return c->object.handle;
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
 	.attach_handle = attach_handle,
 	.detach_handle = detach_handle,
+	.alloc_gsp_client = alloc_gsp_client,
+	.free_gsp_client = free_gsp_client,
+	.get_gsp_client_handle = get_gsp_client_handle,
 };
 
 /**
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 09ecc3dc454f..79920cc27055 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -10,6 +10,12 @@ struct nvidia_vgpu_vfio_handle_data {
 	void *priv;
 };
 
+/* A combo of handles of RmClient and RmDevice */
+struct nvidia_vgpu_gsp_client {
+	void *gsp_client;
+	void *gsp_device;
+};
+
 struct nvkm_vgpu_mgr_vfio_ops {
 	bool (*vgpu_mgr_is_enabled)(void *handle);
 	void (*get_handle)(void *handle,
@@ -17,6 +23,10 @@ struct nvkm_vgpu_mgr_vfio_ops {
 	int (*attach_handle)(void *handle,
 		             struct nvidia_vgpu_vfio_handle_data *data);
 	void (*detach_handle)(void *handle);
+	int (*alloc_gsp_client)(void *handle,
+				struct nvidia_vgpu_gsp_client *client);
+	void (*free_gsp_client)(struct nvidia_vgpu_gsp_client *client);
+	u32 (*get_gsp_client_handle)(struct nvidia_vgpu_gsp_client *client);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 12/29] nvkm/vgpu: introduce GSP RM control interface for vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (10 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 11/29] nvkm/vgpu: introduce GSP RM client alloc and free for vGPU Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 13/29] nvkm: move chid.h to nvkm/engine Zhi Wang
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

To talk to the GSP firmware, the first step is allocating a GSP RM client.
The second step is issuing GSP RM controls to access the functions
provided by GSP firmware.

The NVIDIA vGPU VFIO module requires a GSP RM control interface to obtain
the system information, create and configure vGPUs.

Implement the GSP RM control interface based on nvkm routines.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 34 ++++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h             |  8 +++++
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index a0b4be2e1085..9732e43a5d6b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -98,6 +98,36 @@ static u32 get_gsp_client_handle(struct nvidia_vgpu_gsp_client *client)
 	return c->object.handle;
 }
 
+static void *rm_ctrl_get(struct nvidia_vgpu_gsp_client *client, u32 cmd,
+			 u32 size)
+{
+	struct nvkm_gsp_device *device = client->gsp_device;
+
+	return nvkm_gsp_rm_ctrl_get(&device->subdevice, cmd, size);
+}
+
+static int rm_ctrl_wr(struct nvidia_vgpu_gsp_client *client, void *ctrl)
+{
+	struct nvkm_gsp_device *device = client->gsp_device;
+
+	return nvkm_gsp_rm_ctrl_wr(&device->subdevice, ctrl);
+}
+
+static void *rm_ctrl_rd(struct nvidia_vgpu_gsp_client *client, u32 cmd,
+			u32 size)
+{
+	struct nvkm_gsp_device *device = client->gsp_device;
+
+	return nvkm_gsp_rm_ctrl_rd(&device->subdevice, cmd, size);
+}
+
+static void rm_ctrl_done(struct nvidia_vgpu_gsp_client *client, void *ctrl)
+{
+	struct nvkm_gsp_device *device = client->gsp_device;
+
+	nvkm_gsp_rm_ctrl_done(&device->subdevice, ctrl);
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
@@ -106,6 +136,10 @@ struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.alloc_gsp_client = alloc_gsp_client,
 	.free_gsp_client = free_gsp_client,
 	.get_gsp_client_handle = get_gsp_client_handle,
+	.rm_ctrl_get = rm_ctrl_get,
+	.rm_ctrl_wr = rm_ctrl_wr,
+	.rm_ctrl_rd = rm_ctrl_rd,
+	.rm_ctrl_done = rm_ctrl_done,
 };
 
 /**
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 79920cc27055..29ff9b39d0b2 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -27,6 +27,14 @@ struct nvkm_vgpu_mgr_vfio_ops {
 				struct nvidia_vgpu_gsp_client *client);
 	void (*free_gsp_client)(struct nvidia_vgpu_gsp_client *client);
 	u32 (*get_gsp_client_handle)(struct nvidia_vgpu_gsp_client *client);
+	void *(*rm_ctrl_get)(struct nvidia_vgpu_gsp_client *client,
+			     u32 cmd, u32 size);
+	int (*rm_ctrl_wr)(struct nvidia_vgpu_gsp_client *client,
+			  void *ctrl);
+	void *(*rm_ctrl_rd)(struct nvidia_vgpu_gsp_client *client, u32 cmd,
+			    u32 size);
+	void (*rm_ctrl_done)(struct nvidia_vgpu_gsp_client *client,
+			     void *ctrl);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 13/29] nvkm: move chid.h to nvkm/engine.
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (11 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 12/29] nvkm/vgpu: introduce GSP RM control interface " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 14/29] nvkm/vgpu: introduce channel allocation for vGPU Zhi Wang
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Move the chid.h to nvkm/engine so that vGPU manager support can expose the
routines of allocating CHIDs from the reserved CHID pool to NVIDIA VFIO
module when creating a vGPU.

No function change is intended.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../drm/nouveau/include/nvkm/engine/chid.h    | 29 ++++++++++++++++++
 .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   | 30 +------------------
 2 files changed, 30 insertions(+), 29 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
new file mode 100644
index 000000000000..b9e507af6725
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NVKM_CHID_H__
+#define __NVKM_CHID_H__
+#include <core/event.h>
+
+struct nvkm_chid {
+	struct kref kref;
+	int nr;
+	u32 mask;
+
+	struct nvkm_event event;
+
+	void **data;
+
+	spinlock_t lock;
+	unsigned long *reserved;
+	unsigned long used[];
+};
+
+int nvkm_chid_new(const struct nvkm_event_func *, struct nvkm_subdev *,
+		  int nr, int first, int count, struct nvkm_chid **pchid);
+struct nvkm_chid *nvkm_chid_ref(struct nvkm_chid *);
+void nvkm_chid_unref(struct nvkm_chid **);
+int nvkm_chid_get(struct nvkm_chid *, void *data);
+void nvkm_chid_put(struct nvkm_chid *, int id, spinlock_t *data_lock);
+int nvkm_chid_reserved_alloc(struct nvkm_chid *chid, int count);
+void nvkm_chid_reserved_free(struct nvkm_chid *chid, int first, int count);
+void nvkm_chid_reserve(struct nvkm_chid *chid, int first, int count);
+#endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
index b9e507af6725..a9c3e7143165 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.h
@@ -1,29 +1 @@
-/* SPDX-License-Identifier: MIT */
-#ifndef __NVKM_CHID_H__
-#define __NVKM_CHID_H__
-#include <core/event.h>
-
-struct nvkm_chid {
-	struct kref kref;
-	int nr;
-	u32 mask;
-
-	struct nvkm_event event;
-
-	void **data;
-
-	spinlock_t lock;
-	unsigned long *reserved;
-	unsigned long used[];
-};
-
-int nvkm_chid_new(const struct nvkm_event_func *, struct nvkm_subdev *,
-		  int nr, int first, int count, struct nvkm_chid **pchid);
-struct nvkm_chid *nvkm_chid_ref(struct nvkm_chid *);
-void nvkm_chid_unref(struct nvkm_chid **);
-int nvkm_chid_get(struct nvkm_chid *, void *data);
-void nvkm_chid_put(struct nvkm_chid *, int id, spinlock_t *data_lock);
-int nvkm_chid_reserved_alloc(struct nvkm_chid *chid, int count);
-void nvkm_chid_reserved_free(struct nvkm_chid *chid, int first, int count);
-void nvkm_chid_reserve(struct nvkm_chid *chid, int first, int count);
-#endif
+#include <engine/chid.h>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 14/29] nvkm/vgpu: introduce channel allocation for vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (12 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 13/29] nvkm: move chid.h to nvkm/engine Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 15/29] nvkm/vgpu: introduce FB memory " Zhi Wang
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Creating a vGPU requires allocating a portion of the CHIDs from the
reserved channel pool.

Expose the routine of allocating the channels from the reserved channel
pool to NVIDIA vGPU VFIO module for creating a vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  2 ++
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 28 +++++++++++++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  |  2 ++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  2 ++
 4 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 5a856fa905f9..a351e8bfc772 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -22,6 +22,8 @@ struct nvkm_vgpu_mgr {
 
 	void *vfio_ops;
 	struct nvidia_vgpu_vfio_handle_data vfio_handle_data;
+
+	struct mutex chid_alloc_lock;
 };
 
 bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device);
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index 9732e43a5d6b..44d901a0474d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -1,6 +1,9 @@
 /* SPDX-License-Identifier: MIT */
 
 #include <core/device.h>
+#include <engine/chid.h>
+#include <engine/fifo.h>
+#include <subdev/fb.h>
 #include <subdev/gsp.h>
 
 #include <vgpu_mgr/vgpu_mgr.h>
@@ -128,6 +131,29 @@ static void rm_ctrl_done(struct nvidia_vgpu_gsp_client *client, void *ctrl)
 	nvkm_gsp_rm_ctrl_done(&device->subdevice, ctrl);
 }
 
+static void free_chids(void *handle, int offset, int count)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+
+	mutex_lock(&vgpu_mgr->chid_alloc_lock);
+	nvkm_chid_reserved_free(device->fifo->chid, offset, count);
+	mutex_unlock(&vgpu_mgr->chid_alloc_lock);
+}
+
+static int alloc_chids(void *handle, int count)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+	int ret;
+
+	mutex_lock(&vgpu_mgr->chid_alloc_lock);
+	ret = nvkm_chid_reserved_alloc(device->fifo->chid, count);
+	mutex_unlock(&vgpu_mgr->chid_alloc_lock);
+
+	return ret;
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
@@ -140,6 +166,8 @@ struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.rm_ctrl_wr = rm_ctrl_wr,
 	.rm_ctrl_rd = rm_ctrl_rd,
 	.rm_ctrl_done = rm_ctrl_done,
+	.alloc_chids = alloc_chids,
+	.free_chids = free_chids,
 };
 
 /**
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index caeb805cf1c3..3654bd43b68a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -127,6 +127,8 @@ int nvkm_vgpu_mgr_init(struct nvkm_device *device)
 
 	vgpu_mgr->nvkm_dev = device;
 
+	mutex_init(&vgpu_mgr->chid_alloc_lock);
+
 	ret = attach_nvkm(vgpu_mgr);
 	if (ret)
 		return ret;
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 29ff9b39d0b2..001306fb0b5b 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -35,6 +35,8 @@ struct nvkm_vgpu_mgr_vfio_ops {
 			    u32 size);
 	void (*rm_ctrl_done)(struct nvidia_vgpu_gsp_client *client,
 			     void *ctrl);
+	int (*alloc_chids)(void *handle, int count);
+	void (*free_chids)(void *handle, int offset, int count);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 15/29] nvkm/vgpu: introduce FB memory allocation for vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (13 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 14/29] nvkm/vgpu: introduce channel allocation for vGPU Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 16/29] nvkm/vgpu: introduce BAR1 map routines for vGPUs Zhi Wang
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Creating a vGPU requires allocating a mgmt heap from the FB memory. The
size of the mgmt heap that a vGPU requires is from the vGPU type.

Expose the FB memory allocation to NVIDIA vGPU VFIO module to allocate the
mgmt heap when creating a vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  6 +++
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 51 +++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  8 +++
 3 files changed, 65 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index a351e8bfc772..b6e0321a53ad 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -6,6 +6,12 @@
 
 #define NVIDIA_MAX_VGPUS 2
 
+struct nvkm_vgpu_mem {
+	struct nvidia_vgpu_mem base;
+	struct nvkm_memory *mem;
+	struct nvkm_vgpu_mgr *vgpu_mgr;
+};
+
 struct nvkm_vgpu_mgr {
 	bool enabled;
 	struct nvkm_device *nvkm_dev;
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index 44d901a0474d..2aabb2c5f142 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -3,6 +3,7 @@
 #include <core/device.h>
 #include <engine/chid.h>
 #include <engine/fifo.h>
+#include <subdev/bar.h>
 #include <subdev/fb.h>
 #include <subdev/gsp.h>
 
@@ -154,6 +155,54 @@ static int alloc_chids(void *handle, int count)
 	return ret;
 }
 
+static void free_fbmem(struct nvidia_vgpu_mem *base)
+{
+	struct nvkm_vgpu_mem *mem =
+		container_of(base, struct nvkm_vgpu_mem, base);
+	struct nvkm_vgpu_mgr *vgpu_mgr = mem->vgpu_mgr;
+	struct nvkm_device *device = vgpu_mgr->nvkm_dev;
+
+	nvdev_debug(device, "free fb mem: addr %llx size %llx\n",
+		    base->addr, base->size);
+
+	nvkm_memory_unref(&mem->mem);
+	kfree(mem);
+}
+
+static struct nvidia_vgpu_mem *alloc_fbmem(void *handle, u64 size,
+					   bool vmmu_aligned)
+{
+	struct nvkm_device *device = handle;
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+	struct nvidia_vgpu_mem *base;
+	struct nvkm_vgpu_mem *mem;
+	u32 shift = vmmu_aligned ? ilog2(vgpu_mgr->vmmu_segment_size) :
+		    NVKM_RAM_MM_SHIFT;
+	int ret;
+
+	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+	if (!mem)
+		return ERR_PTR(-ENOMEM);
+
+	base = &mem->base;
+
+	ret = nvkm_ram_get(device, NVKM_RAM_MM_NORMAL, 0x1, shift, size,
+			   true, true, &mem->mem);
+	if (ret) {
+		kfree(mem);
+		return ERR_PTR(ret);
+	}
+
+	mem->vgpu_mgr = vgpu_mgr;
+	base->addr = mem->mem->func->addr(mem->mem);
+	base->size = mem->mem->func->size(mem->mem);
+
+	nvdev_debug(device, "alloc fb mem: addr %llx size %llx\n",
+		    base->addr, base->size);
+
+	return base;
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
@@ -168,6 +217,8 @@ struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.rm_ctrl_done = rm_ctrl_done,
 	.alloc_chids = alloc_chids,
 	.free_chids = free_chids,
+	.alloc_fbmem = alloc_fbmem,
+	.free_fbmem = free_fbmem,
 };
 
 /**
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 001306fb0b5b..4841e9cf0d40 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -16,6 +16,11 @@ struct nvidia_vgpu_gsp_client {
 	void *gsp_device;
 };
 
+struct nvidia_vgpu_mem {
+	u64 addr;
+	u64 size;
+};
+
 struct nvkm_vgpu_mgr_vfio_ops {
 	bool (*vgpu_mgr_is_enabled)(void *handle);
 	void (*get_handle)(void *handle,
@@ -37,6 +42,9 @@ struct nvkm_vgpu_mgr_vfio_ops {
 			     void *ctrl);
 	int (*alloc_chids)(void *handle, int count);
 	void (*free_chids)(void *handle, int offset, int count);
+	struct nvidia_vgpu_mem *(*alloc_fbmem)(void *handle, u64 size,
+					       bool vmmu_aligned);
+	void (*free_fbmem)(struct nvidia_vgpu_mem *mem);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 16/29] nvkm/vgpu: introduce BAR1 map routines for vGPUs
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (14 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 15/29] nvkm/vgpu: introduce FB memory " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 17/29] nvkm/vgpu: introduce engine bitmap for vGPU Zhi Wang
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The mgmt heap is a block of shared FBMEM between the GSP firmware and
the vGPU manager. It is used for supporting vGPU RPCs, vGPU logging.

To access the data structures of vGPU RPCs and vGPU logging, the mgmt
heap FBMEM needs to mapped into BAR1 and the region in the BAR1 is
required to be mapped into CPU vaddr.

Expose the BAR1 map routines to NVIDIA vGPU VFIO module to map the mgmt
heap.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  1 +
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 47 +++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  3 ++
 3 files changed, 51 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index b6e0321a53ad..882965fd25ce 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -10,6 +10,7 @@ struct nvkm_vgpu_mem {
 	struct nvidia_vgpu_mem base;
 	struct nvkm_memory *mem;
 	struct nvkm_vgpu_mgr *vgpu_mgr;
+	struct nvkm_vma *bar1_vma;
 };
 
 struct nvkm_vgpu_mgr {
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index 2aabb2c5f142..535c2922d3af 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -6,6 +6,7 @@
 #include <subdev/bar.h>
 #include <subdev/fb.h>
 #include <subdev/gsp.h>
+#include <subdev/mmu.h>
 
 #include <vgpu_mgr/vgpu_mgr.h>
 #include <drm/nvkm_vgpu_mgr_vfio.h>
@@ -203,6 +204,50 @@ static struct nvidia_vgpu_mem *alloc_fbmem(void *handle, u64 size,
 	return base;
 }
 
+static void bar1_unmap_mem(struct nvidia_vgpu_mem *base)
+{
+	struct nvkm_vgpu_mem *mem =
+		container_of(base, struct nvkm_vgpu_mem, base);
+	struct nvkm_vgpu_mgr *vgpu_mgr = mem->vgpu_mgr;
+	struct nvkm_device *device = vgpu_mgr->nvkm_dev;
+	struct nvkm_vmm *vmm = nvkm_bar_bar1_vmm(device);
+
+	iounmap(base->bar1_vaddr);
+	base->bar1_vaddr = NULL;
+	nvkm_vmm_put(vmm, &mem->bar1_vma);
+	mem->bar1_vma = NULL;
+}
+
+static int bar1_map_mem(struct nvidia_vgpu_mem *base)
+{
+	struct nvkm_vgpu_mem *mem =
+		container_of(base, struct nvkm_vgpu_mem, base);
+	struct nvkm_vgpu_mgr *vgpu_mgr = mem->vgpu_mgr;
+	struct nvkm_device *device = vgpu_mgr->nvkm_dev;
+	struct nvkm_vmm *vmm = nvkm_bar_bar1_vmm(device);
+	unsigned long paddr;
+	int ret;
+
+	if (WARN_ON(base->bar1_vaddr || mem->bar1_vma))
+		return -EEXIST;
+
+	ret = nvkm_vmm_get(vmm, 12, base->size, &mem->bar1_vma);
+	if (ret)
+		return ret;
+
+	ret = nvkm_memory_map(mem->mem, 0, vmm, mem->bar1_vma, NULL, 0);
+	if (ret) {
+		nvkm_vmm_put(vmm, &mem->bar1_vma);
+		return ret;
+	}
+
+	paddr = device->func->resource_addr(device, 1) +
+		mem->bar1_vma->addr;
+
+	base->bar1_vaddr = ioremap(paddr, base->size);
+	return 0;
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
@@ -219,6 +264,8 @@ struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.free_chids = free_chids,
 	.alloc_fbmem = alloc_fbmem,
 	.free_fbmem = free_fbmem,
+	.bar1_map_mem = bar1_map_mem,
+	.bar1_unmap_mem = bar1_unmap_mem,
 };
 
 /**
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 4841e9cf0d40..38b4cf5786fa 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -19,6 +19,7 @@ struct nvidia_vgpu_gsp_client {
 struct nvidia_vgpu_mem {
 	u64 addr;
 	u64 size;
+	void * __iomem bar1_vaddr;
 };
 
 struct nvkm_vgpu_mgr_vfio_ops {
@@ -45,6 +46,8 @@ struct nvkm_vgpu_mgr_vfio_ops {
 	struct nvidia_vgpu_mem *(*alloc_fbmem)(void *handle, u64 size,
 					       bool vmmu_aligned);
 	void (*free_fbmem)(struct nvidia_vgpu_mem *mem);
+	int (*bar1_map_mem)(struct nvidia_vgpu_mem *mem);
+	void (*bar1_unmap_mem)(struct nvidia_vgpu_mem *mem);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 17/29] nvkm/vgpu: introduce engine bitmap for vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (15 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 16/29] nvkm/vgpu: introduce BAR1 map routines for vGPUs Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm Zhi Wang
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Activating a new vGPU needs to configure the allocated CHIDs of engines.
Thus, an engine bitmap is required.

Expose the engine bitmap to NVIDIA vGPU VFIO module to activate the new
vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c | 15 +++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h             |  1 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
index 535c2922d3af..84c13d678ffa 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
@@ -3,6 +3,7 @@
 #include <core/device.h>
 #include <engine/chid.h>
 #include <engine/fifo.h>
+#include <engine/fifo/runl.h>
 #include <subdev/bar.h>
 #include <subdev/fb.h>
 #include <subdev/gsp.h>
@@ -248,6 +249,19 @@ static int bar1_map_mem(struct nvidia_vgpu_mem *base)
 	return 0;
 }
 
+static void get_engine_bitmap(void *handle, unsigned long *bitmap)
+{
+	struct nvkm_device *nvkm_dev = handle;
+	struct nvkm_runl *runl;
+	struct nvkm_engn *engn;
+
+	nvkm_runl_foreach(runl, nvkm_dev->fifo) {
+		nvkm_runl_foreach_engn(engn, runl) {
+			__set_bit(engn->id, bitmap);
+		}
+	}
+}
+
 struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.vgpu_mgr_is_enabled = vgpu_mgr_is_enabled,
 	.get_handle = get_handle,
@@ -266,6 +280,7 @@ struct nvkm_vgpu_mgr_vfio_ops nvkm_vgpu_mgr_vfio_ops = {
 	.free_fbmem = free_fbmem,
 	.bar1_map_mem = bar1_map_mem,
 	.bar1_unmap_mem = bar1_unmap_mem,
+	.get_engine_bitmap = get_engine_bitmap,
 };
 
 /**
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index 38b4cf5786fa..d9ed2cd202ff 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -48,6 +48,7 @@ struct nvkm_vgpu_mgr_vfio_ops {
 	void (*free_fbmem)(struct nvidia_vgpu_mem *mem);
 	int (*bar1_map_mem)(struct nvidia_vgpu_mem *mem);
 	void (*bar1_unmap_mem)(struct nvidia_vgpu_mem *mem);
+	void (*get_engine_bitmap)(void *handle, unsigned long *bitmap);
 };
 
 struct nvkm_vgpu_mgr_vfio_ops *nvkm_vgpu_mgr_get_vfio_ops(void *handle);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (16 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 17/29] nvkm/vgpu: introduce engine bitmap for vGPU Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-26 22:56   ` Jason Gunthorpe
  2024-09-22 12:49 ` [RFC 19/29] vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude Zhi Wang
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The kernel PCI core provides a sysfs UAPI for the user to specify
number of VFs to be enabled.

To support the UAPI, the driver is required to implement
pci_driver.sriov_configure(). The num of VFs are only able to be
changed where there is no active vGPUs.

Implement the pci_driver.sriov_configure() in nvkm. Introduce a
blocking call chain to let NVIDIA vGPU manager handle this event.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  1 +
 drivers/gpu/drm/nouveau/nvkm/device/pci.c     | 14 +++++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 25 +++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  5 ++++
 4 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 882965fd25ce..388758fa7ce8 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -40,5 +40,6 @@ void nvkm_vgpu_mgr_fini(struct nvkm_device *device);
 void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
 					void *info);
 void nvkm_vgpu_mgr_init_vfio_ops(struct nvkm_vgpu_mgr *vgpu_mgr);
+int nvkm_vgpu_mgr_pci_sriov_configure(struct nvkm_device *device, int num_vfs);
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
index 1543902b20e9..f39d2727d653 100644
--- a/drivers/gpu/drm/nouveau/nvkm/device/pci.c
+++ b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
@@ -1766,6 +1766,9 @@ nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	struct nvkm_device *device;
 	int ret, bits;
 
+	if (pci_dev->is_virtfn)
+		return -EINVAL;
+
 	if (vga_switcheroo_client_probe_defer(pci_dev))
 		return -EPROBE_DEFER;
 
@@ -1867,6 +1870,16 @@ nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	return ret;
 }
 
+static int nvkm_device_pci_sriov_configure(struct pci_dev *dev, int num_vfs)
+{
+	struct nvkm_device *device = pci_get_drvdata(dev);
+
+	if (!nvkm_vgpu_mgr_is_enabled(device))
+		return -ENODEV;
+
+	return nvkm_vgpu_mgr_pci_sriov_configure(device, num_vfs);
+}
+
 static struct pci_device_id
 nvkm_device_pci_id_table[] = {
 	{
@@ -1889,6 +1902,7 @@ nvkm_device_pci_driver = {
 	.probe = nvkm_device_pci_probe,
 	.remove = nvkm_device_pci_remove,
 	.driver.pm = &nvkm_device_pci_pm,
+	.sriov_configure = nvkm_device_pci_sriov_configure,
 };
 
 MODULE_DEVICE_TABLE(pci, nvkm_device_pci_id_table);
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index 3654bd43b68a..47c459f93950 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -207,3 +207,28 @@ void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
 	v = nvkm_rd32(device, 0x88000 + 0xbfc);
 	vf_info->b64bitBar2 = IS_BAR_64(v);
 }
+
+/**
+ * nvkm_vgpu_mgr_pci_sriov_configure - Configure SRIOV VFs
+ * @device: the nvkm_device pointer
+ * @num_vfs: Number of VFs
+ *
+ * Returns: 0 on success, negative on failure.
+ */
+int nvkm_vgpu_mgr_pci_sriov_configure(struct nvkm_device *device, int num_vfs)
+{
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+	struct nvidia_vgpu_vfio_handle_data *vfio = &vgpu_mgr->vfio_handle_data;
+	struct pci_dev *pdev = nvkm_to_pdev(device);
+	int ret;
+
+	if (vfio->priv)
+		return -EBUSY;
+
+	if (num_vfs)
+		ret = pci_enable_sriov(pdev, num_vfs);
+	else
+		pci_disable_sriov(pdev);
+
+	return ret ? ret : num_vfs;
+}
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index d9ed2cd202ff..5c2c650c2df9 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -6,8 +6,13 @@
 #ifndef __NVKM_VGPU_MGR_VFIO_H__
 #define __NVKM_VGPU_MGR_VFIO_H__
 
+enum {
+	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
+};
+
 struct nvidia_vgpu_vfio_handle_data {
 	void *priv;
+	struct notifier_block notifier;
 };
 
 /* A combo of handles of RmClient and RmDevice */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 19/29] vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (17 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 20/29] vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager Zhi Wang
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

To introduce the routines when creating a vGPU one by one in the
following patches, first, introduce the prelude of the vGPU lifecycle
management as the skeleton.

Introduce NVIDIA vGPU manager core module that hosting the vGPU lifecycle
managemement data structures and routines.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/Kconfig                |  2 +
 drivers/vfio/pci/Makefile               |  2 +
 drivers/vfio/pci/nvidia-vgpu/Kconfig    | 13 ++++
 drivers/vfio/pci/nvidia-vgpu/Makefile   |  3 +
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     | 46 ++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 83 +++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 99 +++++++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 43 +++++++++++
 8 files changed, 291 insertions(+)
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/Makefile
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 15821a2d77d2..4b42378afc1a 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -69,4 +69,6 @@ source "drivers/vfio/pci/virtio/Kconfig"
 
 source "drivers/vfio/pci/nvgrace-gpu/Kconfig"
 
+source "drivers/vfio/pci/nvidia-vgpu/Kconfig"
+
 endmenu
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index ce7a61f1d912..88f722c5c161 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -17,3 +17,5 @@ obj-$(CONFIG_PDS_VFIO_PCI) += pds/
 obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
 
 obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) += nvgrace-gpu/
+
+obj-$(CONFIG_NVIDIA_VGPU_VFIO_PCI) += nvidia-vgpu/
diff --git a/drivers/vfio/pci/nvidia-vgpu/Kconfig b/drivers/vfio/pci/nvidia-vgpu/Kconfig
new file mode 100644
index 000000000000..a9b28e944902
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/Kconfig
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config NVIDIA_VGPU_MGR
+	tristate
+
+config NVIDIA_VGPU_VFIO_PCI
+	tristate "VFIO support for the NVIDIA vGPU"
+	select NVIDIA_VGPU_MGR
+	select VFIO_PCI_CORE
+	help
+	  VFIO support for the NVIDIA vGPU is required to assign the vGPU
+	  to userspace using KVM/qemu/etc.
+
+	  If you don't know what to do here, say N.
diff --git a/drivers/vfio/pci/nvidia-vgpu/Makefile b/drivers/vfio/pci/nvidia-vgpu/Makefile
new file mode 100644
index 000000000000..1d2c0eb1fa5c
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_NVIDIA_VGPU_MGR) += nvidia-vgpu-mgr.o
+nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o
diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
new file mode 100644
index 000000000000..4c75431ee1f6
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+#ifndef __NVIDIA_VGPU_MGR_NVKM_H__
+#define __NVIDIA_VGPU_MGR_NVKM_H__
+
+#include <linux/pci.h>
+#include <drm/nvkm_vgpu_mgr_vfio.h>
+
+struct nvidia_vgpu_mgr_handle {
+	void *pf_drvdata;
+	struct nvkm_vgpu_mgr_vfio_ops *ops;
+	struct nvidia_vgpu_vfio_handle_data data;
+};
+
+static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
+		struct nvidia_vgpu_mgr_handle *h)
+{
+	struct pci_dev *pf_dev;
+
+	if (!pdev->is_virtfn)
+		return -EINVAL;
+
+	pf_dev = pdev->physfn;
+
+	if (strcmp(pf_dev->driver->name, "nvkm"))
+		return -EINVAL;
+
+	h->pf_drvdata = pci_get_drvdata(pf_dev);
+	h->ops = nvkm_vgpu_mgr_get_vfio_ops(h->pf_drvdata);
+	h->ops->get_handle(h->pf_drvdata, &h->data);
+
+	return 0;
+}
+
+#define nvidia_vgpu_mgr_support_is_enabled(h) \
+	(h).ops->vgpu_mgr_is_enabled((h).pf_drvdata)
+
+#define nvidia_vgpu_mgr_attach_handle(h) \
+	(h)->ops->attach_handle((h)->pf_drvdata, &(h)->data)
+
+#define nvidia_vgpu_mgr_detach_handle(h) \
+	(h)->ops->detach_handle((h)->pf_drvdata)
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
new file mode 100644
index 000000000000..34f6adb9dfe4
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include "vgpu_mgr.h"
+
+static void unregister_vgpu(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+
+	mutex_lock(&vgpu_mgr->vgpu_id_lock);
+
+	vgpu_mgr->vgpus[vgpu->info.id] = NULL;
+	atomic_dec(&vgpu_mgr->num_vgpus);
+
+	mutex_unlock(&vgpu_mgr->vgpu_id_lock);
+}
+
+static int register_vgpu(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+
+	mutex_lock(&vgpu_mgr->vgpu_id_lock);
+
+	if (vgpu_mgr->vgpus[vgpu->info.id]) {
+		mutex_unlock(&vgpu_mgr->vgpu_id_lock);
+		return -EBUSY;
+	}
+	vgpu_mgr->vgpus[vgpu->info.id] = vgpu;
+	atomic_inc(&vgpu_mgr->num_vgpus);
+
+	mutex_unlock(&vgpu_mgr->vgpu_id_lock);
+	return 0;
+}
+
+/**
+ * nvidia_vgpu_mgr_destroy_vgpu - destroy a vGPU instance
+ * @vgpu: the vGPU instance going to be destroyed.
+ *
+ * Returns: 0 on success, others on failure.
+ */
+int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
+{
+	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
+		return -ENODEV;
+
+	unregister_vgpu(vgpu);
+	return 0;
+}
+EXPORT_SYMBOL(nvidia_vgpu_mgr_destroy_vgpu);
+
+/**
+ * nvidia_vgpu_mgr_create_vgpu - create a vGPU instance
+ * @vgpu: the vGPU instance going to be created.
+ * @vgpu_type: the vGPU type of the vGPU instance.
+ *
+ * The caller must initialize vgpu->vgpu_mgr, gpu->info, vgpu->pdev.
+ *
+ * Returns: 0 on success, others on failure.
+ */
+int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
+{
+	int ret;
+
+	if (WARN_ON(vgpu->info.id >= NVIDIA_MAX_VGPUS))
+		return -EINVAL;
+
+	if (WARN_ON(!vgpu->vgpu_mgr || !vgpu->info.gfid || !vgpu->info.dbdf))
+		return -EINVAL;
+
+	mutex_init(&vgpu->lock);
+	vgpu->vgpu_type = vgpu_type;
+
+	ret = register_vgpu(vgpu);
+	if (ret)
+		return ret;
+
+	atomic_set(&vgpu->status, 1);
+
+	return 0;
+}
+EXPORT_SYMBOL(nvidia_vgpu_mgr_create_vgpu);
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
new file mode 100644
index 000000000000..dc2a73f95650
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include "vgpu_mgr.h"
+
+DEFINE_MUTEX(vgpu_mgr_attach_lock);
+
+static void vgpu_mgr_release(struct kref *kref)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr =
+		container_of(kref, struct nvidia_vgpu_mgr, refcount);
+
+	nvidia_vgpu_mgr_detach_handle(&vgpu_mgr->handle);
+	kvfree(vgpu_mgr);
+}
+
+/**
+ * nvidia_vgpu_mgr_put - put the vGPU manager
+ * @vgpu: the vGPU manager to put.
+ *
+ */
+void nvidia_vgpu_mgr_put(struct nvidia_vgpu_mgr *vgpu_mgr)
+{
+	if (!nvidia_vgpu_mgr_support_is_enabled(vgpu_mgr->handle))
+		return;
+
+	mutex_lock(&vgpu_mgr_attach_lock);
+	kref_put(&vgpu_mgr->refcount, vgpu_mgr_release);
+	mutex_unlock(&vgpu_mgr_attach_lock);
+}
+EXPORT_SYMBOL(nvidia_vgpu_mgr_put);
+
+/**
+ * nvidia_vgpu_mgr_get - get the vGPU manager
+ * @dev: the VF pci_dev.
+ *
+ * Returns: pointer to vgpu_mgr on success, IS_ERR() on failure.
+ */
+struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr;
+	struct nvidia_vgpu_mgr_handle handle;
+	int ret;
+
+	mutex_lock(&vgpu_mgr_attach_lock);
+
+	memset(&handle, 0, sizeof(handle));
+
+	ret = nvidia_vgpu_mgr_get_handle(dev, &handle);
+	if (ret) {
+		mutex_unlock(&vgpu_mgr_attach_lock);
+		return ERR_PTR(ret);
+	}
+
+	if (!nvidia_vgpu_mgr_support_is_enabled(handle)) {
+		mutex_unlock(&vgpu_mgr_attach_lock);
+		return ERR_PTR(-ENODEV);
+	}
+
+	if (handle.data.priv) {
+		vgpu_mgr = handle.data.priv;
+		kref_get(&vgpu_mgr->refcount);
+		mutex_unlock(&vgpu_mgr_attach_lock);
+		return vgpu_mgr;
+	}
+
+	vgpu_mgr = kvzalloc(sizeof(*vgpu_mgr), GFP_KERNEL);
+	if (!vgpu_mgr) {
+		ret = -ENOMEM;
+		goto fail_alloc_vgpu_mgr;
+	}
+
+	vgpu_mgr->handle = handle;
+	vgpu_mgr->handle.data.priv = vgpu_mgr;
+
+	ret = nvidia_vgpu_mgr_attach_handle(&handle);
+	if (ret)
+		goto fail_attach_handle;
+
+	kref_init(&vgpu_mgr->refcount);
+	mutex_init(&vgpu_mgr->vgpu_id_lock);
+
+	mutex_unlock(&vgpu_mgr_attach_lock);
+	return vgpu_mgr;
+
+fail_attach_handle:
+	kvfree(vgpu_mgr);
+fail_alloc_vgpu_mgr:
+	mutex_unlock(&vgpu_mgr_attach_lock);
+	vgpu_mgr = ERR_PTR(ret);
+	return vgpu_mgr;
+}
+EXPORT_SYMBOL(nvidia_vgpu_mgr_get);
+
+MODULE_LICENSE("Dual MIT/GPL");
+MODULE_AUTHOR("Zhi Wang <zhiw@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA VGPU manager - core module to support VFIO PCI driver for NVIDIA vGPU");
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
new file mode 100644
index 000000000000..2efd96644098
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+#ifndef __NVIDIA_VGPU_MGR_H__
+#define __NVIDIA_VGPU_MGR_H__
+
+#include "nvkm.h"
+
+#define NVIDIA_MAX_VGPUS 2
+
+struct nvidia_vgpu_info {
+	int id;
+	u32 gfid;
+	u32 dbdf;
+};
+
+struct nvidia_vgpu {
+	struct mutex lock;
+	atomic_t status;
+	struct pci_dev *pdev;
+
+	u8 *vgpu_type;
+	struct nvidia_vgpu_info info;
+	struct nvidia_vgpu_mgr *vgpu_mgr;
+};
+
+struct nvidia_vgpu_mgr {
+	struct kref refcount;
+	struct nvidia_vgpu_mgr_handle handle;
+
+	struct mutex vgpu_id_lock;
+	struct nvidia_vgpu *vgpus[NVIDIA_MAX_VGPUS];
+	atomic_t num_vgpus;
+};
+
+struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev);
+void nvidia_vgpu_mgr_put(struct nvidia_vgpu_mgr *vgpu_mgr);
+
+int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu);
+int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type);
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 20/29] vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (18 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 19/29] vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 21/29] vfio/vgpu_mgr: introduce vGPU type uploading Zhi Wang
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

A GSP RM client is required when talking to the GSP firmware via GSP RM
controls.

In order to create vGPUs, NVIDIA vGPU manager requires a GSP RM client to
upload vGPU types to GSP.

Allocate a dedicated GSP RM client for NVIDIA vGPU manager.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     | 9 +++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c | 8 ++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h | 2 ++
 3 files changed, 19 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index 4c75431ee1f6..939f3b420bb3 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -43,4 +43,13 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_detach_handle(h) \
 	(h)->ops->detach_handle((h)->pf_drvdata)
 
+#define nvidia_vgpu_mgr_alloc_gsp_client(m, c) \
+	m->handle.ops->alloc_gsp_client(m->handle.pf_drvdata, c)
+
+#define nvidia_vgpu_mgr_free_gsp_client(m, c) \
+	m->handle.ops->free_gsp_client(c)
+
+#define nvidia_vgpu_mgr_get_gsp_client_handle(m, c) \
+	m->handle.ops->get_gsp_client_handle(c)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
index dc2a73f95650..812b7be00bee 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
@@ -12,6 +12,7 @@ static void vgpu_mgr_release(struct kref *kref)
 	struct nvidia_vgpu_mgr *vgpu_mgr =
 		container_of(kref, struct nvidia_vgpu_mgr, refcount);
 
+	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu_mgr->gsp_client);
 	nvidia_vgpu_mgr_detach_handle(&vgpu_mgr->handle);
 	kvfree(vgpu_mgr);
 }
@@ -82,9 +83,16 @@ struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev)
 	kref_init(&vgpu_mgr->refcount);
 	mutex_init(&vgpu_mgr->vgpu_id_lock);
 
+	ret = nvidia_vgpu_mgr_alloc_gsp_client(vgpu_mgr,
+					       &vgpu_mgr->gsp_client);
+	if (ret)
+		goto fail_alloc_gsp_client;
+
 	mutex_unlock(&vgpu_mgr_attach_lock);
 	return vgpu_mgr;
 
+fail_alloc_gsp_client:
+	nvidia_vgpu_mgr_detach_handle(&vgpu_mgr->handle);
 fail_attach_handle:
 	kvfree(vgpu_mgr);
 fail_alloc_vgpu_mgr:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index 2efd96644098..f4416e6ed8f9 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -32,6 +32,8 @@ struct nvidia_vgpu_mgr {
 	struct mutex vgpu_id_lock;
 	struct nvidia_vgpu *vgpus[NVIDIA_MAX_VGPUS];
 	atomic_t num_vgpus;
+
+	struct nvidia_vgpu_gsp_client gsp_client;
 };
 
 struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 21/29] vfio/vgpu_mgr: introduce vGPU type uploading
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (19 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 20/29] vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 22/29] vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs Zhi Wang
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Each type of vGPU is designed to meet specific requirements, from
supporting multiple users with demanding graphics applications to
powering AI workloads in virtualized environments.

To create a vGPU associated with a vGPU type, the vGPU type specs are
required to be uploaded to GSP firmware.

Intorduce vGPU type uploading defs and upload vGPU type specs when vGPU
is enabled. Upload a default vGPU type: L40-24Q.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/Makefile         |   4 +-
 .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
 .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   |  62 +++
 .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
 .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
 drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  12 +
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       |   6 +
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |   5 +
 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++++
 9 files changed, 720 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c

diff --git a/drivers/vfio/pci/nvidia-vgpu/Makefile b/drivers/vfio/pci/nvidia-vgpu/Makefile
index 1d2c0eb1fa5c..bd65fa548ea1 100644
--- a/drivers/vfio/pci/nvidia-vgpu/Makefile
+++ b/drivers/vfio/pci/nvidia-vgpu/Makefile
@@ -1,3 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
+ccflags-y += -I$(srctree)/$(src)/include
+
 obj-$(CONFIG_NVIDIA_VGPU_MGR) += nvidia-vgpu-mgr.o
-nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o
+nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o vgpu_types.o
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
new file mode 100644
index 000000000000..5b6750f6f5a2
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
@@ -0,0 +1,33 @@
+#ifndef __src_common_sdk_nvidia_inc_ctrl_ctrl2080_ctrl2080gpu_h__
+#define __src_common_sdk_nvidia_inc_ctrl_ctrl2080_ctrl2080gpu_h__
+
+/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535.113.01 */
+
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2006-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#define NV_GRID_LICENSE_INFO_MAX_LENGTH (128)
+
+#define NV2080_GPU_MAX_NAME_STRING_LENGTH                  (0x0000040U)
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
new file mode 100644
index 000000000000..f44cdc733229
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
@@ -0,0 +1,62 @@
+#ifndef __src_common_sdk_nvidia_inc_ctrl_ctrl2080_ctrl2080vgpumgrinternal_h__
+#define __src_common_sdk_nvidia_inc_ctrl_ctrl2080_ctrl2080vgpumgrinternal_h__
+
+/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535.113.01 */
+
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2021-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/*
+ * NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE
+ *
+ * This command is used to add a new vGPU config to the pGPU in physical RM.
+ * Unlike NVA081_CTRL_CMD_VGPU_CONFIG_SET_INFO, it does no validation
+ * and is only to be used internally.
+ *
+ * discardVgpuTypes [IN]
+ *  This parameter specifies if existing vGPU configuration should be
+ *  discarded for given pGPU
+ *
+ * vgpuInfoCount [IN]
+ *   This parameter specifies the number of entries of virtual GPU type
+ *   information
+ *
+ * vgpuInfo [IN]
+ *   This parameter specifies virtual GPU type information
+ *
+ * Possible status values returned are:
+ *   NV_OK
+ *   NV_ERR_OBJECT_NOT_FOUND
+ *   NV_ERR_NOT_SUPPORTED
+ */
+#define NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE (0x20804003) /* finn: Evaluated from "(FINN_NV20_SUBDEVICE_0_VGPU_MGR_INTERNAL_INTERFACE_ID << 8) | NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS_MESSAGE_ID" */
+
+#define NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS_MESSAGE_ID (0x3U)
+
+typedef struct NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS {
+	NvBool discardVgpuTypes;
+	NvU32  vgpuInfoCount;
+	NV_DECLARE_ALIGNED(NVA081_CTRL_VGPU_INFO vgpuInfo[NVA081_MAX_VGPU_TYPES_PER_PGPU], 8);
+} NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS;
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
new file mode 100644
index 000000000000..c40740e3f9df
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
@@ -0,0 +1,109 @@
+#ifndef __src_common_sdk_nvidia_inc_ctrl_ctrla081_h__
+#define __src_common_sdk_nvidia_inc_ctrl_ctrla081_h__
+
+/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535.113.01 */
+
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2014-2020 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h>
+
+#define NVA081_CTRL_VGPU_CONFIG_INVALID_TYPE 0x00
+#define NVA081_MAX_VGPU_TYPES_PER_PGPU       0x40
+#define NVA081_MAX_VGPU_PER_PGPU             32
+#define NVA081_VM_UUID_SIZE                  16
+#define NVA081_VGPU_STRING_BUFFER_SIZE       32
+#define NVA081_VGPU_SIGNATURE_SIZE           128
+#define NVA081_VM_NAME_SIZE                  128
+#define NVA081_PCI_CONFIG_SPACE_SIZE         0x100
+#define NVA081_PGPU_METADATA_STRING_SIZE     256
+#define NVA081_EXTRA_PARAMETERS_SIZE         1024
+
+/*
+ * NVA081_CTRL_VGPU_CONFIG_INFO
+ *
+ * This structure represents the per vGPU information
+ *
+ */
+typedef struct NVA081_CTRL_VGPU_INFO {
+	// This structure should be in sync with NVA082_CTRL_CMD_HOST_VGPU_DEVICE_GET_VGPU_TYPE_INFO_PARAMS
+	NvU32 vgpuType;
+	NvU8  vgpuName[NVA081_VGPU_STRING_BUFFER_SIZE];
+	NvU8  vgpuClass[NVA081_VGPU_STRING_BUFFER_SIZE];
+	NvU8  vgpuSignature[NVA081_VGPU_SIGNATURE_SIZE];
+	NvU8  license[NV_GRID_LICENSE_INFO_MAX_LENGTH];
+	NvU32 maxInstance;
+	NvU32 numHeads;
+	NvU32 maxResolutionX;
+	NvU32 maxResolutionY;
+	NvU32 maxPixels;
+	NvU32 frlConfig;
+	NvU32 cudaEnabled;
+	NvU32 eccSupported;
+	NvU32 gpuInstanceSize;
+	NvU32 multiVgpuSupported;
+	NV_DECLARE_ALIGNED(NvU64 vdevId, 8);
+	NV_DECLARE_ALIGNED(NvU64 pdevId, 8);
+	NV_DECLARE_ALIGNED(NvU64 profileSize, 8);
+	NV_DECLARE_ALIGNED(NvU64 fbLength, 8);
+	NV_DECLARE_ALIGNED(NvU64 gspHeapSize, 8);
+	NV_DECLARE_ALIGNED(NvU64 fbReservation, 8);
+	NV_DECLARE_ALIGNED(NvU64 mappableVideoSize, 8);
+	NvU32 encoderCapacity;
+	NV_DECLARE_ALIGNED(NvU64 bar1Length, 8);
+	NvU32 frlEnable;
+	NvU8  adapterName[NV2080_GPU_MAX_NAME_STRING_LENGTH];
+	NvU16 adapterName_Unicode[NV2080_GPU_MAX_NAME_STRING_LENGTH];
+	NvU8  shortGpuNameString[NV2080_GPU_MAX_NAME_STRING_LENGTH];
+	NvU8  licensedProductName[NV_GRID_LICENSE_INFO_MAX_LENGTH];
+	NvU32 vgpuExtraParams[NVA081_EXTRA_PARAMETERS_SIZE];
+	NvU32 ftraceEnable;
+	NvU32 gpuDirectSupported;
+	NvU32 nvlinkP2PSupported;
+	NvU32 multiVgpuExclusive;
+	NvU32 exclusiveType;
+	NvU32 exclusiveSize;
+	// used only by NVML
+	NvU32 gpuInstanceProfileId;
+} NVA081_CTRL_VGPU_INFO;
+
+/*
+ * NVA081_CTRL_VGPU_CONFIG_INFO_PARAMS
+ *
+ * This structure represents the vGPU configuration information
+ *
+ */
+#define NVA081_CTRL_VGPU_CONFIG_INFO_PARAMS_MESSAGE_ID (0x1U)
+
+typedef struct NVA081_CTRL_VGPU_CONFIG_INFO_PARAMS {
+	NvBool discardVgpuTypes;
+	NV_DECLARE_ALIGNED(NVA081_CTRL_VGPU_INFO vgpuInfo, 8);
+	NvU32  vgpuConfigState;
+} NVA081_CTRL_VGPU_CONFIG_INFO_PARAMS;
+
+/* VGPU Config state values */
+#define NVA081_CTRL_VGPU_CONFIG_STATE_UNINITIALIZED         0
+#define NVA081_CTRL_VGPU_CONFIG_STATE_IN_PROGRESS           1
+#define NVA081_CTRL_VGPU_CONFIG_STATE_READY                 2
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h
new file mode 100644
index 000000000000..e6833df1ccc7
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NVRM_NVTYPES_H__
+#define __NVRM_NVTYPES_H__
+
+#define NV_ALIGN_BYTES(a) __attribute__ ((__aligned__(a)))
+#define NV_DECLARE_ALIGNED(f,a) f __attribute__ ((__aligned__(a)))
+
+typedef u32 NvV32;
+
+typedef u8 NvU8;
+typedef u16 NvU16;
+typedef u32 NvU32;
+typedef u64 NvU64;
+
+typedef void* NvP64;
+
+typedef NvU8 NvBool;
+typedef NvU32 NvHandle;
+typedef NvU64 NvLength;
+
+typedef NvU64 RmPhysAddr;
+
+typedef NvU32 NV_STATUS;
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index 939f3b420bb3..065bb7aa55f8 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -52,4 +52,16 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_get_gsp_client_handle(m, c) \
 	m->handle.ops->get_gsp_client_handle(c)
 
+#define nvidia_vgpu_mgr_rm_ctrl_get(m, g, c, s) \
+	m->handle.ops->rm_ctrl_get(g, c, s)
+
+#define nvidia_vgpu_mgr_rm_ctrl_wr(m, g, c) \
+	m->handle.ops->rm_ctrl_wr(g, c)
+
+#define nvidia_vgpu_mgr_rm_ctrl_rd(m, g, c, s) \
+	m->handle.ops->rm_ctrl_rd(g, c, s)
+
+#define nvidia_vgpu_mgr_rm_ctrl_done(m, g, c) \
+	m->handle.ops->rm_ctrl_done(, c)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
index 812b7be00bee..dcb314b14f91 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
@@ -88,9 +88,15 @@ struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev)
 	if (ret)
 		goto fail_alloc_gsp_client;
 
+	ret = nvidia_vgpu_mgr_init_vgpu_types(vgpu_mgr);
+	if (ret)
+		goto fail_init_vgpu_types;
+
 	mutex_unlock(&vgpu_mgr_attach_lock);
 	return vgpu_mgr;
 
+fail_init_vgpu_types:
+	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu_mgr->gsp_client);
 fail_alloc_gsp_client:
 	nvidia_vgpu_mgr_detach_handle(&vgpu_mgr->handle);
 fail_attach_handle:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index f4416e6ed8f9..eb2df9f0fe07 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -33,6 +33,9 @@ struct nvidia_vgpu_mgr {
 	struct nvidia_vgpu *vgpus[NVIDIA_MAX_VGPUS];
 	atomic_t num_vgpus;
 
+	u8 **vgpu_types;
+	u32 num_vgpu_types;
+
 	struct nvidia_vgpu_gsp_client gsp_client;
 };
 
@@ -42,4 +45,6 @@ void nvidia_vgpu_mgr_put(struct nvidia_vgpu_mgr *vgpu_mgr);
 int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu);
 int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type);
 
+int nvidia_vgpu_mgr_init_vgpu_types(struct nvidia_vgpu_mgr *vgpu_mgr);
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_types.c b/drivers/vfio/pci/nvidia-vgpu/vgpu_types.c
new file mode 100644
index 000000000000..d402663e01bc
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_types.c
@@ -0,0 +1,466 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include <linux/kernel.h>
+
+#include <nvrm/nvtypes.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h>
+
+#include "vgpu_mgr.h"
+
+unsigned char vgpu_type_869[] = {
+  0x65, 0x03, 0x00, 0x00, 0x4e, 0x56, 0x49, 0x44, 0x49, 0x41, 0x20, 0x4c,
+  0x34, 0x30, 0x2d, 0x31, 0x42, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x4e, 0x56, 0x53, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x44, 0x6c, 0xab, 0x5b,
+  0x8f, 0xdb, 0xd5, 0x37, 0x09, 0x9a, 0x35, 0x04, 0x91, 0xde, 0x94, 0xb3,
+  0x63, 0x78, 0xff, 0xca, 0x38, 0x2f, 0x0a, 0xee, 0xe2, 0x76, 0xb0, 0xe9,
+  0xda, 0x9d, 0xf2, 0x5c, 0x2f, 0xb2, 0xe3, 0xe8, 0xfb, 0x96, 0x64, 0x9f,
+  0x5a, 0x59, 0x5f, 0x1e, 0x13, 0x19, 0x5a, 0x45, 0x3a, 0xf4, 0x85, 0x04,
+  0xba, 0xc5, 0x53, 0x1c, 0xfc, 0x3a, 0x8b, 0x78, 0x98, 0x36, 0x49, 0x66,
+  0x91, 0x0d, 0x99, 0x55, 0x59, 0xec, 0x8d, 0xd7, 0x13, 0x4a, 0x0a, 0x60,
+  0x2b, 0xd1, 0x96, 0x05, 0x13, 0x92, 0xde, 0xe0, 0xf8, 0x83, 0xac, 0x0b,
+  0xf7, 0x3c, 0xb5, 0x35, 0x44, 0x90, 0xcb, 0x6b, 0x63, 0xba, 0xc8, 0xaa,
+  0x3c, 0x34, 0xfd, 0xf6, 0x94, 0xdd, 0x74, 0x85, 0x3e, 0x44, 0xb5, 0x4d,
+  0xff, 0x99, 0x40, 0x52, 0x8c, 0xd2, 0xec, 0x99, 0xc5, 0x77, 0x79, 0x30,
+  0x8d, 0xb4, 0xba, 0x70, 0x47, 0x52, 0x49, 0x44, 0x2d, 0x56, 0x69, 0x72,
+  0x74, 0x75, 0x61, 0x6c, 0x2d, 0x50, 0x43, 0x2c, 0x32, 0x2e, 0x30, 0x3b,
+  0x51, 0x75, 0x61, 0x64, 0x72, 0x6f, 0x2d, 0x56, 0x69, 0x72, 0x74, 0x75,
+  0x61, 0x6c, 0x2d, 0x44, 0x57, 0x53, 0x2c, 0x35, 0x2e, 0x30, 0x3b, 0x47,
+  0x52, 0x49, 0x44, 0x2d, 0x56, 0x69, 0x72, 0x74, 0x75, 0x61, 0x6c, 0x2d,
+  0x57, 0x53, 0x2c, 0x32, 0x2e, 0x30, 0x3b, 0x47, 0x52, 0x49, 0x44, 0x2d,
+  0x56, 0x69, 0x72, 0x74, 0x75, 0x61, 0x6c, 0x2d, 0x57, 0x53, 0x2d, 0x45,
+  0x78, 0x74, 0x2c, 0x32, 0x2e, 0x30, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x20, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00,
+  0x40, 0x0b, 0x00, 0x00, 0x00, 0x00, 0xfa, 0x00, 0x2d, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x6d, 0x17, 0xb5, 0x26,
+  0x00, 0x00, 0x00, 0x00, 0xb5, 0x26, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static unsigned char *vgpu_types[] = {
+	vgpu_type_869,
+};
+
+/**
+ * nvidia_vgpu_mgr_init_vgpu_types - Upload the default vGPU types.
+ * @vgpu_mgr: NVIDIA vGPU manager
+ *
+ * Returns: 0 on success, others on failure.
+ */
+int nvidia_vgpu_mgr_init_vgpu_types(struct nvidia_vgpu_mgr *vgpu_mgr)
+{
+	NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS *ctrl;
+	int i, ret;
+
+	ctrl = nvidia_vgpu_mgr_rm_ctrl_get(vgpu_mgr, &vgpu_mgr->gsp_client,
+			NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE,
+			sizeof(*ctrl));
+	if (IS_ERR(ctrl))
+		return PTR_ERR(ctrl);
+
+	ctrl->discardVgpuTypes = true;
+	ctrl->vgpuInfoCount = ARRAY_SIZE(vgpu_types);
+
+	for (i = 0; i < ARRAY_SIZE(vgpu_types); i++)
+		memcpy(&ctrl->vgpuInfo[i], vgpu_types[i], sizeof(NVA081_CTRL_VGPU_INFO));
+
+	ret = nvidia_vgpu_mgr_rm_ctrl_wr(vgpu_mgr, &vgpu_mgr->gsp_client,
+					 ctrl);
+	if (ret)
+		return ret;
+
+	vgpu_mgr->vgpu_types = vgpu_types;
+	vgpu_mgr->num_vgpu_types = ARRAY_SIZE(vgpu_types);
+
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 22/29] vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (20 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 21/29] vfio/vgpu_mgr: introduce vGPU type uploading Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 23/29] vfio/vgpu_mgr: allocate vGPU channels " Zhi Wang
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Creating a vGPU requires allocating a portion of the FB memory from the
NVKM. The size of the FB memory that a vGPU requires is from the vGPU
type.

Acquire the size of the required FB memory from the vGPU type. Allocate
the FB memory from NVKM when creating a vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     |  6 ++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 38 +++++++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h |  2 ++
 3 files changed, 46 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index 065bb7aa55f8..d3c77d26c734 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -64,4 +64,10 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_rm_ctrl_done(m, g, c) \
 	m->handle.ops->rm_ctrl_done(, c)
 
+#define nvidia_vgpu_mgr_alloc_fbmem_heap(m, s) \
+	m->handle.ops->alloc_fbmem(m->handle.pf_drvdata, s, true)
+
+#define nvidia_vgpu_mgr_free_fbmem_heap(m, h) \
+	m->handle.ops->free_fbmem(h)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index 34f6adb9dfe4..54e27823820e 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -3,6 +3,11 @@
  * Copyright © 2024 NVIDIA Corporation
  */
 
+#include <linux/kernel.h>
+
+#include <nvrm/nvtypes.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
+
 #include "vgpu_mgr.h"
 
 static void unregister_vgpu(struct nvidia_vgpu *vgpu)
@@ -34,6 +39,29 @@ static int register_vgpu(struct nvidia_vgpu *vgpu)
 	return 0;
 }
 
+static void clean_fbmem_heap(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+
+	nvidia_vgpu_mgr_free_fbmem_heap(vgpu_mgr, vgpu->fbmem_heap);
+	vgpu->fbmem_heap = NULL;
+}
+
+static int setup_fbmem_heap(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	NVA081_CTRL_VGPU_INFO *info =
+		(NVA081_CTRL_VGPU_INFO *)vgpu->vgpu_type;
+	struct nvidia_vgpu_mem *mem;
+
+	mem = nvidia_vgpu_mgr_alloc_fbmem_heap(vgpu_mgr, info->fbLength);
+	if (IS_ERR(mem))
+		return PTR_ERR(mem);
+
+	vgpu->fbmem_heap = mem;
+	return 0;
+}
+
 /**
  * nvidia_vgpu_mgr_destroy_vgpu - destroy a vGPU instance
  * @vgpu: the vGPU instance going to be destroyed.
@@ -45,6 +73,7 @@ int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	clean_fbmem_heap(vgpu);
 	unregister_vgpu(vgpu);
 	return 0;
 }
@@ -76,8 +105,17 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		return ret;
 
+	ret = setup_fbmem_heap(vgpu);
+	if (ret)
+		goto err_setup_fbmem_heap;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
+
+err_setup_fbmem_heap:
+	unregister_vgpu(vgpu);
+
+	return ret;
 }
 EXPORT_SYMBOL(nvidia_vgpu_mgr_create_vgpu);
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index eb2df9f0fe07..35312d814996 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -23,6 +23,8 @@ struct nvidia_vgpu {
 	u8 *vgpu_type;
 	struct nvidia_vgpu_info info;
 	struct nvidia_vgpu_mgr *vgpu_mgr;
+
+	struct nvidia_vgpu_mem *fbmem_heap;
 };
 
 struct nvidia_vgpu_mgr {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 23/29] vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (21 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 22/29] vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 24/29] vfio/vgpu_mgr: allocate mgmt heap " Zhi Wang
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

Creating a vGPU requires allocating a portion of the channels from the
reserved channel pool.

Allocate the channels from the reserved channel pool when creating a vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     |  6 +++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 32 +++++++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h |  7 ++++++
 3 files changed, 45 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index d3c77d26c734..b95b48edeb03 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -70,4 +70,10 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_free_fbmem_heap(m, h) \
 	m->handle.ops->free_fbmem(h)
 
+#define nvidia_vgpu_mgr_alloc_chids(m, s) \
+	m->handle.ops->alloc_chids(m->handle.pf_drvdata, s)
+
+#define nvidia_vgpu_mgr_free_chids(m, o, s) \
+	m->handle.ops->free_chids(m->handle.pf_drvdata, o, s)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index 54e27823820e..c48c22d8fbb4 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -62,6 +62,31 @@ static int setup_fbmem_heap(struct nvidia_vgpu *vgpu)
 	return 0;
 }
 
+static void clean_chids(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	struct nvidia_vgpu_chid *chid = &vgpu->chid;
+
+	nvidia_vgpu_mgr_free_chids(vgpu_mgr, chid->chid_offset, chid->num_chid);
+}
+
+static int setup_chids(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	struct nvidia_vgpu_chid *chid = &vgpu->chid;
+	int ret;
+
+	ret = nvidia_vgpu_mgr_alloc_chids(vgpu_mgr, 512);
+	if (ret < 0)
+		return ret;
+
+	chid->chid_offset = ret;
+	chid->num_chid = 512;
+	chid->num_plugin_channels = 0;
+
+	return 0;
+}
+
 /**
  * nvidia_vgpu_mgr_destroy_vgpu - destroy a vGPU instance
  * @vgpu: the vGPU instance going to be destroyed.
@@ -73,6 +98,7 @@ int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	clean_chids(vgpu);
 	clean_fbmem_heap(vgpu);
 	unregister_vgpu(vgpu);
 	return 0;
@@ -109,10 +135,16 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		goto err_setup_fbmem_heap;
 
+	ret = setup_chids(vgpu);
+	if (ret)
+		goto err_setup_chids;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
 
+err_setup_chids:
+	clean_fbmem_heap(vgpu);
 err_setup_fbmem_heap:
 	unregister_vgpu(vgpu);
 
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index 35312d814996..0918823fdde7 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -15,6 +15,12 @@ struct nvidia_vgpu_info {
 	u32 dbdf;
 };
 
+struct nvidia_vgpu_chid {
+	int chid_offset;
+	int num_chid;
+	int num_plugin_channels;
+};
+
 struct nvidia_vgpu {
 	struct mutex lock;
 	atomic_t status;
@@ -25,6 +31,7 @@ struct nvidia_vgpu {
 	struct nvidia_vgpu_mgr *vgpu_mgr;
 
 	struct nvidia_vgpu_mem *fbmem_heap;
+	struct nvidia_vgpu_chid chid;
 };
 
 struct nvidia_vgpu_mgr {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 24/29] vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (22 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 23/29] vfio/vgpu_mgr: allocate vGPU channels " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 25/29] vfio/vgpu_mgr: map mgmt heap when creating a vGPU Zhi Wang
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The mgmt heap is a block of shared FBMEM between the GSP firmware and
the vGPU host. It is used for supporting vGPU RPCs, vGPU logging.

Creating a vGPU requires allocating a mgmt heap from the FBMEM. The size
of the mgmt heap that a vGPU requires is from the vGPU type.

Acquire the size of mgmt heap from the vGPU type. Allocate the mgmt
heap from nvkm when creating a vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     |  6 +++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 32 +++++++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h |  6 +++++
 3 files changed, 44 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index b95b48edeb03..50b860e7967d 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -76,4 +76,10 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_free_chids(m, o, s) \
 	m->handle.ops->free_chids(m->handle.pf_drvdata, o, s)
 
+#define nvidia_vgpu_mgr_alloc_fbmem(m, s) \
+	m->handle.ops->alloc_fbmem(m->handle.pf_drvdata, s, false)
+
+#define nvidia_vgpu_mgr_free_fbmem(m, h) \
+	m->handle.ops->free_fbmem(h)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index c48c22d8fbb4..4b04b13944d5 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -87,6 +87,31 @@ static int setup_chids(struct nvidia_vgpu *vgpu)
 	return 0;
 }
 
+static void clean_mgmt_heap(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	struct nvidia_vgpu_mgmt *mgmt = &vgpu->mgmt;
+
+	nvidia_vgpu_mgr_free_fbmem(vgpu_mgr, mgmt->heap_mem);
+	mgmt->heap_mem = NULL;
+}
+
+static int setup_mgmt_heap(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	struct nvidia_vgpu_mgmt *mgmt = &vgpu->mgmt;
+	NVA081_CTRL_VGPU_INFO *info =
+		(NVA081_CTRL_VGPU_INFO *)vgpu->vgpu_type;
+	struct nvidia_vgpu_mem *mem;
+
+	mem = nvidia_vgpu_mgr_alloc_fbmem(vgpu_mgr, info->gspHeapSize);
+	if (IS_ERR(mem))
+		return PTR_ERR(mem);
+
+	mgmt->heap_mem = mem;
+	return 0;
+}
+
 /**
  * nvidia_vgpu_mgr_destroy_vgpu - destroy a vGPU instance
  * @vgpu: the vGPU instance going to be destroyed.
@@ -98,6 +123,7 @@ int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	clean_mgmt_heap(vgpu);
 	clean_chids(vgpu);
 	clean_fbmem_heap(vgpu);
 	unregister_vgpu(vgpu);
@@ -139,10 +165,16 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		goto err_setup_chids;
 
+	ret = setup_mgmt_heap(vgpu);
+	if (ret)
+		goto err_setup_mgmt_heap;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
 
+err_setup_mgmt_heap:
+	clean_chids(vgpu);
 err_setup_chids:
 	clean_fbmem_heap(vgpu);
 err_setup_fbmem_heap:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index 0918823fdde7..f4ebeadb2b86 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -21,6 +21,11 @@ struct nvidia_vgpu_chid {
 	int num_plugin_channels;
 };
 
+struct nvidia_vgpu_mgmt {
+	struct nvidia_vgpu_mem *heap_mem;
+	/* more to come */
+};
+
 struct nvidia_vgpu {
 	struct mutex lock;
 	atomic_t status;
@@ -32,6 +37,7 @@ struct nvidia_vgpu {
 
 	struct nvidia_vgpu_mem *fbmem_heap;
 	struct nvidia_vgpu_chid chid;
+	struct nvidia_vgpu_mgmt mgmt;
 };
 
 struct nvidia_vgpu_mgr {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 25/29] vfio/vgpu_mgr: map mgmt heap when creating a vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (23 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 24/29] vfio/vgpu_mgr: allocate mgmt heap " Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 26/29] vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs Zhi Wang
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

The mgmt heap is a block of shared FB memory between the GSP firmware
and the vGPU host. It is used for supporting vGPU RPCs, vGPU logging.

To access the data structures of vGPU RPCs and vGPU logging, the mgmt
heap FB memory needs to mapped into BAR1 and the region in the BAR1 is
required to be mapped into CPU vaddr.

Map the mgmt heap FB memory into BAR1 and map the related BAR1 region
into CPU vaddr. Initialize the pointers to the mgmt heap FB memory.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/nvkm.h     |  6 +++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 29 +++++++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h |  4 +++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index 50b860e7967d..8ad2241f7c5e 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -82,4 +82,10 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_free_fbmem(m, h) \
 	m->handle.ops->free_fbmem(h)
 
+#define nvidia_vgpu_mgr_bar1_map_mem(m, mem) \
+	m->handle.ops->bar1_map_mem(mem)
+
+#define nvidia_vgpu_mgr_bar1_unmap_mem(m, mem) \
+	m->handle.ops->bar1_unmap_mem(mem)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index 4b04b13944d5..de7857fe8af2 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -87,12 +87,29 @@ static int setup_chids(struct nvidia_vgpu *vgpu)
 	return 0;
 }
 
+static inline u64 init_task_log_buff_offset(void)
+{
+	return (3 * SZ_4K) + SZ_2M + SZ_4K;
+}
+
+static inline u64 init_task_log_buff_size(void)
+{
+	return SZ_128K;
+}
+
+static inline u64 vgpu_task_log_buff_size(void)
+{
+	return SZ_128K;
+}
+
 static void clean_mgmt_heap(struct nvidia_vgpu *vgpu)
 {
 	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
 	struct nvidia_vgpu_mgmt *mgmt = &vgpu->mgmt;
 
+	nvidia_vgpu_mgr_bar1_unmap_mem(vgpu_mgr, mgmt->heap_mem);
 	nvidia_vgpu_mgr_free_fbmem(vgpu_mgr, mgmt->heap_mem);
+	mgmt->init_task_log_vaddr = mgmt->vgpu_task_log_vaddr = NULL;
 	mgmt->heap_mem = NULL;
 }
 
@@ -103,11 +120,23 @@ static int setup_mgmt_heap(struct nvidia_vgpu *vgpu)
 	NVA081_CTRL_VGPU_INFO *info =
 		(NVA081_CTRL_VGPU_INFO *)vgpu->vgpu_type;
 	struct nvidia_vgpu_mem *mem;
+	int ret;
 
 	mem = nvidia_vgpu_mgr_alloc_fbmem(vgpu_mgr, info->gspHeapSize);
 	if (IS_ERR(mem))
 		return PTR_ERR(mem);
 
+	ret = nvidia_vgpu_mgr_bar1_map_mem(vgpu_mgr, mem);
+	if (ret) {
+		nvidia_vgpu_mgr_free_fbmem(vgpu_mgr, mem);
+		return ret;
+	}
+
+	mgmt->ctrl_vaddr = mem->bar1_vaddr;
+	mgmt->init_task_log_vaddr = mgmt->ctrl_vaddr +
+				    init_task_log_buff_offset();
+	mgmt->vgpu_task_log_vaddr = mgmt->init_task_log_vaddr +
+				    init_task_log_buff_size();
 	mgmt->heap_mem = mem;
 	return 0;
 }
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index f4ebeadb2b86..404fc67a0c0a 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -23,7 +23,9 @@ struct nvidia_vgpu_chid {
 
 struct nvidia_vgpu_mgmt {
 	struct nvidia_vgpu_mem *heap_mem;
-	/* more to come */
+	void __iomem *ctrl_vaddr;
+	void __iomem *init_task_log_vaddr;
+	void __iomem *vgpu_task_log_vaddr;
 };
 
 struct nvidia_vgpu {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 26/29] vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (24 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 25/29] vfio/vgpu_mgr: map mgmt heap when creating a vGPU Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU Zhi Wang
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

A GSP RM client is required when talking to the GSP firmware via GSP RM
controls.

So far, all the vGPU GSP RPCs are sent via the GSP RM client allocated
for vGPU manager and some vGPU GSP RPCs needs a per-vGPU GSP RM client.

Allocate a dedicated GSP RM client for each vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/vgpu.c     | 11 +++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index de7857fe8af2..124a1a4593ae 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -149,9 +149,12 @@ static int setup_mgmt_heap(struct nvidia_vgpu *vgpu)
  */
 int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 {
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu->gsp_client);
 	clean_mgmt_heap(vgpu);
 	clean_chids(vgpu);
 	clean_fbmem_heap(vgpu);
@@ -171,6 +174,7 @@ EXPORT_SYMBOL(nvidia_vgpu_mgr_destroy_vgpu);
  */
 int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 {
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
 	int ret;
 
 	if (WARN_ON(vgpu->info.id >= NVIDIA_MAX_VGPUS))
@@ -198,10 +202,17 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		goto err_setup_mgmt_heap;
 
+	ret = nvidia_vgpu_mgr_alloc_gsp_client(vgpu_mgr,
+			&vgpu->gsp_client);
+	if (ret)
+		goto err_alloc_gsp_client;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
 
+err_alloc_gsp_client:
+	clean_mgmt_heap(vgpu);
 err_setup_mgmt_heap:
 	clean_chids(vgpu);
 err_setup_chids:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index 404fc67a0c0a..6f05b285484c 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -36,6 +36,7 @@ struct nvidia_vgpu {
 	u8 *vgpu_type;
 	struct nvidia_vgpu_info info;
 	struct nvidia_vgpu_mgr *vgpu_mgr;
+	struct nvidia_vgpu_gsp_client gsp_client;
 
 	struct nvidia_vgpu_mem *fbmem_heap;
 	struct nvidia_vgpu_chid chid;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (25 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 26/29] vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-25  0:31   ` Dave Airlie
  2024-09-22 12:49 ` [RFC 28/29] vfio/vgpu_mgr: introduce vGPU host RPC channel Zhi Wang
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

All the resources that required by a new vGPU has been set up. It is time
to activate it.

Send the NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK
GSP RPC to activate the new vGPU.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 90 ++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  3 +
 drivers/vfio/pci/nvidia-vgpu/vgpu.c           | 94 +++++++++++++++++++
 3 files changed, 187 insertions(+)

diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
index f44cdc733229..58c6bff72f44 100644
--- a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
@@ -59,4 +59,94 @@ typedef struct NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS {
 	NV_DECLARE_ALIGNED(NVA081_CTRL_VGPU_INFO vgpuInfo[NVA081_MAX_VGPU_TYPES_PER_PGPU], 8);
 } NV2080_CTRL_VGPU_MGR_INTERNAL_PGPU_ADD_VGPU_TYPE_PARAMS;
 
+/*
+ * NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK
+ *
+ * This command is used to bootload GSP VGPU plugin task.
+ * Can be called only with SR-IOV and with VGPU_GSP_PLUGIN_OFFLOAD feature.
+ *
+ * dbdf                        - domain (31:16), bus (15:8), device (7:3), function (2:0)
+ * gfid                        - Gfid
+ * vgpuType                    - The Type ID for VGPU profile
+ * vmPid                       - Plugin process ID of vGPU guest instance
+ * swizzId                     - SwizzId
+ * numChannels                 - Number of channels
+ * numPluginChannels           - Number of plugin channels
+ * bDisableSmcPartitionRestore - If set to true, SMC default execution partition
+ *                               save/restore will not be done in host-RM
+ * guestFbPhysAddrList         - list of VMMU segment aligned physical address of guest FB memory
+ * guestFbLengthList           - list of guest FB memory length in bytes
+ * pluginHeapMemoryPhysAddr    - plugin heap memory offset
+ * pluginHeapMemoryLength      - plugin heap memory length in bytes
+ * bDeviceProfilingEnabled     - If set to true, profiling is allowed
+ */
+#define NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK (0x20804001) /* finn: Evaluated from "(FINN_NV20_SUBDEVICE_0_VGPU_MGR_INTERNAL_INTERFACE_ID << 8) | NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS_MESSAGE_ID" */
+
+#define NV2080_CTRL_MAX_VMMU_SEGMENTS                                   384
+
+/* Must match NV2080_ENGINE_TYPE_LAST from cl2080.h */
+#define NV2080_GPU_MAX_ENGINES                                          0x3e
+
+#define NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS_MESSAGE_ID (0x1U)
+
+typedef struct NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS {
+	NvU32  dbdf;
+	NvU32  gfid;
+	NvU32  vgpuType;
+	NvU32  vmPid;
+	NvU32  swizzId;
+	NvU32  numChannels;
+	NvU32  numPluginChannels;
+	NvU32  chidOffset[NV2080_GPU_MAX_ENGINES];
+	NvBool bDisableDefaultSmcExecPartRestore;
+	NvU32  numGuestFbSegments;
+	NV_DECLARE_ALIGNED(NvU64 guestFbPhysAddrList[NV2080_CTRL_MAX_VMMU_SEGMENTS], 8);
+	NV_DECLARE_ALIGNED(NvU64 guestFbLengthList[NV2080_CTRL_MAX_VMMU_SEGMENTS], 8);
+	NV_DECLARE_ALIGNED(NvU64 pluginHeapMemoryPhysAddr, 8);
+	NV_DECLARE_ALIGNED(NvU64 pluginHeapMemoryLength, 8);
+	NV_DECLARE_ALIGNED(NvU64 ctrlBuffOffset, 8);
+	NV_DECLARE_ALIGNED(NvU64 initTaskLogBuffOffset, 8);
+	NV_DECLARE_ALIGNED(NvU64 initTaskLogBuffSize, 8);
+	NV_DECLARE_ALIGNED(NvU64 vgpuTaskLogBuffOffset, 8);
+	NV_DECLARE_ALIGNED(NvU64 vgpuTaskLogBuffSize, 8);
+	NvBool bDeviceProfilingEnabled;
+} NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS;
+
+/*
+ * NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK
+ *
+ * This command is used to shutdown GSP VGPU plugin task.
+ * Can be called only with SR-IOV and with VGPU_GSP_PLUGIN_OFFLOAD feature.
+ *
+ * gfid                        - Gfid
+ */
+#define NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK (0x20804002) /* finn: Evaluated from "(FINN_NV20_SUBDEVICE_0_VGPU_MGR_INTERNAL_INTERFACE_ID << 8) | NV2080_CTRL_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK_PARAMS_MESSAGE_ID" */
+
+#define NV2080_CTRL_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK_PARAMS_MESSAGE_ID (0x2U)
+
+typedef struct NV2080_CTRL_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK_PARAMS {
+	NvU32 gfid;
+} NV2080_CTRL_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK_PARAMS;
+
+/*
+ * NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP
+ *
+ * This command is used to cleanup all the GSP VGPU plugin task allocated resources after its shutdown.
+ * Can be called only with SR-IOV and with VGPU_GSP_PLUGIN_OFFLOAD feature.
+ *
+ * gfid [IN]
+ *  This parameter specifies the gfid of vGPU assigned to VM.
+ *
+ * Possible status values returned are:
+ *   NV_OK
+ *   NV_ERR_NOT_SUPPORTED
+ */
+#define NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP (0x20804008) /* finn: Evaluated from "(FINN_NV20_SUBDEVICE_0_VGPU_MGR_INTERNAL_INTERFACE_ID << 8) | NV2080_CTRL_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP_PARAMS_MESSAGE_ID" */
+
+#define NV2080_CTRL_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP_PARAMS_MESSAGE_ID (0x8U)
+
+typedef struct NV2080_CTRL_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP_PARAMS {
+	NvU32 gfid;
+} NV2080_CTRL_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP_PARAMS;
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/nvkm.h b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
index 8ad2241f7c5e..8e07422f99e5 100644
--- a/drivers/vfio/pci/nvidia-vgpu/nvkm.h
+++ b/drivers/vfio/pci/nvidia-vgpu/nvkm.h
@@ -88,4 +88,7 @@ static inline int nvidia_vgpu_mgr_get_handle(struct pci_dev *pdev,
 #define nvidia_vgpu_mgr_bar1_unmap_mem(m, mem) \
 	m->handle.ops->bar1_unmap_mem(mem)
 
+#define nvidia_vgpu_mgr_get_engine_bitmap(m, b) \
+	m->handle.ops->get_engine_bitmap(m->handle.pf_drvdata, b)
+
 #endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index 124a1a4593ae..e06d5155bb38 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -7,6 +7,7 @@
 
 #include <nvrm/nvtypes.h>
 #include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h>
 
 #include "vgpu_mgr.h"
 
@@ -141,6 +142,91 @@ static int setup_mgmt_heap(struct nvidia_vgpu *vgpu)
 	return 0;
 }
 
+static int shutdown_vgpu_plugin_task(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	NV2080_CTRL_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK_PARAMS *ctrl;
+
+	ctrl = nvidia_vgpu_mgr_rm_ctrl_get(vgpu_mgr, &vgpu->gsp_client,
+			NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_SHUTDOWN_GSP_VGPU_PLUGIN_TASK,
+			sizeof(*ctrl));
+	if (IS_ERR(ctrl))
+		return PTR_ERR(ctrl);;
+
+	ctrl->gfid = vgpu->info.gfid;
+
+	return nvidia_vgpu_mgr_rm_ctrl_wr(vgpu_mgr, &vgpu->gsp_client,
+					  ctrl);
+}
+
+static int cleanup_vgpu_plugin_task(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	NV2080_CTRL_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP_PARAMS *ctrl;
+
+	ctrl = nvidia_vgpu_mgr_rm_ctrl_get(vgpu_mgr, &vgpu->gsp_client,
+			NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_VGPU_PLUGIN_CLEANUP,
+			sizeof(*ctrl));
+	if (IS_ERR(ctrl))
+		return PTR_ERR(ctrl);
+
+	ctrl->gfid = vgpu->info.gfid;
+
+	return nvidia_vgpu_mgr_rm_ctrl_wr(vgpu_mgr, &vgpu->gsp_client,
+					  ctrl);
+}
+
+static int bootload_vgpu_plugin_task(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	struct nvidia_vgpu_mgmt *mgmt = &vgpu->mgmt;
+	NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS *ctrl;
+	DECLARE_BITMAP(engine_bitmap, NV2080_GPU_MAX_ENGINES);
+	int ret, i;
+
+	ctrl = nvidia_vgpu_mgr_rm_ctrl_get(vgpu_mgr, &vgpu->gsp_client,
+			NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK,
+			sizeof(*ctrl));
+	if (IS_ERR(ctrl))
+		return PTR_ERR(ctrl);
+
+	ctrl->dbdf = vgpu->info.dbdf;
+	ctrl->gfid = vgpu->info.gfid;
+	ctrl->vmPid = 0;
+	ctrl->swizzId = 0;
+	ctrl->numChannels = vgpu->chid.num_chid;
+	ctrl->numPluginChannels = 0;
+
+	bitmap_clear(engine_bitmap, 0, NV2080_GPU_MAX_ENGINES);
+
+	/* FIXME: nvkm seems not correctly record engines. two engines are missing. */
+	nvidia_vgpu_mgr_get_engine_bitmap(vgpu_mgr, engine_bitmap);
+
+	for_each_set_bit(i, engine_bitmap, NV2080_GPU_MAX_ENGINES)
+		ctrl->chidOffset[i] = vgpu->chid.chid_offset;
+
+	ctrl->bDisableDefaultSmcExecPartRestore = false;
+	ctrl->numGuestFbSegments = 1;
+	ctrl->guestFbPhysAddrList[0] = vgpu->fbmem_heap->addr;
+	ctrl->guestFbLengthList[0] = vgpu->fbmem_heap->size;
+	ctrl->pluginHeapMemoryPhysAddr = mgmt->heap_mem->addr;
+	ctrl->pluginHeapMemoryLength = mgmt->heap_mem->size;
+	ctrl->ctrlBuffOffset = 0;
+	ctrl->initTaskLogBuffOffset = mgmt->heap_mem->addr +
+				      init_task_log_buff_offset();
+	ctrl->initTaskLogBuffSize = init_task_log_buff_size();
+	ctrl->vgpuTaskLogBuffOffset = ctrl->initTaskLogBuffOffset +
+				      ctrl->initTaskLogBuffSize;
+	ctrl->vgpuTaskLogBuffSize = vgpu_task_log_buff_size();
+	ctrl->bDeviceProfilingEnabled = false;
+
+	ret = nvidia_vgpu_mgr_rm_ctrl_wr(vgpu_mgr, &vgpu->gsp_client,
+					 ctrl);
+	if (ret)
+		return ret;
+	return 0;
+}
+
 /**
  * nvidia_vgpu_mgr_destroy_vgpu - destroy a vGPU instance
  * @vgpu: the vGPU instance going to be destroyed.
@@ -154,6 +240,8 @@ int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	WARN_ON(shutdown_vgpu_plugin_task(vgpu));
+	WARN_ON(cleanup_vgpu_plugin_task(vgpu));
 	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu->gsp_client);
 	clean_mgmt_heap(vgpu);
 	clean_chids(vgpu);
@@ -207,10 +295,16 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		goto err_alloc_gsp_client;
 
+	ret = bootload_vgpu_plugin_task(vgpu);
+	if (ret)
+		goto err_bootload_vgpu_plugin_task;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
 
+err_bootload_vgpu_plugin_task:
+	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu->gsp_client);
 err_alloc_gsp_client:
 	clean_mgmt_heap(vgpu);
 err_setup_mgmt_heap:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 28/29] vfio/vgpu_mgr: introduce vGPU host RPC channel
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (26 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 12:49 ` [RFC 29/29] vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver Zhi Wang
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang

A newly created vGPU requires some runtime configuration to be uploaded
before moving on.

Introduce the vGPU host RPCs manipulation APIs to send vGPU RPCs.
Send vGPU RPCs to upload the runtime configuration of a vGPU.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/Makefile         |   2 +-
 drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 ++
 .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +++
 .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 +++++++++++++++
 .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++++
 .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 ++
 drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 ++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c           |  11 +
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       |  31 +++
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  21 ++
 10 files changed, 644 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/debug.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/rpc.c

diff --git a/drivers/vfio/pci/nvidia-vgpu/Makefile b/drivers/vfio/pci/nvidia-vgpu/Makefile
index bd65fa548ea1..fade9d49df97 100644
--- a/drivers/vfio/pci/nvidia-vgpu/Makefile
+++ b/drivers/vfio/pci/nvidia-vgpu/Makefile
@@ -2,4 +2,4 @@
 ccflags-y += -I$(srctree)/$(src)/include
 
 obj-$(CONFIG_NVIDIA_VGPU_MGR) += nvidia-vgpu-mgr.o
-nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o vgpu_types.o
+nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o vgpu_types.o rpc.o
diff --git a/drivers/vfio/pci/nvidia-vgpu/debug.h b/drivers/vfio/pci/nvidia-vgpu/debug.h
new file mode 100644
index 000000000000..bc1c4273f089
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/debug.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#ifndef __NVIDIA_VGPU_DEBUG_H__
+#define __NVIDIA_VGPU_DEBUG_H__
+
+#define nv_vgpu_dbg(v, f, a...) \
+	pci_dbg(v->pdev, "nvidia-vgpu %d: "f, v->info.id, ##a)
+
+#define nv_vgpu_info(v, f, a...) \
+	pci_info(v->pdev, "nvidia-vgpu %d: "f, v->info.id, ##a)
+
+#define nv_vgpu_err(v, f, a...) \
+	pci_err(v->pdev, "nvidia-vgpu %d: "f, v->info.id, ##a)
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
new file mode 100644
index 000000000000..871c498fb666
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
@@ -0,0 +1,30 @@
+#ifndef __src_common_sdk_nvidia_inc_ctrl_ctrl0000_ctrl0000system_h__
+#define __src_common_sdk_nvidia_inc_ctrl_ctrl0000_ctrl0000system_h__
+
+/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535.113.01 */
+
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2005-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#define NV0000_CTRL_CMD_SYSTEM_GET_VGX_SYSTEM_INFO_BUFFER_SIZE 256U
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
new file mode 100644
index 000000000000..8f3ea48ef10d
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
@@ -0,0 +1,213 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+#ifndef __src_common_sdk_nvidia_inc_vgpu_dev_nv_vgpu_gsp_h__
+#define __src_common_sdk_nvidia_inc_vgpu_dev_nv_vgpu_gsp_h__
+
+#include "nv_vgpu_types.h"
+
+#define GSP_PLUGIN_BOOTLOADED 0x4E654A6F
+
+/******************************************************************************/
+/* GSP Control buffer shared between CPU Plugin and GSP Plugin - START        */
+/******************************************************************************/
+
+/*    GSP Plugin heap memory layout
+      +--------------------------------+ offset = 0
+      |         CONTROL BUFFER         |
+      +--------------------------------+
+      |        RESPONSE BUFFER         |
+      +--------------------------------+
+      |         MESSAGE BUFFER         |
+      +--------------------------------+
+      |        MIGRATION BUFFER        |
+      +--------------------------------+
+      |    GSP PLUGIN ERROR BUFFER     |
+      +--------------------------------+
+      |    INIT TASK LOG BUFFER        |
+      +--------------------------------+
+      |    VGPU TASK LOG BUFFER        |
+      +--------------------------------+
+      |      MEMORY AVAILABLE FOR      |
+      | GSP PLUGIN INTERNAL HEAP USAGE |
+      +--------------------------------+
+ */
+#define VGPU_CPU_GSP_CTRL_BUFF_VERSION              0x1
+#define VGPU_CPU_GSP_CTRL_BUFF_REGION_SIZE          4096
+#define VGPU_CPU_GSP_RESPONSE_BUFF_REGION_SIZE      4096
+#define VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE       4096
+#define VGPU_CPU_GSP_MIGRATION_BUFF_REGION_SIZE     (2 * 1024 * 1024)
+#define VGPU_CPU_GSP_ERROR_BUFF_REGION_SIZE         4096
+#define VGPU_CPU_GSP_INIT_TASK_LOG_BUFF_REGION_SIZE (128 * 1024)
+#define VGPU_CPU_GSP_VGPU_TASK_LOG_BUFF_REGION_SIZE (256 * 1024)
+#define VGPU_CPU_GSP_COMMUNICATION_BUFF_TOTAL_SIZE  (VGPU_CPU_GSP_CTRL_BUFF_REGION_SIZE          + \
+		VGPU_CPU_GSP_RESPONSE_BUFF_REGION_SIZE      + \
+		VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE       + \
+		VGPU_CPU_GSP_MIGRATION_BUFF_REGION_SIZE     + \
+		VGPU_CPU_GSP_ERROR_BUFF_REGION_SIZE         + \
+		VGPU_CPU_GSP_INIT_TASK_LOG_BUFF_REGION_SIZE + \
+		VGPU_CPU_GSP_VGPU_TASK_LOG_BUFF_REGION_SIZE)
+
+//
+// Control buffer: CPU Plugin -> GSP Plugin
+// CPU Plugin - Write only
+// GSP Plugin - Read only
+//
+typedef union {
+	NvU8 buf[VGPU_CPU_GSP_CTRL_BUFF_REGION_SIZE];
+	struct {
+		volatile NvU32  version;                        // Version of format
+		volatile NvU32  message_type;                   // Task to be performed by GSP Plugin
+		volatile NvU32  message_seq_num;                // Incrementing sequence number to identify the RPC packet
+		volatile NvU64  response_buff_offset;           // Buffer used to send data from GSP Plugin -> CPU Plugin
+		volatile NvU64  message_buff_offset;            // Buffer used to send RPC data between CPU and GSP Plugin
+		volatile NvU64  migration_buff_offset;          // Buffer used to send migration data between CPU and GSP Plugin
+		volatile NvU64  error_buff_offset;              // Buffer used to send error data from GSP Plugin -> CPU Plugin
+		volatile NvU32  migration_buf_cpu_access_offset;// CPU plugin GET/PUT offset of migration buffer
+		volatile NvBool is_migration_in_progress;       // Is migration active or cancelled
+		volatile NvU32  error_buff_cpu_get_idx;         // GET pointer into ERROR Buffer for CPU Plugin
+		volatile NvU32 attached_vgpu_count;
+		volatile struct {
+			NvU32 vgpu_type_id;
+			NvU32 host_gpu_pci_id;
+			NvU32 pci_dev_id;
+			NvU8  vgpu_uuid[VM_UUID_SIZE];
+		} host_info[VMIOPD_MAX_INSTANCES];
+	};
+} VGPU_CPU_GSP_CTRL_BUFF_REGION;
+
+//
+// Specify actions intended on getting
+// notification from CPU Plugin -> GSP plugin
+//
+typedef enum {
+	NV_VGPU_CPU_RPC_MSG_VERSION_NEGOTIATION = 1,
+	NV_VGPU_CPU_RPC_MSG_SETUP_CONFIG_PARAMS_AND_INIT,
+	NV_VGPU_CPU_RPC_MSG_RESET,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_STOP_WORK,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_CANCEL_STOP,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_SAVE_STATE,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_CANCEL_SAVE,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_RESTORE_STATE,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_RESTORE_DEFERRED_STATE,
+	NV_VGPU_CPU_RPC_MSG_MIGRATION_RESUME_WORK,
+	NV_VGPU_CPU_RPC_MSG_CONSOLE_VNC_STATE,
+	NV_VGPU_CPU_RPC_MSG_VF_BAR0_REG_ACCESS,
+	NV_VGPU_CPU_RPC_MSG_UPDATE_BME_STATE,
+	NV_VGPU_CPU_RPC_MSG_GET_GUEST_INFO,
+	NV_VGPU_CPU_RPC_MSG_MAX,
+} MESSAGE;
+
+//
+// Params structure for NV_VGPU_CPU_RPC_MSG_VERSION_NEGOTIATION
+//
+typedef struct {
+	volatile NvU32 version_cpu;        /* Sent by CPU Plugin */
+	volatile NvU32 version_negotiated; /* Updated by GSP Plugin */
+} NV_VGPU_CPU_RPC_DATA_VERSION_NEGOTIATION;
+
+//
+// Host CPU arch
+//
+typedef enum {
+	NV_VGPU_HOST_CPU_ARCH_AARCH64 = 1,
+	NV_VGPU_HOST_CPU_ARCH_X86_64,
+} NV_VGPU_HOST_CPU_ARCH;
+
+//
+// Params structure for NV_VGPU_CPU_RPC_MSG_COPY_CONFIG_PARAMS
+//
+typedef struct {
+	volatile NvU8   vgpu_uuid[VM_UUID_SIZE];
+	volatile NvU32  dbdf;
+	volatile NvU32  driver_vm_vf_dbdf;
+	volatile NvU32  vgpu_device_instance_id;
+	volatile NvU32  vgpu_type;
+	volatile NvU32  vm_pid;
+	volatile NvU32  swizz_id;
+	volatile NvU32  num_channels;
+	volatile NvU32  num_plugin_channels;
+	volatile NvU32  vmm_cap;
+	volatile NvU32  migration_feature;
+	volatile NvU32  hypervisor_type;
+	volatile NvU32  host_cpu_arch;
+	volatile NvU64  host_page_size;
+	volatile NvBool rev1[2];
+	volatile NvBool enable_uvm;
+	volatile NvBool linux_interrupt_optimization;
+	volatile NvBool vmm_migration_supported;
+	volatile NvBool rev2;
+	volatile NvBool enable_console_vnc;
+	volatile NvBool use_non_stall_linux_events;
+	volatile NvU32  rev3;
+} NV_VGPU_CPU_RPC_DATA_COPY_CONFIG_PARAMS;
+
+// Params structure for NV_VGPU_CPU_RPC_MSG_UPDATE_BME_STATE
+typedef struct {
+	volatile NvBool enable;
+	volatile NvBool allowed;
+} NV_VGPU_CPU_RPC_DATA_UPDATE_BME_STATE;
+//
+// Message Buffer:
+// CPU Plugin - Read/Write
+// GSP Plugin - Read/Write
+//
+typedef union {
+	NvU8 buf[VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE];
+	NV_VGPU_CPU_RPC_DATA_VERSION_NEGOTIATION    version_data;
+	NV_VGPU_CPU_RPC_DATA_COPY_CONFIG_PARAMS     config_data;
+	NV_VGPU_CPU_RPC_DATA_UPDATE_BME_STATE       bme_state;
+} VGPU_CPU_GSP_MSG_BUFF_REGION;
+
+typedef struct {
+	volatile NvU64                          sequence_update_start;
+	volatile NvU64                          sequence_update_end;
+	volatile NvU32                          effective_fb_page_size;
+	volatile NvU32                          rect_width;
+	volatile NvU32                          rect_height;
+	volatile NvU32                          surface_width;
+	volatile NvU32                          surface_height;
+	volatile NvU32                          surface_size;
+	volatile NvU32                          surface_offset;
+	volatile NvU32                          surface_format;
+	volatile NvU32                          surface_kind;
+	volatile NvU32                          surface_pitch;
+	volatile NvU32                          surface_type;
+	volatile NvU8                           surface_block_height;
+	volatile vmiop_bool_t                   is_blanking_enabled;
+	volatile vmiop_bool_t                   is_flip_pending;
+	volatile vmiop_bool_t                   is_free_pending;
+	volatile vmiop_bool_t                   is_memory_blocklinear;
+} VGPU_CPU_GSP_DISPLAYLESS_SURFACE;
+
+//
+// GSP Plugin Response Buffer:
+// CPU Plugin - Read only
+// GSP Plugin - Write only
+//
+typedef union {
+	NvU8 buf[VGPU_CPU_GSP_RESPONSE_BUFF_REGION_SIZE];
+	struct {
+		// Updated by GSP Plugin once task is complete
+		volatile NvU32                              message_seq_num_processed;
+		// Updated by GSP on completion of RPC
+		volatile NvU32                              result_code;
+		volatile NvU32                              guest_rpc_version;
+		// GSP plugin GET/PUT offset pointer of migration buffer
+		volatile NvU32                              migration_buf_gsp_access_offset;
+		// Current state of migration
+		volatile NvU32                              migration_state_save_complete;
+		// Console VNC surface information
+		volatile VGPU_CPU_GSP_DISPLAYLESS_SURFACE   surface[VMIOPD_MAX_HEADS];
+		// PUT pointer into ERROR Buffer for GSP Plugin
+		volatile NvU32                              error_buff_gsp_put_idx;
+		// Updated grid license state as received from guest
+		volatile NvU32                              grid_license_state;
+	};
+} VGPU_CPU_GSP_RESPONSE_BUFF_REGION;
+
+/******************************************************************************/
+/* GSP Control buffer shared between CPU Plugin and GSP Plugin - END          */
+/******************************************************************************/
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
new file mode 100644
index 000000000000..903a5840366c
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
@@ -0,0 +1,51 @@
+#ifndef __src_common_sdk_nvidia_inc_nv_vgpu_types_h__
+#define __src_common_sdk_nvidia_inc_nv_vgpu_types_h__
+
+/* Excerpt of RM headers from https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535.113.01 */
+
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2016-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-License-Identifier: MIT
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#define VM_UUID_SIZE            16
+#define INVALID_VGPU_DEV_INST   0xFFFFFFFFU
+#define MAX_VGPU_DEVICES_PER_VM 16U
+
+/* This enum represents the current state of guest dependent fields */
+typedef enum GUEST_VM_INFO_STATE {
+	GUEST_VM_INFO_STATE_UNINITIALIZED = 0,
+	GUEST_VM_INFO_STATE_INITIALIZED = 1,
+} GUEST_VM_INFO_STATE;
+
+/* This enum represents types of VM identifiers */
+typedef enum VM_ID_TYPE {
+	VM_ID_DOMAIN_ID = 0,
+	VM_ID_UUID = 1,
+} VM_ID_TYPE;
+
+/* This structure represents VM identifier */
+typedef union VM_ID {
+	NvU8 vmUuid[VM_UUID_SIZE];
+	NV_DECLARE_ALIGNED(NvU64 vmId, 8);
+} VM_ID;
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
new file mode 100644
index 000000000000..58a473309e42
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+#ifndef __src_common_sdk_vmioplugin_inc_vmioplugin_h__
+#define __src_common_sdk_vmioplugin_inc_vmioplugin_h__
+
+#define VMIOPD_MAX_INSTANCES 16
+#define VMIOPD_MAX_HEADS     4
+
+/**
+ * Boolean type.
+ */
+
+enum vmiop_bool_e {
+	vmiop_false = 0,        /*!< Boolean false */
+	vmiop_true = 1          /*!< Boolean true */
+};
+
+/**
+ * Boolean type.
+ */
+
+typedef enum vmiop_bool_e vmiop_bool_t;
+
+#endif
diff --git a/drivers/vfio/pci/nvidia-vgpu/rpc.c b/drivers/vfio/pci/nvidia-vgpu/rpc.c
new file mode 100644
index 000000000000..c316941f4b97
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/rpc.c
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include <linux/delay.h>
+#include <linux/kernel.h>
+
+#include <nvrm/nvtypes.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h>
+#include <nvrm/common/sdk/vmioplugin/inc/vmioplugin.h>
+#include <nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
+
+#include "debug.h"
+#include "vgpu_mgr.h"
+
+static void trigger_doorbell(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+
+	u32 v = vgpu->info.gfid * 32 + 17;
+
+	writel(v, vgpu_mgr->bar0_vaddr + 0x00B80000 + 0x2200);
+	readl(vgpu_mgr->bar0_vaddr + 0x00B80000 + 0x2200);
+}
+
+static void send_rpc_request(struct nvidia_vgpu *vgpu, u32 msg_type,
+			    void *data, u64 size)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	VGPU_CPU_GSP_CTRL_BUFF_REGION *ctrl_buf = rpc->ctrl_buf;
+
+	if (data && size)
+		memcpy_toio(rpc->msg_buf, data, size);
+
+	ctrl_buf->message_type = msg_type;
+
+	rpc->msg_seq_num++;
+	ctrl_buf->message_seq_num = rpc->msg_seq_num;
+
+	trigger_doorbell(vgpu);
+}
+
+static int wait_for_response(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	VGPU_CPU_GSP_RESPONSE_BUFF_REGION *resp_buf = rpc->resp_buf;
+
+	u64 timeout = 120 * 1000000; /* 120s */
+
+	do {
+		if (resp_buf->message_seq_num_processed == rpc->msg_seq_num)
+			break;
+
+		usleep_range(1, 2);
+	} while (--timeout);
+
+	return timeout ? 0 : -ETIMEDOUT;
+}
+
+static int recv_rpc_response(struct nvidia_vgpu *vgpu, void *data,
+			     u64 size, u32 *result)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	VGPU_CPU_GSP_RESPONSE_BUFF_REGION *resp_buf = rpc->resp_buf;
+	int ret;
+
+	ret = wait_for_response(vgpu);
+	if (result)
+		*result = resp_buf->result_code;
+
+	if (ret)
+		return ret;
+
+	if (data && size)
+		memcpy_fromio(data, rpc->msg_buf, size);
+
+	return 0;
+}
+
+int nvidia_vgpu_rpc_call(struct nvidia_vgpu *vgpu, u32 msg_type,
+			 void *data, u64 size)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	u32 result;
+	int ret;
+
+	if (WARN_ON(msg_type >= NV_VGPU_CPU_RPC_MSG_MAX) ||
+		   (size > VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE) ||
+		   ((size != 0) && (data == NULL)))
+		return -EINVAL;
+
+	mutex_lock(&rpc->lock);
+
+	send_rpc_request(vgpu, msg_type, data, size);
+	ret = recv_rpc_response(vgpu, data, size, &result);
+
+	mutex_unlock(&rpc->lock);
+	if (ret || result) {
+		nv_vgpu_err(vgpu, "fail to recv RPC: result %u\n",
+			    result);
+		return -EINVAL;
+	}
+	return ret;
+}
+
+void nvidia_vgpu_clean_rpc(struct nvidia_vgpu *vgpu)
+{
+}
+
+static void init_rpc_buf_pointers(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgmt *mgmt = &vgpu->mgmt;
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+
+	rpc->ctrl_buf = mgmt->ctrl_vaddr;
+	rpc->resp_buf = rpc->ctrl_buf + VGPU_CPU_GSP_CTRL_BUFF_REGION_SIZE;
+	rpc->msg_buf = rpc->resp_buf + VGPU_CPU_GSP_RESPONSE_BUFF_REGION_SIZE;
+	rpc->migration_buf = rpc->msg_buf + VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE;
+	rpc->error_buf = rpc->migration_buf + VGPU_CPU_GSP_MIGRATION_BUFF_REGION_SIZE;
+}
+
+static void init_ctrl_buf_offsets(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	VGPU_CPU_GSP_CTRL_BUFF_REGION *ctrl_buf;
+	u64 offset = 0;
+
+	ctrl_buf = rpc->ctrl_buf;
+
+	ctrl_buf->version = VGPU_CPU_GSP_CTRL_BUFF_VERSION;
+
+	offset = VGPU_CPU_GSP_CTRL_BUFF_REGION_SIZE;
+	ctrl_buf->response_buff_offset = offset;
+
+	offset += VGPU_CPU_GSP_RESPONSE_BUFF_REGION_SIZE;
+	ctrl_buf->message_buff_offset = offset;
+
+	offset += VGPU_CPU_GSP_MESSAGE_BUFF_REGION_SIZE;
+	ctrl_buf->migration_buff_offset = offset;
+
+	offset += VGPU_CPU_GSP_MIGRATION_BUFF_REGION_SIZE;
+	ctrl_buf->error_buff_offset = offset;
+}
+
+static int wait_vgpu_plugin_task_bootloaded(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	VGPU_CPU_GSP_CTRL_BUFF_REGION *ctrl_buf = rpc->ctrl_buf;
+
+	u64 timeout = 10 * 1000000; /* 10 s */
+
+	do {
+		if (ctrl_buf->message_seq_num == GSP_PLUGIN_BOOTLOADED)
+			break;
+
+		usleep_range(1, 2);
+	} while (--timeout);
+
+	return timeout ? 0 : -ETIMEDOUT;
+}
+
+static int negotiate_rpc_version(struct nvidia_vgpu *vgpu)
+{
+	return nvidia_vgpu_rpc_call(vgpu, NV_VGPU_CPU_RPC_MSG_VERSION_NEGOTIATION,
+				    NULL, 0);
+}
+
+unsigned char config_params[] = {
+	0x24, 0xef, 0x8f, 0xf7, 0x3e, 0xd5, 0x11, 0xef, 0xae, 0x36, 0x97, 0x58,
+	0xb1, 0xcb, 0x0c, 0x87, 0x04, 0xc1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x14, 0x00, 0xd0, 0xc1, 0x65, 0x03, 0x00, 0x00, 0xa1, 0x0e, 0x00, 0x00,
+	0xff, 0xff, 0xff, 0xff, 0x40, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
+	0x02, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00
+};
+
+static int send_config_params_and_init(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = vgpu->vgpu_mgr;
+	NV_VGPU_CPU_RPC_DATA_COPY_CONFIG_PARAMS params = {0};
+	NVA081_CTRL_VGPU_INFO *info = (NVA081_CTRL_VGPU_INFO *)
+				      vgpu->vgpu_type;
+
+	memcpy(&params, config_params, sizeof(config_params));
+
+	params.dbdf = vgpu->info.dbdf;
+	params.vgpu_device_instance_id =
+		nvidia_vgpu_mgr_get_gsp_client_handle(vgpu_mgr, &vgpu->gsp_client);
+	params.vgpu_type = info->vgpuType;
+	params.vm_pid = 0;
+	params.swizz_id = 0;
+	params.num_channels = vgpu->chid.num_chid;
+	params.num_plugin_channels = vgpu->chid.num_plugin_channels;
+
+	return nvidia_vgpu_rpc_call(vgpu, NV_VGPU_CPU_RPC_MSG_SETUP_CONFIG_PARAMS_AND_INIT,
+				    &params, sizeof(params));
+}
+
+/**
+ * nvidia_vgpu_setup_rpc - setup the vGPU host RPC channel and send runtime
+ * configuration.
+ * @vgpu: the vGPU instance.
+ * @vgpu_type: the vGPU type of the vGPU instance.
+ *
+ * Returns: 0 on success, others on failure.
+ */
+int nvidia_vgpu_setup_rpc(struct nvidia_vgpu *vgpu)
+{
+	struct nvidia_vgpu_rpc *rpc = &vgpu->rpc;
+	int ret;
+
+	mutex_init(&rpc->lock);
+
+	init_rpc_buf_pointers(vgpu);
+	init_ctrl_buf_offsets(vgpu);
+
+	ret = wait_vgpu_plugin_task_bootloaded(vgpu);
+	if (ret) {
+		nv_vgpu_err(vgpu, "waiting bootload timeout!\n");
+		return ret;
+	}
+
+	ret = negotiate_rpc_version(vgpu);
+	if (ret) {
+		nv_vgpu_err(vgpu, "fail to negotiate rpc version!\n");
+		return ret;
+	}
+
+	ret = send_config_params_and_init(vgpu);
+	if (ret) {
+		nv_vgpu_err(vgpu, "fail to init vgpu plugin task!\n");
+		return ret;
+	}
+
+	nv_vgpu_dbg(vgpu, "vGPU RPC initialization is done.\n");
+
+	return 0;
+}
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index e06d5155bb38..93d27db30a41 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -8,6 +8,9 @@
 #include <nvrm/nvtypes.h>
 #include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
 #include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h>
+#include <nvrm/common/sdk/vmioplugin/inc/vmioplugin.h>
+#include <nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h>
 
 #include "vgpu_mgr.h"
 
@@ -240,6 +243,7 @@ int nvidia_vgpu_mgr_destroy_vgpu(struct nvidia_vgpu *vgpu)
 	if (!atomic_cmpxchg(&vgpu->status, 1, 0))
 		return -ENODEV;
 
+	nvidia_vgpu_clean_rpc(vgpu);
 	WARN_ON(shutdown_vgpu_plugin_task(vgpu));
 	WARN_ON(cleanup_vgpu_plugin_task(vgpu));
 	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu->gsp_client);
@@ -299,10 +303,17 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	if (ret)
 		goto err_bootload_vgpu_plugin_task;
 
+	ret = nvidia_vgpu_setup_rpc(vgpu);
+	if (ret)
+		goto err_setup_rpc;
+
 	atomic_set(&vgpu->status, 1);
 
 	return 0;
 
+err_setup_rpc:
+	shutdown_vgpu_plugin_task(vgpu);
+	cleanup_vgpu_plugin_task(vgpu);
 err_bootload_vgpu_plugin_task:
 	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu->gsp_client);
 err_alloc_gsp_client:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
index dcb314b14f91..e84cf4a845d4 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
@@ -7,11 +7,35 @@
 
 DEFINE_MUTEX(vgpu_mgr_attach_lock);
 
+static void unmap_pf_mmio(struct nvidia_vgpu_mgr *vgpu_mgr)
+{
+	iounmap(vgpu_mgr->bar0_vaddr);
+}
+
+static int map_pf_mmio(struct nvidia_vgpu_mgr *vgpu_mgr)
+{
+	struct pci_dev *pdev = vgpu_mgr->pdev;
+	resource_size_t start, size;
+	void *vaddr;
+
+	start = pci_resource_start(pdev, 0);
+	size = pci_resource_len(pdev, 0);
+
+	vaddr = ioremap(start, size);
+	if (!vaddr)
+		return -ENOMEM;
+
+	vgpu_mgr->bar0_vaddr = vaddr;
+
+	return 0;
+}
+
 static void vgpu_mgr_release(struct kref *kref)
 {
 	struct nvidia_vgpu_mgr *vgpu_mgr =
 		container_of(kref, struct nvidia_vgpu_mgr, refcount);
 
+	unmap_pf_mmio(vgpu_mgr);
 	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu_mgr->gsp_client);
 	nvidia_vgpu_mgr_detach_handle(&vgpu_mgr->handle);
 	kvfree(vgpu_mgr);
@@ -83,6 +107,8 @@ struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev)
 	kref_init(&vgpu_mgr->refcount);
 	mutex_init(&vgpu_mgr->vgpu_id_lock);
 
+	vgpu_mgr->pdev = dev->physfn;
+
 	ret = nvidia_vgpu_mgr_alloc_gsp_client(vgpu_mgr,
 					       &vgpu_mgr->gsp_client);
 	if (ret)
@@ -92,9 +118,14 @@ struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev)
 	if (ret)
 		goto fail_init_vgpu_types;
 
+	ret = map_pf_mmio(vgpu_mgr);
+	if (ret)
+		goto fail_map_pf_mmio;
+
 	mutex_unlock(&vgpu_mgr_attach_lock);
 	return vgpu_mgr;
 
+fail_map_pf_mmio:
 fail_init_vgpu_types:
 	nvidia_vgpu_mgr_free_gsp_client(vgpu_mgr, &vgpu_mgr->gsp_client);
 fail_alloc_gsp_client:
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index 6f05b285484c..af922d8e539c 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -28,6 +28,16 @@ struct nvidia_vgpu_mgmt {
 	void __iomem *vgpu_task_log_vaddr;
 };
 
+struct nvidia_vgpu_rpc {
+	struct mutex lock;
+	u32 msg_seq_num;
+	void __iomem *ctrl_buf;
+	void __iomem *resp_buf;
+	void __iomem *msg_buf;
+	void __iomem *migration_buf;
+	void __iomem *error_buf;
+};
+
 struct nvidia_vgpu {
 	struct mutex lock;
 	atomic_t status;
@@ -41,6 +51,7 @@ struct nvidia_vgpu {
 	struct nvidia_vgpu_mem *fbmem_heap;
 	struct nvidia_vgpu_chid chid;
 	struct nvidia_vgpu_mgmt mgmt;
+	struct nvidia_vgpu_rpc rpc;
 };
 
 struct nvidia_vgpu_mgr {
@@ -55,6 +66,9 @@ struct nvidia_vgpu_mgr {
 	u32 num_vgpu_types;
 
 	struct nvidia_vgpu_gsp_client gsp_client;
+
+	struct pci_dev *pdev;
+	void __iomem *bar0_vaddr;
 };
 
 struct nvidia_vgpu_mgr *nvidia_vgpu_mgr_get(struct pci_dev *dev);
@@ -65,4 +79,11 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type);
 
 int nvidia_vgpu_mgr_init_vgpu_types(struct nvidia_vgpu_mgr *vgpu_mgr);
 
+int nvidia_vgpu_rpc_call(struct nvidia_vgpu *vgpu, u32 msg_type,
+			 void *data, u64 size);
+void nvidia_vgpu_clean_rpc(struct nvidia_vgpu *vgpu);
+int nvidia_vgpu_setup_rpc(struct nvidia_vgpu *vgpu);
+
+int nvidia_vgpu_mgr_reset_vgpu(struct nvidia_vgpu *vgpu);
+
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [RFC 29/29] vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (27 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 28/29] vfio/vgpu_mgr: introduce vGPU host RPC channel Zhi Wang
@ 2024-09-22 12:49 ` Zhi Wang
  2024-09-22 13:11 ` [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 12:49 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiw, zhiwang,
	Vinay Kabra

A VFIO variant driver module is designed to extend the capabilities of
the existing VFIO (Virtual Function I/O), offering device management
interfaces to the userspace and advanced feature support.

For the userspace to use the NVIDIA vGPU, a new vGPU VFIO variant driver
is introduced to provide vGPU management, like selecting/creating vGPU
instance, support advance features like live migration.

Introduce the NVIDIA vGPU VFIO variant driver to support vGPU lifecycle
management UABI and the future advancd features.

Cc: Neo Jia <cjia@nvidia.com>
Cc: Surath Mitra <smitra@nvidia.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Vinay Kabra <vkabra@nvidia.com>
Cc: Ankit Agrawal <ankita@nvidia.com>
Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/vfio/pci/nvidia-vgpu/Makefile      |   3 +
 drivers/vfio/pci/nvidia-vgpu/vfio.h        |  43 ++
 drivers/vfio/pci/nvidia-vgpu/vfio_access.c | 297 ++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vfio_main.c   | 511 +++++++++++++++++++++
 drivers/vfio/pci/nvidia-vgpu/vgpu.c        |  22 +
 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h    |   2 +-
 6 files changed, 877 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio.h
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_access.c
 create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_main.c

diff --git a/drivers/vfio/pci/nvidia-vgpu/Makefile b/drivers/vfio/pci/nvidia-vgpu/Makefile
index fade9d49df97..99c47e2f436d 100644
--- a/drivers/vfio/pci/nvidia-vgpu/Makefile
+++ b/drivers/vfio/pci/nvidia-vgpu/Makefile
@@ -3,3 +3,6 @@ ccflags-y += -I$(srctree)/$(src)/include
 
 obj-$(CONFIG_NVIDIA_VGPU_MGR) += nvidia-vgpu-mgr.o
 nvidia-vgpu-mgr-y := vgpu_mgr.o vgpu.o vgpu_types.o rpc.o
+
+obj-$(CONFIG_NVIDIA_VGPU_VFIO_PCI) += nvidia-vgpu-vfio-pci.o
+nvidia-vgpu-vfio-pci-y := vfio_main.o vfio_access.o
diff --git a/drivers/vfio/pci/nvidia-vgpu/vfio.h b/drivers/vfio/pci/nvidia-vgpu/vfio.h
new file mode 100644
index 000000000000..fa6bbf81552d
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vfio.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#ifndef _NVIDIA_VGPU_VFIO_H__
+#define _NVIDIA_VGPU_VFIO_H__
+
+#include <linux/vfio_pci_core.h>
+
+#include <nvrm/nvtypes.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h>
+#include <nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h>
+
+#include "vgpu_mgr.h"
+
+#define VGPU_CONFIG_PARAMS_MAX_LENGTH 1024
+#define DEVICE_CLASS_LENGTH 5
+#define PCI_CONFIG_SPACE_LENGTH 4096
+
+#define CAP_LIST_NEXT_PTR_MSIX 0x7c
+#define MSIX_CAP_SIZE   0xc
+
+struct nvidia_vgpu_vfio {
+	struct vfio_pci_core_device core_dev;
+	u8 config_space[PCI_CONFIG_SPACE_LENGTH];
+
+	void __iomem *bar0_map;
+
+	u8 **vgpu_types;
+	NVA081_CTRL_VGPU_INFO *curr_vgpu_type;
+	u32 num_vgpu_types;
+
+	struct nvidia_vgpu_mgr *vgpu_mgr;
+	struct nvidia_vgpu *vgpu;
+};
+
+void nvidia_vgpu_vfio_setup_config(struct nvidia_vgpu_vfio *nvdev);
+ssize_t nvidia_vgpu_vfio_access(struct nvidia_vgpu_vfio *nvdev,
+				char __user *buf, size_t count,
+				loff_t ppos, bool iswrite);
+
+#endif /* _NVIDIA_VGPU_VFIO_H__ */
diff --git a/drivers/vfio/pci/nvidia-vgpu/vfio_access.c b/drivers/vfio/pci/nvidia-vgpu/vfio_access.c
new file mode 100644
index 000000000000..320c72a07dbe
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vfio_access.c
@@ -0,0 +1,297 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include <linux/string.h>
+#include <linux/pci.h>
+#include <linux/pci_regs.h>
+
+#include "vfio.h"
+
+void nvidia_vgpu_vfio_setup_config(struct nvidia_vgpu_vfio *nvdev)
+{
+	u8 *buffer = NULL;
+
+	memset(nvdev->config_space, 0, sizeof(nvdev->config_space));
+
+	/* Header type 0 (normal devices) */
+	*(u16 *)&nvdev->config_space[PCI_VENDOR_ID] = 0x10de;
+	*(u16 *)&nvdev->config_space[PCI_DEVICE_ID] =
+		FIELD_GET(GENMASK(31, 16), nvdev->curr_vgpu_type->vdevId);
+	*(u16 *)&nvdev->config_space[PCI_COMMAND] = 0x0000;
+	*(u16 *)&nvdev->config_space[PCI_STATUS] = 0x0010;
+
+	buffer = &nvdev->config_space[PCI_CLASS_REVISION];
+	pci_read_config_byte(nvdev->core_dev.pdev, PCI_CLASS_REVISION, buffer);
+
+	nvdev->config_space[PCI_CLASS_PROG] = 0; /* VGA-compatible */
+	nvdev->config_space[PCI_CLASS_DEVICE] = 0; /* VGA controller */
+	nvdev->config_space[PCI_CLASS_DEVICE + 1] = 3; /* display controller */
+
+	/* BAR0: 32-bit */
+	*(u32 *)&nvdev->config_space[PCI_BASE_ADDRESS_0] = 0x00000000;
+	/* BAR1: 64-bit, prefetchable */
+	*(u32 *)&nvdev->config_space[PCI_BASE_ADDRESS_1] = 0x0000000c;
+	/* BAR2: 64-bit, prefetchable */
+	*(u32 *)&nvdev->config_space[PCI_BASE_ADDRESS_3] = 0x0000000c;
+	/* Disable BAR3: I/O */
+	*(u32 *)&nvdev->config_space[PCI_BASE_ADDRESS_5] = 0x00000000;
+
+	*(u16 *)&nvdev->config_space[PCI_SUBSYSTEM_VENDOR_ID] = 0x10de;
+	*(u16 *)&nvdev->config_space[PCI_SUBSYSTEM_ID] =
+		FIELD_GET(GENMASK(15, 0), nvdev->curr_vgpu_type->vdevId);
+
+	nvdev->config_space[PCI_CAPABILITY_LIST] = CAP_LIST_NEXT_PTR_MSIX;
+	nvdev->config_space[CAP_LIST_NEXT_PTR_MSIX + 1] = 0x0;
+
+	/* INTx disabled */
+	nvdev->config_space[0x3d] = 0;
+}
+
+static void read_hw_pci_config(struct pci_dev *pdev, char *buf,
+			       size_t count, loff_t offset)
+{
+	switch (count) {
+	case 4:
+		pci_read_config_dword(pdev, offset, (u32 *)buf);
+		break;
+
+	case 2:
+		pci_read_config_word(pdev, offset, (u16 *)buf);
+		break;
+
+	case 1:
+		pci_read_config_byte(pdev, offset, (u8 *)buf);
+		break;
+	default:
+		WARN_ONCE(1, "Not supported access len\n");
+		break;
+	}
+}
+
+static void write_hw_pci_config(struct pci_dev *pdev, char *buf,
+				size_t count, loff_t offset)
+{
+	switch (count) {
+	case 4:
+		pci_write_config_dword(pdev, offset, *(u32 *)buf);
+		break;
+
+	case 2:
+		pci_write_config_word(pdev, offset, *(u16 *)buf);
+		break;
+
+	case 1:
+		pci_write_config_byte(pdev, offset, *(u8 *)buf);
+		break;
+	default:
+		WARN_ONCE(1, "Not supported access len\n");
+		break;
+	}
+}
+
+static void hw_pci_config_rw(struct pci_dev *pdev, char *buf,
+			     size_t count, loff_t offset,
+			     bool is_write)
+{
+	is_write ? write_hw_pci_config(pdev, buf, count, offset) :
+		   read_hw_pci_config(pdev, buf, count, offset);
+}
+
+static ssize_t bar0_rw(struct nvidia_vgpu_vfio *nvdev, char *buf,
+		       size_t count, loff_t ppos, bool iswrite)
+{
+	struct pci_dev *pdev = nvdev->core_dev.pdev;
+	int index = VFIO_PCI_OFFSET_TO_INDEX(ppos);
+	loff_t offset = ppos;
+	void __iomem *map;
+	u32 val;
+	int ret;
+
+	if (index != VFIO_PCI_BAR0_REGION_INDEX)
+		return -EINVAL;
+
+	offset &= VFIO_PCI_OFFSET_MASK;
+
+	if (nvdev->bar0_map == NULL) {
+		ret = pci_request_selected_regions(pdev, 1 << index, "nvidia-vgpu-vfio");
+		if (ret)
+			return ret;
+
+		if (!(pci_resource_flags(pdev, index) & IORESOURCE_MEM)) {
+			pci_release_selected_regions(pdev, 1 << index);
+			return -EIO;
+		}
+
+		map = ioremap(pci_resource_start(pdev, index), pci_resource_len(pdev, index));
+		if (!map) {
+			pci_err(pdev, "Can't map BAR0 MMIO space\n");
+			pci_release_selected_regions(pdev, 1 << index);
+			return -ENOMEM;
+		}
+		nvdev->bar0_map = map;
+	} else
+		map = nvdev->bar0_map;
+
+	if (!iswrite) {
+		switch (count) {
+		case 4:
+			val = ioread32(map + offset);
+			break;
+		case 2:
+			val = ioread16(map + offset);
+			break;
+		case 1:
+			val = ioread8(map + offset);
+			break;
+		}
+		memcpy(buf, (u8 *)&val, count);
+	} else {
+		switch (count) {
+		case 4:
+			iowrite32(*(u32 *)buf, map + offset);
+			break;
+		case 2:
+			iowrite16(*(u16 *)buf, map + offset);
+			break;
+		case 1:
+			iowrite8(*(u8 *)buf, map + offset);
+			break;
+		}
+	}
+	return count;
+}
+
+static ssize_t pci_config_rw(struct nvidia_vgpu_vfio *nvdev, char *buf,
+			     size_t count, loff_t ppos, bool iswrite)
+{
+	struct pci_dev *pdev = nvdev->core_dev.pdev;
+	int index = VFIO_PCI_OFFSET_TO_INDEX(ppos);
+	loff_t offset = ppos;
+	u32 bar_mask, cfg_addr;
+	u32 val = 0;
+
+	if (index != VFIO_PCI_CONFIG_REGION_INDEX)
+		return -EINVAL;
+
+	offset &= VFIO_PCI_OFFSET_MASK;
+
+	if ((offset >= CAP_LIST_NEXT_PTR_MSIX) && (offset <
+				(CAP_LIST_NEXT_PTR_MSIX + MSIX_CAP_SIZE))) {
+		hw_pci_config_rw(pdev, buf, count, offset, iswrite);
+		return count;
+	}
+
+	if (!iswrite) {
+		memcpy(buf, (u8 *)&nvdev->config_space[offset], count);
+
+		switch (offset) {
+		case PCI_COMMAND:
+			hw_pci_config_rw(pdev, (char *)&val, count, offset, iswrite);
+
+			switch (count) {
+			case 4:
+				val = (u32)(val & 0xFFFF0000) | (val &
+					(PCI_COMMAND_PARITY | PCI_COMMAND_SERR));
+				break;
+			case 2:
+				val = (val & (PCI_COMMAND_PARITY | PCI_COMMAND_SERR));
+				break;
+			default:
+				WARN_ONCE(1, "Not supported access len\n");
+				break;
+			}
+			break;
+		case PCI_STATUS:
+			hw_pci_config_rw(pdev, (char *)&val, count, offset, iswrite);
+			break;
+
+		default:
+			break;
+		}
+		*(u32 *)buf = *(u32 *)buf | val;
+	} else {
+		switch (offset) {
+		case PCI_VENDOR_ID:
+		case PCI_DEVICE_ID:
+		case PCI_CAPABILITY_LIST:
+			break;
+
+		case PCI_STATUS:
+			hw_pci_config_rw(pdev, buf, count, offset, iswrite);
+			break;
+
+		case PCI_COMMAND:
+			if (count == 4) {
+				val = (u32)((*(u32 *)buf & 0xFFFF0000) >> 16);
+				hw_pci_config_rw(pdev, (char *)&val, 2, PCI_STATUS, iswrite);
+
+				val = (u32)(*(u32 *)buf & 0x0000FFFF);
+				*(u32 *)buf = val;
+			}
+
+			memcpy((u8 *)&nvdev->config_space[offset], buf, count);
+			break;
+
+		case PCI_BASE_ADDRESS_0:
+		case PCI_BASE_ADDRESS_1:
+		case PCI_BASE_ADDRESS_2:
+		case PCI_BASE_ADDRESS_3:
+		case PCI_BASE_ADDRESS_4:
+			cfg_addr = *(u32 *)buf;
+
+			switch (offset) {
+			case PCI_BASE_ADDRESS_0:
+				bar_mask = (u32)((~(pci_resource_len(pdev, VFIO_PCI_BAR0_REGION_INDEX)) + 1) & ~0xFul);
+				cfg_addr = (cfg_addr & bar_mask) | (nvdev->config_space[offset] & 0xFul);
+				break;
+			case PCI_BASE_ADDRESS_1:
+				bar_mask = (u32)((~(nvdev->curr_vgpu_type->bar1Length * 1024 * 1024) + 1) & ~0xFul);
+				cfg_addr = (cfg_addr & bar_mask) | (nvdev->config_space[offset] & 0xFul);
+				break;
+
+			case PCI_BASE_ADDRESS_2:
+				bar_mask = (u32)(((~(nvdev->curr_vgpu_type->bar1Length * 1024 * 1024) + 1) & ~0xFul) >> 32);
+				cfg_addr = (cfg_addr & bar_mask);
+				break;
+
+			case PCI_BASE_ADDRESS_3:
+				bar_mask = (u32)((~(pci_resource_len(pdev, VFIO_PCI_BAR3_REGION_INDEX)) + 1) & ~0xFul);
+				cfg_addr = (cfg_addr & bar_mask) | (nvdev->config_space[offset] & 0xFul);
+				break;
+
+			case PCI_BASE_ADDRESS_4:
+				bar_mask = (u32)(((~(pci_resource_len(pdev, VFIO_PCI_BAR3_REGION_INDEX)) + 1) & ~0xFul) >> 32);
+				cfg_addr = (cfg_addr & bar_mask);
+				break;
+			}
+			*(u32 *)&nvdev->config_space[offset] = cfg_addr;
+			break;
+		default:
+			break;
+
+		}
+	}
+	return count;
+}
+
+ssize_t nvidia_vgpu_vfio_access(struct nvidia_vgpu_vfio *nvdev, char *buf,
+				size_t count, loff_t ppos, bool iswrite)
+{
+	int index = VFIO_PCI_OFFSET_TO_INDEX(ppos);
+
+	if (index >= VFIO_PCI_NUM_REGIONS)
+		return -EINVAL;
+
+	switch (index) {
+	case VFIO_PCI_CONFIG_REGION_INDEX:
+		return pci_config_rw(nvdev, buf, count, ppos,
+				     iswrite);
+	case VFIO_PCI_BAR0_REGION_INDEX:
+		return bar0_rw(nvdev, buf, count, ppos, iswrite);
+	default:
+		return -EINVAL;
+	}
+	return count;
+}
diff --git a/drivers/vfio/pci/nvidia-vgpu/vfio_main.c b/drivers/vfio/pci/nvidia-vgpu/vfio_main.c
new file mode 100644
index 000000000000..667ed6fb48f6
--- /dev/null
+++ b/drivers/vfio/pci/nvidia-vgpu/vfio_main.c
@@ -0,0 +1,511 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/*
+ * Copyright © 2024 NVIDIA Corporation
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/vfio_pci_core.h>
+#include <linux/types.h>
+
+#include "vfio.h"
+
+static int pdev_to_gfid(struct pci_dev *pdev)
+{
+	return pci_iov_vf_id(pdev) + 1;
+}
+
+static int destroy_vgpu(struct nvidia_vgpu_vfio *nvdev)
+{
+	int ret;
+
+	ret = nvidia_vgpu_mgr_destroy_vgpu(nvdev->vgpu);
+	if (ret)
+		return ret;
+
+	kfree(nvdev->vgpu);
+	nvdev->vgpu = NULL;
+	return 0;
+}
+
+static int create_vgpu(struct nvidia_vgpu_vfio *nvdev)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr = nvdev->vgpu_mgr;
+	struct pci_dev *pdev = nvdev->core_dev.pdev;
+	struct nvidia_vgpu *vgpu;
+	int ret;
+
+	vgpu = kzalloc(sizeof(*vgpu), GFP_KERNEL);
+	if (!vgpu)
+		return -ENOMEM;
+
+	vgpu->info.id = pci_iov_vf_id(pdev);
+	vgpu->info.dbdf = (0 << 16) | pci_dev_id(pdev);
+	vgpu->info.gfid = pdev_to_gfid(pdev);
+
+	vgpu->vgpu_mgr = vgpu_mgr;
+	vgpu->pdev = pdev;
+
+	ret = nvidia_vgpu_mgr_create_vgpu(vgpu,
+			(u8 *)nvdev->curr_vgpu_type);
+	if (ret) {
+		kfree(vgpu);
+		return ret;
+	}
+
+	pr_err("create_vgpu() called\n");
+	nvdev->vgpu = vgpu;
+	return 0;
+}
+
+static inline struct vfio_pci_core_device *
+vdev_to_core_dev(struct vfio_device *vdev)
+{
+	return container_of(vdev, struct vfio_pci_core_device, vdev);
+}
+
+static inline struct nvidia_vgpu_vfio *
+core_dev_to_nvdev(struct vfio_pci_core_device *core_dev)
+{
+	return container_of(core_dev, struct nvidia_vgpu_vfio, core_dev);
+}
+
+static void detach_vgpu_mgr(struct nvidia_vgpu_vfio *nvdev)
+{
+	nvidia_vgpu_mgr_put(nvdev->vgpu_mgr);
+
+	nvdev->vgpu_mgr = NULL;
+	nvdev->vgpu_types = NULL;
+	nvdev->num_vgpu_types = 0;
+}
+
+static int attach_vgpu_mgr(struct nvidia_vgpu_vfio *nvdev,
+			   struct pci_dev *pdev)
+{
+	struct nvidia_vgpu_mgr *vgpu_mgr;
+
+	vgpu_mgr = nvidia_vgpu_mgr_get(pdev);
+	if (IS_ERR(vgpu_mgr))
+		return PTR_ERR(vgpu_mgr);
+
+	nvdev->vgpu_mgr = vgpu_mgr;
+	nvdev->vgpu_types = nvdev->vgpu_mgr->vgpu_types;
+	nvdev->num_vgpu_types = nvdev->vgpu_mgr->num_vgpu_types;
+
+	return 0;
+}
+
+static NVA081_CTRL_VGPU_INFO *
+find_vgpu_type(struct nvidia_vgpu_vfio *nvdev, u32 type_id)
+{
+	NVA081_CTRL_VGPU_INFO *vgpu_type;
+	u32 i;
+
+	for (i = 0; i < nvdev->num_vgpu_types; i++) {
+		vgpu_type = (NVA081_CTRL_VGPU_INFO *)nvdev->vgpu_types[i];
+		if (vgpu_type->vgpuType == type_id)
+			return vgpu_type;
+	}
+
+	return NULL;
+}
+
+static int
+nvidia_vgpu_vfio_open_device(struct vfio_device *vdev)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	struct pci_dev *pdev = core_dev->pdev;
+	u64 pf_dma_mask;
+	int ret;
+
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+
+	if (!pdev->physfn)
+		return -EINVAL;
+
+	ret = create_vgpu(nvdev);
+	if (ret)
+		return ret;
+
+	ret = pci_enable_device(pdev);
+	if (ret)
+		goto err_enable_device;
+
+	pci_set_master(pdev);
+
+	pf_dma_mask = dma_get_mask(&pdev->physfn->dev);
+	dma_set_mask(&pdev->dev, pf_dma_mask);
+	dma_set_coherent_mask(&pdev->dev, pf_dma_mask);
+
+	ret = pci_try_reset_function(pdev);
+	if (ret)
+		goto err_reset_function;
+
+	ret = nvidia_vgpu_mgr_enable_bme(nvdev->vgpu);
+	if (ret)
+		goto err_enable_bme;
+
+	return 0;
+
+err_enable_bme:
+err_reset_function:
+	pci_clear_master(pdev);
+	pci_disable_device(pdev);
+err_enable_device:
+	destroy_vgpu(nvdev);
+	return ret;
+}
+
+static void
+nvidia_vgpu_vfio_close_device(struct vfio_device *vdev)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	struct pci_dev *pdev = core_dev->pdev;
+
+	WARN_ON(destroy_vgpu(nvdev));
+
+	if (nvdev->bar0_map) {
+		iounmap(nvdev->bar0_map);
+		pci_release_selected_regions(pdev, 1 << 0);
+		nvdev->bar0_map = NULL;
+	}
+
+	pci_clear_master(pdev);
+	pci_disable_device(pdev);
+}
+
+static int
+get_region_info(struct vfio_pci_core_device *core_dev, unsigned long arg)
+{
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	struct pci_dev *pdev = core_dev->pdev;
+	struct vfio_region_info info;
+	unsigned long minsz;
+	int ret = 0;
+
+	minsz = offsetofend(struct vfio_region_info, offset);
+	if (copy_from_user(&info, (void __user *)arg, minsz))
+		return -EINVAL;
+
+	if (info.argsz < minsz)
+		return -EINVAL;
+
+	switch (info.index) {
+	case VFIO_PCI_CONFIG_REGION_INDEX:
+		info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index);
+		info.size = PCI_CONFIG_SPACE_LENGTH;
+		info.flags = VFIO_REGION_INFO_FLAG_READ |
+			VFIO_REGION_INFO_FLAG_WRITE;
+		break;
+
+	case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR4_REGION_INDEX:
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+
+		info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index);
+		info.size = pci_resource_len(pdev, info.index);
+
+		if (info.index == VFIO_PCI_BAR1_REGION_INDEX)
+			info.size = nvdev->curr_vgpu_type->bar1Length * 1024 * 1024;
+
+		if (!info.size) {
+			info.flags = 0;
+			break;
+		}
+		info.flags = VFIO_REGION_INFO_FLAG_READ |
+			VFIO_REGION_INFO_FLAG_WRITE |
+			VFIO_REGION_INFO_FLAG_MMAP;
+
+		if (caps.size) {
+			info.flags |= VFIO_REGION_INFO_FLAG_CAPS;
+			if (info.argsz < sizeof(info) + caps.size) {
+				info.argsz = sizeof(info) + caps.size;
+				info.cap_offset = 0;
+			} else {
+				vfio_info_cap_shift(&caps, sizeof(info));
+				if (copy_to_user((void __user *)arg +
+							sizeof(info), caps.buf,
+							caps.size)) {
+					kfree(caps.buf);
+					ret = -EFAULT;
+					break;
+				}
+				info.cap_offset = sizeof(info);
+			}
+			kfree(caps.buf);
+		}
+		break;
+	case VFIO_PCI_BAR5_REGION_INDEX:
+	case VFIO_PCI_ROM_REGION_INDEX:
+	case VFIO_PCI_VGA_REGION_INDEX:
+		info.size = 0;
+		break;
+
+	default:
+		if (info.index >= VFIO_PCI_NUM_REGIONS)
+			ret = -EINVAL;
+		break;
+	}
+
+	if (!ret)
+		ret = copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0;
+
+	return ret;
+}
+
+static long nvidia_vgpu_vfio_ioctl(struct vfio_device *vdev,
+				   unsigned int cmd,
+				   unsigned long arg)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	int ret = 0;
+
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+
+	switch (cmd) {
+	case VFIO_DEVICE_GET_REGION_INFO:
+		ret = get_region_info(core_dev, arg);
+		break;
+	case VFIO_DEVICE_GET_PCI_HOT_RESET_INFO:
+	case VFIO_DEVICE_PCI_HOT_RESET:
+	case VFIO_DEVICE_RESET:
+		break;
+
+	default:
+		ret = vfio_pci_core_ioctl(vdev, cmd, arg);
+		break;
+	}
+
+	return ret;
+}
+
+static ssize_t nvidia_vgpu_vfio_read(struct vfio_device *vdev,
+				     char __user *buf, size_t count,
+				     loff_t *ppos)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	u64 val;
+	size_t done = 0;
+	int ret = 0, size;
+
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+
+	while (count) {
+		if (count >= 4 && !(*ppos % 4))
+			size = 4;
+		else if (count >= 2 && !(*ppos % 2))
+			size = 2;
+		else
+			size = 1;
+
+		ret = nvidia_vgpu_vfio_access(nvdev, (char *)&val, size, *ppos, false);
+
+		if (ret <= 0)
+			return ret;
+
+		if (copy_to_user(buf, &val, size) != 0)
+			return -EFAULT;
+
+		*ppos += size;
+		buf += size;
+		count -= size;
+		done += size;
+	}
+
+	return done;
+}
+
+static ssize_t nvidia_vgpu_vfio_write(struct vfio_device *vdev,
+				      const char __user *buf, size_t count,
+				      loff_t *ppos)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	u64 val;
+	size_t done = 0;
+	int ret = 0, size;
+
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+
+	while (count) {
+		if (count >= 4 && !(*ppos % 4))
+			size = 4;
+		else if (count >= 2 && !(*ppos % 2))
+			size = 2;
+		else
+			size = 1;
+
+		if (copy_from_user(&val, buf, size) != 0)
+			return -EFAULT;
+
+		ret = nvidia_vgpu_vfio_access(nvdev, (char *)&val, size, *ppos, true);
+
+		if (ret <= 0)
+			return ret;
+
+		*ppos += size;
+		buf += size;
+		count -= size;
+		done += size;
+	}
+
+	return done;
+}
+
+static int nvidia_vgpu_vfio_mmap(struct vfio_device *vdev,
+				 struct vm_area_struct *vma)
+{
+	struct vfio_pci_core_device *core_dev = vdev_to_core_dev(vdev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+	struct pci_dev *pdev = core_dev->pdev;
+	u64 phys_len, req_len, pgoff, req_start;
+	unsigned int index;
+
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+
+	index = vma->vm_pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT);
+
+	if (index >= VFIO_PCI_BAR5_REGION_INDEX)
+		return -EINVAL;
+	if (vma->vm_end < vma->vm_start)
+		return -EINVAL;
+	if ((vma->vm_flags & VM_SHARED) == 0)
+		return -EINVAL;
+
+	phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
+	req_len = vma->vm_end - vma->vm_start;
+	pgoff = vma->vm_pgoff &
+		((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
+	req_start = pgoff << PAGE_SHIFT;
+
+	if (req_len == 0)
+		return -EINVAL;
+
+	if ((req_start + req_len > phys_len) || (phys_len == 0))
+		return -EINVAL;
+
+	vma->vm_private_data = vdev;
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
+	vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
+
+	return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, req_len, vma->vm_page_prot);
+}
+
+static const struct vfio_device_ops nvidia_vgpu_vfio_ops = {
+	.name           = "nvidia-vgpu-vfio-pci",
+	.init		= vfio_pci_core_init_dev,
+	.release	= vfio_pci_core_release_dev,
+	.open_device    = nvidia_vgpu_vfio_open_device,
+	.close_device   = nvidia_vgpu_vfio_close_device,
+	.ioctl          = nvidia_vgpu_vfio_ioctl,
+	.device_feature = vfio_pci_core_ioctl_feature,
+	.read           = nvidia_vgpu_vfio_read,
+	.write          = nvidia_vgpu_vfio_write,
+	.mmap           = nvidia_vgpu_vfio_mmap,
+	.request	= vfio_pci_core_request,
+	.match		= vfio_pci_core_match,
+	.bind_iommufd	= vfio_iommufd_physical_bind,
+	.unbind_iommufd	= vfio_iommufd_physical_unbind,
+	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
+};
+
+static int setup_vgpu_type(struct nvidia_vgpu_vfio *nvdev)
+{
+	nvdev->curr_vgpu_type = find_vgpu_type(nvdev, 869);
+	if (!nvdev->curr_vgpu_type)
+		return -ENODEV;
+	return 0;
+}
+
+static int nvidia_vgpu_vfio_probe(struct pci_dev *pdev,
+				  const struct pci_device_id *id_table)
+{
+	struct nvidia_vgpu_vfio *nvdev;
+	int ret;
+
+	if (!pdev->is_virtfn)
+		return -EINVAL;
+
+	nvdev = vfio_alloc_device(nvidia_vgpu_vfio, core_dev.vdev,
+				  &pdev->dev, &nvidia_vgpu_vfio_ops);
+	if (IS_ERR(nvdev))
+		return PTR_ERR(nvdev);
+
+	ret = attach_vgpu_mgr(nvdev, pdev);
+	if (ret)
+		goto err_attach_vgpu_mgr;
+
+	ret = setup_vgpu_type(nvdev);
+	if (ret)
+		goto err_setup_vgpu_type;
+
+	nvidia_vgpu_vfio_setup_config(nvdev);
+
+	dev_set_drvdata(&pdev->dev, &nvdev->core_dev);
+
+	ret = vfio_pci_core_register_device(&nvdev->core_dev);
+	if (ret)
+		goto err_setup_vgpu_type;
+
+	return 0;
+
+err_setup_vgpu_type:
+	detach_vgpu_mgr(nvdev);
+
+err_attach_vgpu_mgr:
+	vfio_put_device(&nvdev->core_dev.vdev);
+
+	pci_err(pdev, "VF probe failed with ret: %d\n", ret);
+	return ret;
+}
+
+static void nvidia_vgpu_vfio_remove(struct pci_dev *pdev)
+{
+	struct vfio_pci_core_device *core_dev = dev_get_drvdata(&pdev->dev);
+	struct nvidia_vgpu_vfio *nvdev = core_dev_to_nvdev(core_dev);
+
+	vfio_pci_core_unregister_device(core_dev);
+	detach_vgpu_mgr(nvdev);
+	vfio_put_device(&core_dev->vdev);
+}
+
+struct pci_device_id nvidia_vgpu_vfio_table[] = {
+	{
+		.vendor      = PCI_VENDOR_ID_NVIDIA,
+		.device      = PCI_ANY_ID,
+		.subvendor   = PCI_ANY_ID,
+		.subdevice   = PCI_ANY_ID,
+		.class       = (PCI_CLASS_DISPLAY_3D << 8),
+		.class_mask  = ~0,
+	},
+	{ }
+};
+MODULE_DEVICE_TABLE(pci, nvidia_vgpu_vfio_table);
+
+struct pci_driver nvidia_vgpu_vfio_driver = {
+	.name               = "nvidia-vgpu-vfio",
+	.id_table           = nvidia_vgpu_vfio_table,
+	.probe              = nvidia_vgpu_vfio_probe,
+	.remove             = nvidia_vgpu_vfio_remove,
+	.driver_managed_dma = true,
+};
+
+module_pci_driver(nvidia_vgpu_vfio_driver);
+
+MODULE_LICENSE("Dual MIT/GPL");
+MODULE_AUTHOR("Vinay Kabra <vkabra@nvidia.com>");
+MODULE_AUTHOR("Kirti Wankhede <kwankhede@nvidia.com>");
+MODULE_AUTHOR("Zhi Wang <zhiw@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA vGPU VFIO Variant Driver - User Level driver for NVIDIA vGPU");
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu.c b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
index 93d27db30a41..003ca116b4a8 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu.c
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu.c
@@ -328,3 +328,25 @@ int nvidia_vgpu_mgr_create_vgpu(struct nvidia_vgpu *vgpu, u8 *vgpu_type)
 	return ret;
 }
 EXPORT_SYMBOL(nvidia_vgpu_mgr_create_vgpu);
+
+static int update_bme_state(struct nvidia_vgpu *vgpu)
+{
+	NV_VGPU_CPU_RPC_DATA_UPDATE_BME_STATE params = {0};
+
+	params.enable = true;
+
+	return nvidia_vgpu_rpc_call(vgpu, NV_VGPU_CPU_RPC_MSG_UPDATE_BME_STATE,
+				    &params, sizeof(params));
+}
+
+/**
+ * nvidia_vgpu_enable_bme - handle BME sequence
+ * @vf: the vGPU instance
+ *
+ * Returns: 0 on success, others on failure.
+ */
+int nvidia_vgpu_mgr_enable_bme(struct nvidia_vgpu *vgpu)
+{
+	return update_bme_state(vgpu);
+}
+EXPORT_SYMBOL(nvidia_vgpu_mgr_enable_bme);
diff --git a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
index af922d8e539c..2c9e0eebcb99 100644
--- a/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
+++ b/drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
@@ -84,6 +84,6 @@ int nvidia_vgpu_rpc_call(struct nvidia_vgpu *vgpu, u32 msg_type,
 void nvidia_vgpu_clean_rpc(struct nvidia_vgpu *vgpu);
 int nvidia_vgpu_setup_rpc(struct nvidia_vgpu *vgpu);
 
-int nvidia_vgpu_mgr_reset_vgpu(struct nvidia_vgpu *vgpu);
+int nvidia_vgpu_mgr_enable_bme(struct nvidia_vgpu *vgpu);
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (28 preceding siblings ...)
  2024-09-22 12:49 ` [RFC 29/29] vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver Zhi Wang
@ 2024-09-22 13:11 ` Zhi Wang
  2024-09-23  8:38   ` Danilo Krummrich
  2024-09-23  6:22 ` Tian, Kevin
  2024-09-23  8:49 ` Danilo Krummrich
  31 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-09-22 13:11 UTC (permalink / raw)
  To: kvm, nouveau
  Cc: alex.williamson, kevin.tian, jgg, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiwang, bskeggs

On Sun, 22 Sep 2024 05:49:22 -0700
Zhi Wang <zhiw@nvidia.com> wrote:

+Ben.

Forget to add you. My bad. 
 

> 1. Background
> =============
> 
> NVIDIA vGPU[1] software enables powerful GPU performance for workloads
> ranging from graphics-rich virtual workstations to data science and
> AI, enabling IT to leverage the management and security benefits of
> virtualization as well as the performance of NVIDIA GPUs required for
> modern workloads. Installed on a physical GPU in a cloud or enterprise
> data center server, NVIDIA vGPU software creates virtual GPUs that can
> be shared across multiple virtual machines.
> 
> The vGPU architecture[2] can be illustrated as follow:
> 
>  +--------------------+    +--------------------+
> +--------------------+ +--------------------+ | Hypervisor         |
>   | Guest VM           | | Guest VM           | | Guest VM
> | |                    |    | +----------------+ | |
> +----------------+ | | +----------------+ | | +----------------+ |
> | |Applications... | | | |Applications... | | | |Applications... | |
> | |  NVIDIA        | |    | +----------------+ | | +----------------+
> | | +----------------+ | | |  Virtual GPU   | |    |
> +----------------+ | | +----------------+ | | +----------------+ | |
> |  Manager       | |    | |  Guest Driver  | | | |  Guest Driver  | |
> | |  Guest Driver  | | | +------^---------+ |    | +----------------+
> | | +----------------+ | | +----------------+ | |        |
> |    +---------^----------+ +----------^---------+
> +----------^---------+ |        |           |              |
>              |                      | |        |
> +--------------+-----------------------+----------------------+---------+
> |        |                          |                       |
>              |         | |        |                          |
>                |                      |         |
> +--------+--------------------------+-----------------------+----------------------+---------+
> +---------v--------------------------+-----------------------+----------------------+----------+
> | NVIDIA                  +----------v---------+
> +-----------v--------+ +-----------v--------+ | | Physical GPU
>     |   Virtual GPU      | |   Virtual GPU      | |   Virtual GPU
>  | | |                         +--------------------+
> +--------------------+ +--------------------+ |
> +----------------------------------------------------------------------------------------------+
> 
> Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed
> amount of GPU framebuffer, and one or more virtual display outputs or
> "heads". The vGPU’s framebuffer is allocated out of the physical
> GPU’s framebuffer at the time the vGPU is created, and the vGPU
> retains exclusive use of that framebuffer until it is destroyed.
> 
> The number of physical GPUs that a board has depends on the board.
> Each physical GPU can support several different types of virtual GPU
> (vGPU). vGPU types have a fixed amount of frame buffer, number of
> supported display heads, and maximum resolutions. They are grouped
> into different series according to the different classes of workload
> for which they are optimized. Each series is identified by the last
> letter of the vGPU type name.
> 
> NVIDIA vGPU supports Windows and Linux guest VM operating systems. The
> supported vGPU types depend on the guest VM OS.
> 
> 2. Proposal for upstream
> ========================
> 
> 2.1 Architecture
> ----------------
> 
> Moving to the upstream, the proposed architecture can be illustrated
> as followings:
> 
>                             +--------------------+
> +--------------------+ +--------------------+ | Linux VM           |
> | Windows VM         | | Guest VM           | | +----------------+ |
> | +----------------+ | | +----------------+ | | |Applications... | |
> | |Applications... | | | |Applications... | | | +----------------+ |
> | +----------------+ | | +----------------+ | ... |
> +----------------+ | | +----------------+ | | +----------------+ | |
> |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | |
> +----------------+ | | +----------------+ | | +----------------+ |
> +---------^----------+ +----------^---------+ +----------^---------+
> |                       |                      |
> +--------------------------------------------------------------------+
> |+--------------------+ +--------------------+
> +--------------------+| ||       QEMU         | |       QEMU
> | |       QEMU         || ||                    | |
>  | |                    || |+--------------------+
> +--------------------+ +--------------------+|
> +--------------------------------------------------------------------+
> |                       |                      |
> +-----------------------------------------------------------------------------------------------+
> |
> +----------------------------------------------------------------+  |
> |                           |                                VFIO
>                        |  | |                           |
>                                                    |  | |
> +-----------------------+ | +------------------------+
> +---------------------------------+|  | | |  Core Driver vGPU     | |
> |                        |  |                                 ||  | |
> |       Support        <--->|                       <---->
>                     ||  | | +-----------------------+ | | NVIDIA vGPU
> Manager    |  | NVIDIA vGPU VFIO Variant Driver ||  | | |    NVIDIA
> GPU Core    | | |                        |  |
>         ||  | | |        Driver         | |
> +------------------------+  +---------------------------------+|  | |
> +--------^--------------+
> +----------------------------------------------------------------+  |
> |          |                          |                       |
>                |          |
> +-----------------------------------------------------------------------------------------------+
> |                          |                       |
>     |
> +----------|--------------------------|-----------------------|----------------------|----------+
> |          v               +----------v---------+
> +-----------v--------+ +-----------v--------+ | |  NVIDIA
>      |       PCI VF       | |       PCI VF       | |       PCI VF
>   | | |  Physical GPU            |                    | |
>        | |                    | | |                          |
> (Virtual GPU)    | |   (Virtual GPU)    | |    (Virtual GPU)   | | |
>                         +--------------------+ +--------------------+
> +--------------------+ |
> +-----------------------------------------------------------------------------------------------+
> 
> The supported GPU generations will be Ada which come with the
> supported GPU architecture. Each vGPU is backed by a PCI virtual
> function.
> 
> The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> extended management and features, e.g. selecting the vGPU types,
> support live migration and driver warm update.
> 
> Like other devices that VFIO supports, VFIO provides the standard
> userspace APIs for device lifecycle management and advance feature
> support.
> 
> The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU
> VFIO variant driver to create/destroy vGPUs, query available vGPU
> types, select the vGPU type, etc.
> 
> On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core
> driver, which provide necessary support to reach the HW functions.
> 
> 2.2 Requirements to the NVIDIA GPU core driver
> ----------------------------------------------
> 
> The primary use case of CSP and enterprise is a standalone minimal
> drivers of vGPU manager and other necessary components.
> 
> NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide
> necessary support to:
> 
> - Load the GSP firmware, boot the GSP, provide commnication channel.
> - Manage the shared/partitioned HW resources. E.g. reserving FB
> memory, channels for the vGPU mananger to create vGPUs.
> - Exception handling. E.g. delivering the GSP events to vGPU manager.
> - Host event dispatch. E.g. suspend/resume.
> - Enumerations of HW configuration.
> 
> The NVIDIA GPU core driver, which sits on the PCI device interface of
> NVIDIA GPU, provides support to both DRM driver and the vGPU manager.
> 
> In this RFC, the split nouveau GPU driver[3] is used as an example to
> demostrate the requirements of vGPU manager to the core driver. The
> nouveau driver is split into nouveau (the DRM driver) and nvkm (the
> core driver).
> 
> 3 Try the RFC patches
> -----------------------
> 
> The RFC supports to create one VM to test the simple GPU workload.
> 
> - Host kernel:
> https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc
> - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4]
> 
>   Install guest driver:
>   # export GRID_BUILD=1
>   # ./NVIDIA-Linux-x86_64-535.154.05.run
> 
> - Tested platforms: L40.
> - Tested guest OS: Ubutnu 24.04 LTS.
> - Supported experience: Linux rich desktop experience with simple 3D
>   workload, e.g. glmark2
> 
> 4 Demo
> ------
> 
> A demo video can be found at: https://youtu.be/YwgIvvk-V94
> 
> [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/
> [2]
> https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
> [3]
> https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/
> [4]
> https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
> 
> Zhi Wang (29):
>   nvkm/vgpu: introduce NVIDIA vGPU support prelude
>   nvkm/vgpu: attach to nvkm as a nvkm client
>   nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
>   nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
>   nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
>   nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
>   nvkm/gsp: add a notify handler for GSP event
>     GPUACCT_PERFMON_UTIL_SAMPLES
>   nvkm/vgpu: get the size VMMU segment from GSP firmware
>   nvkm/vgpu: introduce the reserved channel allocator
>   nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
>   nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
>   nvkm/vgpu: introduce GSP RM control interface for vGPU
>   nvkm: move chid.h to nvkm/engine.
>   nvkm/vgpu: introduce channel allocation for vGPU
>   nvkm/vgpu: introduce FB memory allocation for vGPU
>   nvkm/vgpu: introduce BAR1 map routines for vGPUs
>   nvkm/vgpu: introduce engine bitmap for vGPU
>   nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
>   vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
>   vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
>   vfio/vgpu_mgr: introduce vGPU type uploading
>   vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
>   vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
>   vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
>   vfio/vgpu_mgr: map mgmt heap when creating a vGPU
>   vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
>   vfio/vgpu_mgr: bootload the new vGPU
>   vfio/vgpu_mgr: introduce vGPU host RPC channel
>   vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver
> 
>  .../drm/nouveau/include/nvkm/core/device.h    |   3 +
>  .../drm/nouveau/include/nvkm/engine/chid.h    |  29 +
>  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   1 +
>  .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  45 ++
>  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  12 +
>  drivers/gpu/drm/nouveau/nvkm/Kbuild           |   1 +
>  drivers/gpu/drm/nouveau/nvkm/device/pci.c     |  33 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   |  49 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  26 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |   3 +
>  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  14 +-
>  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |   3 +
>  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 302 +++++++++++
>  .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 234 ++++++++
>  drivers/vfio/pci/Kconfig                      |   2 +
>  drivers/vfio/pci/Makefile                     |   2 +
>  drivers/vfio/pci/nvidia-vgpu/Kconfig          |  13 +
>  drivers/vfio/pci/nvidia-vgpu/Makefile         |   8 +
>  drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 +
>  .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +
>  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
>  .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 152 ++++++
>  .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
>  .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++
>  .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++
>  .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 +
>  .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
>  drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  94 ++++
>  drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 +++++++++
>  drivers/vfio/pci/nvidia-vgpu/vfio.h           |  43 ++
>  drivers/vfio/pci/nvidia-vgpu/vfio_access.c    | 297 ++++++++++
>  drivers/vfio/pci/nvidia-vgpu/vfio_main.c      | 511
> ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c           |
> 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       | 144
> +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  89 +++
>  drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++
>  include/drm/nvkm_vgpu_mgr_vfio.h              |  61 +++
>  37 files changed, 3702 insertions(+), 33 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
>  create mode 100644
> drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode
> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode
> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode
> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode
> 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644
> drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644
> drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
> create mode 100644
> drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode
> 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644
> drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644
> drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644
> include/drm/nvkm_vgpu_mgr_vfio.h
> 


^ permalink raw reply	[flat|nested] 86+ messages in thread

* RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (29 preceding siblings ...)
  2024-09-22 13:11 ` [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
@ 2024-09-23  6:22 ` Tian, Kevin
  2024-09-23 15:02   ` Jason Gunthorpe
  2024-09-23  8:49 ` Danilo Krummrich
  31 siblings, 1 reply; 86+ messages in thread
From: Tian, Kevin @ 2024-09-23  6:22 UTC (permalink / raw)
  To: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org
  Cc: alex.williamson@redhat.com, jgg@nvidia.com, airlied@gmail.com,
	daniel@ffwll.ch, Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

> From: Zhi Wang <zhiw@nvidia.com>
> Sent: Sunday, September 22, 2024 8:49 PM
> 
[...]
> 
> The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> extended management and features, e.g. selecting the vGPU types, support
> live migration and driver warm update.
> 
> Like other devices that VFIO supports, VFIO provides the standard
> userspace APIs for device lifecycle management and advance feature
> support.
> 
> The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO
> variant driver to create/destroy vGPUs, query available vGPU types, select
> the vGPU type, etc.
> 
> On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver,
> which provide necessary support to reach the HW functions.
> 

I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. 
It's very NVIDIA specific and naturally fit in the PF driver.

The VFIO side should focus on what's necessary for managing userspace
access to the VF hw, i.e. patch29.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-22 13:11 ` [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
@ 2024-09-23  8:38   ` Danilo Krummrich
  2024-09-24 19:49     ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-23  8:38 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, jgg, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang, bskeggs

On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote:
> On Sun, 22 Sep 2024 05:49:22 -0700
> Zhi Wang <zhiw@nvidia.com> wrote:
> 
> +Ben.
> 
> Forget to add you. My bad. 

Please also add the driver maintainers!

I had to fetch the patchset from the KVM list, since they did not hit the
nouveau list (I'm trying to get @nvidia.com addresses whitelisted).

- Danilo

>  
> 
> > 1. Background
> > =============
> > 
> > NVIDIA vGPU[1] software enables powerful GPU performance for workloads
> > ranging from graphics-rich virtual workstations to data science and
> > AI, enabling IT to leverage the management and security benefits of
> > virtualization as well as the performance of NVIDIA GPUs required for
> > modern workloads. Installed on a physical GPU in a cloud or enterprise
> > data center server, NVIDIA vGPU software creates virtual GPUs that can
> > be shared across multiple virtual machines.
> > 
> > The vGPU architecture[2] can be illustrated as follow:
> > 
> >  +--------------------+    +--------------------+
> > +--------------------+ +--------------------+ | Hypervisor         |
> >   | Guest VM           | | Guest VM           | | Guest VM
> > | |                    |    | +----------------+ | |
> > +----------------+ | | +----------------+ | | +----------------+ |
> > | |Applications... | | | |Applications... | | | |Applications... | |
> > | |  NVIDIA        | |    | +----------------+ | | +----------------+
> > | | +----------------+ | | |  Virtual GPU   | |    |
> > +----------------+ | | +----------------+ | | +----------------+ | |
> > |  Manager       | |    | |  Guest Driver  | | | |  Guest Driver  | |
> > | |  Guest Driver  | | | +------^---------+ |    | +----------------+
> > | | +----------------+ | | +----------------+ | |        |
> > |    +---------^----------+ +----------^---------+
> > +----------^---------+ |        |           |              |
> >              |                      | |        |
> > +--------------+-----------------------+----------------------+---------+
> > |        |                          |                       |
> >              |         | |        |                          |
> >                |                      |         |
> > +--------+--------------------------+-----------------------+----------------------+---------+
> > +---------v--------------------------+-----------------------+----------------------+----------+
> > | NVIDIA                  +----------v---------+
> > +-----------v--------+ +-----------v--------+ | | Physical GPU
> >     |   Virtual GPU      | |   Virtual GPU      | |   Virtual GPU
> >  | | |                         +--------------------+
> > +--------------------+ +--------------------+ |
> > +----------------------------------------------------------------------------------------------+
> > 
> > Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed
> > amount of GPU framebuffer, and one or more virtual display outputs or
> > "heads". The vGPU’s framebuffer is allocated out of the physical
> > GPU’s framebuffer at the time the vGPU is created, and the vGPU
> > retains exclusive use of that framebuffer until it is destroyed.
> > 
> > The number of physical GPUs that a board has depends on the board.
> > Each physical GPU can support several different types of virtual GPU
> > (vGPU). vGPU types have a fixed amount of frame buffer, number of
> > supported display heads, and maximum resolutions. They are grouped
> > into different series according to the different classes of workload
> > for which they are optimized. Each series is identified by the last
> > letter of the vGPU type name.
> > 
> > NVIDIA vGPU supports Windows and Linux guest VM operating systems. The
> > supported vGPU types depend on the guest VM OS.
> > 
> > 2. Proposal for upstream
> > ========================
> > 
> > 2.1 Architecture
> > ----------------
> > 
> > Moving to the upstream, the proposed architecture can be illustrated
> > as followings:
> > 
> >                             +--------------------+
> > +--------------------+ +--------------------+ | Linux VM           |
> > | Windows VM         | | Guest VM           | | +----------------+ |
> > | +----------------+ | | +----------------+ | | |Applications... | |
> > | |Applications... | | | |Applications... | | | +----------------+ |
> > | +----------------+ | | +----------------+ | ... |
> > +----------------+ | | +----------------+ | | +----------------+ | |
> > |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | |
> > +----------------+ | | +----------------+ | | +----------------+ |
> > +---------^----------+ +----------^---------+ +----------^---------+
> > |                       |                      |
> > +--------------------------------------------------------------------+
> > |+--------------------+ +--------------------+
> > +--------------------+| ||       QEMU         | |       QEMU
> > | |       QEMU         || ||                    | |
> >  | |                    || |+--------------------+
> > +--------------------+ +--------------------+|
> > +--------------------------------------------------------------------+
> > |                       |                      |
> > +-----------------------------------------------------------------------------------------------+
> > |
> > +----------------------------------------------------------------+  |
> > |                           |                                VFIO
> >                        |  | |                           |
> >                                                    |  | |
> > +-----------------------+ | +------------------------+
> > +---------------------------------+|  | | |  Core Driver vGPU     | |
> > |                        |  |                                 ||  | |
> > |       Support        <--->|                       <---->
> >                     ||  | | +-----------------------+ | | NVIDIA vGPU
> > Manager    |  | NVIDIA vGPU VFIO Variant Driver ||  | | |    NVIDIA
> > GPU Core    | | |                        |  |
> >         ||  | | |        Driver         | |
> > +------------------------+  +---------------------------------+|  | |
> > +--------^--------------+
> > +----------------------------------------------------------------+  |
> > |          |                          |                       |
> >                |          |
> > +-----------------------------------------------------------------------------------------------+
> > |                          |                       |
> >     |
> > +----------|--------------------------|-----------------------|----------------------|----------+
> > |          v               +----------v---------+
> > +-----------v--------+ +-----------v--------+ | |  NVIDIA
> >      |       PCI VF       | |       PCI VF       | |       PCI VF
> >   | | |  Physical GPU            |                    | |
> >        | |                    | | |                          |
> > (Virtual GPU)    | |   (Virtual GPU)    | |    (Virtual GPU)   | | |
> >                         +--------------------+ +--------------------+
> > +--------------------+ |
> > +-----------------------------------------------------------------------------------------------+
> > 
> > The supported GPU generations will be Ada which come with the
> > supported GPU architecture. Each vGPU is backed by a PCI virtual
> > function.
> > 
> > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> > extended management and features, e.g. selecting the vGPU types,
> > support live migration and driver warm update.
> > 
> > Like other devices that VFIO supports, VFIO provides the standard
> > userspace APIs for device lifecycle management and advance feature
> > support.
> > 
> > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU
> > VFIO variant driver to create/destroy vGPUs, query available vGPU
> > types, select the vGPU type, etc.
> > 
> > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core
> > driver, which provide necessary support to reach the HW functions.
> > 
> > 2.2 Requirements to the NVIDIA GPU core driver
> > ----------------------------------------------
> > 
> > The primary use case of CSP and enterprise is a standalone minimal
> > drivers of vGPU manager and other necessary components.
> > 
> > NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide
> > necessary support to:
> > 
> > - Load the GSP firmware, boot the GSP, provide commnication channel.
> > - Manage the shared/partitioned HW resources. E.g. reserving FB
> > memory, channels for the vGPU mananger to create vGPUs.
> > - Exception handling. E.g. delivering the GSP events to vGPU manager.
> > - Host event dispatch. E.g. suspend/resume.
> > - Enumerations of HW configuration.
> > 
> > The NVIDIA GPU core driver, which sits on the PCI device interface of
> > NVIDIA GPU, provides support to both DRM driver and the vGPU manager.
> > 
> > In this RFC, the split nouveau GPU driver[3] is used as an example to
> > demostrate the requirements of vGPU manager to the core driver. The
> > nouveau driver is split into nouveau (the DRM driver) and nvkm (the
> > core driver).
> > 
> > 3 Try the RFC patches
> > -----------------------
> > 
> > The RFC supports to create one VM to test the simple GPU workload.
> > 
> > - Host kernel:
> > https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc
> > - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4]
> > 
> >   Install guest driver:
> >   # export GRID_BUILD=1
> >   # ./NVIDIA-Linux-x86_64-535.154.05.run
> > 
> > - Tested platforms: L40.
> > - Tested guest OS: Ubutnu 24.04 LTS.
> > - Supported experience: Linux rich desktop experience with simple 3D
> >   workload, e.g. glmark2
> > 
> > 4 Demo
> > ------
> > 
> > A demo video can be found at: https://youtu.be/YwgIvvk-V94
> > 
> > [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/
> > [2]
> > https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
> > [3]
> > https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/
> > [4]
> > https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
> > 
> > Zhi Wang (29):
> >   nvkm/vgpu: introduce NVIDIA vGPU support prelude
> >   nvkm/vgpu: attach to nvkm as a nvkm client
> >   nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
> >   nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
> >   nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
> >   nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
> >   nvkm/gsp: add a notify handler for GSP event
> >     GPUACCT_PERFMON_UTIL_SAMPLES
> >   nvkm/vgpu: get the size VMMU segment from GSP firmware
> >   nvkm/vgpu: introduce the reserved channel allocator
> >   nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
> >   nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
> >   nvkm/vgpu: introduce GSP RM control interface for vGPU
> >   nvkm: move chid.h to nvkm/engine.
> >   nvkm/vgpu: introduce channel allocation for vGPU
> >   nvkm/vgpu: introduce FB memory allocation for vGPU
> >   nvkm/vgpu: introduce BAR1 map routines for vGPUs
> >   nvkm/vgpu: introduce engine bitmap for vGPU
> >   nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
> >   vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
> >   vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
> >   vfio/vgpu_mgr: introduce vGPU type uploading
> >   vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
> >   vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
> >   vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
> >   vfio/vgpu_mgr: map mgmt heap when creating a vGPU
> >   vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
> >   vfio/vgpu_mgr: bootload the new vGPU
> >   vfio/vgpu_mgr: introduce vGPU host RPC channel
> >   vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver
> > 
> >  .../drm/nouveau/include/nvkm/core/device.h    |   3 +
> >  .../drm/nouveau/include/nvkm/engine/chid.h    |  29 +
> >  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   1 +
> >  .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  45 ++
> >  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  12 +
> >  drivers/gpu/drm/nouveau/nvkm/Kbuild           |   1 +
> >  drivers/gpu/drm/nouveau/nvkm/device/pci.c     |  33 +-
> >  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   |  49 +-
> >  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  26 +-
> >  .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |   3 +
> >  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  14 +-
> >  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |   3 +
> >  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 302 +++++++++++
> >  .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 234 ++++++++
> >  drivers/vfio/pci/Kconfig                      |   2 +
> >  drivers/vfio/pci/Makefile                     |   2 +
> >  drivers/vfio/pci/nvidia-vgpu/Kconfig          |  13 +
> >  drivers/vfio/pci/nvidia-vgpu/Makefile         |   8 +
> >  drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 +
> >  .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +
> >  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
> >  .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 152 ++++++
> >  .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
> >  .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++
> >  .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++
> >  .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 +
> >  .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
> >  drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  94 ++++
> >  drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 +++++++++
> >  drivers/vfio/pci/nvidia-vgpu/vfio.h           |  43 ++
> >  drivers/vfio/pci/nvidia-vgpu/vfio_access.c    | 297 ++++++++++
> >  drivers/vfio/pci/nvidia-vgpu/vfio_main.c      | 511
> > ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c           |
> > 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       | 144
> > +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  89 +++
> >  drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++
> >  include/drm/nvkm_vgpu_mgr_vfio.h              |  61 +++
> >  37 files changed, 3702 insertions(+), 33 deletions(-)
> >  create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
> >  create mode 100644
> > drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode
> > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode
> > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode
> > 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode
> > 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
> > create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode
> > 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644
> > drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644
> > include/drm/nvkm_vgpu_mgr_vfio.h
> > 
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
                   ` (30 preceding siblings ...)
  2024-09-23  6:22 ` Tian, Kevin
@ 2024-09-23  8:49 ` Danilo Krummrich
  2024-09-23 15:01   ` Jason Gunthorpe
  31 siblings, 1 reply; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-23  8:49 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, jgg, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

Hi Zhi,

Thanks for the very detailed cover letter.

On Sun, Sep 22, 2024 at 05:49:22AM -0700, Zhi Wang wrote:
> 1. Background
> =============
> 
> NVIDIA vGPU[1] software enables powerful GPU performance for workloads
> ranging from graphics-rich virtual workstations to data science and AI,
> enabling IT to leverage the management and security benefits of
> virtualization as well as the performance of NVIDIA GPUs required for
> modern workloads. Installed on a physical GPU in a cloud or enterprise
> data center server, NVIDIA vGPU software creates virtual GPUs that can
> be shared across multiple virtual machines.
> 
> The vGPU architecture[2] can be illustrated as follow:
> 
>  +--------------------+    +--------------------+ +--------------------+ +--------------------+ 
>  | Hypervisor         |    | Guest VM           | | Guest VM           | | Guest VM           | 
>  |                    |    | +----------------+ | | +----------------+ | | +----------------+ | 
>  | +----------------+ |    | |Applications... | | | |Applications... | | | |Applications... | | 
>  | |  NVIDIA        | |    | +----------------+ | | +----------------+ | | +----------------+ | 
>  | |  Virtual GPU   | |    | +----------------+ | | +----------------+ | | +----------------+ | 
>  | |  Manager       | |    | |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | 
>  | +------^---------+ |    | +----------------+ | | +----------------+ | | +----------------+ | 
>  |        |           |    +---------^----------+ +----------^---------+ +----------^---------+ 
>  |        |           |              |                       |                      |           
>  |        |           +--------------+-----------------------+----------------------+---------+ 
>  |        |                          |                       |                      |         | 
>  |        |                          |                       |                      |         | 
>  +--------+--------------------------+-----------------------+----------------------+---------+ 
> +---------v--------------------------+-----------------------+----------------------+----------+
> | NVIDIA                  +----------v---------+ +-----------v--------+ +-----------v--------+ |
> | Physical GPU            |   Virtual GPU      | |   Virtual GPU      | |   Virtual GPU      | |
> |                         +--------------------+ +--------------------+ +--------------------+ |
> +----------------------------------------------------------------------------------------------+
> 
> Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed amount
> of GPU framebuffer, and one or more virtual display outputs or "heads".
> The vGPU’s framebuffer is allocated out of the physical GPU’s framebuffer
> at the time the vGPU is created, and the vGPU retains exclusive use of
> that framebuffer until it is destroyed.
> 
> The number of physical GPUs that a board has depends on the board. Each
> physical GPU can support several different types of virtual GPU (vGPU).
> vGPU types have a fixed amount of frame buffer, number of supported
> display heads, and maximum resolutions. They are grouped into different
> series according to the different classes of workload for which they are
> optimized. Each series is identified by the last letter of the vGPU type
> name.
> 
> NVIDIA vGPU supports Windows and Linux guest VM operating systems. The
> supported vGPU types depend on the guest VM OS.
> 
> 2. Proposal for upstream
> ========================

What is the strategy in the mid / long term with this?

As you know, we're trying to move to Nova and the blockers with the device /
driver infrastructure have been resolved and we're able to move forward. Besides
that, Dave made great progress on the firmware abstraction side of things.

Is this more of a proof of concept? Do you plan to work on Nova in general and
vGPU support for Nova?

> 
> 2.1 Architecture
> ----------------
> 
> Moving to the upstream, the proposed architecture can be illustrated as followings:
> 
>                             +--------------------+ +--------------------+ +--------------------+ 
>                             | Linux VM           | | Windows VM         | | Guest VM           | 
>                             | +----------------+ | | +----------------+ | | +----------------+ | 
>                             | |Applications... | | | |Applications... | | | |Applications... | | 
>                             | +----------------+ | | +----------------+ | | +----------------+ | ... 
>                             | +----------------+ | | +----------------+ | | +----------------+ | 
>                             | |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | 
>                             | +----------------+ | | +----------------+ | | +----------------+ | 
>                             +---------^----------+ +----------^---------+ +----------^---------+ 
>                                       |                       |                      |           
>                            +--------------------------------------------------------------------+
>                            |+--------------------+ +--------------------+ +--------------------+|
>                            ||       QEMU         | |       QEMU         | |       QEMU         ||
>                            ||                    | |                    | |                    ||
>                            |+--------------------+ +--------------------+ +--------------------+|
>                            +--------------------------------------------------------------------+
>                                       |                       |                      |
> +-----------------------------------------------------------------------------------------------+
> |                           +----------------------------------------------------------------+  |
> |                           |                                VFIO                            |  |
> |                           |                                                                |  |
> | +-----------------------+ | +------------------------+  +---------------------------------+|  |
> | |  Core Driver vGPU     | | |                        |  |                                 ||  |
> | |       Support        <--->|                       <---->                                ||  |
> | +-----------------------+ | | NVIDIA vGPU Manager    |  | NVIDIA vGPU VFIO Variant Driver ||  |
> | |    NVIDIA GPU Core    | | |                        |  |                                 ||  |
> | |        Driver         | | +------------------------+  +---------------------------------+|  |
> | +--------^--------------+ +----------------------------------------------------------------+  |
> |          |                          |                       |                      |          |
> +-----------------------------------------------------------------------------------------------+
>            |                          |                       |                      |           
> +----------|--------------------------|-----------------------|----------------------|----------+
> |          v               +----------v---------+ +-----------v--------+ +-----------v--------+ |
> |  NVIDIA                  |       PCI VF       | |       PCI VF       | |       PCI VF       | |
> |  Physical GPU            |                    | |                    | |                    | |
> |                          |   (Virtual GPU)    | |   (Virtual GPU)    | |    (Virtual GPU)   | |
> |                          +--------------------+ +--------------------+ +--------------------+ |
> +-----------------------------------------------------------------------------------------------+
> 
> The supported GPU generations will be Ada which come with the supported
> GPU architecture. Each vGPU is backed by a PCI virtual function.
> 
> The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> extended management and features, e.g. selecting the vGPU types, support
> live migration and driver warm update.
> 
> Like other devices that VFIO supports, VFIO provides the standard
> userspace APIs for device lifecycle management and advance feature
> support.
> 
> The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO
> variant driver to create/destroy vGPUs, query available vGPU types, select
> the vGPU type, etc.
> 
> On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver,
> which provide necessary support to reach the HW functions.
> 
> 2.2 Requirements to the NVIDIA GPU core driver
> ----------------------------------------------
> 
> The primary use case of CSP and enterprise is a standalone minimal
> drivers of vGPU manager and other necessary components.
> 
> NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide
> necessary support to:
> 
> - Load the GSP firmware, boot the GSP, provide commnication channel.
> - Manage the shared/partitioned HW resources. E.g. reserving FB memory,
>   channels for the vGPU mananger to create vGPUs.
> - Exception handling. E.g. delivering the GSP events to vGPU manager.
> - Host event dispatch. E.g. suspend/resume.
> - Enumerations of HW configuration.
> 
> The NVIDIA GPU core driver, which sits on the PCI device interface of
> NVIDIA GPU, provides support to both DRM driver and the vGPU manager.
> 
> In this RFC, the split nouveau GPU driver[3] is used as an example to
> demostrate the requirements of vGPU manager to the core driver. The
> nouveau driver is split into nouveau (the DRM driver) and nvkm (the core
> driver).
> 
> 3 Try the RFC patches
> -----------------------
> 
> The RFC supports to create one VM to test the simple GPU workload.
> 
> - Host kernel: https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc
> - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4]
> 
>   Install guest driver:
>   # export GRID_BUILD=1
>   # ./NVIDIA-Linux-x86_64-535.154.05.run
> 
> - Tested platforms: L40.
> - Tested guest OS: Ubutnu 24.04 LTS.
> - Supported experience: Linux rich desktop experience with simple 3D
>   workload, e.g. glmark2
> 
> 4 Demo
> ------
> 
> A demo video can be found at: https://youtu.be/YwgIvvk-V94
> 
> [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/
> [2] https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
> [3] https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/
> [4] https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
> 
> Zhi Wang (29):
>   nvkm/vgpu: introduce NVIDIA vGPU support prelude
>   nvkm/vgpu: attach to nvkm as a nvkm client
>   nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
>   nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
>   nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
>   nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
>   nvkm/gsp: add a notify handler for GSP event
>     GPUACCT_PERFMON_UTIL_SAMPLES
>   nvkm/vgpu: get the size VMMU segment from GSP firmware
>   nvkm/vgpu: introduce the reserved channel allocator
>   nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
>   nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
>   nvkm/vgpu: introduce GSP RM control interface for vGPU
>   nvkm: move chid.h to nvkm/engine.
>   nvkm/vgpu: introduce channel allocation for vGPU
>   nvkm/vgpu: introduce FB memory allocation for vGPU
>   nvkm/vgpu: introduce BAR1 map routines for vGPUs
>   nvkm/vgpu: introduce engine bitmap for vGPU
>   nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
>   vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
>   vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
>   vfio/vgpu_mgr: introduce vGPU type uploading
>   vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
>   vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
>   vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
>   vfio/vgpu_mgr: map mgmt heap when creating a vGPU
>   vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
>   vfio/vgpu_mgr: bootload the new vGPU
>   vfio/vgpu_mgr: introduce vGPU host RPC channel
>   vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver
> 
>  .../drm/nouveau/include/nvkm/core/device.h    |   3 +
>  .../drm/nouveau/include/nvkm/engine/chid.h    |  29 +
>  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   1 +
>  .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  45 ++
>  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  12 +
>  drivers/gpu/drm/nouveau/nvkm/Kbuild           |   1 +
>  drivers/gpu/drm/nouveau/nvkm/device/pci.c     |  33 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   |  49 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  26 +-
>  .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |   3 +
>  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  14 +-
>  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |   3 +
>  drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 302 +++++++++++
>  .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 234 ++++++++
>  drivers/vfio/pci/Kconfig                      |   2 +
>  drivers/vfio/pci/Makefile                     |   2 +
>  drivers/vfio/pci/nvidia-vgpu/Kconfig          |  13 +
>  drivers/vfio/pci/nvidia-vgpu/Makefile         |   8 +
>  drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 +
>  .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +
>  .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
>  .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 152 ++++++
>  .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
>  .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++
>  .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++
>  .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 +
>  .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
>  drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  94 ++++
>  drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 +++++++++
>  drivers/vfio/pci/nvidia-vgpu/vfio.h           |  43 ++
>  drivers/vfio/pci/nvidia-vgpu/vfio_access.c    | 297 ++++++++++
>  drivers/vfio/pci/nvidia-vgpu/vfio_main.c      | 511 ++++++++++++++++++
>  drivers/vfio/pci/nvidia-vgpu/vgpu.c           | 352 ++++++++++++
>  drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       | 144 +++++
>  drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  89 +++
>  drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++
>  include/drm/nvkm_vgpu_mgr_vfio.h              |  61 +++
>  37 files changed, 3702 insertions(+), 33 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
>  create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/Makefile
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/debug.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/rpc.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_access.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vfio_main.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h
>  create mode 100644 drivers/vfio/pci/nvidia-vgpu/vgpu_types.c
>  create mode 100644 include/drm/nvkm_vgpu_mgr_vfio.h
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23  8:49 ` Danilo Krummrich
@ 2024-09-23 15:01   ` Jason Gunthorpe
  2024-09-23 22:50     ` Danilo Krummrich
  2024-09-26  9:14     ` Greg KH
  0 siblings, 2 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-23 15:01 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > 2. Proposal for upstream
> > ========================
> 
> What is the strategy in the mid / long term with this?
> 
> As you know, we're trying to move to Nova and the blockers with the device /
> driver infrastructure have been resolved and we're able to move forward. Besides
> that, Dave made great progress on the firmware abstraction side of things.
> 
> Is this more of a proof of concept? Do you plan to work on Nova in general and
> vGPU support for Nova?

This is intended to be a real product that customers would use, it is
not a proof of concept. There is alot of demand for this kind of
simplified virtualization infrastructure in the host side. The series
here is the first attempt at making thin host infrastructure and
Zhi/etc are doing it with an upstream-first approach.

From the VFIO side I would like to see something like this merged in
nearish future as it would bring a previously out of tree approach to
be fully intree using our modern infrastructure. This is a big win for
the VFIO world.

As a commercial product this will be backported extensively to many
old kernels and that is harder/impossible if it isn't exclusively in
C. So, I think nova needs to co-exist in some way.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23  6:22 ` Tian, Kevin
@ 2024-09-23 15:02   ` Jason Gunthorpe
  2024-09-26  6:43     ` Tian, Kevin
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-23 15:02 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, airlied@gmail.com, daniel@ffwll.ch,
	Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote:
> > From: Zhi Wang <zhiw@nvidia.com>
> > Sent: Sunday, September 22, 2024 8:49 PM
> > 
> [...]
> > 
> > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> > extended management and features, e.g. selecting the vGPU types, support
> > live migration and driver warm update.
> > 
> > Like other devices that VFIO supports, VFIO provides the standard
> > userspace APIs for device lifecycle management and advance feature
> > support.
> > 
> > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO
> > variant driver to create/destroy vGPUs, query available vGPU types, select
> > the vGPU type, etc.
> > 
> > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver,
> > which provide necessary support to reach the HW functions.
> > 
> 
> I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. 
> It's very NVIDIA specific and naturally fit in the PF driver.

drm isn't a particularly logical place for that either :|

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23 15:01   ` Jason Gunthorpe
@ 2024-09-23 22:50     ` Danilo Krummrich
  2024-09-24 16:41       ` Jason Gunthorpe
  2024-09-26  9:14     ` Greg KH
  1 sibling, 1 reply; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-23 22:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > > 2. Proposal for upstream
> > > ========================
> > 
> > What is the strategy in the mid / long term with this?
> > 
> > As you know, we're trying to move to Nova and the blockers with the device /
> > driver infrastructure have been resolved and we're able to move forward. Besides
> > that, Dave made great progress on the firmware abstraction side of things.
> > 
> > Is this more of a proof of concept? Do you plan to work on Nova in general and
> > vGPU support for Nova?
> 
> This is intended to be a real product that customers would use, it is
> not a proof of concept. There is alot of demand for this kind of
> simplified virtualization infrastructure in the host side.

I see...

> The series
> here is the first attempt at making thin host infrastructure and
> Zhi/etc are doing it with an upstream-first approach.

This is great!

> 
> From the VFIO side I would like to see something like this merged in
> nearish future as it would bring a previously out of tree approach to
> be fully intree using our modern infrastructure. This is a big win for
> the VFIO world.
> 
> As a commercial product this will be backported extensively to many
> old kernels and that is harder/impossible if it isn't exclusively in
> C. So, I think nova needs to co-exist in some way.

We'll surely not support two drivers for the same thing in the long term,
neither does it make sense, nor is it sustainable.

We have a lot of good reasons why we decided to move forward with Nova as a
successor of Nouveau for GSP-based GPUs in the long term -- I also just held a
talk about this at LPC.

For the short/mid term I think it may be reasonable to start with Nouveau, but
this must be based on some agreements, for instance:

- take responsibility, e.g. commitment to help with maintainance with some of
  NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau
- commitment to help with Nova in general and, once applicable, move the vGPU
  parts over to Nova

But I think the very last one naturally happens if we stop further support for
new HW in Nouveau at some point.

> 
> Jason
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23 22:50     ` Danilo Krummrich
@ 2024-09-24 16:41       ` Jason Gunthorpe
  2024-09-24 19:56         ` Danilo Krummrich
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-24 16:41 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote:

> > From the VFIO side I would like to see something like this merged in
> > nearish future as it would bring a previously out of tree approach to
> > be fully intree using our modern infrastructure. This is a big win for
> > the VFIO world.
> > 
> > As a commercial product this will be backported extensively to many
> > old kernels and that is harder/impossible if it isn't exclusively in
> > C. So, I think nova needs to co-exist in some way.
> 
> We'll surely not support two drivers for the same thing in the long term,
> neither does it make sense, nor is it sustainable.

What is being done here is the normal correct kernel thing to
do. Refactor the shared core code into a module and stick higher level
stuff on top of it. Ideally Nova/Nouveau would exist as peers
implementing DRM subsystem on this shared core infrastructure. We've
done this sort of thing before in other places in the kernel. It has
been proven to work well.

So, I'm not sure why you think there should be two drivers in the long
term? Do you have some technical reason why Nova can't fit into this
modular architecture?

Regardless, assuming Nova will eventually propose merging duplicated
bootup code then I suggest it should be able to fully replace the C
code with a kconfig switch and provide C compatible interfaces for
VFIO. When Rust is sufficiently mature we can consider a deprecation
schedule for the C version.

I agree duplication doesn't make sense, but if it is essential to make
everyone happy then we should do it to accommodate the ongoing Rust
experiment.

> We have a lot of good reasons why we decided to move forward with Nova as a
> successor of Nouveau for GSP-based GPUs in the long term -- I also just held a
> talk about this at LPC.

I know, but this series is adding a VFIO driver to the kernel, and a
complete Nova driver doesn't even exist yet. It is fine to think about
future plans, but let's not get too far ahead of ourselves here..

> For the short/mid term I think it may be reasonable to start with
> Nouveau, but this must be based on some agreements, for instance:
> 
> - take responsibility, e.g. commitment to help with maintainance with some of
>   NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau

I fully expect NVIDIA teams to own this core driver and VFIO parts. I
see there are no changes to the MAINTAINERs file in this RFC, that
will need to be corrected.

> - commitment to help with Nova in general and, once applicable, move the vGPU
>   parts over to Nova

I think you will get help with Nova based on its own merit, but I
don't like where you are going with this. Linus has had negative
things to say about this sort of cross-linking and I agree with
him. We should not be trying to extract unrelated promises on Nova as
a condition for progressing a VFIO series. :\

> But I think the very last one naturally happens if we stop further support for
> new HW in Nouveau at some point.

I expect the core code would continue to support new HW going forward
to support the VFIO driver, even if nouveau doesn't use it, until Rust
reaches some full ecosystem readyness for the server space.

There are going to be a lot of users of this code, let's not rush to
harm them please.

Fortunately there is no use case for DRM and VFIO to coexist in a
hypervisor, so this does not turn into a such a technical problem like
most other dual-driver situations.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23  8:38   ` Danilo Krummrich
@ 2024-09-24 19:49     ` Zhi Wang
  0 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-09-24 19:49 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com, Jason Gunthorpe,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org, Ben Skeggs

On 23/09/2024 11.38, Danilo Krummrich wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote:
>> On Sun, 22 Sep 2024 05:49:22 -0700
>> Zhi Wang <zhiw@nvidia.com> wrote:
>>
>> +Ben.
>>
>> Forget to add you. My bad.
> 
> Please also add the driver maintainers!
> 
> I had to fetch the patchset from the KVM list, since they did not hit the
> nouveau list (I'm trying to get @nvidia.com addresses whitelisted).
> 
> - Danilo
> 

My bad. Will do in the next iteration. Weird...never thought this cound 
happen since alll my previous emails landed in the mailing list. Did you 
see any email discussion in the thread in the nouveau list? Feel free to 
let me know if I should send them again to nouveau list. Maybe it is 
also easier that you can pull the patches from my tree.

Note that I will be on vacation until Oct 11th. Email reply might be 
slow. But I will read the emails in the mailing list.

Thanks,
Zhi.
>>
>>
>>> 1. Background
>>> =============
>>>
>>> NVIDIA vGPU[1] software enables powerful GPU performance for workloads
>>> ranging from graphics-rich virtual workstations to data science and
>>> AI, enabling IT to leverage the management and security benefits of
>>> virtualization as well as the performance of NVIDIA GPUs required for
>>> modern workloads. Installed on a physical GPU in a cloud or enterprise
>>> data center server, NVIDIA vGPU software creates virtual GPUs that can
>>> be shared across multiple virtual machines.
>>>
>>> The vGPU architecture[2] can be illustrated as follow:
>>>
>>>   +--------------------+    +--------------------+
>>> +--------------------+ +--------------------+ | Hypervisor         |
>>>    | Guest VM           | | Guest VM           | | Guest VM
>>> | |                    |    | +----------------+ | |
>>> +----------------+ | | +----------------+ | | +----------------+ |
>>> | |Applications... | | | |Applications... | | | |Applications... | |
>>> | |  NVIDIA        | |    | +----------------+ | | +----------------+
>>> | | +----------------+ | | |  Virtual GPU   | |    |
>>> +----------------+ | | +----------------+ | | +----------------+ | |
>>> |  Manager       | |    | |  Guest Driver  | | | |  Guest Driver  | |
>>> | |  Guest Driver  | | | +------^---------+ |    | +----------------+
>>> | | +----------------+ | | +----------------+ | |        |
>>> |    +---------^----------+ +----------^---------+
>>> +----------^---------+ |        |           |              |
>>>               |                      | |        |
>>> +--------------+-----------------------+----------------------+---------+
>>> |        |                          |                       |
>>>               |         | |        |                          |
>>>                 |                      |         |
>>> +--------+--------------------------+-----------------------+----------------------+---------+
>>> +---------v--------------------------+-----------------------+----------------------+----------+
>>> | NVIDIA                  +----------v---------+
>>> +-----------v--------+ +-----------v--------+ | | Physical GPU
>>>      |   Virtual GPU      | |   Virtual GPU      | |   Virtual GPU
>>>   | | |                         +--------------------+
>>> +--------------------+ +--------------------+ |
>>> +----------------------------------------------------------------------------------------------+
>>>
>>> Each NVIDIA vGPU is analogous to a conventional GPU, having a fixed
>>> amount of GPU framebuffer, and one or more virtual display outputs or
>>> "heads". The vGPU’s framebuffer is allocated out of the physical
>>> GPU’s framebuffer at the time the vGPU is created, and the vGPU
>>> retains exclusive use of that framebuffer until it is destroyed.
>>>
>>> The number of physical GPUs that a board has depends on the board.
>>> Each physical GPU can support several different types of virtual GPU
>>> (vGPU). vGPU types have a fixed amount of frame buffer, number of
>>> supported display heads, and maximum resolutions. They are grouped
>>> into different series according to the different classes of workload
>>> for which they are optimized. Each series is identified by the last
>>> letter of the vGPU type name.
>>>
>>> NVIDIA vGPU supports Windows and Linux guest VM operating systems. The
>>> supported vGPU types depend on the guest VM OS.
>>>
>>> 2. Proposal for upstream
>>> ========================
>>>
>>> 2.1 Architecture
>>> ----------------
>>>
>>> Moving to the upstream, the proposed architecture can be illustrated
>>> as followings:
>>>
>>>                              +--------------------+
>>> +--------------------+ +--------------------+ | Linux VM           |
>>> | Windows VM         | | Guest VM           | | +----------------+ |
>>> | +----------------+ | | +----------------+ | | |Applications... | |
>>> | |Applications... | | | |Applications... | | | +----------------+ |
>>> | +----------------+ | | +----------------+ | ... |
>>> +----------------+ | | +----------------+ | | +----------------+ | |
>>> |  Guest Driver  | | | |  Guest Driver  | | | |  Guest Driver  | | |
>>> +----------------+ | | +----------------+ | | +----------------+ |
>>> +---------^----------+ +----------^---------+ +----------^---------+
>>> |                       |                      |
>>> +--------------------------------------------------------------------+
>>> |+--------------------+ +--------------------+
>>> +--------------------+| ||       QEMU         | |       QEMU
>>> | |       QEMU         || ||                    | |
>>>   | |                    || |+--------------------+
>>> +--------------------+ +--------------------+|
>>> +--------------------------------------------------------------------+
>>> |                       |                      |
>>> +-----------------------------------------------------------------------------------------------+
>>> |
>>> +----------------------------------------------------------------+  |
>>> |                           |                                VFIO
>>>                         |  | |                           |
>>>                                                     |  | |
>>> +-----------------------+ | +------------------------+
>>> +---------------------------------+|  | | |  Core Driver vGPU     | |
>>> |                        |  |                                 ||  | |
>>> |       Support        <--->|                       <---->
>>>                      ||  | | +-----------------------+ | | NVIDIA vGPU
>>> Manager    |  | NVIDIA vGPU VFIO Variant Driver ||  | | |    NVIDIA
>>> GPU Core    | | |                        |  |
>>>          ||  | | |        Driver         | |
>>> +------------------------+  +---------------------------------+|  | |
>>> +--------^--------------+
>>> +----------------------------------------------------------------+  |
>>> |          |                          |                       |
>>>                 |          |
>>> +-----------------------------------------------------------------------------------------------+
>>> |                          |                       |
>>>      |
>>> +----------|--------------------------|-----------------------|----------------------|----------+
>>> |          v               +----------v---------+
>>> +-----------v--------+ +-----------v--------+ | |  NVIDIA
>>>       |       PCI VF       | |       PCI VF       | |       PCI VF
>>>    | | |  Physical GPU            |                    | |
>>>         | |                    | | |                          |
>>> (Virtual GPU)    | |   (Virtual GPU)    | |    (Virtual GPU)   | | |
>>>                          +--------------------+ +--------------------+
>>> +--------------------+ |
>>> +-----------------------------------------------------------------------------------------------+
>>>
>>> The supported GPU generations will be Ada which come with the
>>> supported GPU architecture. Each vGPU is backed by a PCI virtual
>>> function.
>>>
>>> The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
>>> extended management and features, e.g. selecting the vGPU types,
>>> support live migration and driver warm update.
>>>
>>> Like other devices that VFIO supports, VFIO provides the standard
>>> userspace APIs for device lifecycle management and advance feature
>>> support.
>>>
>>> The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU
>>> VFIO variant driver to create/destroy vGPUs, query available vGPU
>>> types, select the vGPU type, etc.
>>>
>>> On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core
>>> driver, which provide necessary support to reach the HW functions.
>>>
>>> 2.2 Requirements to the NVIDIA GPU core driver
>>> ----------------------------------------------
>>>
>>> The primary use case of CSP and enterprise is a standalone minimal
>>> drivers of vGPU manager and other necessary components.
>>>
>>> NVIDIA vGPU manager talks to the NVIDIA GPU core driver, which provide
>>> necessary support to:
>>>
>>> - Load the GSP firmware, boot the GSP, provide commnication channel.
>>> - Manage the shared/partitioned HW resources. E.g. reserving FB
>>> memory, channels for the vGPU mananger to create vGPUs.
>>> - Exception handling. E.g. delivering the GSP events to vGPU manager.
>>> - Host event dispatch. E.g. suspend/resume.
>>> - Enumerations of HW configuration.
>>>
>>> The NVIDIA GPU core driver, which sits on the PCI device interface of
>>> NVIDIA GPU, provides support to both DRM driver and the vGPU manager.
>>>
>>> In this RFC, the split nouveau GPU driver[3] is used as an example to
>>> demostrate the requirements of vGPU manager to the core driver. The
>>> nouveau driver is split into nouveau (the DRM driver) and nvkm (the
>>> core driver).
>>>
>>> 3 Try the RFC patches
>>> -----------------------
>>>
>>> The RFC supports to create one VM to test the simple GPU workload.
>>>
>>> - Host kernel:
>>> https://github.com/zhiwang-nvidia/linux/tree/zhi/vgpu-mgr-rfc
>>> - Guest driver package: NVIDIA-Linux-x86_64-535.154.05.run [4]
>>>
>>>    Install guest driver:
>>>    # export GRID_BUILD=1
>>>    # ./NVIDIA-Linux-x86_64-535.154.05.run
>>>
>>> - Tested platforms: L40.
>>> - Tested guest OS: Ubutnu 24.04 LTS.
>>> - Supported experience: Linux rich desktop experience with simple 3D
>>>    workload, e.g. glmark2
>>>
>>> 4 Demo
>>> ------
>>>
>>> A demo video can be found at: https://youtu.be/YwgIvvk-V94
>>>
>>> [1] https://www.nvidia.com/en-us/data-center/virtual-solutions/
>>> [2]
>>> https://docs.nvidia.com/vgpu/17.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
>>> [3]
>>> https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/
>>> [4]
>>> https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
>>>
>>> Zhi Wang (29):
>>>    nvkm/vgpu: introduce NVIDIA vGPU support prelude
>>>    nvkm/vgpu: attach to nvkm as a nvkm client
>>>    nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled
>>>    nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
>>>    nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
>>>    nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
>>>    nvkm/gsp: add a notify handler for GSP event
>>>      GPUACCT_PERFMON_UTIL_SAMPLES
>>>    nvkm/vgpu: get the size VMMU segment from GSP firmware
>>>    nvkm/vgpu: introduce the reserved channel allocator
>>>    nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module
>>>    nvkm/vgpu: introduce GSP RM client alloc and free for vGPU
>>>    nvkm/vgpu: introduce GSP RM control interface for vGPU
>>>    nvkm: move chid.h to nvkm/engine.
>>>    nvkm/vgpu: introduce channel allocation for vGPU
>>>    nvkm/vgpu: introduce FB memory allocation for vGPU
>>>    nvkm/vgpu: introduce BAR1 map routines for vGPUs
>>>    nvkm/vgpu: introduce engine bitmap for vGPU
>>>    nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
>>>    vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude
>>>    vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager
>>>    vfio/vgpu_mgr: introduce vGPU type uploading
>>>    vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs
>>>    vfio/vgpu_mgr: allocate vGPU channels when creating vGPUs
>>>    vfio/vgpu_mgr: allocate mgmt heap when creating vGPUs
>>>    vfio/vgpu_mgr: map mgmt heap when creating a vGPU
>>>    vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs
>>>    vfio/vgpu_mgr: bootload the new vGPU
>>>    vfio/vgpu_mgr: introduce vGPU host RPC channel
>>>    vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver
>>>
>>>   .../drm/nouveau/include/nvkm/core/device.h    |   3 +
>>>   .../drm/nouveau/include/nvkm/engine/chid.h    |  29 +
>>>   .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   1 +
>>>   .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  45 ++
>>>   .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  12 +
>>>   drivers/gpu/drm/nouveau/nvkm/Kbuild           |   1 +
>>>   drivers/gpu/drm/nouveau/nvkm/device/pci.c     |  33 +-
>>>   .../gpu/drm/nouveau/nvkm/engine/fifo/chid.c   |  49 +-
>>>   .../gpu/drm/nouveau/nvkm/engine/fifo/chid.h   |  26 +-
>>>   .../gpu/drm/nouveau/nvkm/engine/fifo/r535.c   |   3 +
>>>   .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    |  14 +-
>>>   drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild  |   3 +
>>>   drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c  | 302 +++++++++++
>>>   .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 234 ++++++++
>>>   drivers/vfio/pci/Kconfig                      |   2 +
>>>   drivers/vfio/pci/Makefile                     |   2 +
>>>   drivers/vfio/pci/nvidia-vgpu/Kconfig          |  13 +
>>>   drivers/vfio/pci/nvidia-vgpu/Makefile         |   8 +
>>>   drivers/vfio/pci/nvidia-vgpu/debug.h          |  18 +
>>>   .../nvidia/inc/ctrl/ctrl0000/ctrl0000system.h |  30 +
>>>   .../nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h    |  33 ++
>>>   .../ctrl/ctrl2080/ctrl2080vgpumgrinternal.h   | 152 ++++++
>>>   .../common/sdk/nvidia/inc/ctrl/ctrla081.h     | 109 ++++
>>>   .../nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h | 213 ++++++++
>>>   .../common/sdk/nvidia/inc/nv_vgpu_types.h     |  51 ++
>>>   .../common/sdk/vmioplugin/inc/vmioplugin.h    |  26 +
>>>   .../pci/nvidia-vgpu/include/nvrm/nvtypes.h    |  24 +
>>>   drivers/vfio/pci/nvidia-vgpu/nvkm.h           |  94 ++++
>>>   drivers/vfio/pci/nvidia-vgpu/rpc.c            | 242 +++++++++
>>>   drivers/vfio/pci/nvidia-vgpu/vfio.h           |  43 ++
>>>   drivers/vfio/pci/nvidia-vgpu/vfio_access.c    | 297 ++++++++++
>>>   drivers/vfio/pci/nvidia-vgpu/vfio_main.c      | 511
>>> ++++++++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu.c           |
>>> 352 ++++++++++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c       | 144
>>> +++++ drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h       |  89 +++
>>>   drivers/vfio/pci/nvidia-vgpu/vgpu_types.c     | 466 ++++++++++++++++
>>>   include/drm/nvkm_vgpu_mgr_vfio.h              |  61 +++
>>>   37 files changed, 3702 insertions(+), 33 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/engine/chid.h
>>>   create mode 100644
>>> drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h create mode
>>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/Kbuild create mode
>>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vfio.c create mode
>>> 100644 drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c create mode
>>> 100644 drivers/vfio/pci/nvidia-vgpu/Kconfig create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/Makefile create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/debug.h create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000system.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080gpu.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrl2080/ctrl2080vgpumgrinternal.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/ctrl/ctrla081.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/dev_vgpu_gsp.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/nvidia/inc/nv_vgpu_types.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/common/sdk/vmioplugin/inc/vmioplugin.h
>>> create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/include/nvrm/nvtypes.h create mode
>>> 100644 drivers/vfio/pci/nvidia-vgpu/nvkm.h create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/rpc.c create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vfio.h create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vfio_access.c create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vfio_main.c create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vgpu.c create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.c create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vgpu_mgr.h create mode 100644
>>> drivers/vfio/pci/nvidia-vgpu/vgpu_types.c create mode 100644
>>> include/drm/nvkm_vgpu_mgr_vfio.h
>>>
>>


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-24 16:41       ` Jason Gunthorpe
@ 2024-09-24 19:56         ` Danilo Krummrich
  2024-09-24 22:52           ` Dave Airlie
  2024-09-25  0:53           ` Jason Gunthorpe
  0 siblings, 2 replies; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-24 19:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote:
> 
> > > From the VFIO side I would like to see something like this merged in
> > > nearish future as it would bring a previously out of tree approach to
> > > be fully intree using our modern infrastructure. This is a big win for
> > > the VFIO world.
> > > 
> > > As a commercial product this will be backported extensively to many
> > > old kernels and that is harder/impossible if it isn't exclusively in
> > > C. So, I think nova needs to co-exist in some way.
> > 
> > We'll surely not support two drivers for the same thing in the long term,
> > neither does it make sense, nor is it sustainable.
> 
> What is being done here is the normal correct kernel thing to
> do. Refactor the shared core code into a module and stick higher level
> stuff on top of it. Ideally Nova/Nouveau would exist as peers
> implementing DRM subsystem on this shared core infrastructure. We've
> done this sort of thing before in other places in the kernel. It has
> been proven to work well.

So, that's where you have the wrong understanding of what we're working on: You
seem to think that Nova is just another DRM subsystem layer on top of the NVKM
parts (what you call the core driver) of Nouveau.

But the whole point of Nova is to replace the NVKM parts of Nouveau, since
that's where the problems we want to solve reside in.

> 
> So, I'm not sure why you think there should be two drivers in the long
> term? Do you have some technical reason why Nova can't fit into this
> modular architecture?

Like I said above, the whole point of Nova is to be the core driver, the DRM
parts on top are more like "the icing on the cake".

> 
> Regardless, assuming Nova will eventually propose merging duplicated
> bootup code then I suggest it should be able to fully replace the C
> code with a kconfig switch and provide C compatible interfaces for
> VFIO. When Rust is sufficiently mature we can consider a deprecation
> schedule for the C version.
> 
> I agree duplication doesn't make sense, but if it is essential to make
> everyone happy then we should do it to accommodate the ongoing Rust
> experiment.
> 
> > We have a lot of good reasons why we decided to move forward with Nova as a
> > successor of Nouveau for GSP-based GPUs in the long term -- I also just held a
> > talk about this at LPC.
> 
> I know, but this series is adding a VFIO driver to the kernel, and a

I have no concerns regarding the VFIO driver, this is about the new features
that you intend to add to Nouveau.

> complete Nova driver doesn't even exist yet. It is fine to think about
> future plans, but let's not get too far ahead of ourselves here..

Well, that's true, but we can't just add new features to something that has been
agreed to be replaced without having a strategy for this for the successor.

> 
> > For the short/mid term I think it may be reasonable to start with
> > Nouveau, but this must be based on some agreements, for instance:
> > 
> > - take responsibility, e.g. commitment to help with maintainance with some of
> >   NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau
> 
> I fully expect NVIDIA teams to own this core driver and VFIO parts. I
> see there are no changes to the MAINTAINERs file in this RFC, that
> will need to be corrected.

Well, I did not say to just take over the biggest part of Nouveau.

Currently - and please correct me if I'm wrong - you make it sound to me as if
you're not willing to respect the decisions that have been taken by Nouveau and
DRM maintainers.

> 
> > - commitment to help with Nova in general and, once applicable, move the vGPU
> >   parts over to Nova
> 
> I think you will get help with Nova based on its own merit, but I
> don't like where you are going with this. Linus has had negative
> things to say about this sort of cross-linking and I agree with
> him. We should not be trying to extract unrelated promises on Nova as
> a condition for progressing a VFIO series. :\

No cross-linking, no unrelated promises.

Again, we're working on a successor of Nouveau and if we keep adding features to
Nouveau in the meantime, we have to have a strategy for the transition,
otherwise we're effectively just ignoring this decision.

So, I really need you to respect the fact that there has been a decision for a
successor and that this *is* in fact relevant for all major changes to Nouveau
as well.

Once you do this, we get the chance to work things out for the short/mid term
and for the long term and make everyone benefit.

I encourage that NVIDIA wants to move things upstream and I'm absolutely willing
to collaborate and help with the use-cases and goals NVIDIA has. But it really
has to be a collaboration and this starts with acknowledging the goals of *each
other*.

> 
> > But I think the very last one naturally happens if we stop further support for
> > new HW in Nouveau at some point.
> 
> I expect the core code would continue to support new HW going forward
> to support the VFIO driver, even if nouveau doesn't use it, until Rust
> reaches some full ecosystem readyness for the server space.

From an upstream perspective the kernel doesn't need to consider OOT drivers,
i.e. the guest driver.

This doesn't mean that we can't work something out for a seamless transition
though.

But again, this can only really work if we acknowledge the goals of each other.

> 
> There are going to be a lot of users of this code, let's not rush to
> harm them please.

Please abstain from such kind of unconstructive insinuations; it's ridiculous to
imply that upstream kernel developers and maintainers would harm the users of
NVIDIA GPUs.

> 
> Fortunately there is no use case for DRM and VFIO to coexist in a
> hypervisor, so this does not turn into a such a technical problem like
> most other dual-driver situations.
> 
> Jason
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-24 19:56         ` Danilo Krummrich
@ 2024-09-24 22:52           ` Dave Airlie
  2024-09-24 23:47             ` Jason Gunthorpe
  2024-09-25  0:53           ` Jason Gunthorpe
  1 sibling, 1 reply; 86+ messages in thread
From: Dave Airlie @ 2024-09-24 22:52 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Jason Gunthorpe, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote:
> > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote:
> >
> > > > From the VFIO side I would like to see something like this merged in
> > > > nearish future as it would bring a previously out of tree approach to
> > > > be fully intree using our modern infrastructure. This is a big win for
> > > > the VFIO world.
> > > >
> > > > As a commercial product this will be backported extensively to many
> > > > old kernels and that is harder/impossible if it isn't exclusively in
> > > > C. So, I think nova needs to co-exist in some way.
> > >
> > > We'll surely not support two drivers for the same thing in the long term,
> > > neither does it make sense, nor is it sustainable.
> >
> > What is being done here is the normal correct kernel thing to
> > do. Refactor the shared core code into a module and stick higher level
> > stuff on top of it. Ideally Nova/Nouveau would exist as peers
> > implementing DRM subsystem on this shared core infrastructure. We've
> > done this sort of thing before in other places in the kernel. It has
> > been proven to work well.
>
> So, that's where you have the wrong understanding of what we're working on: You
> seem to think that Nova is just another DRM subsystem layer on top of the NVKM
> parts (what you call the core driver) of Nouveau.
>
> But the whole point of Nova is to replace the NVKM parts of Nouveau, since
> that's where the problems we want to solve reside in.

Just to re-emphasise for Jason who might not be as across this stuff,

NVKM replacement with rust is the main reason for the nova project,
100% the driving force for nova is the unstable NVIDIA firmware API.
The ability to use rust proc-macros to hide the NVIDIA instability
instead of trying to do it in C by either generators or abusing C
macros (which I don't think are sufficient).

The lower level nvkm driver needs to start being in rust before we can
add support for newer stuff.

Now there is possibly some scope about evolving the rust pieces in it
as, rust wrapped in C APIs to make things easier for backports or
avoid some pitfalls, but that is a discussion that we need to have
here.

I think the idea of a nova drm and nova core driver architecture is
acceptable to most of us, but long term trying to main a nouveau based
nvkm is definitely not acceptable due to the unstable firmware APIs.

Dave.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-24 22:52           ` Dave Airlie
@ 2024-09-24 23:47             ` Jason Gunthorpe
  2024-09-25  0:18               ` Dave Airlie
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-24 23:47 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Wed, Sep 25, 2024 at 08:52:32AM +1000, Dave Airlie wrote:
> On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote:
> > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote:
> > >
> > > > > From the VFIO side I would like to see something like this merged in
> > > > > nearish future as it would bring a previously out of tree approach to
> > > > > be fully intree using our modern infrastructure. This is a big win for
> > > > > the VFIO world.
> > > > >
> > > > > As a commercial product this will be backported extensively to many
> > > > > old kernels and that is harder/impossible if it isn't exclusively in
> > > > > C. So, I think nova needs to co-exist in some way.
> > > >
> > > > We'll surely not support two drivers for the same thing in the long term,
> > > > neither does it make sense, nor is it sustainable.
> > >
> > > What is being done here is the normal correct kernel thing to
> > > do. Refactor the shared core code into a module and stick higher level
> > > stuff on top of it. Ideally Nova/Nouveau would exist as peers
> > > implementing DRM subsystem on this shared core infrastructure. We've
> > > done this sort of thing before in other places in the kernel. It has
> > > been proven to work well.
> >
> > So, that's where you have the wrong understanding of what we're
> > working on: You seem to think that Nova is just another DRM
> > subsystem layer on top of the NVKM parts (what you call the core
> > driver) of Nouveau.

Well, no, I am calling a core driver to be the very minimal parts that
are actually shared between vfio and drm. It should definitely not
include key parts you want to work on in rust, like the command
marshaling. 

I expect there is more work to do in order to make this kind of split,
but this is what I'm thinking/expecting.

> > But the whole point of Nova is to replace the NVKM parts of Nouveau, since
> > that's where the problems we want to solve reside in.
> 
> Just to re-emphasise for Jason who might not be as across this stuff,
> 
> NVKM replacement with rust is the main reason for the nova project,
> 100% the driving force for nova is the unstable NVIDIA firmware API.
> The ability to use rust proc-macros to hide the NVIDIA instability
> instead of trying to do it in C by either generators or abusing C
> macros (which I don't think are sufficient).

I would not include any of this in the very core most driver. My
thinking is informed by what we've done in RDMA, particularly mlx5
which has a pretty thin PCI driver and each of the drivers stacked on
top form their own command buffers directly. The PCI driver primarily
just does some device bootup, command execution and interrupts because
those are all shared by the subsystem drivers.

We have a lot of experiance now building these kinds of
multi-subsystem structures and this pattern works very well.

So, broadly, build your rust proc macros on the DRM Nova driver and
call a core function to submit a command buffer to the device and get
back a response.

VFIO will make it's command buffers with C and call the same core
function.

> I think the idea of a nova drm and nova core driver architecture is
> acceptable to most of us, but long term trying to main a nouveau based
> nvkm is definitely not acceptable due to the unstable firmware APIs.

? nova core, meaning nova rust, meaning vfio depends on rust, doesn't
seem acceptable ? We need to keep rust isolated to DRM for the
foreseeable future. Just need to find a separation that can do that.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-24 23:47             ` Jason Gunthorpe
@ 2024-09-25  0:18               ` Dave Airlie
  2024-09-25  1:29                 ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Dave Airlie @ 2024-09-25  0:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

>
> Well, no, I am calling a core driver to be the very minimal parts that
> are actually shared between vfio and drm. It should definitely not
> include key parts you want to work on in rust, like the command
> marshaling.

Unfortunately not, the fw ABI is the unsolved problem, rust is our
best solution.

>
> I expect there is more work to do in order to make this kind of split,
> but this is what I'm thinking/expecting.
>
> > > But the whole point of Nova is to replace the NVKM parts of Nouveau, since
> > > that's where the problems we want to solve reside in.
> >
> > Just to re-emphasise for Jason who might not be as across this stuff,
> >
> > NVKM replacement with rust is the main reason for the nova project,
> > 100% the driving force for nova is the unstable NVIDIA firmware API.
> > The ability to use rust proc-macros to hide the NVIDIA instability
> > instead of trying to do it in C by either generators or abusing C
> > macros (which I don't think are sufficient).
>
> I would not include any of this in the very core most driver. My
> thinking is informed by what we've done in RDMA, particularly mlx5
> which has a pretty thin PCI driver and each of the drivers stacked on
> top form their own command buffers directly. The PCI driver primarily
> just does some device bootup, command execution and interrupts because
> those are all shared by the subsystem drivers.
>
> We have a lot of experiance now building these kinds of
> multi-subsystem structures and this pattern works very well.
>
> So, broadly, build your rust proc macros on the DRM Nova driver and
> call a core function to submit a command buffer to the device and get
> back a response.
>
> VFIO will make it's command buffers with C and call the same core
> function.
>
> > I think the idea of a nova drm and nova core driver architecture is
> > acceptable to most of us, but long term trying to main a nouveau based
> > nvkm is definitely not acceptable due to the unstable firmware APIs.
>
> ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't
> seem acceptable ? We need to keep rust isolated to DRM for the
> foreseeable future. Just need to find a separation that can do that.

That isn't going to happen, if we start with that as the default
positioning it won't get us very far.

The core has to be rust, because NVIDIA has an unstable firmware API.
The unstable firmware API isn't some command marshalling, it's deep
down into the depths of it, like memory sizing requirements, base
message queue layout and encoding, firmware init procedures. These are
all changeable at any time with no regard for upstream development, so
upstream development needs to be insulated from these as much as
possible. Using rust provides that insulation layer. The unstable ABI
isn't a solvable problem in the short term, using rust is the
maintainable answer.

Now there are maybe some on/off ramps we can use here that might
provide some solutions to bridge the gap. Using rust in the kernel has
various levels, which we currently tie into one place, but if we
consider different longer term progressions it might be possible to
start with some rust that is easier to backport than other rust might
be etc.

Strategies for moving nvkm core from C to rust in steps, or along a
sliding scale of fws supported could be open for discussion.

The end result though is to have nova core and nova drm in rust, that
is the decision upstream made 6-12 months ago, I don't see any of the
initial reasons for using rust have been invalidated or removed that
warrant revisiting that decision.

Dave.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU
  2024-09-22 12:49 ` [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU Zhi Wang
@ 2024-09-25  0:31   ` Dave Airlie
  0 siblings, 0 replies; 86+ messages in thread
From: Dave Airlie @ 2024-09-25  0:31 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, jgg, daniel, acurrid,
	cjia, smitra, ankita, aniketa, kwankhede, targupta, zhiwang

On Sun, 22 Sept 2024 at 22:51, Zhi Wang <zhiw@nvidia.com> wrote:
>
> All the resources that required by a new vGPU has been set up. It is time
> to activate it.
>
> Send the NV2080_CTRL_CMD_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK
> GSP RPC to activate the new vGPU.

This patch is probably the best example of how this can't work.

The GSP firmware interfaces are not guaranteed stable. Exposing these
interfaces outside the nvkm core is unacceptable, as otherwise we
would have to adapt the whole kernel depending on the loaded firmware.

You cannot use any nvidia sdk headers, these all have to be abstracted
behind things that have no bearing on the API.

If a new NVIDIA driver release was to add a parameter inside
NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS how
would you handle it?

Outside of the other discussion, this is the fundamental problem with
working on the GSP firmware. We cannot trust that any API exposed
won't change, and NVIDIA aren't in a position to guarantee it.

Dave.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-24 19:56         ` Danilo Krummrich
  2024-09-24 22:52           ` Dave Airlie
@ 2024-09-25  0:53           ` Jason Gunthorpe
  2024-09-25  1:08             ` Dave Airlie
  2024-09-25 10:55             ` Danilo Krummrich
  1 sibling, 2 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-25  0:53 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote:

> Currently - and please correct me if I'm wrong - you make it sound to me as if
> you're not willing to respect the decisions that have been taken by Nouveau and
> DRM maintainers.

I've never said anything about your work, go do Nova, have fun.

I'm just not agreeing to being forced into taking Rust dependencies in
VFIO because Nova is participating in the Rust Experiment.

I think the reasonable answer is to accept some code duplication, or
try to consolidate around a small C core. I understad this is
different than you may have planned so far for Nova, but all projects
are subject to community feedback, especially when faced with new
requirements.

I think this discussion is getting a little overheated, there is lots
of space here for everyone to do their things. Let's not get too
excited.

> I encourage that NVIDIA wants to move things upstream and I'm absolutely willing
> to collaborate and help with the use-cases and goals NVIDIA has. But it really
> has to be a collaboration and this starts with acknowledging the goals of *each
> other*.

I've always acknowledged Nova's goal - it is fine.

It is just quite incompatible with the VFIO side requirement of no
Rust in our stack until the ecosystem can consume it.

I belive there is no reason we can't find an agreeable compromise.

> > I expect the core code would continue to support new HW going forward
> > to support the VFIO driver, even if nouveau doesn't use it, until Rust
> > reaches some full ecosystem readyness for the server space.
> 
> From an upstream perspective the kernel doesn't need to consider OOT drivers,
> i.e. the guest driver.

?? VFIO already took the decision that it is agnostic to what is
running in the VM. Run Windows-only VMs for all we care, it is still
supposed to be virtualized correctly.

> > There are going to be a lot of users of this code, let's not rush to
> > harm them please.
> 
> Please abstain from such kind of unconstructive insinuations; it's ridiculous to
> imply that upstream kernel developers and maintainers would harm the users of
> NVIDIA GPUs.

You literally just said you'd want to effectively block usable VFIO
support for new GPU HW when "we stop further support for new HW in
Nouveau at some point" and "move the vGPU parts over to Nova(& rust)".

I don't agree to that, it harms VFIO users, and is not acknowledging
that conflicting goals exist.

VFIO will decide when it starts to depend on rust, Nova should not
force that decision on VFIO. They are very different ecosystems with
different needs.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-25  0:53           ` Jason Gunthorpe
@ 2024-09-25  1:08             ` Dave Airlie
  2024-09-25 15:28               ` Jason Gunthorpe
  2024-09-25 10:55             ` Danilo Krummrich
  1 sibling, 1 reply; 86+ messages in thread
From: Dave Airlie @ 2024-09-25  1:08 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote:
>
> > Currently - and please correct me if I'm wrong - you make it sound to me as if
> > you're not willing to respect the decisions that have been taken by Nouveau and
> > DRM maintainers.
>
> I've never said anything about your work, go do Nova, have fun.
>
> I'm just not agreeing to being forced into taking Rust dependencies in
> VFIO because Nova is participating in the Rust Experiment.
>
> I think the reasonable answer is to accept some code duplication, or
> try to consolidate around a small C core. I understad this is
> different than you may have planned so far for Nova, but all projects
> are subject to community feedback, especially when faced with new
> requirements.
>
> I think this discussion is getting a little overheated, there is lots
> of space here for everyone to do their things. Let's not get too
> excited.

How do you intend to solve the stable ABI problem caused by the GSP firmware?

If you haven't got an answer to that, that is reasonable, you can talk
about VFIO and DRM and who is in charge all you like, but it doesn't
matter.

Fundamentally the problem is the unstable API exposure isn't something
you can build a castle on top of, the nova idea is to use rust to
solve a fundamental problem with the NVIDIA driver design process
forces on us (vfio included), I'm not seeing anything constructive as
to why doing that in C would be worth the investment. Nothing has
changed from when we designed nova, this idea was on the table then,
it has all sorts of problems leaking the unstable ABI that have to be
solved, and when I see a solution for that in C that is maintainable
and doesn't leak like a sieve I might be interested, but you know keep
thinking we are using rust so we can have fun, not because we are
using it to solve maintainability problems caused by an internal
NVIDIA design decision over which we have zero influence.

Dave.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-25  0:18               ` Dave Airlie
@ 2024-09-25  1:29                 ` Jason Gunthorpe
  0 siblings, 0 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-25  1:29 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Wed, Sep 25, 2024 at 10:18:44AM +1000, Dave Airlie wrote:

> > ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't
> > seem acceptable ? We need to keep rust isolated to DRM for the
> > foreseeable future. Just need to find a separation that can do that.
> 
> That isn't going to happen, if we start with that as the default
> positioning it won't get us very far.

What do you want me to say to that? We can't have rust in VFIO right
now, we don't have that luxury. This is just a fact, I can't change
it.

If you say upstream has to be rust then there just won't be upstream
and this will all go OOT and stay as C code. That isn't a good
outcome. Having rust usage actively harm participation in the kernel
seems like the exact opposite of the consensus of the maintainer
summit.

> The core has to be rust, because NVIDIA has an unstable firmware API.
> The unstable firmware API isn't some command marshalling, it's deep
> down into the depths of it, like memory sizing requirements, base
> message queue layout and encoding, firmware init procedures.

I get the feeling the vast majorty of the work, and primary rust
benefit, lies in the command marshalling.

If the init *procedures* change, for instance, you are going to have to
write branches no matter what language you use.

I don't know, it is just a suggestion to consider.

> Now there are maybe some on/off ramps we can use here that might
> provide some solutions to bridge the gap. Using rust in the kernel has
> various levels, which we currently tie into one place, but if we
> consider different longer term progressions it might be possible to
> start with some rust that is easier to backport than other rust might
> be etc.

That seems to be entirely unexplored territory. Certainly if the
backporting can be shown to be solved then I have much less objection
to having VFIO depend on rust.

This is part of why I suggested that a rust core driver could expose
the C APIs that VFIO needs with a kconfig switch. Then people can
experiment and give feedback on what backporting this rust stuff is
actually like. That would be valuable for everyone I think. Especially
if the feedback is that backporting is no problem.

Yes we have duplication while that is ongoing, but I think that is
inevitable, and at least everyone could agree to the duplication and I
expect NVIDIA would sign up to maintain the C VFIO stack top to
bottom.

> The end result though is to have nova core and nova drm in rust, that
> is the decision upstream made 6-12 months ago, I don't see any of the
> initial reasons for using rust have been invalidated or removed that
> warrant revisiting that decision.

Never said they did, but your decision to use Rust in Nova does not
automatically mean a decision to use Rust in VFIO, and now we have a
new requirement to couple the two together. It still must be resolved
satisfactorily.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-25  0:53           ` Jason Gunthorpe
  2024-09-25  1:08             ` Dave Airlie
@ 2024-09-25 10:55             ` Danilo Krummrich
  1 sibling, 0 replies; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-25 10:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Tue, Sep 24, 2024 at 09:53:19PM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote:
> 
> > Currently - and please correct me if I'm wrong - you make it sound to me as if
> > you're not willing to respect the decisions that have been taken by Nouveau and
> > DRM maintainers.
> 
> I've never said anything about your work, go do Nova, have fun.

See, that's the attitude that doesn't get us anywhere.

You act as if we'd just be toying around to have fun, position yourself as the
one who wants to do the "real deal" and just claim that our decisions would harm
users.

And at the same time you proof that you did not get up to speed on what were the
reasons to move in this direction and what are the problems we try to solve.

This just won't lead to a constructive discussion that addresses your concerns.

Try to not go like a bull at a gate. Instead start with asking questions to
understand why we chose this direction and then feel free to raise concerns.

I assure you, we will hear and recognize them! And I'm also sure that we'll find
solutions and compromises.

> 
> I'm just not agreeing to being forced into taking Rust dependencies in
> VFIO because Nova is participating in the Rust Experiment.
> 
> I think the reasonable answer is to accept some code duplication, or
> try to consolidate around a small C core. I understad this is
> different than you may have planned so far for Nova, but all projects
> are subject to community feedback, especially when faced with new
> requirements.

Fully agree, and I'm absolutely open to consider feedback and new requirements.

But again, consider what I said above -- you're creating counterproposals out of
thin air, without considering what we have planned for so far at all.

So, I wonder what kind of reaction you expect approaching things this way?

> 
> I think this discussion is getting a little overheated, there is lots
> of space here for everyone to do their things. Let's not get too
> excited.
> 
> > I encourage that NVIDIA wants to move things upstream and I'm absolutely willing
> > to collaborate and help with the use-cases and goals NVIDIA has. But it really
> > has to be a collaboration and this starts with acknowledging the goals of *each
> > other*.
> 
> I've always acknowledged Nova's goal - it is fine.
> 
> It is just quite incompatible with the VFIO side requirement of no
> Rust in our stack until the ecosystem can consume it.
> 
> I belive there is no reason we can't find an agreeable compromise.

I'm pretty sure we indeed can find agreeable compromise. But again, please
understand that the way of approaching this you've chosen so far won't get us
there.

> 
> > > I expect the core code would continue to support new HW going forward
> > > to support the VFIO driver, even if nouveau doesn't use it, until Rust
> > > reaches some full ecosystem readyness for the server space.
> > 
> > From an upstream perspective the kernel doesn't need to consider OOT drivers,
> > i.e. the guest driver.
> 
> ?? VFIO already took the decision that it is agnostic to what is
> running in the VM. Run Windows-only VMs for all we care, it is still
> supposed to be virtualized correctly.
> 
> > > There are going to be a lot of users of this code, let's not rush to
> > > harm them please.
> > 
> > Please abstain from such kind of unconstructive insinuations; it's ridiculous to
> > imply that upstream kernel developers and maintainers would harm the users of
> > NVIDIA GPUs.
> 
> You literally just said you'd want to effectively block usable VFIO
> support for new GPU HW when "we stop further support for new HW in
> Nouveau at some point" and "move the vGPU parts over to Nova(& rust)".

Well, working on a successor means that once it's in place the support for the
replaced thing has to end at some point.

This doesn't mean that we can't work out ways to address your concerns.

You just make it a binary thing and claim that if we don't choose 1 we harm
users.

This effectively denies looking for solutions of your concerns in the first
place. And again, this won't get us anywhere. It just creates the impression
that you're not interested in solutions, but push through your agenda.

> 
> I don't agree to that, it harms VFIO users, and is not acknowledging
> that conflicting goals exist.
> 
> VFIO will decide when it starts to depend on rust, Nova should not
> force that decision on VFIO. They are very different ecosystems with
> different needs.
> 
> Jason
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-25  1:08             ` Dave Airlie
@ 2024-09-25 15:28               ` Jason Gunthorpe
  0 siblings, 0 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-25 15:28 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Wed, Sep 25, 2024 at 11:08:40AM +1000, Dave Airlie wrote:
> On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote:
> >
> > > Currently - and please correct me if I'm wrong - you make it sound to me as if
> > > you're not willing to respect the decisions that have been taken by Nouveau and
> > > DRM maintainers.
> >
> > I've never said anything about your work, go do Nova, have fun.
> >
> > I'm just not agreeing to being forced into taking Rust dependencies in
> > VFIO because Nova is participating in the Rust Experiment.
> >
> > I think the reasonable answer is to accept some code duplication, or
> > try to consolidate around a small C core. I understad this is
> > different than you may have planned so far for Nova, but all projects
> > are subject to community feedback, especially when faced with new
> > requirements.
> >
> > I think this discussion is getting a little overheated, there is lots
> > of space here for everyone to do their things. Let's not get too
> > excited.
> 
> How do you intend to solve the stable ABI problem caused by the GSP firmware?
> 
> If you haven't got an answer to that, that is reasonable, you can talk
> about VFIO and DRM and who is in charge all you like, but it doesn't
> matter.

I suggest the same answer everyone else building HW in the kernel
operates under. You get to update your driver with your new HW once
per generation.

Not once per FW release, once per generation. That is a similar level
of burden to maintain as most drivers. It is not as good as the
excellence Mellanox does (no SW change for a new HW generation), but
it is still good.

I would apply this logic to Nova as well, no reason to be supporting
random ABI changes coming out every month(s).

> Fundamentally the problem is the unstable API exposure isn't something
> you can build a castle on top of, the nova idea is to use rust to
> solve a fundamental problem with the NVIDIA driver design process
> forces on us (vfio included), 

I firmly believe you can't solve a stable ABI problem with language
features in an OS. The ABI is totally unstable, it will change
semantically, the order and nature of functions you need will
change. New HW will need new behaviors and semantics.

Language support can certainly handle the mindless churn that ideally
shouldn't even be happening in the first place.

The way you solve this is at the root, in the FW. Don't churn
everything. I'm a big believer and supporter of the Mellanox
super-stable approach that has really proven how valuable this concept
is to everyone.

So I agree with you, the extreme unstableness is not OK in upstream,
it needs to slow down a lot to be acceptable. I don't necessarily
agree to Mellanox like gold standard as the bar, but certainly must be
way better than it is now.

FWIW when I discussed the VFIO patches I was given some impression
there would not be high levels of ABI churn on the VFIO side, and that
there was awareness and understanding of this issue on Zhi's side.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23 15:02   ` Jason Gunthorpe
@ 2024-09-26  6:43     ` Tian, Kevin
  2024-09-26 12:55       ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Tian, Kevin @ 2024-09-26  6:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, airlied@gmail.com, daniel@ffwll.ch,
	Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Monday, September 23, 2024 11:02 PM
> 
> On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote:
> > > From: Zhi Wang <zhiw@nvidia.com>
> > > Sent: Sunday, September 22, 2024 8:49 PM
> > >
> > [...]
> > >
> > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides
> > > extended management and features, e.g. selecting the vGPU types,
> support
> > > live migration and driver warm update.
> > >
> > > Like other devices that VFIO supports, VFIO provides the standard
> > > userspace APIs for device lifecycle management and advance feature
> > > support.
> > >
> > > The NVIDIA vGPU manager provides necessary support to the NVIDIA
> vGPU VFIO
> > > variant driver to create/destroy vGPUs, query available vGPU types, select
> > > the vGPU type, etc.
> > >
> > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core
> driver,
> > > which provide necessary support to reach the HW functions.
> > >
> >
> > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager.
> > It's very NVIDIA specific and naturally fit in the PF driver.
> 
> drm isn't a particularly logical place for that either :|
> 

This RFC doesn't expose any new uAPI in the vGPU manager, e.g. with
the vGPU type hard-coded to L40-24Q. In this way the boundary between
code in VFIO and code in PF driver is probably more a vendor specific
choice.

However according to the cover letter it's reasonable for future extension
to implement new uAPI  for admin to select the vGPU type and potentially
do more manual configurations before the target VF can be used:

Then there comes an open whether VFIO is a right place to host such
vendor specific provisioning interface. The existing mdev type based
provisioning mechanism was considered a bad fit already.

IIRC the previous discussion came to suggest putting the provisioning
interface in the PF driver. There may be chance to generalize and
move to VFIO but no idea what it will be until multiple drivers already
demonstrate their own implementations as the base for discussion.

But now seems you prefer to vendors putting their own provisioning
interface in VFIO directly?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-23 15:01   ` Jason Gunthorpe
  2024-09-23 22:50     ` Danilo Krummrich
@ 2024-09-26  9:14     ` Greg KH
  2024-09-26 12:42       ` Jason Gunthorpe
  1 sibling, 1 reply; 86+ messages in thread
From: Greg KH @ 2024-09-26  9:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, airlied, daniel, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > > 2. Proposal for upstream
> > > ========================
> > 
> > What is the strategy in the mid / long term with this?
> > 
> > As you know, we're trying to move to Nova and the blockers with the device /
> > driver infrastructure have been resolved and we're able to move forward. Besides
> > that, Dave made great progress on the firmware abstraction side of things.
> > 
> > Is this more of a proof of concept? Do you plan to work on Nova in general and
> > vGPU support for Nova?
> 
> This is intended to be a real product that customers would use, it is
> not a proof of concept. There is alot of demand for this kind of
> simplified virtualization infrastructure in the host side. The series
> here is the first attempt at making thin host infrastructure and
> Zhi/etc are doing it with an upstream-first approach.
> 
> >From the VFIO side I would like to see something like this merged in
> nearish future as it would bring a previously out of tree approach to
> be fully intree using our modern infrastructure. This is a big win for
> the VFIO world.
> 
> As a commercial product this will be backported extensively to many
> old kernels and that is harder/impossible if it isn't exclusively in
> C. So, I think nova needs to co-exist in some way.

Please never make design decisions based on old ancient commercial
kernels that have any relevance to upstream kernel development today.
If you care about those kernels, work with the companies that get paid
to support such things.  Otherwise development upstream would just
completely stall and never go forward, as you well know.

As it seems that future support for this hardware is going to be in
rust, just use those apis going forward and backport the small number of
missing infrastructure patches to the relevant ancient kernels as well,
it's not like that would even be noticed in the overall number of
patches they take for normal subsystem improvements :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude
  2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
@ 2024-09-26  9:20   ` Greg KH
  2024-10-14  9:59     ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Greg KH @ 2024-09-26  9:20 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, jgg, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:23AM -0700, Zhi Wang wrote:
> NVIDIA GPU virtualization is a technology that allows multiple virtual
> machines (VMs) to share the power of a single GPU, enabling greater
> flexibility, efficiency, and cost-effectiveness in data centers and cloud
> environments.
> 
> The first step of supporting NVIDIA vGPU in nvkm is to introduce the
> necessary vGPU data structures and functions to hook into the
> (de)initialization path of nvkm.
> 
> Introduce NVIDIA vGPU data structures and functions hooking into the
> the (de)initialization path of nvkm and support the following patches.
> 
> Cc: Neo Jia <cjia@nvidia.com>
> Cc: Surath Mitra <smitra@nvidia.com>
> Signed-off-by: Zhi Wang <zhiw@nvidia.com>

Some minor comments that are a hint you all aren't running checkpatch on
your code...

> --- /dev/null
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: MIT */

Wait, what?  Why?  Ick.  You all also forgot the copyright line :(

> --- /dev/null
> +++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
> @@ -0,0 +1,76 @@
> +/* SPDX-License-Identifier: MIT */
> +#include <core/device.h>
> +#include <core/pci.h>
> +#include <vgpu_mgr/vgpu_mgr.h>
> +
> +static bool support_vgpu_mgr = false;

A global variable for the whole system?  Are you sure that will work
well over time?  Why isn't this a per-device thing?

> +module_param_named(support_vgpu_mgr, support_vgpu_mgr, bool, 0400);

This is not the 1990's, please never add new module parameters, use
per-device variables.  And no documentation?  That's not ok either even
if you did want to have this.

> +static inline struct pci_dev *nvkm_to_pdev(struct nvkm_device *device)
> +{
> +	struct nvkm_device_pci *pci = container_of(device, typeof(*pci),
> +						   device);
> +
> +	return pci->pdev;
> +}
> +
> +/**
> + * nvkm_vgpu_mgr_is_supported - check if a platform support vGPU
> + * @device: the nvkm_device pointer
> + *
> + * Returns: true on supported platform which is newer than ADA Lovelace
> + * with SRIOV support.
> + */
> +bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device)
> +{
> +	struct pci_dev *pdev = nvkm_to_pdev(device);
> +
> +	if (!support_vgpu_mgr)
> +		return false;
> +
> +	return device->card_type == AD100 &&  pci_sriov_get_totalvfs(pdev);

checkpatch please.

And "AD100" is an odd #define, as you know.

> +}
> +
> +/**
> + * nvkm_vgpu_mgr_is_enabled - check if vGPU support is enabled on a PF
> + * @device: the nvkm_device pointer
> + *
> + * Returns: true if vGPU enabled.
> + */
> +bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device)
> +{
> +	return device->vgpu_mgr.enabled;

What happens if this changes right after you look at it?


> +}
> +
> +/**
> + * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
> + * @device: the nvkm_device pointer
> + *
> + * Returns: 0 on success, -ENODEV on platforms that are not supported.
> + */
> +int nvkm_vgpu_mgr_init(struct nvkm_device *device)
> +{
> +	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
> +
> +	if (!nvkm_vgpu_mgr_is_supported(device))
> +		return -ENODEV;
> +
> +	vgpu_mgr->nvkm_dev = device;
> +	vgpu_mgr->enabled = true;
> +
> +	pci_info(nvkm_to_pdev(device),
> +		 "NVIDIA vGPU mananger support is enabled.\n");

When drivers work properly, they are quiet.

Why can't you see this all in the sysfs tree instead to know if support
is there or not?  You all are properly tieing in your "sub driver" logic
to the driver model, right?  (hint, I don't think so as it looks like
that isn't happening, but I could be missing it...)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client
  2024-09-22 12:49 ` [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client Zhi Wang
@ 2024-09-26  9:21   ` Greg KH
  2024-10-14 10:16     ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Greg KH @ 2024-09-26  9:21 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, jgg, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote:
> nvkm is a HW abstraction layer(HAL) that initializes the HW and
> allows its clients to manipulate the GPU functions regardless of the
> generations of GPU HW. On the top layer, it provides generic APIs for a
> client to connect to NVKM, enumerate the GPU functions, and manipulate
> the GPU HW.
> 
> To reach nvkm, the client needs to connect to NVKM layer by layer: driver
> layer, client layer, and eventually, the device layer, which provides all
> the access routines to GPU functions. After a client attaches to NVKM,
> it initializes the HW and is able to serve the clients.
> 
> Attach to nvkm as a nvkm client.
> 
> Cc: Neo Jia <cjia@nvidia.com>
> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> ---
>  .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  8 ++++
>  .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 48 ++++++++++++++++++-
>  2 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> index 3163fff1085b..9e10e18306b0 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> @@ -7,6 +7,14 @@
>  struct nvkm_vgpu_mgr {
>  	bool enabled;
>  	struct nvkm_device *nvkm_dev;
> +
> +	const struct nvif_driver *driver;

Meta-comment, why is this attempting to act like a "driver" and yet not
tieing into the driver model code at all?  Please fix that up, it's not
ok to add more layers on top of a broken one like this.  We have
infrastructure for this type of thing, please don't route around it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26  9:14     ` Greg KH
@ 2024-09-26 12:42       ` Jason Gunthorpe
  2024-09-26 12:54         ` Greg KH
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 12:42 UTC (permalink / raw)
  To: Greg KH
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, airlied, daniel, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote:
> On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:
> > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > > > 2. Proposal for upstream
> > > > ========================
> > > 
> > > What is the strategy in the mid / long term with this?
> > > 
> > > As you know, we're trying to move to Nova and the blockers with the device /
> > > driver infrastructure have been resolved and we're able to move forward. Besides
> > > that, Dave made great progress on the firmware abstraction side of things.
> > > 
> > > Is this more of a proof of concept? Do you plan to work on Nova in general and
> > > vGPU support for Nova?
> > 
> > This is intended to be a real product that customers would use, it is
> > not a proof of concept. There is alot of demand for this kind of
> > simplified virtualization infrastructure in the host side. The series
> > here is the first attempt at making thin host infrastructure and
> > Zhi/etc are doing it with an upstream-first approach.
> > 
> > >From the VFIO side I would like to see something like this merged in
> > nearish future as it would bring a previously out of tree approach to
> > be fully intree using our modern infrastructure. This is a big win for
> > the VFIO world.
> > 
> > As a commercial product this will be backported extensively to many
> > old kernels and that is harder/impossible if it isn't exclusively in
> > C. So, I think nova needs to co-exist in some way.
> 
> Please never make design decisions based on old ancient commercial
> kernels that have any relevance to upstream kernel development
> today.

Greg, you are being too extreme. Those "ancient commercial kernels"
have a huge relevance to alot of our community because they are the
users that actually run the code we are building and pay for it to be
created. Yes we usually (but not always!) push back on accommodations
upstream, but taking hard dependencies on rust is currently a very
different thing.

> If you care about those kernels, work with the companies that get paid
> to support such things.  Otherwise development upstream would just
> completely stall and never go forward, as you well know.

They seem to be engaged, but upstream rust isn't even done yet. So
what exactly do you expect them to do? Throw out whole architectures
from their products?

I know how things work, I just don't think we are ready to elevate
Rust to the category of decisions where upstream can ignore the
downstream side readiness. In my view the community needs to agree to
remove the experimental status from Rust first.

> As it seems that future support for this hardware is going to be in
> rust, just use those apis going forward and backport the small number of

"those apis" don't even exist yet! There is a big multi-year gap
between when pure upstream would even be ready to put something like
VFIO on top of Nova and Rust and where we are now with this series.

This argument is *way too early*. I'm deeply hoping we never have to
actually have it, that by the time Nova gets merged Rust will be 100%
ready upstream and there will be no issue. Please? Can that happen?

Otherwise, let's slow down here. Nova is still years away from being
finished. Nouveau is the in-tree driver for this HW. This series
improves on Nouveau. We are definitely not at the point of refusing
new code because it is not writte in Rust, RIGHT?

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 12:42       ` Jason Gunthorpe
@ 2024-09-26 12:54         ` Greg KH
  2024-09-26 13:07           ` Danilo Krummrich
  2024-09-26 14:40           ` Jason Gunthorpe
  0 siblings, 2 replies; 86+ messages in thread
From: Greg KH @ 2024-09-26 12:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, airlied, daniel, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote:
> > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > > > > 2. Proposal for upstream
> > > > > ========================
> > > > 
> > > > What is the strategy in the mid / long term with this?
> > > > 
> > > > As you know, we're trying to move to Nova and the blockers with the device /
> > > > driver infrastructure have been resolved and we're able to move forward. Besides
> > > > that, Dave made great progress on the firmware abstraction side of things.
> > > > 
> > > > Is this more of a proof of concept? Do you plan to work on Nova in general and
> > > > vGPU support for Nova?
> > > 
> > > This is intended to be a real product that customers would use, it is
> > > not a proof of concept. There is alot of demand for this kind of
> > > simplified virtualization infrastructure in the host side. The series
> > > here is the first attempt at making thin host infrastructure and
> > > Zhi/etc are doing it with an upstream-first approach.
> > > 
> > > >From the VFIO side I would like to see something like this merged in
> > > nearish future as it would bring a previously out of tree approach to
> > > be fully intree using our modern infrastructure. This is a big win for
> > > the VFIO world.
> > > 
> > > As a commercial product this will be backported extensively to many
> > > old kernels and that is harder/impossible if it isn't exclusively in
> > > C. So, I think nova needs to co-exist in some way.
> > 
> > Please never make design decisions based on old ancient commercial
> > kernels that have any relevance to upstream kernel development
> > today.
> 
> Greg, you are being too extreme. Those "ancient commercial kernels"
> have a huge relevance to alot of our community because they are the
> users that actually run the code we are building and pay for it to be
> created. Yes we usually (but not always!) push back on accommodations
> upstream, but taking hard dependencies on rust is currently a very
> different thing.

That's fine, but again, do NOT make design decisions based on what you
can, and can not, feel you can slide by one of these companies to get it
into their old kernels.  That's what I take objection to here.

Also always remember please, that the % of overall Linux kernel
installs, even counting out Android and embedded, is VERY tiny for these
companies.  The huge % overall is doing the "right thing" by using
upstream kernels.  And with the laws in place now that % is only going
to grow and those older kernels will rightfully fall away into even
smaller %.

I know those companies pay for many developers, I'm not saying that
their contributions are any less or more important than others, they all
are equal.  You wouldn't want design decisions for a patch series to be
dictated by some really old Yocto kernel restrictions that are only in
autos, right?  We are a large community, that's what I'm saying.

> Otherwise, let's slow down here. Nova is still years away from being
> finished. Nouveau is the in-tree driver for this HW. This series
> improves on Nouveau. We are definitely not at the point of refusing
> new code because it is not writte in Rust, RIGHT?

No, I do object to "we are ignoring the driver being proposed by the
developers involved for this hardware by adding to the old one instead"
which it seems like is happening here.

Anyway, let's focus on the code, there's already real issues with this
patch series as pointed out by me and others that need to be addressed
before it can go anywhere.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26  6:43     ` Tian, Kevin
@ 2024-09-26 12:55       ` Jason Gunthorpe
  2024-09-26 22:57         ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 12:55 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, airlied@gmail.com, daniel@ffwll.ch,
	Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

On Thu, Sep 26, 2024 at 06:43:44AM +0000, Tian, Kevin wrote:

> Then there comes an open whether VFIO is a right place to host such
> vendor specific provisioning interface. The existing mdev type based
> provisioning mechanism was considered a bad fit already.

> IIRC the previous discussion came to suggest putting the provisioning
> interface in the PF driver. There may be chance to generalize and
> move to VFIO but no idea what it will be until multiple drivers already
> demonstrate their own implementations as the base for discussion.

I am looking at fwctl do to alot of this in the SRIOV world.

You'd provision the VF prior to opening VFIO using the fwctl interface
and the VFIO would perceive a VF that has exactly the required
properties. At least for SRIOV where the VM is talking directly to
device FW, mdev/paravirtualization would be different.

> But now seems you prefer to vendors putting their own provisioning
> interface in VFIO directly?

Maybe not, just that drm isn't the right place either. If the we do
fwctl stuff then the VF provisioning would be done through a fwctl
driver.

I'm not entirely sure yet what this whole 'mgr' component is actually
doing though.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 12:54         ` Greg KH
@ 2024-09-26 13:07           ` Danilo Krummrich
  2024-09-26 14:40           ` Jason Gunthorpe
  1 sibling, 0 replies; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-26 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe, Greg KH
  Cc: Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian, airlied,
	daniel, acurrid, cjia, smitra, ankita, aniketa, kwankhede,
	targupta, zhiwang

On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:
> On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote:
> > > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:
> > > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:
> > > > > > 2. Proposal for upstream
> > > > > > ========================
> > > > > 
> > > > > What is the strategy in the mid / long term with this?
> > > > > 
> > > > > As you know, we're trying to move to Nova and the blockers with the device /
> > > > > driver infrastructure have been resolved and we're able to move forward. Besides
> > > > > that, Dave made great progress on the firmware abstraction side of things.
> > > > > 
> > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and
> > > > > vGPU support for Nova?
> > > > 
> > > > This is intended to be a real product that customers would use, it is
> > > > not a proof of concept. There is alot of demand for this kind of
> > > > simplified virtualization infrastructure in the host side. The series
> > > > here is the first attempt at making thin host infrastructure and
> > > > Zhi/etc are doing it with an upstream-first approach.
> > > > 
> > > > >From the VFIO side I would like to see something like this merged in
> > > > nearish future as it would bring a previously out of tree approach to
> > > > be fully intree using our modern infrastructure. This is a big win for
> > > > the VFIO world.
> > > > 
> > > > As a commercial product this will be backported extensively to many
> > > > old kernels and that is harder/impossible if it isn't exclusively in
> > > > C. So, I think nova needs to co-exist in some way.
> > > 
> > > Please never make design decisions based on old ancient commercial
> > > kernels that have any relevance to upstream kernel development
> > > today.
> > 
> > Greg, you are being too extreme. Those "ancient commercial kernels"
> > have a huge relevance to alot of our community because they are the
> > users that actually run the code we are building and pay for it to be
> > created. Yes we usually (but not always!) push back on accommodations
> > upstream, but taking hard dependencies on rust is currently a very
> > different thing.
> 
> That's fine, but again, do NOT make design decisions based on what you
> can, and can not, feel you can slide by one of these companies to get it
> into their old kernels.  That's what I take objection to here.
> 
> Also always remember please, that the % of overall Linux kernel
> installs, even counting out Android and embedded, is VERY tiny for these
> companies.  The huge % overall is doing the "right thing" by using
> upstream kernels.  And with the laws in place now that % is only going
> to grow and those older kernels will rightfully fall away into even
> smaller %.
> 
> I know those companies pay for many developers, I'm not saying that
> their contributions are any less or more important than others, they all
> are equal.  You wouldn't want design decisions for a patch series to be
> dictated by some really old Yocto kernel restrictions that are only in
> autos, right?  We are a large community, that's what I'm saying.
> 
> > Otherwise, let's slow down here. Nova is still years away from being
> > finished. Nouveau is the in-tree driver for this HW. This series
> > improves on Nouveau. We are definitely not at the point of refusing
> > new code because it is not writte in Rust, RIGHT?

Just a reminder on what I said and not said, respectively. I never said we can't
support this in Nouveau for the short and mid term.

But we can't add new features and support new use-cases in Nouveau *without*
considering the way forward to the new driver.

> 
> No, I do object to "we are ignoring the driver being proposed by the
> developers involved for this hardware by adding to the old one instead"
> which it seems like is happening here.
> 
> Anyway, let's focus on the code, there's already real issues with this
> patch series as pointed out by me and others that need to be addressed
> before it can go anywhere.
> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 12:54         ` Greg KH
  2024-09-26 13:07           ` Danilo Krummrich
@ 2024-09-26 14:40           ` Jason Gunthorpe
  2024-09-26 18:07             ` Andy Ritger
  2024-09-26 22:42             ` Danilo Krummrich
  1 sibling, 2 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 14:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Danilo Krummrich, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, airlied, daniel, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:

> That's fine, but again, do NOT make design decisions based on what you
> can, and can not, feel you can slide by one of these companies to get it
> into their old kernels.  That's what I take objection to here.

It is not slide by. It is a recognition that participating in the
community gives everyone value. If you excessively deny value from one
side they will have no reason to participate.

In this case the value is that, with enough light work, the
kernel-fork community can deploy this code to their users. This has
been the accepted bargin for a long time now.

There is a great big question mark over Rust regarding what impact it
actually has on this dynamic. It is definitely not just backport a few
hundred upstream patches. There is clearly new upstream development
work needed still - arch support being a very obvious one.

> Also always remember please, that the % of overall Linux kernel
> installs, even counting out Android and embedded, is VERY tiny for these
> companies.  The huge % overall is doing the "right thing" by using
> upstream kernels.  And with the laws in place now that % is only going
> to grow and those older kernels will rightfully fall away into even
> smaller %.

Who is "doing the right thing"? That is not what I see, we sell
server HW to *everyone*. There are a couple sites that are "near"
upstream, but that is not too common. Everyone is running some kind of
kernel fork.

I dislike this generalization you do with % of users. Almost 100% of
NVIDIA server HW are running forks. I would estimate around 10% is
above a 6.0 baseline. It is not tiny either, NVIDIA sold like $60B of
server HW running Linux last year with this kind of demographic. So
did Intel, AMD, etc.

I would not describe this as "VERY tiny". Maybe you mean RHEL-alike
specifically, and yes, they are a diminishing install share. However,
the hyperscale companies more than make up for that with their
internal secret proprietary forks :(

> > Otherwise, let's slow down here. Nova is still years away from being
> > finished. Nouveau is the in-tree driver for this HW. This series
> > improves on Nouveau. We are definitely not at the point of refusing
> > new code because it is not writte in Rust, RIGHT?
> 
> No, I do object to "we are ignoring the driver being proposed by the
> developers involved for this hardware by adding to the old one instead"
> which it seems like is happening here.

That is too harsh. We've consistently taken a community position that
OOT stuff doesn't matter, and yes that includes OOT stuff that people
we trust and respect are working on. Until it is ready for submission,
and ideally merged, it is an unknown quantity. Good well meaning
people routinely drop their projects, good projects run into
unexpected roadblocks, and life happens.

Nova is not being ignored, there is dialog, and yes some disagreement.

Again, nobody here is talking about disrupting Nova. We just want to
keep going as-is until we can all agree together it is ready to make a
change.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 14:40           ` Jason Gunthorpe
@ 2024-09-26 18:07             ` Andy Ritger
  2024-09-26 22:23               ` Danilo Krummrich
  2024-09-26 22:42             ` Danilo Krummrich
  1 sibling, 1 reply; 86+ messages in thread
From: Andy Ritger @ 2024-09-26 18:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg KH, Danilo Krummrich, Zhi Wang, kvm, nouveau,
	alex.williamson, kevin.tian, airlied, daniel, acurrid, cjia,
	smitra, ankita, aniketa, kwankhede, targupta, zhiwang


I hope and expect the nova and vgpu_mgr efforts to ultimately converge.

First, for the fw ABI debacle: yes, it is unfortunate that we still don't
have a stable ABI from GSP.  We /are/ working on it, though there isn't
anything to show, yet.  FWIW, I expect the end result will be a much
simpler interface than what is there today, and a stable interface that
NVIDIA can guarantee.

But, for now, we have a timing problem like Jason described:

- We have customers eager for upstream vfio support in the near term,
  and that seems like something NVIDIA can develop/contribute/maintain in
  the near term, as an incremental step forward.

- Nova is still early in its development, relative to nouveau/nvkm.

- From NVIDIA's perspective, we're nervous about the backportability of
  rust-based components to enterprise kernels in the near term.

- The stable GSP ABI is not going to be ready in the near term.


I agree with what Dave said in one of the forks of this thread, in the context of
NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS:

> The GSP firmware interfaces are not guaranteed stable. Exposing these
> interfaces outside the nvkm core is unacceptable, as otherwise we
> would have to adapt the whole kernel depending on the loaded firmware.
>
> You cannot use any nvidia sdk headers, these all have to be abstracted
> behind things that have no bearing on the API.

Agreed.  Though not infinitely scalable, and not
as clean as in rust, it seems possible to abstract
NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS behind
a C-implemented abstraction layer in nvkm, at least for the short term.

Is there a potential compromise where vgpu_mgr starts its life with a
dependency on nvkm, and as things mature we migrate it to instead depend
on nova?


On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:
> 
> > That's fine, but again, do NOT make design decisions based on what you
> > can, and can not, feel you can slide by one of these companies to get it
> > into their old kernels.  That's what I take objection to here.
> 
> It is not slide by. It is a recognition that participating in the
> community gives everyone value. If you excessively deny value from one
> side they will have no reason to participate.
> 
> In this case the value is that, with enough light work, the
> kernel-fork community can deploy this code to their users. This has
> been the accepted bargin for a long time now.
> 
> There is a great big question mark over Rust regarding what impact it
> actually has on this dynamic. It is definitely not just backport a few
> hundred upstream patches. There is clearly new upstream development
> work needed still - arch support being a very obvious one.
> 
> > Also always remember please, that the % of overall Linux kernel
> > installs, even counting out Android and embedded, is VERY tiny for these
> > companies.  The huge % overall is doing the "right thing" by using
> > upstream kernels.  And with the laws in place now that % is only going
> > to grow and those older kernels will rightfully fall away into even
> > smaller %.
> 
> Who is "doing the right thing"? That is not what I see, we sell
> server HW to *everyone*. There are a couple sites that are "near"
> upstream, but that is not too common. Everyone is running some kind of
> kernel fork.
> 
> I dislike this generalization you do with % of users. Almost 100% of
> NVIDIA server HW are running forks. I would estimate around 10% is
> above a 6.0 baseline. It is not tiny either, NVIDIA sold like $60B of
> server HW running Linux last year with this kind of demographic. So
> did Intel, AMD, etc.
> 
> I would not describe this as "VERY tiny". Maybe you mean RHEL-alike
> specifically, and yes, they are a diminishing install share. However,
> the hyperscale companies more than make up for that with their
> internal secret proprietary forks :(
> 
> > > Otherwise, let's slow down here. Nova is still years away from being
> > > finished. Nouveau is the in-tree driver for this HW. This series
> > > improves on Nouveau. We are definitely not at the point of refusing
> > > new code because it is not writte in Rust, RIGHT?
> > 
> > No, I do object to "we are ignoring the driver being proposed by the
> > developers involved for this hardware by adding to the old one instead"
> > which it seems like is happening here.
> 
> That is too harsh. We've consistently taken a community position that
> OOT stuff doesn't matter, and yes that includes OOT stuff that people
> we trust and respect are working on. Until it is ready for submission,
> and ideally merged, it is an unknown quantity. Good well meaning
> people routinely drop their projects, good projects run into
> unexpected roadblocks, and life happens.
> 
> Nova is not being ignored, there is dialog, and yes some disagreement.
> 
> Again, nobody here is talking about disrupting Nova. We just want to
> keep going as-is until we can all agree together it is ready to make a
> change.
> 
> Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 18:07             ` Andy Ritger
@ 2024-09-26 22:23               ` Danilo Krummrich
  0 siblings, 0 replies; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-26 22:23 UTC (permalink / raw)
  To: Andy Ritger
  Cc: Jason Gunthorpe, Greg KH, Zhi Wang, kvm, nouveau, alex.williamson,
	kevin.tian, airlied, daniel, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Sep 26, 2024 at 11:07:56AM -0700, Andy Ritger wrote:
> 
> I hope and expect the nova and vgpu_mgr efforts to ultimately converge.
> 
> First, for the fw ABI debacle: yes, it is unfortunate that we still don't
> have a stable ABI from GSP.  We /are/ working on it, though there isn't
> anything to show, yet.  FWIW, I expect the end result will be a much
> simpler interface than what is there today, and a stable interface that
> NVIDIA can guarantee.
> 
> But, for now, we have a timing problem like Jason described:
> 
> - We have customers eager for upstream vfio support in the near term,
>   and that seems like something NVIDIA can develop/contribute/maintain in
>   the near term, as an incremental step forward.
> 
> - Nova is still early in its development, relative to nouveau/nvkm.
> 
> - From NVIDIA's perspective, we're nervous about the backportability of
>   rust-based components to enterprise kernels in the near term.
> 
> - The stable GSP ABI is not going to be ready in the near term.
> 
> 
> I agree with what Dave said in one of the forks of this thread, in the context of
> NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS:
> 
> > The GSP firmware interfaces are not guaranteed stable. Exposing these
> > interfaces outside the nvkm core is unacceptable, as otherwise we
> > would have to adapt the whole kernel depending on the loaded firmware.
> >
> > You cannot use any nvidia sdk headers, these all have to be abstracted
> > behind things that have no bearing on the API.
> 
> Agreed.  Though not infinitely scalable, and not
> as clean as in rust, it seems possible to abstract
> NV2080_CTRL_VGPU_MGR_INTERNAL_BOOTLOAD_GSP_VGPU_PLUGIN_TASK_PARAMS behind
> a C-implemented abstraction layer in nvkm, at least for the short term.
> 
> Is there a potential compromise where vgpu_mgr starts its life with a
> dependency on nvkm, and as things mature we migrate it to instead depend
> on nova?
> 

Of course, I've always said that it's perfectly fine to go with Nouveau as long
as Nova is not ready yet.

But, and that's very central, the condition must be that we agree on the long
term goal and agree on working towards this goal *together*.

Having two competing upstream strategies is not acceptable.

The baseline for the long term goal that we have set so far is Nova. And this
must also be the baseline for a discussion.

Raising concerns about that is perfectly valid, we can discuss them and look for
solutions.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 14:40           ` Jason Gunthorpe
  2024-09-26 18:07             ` Andy Ritger
@ 2024-09-26 22:42             ` Danilo Krummrich
  2024-09-27 12:51               ` Jason Gunthorpe
  1 sibling, 1 reply; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-26 22:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg KH, Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian,
	airlied, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:
> > 
> > No, I do object to "we are ignoring the driver being proposed by the
> > developers involved for this hardware by adding to the old one instead"
> > which it seems like is happening here.
> 
> That is too harsh. We've consistently taken a community position that
> OOT stuff doesn't matter, and yes that includes OOT stuff that people
> we trust and respect are working on. Until it is ready for submission,
> and ideally merged, it is an unknown quantity. Good well meaning
> people routinely drop their projects, good projects run into
> unexpected roadblocks, and life happens.

That's not the point -- at least it never was my point.

Upstream has set a strategy, and it's totally fine to raise concerns, discuss
them, look for solutions, draw conclusions and do adjustments where needed.

But, we have to agree on a long term strategy and work towards the corresponding
goals *together*.

I don't want to end up in a situation where everyone just does their own thing.

So, when you say things like "go do Nova, have fun", it really just sounds like
as if you just want to do your own thing and ignore the existing upstream
strategy instead of collaborate and shape it.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-09-22 12:49 ` [RFC 04/29] nvkm/vgpu: set the VF partition count " Zhi Wang
@ 2024-09-26 22:51   ` Jason Gunthorpe
  2024-10-13 18:54     ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 22:51 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:26AM -0700, Zhi Wang wrote:
> GSP firmware needs to know the number of max-supported vGPUs when
> initialization.
> 
> The field of VF partition count in the GSP WPR2 is required to be set
> according to the number of max-supported vGPUs.
> 
> Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP
> firmware and initializes the GSP WPR2, if vGPU is enabled.

How/why is this different from the SRIOV num_vfs concept?

The way the SRIOV flow should work is you boot the PF, startup the
device, then userspace sets num_vfs and you get the SRIOV VFs.

Why would you want less/more partitions than VFs? Is there some way to
consume more than one partition per VF?

At least based on the commit message this seems like a very poor FW
interface.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO when NVIDIA vGPU is enabled
  2024-09-22 12:49 ` [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO " Zhi Wang
@ 2024-09-26 22:52   ` Jason Gunthorpe
  0 siblings, 0 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 22:52 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:27AM -0700, Zhi Wang wrote:
> +void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
> +					void *info)
> +{
> +	struct pci_dev *pdev = nvkm_to_pdev(device);
> +	GspSystemInfo *gsp_info = info;
> +	GSP_VF_INFO *vf_info = &gsp_info->gspVFInfo;
> +	u32 lo, hi;
> +	u16 v;
> +	int pos;
> +
> +	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
> +
> +	pci_read_config_word(pdev, pos + PCI_SRIOV_TOTAL_VF, &v);
> +	vf_info->totalVFs = v;
> +
> +	pci_read_config_word(pdev, pos + PCI_SRIOV_VF_OFFSET, &v);
> +	vf_info->firstVFOffset = v;
> +
> +	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR, &lo);
> +	vf_info->FirstVFBar0Address = lo & 0xFFFFFFF0;
> +
> +	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 4, &lo);
> +	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 8, &hi);
> +
> +	vf_info->FirstVFBar1Address = (((u64)hi) << 32) + (lo & 0xFFFFFFF0);
> +
> +	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 12, &lo);
> +	pci_read_config_dword(pdev, pos + PCI_SRIOV_BAR + 16, &hi);
> +
> +	vf_info->FirstVFBar2Address = (((u64)hi) << 32) + (lo & 0xFFFFFFF0);
> +
> +#define IS_BAR_64(i) (((i) & 0x00000006) == 0x00000004)

This should come from the PCI core not be re-read with pci_read_config
and hand rolled macros.

From a mlx perspective this is really weird, I'd expect the FW to be
able to read its own config space.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  2024-09-22 12:49 ` [RFC 06/29] nvkm/vgpu: set RMSetSriovMode " Zhi Wang
@ 2024-09-26 22:53   ` Jason Gunthorpe
  2024-10-14  7:38     ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 22:53 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
> The registry object "RMSetSriovMode" is required to be set when vGPU is
> enabled.
> 
> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
> initialize the GSP registry objects, if vGPU is enabled.

Also really weird, this sounds like what the PCI sriov enable is for.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-09-22 12:49 ` [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm Zhi Wang
@ 2024-09-26 22:56   ` Jason Gunthorpe
  2024-10-14  8:32     ` Zhi Wang
  2024-10-14  8:36     ` Zhi Wang
  0 siblings, 2 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 22:56 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, nouveau, alex.williamson, kevin.tian, airlied, daniel,
	acurrid, cjia, smitra, ankita, aniketa, kwankhede, targupta,
	zhiwang

On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:

> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
> index d9ed2cd202ff..5c2c650c2df9 100644
> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
> @@ -6,8 +6,13 @@
>  #ifndef __NVKM_VGPU_MGR_VFIO_H__
>  #define __NVKM_VGPU_MGR_VFIO_H__
>  
> +enum {
> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
> +};
> +
>  struct nvidia_vgpu_vfio_handle_data {
>  	void *priv;
> +	struct notifier_block notifier;
>  };

Nothing references this? Why would you need it?

It looks approx correct to me to just directly put your function in
the sriov_configure callback.

This is the callback that indicates the admin has decided to turn on
the SRIOV feature.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 12:55       ` Jason Gunthorpe
@ 2024-09-26 22:57         ` Jason Gunthorpe
  2024-09-27  0:13           ` Tian, Kevin
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-26 22:57 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, airlied@gmail.com, daniel@ffwll.ch,
	Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote:

> I'm not entirely sure yet what this whole 'mgr' component is actually
> doing though.

Looking more closely I think some of it is certainly appropriate to be
in vfio. Like when something opens the VFIO device it should allocate
the PF device resources from FW, setup kernel structures and so on to
allow the about to be opened VF to work. That is good VFIO topics. IOW
if you don't open any VFIO devices there would be a minimal overhead

But that stuff shouldn't be shunted into some weird "mgr", it should
just be inside the struct vfio_device subclass inside the variant
driver.

How to get the provisioning into the kernel prior to VFIO open, and
what kind of control object should exist for the hypervisor side of
the VF, I'm not sure. In mlx5 we used devlink and a netdev/rdma
"respresentor" for alot of this complex control stuff.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 22:57         ` Jason Gunthorpe
@ 2024-09-27  0:13           ` Tian, Kevin
  0 siblings, 0 replies; 86+ messages in thread
From: Tian, Kevin @ 2024-09-27  0:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Zhi Wang, kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, airlied@gmail.com, daniel@ffwll.ch,
	Currid, Andy, cjia@nvidia.com, smitra@nvidia.com,
	ankita@nvidia.com, aniketa@nvidia.com, kwankhede@nvidia.com,
	targupta@nvidia.com, zhiwang@kernel.org

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, September 27, 2024 6:57 AM
> 
> On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote:
> 
> > I'm not entirely sure yet what this whole 'mgr' component is actually
> > doing though.
> 
> Looking more closely I think some of it is certainly appropriate to be
> in vfio. Like when something opens the VFIO device it should allocate
> the PF device resources from FW, setup kernel structures and so on to
> allow the about to be opened VF to work. That is good VFIO topics. IOW
> if you don't open any VFIO devices there would be a minimal overhead
> 
> But that stuff shouldn't be shunted into some weird "mgr", it should
> just be inside the struct vfio_device subclass inside the variant
> driver.

yes. That's why I said earlier that the current way looks fine as long as
it won't expand to carry vendor specific provisioning interface. The
majority of the series is to allocate backend resource when the
device is opened. that's perfectly a VFIO topic.

Just the point of hardcoding a vGPU type now while stating the mgr
will supporting selecting a vGPU type later implies something not
clearly designed.

> 
> How to get the provisioning into the kernel prior to VFIO open, and
> what kind of control object should exist for the hypervisor side of
> the VF, I'm not sure. In mlx5 we used devlink and a netdev/rdma
> "respresentor" for alot of this complex control stuff.
> 

the mlx5 approach is what I envisioned. or the fwctl option is
also fine after it's merged.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-26 22:42             ` Danilo Krummrich
@ 2024-09-27 12:51               ` Jason Gunthorpe
  2024-09-27 14:22                 ` Danilo Krummrich
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-27 12:51 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Greg KH, Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian,
	airlied, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote:
> On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:
> > > 
> > > No, I do object to "we are ignoring the driver being proposed by the
> > > developers involved for this hardware by adding to the old one instead"
> > > which it seems like is happening here.
> > 
> > That is too harsh. We've consistently taken a community position that
> > OOT stuff doesn't matter, and yes that includes OOT stuff that people
> > we trust and respect are working on. Until it is ready for submission,
> > and ideally merged, it is an unknown quantity. Good well meaning
> > people routinely drop their projects, good projects run into
> > unexpected roadblocks, and life happens.
> 
> That's not the point -- at least it never was my point.
> 
> Upstream has set a strategy, and it's totally fine to raise concerns, discuss
> them, look for solutions, draw conclusions and do adjustments where needed.

We don't really do strategy in the kernel. This language is a bit
off putting. Linux runs on community consensus and if any strategy
exists it is reflected by the code actually merged.

When you say things like this it comes across as though you are
implying there are two tiers to the community. Ie those that set the
strategy and those that don't.

> But, we have to agree on a long term strategy and work towards the corresponding
> goals *together*.

I think we went over all the options already. IMHO the right one is
for nova and vfio to share some kind of core driver. The choice of
Rust for nova complicates planning this, but it doesn't mean anyone is
saying no to it.

My main point is when this switches from VFIO on nouveau to VFIO on
Nova is something that needs to be a mutual decision with the VFIO
side and user community as well.

> So, when you say things like "go do Nova, have fun", it really just sounds like
> as if you just want to do your own thing and ignore the existing upstream
> strategy instead of collaborate and shape it.

I am saying I have no interest in interfering with your
project. Really, I read your responses as though you feel Nova is
under attack and I'm trying hard to say that is not at all my
intention.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-27 12:51               ` Jason Gunthorpe
@ 2024-09-27 14:22                 ` Danilo Krummrich
  2024-09-27 15:27                   ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-27 14:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg KH, Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian,
	airlied, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Fri, Sep 27, 2024 at 09:51:15AM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote:
> > On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote:
> > > > 
> > > > No, I do object to "we are ignoring the driver being proposed by the
> > > > developers involved for this hardware by adding to the old one instead"
> > > > which it seems like is happening here.
> > > 
> > > That is too harsh. We've consistently taken a community position that
> > > OOT stuff doesn't matter, and yes that includes OOT stuff that people
> > > we trust and respect are working on. Until it is ready for submission,
> > > and ideally merged, it is an unknown quantity. Good well meaning
> > > people routinely drop their projects, good projects run into
> > > unexpected roadblocks, and life happens.
> > 
> > That's not the point -- at least it never was my point.
> > 
> > Upstream has set a strategy, and it's totally fine to raise concerns, discuss
> > them, look for solutions, draw conclusions and do adjustments where needed.
> 
> We don't really do strategy in the kernel. This language is a bit
> off putting. Linux runs on community consensus and if any strategy
> exists it is reflected by the code actually merged.

We can also just call it "goals", but either way, of course maintainers set
goals for the components they maintain and hence have some sort of "strategy"
how they want to evolve their components, to solve existing or foreseeable
problems.

However, I agree that those things may be reevaluated based on community
feedback and consensus. And I'm happy to do that.

See, you're twisting my words and imply that we wouldn't look for community
consensus, while I'm *explicitly* asking you to let us do exactly that. I want
to find consensus on the long term goals that we all work on *together*, because
I don't want to end up with competing projects.

And I think it's reasonable to first consider the goals that have been set
already. Again, feel free to raise concerns and we'll discuss them and look for
solutions, but please not just ignore the existing goals.

> 
> When you say things like this it comes across as though you are
> implying there are two tiers to the community. Ie those that set the
> strategy and those that don't.

This isn't true, I just ask you to consider the goals that have been set
already, because we have been working on this already.

*We can discuss them*, but I indeed ask you to accept the current direction as a
baseline for discussion. I don't think this is unreasonable, is it?

> 
> > But, we have to agree on a long term strategy and work towards the corresponding
> > goals *together*.
> 
> I think we went over all the options already. IMHO the right one is
> for nova and vfio to share some kind of core driver. The choice of
> Rust for nova complicates planning this, but it doesn't mean anyone is
> saying no to it.

This is the problem, you're many steps ahead.

You should start with understanding why we want the core driver to be in Rust.
You then can raise your concerns about it and then we can discuss them and see
if we can find solutions / consensus.

But you're not even considering it, and instead start with a counter proposal.
This isn't acceptable to me.

> 
> My main point is when this switches from VFIO on nouveau to VFIO on
> Nova is something that needs to be a mutual decision with the VFIO
> side and user community as well.

To me it's important that we agree on the goals and work towards them together.
If we seriously do that, then the "when" should be trival to agree on.

> 
> > So, when you say things like "go do Nova, have fun", it really just sounds like
> > as if you just want to do your own thing and ignore the existing upstream
> > strategy instead of collaborate and shape it.
> 
> I am saying I have no interest in interfering with your
> project. Really, I read your responses as though you feel Nova is
> under attack and I'm trying hard to say that is not at all my
> intention.

I don't read this as Nova "being under attack" at all. I read it as "I don't
care about the goal to have the core driver in Rust, nor do I care about the
reasons you have for this.".

> 
> Jason
> 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-27 14:22                 ` Danilo Krummrich
@ 2024-09-27 15:27                   ` Jason Gunthorpe
  2024-09-30 15:59                     ` Danilo Krummrich
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-09-27 15:27 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Greg KH, Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian,
	airlied, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote:
> > When you say things like this it comes across as though you are
> > implying there are two tiers to the community. Ie those that set the
> > strategy and those that don't.
> 
> This isn't true, I just ask you to consider the goals that have been set
> already, because we have been working on this already.

Why do keep saying I haven't?

I have no intention of becoming involved in your project or
nouveau. My only interest here is to get an agreement that we can get
a VFIO driver (to improve the VFIO subsystem and community!) in the
near term on top of in-tree nouveau.

> > > But, we have to agree on a long term strategy and work towards the corresponding
> > > goals *together*.
> > 
> > I think we went over all the options already. IMHO the right one is
> > for nova and vfio to share some kind of core driver. The choice of
> > Rust for nova complicates planning this, but it doesn't mean anyone is
> > saying no to it.
> 
> This is the problem, you're many steps ahead.
> 
> You should start with understanding why we want the core driver to be in Rust.
> You then can raise your concerns about it and then we can discuss them and see
> if we can find solutions / consensus.

I don't want to debate with you about Nova. It is too far in the
future, and it doesn't intersect with anything I am doing.

> But you're not even considering it, and instead start with a counter proposal.
> This isn't acceptable to me.

I'm even agreeing to a transition into a core driver in Rust, someday,
when the full community can agree it is the right time.

What more do you want from me?

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
  2024-09-27 15:27                   ` Jason Gunthorpe
@ 2024-09-30 15:59                     ` Danilo Krummrich
  0 siblings, 0 replies; 86+ messages in thread
From: Danilo Krummrich @ 2024-09-30 15:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg KH, Zhi Wang, kvm, nouveau, alex.williamson, kevin.tian,
	airlied, daniel, acurrid, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, zhiwang

On Fri, Sep 27, 2024 at 12:27:24PM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote:
> > > When you say things like this it comes across as though you are
> > > implying there are two tiers to the community. Ie those that set the
> > > strategy and those that don't.
> > 
> > This isn't true, I just ask you to consider the goals that have been set
> > already, because we have been working on this already.
> 
> Why do keep saying I haven't?

Because I haven't seen you to acknowlege that the current direction we're moving
to is that we're trying to move away from Nouveau and start over with a new
GSP-only solution.

Instead you propose a huge architectural rework of Nouveau, extract a core
driver from Nouveau and make this the long term solution.

> 
> I have no intention of becoming involved in your project or
> nouveau. My only interest here is to get an agreement that we can get
> a VFIO driver (to improve the VFIO subsystem and community!) in the
> near term on top of in-tree nouveau.

Two aspects about this.

First, Nova isn't a different project in this sense, it's the continuation of
Nouveau to overcome several problems we have with Nouveau.

Second, of course you have the intention of becoming involved in the Nouveau /
Nova project. You ask for huge architectural changes of Nouveau, including new
interfaces for a VFIO driver on top. If that's not becoming involved what else
would it be?

> 
> > > > But, we have to agree on a long term strategy and work towards the corresponding
> > > > goals *together*.
> > > 
> > > I think we went over all the options already. IMHO the right one is
> > > for nova and vfio to share some kind of core driver. The choice of
> > > Rust for nova complicates planning this, but it doesn't mean anyone is
> > > saying no to it.
> > 
> > This is the problem, you're many steps ahead.
> > 
> > You should start with understanding why we want the core driver to be in Rust.
> > You then can raise your concerns about it and then we can discuss them and see
> > if we can find solutions / consensus.
> 
> I don't want to debate with you about Nova. It is too far in the
> future, and it doesn't intersect with anything I am doing.

Sure it does. Again, Nova is intended to be the continuation of Nouveau. So, if
you want to do a major rework in Nouveau (and hence become involved in the
project) we have to make sure that we progress things in the same direction.

How do you expect the project to be successful in the long term, if the involved
parties are not willing to agree at a direction and common goals for the
project?

Or is it that you are simply not interested in long term? Do you have reasons to
think that the problems we have with Nouveau just go away in the long term? Do
you plan to solve them within Nouveau? If so, how do you plan to do that?

> 
> > But you're not even considering it, and instead start with a counter proposal.
> > This isn't acceptable to me.
> 
> I'm even agreeing to a transition into a core driver in Rust, someday,
> when the full community can agree it is the right time.
> 
> What more do you want from me?

I want that the people involved in the project seriously discuss and align on
the direction and goals for the project in the long term and work towards them
together.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-09-26 22:51   ` Jason Gunthorpe
@ 2024-10-13 18:54     ` Zhi Wang
  2024-10-15 12:20       ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-10-13 18:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 27/09/2024 1.51, Jason Gunthorpe wrote:
> On Sun, Sep 22, 2024 at 05:49:26AM -0700, Zhi Wang wrote:
>> GSP firmware needs to know the number of max-supported vGPUs when
>> initialization.
>>
>> The field of VF partition count in the GSP WPR2 is required to be set
>> according to the number of max-supported vGPUs.
>>
>> Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP
>> firmware and initializes the GSP WPR2, if vGPU is enabled.
> 
> How/why is this different from the SRIOV num_vfs concept?
> 

1) The VF is considered as an HW interface of vGPU exposed to the VMM/VM.

2) Number of VF is not always equal to number of max vGPU supported, 
which depends on a) the size of metadata of video memory space allocated 
for FW to manage the vGPUs. b) how user divide the resources. E.g. if a 
card has 48GB video memory, and user creates two vGPUs each has 24GB 
video memory. Only two VFs are usable even SRIOV num_vfs can be large 
than that.

> The way the SRIOV flow should work is you boot the PF, startup the
> device, then userspace sets num_vfs and you get the SRIOV VFs.
> 
> Why would you want less/more partitions than VFs?

Is there some way to
> consume more than one partition per VF?

No.

> 
> At least based on the commit message this seems like a very poor FW
> interface.
> > Jason


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  2024-09-26 22:53   ` Jason Gunthorpe
@ 2024-10-14  7:38     ` Zhi Wang
  2024-10-15  3:49       ` Christoph Hellwig
  2024-10-15 12:23       ` Jason Gunthorpe
  0 siblings, 2 replies; 86+ messages in thread
From: Zhi Wang @ 2024-10-14  7:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 27/09/2024 1.53, Jason Gunthorpe wrote:
> On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
>> The registry object "RMSetSriovMode" is required to be set when vGPU is
>> enabled.
>>
>> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
>> initialize the GSP registry objects, if vGPU is enabled.
> 
> Also really weird, this sounds like what the PCI sriov enable is for.
> 

As what has been explained in PATCH 4's reply, the concept of vGPU and 
VF are not identically equal. PCI SRIOV VF is the HW interface of 
reaching a vGPU and there were generations in which HW didn't have SRIOV 
VFs and a vGPU is reached via other means.

The "RMSetSriovMode" here is not equal to PCI SRIOV enable, which 
activates the VFs and let them present on PCI bus. It is to tell the GSP 
FW to enable the mode of "vGPUs are reached by VFs".

> Jason


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-09-26 22:56   ` Jason Gunthorpe
@ 2024-10-14  8:32     ` Zhi Wang
  2024-10-15 12:27       ` Jason Gunthorpe
  2024-10-14  8:36     ` Zhi Wang
  1 sibling, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-10-14  8:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 27/09/2024 1.56, Jason Gunthorpe wrote:
> On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:
> 
>> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
>> index d9ed2cd202ff..5c2c650c2df9 100644
>> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
>> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
>> @@ -6,8 +6,13 @@
>>   #ifndef __NVKM_VGPU_MGR_VFIO_H__
>>   #define __NVKM_VGPU_MGR_VFIO_H__
>>   
>> +enum {
>> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
>> +};
>> +
>>   struct nvidia_vgpu_vfio_handle_data {
>>   	void *priv;
>> +	struct notifier_block notifier;
>>   };
> 
> Nothing references this? Why would you need it?
> 

Oops, these are the leftovers of the discard changes. Will remove them 
accordingly in the next iteration. Thanks so much for catching this.

> It looks approx correct to me to just directly put your function in
> the sriov_configure callback.
> 
> This is the callback that indicates the admin has decided to turn on
> the SRIOV feature.

Turning on the SRIOV feature is just a part of the process enabling a 
vGPU. The VF is not instantly usable before a vGPU type is chosen via 
another userspace interface (e.g. fwctl).

Besides, admin has to enable the vGPU support by some means (e.g. a 
kernel parameter is just one candidate) and GSP firmware needs to be 
configured accordingly when being loaded.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

As this is related to user space interface, I am leaning towards putting 
some restriction/checks for the pre-condition in the 
driver.sriov_configure(), so admin would know there is something wrong 
in his configuration as early as possible, instead of he failed to 
creating vGPUs again and again, then he found he forgot to enable the 
vGPU support.

Thanks,
Zhi.

> Jason


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-09-26 22:56   ` Jason Gunthorpe
  2024-10-14  8:32     ` Zhi Wang
@ 2024-10-14  8:36     ` Zhi Wang
  1 sibling, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-10-14  8:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 27/09/2024 1.56, Jason Gunthorpe wrote:

Re-send this email as there are weird space in the previous email which.

> On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:
> 
>> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
>> index d9ed2cd202ff..5c2c650c2df9 100644
>> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
>> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
>> @@ -6,8 +6,13 @@
>>   #ifndef __NVKM_VGPU_MGR_VFIO_H__
>>   #define __NVKM_VGPU_MGR_VFIO_H__
>>   
>> +enum {
>> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
>> +};
>> +
>>   struct nvidia_vgpu_vfio_handle_data {
>>   	void *priv;
>> +	struct notifier_block notifier;
>>   };
> 
> Nothing references this? Why would you need it?
> 
> It looks approx correct to me to just directly put your function in
> the sriov_configure callback.
> 

Oops, these are the leftovers of the discard changes. Will remove them 
accordingly in the next iteration. Thanks so much for catching this.

> This is the callback that indicates the admin has decided to turn on
> the SRIOV feature.
> 

As this is related to user space interface, I am leaning towards putting
some restriction/checks for the pre-condition in the
driver.sriov_configure(), so admin would know there is something wrong
in his configuration as early as possible, instead of he failed to
creating vGPUs again and again, then he found he forgot to enable the
vGPU support.

Thanks,
Zhi.

> Jason


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude
  2024-09-26  9:20   ` Greg KH
@ 2024-10-14  9:59     ` Zhi Wang
  2024-10-14 11:36       ` Greg KH
  0 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-10-14  9:59 UTC (permalink / raw)
  To: Greg KH
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com, Jason Gunthorpe,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 26/09/2024 12.20, Greg KH wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Sun, Sep 22, 2024 at 05:49:23AM -0700, Zhi Wang wrote:
>> NVIDIA GPU virtualization is a technology that allows multiple virtual
>> machines (VMs) to share the power of a single GPU, enabling greater
>> flexibility, efficiency, and cost-effectiveness in data centers and cloud
>> environments.
>>
>> The first step of supporting NVIDIA vGPU in nvkm is to introduce the
>> necessary vGPU data structures and functions to hook into the
>> (de)initialization path of nvkm.
>>
>> Introduce NVIDIA vGPU data structures and functions hooking into the
>> the (de)initialization path of nvkm and support the following patches.
>>
>> Cc: Neo Jia <cjia@nvidia.com>
>> Cc: Surath Mitra <smitra@nvidia.com>
>> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> 
> Some minor comments that are a hint you all aren't running checkpatch on
> your code...
> 
>> --- /dev/null
>> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
>> @@ -0,0 +1,17 @@
>> +/* SPDX-License-Identifier: MIT */
> 
> Wait, what?  Why?  Ick.  You all also forgot the copyright line :(
> 

Will fix it accordingly.

Back to the reason, I am trying to follow the majority in the nouveau 
since this is the change of nouveau.

What's your guidelines about those already in the code?

inno@inno-linux:~/vgpu-linux-rfc/drivers/gpu/drm/nouveau$ grep -A 3 -R 
": MIT" *


dispnv04/disp.h:/* SPDX-License-Identifier: MIT */
dispnv04/disp.h-#ifndef __NV04_DISPLAY_H__
dispnv04/disp.h-#define __NV04_DISPLAY_H__
dispnv04/disp.h-#include <subdev/bios.h>
--
dispnv04/cursor.c:// SPDX-License-Identifier: MIT
dispnv04/cursor.c-#include <drm/drm_mode.h>
dispnv04/cursor.c-#include "nouveau_drv.h"
dispnv04/cursor.c-#include "nouveau_reg.h"
--
dispnv04/Kbuild:# SPDX-License-Identifier: MIT
dispnv04/Kbuild-nouveau-y += dispnv04/arb.o
dispnv04/Kbuild-nouveau-y += dispnv04/crtc.o
dispnv04/Kbuild-nouveau-y += dispnv04/cursor.o
--
dispnv50/crc.h:/* SPDX-License-Identifier: MIT */
dispnv50/crc.h-#ifndef __NV50_CRC_H__
dispnv50/crc.h-#define __NV50_CRC_H__
dispnv50/crc.h-
--
dispnv50/handles.h:/* SPDX-License-Identifier: MIT */
dispnv50/handles.h-#ifndef __NV50_KMS_HANDLES_H__
dispnv50/handles.h-#define __NV50_KMS_HANDLES_H__
dispnv50/handles.h-
--
dispnv50/crcc37d.h:/* SPDX-License-Identifier: MIT */
dispnv50/crcc37d.h-
dispnv50/crcc37d.h-#ifndef __CRCC37D_H__
dispnv50/crcc37d.h-#define __CRCC37D_H__
--
dispnv50/Kbuild:# SPDX-License-Identifier: MIT
dispnv50/Kbuild-nouveau-y += dispnv50/disp.o
dispnv50/Kbuild-nouveau-y += dispnv50/lut.o

>> --- /dev/null
>> +++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
>> @@ -0,0 +1,76 @@
>> +/* SPDX-License-Identifier: MIT */
>> +#include <core/device.h>
>> +#include <core/pci.h>
>> +#include <vgpu_mgr/vgpu_mgr.h>
>> +
>> +static bool support_vgpu_mgr = false;
> 
> A global variable for the whole system?  Are you sure that will work
> well over time?  Why isn't this a per-device thing?
> 
>> +module_param_named(support_vgpu_mgr, support_vgpu_mgr, bool, 0400);
> 
> This is not the 1990's, please never add new module parameters, use
> per-device variables.  And no documentation?  That's not ok either even
> if you did want to have this.
> 

Thanks for the comments. I am most collecting people opinion on the 
means of enabling/disabling the vGPU, via kernel parameter or not is 
just one of the options. If it is chosen, having a global kernel 
parameter is not expected to be in the !RFC patch.

>> +static inline struct pci_dev *nvkm_to_pdev(struct nvkm_device *device)
>> +{
>> +     struct nvkm_device_pci *pci = container_of(device, typeof(*pci),
>> +                                                device);
>> +
>> +     return pci->pdev;
>> +}
>> +
>> +/**
>> + * nvkm_vgpu_mgr_is_supported - check if a platform support vGPU
>> + * @device: the nvkm_device pointer
>> + *
>> + * Returns: true on supported platform which is newer than ADA Lovelace
>> + * with SRIOV support.
>> + */
>> +bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device)
>> +{
>> +     struct pci_dev *pdev = nvkm_to_pdev(device);
>> +
>> +     if (!support_vgpu_mgr)
>> +             return false;
>> +
>> +     return device->card_type == AD100 &&  pci_sriov_get_totalvfs(pdev);
> 
> checkpatch please.
> 

I did before sending it, but it doesn't complain this line.

My command line
$ scripts/checkpatch.pl [this patch]

> And "AD100" is an odd #define, as you know.

I agree and people commented about it in the internal review. But it is 
from the nouveau driver and it has been used in many other places in 
nouveau driver. What would be your guidelines in this situation?

> 
>> +}
>> +
>> +/**
>> + * nvkm_vgpu_mgr_is_enabled - check if vGPU support is enabled on a PF
>> + * @device: the nvkm_device pointer
>> + *
>> + * Returns: true if vGPU enabled.
>> + */
>> +bool nvkm_vgpu_mgr_is_enabled(struct nvkm_device *device)
>> +{
>> +     return device->vgpu_mgr.enabled;
> 
> What happens if this changes right after you look at it?
> 

Nice catch. Will fix it.

> 
>> +}
>> +
>> +/**
>> + * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
>> + * @device: the nvkm_device pointer
>> + *
>> + * Returns: 0 on success, -ENODEV on platforms that are not supported.
>> + */
>> +int nvkm_vgpu_mgr_init(struct nvkm_device *device)
>> +{
>> +     struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
>> +
>> +     if (!nvkm_vgpu_mgr_is_supported(device))
>> +             return -ENODEV;
>> +
>> +     vgpu_mgr->nvkm_dev = device;
>> +     vgpu_mgr->enabled = true;
>> +
>> +     pci_info(nvkm_to_pdev(device),
>> +              "NVIDIA vGPU mananger support is enabled.\n");
> 
> When drivers work properly, they are quiet.
>

I totally understand this rule that driver should be quiet. But this is 
not the same as "driver is loaded". This is a feature reporting like 
many others

My concern is as nouveau is a kernel driver, when a user mets a kernel 
panic and offers a dmesg to analyze, it would be at least nice to know 
if the vGPU feature is turned on or not. Sysfs is doable, but it helps 
in different scenarios.
> Why can't you see this all in the sysfs tree instead to know if support
> is there or not?  You all are properly tieing in your "sub driver" logic
> to the driver model, right?  (hint, I don't think so as it looks like
> that isn't happening, but I could be missing it...)
> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client
  2024-09-26  9:21   ` Greg KH
@ 2024-10-14 10:16     ` Zhi Wang
  2024-10-14 11:33       ` Greg KH
  0 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-10-14 10:16 UTC (permalink / raw)
  To: Greg KH
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com, Jason Gunthorpe,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org, Ben Skeggs

On 26/09/2024 12.21, Greg KH wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote:
>> nvkm is a HW abstraction layer(HAL) that initializes the HW and
>> allows its clients to manipulate the GPU functions regardless of the
>> generations of GPU HW. On the top layer, it provides generic APIs for a
>> client to connect to NVKM, enumerate the GPU functions, and manipulate
>> the GPU HW.
>>
>> To reach nvkm, the client needs to connect to NVKM layer by layer: driver
>> layer, client layer, and eventually, the device layer, which provides all
>> the access routines to GPU functions. After a client attaches to NVKM,
>> it initializes the HW and is able to serve the clients.
>>
>> Attach to nvkm as a nvkm client.
>>
>> Cc: Neo Jia <cjia@nvidia.com>
>> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
>> ---
>>   .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  8 ++++
>>   .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 48 ++++++++++++++++++-
>>   2 files changed, 55 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
>> index 3163fff1085b..9e10e18306b0 100644
>> --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
>> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
>> @@ -7,6 +7,14 @@
>>   struct nvkm_vgpu_mgr {
>>        bool enabled;
>>        struct nvkm_device *nvkm_dev;
>> +
>> +     const struct nvif_driver *driver;
> 
> Meta-comment, why is this attempting to act like a "driver" and yet not
> tieing into the driver model code at all?  Please fix that up, it's not
> ok to add more layers on top of a broken one like this.  We have
> infrastructure for this type of thing, please don't route around it.
> 

Thanks for the guidelines. Will try to work with folks and figure out a 
solution.

Ben is doing quite some clean-ups of nouveau driver[1], they had been 
reviewed and merged by Danilo. Also, the split driver patchset he is 
working on seems a meaningful pre-step to fix this, as it also includes 
the re-factor of the interface between the nvkm and the nvif stuff.

[1] 
https://lore.kernel.org/nouveau/CAPM=9tyW=YuDQrRwrYK_ayuvEnp+9irTuze=MP-zkowm3CFJ9A@mail.gmail.com/T/

[2] 
https://lore.kernel.org/dri-devel/20240613170211.88779-1-bskeggs@nvidia.com/T/

> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client
  2024-10-14 10:16     ` Zhi Wang
@ 2024-10-14 11:33       ` Greg KH
  0 siblings, 0 replies; 86+ messages in thread
From: Greg KH @ 2024-10-14 11:33 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com, Jason Gunthorpe,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org, Ben Skeggs

On Mon, Oct 14, 2024 at 10:16:21AM +0000, Zhi Wang wrote:
> On 26/09/2024 12.21, Greg KH wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Sun, Sep 22, 2024 at 05:49:24AM -0700, Zhi Wang wrote:
> >> nvkm is a HW abstraction layer(HAL) that initializes the HW and
> >> allows its clients to manipulate the GPU functions regardless of the
> >> generations of GPU HW. On the top layer, it provides generic APIs for a
> >> client to connect to NVKM, enumerate the GPU functions, and manipulate
> >> the GPU HW.
> >>
> >> To reach nvkm, the client needs to connect to NVKM layer by layer: driver
> >> layer, client layer, and eventually, the device layer, which provides all
> >> the access routines to GPU functions. After a client attaches to NVKM,
> >> it initializes the HW and is able to serve the clients.
> >>
> >> Attach to nvkm as a nvkm client.
> >>
> >> Cc: Neo Jia <cjia@nvidia.com>
> >> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> >> ---
> >>   .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  8 ++++
> >>   .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 48 ++++++++++++++++++-
> >>   2 files changed, 55 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> >> index 3163fff1085b..9e10e18306b0 100644
> >> --- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> >> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> >> @@ -7,6 +7,14 @@
> >>   struct nvkm_vgpu_mgr {
> >>        bool enabled;
> >>        struct nvkm_device *nvkm_dev;
> >> +
> >> +     const struct nvif_driver *driver;
> > 
> > Meta-comment, why is this attempting to act like a "driver" and yet not
> > tieing into the driver model code at all?  Please fix that up, it's not
> > ok to add more layers on top of a broken one like this.  We have
> > infrastructure for this type of thing, please don't route around it.
> > 
> 
> Thanks for the guidelines. Will try to work with folks and figure out a 
> solution.
> 
> Ben is doing quite some clean-ups of nouveau driver[1], they had been 
> reviewed and merged by Danilo. Also, the split driver patchset he is 
> working on seems a meaningful pre-step to fix this, as it also includes 
> the re-factor of the interface between the nvkm and the nvif stuff.

What we need is the auxbus code changes that are pointed to somewhere in
those long threads, that needs to land first and this series rebased
before it can be reviewed properly as that is going to change your
device lifetime rules a lot.

Please do that and then move this "nvif_driver" to be the proper driver
core type to tie into the kernel correctly.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude
  2024-10-14  9:59     ` Zhi Wang
@ 2024-10-14 11:36       ` Greg KH
  0 siblings, 0 replies; 86+ messages in thread
From: Greg KH @ 2024-10-14 11:36 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com, Jason Gunthorpe,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Mon, Oct 14, 2024 at 09:59:18AM +0000, Zhi Wang wrote:
> On 26/09/2024 12.20, Greg KH wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Sun, Sep 22, 2024 at 05:49:23AM -0700, Zhi Wang wrote:
> >> NVIDIA GPU virtualization is a technology that allows multiple virtual
> >> machines (VMs) to share the power of a single GPU, enabling greater
> >> flexibility, efficiency, and cost-effectiveness in data centers and cloud
> >> environments.
> >>
> >> The first step of supporting NVIDIA vGPU in nvkm is to introduce the
> >> necessary vGPU data structures and functions to hook into the
> >> (de)initialization path of nvkm.
> >>
> >> Introduce NVIDIA vGPU data structures and functions hooking into the
> >> the (de)initialization path of nvkm and support the following patches.
> >>
> >> Cc: Neo Jia <cjia@nvidia.com>
> >> Cc: Surath Mitra <smitra@nvidia.com>
> >> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> > 
> > Some minor comments that are a hint you all aren't running checkpatch on
> > your code...
> > 
> >> --- /dev/null
> >> +++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
> >> @@ -0,0 +1,17 @@
> >> +/* SPDX-License-Identifier: MIT */
> > 
> > Wait, what?  Why?  Ick.  You all also forgot the copyright line :(
> > 
> 
> Will fix it accordingly.
> 
> Back to the reason, I am trying to follow the majority in the nouveau 
> since this is the change of nouveau.
> 
> What's your guidelines about those already in the code?

My "guidelines" is that your lawyers agree what needs to be done and to
do that.

After that, my opinion is you do the proper thing and follow the kernel
licenses here, ESPECIALLY as you will be talking to gpl-only symbols
(hint, MIT licensed code doesn't make any sense there, and go get your
legal approval if you think it does...)

> >> +static bool support_vgpu_mgr = false;
> > 
> > A global variable for the whole system?  Are you sure that will work
> > well over time?  Why isn't this a per-device thing?
> > 
> >> +module_param_named(support_vgpu_mgr, support_vgpu_mgr, bool, 0400);
> > 
> > This is not the 1990's, please never add new module parameters, use
> > per-device variables.  And no documentation?  That's not ok either even
> > if you did want to have this.
> 
> Thanks for the comments. I am most collecting people opinion on the 
> means of enabling/disabling the vGPU, via kernel parameter or not is 
> just one of the options. If it is chosen, having a global kernel 
> parameter is not expected to be in the !RFC patch.

That wasn't explained anywhere I noticed, did I miss it?

Please do this properly, again, kernel module parameters is not the
proper way.

> >> +static inline struct pci_dev *nvkm_to_pdev(struct nvkm_device *device)
> >> +{
> >> +     struct nvkm_device_pci *pci = container_of(device, typeof(*pci),
> >> +                                                device);
> >> +
> >> +     return pci->pdev;
> >> +}
> >> +
> >> +/**
> >> + * nvkm_vgpu_mgr_is_supported - check if a platform support vGPU
> >> + * @device: the nvkm_device pointer
> >> + *
> >> + * Returns: true on supported platform which is newer than ADA Lovelace
> >> + * with SRIOV support.
> >> + */
> >> +bool nvkm_vgpu_mgr_is_supported(struct nvkm_device *device)
> >> +{
> >> +     struct pci_dev *pdev = nvkm_to_pdev(device);
> >> +
> >> +     if (!support_vgpu_mgr)
> >> +             return false;
> >> +
> >> +     return device->card_type == AD100 &&  pci_sriov_get_totalvfs(pdev);
> > 
> > checkpatch please.
> > 
> 
> I did before sending it, but it doesn't complain this line.
> 
> My command line
> $ scripts/checkpatch.pl [this patch]

Then something is odd as that '  ' should have been caught.

> > And "AD100" is an odd #define, as you know.
> 
> I agree and people commented about it in the internal review. But it is 
> from the nouveau driver and it has been used in many other places in 
> nouveau driver. What would be your guidelines in this situation?

Something properly namespaced?

> >> +/**
> >> + * nvkm_vgpu_mgr_init - Initialize the vGPU manager support
> >> + * @device: the nvkm_device pointer
> >> + *
> >> + * Returns: 0 on success, -ENODEV on platforms that are not supported.
> >> + */
> >> +int nvkm_vgpu_mgr_init(struct nvkm_device *device)
> >> +{
> >> +     struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
> >> +
> >> +     if (!nvkm_vgpu_mgr_is_supported(device))
> >> +             return -ENODEV;
> >> +
> >> +     vgpu_mgr->nvkm_dev = device;
> >> +     vgpu_mgr->enabled = true;
> >> +
> >> +     pci_info(nvkm_to_pdev(device),
> >> +              "NVIDIA vGPU mananger support is enabled.\n");
> > 
> > When drivers work properly, they are quiet.
> >
> 
> I totally understand this rule that driver should be quiet. But this is 
> not the same as "driver is loaded". This is a feature reporting like 
> many others

And again, those "many others" need to be quiet too, we have many ways
to properly gather system information, and the kernel boot log is not
that.

> My concern is as nouveau is a kernel driver, when a user mets a kernel 
> panic and offers a dmesg to analyze, it would be at least nice to know 
> if the vGPU feature is turned on or not. Sysfs is doable, but it helps 
> in different scenarios.

A kernel panic is usually way way way after this initial boot time
message.  Again, keep the boot fast, and quiet please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  2024-10-14  7:38     ` Zhi Wang
@ 2024-10-15  3:49       ` Christoph Hellwig
  2024-10-15 12:23       ` Jason Gunthorpe
  1 sibling, 0 replies; 86+ messages in thread
From: Christoph Hellwig @ 2024-10-15  3:49 UTC (permalink / raw)
  To: Zhi Wang
  Cc: Jason Gunthorpe, kvm@vger.kernel.org,
	nouveau@lists.freedesktop.org, alex.williamson@redhat.com,
	kevin.tian@intel.com, airlied@gmail.com, daniel@ffwll.ch,
	Andy Currid, Neo Jia, Surath Mitra, Ankit Agrawal, Aniket Agashe,
	Kirti Wankhede, Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Mon, Oct 14, 2024 at 07:38:03AM +0000, Zhi Wang wrote:
> As what has been explained in PATCH 4's reply, the concept of vGPU and 
> VF are not identically equal. PCI SRIOV VF is the HW interface of 
> reaching a vGPU and there were generations in which HW didn't have SRIOV 
> VFs and a vGPU is reached via other means.

What does "were" mean.  Are they supported by this driver?  If so how.
If not that's entirely irrelevant.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-10-13 18:54     ` Zhi Wang
@ 2024-10-15 12:20       ` Jason Gunthorpe
  2024-10-15 15:19         ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-10-15 12:20 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Sun, Oct 13, 2024 at 06:54:32PM +0000, Zhi Wang wrote:
> On 27/09/2024 1.51, Jason Gunthorpe wrote:
> > On Sun, Sep 22, 2024 at 05:49:26AM -0700, Zhi Wang wrote:
> >> GSP firmware needs to know the number of max-supported vGPUs when
> >> initialization.
> >>
> >> The field of VF partition count in the GSP WPR2 is required to be set
> >> according to the number of max-supported vGPUs.
> >>
> >> Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP
> >> firmware and initializes the GSP WPR2, if vGPU is enabled.
> > 
> > How/why is this different from the SRIOV num_vfs concept?
> > 
> 
> 1) The VF is considered as an HW interface of vGPU exposed to the VMM/VM.
> 
> 2) Number of VF is not always equal to number of max vGPU supported, 
> which depends on a) the size of metadata of video memory space allocated 
> for FW to manage the vGPUs. b) how user divide the resources. E.g. if a 
> card has 48GB video memory, and user creates two vGPUs each has 24GB 
> video memory. Only two VFs are usable even SRIOV num_vfs can be large 
> than that.

But that can't be determine at driver load time, the profiling of the
VFs must happen at run time when the orchestation determins what kind
of VM instance type to run.

Which again gets back to the question of why do you need to specify
the number of VFs at FW boot time? Why isn't it just fully dynamic and
driven on the SRIOV enable?

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 06/29] nvkm/vgpu: set RMSetSriovMode when NVIDIA vGPU is enabled
  2024-10-14  7:38     ` Zhi Wang
  2024-10-15  3:49       ` Christoph Hellwig
@ 2024-10-15 12:23       ` Jason Gunthorpe
  1 sibling, 0 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-10-15 12:23 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Mon, Oct 14, 2024 at 07:38:03AM +0000, Zhi Wang wrote:
> On 27/09/2024 1.53, Jason Gunthorpe wrote:
> > On Sun, Sep 22, 2024 at 05:49:28AM -0700, Zhi Wang wrote:
> >> The registry object "RMSetSriovMode" is required to be set when vGPU is
> >> enabled.
> >>
> >> Set "RMSetSriovMode" to 1 when nvkm is loading the GSP firmware and
> >> initialize the GSP registry objects, if vGPU is enabled.
> > 
> > Also really weird, this sounds like what the PCI sriov enable is for.
> > 
> 
> As what has been explained in PATCH 4's reply, the concept of vGPU and 
> VF are not identically equal. PCI SRIOV VF is the HW interface of 
> reaching a vGPU and there were generations in which HW didn't have SRIOV 
> VFs and a vGPU is reached via other means.
> 
> The "RMSetSriovMode" here is not equal to PCI SRIOV enable, which 
> activates the VFs and let them present on PCI bus. It is to tell the GSP 
> FW to enable the mode of "vGPUs are reached by VFs".

Which is usless if you don't enable SRIOV, so again, this seems like
it should be dynamic and whatever activated this is doing should be
shifted to sriov enable time and not fw load time.

There is a fundamental issue in Linux with trying to configure drivers
statically when they are probed. We want to avoid that as much as
possible.

If it can't be properly dynamic then the driver needs to take its
configuration from device flash, or you need to build a whole system
to allow configuring and rebooting the device - this is pretty hard.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-10-14  8:32     ` Zhi Wang
@ 2024-10-15 12:27       ` Jason Gunthorpe
  2024-10-15 15:14         ` Zhi Wang
  0 siblings, 1 reply; 86+ messages in thread
From: Jason Gunthorpe @ 2024-10-15 12:27 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Mon, Oct 14, 2024 at 08:32:03AM +0000, Zhi Wang wrote:

> Turning on the SRIOV feature is just a part of the process enabling a 
> vGPU. The VF is not instantly usable before a vGPU type is chosen via 
> another userspace interface (e.g. fwctl).

That's OK, that has become pretty normal now that VFs are just empty
handles when they are created until they are properly profiled.

> Besides, admin has to enable the vGPU support by some means (e.g. a 
> kernel parameter is just one candidate) and GSP firmware needs to be 
> configured accordingly when being loaded.

Definitely not a kernel parameter..

> As this is related to user space interface, I am leaning towards putting 
> some restriction/checks for the pre-condition in the 
> driver.sriov_configure(), so admin would know there is something wrong 
> in his configuration as early as possible, instead of he failed to 
> creating vGPUs again and again, then he found he forgot to enable the 
> vGPU support.

Well, as I've said, this is poor, you shouldn't have a FW SRIOV enable
bit at all, or at least it shouldn't be user configurable.

If the PCI function supports SRIOV then it should work to turn it on.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm
  2024-10-15 12:27       ` Jason Gunthorpe
@ 2024-10-15 15:14         ` Zhi Wang
  0 siblings, 0 replies; 86+ messages in thread
From: Zhi Wang @ 2024-10-15 15:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 15/10/2024 15.27, Jason Gunthorpe wrote:
> On Mon, Oct 14, 2024 at 08:32:03AM +0000, Zhi Wang wrote:
> 
>> Turning on the SRIOV feature is just a part of the process enabling a
>> vGPU. The VF is not instantly usable before a vGPU type is chosen via
>> another userspace interface (e.g. fwctl).
> 
> That's OK, that has become pretty normal now that VFs are just empty
> handles when they are created until they are properly profiled.
> 
>> Besides, admin has to enable the vGPU support by some means (e.g. a
>> kernel parameter is just one candidate) and GSP firmware needs to be
>> configured accordingly when being loaded.
> 
> Definitely not a kernel parameter..
> 
>> As this is related to user space interface, I am leaning towards putting
>> some restriction/checks for the pre-condition in the
>> driver.sriov_configure(), so admin would know there is something wrong
>> in his configuration as early as possible, instead of he failed to
>> creating vGPUs again and again, then he found he forgot to enable the
>> vGPU support.
> 
> Well, as I've said, this is poor, you shouldn't have a FW SRIOV enable
> bit at all, or at least it shouldn't be user configurable.
> 
> If the PCI function supports SRIOV then it should work to turn it on.
> 
> Jason

Makes sense. Then we don't need a user-configurable option for 
enabling/disable SRIOV at least so far. We just enable it when we see 
the HW supports SRIOV.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-10-15 12:20       ` Jason Gunthorpe
@ 2024-10-15 15:19         ` Zhi Wang
  2024-10-15 16:35           ` Jason Gunthorpe
  0 siblings, 1 reply; 86+ messages in thread
From: Zhi Wang @ 2024-10-15 15:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On 15/10/2024 15.20, Jason Gunthorpe wrote:
> On Sun, Oct 13, 2024 at 06:54:32PM +0000, Zhi Wang wrote:
>> On 27/09/2024 1.51, Jason Gunthorpe wrote:
>>> On Sun, Sep 22, 2024 at 05:49:26AM -0700, Zhi Wang wrote:
>>>> GSP firmware needs to know the number of max-supported vGPUs when
>>>> initialization.
>>>>
>>>> The field of VF partition count in the GSP WPR2 is required to be set
>>>> according to the number of max-supported vGPUs.
>>>>
>>>> Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP
>>>> firmware and initializes the GSP WPR2, if vGPU is enabled.
>>>
>>> How/why is this different from the SRIOV num_vfs concept?
>>>
>>
>> 1) The VF is considered as an HW interface of vGPU exposed to the VMM/VM.
>>
>> 2) Number of VF is not always equal to number of max vGPU supported,
>> which depends on a) the size of metadata of video memory space allocated
>> for FW to manage the vGPUs. b) how user divide the resources. E.g. if a
>> card has 48GB video memory, and user creates two vGPUs each has 24GB
>> video memory. Only two VFs are usable even SRIOV num_vfs can be large
>> than that.
> 
> But that can't be determine at driver load time, the profiling of the
> VFs must happen at run time when the orchestation determins what kind
> of VM instance type to run.
> 
> Which again gets back to the question of why do you need to specify
> the number of VFs at FW boot time? Why isn't it just fully dynamic and
> driven on the SRIOV enable?
> 

The FW needs to pre-calculate the reserved video memory for its own use, 
which includes the size of metadata of max-supported vGPUs. It needs to 
be decided at the FW loading time. We can always set it to the max 
number and the trade-off is we lose some usable video memory, at around 
(549-256)MB so far.

> Jason


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC 04/29] nvkm/vgpu: set the VF partition count when NVIDIA vGPU is enabled
  2024-10-15 15:19         ` Zhi Wang
@ 2024-10-15 16:35           ` Jason Gunthorpe
  0 siblings, 0 replies; 86+ messages in thread
From: Jason Gunthorpe @ 2024-10-15 16:35 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm@vger.kernel.org, nouveau@lists.freedesktop.org,
	alex.williamson@redhat.com, kevin.tian@intel.com,
	airlied@gmail.com, daniel@ffwll.ch, Andy Currid, Neo Jia,
	Surath Mitra, Ankit Agrawal, Aniket Agashe, Kirti Wankhede,
	Tarun Gupta (SW-GPU), zhiwang@kernel.org

On Tue, Oct 15, 2024 at 03:19:33PM +0000, Zhi Wang wrote:

> The FW needs to pre-calculate the reserved video memory for its own use, 
> which includes the size of metadata of max-supported vGPUs. It needs to 
> be decided at the FW loading time. We can always set it to the max 
> number and the trade-off is we lose some usable video memory, at around 
> (549-256)MB so far.

I think that is where you have to end up as we don't really want
probe-time configurables, and consider later updating how the FW works
to make this more optimal.

Jason

^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2024-10-15 16:36 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-22 12:49 [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
2024-09-22 12:49 ` [RFC 01/29] nvkm/vgpu: introduce NVIDIA vGPU support prelude Zhi Wang
2024-09-26  9:20   ` Greg KH
2024-10-14  9:59     ` Zhi Wang
2024-10-14 11:36       ` Greg KH
2024-09-22 12:49 ` [RFC 02/29] nvkm/vgpu: attach to nvkm as a nvkm client Zhi Wang
2024-09-26  9:21   ` Greg KH
2024-10-14 10:16     ` Zhi Wang
2024-10-14 11:33       ` Greg KH
2024-09-22 12:49 ` [RFC 03/29] nvkm/vgpu: reserve a larger GSP heap when NVIDIA vGPU is enabled Zhi Wang
2024-09-22 12:49 ` [RFC 04/29] nvkm/vgpu: set the VF partition count " Zhi Wang
2024-09-26 22:51   ` Jason Gunthorpe
2024-10-13 18:54     ` Zhi Wang
2024-10-15 12:20       ` Jason Gunthorpe
2024-10-15 15:19         ` Zhi Wang
2024-10-15 16:35           ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 05/29] nvkm/vgpu: populate GSP_VF_INFO " Zhi Wang
2024-09-26 22:52   ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 06/29] nvkm/vgpu: set RMSetSriovMode " Zhi Wang
2024-09-26 22:53   ` Jason Gunthorpe
2024-10-14  7:38     ` Zhi Wang
2024-10-15  3:49       ` Christoph Hellwig
2024-10-15 12:23       ` Jason Gunthorpe
2024-09-22 12:49 ` [RFC 07/29] nvkm/gsp: add a notify handler for GSP event GPUACCT_PERFMON_UTIL_SAMPLES Zhi Wang
2024-09-22 12:49 ` [RFC 08/29] nvkm/vgpu: get the size VMMU segment from GSP firmware Zhi Wang
2024-09-22 12:49 ` [RFC 09/29] nvkm/vgpu: introduce the reserved channel allocator Zhi Wang
2024-09-22 12:49 ` [RFC 10/29] nvkm/vgpu: introduce interfaces for NVIDIA vGPU VFIO module Zhi Wang
2024-09-22 12:49 ` [RFC 11/29] nvkm/vgpu: introduce GSP RM client alloc and free for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 12/29] nvkm/vgpu: introduce GSP RM control interface " Zhi Wang
2024-09-22 12:49 ` [RFC 13/29] nvkm: move chid.h to nvkm/engine Zhi Wang
2024-09-22 12:49 ` [RFC 14/29] nvkm/vgpu: introduce channel allocation for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 15/29] nvkm/vgpu: introduce FB memory " Zhi Wang
2024-09-22 12:49 ` [RFC 16/29] nvkm/vgpu: introduce BAR1 map routines for vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 17/29] nvkm/vgpu: introduce engine bitmap for vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm Zhi Wang
2024-09-26 22:56   ` Jason Gunthorpe
2024-10-14  8:32     ` Zhi Wang
2024-10-15 12:27       ` Jason Gunthorpe
2024-10-15 15:14         ` Zhi Wang
2024-10-14  8:36     ` Zhi Wang
2024-09-22 12:49 ` [RFC 19/29] vfio/vgpu_mgr: introdcue vGPU lifecycle management prelude Zhi Wang
2024-09-22 12:49 ` [RFC 20/29] vfio/vgpu_mgr: allocate GSP RM client for NVIDIA vGPU manager Zhi Wang
2024-09-22 12:49 ` [RFC 21/29] vfio/vgpu_mgr: introduce vGPU type uploading Zhi Wang
2024-09-22 12:49 ` [RFC 22/29] vfio/vgpu_mgr: allocate vGPU FB memory when creating vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 23/29] vfio/vgpu_mgr: allocate vGPU channels " Zhi Wang
2024-09-22 12:49 ` [RFC 24/29] vfio/vgpu_mgr: allocate mgmt heap " Zhi Wang
2024-09-22 12:49 ` [RFC 25/29] vfio/vgpu_mgr: map mgmt heap when creating a vGPU Zhi Wang
2024-09-22 12:49 ` [RFC 26/29] vfio/vgpu_mgr: allocate GSP RM client when creating vGPUs Zhi Wang
2024-09-22 12:49 ` [RFC 27/29] vfio/vgpu_mgr: bootload the new vGPU Zhi Wang
2024-09-25  0:31   ` Dave Airlie
2024-09-22 12:49 ` [RFC 28/29] vfio/vgpu_mgr: introduce vGPU host RPC channel Zhi Wang
2024-09-22 12:49 ` [RFC 29/29] vfio/vgpu_mgr: introduce NVIDIA vGPU VFIO variant driver Zhi Wang
2024-09-22 13:11 ` [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support Zhi Wang
2024-09-23  8:38   ` Danilo Krummrich
2024-09-24 19:49     ` Zhi Wang
2024-09-23  6:22 ` Tian, Kevin
2024-09-23 15:02   ` Jason Gunthorpe
2024-09-26  6:43     ` Tian, Kevin
2024-09-26 12:55       ` Jason Gunthorpe
2024-09-26 22:57         ` Jason Gunthorpe
2024-09-27  0:13           ` Tian, Kevin
2024-09-23  8:49 ` Danilo Krummrich
2024-09-23 15:01   ` Jason Gunthorpe
2024-09-23 22:50     ` Danilo Krummrich
2024-09-24 16:41       ` Jason Gunthorpe
2024-09-24 19:56         ` Danilo Krummrich
2024-09-24 22:52           ` Dave Airlie
2024-09-24 23:47             ` Jason Gunthorpe
2024-09-25  0:18               ` Dave Airlie
2024-09-25  1:29                 ` Jason Gunthorpe
2024-09-25  0:53           ` Jason Gunthorpe
2024-09-25  1:08             ` Dave Airlie
2024-09-25 15:28               ` Jason Gunthorpe
2024-09-25 10:55             ` Danilo Krummrich
2024-09-26  9:14     ` Greg KH
2024-09-26 12:42       ` Jason Gunthorpe
2024-09-26 12:54         ` Greg KH
2024-09-26 13:07           ` Danilo Krummrich
2024-09-26 14:40           ` Jason Gunthorpe
2024-09-26 18:07             ` Andy Ritger
2024-09-26 22:23               ` Danilo Krummrich
2024-09-26 22:42             ` Danilo Krummrich
2024-09-27 12:51               ` Jason Gunthorpe
2024-09-27 14:22                 ` Danilo Krummrich
2024-09-27 15:27                   ` Jason Gunthorpe
2024-09-30 15:59                     ` Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox