All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Max Gurtovoy <mgurtovoy@nvidia.com>
Cc: <alex.williamson@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <jgg@nvidia.com>,
	<liranl@nvidia.com>, <oren@nvidia.com>, <tzahio@nvidia.com>,
	<leonro@nvidia.com>, <yarong@nvidia.com>, <aviadye@nvidia.com>,
	<shahafs@nvidia.com>, <artemp@nvidia.com>, <kwankhede@nvidia.com>,
	<ACurrid@nvidia.com>, <gmataev@nvidia.com>, <cjia@nvidia.com>
Subject: Re: [PATCH RFC v1 0/3] Introduce vfio-pci-core subsystem
Date: Mon, 18 Jan 2021 14:38:06 +0100	[thread overview]
Message-ID: <20210118143806.036c8dbc.cohuck@redhat.com> (raw)
In-Reply-To: <20210117181534.65724-1-mgurtovoy@nvidia.com>

On Sun, 17 Jan 2021 18:15:31 +0000
Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> Hi Alex and Cornelia,
> 
> This series split the vfio_pci driver into 2 parts: pci driver and a
> subsystem driver that will also be library of code. The pci driver,
> vfio_pci.ko will be used as before and it will bind to the subsystem
> driver vfio_pci_core.ko to register to the VFIO subsystem. This patchset
> if fully backward compatible. This is a typical Linux subsystem
> framework behaviour. This framework can be also adopted by vfio_mdev
> devices as we'll see in the below sketch.
> 
> This series is coming to solve the issues that were raised in the
> previous attempt for extending vfio-pci for vendor specific
> functionality: https://lkml.org/lkml/2020/5/17/376 by Yan Zhao.
> 
> This solution is also deterministic in a sense that when a user will
> bind to a vendor specific vfio-pci driver, it will get all the special
> goodies of the HW.
>  
> This subsystem framework will also ease on adding vendor specific
> functionality to VFIO devices in the future by allowing another module
> to provide the pci_driver that can setup number of details before
> registering to VFIO subsystem (such as inject its own operations).
> 
> Below we can see the proposed changes (this patchset only deals with
> VFIO_PCI subsystem but it can easily be extended to VFIO_MDEV subsystem
> as well):
> 
> +----------------------------------------------------------------------+
> |                                                                      |
> |                                VFIO                                  |
> |                                                                      |
> +----------------------------------------------------------------------+
> 
> +--------------------------------+    +--------------------------------+
> |                                |    |                                |
> |          VFIO_PCI_CORE         |    |          VFIO_MDEV_CORE        |
> |                                |    |                                |
> +--------------------------------+    +--------------------------------+
> 
> +---------------+ +--------------+    +---------------+ +--------------+
> |               | |              |    |               | |              |
> |               | |              |    |               | |              |
> | VFIO_PCI      | | MLX5_VFIO_PCI|    | VFIO_MDEV     | |MLX5_VFIO_MDEV|
> |               | |              |    |               | |              |
> |               | |              |    |               | |              |
> +---------------+ +--------------+    +---------------+ +--------------+
> 
> First 2 patches introduce the above changes for vfio_pci and
> vfio_pci_core.
> 
> Patch (3/3) introduces a new mlx5 vfio-pci module that registers to VFIO
> subsystem using vfio_pci_core. It also registers to Auxiliary bus for
> binding to mlx5_core that is the parent of mlx5-vfio-pci devices. This
> will allow extending mlx5-vfio-pci devices with HW specific features
> such as Live Migration (mlx5_core patches are not part of this series
> that comes for proposing the changes need for the vfio pci subsystem).
> 
> These devices will be seen on the Auxiliary bus as:
> mlx5_core.vfio_pci.2048 -> ../../../devices/pci0000:00/0000:00:02.0/0000:05:00.0/0000:06:00.0/0000:07:00.0/mlx5_core.vfio_pci.2048
> mlx5_core.vfio_pci.2304 -> ../../../devices/pci0000:00/0000:00:02.0/0000:05:00.0/0000:06:00.0/0000:07:00.1/mlx5_core.vfio_pci.2304
> 
> 2048 represents BDF 08:00.0 and 2304 represents BDF 09:00.0 in decimal
> view. In this manner, the administrator will be able to locate the
> correct vfio-pci module it should bind the desired BDF to (by finding
> the pointer to the module according to the Auxiliary driver of that
> BDF).

I'm not familiar with that auxiliary framework (it seems to be fairly
new?); but can you maybe create an auxiliary device unconditionally and
contain all hardware-specific things inside a driver for it? Or is that
not flexible enough?

> 
> In this way, we'll use the HW vendor driver core to manage the lifecycle
> of these devices. This is reasonable since only the vendor driver knows
> exactly about the status on its internal state and the capabilities of
> its acceleratots, for example.
> 
> TODOs:
> 1. For this RFC we still haven't cleaned all vendor specific stuff that
>    were merged in the past into vfio_pci (such as VFIO_PCI_IG and
>    VFIO_PCI_NVLINK2).
> 2. Create subsystem module for VFIO_MDEV. This can be used for vendor
>    specific scalable functions for example (SFs).
> 3. Add Live migration functionality for mlx5 SNAP devices
>    (NVMe/Virtio-BLK).
> 4. Add Live migration functionality for mlx5 VFs
> 5. Add the needed functionality for mlx5_core
> 
> I would like to thank the great team that was involved in this
> development, design and internal review:
> Oren, Liran, Jason, Leon, Aviad, Shahaf, Gary, Artem, Kirti, Neo, Andy
> and others.
> 
> This series applies cleanly on top of kernel 5.11-rc2+ commit 2ff90100ace8:
> "Merge tag 'hwmon-for-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging"
> from Linus.
> 
> Note: Live migration for MLX5 SNAP devices is WIP and will be the first
>       example for adding vendor extension to vfio-pci devices. As the
>       changes to the subsystem must be defined as a pre-condition for
>       this work, we've decided to split the submission for now.
> 
> Max Gurtovoy (3):
>   vfio-pci: rename vfio_pci.c to vfio_pci_core.c
>   vfio-pci: introduce vfio_pci_core subsystem driver
>   mlx5-vfio-pci: add new vfio_pci driver for mlx5 devices
> 
>  drivers/vfio/pci/Kconfig            |   22 +-
>  drivers/vfio/pci/Makefile           |   16 +-
>  drivers/vfio/pci/mlx5_vfio_pci.c    |  253 +++
>  drivers/vfio/pci/vfio_pci.c         | 2386 +--------------------------
>  drivers/vfio/pci/vfio_pci_core.c    | 2311 ++++++++++++++++++++++++++

Especially regarding this diffstat... from a quick glance at patch 3,
it mostly forwards to vfio_pci_core anyway. Do you expect a huge amount
of device-specific callback invocations?

[I have not looked at this in detail yet.]

>  drivers/vfio/pci/vfio_pci_private.h |   21 +
>  include/linux/mlx5/vfio_pci.h       |   36 +
>  7 files changed, 2734 insertions(+), 2311 deletions(-)
>  create mode 100644 drivers/vfio/pci/mlx5_vfio_pci.c
>  create mode 100644 drivers/vfio/pci/vfio_pci_core.c
>  create mode 100644 include/linux/mlx5/vfio_pci.h
> 


  parent reply	other threads:[~2021-01-18 13:40 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-17 18:15 [PATCH RFC v1 0/3] Introduce vfio-pci-core subsystem Max Gurtovoy
2021-01-17 18:15 ` [PATCH 1/3] vfio-pci: rename vfio_pci.c to vfio_pci_core.c Max Gurtovoy
2021-01-17 18:15 ` [PATCH 2/3] vfio-pci: introduce vfio_pci_core subsystem driver Max Gurtovoy
2021-01-17 18:15 ` [PATCH 3/3] mlx5-vfio-pci: add new vfio_pci driver for mlx5 devices Max Gurtovoy
2021-01-18 13:38 ` Cornelia Huck [this message]
2021-01-18 15:10   ` [PATCH RFC v1 0/3] Introduce vfio-pci-core subsystem Jason Gunthorpe
2021-01-18 16:00     ` Cornelia Huck
2021-01-18 18:16       ` Jason Gunthorpe
2021-01-19 18:56         ` Cornelia Huck
2021-01-19 19:42           ` Jason Gunthorpe
2021-01-22 19:25 ` Alex Williamson
2021-01-22 20:04   ` Jason Gunthorpe
2021-01-25 16:20     ` Cornelia Huck
2021-01-25 18:04       ` Jason Gunthorpe
2021-01-25 23:31         ` Alex Williamson
2021-01-26  0:45           ` Jason Gunthorpe
2021-01-26  3:34             ` Alex Williamson
2021-01-26 13:27               ` Max Gurtovoy
2021-01-28 16:29                 ` Cornelia Huck
2021-01-28 21:02                   ` Alex Williamson
2021-01-31 18:46                     ` Max Gurtovoy
2021-02-01  4:32                       ` Alex Williamson
2021-02-01  9:40                         ` Max Gurtovoy
2021-02-01 17:29                           ` Alex Williamson
2021-02-01 17:17                         ` Jason Gunthorpe
2021-01-31 18:09                   ` Max Gurtovoy
2021-01-26 17:23               ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210118143806.036c8dbc.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=ACurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=artemp@nvidia.com \
    --cc=aviadye@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=gmataev@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liranl@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=oren@nvidia.com \
    --cc=shahafs@nvidia.com \
    --cc=tzahio@nvidia.com \
    --cc=yarong@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.