public inbox for virtio-dev@lists.linux.dev
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Parav Pandit <parav@nvidia.com>
Cc: virtio-dev@lists.oasis-open.org, cohuck@redhat.com,
	david.edmondson@oracle.com, sburla@marvell.com,
	jasowang@redhat.com, yishaih@nvidia.com, maorg@nvidia.com,
	virtio-comment@lists.oasis-open.org, shahafs@nvidia.com
Subject: [virtio-dev] Re: [PATCH v3 0/3] transport-pci: Introduce legacy registers access using AQ
Date: Sun, 4 Jun 2023 09:34:23 -0400	[thread overview]
Message-ID: <20230604092441-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20230602203604.627661-1-parav@nvidia.com>

On Fri, Jun 02, 2023 at 11:36:01PM +0300, Parav Pandit wrote:
> This short series introduces legacy registers access commands for the owner
> group member PCI PF to access the legacy registers of the member VFs.

Note that some work will be needed here to fix up grammar and spelling
mistakes.

> If in future any SIOV devices to support legacy registers, they
> can be easily supported using same commands by using the group
> member identifiers of the future SIOV devices.

Yes, with the exception of
VIRTIO_ADMIN_CMD_LQ_NOTIFY_QUERY - currently refers to VF BAR,
subfunctions do not have it.
Can we find a way to have it in the PF BAR instead?
E.g. the notification can include VF# + VQ#?
At least as an option?
If not can you add some info explaining why not?


> More details as overview, motivation, use case are further described
> below.
> 
> Patch summary:
> --------------
> patch-1 split rows of admin opcode tables by a line
> patch-2 adds administrative virtuqueue commands
> patch-3 adds its conformance section
> 
> This short series is on top of latest work [1] from Michael.
> It uses the newly introduced administrative virtqueue facility with 3 new
> commands which uses the existing virtio_admin_cmd.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202305/msg00112.html
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>    virtio devices to the guest VM at scale of thousands,
>    typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>    vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>    virtio device type (net/blk) and be future compatible with a
>    single vfio stack using SR-IOV or other scalable device
>    virtualization technology to map PCI devices to the guest VM.
>    (as transitional or otherwise)
> 
> Motivation/Background:
> ----------------------
> The existing virtio transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. Currently it
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range will be
> aligned to a 4 KB boundary.
> 
> Overview:
> ---------
> Above usecase requirements can be solved by PCI PF group owner accessing
> its group member PCI VFs legacy registers using an admin virtqueue of
> the group owner PCI PF.
> 
> Two new admin virtqueue commands are added which read/write PCI VF
> registers.
> 
> The third command suggested by Jason queries the VF device's driver
> notification region.
> 
> Software usage example:
> -----------------------
> One way to use and map to the guest VM is by using vfio driver
> framework in Linux kernel.
> 
>                 +----------------------+
>                 |pci_dev_id = 0x100X   |
> +---------------|pci_rev_id = 0x0      |-----+
> |vfio device    |BAR0 = I/O region     |     |
> |               |Other attributes      |     |
> |               +----------------------+     |
> |                                            |
> +   +--------------+     +-----------------+ |
> |   |I/O BAR to AQ |     | Other vfio      | |
> |   |rd/wr mapper  |     | functionalities | |
> |   +--------------+     +-----------------+ |
> |                                            |
> +------+-------------------------+-----------+
>        |                         |
>        |                    Driver notification
>        |                         |
>        |                         |
>   +----+------------+       +----+------------+
>   | +-----+         |       | PCI VF device A |
>   | | AQ  |-------------+---->+-------------+ |
>   | +-----+         |   |   | | legacy regs | |
>   | PCI PF device   |   |   | +-------------+ |
>   +-----------------+   |   +-----------------+
>                         |
>                         |   +----+------------+
>                         |   | PCI VF device N |
>                         +---->+-------------+ |
>                             | | legacy regs | |
>                             | +-------------+ |
>                             +-----------------+
> 
> 2. Virtio pci driver to bind to the listed device id and
>    use it as native device in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Please review.
> 
> Alternatives considered:
> ========================
> 1. Exposing BAR0 as MMIO BAR that follows legacy registers template
> Pros:
> a. Kind of works with legacy drivers as some of them have API
>    which is agnostic to MMIO vs IOBAR.
> b. Does not require hypervisor intervantion
> Cons:
> a. Device reset is extremely hard to implement in device at scale as
>    driver does not wait for reset completion
> b. Device register width related problems persist that hypervisor if
>    wishes, cannot fix it.
> 
> 2. Accessing VF registers by tunneling it through new legacy PCI capability
> Pros:
> a. Self contained, but cannot work with future PCI SIOV devices
> Cons:
> a. Equally slow as AQ access
> b. Still requires new capability for notification access
> 
> conclusion for picking AQ approach:
> ==================================
> 1. Overall AQ based access is simpler to implement with combination of
>    best from software and device so that legacy registers do not get baked
>    in the device hardware
> 2. AQ allows hypervisor software to intercept legacy registers and make
>    corrections if needed
> 3. Provides trade-off between performance, device complexity vs spec,
>    while still maintaining passthrough mode for the VFs with minimal
>    hypervisor intercepts only for legacy registers access
> 4. AQ mechanism is designed for accessing other member devices registers
>    as noted in AQ submission, it utilizes the existing infrastructure over
>    other alternatives.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/167
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> 
> ---
> changelog:
> v2->v3:
> - added new patch to split raws of admin vq opcode table
> - adddressed Jason and Michael's comment to split single register
>   access command to common config and device specific commands.
> - dropped the suggetion to introduce enable/disable command as
>   admin command cap bit already covers it.
> - added other alternative design considered and discussed in detail in v0, v1 and v2
> 
> v1->v2:
> - addressed comments from Michael
> - added theory of operation
> - grammar corrections
> - removed group fields description from individual commands as
>   it is already present in generic section
> - added endianness normative for legacy device registers region
> - renamed the file to drop vf and add legacy prefix
> - added overview in commit log
> - renamed subsection to reflect command
> 
> v0->v1:
> - addressed comments, suggesetions and ideas from Michael Tsirkin and Jason Wang
> - far more simpler design than MMR access
> - removed complexities of MMR device ids
> - removed complexities of MMR registers and extended capabilities
> - dropped adding new extended capabilities because if if they are
>   added, a pci device still needs to have existing capabilities
>   in the legacy configuration space and hypervisor driver do not
>   need to access them
> 
> 
> 
> Parav Pandit (3):
>   admin: Split opcode table rows with a line
>   transport-pci: Introduce legacy registers access commands
>   transport-pci: Add legacy register access conformance section
> 
>  admin.tex                     |  14 ++-
>  conformance.tex               |   2 +
>  transport-pci-legacy-regs.tex | 189 ++++++++++++++++++++++++++++++++++
>  transport-pci.tex             |   2 +
>  4 files changed, 206 insertions(+), 1 deletion(-)
>  create mode 100644 transport-pci-legacy-regs.tex
> 
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2023-06-04 13:35 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 20:36 [virtio-dev] [PATCH v3 0/3] transport-pci: Introduce legacy registers access using AQ Parav Pandit
2023-06-02 20:36 ` [virtio-dev] [PATCH v3 1/3] admin: Split opcode table rows with a line Parav Pandit
2023-06-02 20:36 ` [virtio-dev] [PATCH v3 2/3] transport-pci: Introduce legacy registers access commands Parav Pandit
2023-06-04 13:22   ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 13:51     ` [virtio-dev] " Parav Pandit
2023-06-04 14:13       ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 14:32         ` [virtio-dev] " Parav Pandit
2023-06-04 14:41           ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 15:01             ` [virtio-dev] " Parav Pandit
2023-06-04 22:10               ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 23:57                 ` [virtio-dev] " Parav Pandit
2023-06-08 18:34   ` [virtio-dev] " Michael S. Tsirkin
2023-06-08 18:55     ` [virtio-dev] " Parav Pandit
2023-06-08 19:00       ` [virtio-dev] " Michael S. Tsirkin
2023-06-08 19:04         ` [virtio-dev] " Parav Pandit
2023-06-02 20:36 ` [virtio-dev] [PATCH v3 3/3] transport-pci: Add legacy register access conformance section Parav Pandit
2023-06-04 13:34 ` Michael S. Tsirkin [this message]
2023-06-04 13:41   ` [virtio-dev] RE: [PATCH v3 0/3] transport-pci: Introduce legacy registers access using AQ Parav Pandit
2023-06-04 13:55     ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 14:10       ` [virtio-dev] " Parav Pandit
2023-06-04 14:23         ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 14:48           ` [virtio-dev] " Parav Pandit
2023-06-04 14:53             ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 15:07               ` [virtio-dev] " Parav Pandit
2023-06-04 21:48                 ` [virtio-dev] " Michael S. Tsirkin
2023-06-04 23:40                   ` [virtio-dev] " Parav Pandit
2023-06-05  5:51                     ` [virtio-dev] " Michael S. Tsirkin
2023-06-05 13:27                       ` [virtio-dev] " Parav Pandit
2023-06-05 13:50                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-05 16:04                           ` [virtio-dev] " Parav Pandit
2023-06-05 21:57                             ` [virtio-dev] " Michael S. Tsirkin
2023-06-05 22:12                               ` Parav Pandit
2023-06-06 11:56                                 ` Michael S. Tsirkin
2023-06-06 20:15                                   ` Parav Pandit
2023-06-07  2:27                                   ` Jason Wang
2023-06-07  3:05                                     ` Parav Pandit
2023-06-07  6:54                                       ` Jason Wang
2023-06-07  8:54                                         ` Michael S. Tsirkin
2023-06-08 14:38                                         ` Parav Pandit
2023-06-08 14:44                                           ` Michael S. Tsirkin
2023-06-08 14:53                                             ` Parav Pandit
2023-06-08 15:03                                               ` Michael S. Tsirkin
2023-06-08 15:16                                                 ` Parav Pandit
2023-06-08 18:03                                                   ` Michael S. Tsirkin
2023-06-08 18:11                                                     ` Parav Pandit
2023-06-08 18:31                                                   ` Michael S. Tsirkin
2023-06-08 19:00                                                     ` Parav Pandit
2023-06-08 19:03                                                       ` Michael S. Tsirkin
2023-06-08 19:12                                                         ` Parav Pandit
2023-06-09  2:06                                           ` Jason Wang
2023-06-09  2:29                                             ` Parav Pandit
2023-06-09  2:42                                               ` Jason Wang
2023-06-09  2:53                                                 ` Parav Pandit
2023-06-09  2:56                                                   ` Jason Wang
2023-06-09  2:58                                                     ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-06-09  3:02                                                       ` [virtio-dev] " Jason Wang
2023-06-09  3:25                                                         ` [virtio-dev] " Parav Pandit
2023-06-09  6:27                                                           ` [virtio-dev] " Jason Wang
2023-06-09  7:21                                                             ` Michael S. Tsirkin
2023-06-09 17:11                                                               ` [virtio-dev] " Parav Pandit
2023-06-11  0:27                                                                 ` [virtio-dev] " Michael S. Tsirkin
2023-06-11  2:08                                                                   ` [virtio-dev] " Parav Pandit
2023-06-11  7:14                                                                     ` [virtio-dev] " Michael S. Tsirkin
2023-06-11 12:54                                                                       ` [virtio-dev] " Parav Pandit
2023-06-11 20:09                                                                         ` [virtio-dev] " Michael S. Tsirkin
2023-06-11 20:17                                                                           ` [virtio-dev] " Parav Pandit
2023-06-11 23:15                                                                             ` [virtio-dev] " Michael S. Tsirkin
2023-06-26  3:46                                                                   ` Jason Wang
2023-06-26  3:32                                                                 ` Jason Wang
2023-06-26  3:51                                                                   ` [virtio-dev] " Parav Pandit
2023-06-27  2:38                                                                     ` [virtio-dev] " Jason Wang
2023-06-27  3:17                                                                       ` [virtio-dev] " Parav Pandit
2023-06-27  4:33                                                                         ` [virtio-dev] " Jason Wang
2023-06-26  3:50                                                               ` Jason Wang
2023-06-26  3:55                                                                 ` [virtio-dev] " Parav Pandit
2023-06-26 10:49                                                                 ` [virtio-dev] " Michael S. Tsirkin
2023-06-09  7:15                                             ` Michael S. Tsirkin
2023-06-26  3:59                                               ` Jason Wang
2023-06-26  4:04                                                 ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-06-27  2:42                                                   ` [virtio-dev] " Jason Wang
2023-06-26  7:13                                                 ` Michael S. Tsirkin
2023-06-07  8:57                                     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230604092441-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=david.edmondson@oracle.com \
    --cc=jasowang@redhat.com \
    --cc=maorg@nvidia.com \
    --cc=parav@nvidia.com \
    --cc=sburla@marvell.com \
    --cc=shahafs@nvidia.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox