All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>
To: Parav Pandit <parav@nvidia.com>
Cc: "virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
	"mie@igel.co.jp" <mie@igel.co.jp>
Subject: Re: MSI for Virtio PCI transport
Date: Tue, 25 Jun 2024 14:41:28 +0530	[thread overview]
Message-ID: <20240625091128.GC2642@thinkpad> (raw)
In-Reply-To: <PH0PR12MB54817F3736022FFE3B9C570FDCD52@PH0PR12MB5481.namprd12.prod.outlook.com>

On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote:
> 
> 
> > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>
> > Sent: Tuesday, June 25, 2024 11:14 AM
> > 
> > On Tue, Jun 25, 2024 at 04:09:07AM +0000, Parav Pandit wrote:
> > > Hi,
> > >
> > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>
> > > > Sent: Monday, June 24, 2024 9:50 PM
> > > >
> > > > Hi,
> > > >
> > > > We are looking into adapting Virtio spec for configurable physical
> > > > PCIe endpoint devices to expose Virtio devices to the host machine
> > > > connected over PCIe. This allows us to use the existing frontend
> > > > drivers on the host machine, thus minimizing the development
> > > > efforts. This idea is not new as some vendors like NVidia have
> > > > already released customized PCIe devices exposing Virtio devices to
> > > > the host machines. But we are working on making the configurable
> > > > PCIe devices running Linux kernel to expose Virtio devices using the PCI
> > Endpoint (EP) subsystem.
> > > >
> > > > Below is the simplistic represenation of the idea with virt-net as an
> > example.
> > > > But this could be extended to any supported Virtio devices:
> > > >
> > > >            HOST                                    ENDPOINT
> > > >
> > > > +-----------------------------+
> > > > +-----------------------------+ +-----------------------------+
> > > > |                             |         |                             |
> > > > |         Linux Kernel        |         |         Linux Kernel        |
> > > > |                             |         |                             |
> > > > |                             |         |     +------------------+    |
> > > > |                             |         |     |                  |    |
> > > > |                             |         |     |       Modem      |    |
> > > > |                             |         |     |                  |    |
> > > > |                             |         |     +---------|--------+    |
> > > > |                             |         |               |             |
> > > > |     +------------------+    |         |     +---------|--------+    |
> > > > |     |                  |    |         |     |                  |    |
> > > > |     |     Virt-net     |    |         |     |    Virtio EPF    |    |
> > > > |     |                  |    |         |     |                  |    |
> > > > |     +---------|--------+    |         |     +---------|--------+    |
> > > > |               |             |         |               |             |
> > > > |     +---------|--------+    |         |     +---------|--------+    |
> > > > |     |                  |    |         |     |                  |    |
> > > > |     |    Virtio PCI    |    |         |     | PCI EP Subsystem |    |
> > > > |     |                  |    |         |     |                  |    |
> > > > |     +---------|--------+    |         |     +---------|--------+    |
> > > > | SW            |             |         | SW            |             |
> > > > ----------------|--------------         ----------------|--------------
> > > > | HW            |             |         | HW            |             |
> > > > |     +---------|--------+    |         |     +---------|--------+    |
> > > > |     |                  |    |         |     |                  |    |
> > > > |     |      PCIe RC     |    |         |     |     PCIe EP      |    |
> > > > |     |                  |    |         |     |                  |    |
> > > > +-----+---------|--------+----+
> > > > +-----+---------|--------+----+ +-----+---------|--------+----+
> > > >                 |                                       |
> > > >                 |                                       |
> > > >                 |                                       |
> > > >                 |                PCIe                   |
> > > >                 -----------------------------------------
> > > >
> > > Can you please explain what is PCIe EP subsystem is?
> > > I assume, it is a subsystem to somehow configure the PCIe EP HW
> > instances?
> > > If yes, it is not connected to any PCIe RC in your diagram.
> > >
> > 
> > PCIe EP subsystem is a Linux kernel framework to configure the PCIe EP IP
> > inside an SoC/device. Here 'Endpoint' is a separate SoC/device that is running
> > Linux kernel and uses PCIe EP subsystem in kernel [1] to configure the PCIe
> > EP IP based on product usecase like GPU card, NVMe, Modem, WLAN etc...
> >
> 
> I understood the PCI EP subsystem.
> I didn’t follow it in context of virtio device as you have "virtio EPF".
> 
> > > So how does the MSI help in this case?
> > >
> > 
> > I think you are missing the point that 'Endpoint' is a separate SoC/device that
> > is connected to a host machine over PCIe. 
> Understood.
> 
> > Just like how you would connect a PCIe based GPU card to a Desktop PC. 
> Also understood.
> 
> > Only difference is, most of the PCIe
> > cards will run on a proprietary firmware supplied by the vendor, but here the
> > firmware itself can be built by the user and configurable. And this is where
> > Virtio is going to be exposed.
> >
> This part I read few times, but not understanding.
> 
> A PCI EP can be a virtio or nvme or any device.
> A PCI controller driver in Linux can call devm_pci_epc_create() and implement virtio specific configuration headers.

Right. Although, there is one more component called EPF (Endpoint Function)
driver that implements the header for the device. I didn't go in detail because
I thought that would mislead the discussion. But if you want to know more about
PCI Endpoint subsystem, please take a look at my ELEC presentation:

https://elinux.org/images/3/3a/PCI_Endpoint_drivers_in_Linux_kernel_and_How_to_write_one_.pdf

> Don’t see this anyway related to MSI-X.

MSI/MSI-X is a PCIe endpoint controller (hardware) capability. Based on that,
the PCI EP subsystem will expose those functionalities to the host. But if the
underlying hardware is not supporing MSI-X, then it won't be exposed.

> A PCI controller driver may operate a non virtio SoC device, right?
> 

If you go over my presentation, then you will get the internals of EP
subsystem. According to that, a PCIe endpoint device may expose it as any kind
of device to the host. That behavior is defined by the EPF driver. Currently,
mainline Linux kernel supports a test driver, MHI (Qcom specific), NTB function
drivers. And if Virtio is supported, then there would be a common Virtio driver
together with an usecase specific driver like virt-net, virt-blk etc...

> Are you trying to create a new kind of virtio device that is actually bind to PCI controller driver?
> and if so, it likely needs a new device id as this is the control point of PCIe EP device.
>  

No new Virtio device. We are just trying to expose transitional Virtio devices
like virt-net, virt-blk, virt-console etc... The idea is to just expose these
devices and make use of the existing frontend drivers on the host machine.

Just like how a hypervisor would expose Virtio devices to the guest. Here,
hypervisor is replaced by the PCIe endpoint device running Linux kernel.

> and idea is to consume MSI vectors by this PCI controller driver like updated virtio PCI driver , it looks fine to me.
> 

Yeah, since our devices are supporting MSI only, we want the host side Virtio
stack to make use of it.

> > >
> > > > While doing so, we faced an issue due to lack of MSI support defined
> > > > in Virtio spec for PCI transport. Currently, the PCI transport
> > > > (starting from 0.9.5) has only defined INTx (legacy) and MSI-X
> > > > interrupts for the device to send notifications to the guest. While
> > > > it works well for the hypervisor to guest communcation, when a
> > > > physical PCIe device is used as a Virtio device, lack of MSI support is
> > hurting the performance (when there is no MSI-X).
> > > >
> > > I am familiar with the scale issue of MSI-X, which is better for MSI (relative
> > to MSI-X).
> > > What prevents implementing the MSI-X?
> > >
> > 
> > As I said, most of the devices I'm aware doesn't support MSI-X in hardware
> > itself (I mean in the PCIe EP IP inside the SoC/device). For simple usecases
> > like WLAN, modem, MSI-X is not really required.
> > 
> > > > Most of the physical PCIe endpoint devices support MSI interrupts
> > > > over MSI-
> > > I am not sure if this is true. :)
> > > But not a concern either.
> > >
> > 
> > It really depends on the usecase I would say.
> > 
> Right. And also depends from which side you see it.
> i.e. to see PCIe EP device from RC side or from Linux PCIe EP subsystem side.
>  
> A PCIe EP device does not support MSI-X is confusing term to say,
> Because from a host (server) root complex point of view, which is seeing virtio or nvme PCI device, it is a PCIe EP device.
> 
> Therefore, for simplicity, just say,
> 
> A virtio PCI device does not support MSI interrupt mode.
> And it is useful to support it, that is optimized then MSI-X at lower scale.
> 

Ok, I now get what you are saying. I was describing more from a PCIe endpoint
device point of view.

> > > > X for simplicity and with Virtio not supporting MSI, falling back to
> > > > legacy INTx interrupts is affecting the performance.
> > > >
> > > > First of all, INTx requires the PCIe devices to send two MSG TLPs
> > > > (Assert/Deassert) to emulate level triggered interrupt on the host.
> > > > And there could be some delay between assert and deassert messages
> > > > to make sure that the host recognizes it as an interrupt (level
> > > > trigger). Also, the INTx interrupts are limited to 1 per function,
> > > > so all the notifications from device has to share this single interrupt
> > (INTA).
> > > >
> > > Yes, INTx deprecation is in my list but didn’t get their yet.
> > >
> > > > On the other hand, MSI requires only one MWr TLP from the device to
> > > > host and since it is a posted write, there is no delay involved.
> > > > Also, a single PCIe function can use upto 32 MSIs, thus making it
> > > > possible to use one MSI vector per virtqueue (32 is more than enough for
> > most of the usecases).
> > > >
> > > > So my question is, why does the Virtio spec not supporting MSI? If
> > > > there are no major blocker in supporting MSI, could we propose
> > > > adding MSI to the Virtio spec?
> > > >
> > > MSI addition is good for virtio for small scale devices of 1 to 32.
> > > PCIe EP may support MSI-X and MSI both the capabilities and sw can give
> > preference to MSI when the need is <= 32 vectors.
> > >
> > 
> > PCIe specification only mandates the devices to support either MSI or MSI-X.
> > 
> > Reference: PCIe spec r5.0, sec 6.1.4:
> > 
> > "All PCI Express device Functions that are capable of generating interrupts
> > must support MSI or MSI-X or both."
> >
> I am referring to the last word "both".
> 

But there is an 'or' before 'both', that's what I'm trying to highlight.
Because, there is no necessity for a device to support MSI-X. But current Virtio
Linux driver expects the device to support MSI-X, otherwise falls back to INTx.

>  
> > So MSI-X is clearly an optional feature which simple devices tend to ignore.
> > But if both are supported, then obviously Virtio will make use of MSI-X, but
> > that's not the case here.
> > 
> If both are supported, and required scale by the driver is <=32, driver can choose MSI due to its lightweight nature.
> Why do you say "obviously virtio will make use of MSI-X?" do you mean current code or future code?
> 

Current code:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/virtio/virtio_pci_common.c#n102

The API vp_request_msix_vectors() just requests MSI-X using the flag
PCI_IRQ_MSIX. Because of this, even if the device supports MSI, it won't be used
by Virtio, hence falling back to legacy INTx.

> > > Though I don’t see it anyway related to PCIe EP configuration in your
> > diagram.
> > > In other words, PCI EP subsystem can still work with MSI-X.
> > > Can you please elaborate it?
> > >
> > 
> > I hope the above info clarifies. If not, please let me know.
> > 
> 
> The only part that is not clear to me is, the PCIe EP controller driver is attaching to virtio device or some vendor specific SoC platform device?

I think above justification will clarify this.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

  parent reply	other threads:[~2024-06-25  9:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-24 16:19 MSI for Virtio PCI transport Manivannan Sadhasivam
2024-06-25  4:09 ` Parav Pandit
2024-06-25  5:43   ` Manivannan Sadhasivam
2024-06-25  6:18     ` Parav Pandit
2024-06-25  7:55       ` Michael S. Tsirkin
2024-06-25  8:00         ` Parav Pandit
2024-06-25  8:09           ` Michael S. Tsirkin
2024-06-25  8:18             ` Parav Pandit
2024-06-25  8:29               ` Michael S. Tsirkin
2024-06-25  9:11       ` Manivannan Sadhasivam [this message]
2024-06-25  9:59         ` Parav Pandit
2024-06-25  7:52 ` Michael S. Tsirkin
2024-06-25  9:19   ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240625091128.GC2642@thinkpad \
    --to=manisadhasivam.linux@gmail.com \
    --cc=mie@igel.co.jp \
    --cc=parav@nvidia.com \
    --cc=virtio-comment@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.