* MSI for Virtio PCI transport
@ 2024-06-24 16:19 Manivannan Sadhasivam
2024-06-25 4:09 ` Parav Pandit
2024-06-25 7:52 ` Michael S. Tsirkin
0 siblings, 2 replies; 13+ messages in thread
From: Manivannan Sadhasivam @ 2024-06-24 16:19 UTC (permalink / raw)
To: virtio-comment; +Cc: mie
Hi,
We are looking into adapting Virtio spec for configurable physical PCIe endpoint
devices to expose Virtio devices to the host machine connected over PCIe. This
allows us to use the existing frontend drivers on the host machine, thus
minimizing the development efforts. This idea is not new as some vendors like
NVidia have already released customized PCIe devices exposing Virtio devices to
the host machines. But we are working on making the configurable PCIe devices
running Linux kernel to expose Virtio devices using the PCI Endpoint (EP)
subsystem.
Below is the simplistic represenation of the idea with virt-net as an
example. But this could be extended to any supported Virtio devices:
HOST ENDPOINT
+-----------------------------+ +-----------------------------+
| | | |
| Linux Kernel | | Linux Kernel |
| | | |
| | | +------------------+ |
| | | | | |
| | | | Modem | |
| | | | | |
| | | +---------|--------+ |
| | | | |
| +------------------+ | | +---------|--------+ |
| | | | | | | |
| | Virt-net | | | | Virtio EPF | |
| | | | | | | |
| +---------|--------+ | | +---------|--------+ |
| | | | | |
| +---------|--------+ | | +---------|--------+ |
| | | | | | | |
| | Virtio PCI | | | | PCI EP Subsystem | |
| | | | | | | |
| +---------|--------+ | | +---------|--------+ |
| SW | | | SW | |
----------------|-------------- ----------------|--------------
| HW | | | HW | |
| +---------|--------+ | | +---------|--------+ |
| | | | | | | |
| | PCIe RC | | | | PCIe EP | |
| | | | | | | |
+-----+---------|--------+----+ +-----+---------|--------+----+
| |
| |
| |
| PCIe |
-----------------------------------------
While doing so, we faced an issue due to lack of MSI support defined in Virtio
spec for PCI transport. Currently, the PCI transport (starting from 0.9.5) has
only defined INTx (legacy) and MSI-X interrupts for the device to send
notifications to the guest. While it works well for the hypervisor to guest
communcation, when a physical PCIe device is used as a Virtio device, lack of
MSI support is hurting the performance (when there is no MSI-X).
Most of the physical PCIe endpoint devices support MSI interrupts over MSI-X for
simplicity and with Virtio not supporting MSI, falling back to legacy INTx
interrupts is affecting the performance.
First of all, INTx requires the PCIe devices to send two MSG TLPs
(Assert/Deassert) to emulate level triggered interrupt on the host. And there
could be some delay between assert and deassert messages to make sure that the
host recognizes it as an interrupt (level trigger). Also, the INTx interrupts
are limited to 1 per function, so all the notifications from device has to share
this single interrupt (INTA).
On the other hand, MSI requires only one MWr TLP from the device to host and
since it is a posted write, there is no delay involved. Also, a single PCIe
function can use upto 32 MSIs, thus making it possible to use one MSI vector per
virtqueue (32 is more than enough for most of the usecases).
So my question is, why does the Virtio spec not supporting MSI? If there are no
major blocker in supporting MSI, could we propose adding MSI to the Virtio spec?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 13+ messages in thread* RE: MSI for Virtio PCI transport 2024-06-24 16:19 MSI for Virtio PCI transport Manivannan Sadhasivam @ 2024-06-25 4:09 ` Parav Pandit 2024-06-25 5:43 ` Manivannan Sadhasivam 2024-06-25 7:52 ` Michael S. Tsirkin 1 sibling, 1 reply; 13+ messages in thread From: Parav Pandit @ 2024-06-25 4:09 UTC (permalink / raw) To: Manivannan Sadhasivam, virtio-comment@lists.linux.dev; +Cc: mie@igel.co.jp Hi, > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > Sent: Monday, June 24, 2024 9:50 PM > > Hi, > > We are looking into adapting Virtio spec for configurable physical PCIe > endpoint devices to expose Virtio devices to the host machine connected > over PCIe. This allows us to use the existing frontend drivers on the host > machine, thus minimizing the development efforts. This idea is not new as > some vendors like NVidia have already released customized PCIe devices > exposing Virtio devices to the host machines. But we are working on making > the configurable PCIe devices running Linux kernel to expose Virtio devices > using the PCI Endpoint (EP) subsystem. > > Below is the simplistic represenation of the idea with virt-net as an example. > But this could be extended to any supported Virtio devices: > > HOST ENDPOINT > > +-----------------------------+ +-----------------------------+ > | | | | > | Linux Kernel | | Linux Kernel | > | | | | > | | | +------------------+ | > | | | | | | > | | | | Modem | | > | | | | | | > | | | +---------|--------+ | > | | | | | > | +------------------+ | | +---------|--------+ | > | | | | | | | | > | | Virt-net | | | | Virtio EPF | | > | | | | | | | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | | | > | | Virtio PCI | | | | PCI EP Subsystem | | > | | | | | | | | > | +---------|--------+ | | +---------|--------+ | > | SW | | | SW | | > ----------------|-------------- ----------------|-------------- > | HW | | | HW | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | | | > | | PCIe RC | | | | PCIe EP | | > | | | | | | | | > +-----+---------|--------+----+ +-----+---------|--------+----+ > | | > | | > | | > | PCIe | > ----------------------------------------- > Can you please explain what is PCIe EP subsystem is? I assume, it is a subsystem to somehow configure the PCIe EP HW instances? If yes, it is not connected to any PCIe RC in your diagram. So how does the MSI help in this case? > While doing so, we faced an issue due to lack of MSI support defined in Virtio > spec for PCI transport. Currently, the PCI transport (starting from 0.9.5) has > only defined INTx (legacy) and MSI-X interrupts for the device to send > notifications to the guest. While it works well for the hypervisor to guest > communcation, when a physical PCIe device is used as a Virtio device, lack of > MSI support is hurting the performance (when there is no MSI-X). > I am familiar with the scale issue of MSI-X, which is better for MSI (relative to MSI-X). What prevents implementing the MSI-X? > Most of the physical PCIe endpoint devices support MSI interrupts over MSI- I am not sure if this is true. :) But not a concern either. > X for simplicity and with Virtio not supporting MSI, falling back to legacy INTx > interrupts is affecting the performance. > > First of all, INTx requires the PCIe devices to send two MSG TLPs > (Assert/Deassert) to emulate level triggered interrupt on the host. And there > could be some delay between assert and deassert messages to make sure > that the host recognizes it as an interrupt (level trigger). Also, the INTx > interrupts are limited to 1 per function, so all the notifications from device > has to share this single interrupt (INTA). > Yes, INTx deprecation is in my list but didn’t get their yet. > On the other hand, MSI requires only one MWr TLP from the device to host > and since it is a posted write, there is no delay involved. Also, a single PCIe > function can use upto 32 MSIs, thus making it possible to use one MSI vector > per virtqueue (32 is more than enough for most of the usecases). > > So my question is, why does the Virtio spec not supporting MSI? If there are > no major blocker in supporting MSI, could we propose adding MSI to the > Virtio spec? > MSI addition is good for virtio for small scale devices of 1 to 32. PCIe EP may support MSI-X and MSI both the capabilities and sw can give preference to MSI when the need is <= 32 vectors. Though I don’t see it anyway related to PCIe EP configuration in your diagram. In other words, PCI EP subsystem can still work with MSI-X. Can you please elaborate it? > - Mani > > -- > மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 4:09 ` Parav Pandit @ 2024-06-25 5:43 ` Manivannan Sadhasivam 2024-06-25 6:18 ` Parav Pandit 0 siblings, 1 reply; 13+ messages in thread From: Manivannan Sadhasivam @ 2024-06-25 5:43 UTC (permalink / raw) To: Parav Pandit; +Cc: virtio-comment@lists.linux.dev, mie@igel.co.jp On Tue, Jun 25, 2024 at 04:09:07AM +0000, Parav Pandit wrote: > Hi, > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > Sent: Monday, June 24, 2024 9:50 PM > > > > Hi, > > > > We are looking into adapting Virtio spec for configurable physical PCIe > > endpoint devices to expose Virtio devices to the host machine connected > > over PCIe. This allows us to use the existing frontend drivers on the host > > machine, thus minimizing the development efforts. This idea is not new as > > some vendors like NVidia have already released customized PCIe devices > > exposing Virtio devices to the host machines. But we are working on making > > the configurable PCIe devices running Linux kernel to expose Virtio devices > > using the PCI Endpoint (EP) subsystem. > > > > Below is the simplistic represenation of the idea with virt-net as an example. > > But this could be extended to any supported Virtio devices: > > > > HOST ENDPOINT > > > > +-----------------------------+ +-----------------------------+ > > | | | | > > | Linux Kernel | | Linux Kernel | > > | | | | > > | | | +------------------+ | > > | | | | | | > > | | | | Modem | | > > | | | | | | > > | | | +---------|--------+ | > > | | | | | > > | +------------------+ | | +---------|--------+ | > > | | | | | | | | > > | | Virt-net | | | | Virtio EPF | | > > | | | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | | | > > | | Virtio PCI | | | | PCI EP Subsystem | | > > | | | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | SW | | | SW | | > > ----------------|-------------- ----------------|-------------- > > | HW | | | HW | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | | | > > | | PCIe RC | | | | PCIe EP | | > > | | | | | | | | > > +-----+---------|--------+----+ +-----+---------|--------+----+ > > | | > > | | > > | | > > | PCIe | > > ----------------------------------------- > > > Can you please explain what is PCIe EP subsystem is? > I assume, it is a subsystem to somehow configure the PCIe EP HW instances? > If yes, it is not connected to any PCIe RC in your diagram. > PCIe EP subsystem is a Linux kernel framework to configure the PCIe EP IP inside an SoC/device. Here 'Endpoint' is a separate SoC/device that is running Linux kernel and uses PCIe EP subsystem in kernel [1] to configure the PCIe EP IP based on product usecase like GPU card, NVMe, Modem, WLAN etc... [1] https://docs.kernel.org/PCI/endpoint/pci-endpoint.html > So how does the MSI help in this case? > I think you are missing the point that 'Endpoint' is a separate SoC/device that is connected to a host machine over PCIe. Just like how you would connect a PCIe based GPU card to a Desktop PC. Only difference is, most of the PCIe cards will run on a proprietary firmware supplied by the vendor, but here the firmware itself can be built by the user and configurable. And this is where Virtio is going to be exposed. > > > While doing so, we faced an issue due to lack of MSI support defined in Virtio > > spec for PCI transport. Currently, the PCI transport (starting from 0.9.5) has > > only defined INTx (legacy) and MSI-X interrupts for the device to send > > notifications to the guest. While it works well for the hypervisor to guest > > communcation, when a physical PCIe device is used as a Virtio device, lack of > > MSI support is hurting the performance (when there is no MSI-X). > > > I am familiar with the scale issue of MSI-X, which is better for MSI (relative to MSI-X). > What prevents implementing the MSI-X? > As I said, most of the devices I'm aware doesn't support MSI-X in hardware itself (I mean in the PCIe EP IP inside the SoC/device). For simple usecases like WLAN, modem, MSI-X is not really required. > > Most of the physical PCIe endpoint devices support MSI interrupts over MSI- > I am not sure if this is true. :) > But not a concern either. > It really depends on the usecase I would say. > > X for simplicity and with Virtio not supporting MSI, falling back to legacy INTx > > interrupts is affecting the performance. > > > > First of all, INTx requires the PCIe devices to send two MSG TLPs > > (Assert/Deassert) to emulate level triggered interrupt on the host. And there > > could be some delay between assert and deassert messages to make sure > > that the host recognizes it as an interrupt (level trigger). Also, the INTx > > interrupts are limited to 1 per function, so all the notifications from device > > has to share this single interrupt (INTA). > > > Yes, INTx deprecation is in my list but didn’t get their yet. > > > On the other hand, MSI requires only one MWr TLP from the device to host > > and since it is a posted write, there is no delay involved. Also, a single PCIe > > function can use upto 32 MSIs, thus making it possible to use one MSI vector > > per virtqueue (32 is more than enough for most of the usecases). > > > > So my question is, why does the Virtio spec not supporting MSI? If there are > > no major blocker in supporting MSI, could we propose adding MSI to the > > Virtio spec? > > > MSI addition is good for virtio for small scale devices of 1 to 32. > PCIe EP may support MSI-X and MSI both the capabilities and sw can give preference to MSI when the need is <= 32 vectors. > PCIe specification only mandates the devices to support either MSI or MSI-X. Reference: PCIe spec r5.0, sec 6.1.4: "All PCI Express device Functions that are capable of generating interrupts must support MSI or MSI-X or both." So MSI-X is clearly an optional feature which simple devices tend to ignore. But if both are supported, then obviously Virtio will make use of MSI-X, but that's not the case here. > Though I don’t see it anyway related to PCIe EP configuration in your diagram. > In other words, PCI EP subsystem can still work with MSI-X. > Can you please elaborate it? > I hope the above info clarifies. If not, please let me know. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: MSI for Virtio PCI transport 2024-06-25 5:43 ` Manivannan Sadhasivam @ 2024-06-25 6:18 ` Parav Pandit 2024-06-25 7:55 ` Michael S. Tsirkin 2024-06-25 9:11 ` Manivannan Sadhasivam 0 siblings, 2 replies; 13+ messages in thread From: Parav Pandit @ 2024-06-25 6:18 UTC (permalink / raw) To: Manivannan Sadhasivam; +Cc: virtio-comment@lists.linux.dev, mie@igel.co.jp > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > Sent: Tuesday, June 25, 2024 11:14 AM > > On Tue, Jun 25, 2024 at 04:09:07AM +0000, Parav Pandit wrote: > > Hi, > > > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > > Sent: Monday, June 24, 2024 9:50 PM > > > > > > Hi, > > > > > > We are looking into adapting Virtio spec for configurable physical > > > PCIe endpoint devices to expose Virtio devices to the host machine > > > connected over PCIe. This allows us to use the existing frontend > > > drivers on the host machine, thus minimizing the development > > > efforts. This idea is not new as some vendors like NVidia have > > > already released customized PCIe devices exposing Virtio devices to > > > the host machines. But we are working on making the configurable > > > PCIe devices running Linux kernel to expose Virtio devices using the PCI > Endpoint (EP) subsystem. > > > > > > Below is the simplistic represenation of the idea with virt-net as an > example. > > > But this could be extended to any supported Virtio devices: > > > > > > HOST ENDPOINT > > > > > > +-----------------------------+ > > > +-----------------------------+ +-----------------------------+ > > > | | | | > > > | Linux Kernel | | Linux Kernel | > > > | | | | > > > | | | +------------------+ | > > > | | | | | | > > > | | | | Modem | | > > > | | | | | | > > > | | | +---------|--------+ | > > > | | | | | > > > | +------------------+ | | +---------|--------+ | > > > | | | | | | | | > > > | | Virt-net | | | | Virtio EPF | | > > > | | | | | | | | > > > | +---------|--------+ | | +---------|--------+ | > > > | | | | | | > > > | +---------|--------+ | | +---------|--------+ | > > > | | | | | | | | > > > | | Virtio PCI | | | | PCI EP Subsystem | | > > > | | | | | | | | > > > | +---------|--------+ | | +---------|--------+ | > > > | SW | | | SW | | > > > ----------------|-------------- ----------------|-------------- > > > | HW | | | HW | | > > > | +---------|--------+ | | +---------|--------+ | > > > | | | | | | | | > > > | | PCIe RC | | | | PCIe EP | | > > > | | | | | | | | > > > +-----+---------|--------+----+ > > > +-----+---------|--------+----+ +-----+---------|--------+----+ > > > | | > > > | | > > > | | > > > | PCIe | > > > ----------------------------------------- > > > > > Can you please explain what is PCIe EP subsystem is? > > I assume, it is a subsystem to somehow configure the PCIe EP HW > instances? > > If yes, it is not connected to any PCIe RC in your diagram. > > > > PCIe EP subsystem is a Linux kernel framework to configure the PCIe EP IP > inside an SoC/device. Here 'Endpoint' is a separate SoC/device that is running > Linux kernel and uses PCIe EP subsystem in kernel [1] to configure the PCIe > EP IP based on product usecase like GPU card, NVMe, Modem, WLAN etc... > I understood the PCI EP subsystem. I didn’t follow it in context of virtio device as you have "virtio EPF". > > So how does the MSI help in this case? > > > > I think you are missing the point that 'Endpoint' is a separate SoC/device that > is connected to a host machine over PCIe. Understood. > Just like how you would connect a PCIe based GPU card to a Desktop PC. Also understood. > Only difference is, most of the PCIe > cards will run on a proprietary firmware supplied by the vendor, but here the > firmware itself can be built by the user and configurable. And this is where > Virtio is going to be exposed. > This part I read few times, but not understanding. A PCI EP can be a virtio or nvme or any device. A PCI controller driver in Linux can call devm_pci_epc_create() and implement virtio specific configuration headers. Don’t see this anyway related to MSI-X. A PCI controller driver may operate a non virtio SoC device, right? Are you trying to create a new kind of virtio device that is actually bind to PCI controller driver? and if so, it likely needs a new device id as this is the control point of PCIe EP device. and idea is to consume MSI vectors by this PCI controller driver like updated virtio PCI driver , it looks fine to me. > > > > > While doing so, we faced an issue due to lack of MSI support defined > > > in Virtio spec for PCI transport. Currently, the PCI transport > > > (starting from 0.9.5) has only defined INTx (legacy) and MSI-X > > > interrupts for the device to send notifications to the guest. While > > > it works well for the hypervisor to guest communcation, when a > > > physical PCIe device is used as a Virtio device, lack of MSI support is > hurting the performance (when there is no MSI-X). > > > > > I am familiar with the scale issue of MSI-X, which is better for MSI (relative > to MSI-X). > > What prevents implementing the MSI-X? > > > > As I said, most of the devices I'm aware doesn't support MSI-X in hardware > itself (I mean in the PCIe EP IP inside the SoC/device). For simple usecases > like WLAN, modem, MSI-X is not really required. > > > > Most of the physical PCIe endpoint devices support MSI interrupts > > > over MSI- > > I am not sure if this is true. :) > > But not a concern either. > > > > It really depends on the usecase I would say. > Right. And also depends from which side you see it. i.e. to see PCIe EP device from RC side or from Linux PCIe EP subsystem side. A PCIe EP device does not support MSI-X is confusing term to say, Because from a host (server) root complex point of view, which is seeing virtio or nvme PCI device, it is a PCIe EP device. Therefore, for simplicity, just say, A virtio PCI device does not support MSI interrupt mode. And it is useful to support it, that is optimized then MSI-X at lower scale. > > > X for simplicity and with Virtio not supporting MSI, falling back to > > > legacy INTx interrupts is affecting the performance. > > > > > > First of all, INTx requires the PCIe devices to send two MSG TLPs > > > (Assert/Deassert) to emulate level triggered interrupt on the host. > > > And there could be some delay between assert and deassert messages > > > to make sure that the host recognizes it as an interrupt (level > > > trigger). Also, the INTx interrupts are limited to 1 per function, > > > so all the notifications from device has to share this single interrupt > (INTA). > > > > > Yes, INTx deprecation is in my list but didn’t get their yet. > > > > > On the other hand, MSI requires only one MWr TLP from the device to > > > host and since it is a posted write, there is no delay involved. > > > Also, a single PCIe function can use upto 32 MSIs, thus making it > > > possible to use one MSI vector per virtqueue (32 is more than enough for > most of the usecases). > > > > > > So my question is, why does the Virtio spec not supporting MSI? If > > > there are no major blocker in supporting MSI, could we propose > > > adding MSI to the Virtio spec? > > > > > MSI addition is good for virtio for small scale devices of 1 to 32. > > PCIe EP may support MSI-X and MSI both the capabilities and sw can give > preference to MSI when the need is <= 32 vectors. > > > > PCIe specification only mandates the devices to support either MSI or MSI-X. > > Reference: PCIe spec r5.0, sec 6.1.4: > > "All PCI Express device Functions that are capable of generating interrupts > must support MSI or MSI-X or both." > I am referring to the last word "both". > So MSI-X is clearly an optional feature which simple devices tend to ignore. > But if both are supported, then obviously Virtio will make use of MSI-X, but > that's not the case here. > If both are supported, and required scale by the driver is <=32, driver can choose MSI due to its lightweight nature. Why do you say "obviously virtio will make use of MSI-X?" do you mean current code or future code? > > Though I don’t see it anyway related to PCIe EP configuration in your > diagram. > > In other words, PCI EP subsystem can still work with MSI-X. > > Can you please elaborate it? > > > > I hope the above info clarifies. If not, please let me know. > The only part that is not clear to me is, the PCIe EP controller driver is attaching to virtio device or some vendor specific SoC platform device? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 6:18 ` Parav Pandit @ 2024-06-25 7:55 ` Michael S. Tsirkin 2024-06-25 8:00 ` Parav Pandit 2024-06-25 9:11 ` Manivannan Sadhasivam 1 sibling, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-25 7:55 UTC (permalink / raw) To: Parav Pandit Cc: Manivannan Sadhasivam, virtio-comment@lists.linux.dev, mie@igel.co.jp On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > So MSI-X is clearly an optional feature which simple devices tend to ignore. > > But if both are supported, then obviously Virtio will make use of MSI-X, but > > that's not the case here. > > > If both are supported, and required scale by the driver is <=32, driver can choose MSI due to its lightweight nature. Unlikely, MSI vectors are tricky to mask and this is a problem for interrupt balancing. So MSIX is better for performance even if the # of vectors is low. -- MST ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: MSI for Virtio PCI transport 2024-06-25 7:55 ` Michael S. Tsirkin @ 2024-06-25 8:00 ` Parav Pandit 2024-06-25 8:09 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Parav Pandit @ 2024-06-25 8:00 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Manivannan Sadhasivam, virtio-comment@lists.linux.dev, mie@igel.co.jp > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 25, 2024 1:25 PM > To: Parav Pandit <parav@nvidia.com> > Cc: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>; virtio- > comment@lists.linux.dev; mie@igel.co.jp > Subject: Re: MSI for Virtio PCI transport > > On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > So MSI-X is clearly an optional feature which simple devices tend to ignore. > > > But if both are supported, then obviously Virtio will make use of > > > MSI-X, but that's not the case here. > > > > > If both are supported, and required scale by the driver is <=32, driver can > choose MSI due to its lightweight nature. > > Unlikely, MSI vectors are tricky to mask and this is a problem for interrupt > balancing. So MSIX is better for performance even if the # of vectors is low. Masking to my knowledge is not used by MSIX. Didn't follow how MSIX helps with performance. The benefit of MSI is it does not need to store addr+data pair per vector. > > -- > MST ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 8:00 ` Parav Pandit @ 2024-06-25 8:09 ` Michael S. Tsirkin 2024-06-25 8:18 ` Parav Pandit 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-25 8:09 UTC (permalink / raw) To: Parav Pandit Cc: Manivannan Sadhasivam, virtio-comment@lists.linux.dev, mie@igel.co.jp On Tue, Jun 25, 2024 at 08:00:45AM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 25, 2024 1:25 PM > > To: Parav Pandit <parav@nvidia.com> > > Cc: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>; virtio- > > comment@lists.linux.dev; mie@igel.co.jp > > Subject: Re: MSI for Virtio PCI transport > > > > On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > > So MSI-X is clearly an optional feature which simple devices tend to ignore. > > > > But if both are supported, then obviously Virtio will make use of > > > > MSI-X, but that's not the case here. > > > > > > > If both are supported, and required scale by the driver is <=32, driver can > > choose MSI due to its lightweight nature. > > > > Unlikely, MSI vectors are tricky to mask and this is a problem for interrupt > > balancing. So MSIX is better for performance even if the # of vectors is low. > Masking to my knowledge is not used by MSIX. There's a mask bit per vector, yes. > Didn't follow how MSIX helps with performance. Linux uses mask/change/unmask to balance interrupts between CPUs. *That* is important for performance. > The benefit of MSI is it does not need to store addr+data pair per vector. The addr/data thing wasn't invented just to make hardware costs go up. > > > > -- > > MST ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: MSI for Virtio PCI transport 2024-06-25 8:09 ` Michael S. Tsirkin @ 2024-06-25 8:18 ` Parav Pandit 2024-06-25 8:29 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Parav Pandit @ 2024-06-25 8:18 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Manivannan Sadhasivam, virtio-comment@lists.linux.dev, mie@igel.co.jp > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Tuesday, June 25, 2024 1:40 PM > > On Tue, Jun 25, 2024 at 08:00:45AM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Tuesday, June 25, 2024 1:25 PM > > > To: Parav Pandit <parav@nvidia.com> > > > Cc: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>; virtio- > > > comment@lists.linux.dev; mie@igel.co.jp > > > Subject: Re: MSI for Virtio PCI transport > > > > > > On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > > > So MSI-X is clearly an optional feature which simple devices tend to > ignore. > > > > > But if both are supported, then obviously Virtio will make use > > > > > of MSI-X, but that's not the case here. > > > > > > > > > If both are supported, and required scale by the driver is <=32, > > > > driver can > > > choose MSI due to its lightweight nature. > > > > > > Unlikely, MSI vectors are tricky to mask and this is a problem for > > > interrupt balancing. So MSIX is better for performance even if the # of > vectors is low. > > Masking to my knowledge is not used by MSIX. > > There's a mask bit per vector, yes. > > > Didn't follow how MSIX helps with performance. > > Linux uses mask/change/unmask to balance interrupts between CPUs. > *That* is important for performance. Ah, I see, didn't know this. Frequently reprogramming the mask in device is expensive and also requires synchronization to not miss the interrupt. In case if you have pointer to the Linux code please share. A while back, I recollect seeing reprogramming done by the IOAPIC at the interrupt table level within the cpu (without involving the device) to forward to different cpu. > > > > The benefit of MSI is it does not need to store addr+data pair per vector. > > The addr/data thing wasn't invented just to make hardware costs go up. > I am aware of it. :) I will let other non virtio forums finish their ongoing optimization work in this area to reduce hardware costs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 8:18 ` Parav Pandit @ 2024-06-25 8:29 ` Michael S. Tsirkin 0 siblings, 0 replies; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-25 8:29 UTC (permalink / raw) To: Parav Pandit Cc: Manivannan Sadhasivam, virtio-comment@lists.linux.dev, mie@igel.co.jp On Tue, Jun 25, 2024 at 08:18:03AM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Tuesday, June 25, 2024 1:40 PM > > > > On Tue, Jun 25, 2024 at 08:00:45AM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Tuesday, June 25, 2024 1:25 PM > > > > To: Parav Pandit <parav@nvidia.com> > > > > Cc: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com>; virtio- > > > > comment@lists.linux.dev; mie@igel.co.jp > > > > Subject: Re: MSI for Virtio PCI transport > > > > > > > > On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > > > > So MSI-X is clearly an optional feature which simple devices tend to > > ignore. > > > > > > But if both are supported, then obviously Virtio will make use > > > > > > of MSI-X, but that's not the case here. > > > > > > > > > > > If both are supported, and required scale by the driver is <=32, > > > > > driver can > > > > choose MSI due to its lightweight nature. > > > > > > > > Unlikely, MSI vectors are tricky to mask and this is a problem for > > > > interrupt balancing. So MSIX is better for performance even if the # of > > vectors is low. > > > Masking to my knowledge is not used by MSIX. > > > > There's a mask bit per vector, yes. > > > > > Didn't follow how MSIX helps with performance. > > > > Linux uses mask/change/unmask to balance interrupts between CPUs. > > *That* is important for performance. > Ah, I see, didn't know this. > Frequently reprogramming the mask in device is expensive and also requires synchronization to not miss the interrupt. > In case if you have pointer to the Linux code please share. static inline void pci_msix_mask(struct msi_desc *desc) { desc->pci.msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT; pci_msix_write_vector_ctrl(desc, desc->pci.msix_ctrl); /* Flush write to device */ readl(desc->pci.mask_base); } and static inline void pci_write_msg_msix(struct msi_desc *desc, struct msi_msg *msg) { void __iomem *base = pci_msix_desc_addr(desc); u32 ctrl = desc->pci.msix_ctrl; bool unmasked = !(ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT); if (desc->pci.msi_attrib.is_virtual) return; /* * The specification mandates that the entry is masked * when the message is modified: * * "If software changes the Address or Data value of an * entry while the entry is unmasked, the result is * undefined." */ if (unmasked) pci_msix_write_vector_ctrl(desc, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT); writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR); writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR); writel(msg->data, base + PCI_MSIX_ENTRY_DATA); if (unmasked) pci_msix_write_vector_ctrl(desc, ctrl); /* Ensure that the writes are visible in the device */ readl(base + PCI_MSIX_ENTRY_DATA); } both do this. > A while back, I recollect seeing reprogramming done by the IOAPIC at the interrupt table level within the cpu (without involving the device) to forward to different cpu. AFAIK MSI goes to lapic not to ioapic. I might be confused though. > > > > > > > The benefit of MSI is it does not need to store addr+data pair per vector. > > > > The addr/data thing wasn't invented just to make hardware costs go up. > > > I am aware of it. :) > I will let other non virtio forums finish their ongoing optimization work in this area to reduce hardware costs. Absolutely. -- MST ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 6:18 ` Parav Pandit 2024-06-25 7:55 ` Michael S. Tsirkin @ 2024-06-25 9:11 ` Manivannan Sadhasivam 2024-06-25 9:59 ` Parav Pandit 1 sibling, 1 reply; 13+ messages in thread From: Manivannan Sadhasivam @ 2024-06-25 9:11 UTC (permalink / raw) To: Parav Pandit; +Cc: virtio-comment@lists.linux.dev, mie@igel.co.jp On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > Sent: Tuesday, June 25, 2024 11:14 AM > > > > On Tue, Jun 25, 2024 at 04:09:07AM +0000, Parav Pandit wrote: > > > Hi, > > > > > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > > > Sent: Monday, June 24, 2024 9:50 PM > > > > > > > > Hi, > > > > > > > > We are looking into adapting Virtio spec for configurable physical > > > > PCIe endpoint devices to expose Virtio devices to the host machine > > > > connected over PCIe. This allows us to use the existing frontend > > > > drivers on the host machine, thus minimizing the development > > > > efforts. This idea is not new as some vendors like NVidia have > > > > already released customized PCIe devices exposing Virtio devices to > > > > the host machines. But we are working on making the configurable > > > > PCIe devices running Linux kernel to expose Virtio devices using the PCI > > Endpoint (EP) subsystem. > > > > > > > > Below is the simplistic represenation of the idea with virt-net as an > > example. > > > > But this could be extended to any supported Virtio devices: > > > > > > > > HOST ENDPOINT > > > > > > > > +-----------------------------+ > > > > +-----------------------------+ +-----------------------------+ > > > > | | | | > > > > | Linux Kernel | | Linux Kernel | > > > > | | | | > > > > | | | +------------------+ | > > > > | | | | | | > > > > | | | | Modem | | > > > > | | | | | | > > > > | | | +---------|--------+ | > > > > | | | | | > > > > | +------------------+ | | +---------|--------+ | > > > > | | | | | | | | > > > > | | Virt-net | | | | Virtio EPF | | > > > > | | | | | | | | > > > > | +---------|--------+ | | +---------|--------+ | > > > > | | | | | | > > > > | +---------|--------+ | | +---------|--------+ | > > > > | | | | | | | | > > > > | | Virtio PCI | | | | PCI EP Subsystem | | > > > > | | | | | | | | > > > > | +---------|--------+ | | +---------|--------+ | > > > > | SW | | | SW | | > > > > ----------------|-------------- ----------------|-------------- > > > > | HW | | | HW | | > > > > | +---------|--------+ | | +---------|--------+ | > > > > | | | | | | | | > > > > | | PCIe RC | | | | PCIe EP | | > > > > | | | | | | | | > > > > +-----+---------|--------+----+ > > > > +-----+---------|--------+----+ +-----+---------|--------+----+ > > > > | | > > > > | | > > > > | | > > > > | PCIe | > > > > ----------------------------------------- > > > > > > > Can you please explain what is PCIe EP subsystem is? > > > I assume, it is a subsystem to somehow configure the PCIe EP HW > > instances? > > > If yes, it is not connected to any PCIe RC in your diagram. > > > > > > > PCIe EP subsystem is a Linux kernel framework to configure the PCIe EP IP > > inside an SoC/device. Here 'Endpoint' is a separate SoC/device that is running > > Linux kernel and uses PCIe EP subsystem in kernel [1] to configure the PCIe > > EP IP based on product usecase like GPU card, NVMe, Modem, WLAN etc... > > > > I understood the PCI EP subsystem. > I didn’t follow it in context of virtio device as you have "virtio EPF". > > > > So how does the MSI help in this case? > > > > > > > I think you are missing the point that 'Endpoint' is a separate SoC/device that > > is connected to a host machine over PCIe. > Understood. > > > Just like how you would connect a PCIe based GPU card to a Desktop PC. > Also understood. > > > Only difference is, most of the PCIe > > cards will run on a proprietary firmware supplied by the vendor, but here the > > firmware itself can be built by the user and configurable. And this is where > > Virtio is going to be exposed. > > > This part I read few times, but not understanding. > > A PCI EP can be a virtio or nvme or any device. > A PCI controller driver in Linux can call devm_pci_epc_create() and implement virtio specific configuration headers. Right. Although, there is one more component called EPF (Endpoint Function) driver that implements the header for the device. I didn't go in detail because I thought that would mislead the discussion. But if you want to know more about PCI Endpoint subsystem, please take a look at my ELEC presentation: https://elinux.org/images/3/3a/PCI_Endpoint_drivers_in_Linux_kernel_and_How_to_write_one_.pdf > Don’t see this anyway related to MSI-X. MSI/MSI-X is a PCIe endpoint controller (hardware) capability. Based on that, the PCI EP subsystem will expose those functionalities to the host. But if the underlying hardware is not supporing MSI-X, then it won't be exposed. > A PCI controller driver may operate a non virtio SoC device, right? > If you go over my presentation, then you will get the internals of EP subsystem. According to that, a PCIe endpoint device may expose it as any kind of device to the host. That behavior is defined by the EPF driver. Currently, mainline Linux kernel supports a test driver, MHI (Qcom specific), NTB function drivers. And if Virtio is supported, then there would be a common Virtio driver together with an usecase specific driver like virt-net, virt-blk etc... > Are you trying to create a new kind of virtio device that is actually bind to PCI controller driver? > and if so, it likely needs a new device id as this is the control point of PCIe EP device. > No new Virtio device. We are just trying to expose transitional Virtio devices like virt-net, virt-blk, virt-console etc... The idea is to just expose these devices and make use of the existing frontend drivers on the host machine. Just like how a hypervisor would expose Virtio devices to the guest. Here, hypervisor is replaced by the PCIe endpoint device running Linux kernel. > and idea is to consume MSI vectors by this PCI controller driver like updated virtio PCI driver , it looks fine to me. > Yeah, since our devices are supporting MSI only, we want the host side Virtio stack to make use of it. > > > > > > > While doing so, we faced an issue due to lack of MSI support defined > > > > in Virtio spec for PCI transport. Currently, the PCI transport > > > > (starting from 0.9.5) has only defined INTx (legacy) and MSI-X > > > > interrupts for the device to send notifications to the guest. While > > > > it works well for the hypervisor to guest communcation, when a > > > > physical PCIe device is used as a Virtio device, lack of MSI support is > > hurting the performance (when there is no MSI-X). > > > > > > > I am familiar with the scale issue of MSI-X, which is better for MSI (relative > > to MSI-X). > > > What prevents implementing the MSI-X? > > > > > > > As I said, most of the devices I'm aware doesn't support MSI-X in hardware > > itself (I mean in the PCIe EP IP inside the SoC/device). For simple usecases > > like WLAN, modem, MSI-X is not really required. > > > > > > Most of the physical PCIe endpoint devices support MSI interrupts > > > > over MSI- > > > I am not sure if this is true. :) > > > But not a concern either. > > > > > > > It really depends on the usecase I would say. > > > Right. And also depends from which side you see it. > i.e. to see PCIe EP device from RC side or from Linux PCIe EP subsystem side. > > A PCIe EP device does not support MSI-X is confusing term to say, > Because from a host (server) root complex point of view, which is seeing virtio or nvme PCI device, it is a PCIe EP device. > > Therefore, for simplicity, just say, > > A virtio PCI device does not support MSI interrupt mode. > And it is useful to support it, that is optimized then MSI-X at lower scale. > Ok, I now get what you are saying. I was describing more from a PCIe endpoint device point of view. > > > > X for simplicity and with Virtio not supporting MSI, falling back to > > > > legacy INTx interrupts is affecting the performance. > > > > > > > > First of all, INTx requires the PCIe devices to send two MSG TLPs > > > > (Assert/Deassert) to emulate level triggered interrupt on the host. > > > > And there could be some delay between assert and deassert messages > > > > to make sure that the host recognizes it as an interrupt (level > > > > trigger). Also, the INTx interrupts are limited to 1 per function, > > > > so all the notifications from device has to share this single interrupt > > (INTA). > > > > > > > Yes, INTx deprecation is in my list but didn’t get their yet. > > > > > > > On the other hand, MSI requires only one MWr TLP from the device to > > > > host and since it is a posted write, there is no delay involved. > > > > Also, a single PCIe function can use upto 32 MSIs, thus making it > > > > possible to use one MSI vector per virtqueue (32 is more than enough for > > most of the usecases). > > > > > > > > So my question is, why does the Virtio spec not supporting MSI? If > > > > there are no major blocker in supporting MSI, could we propose > > > > adding MSI to the Virtio spec? > > > > > > > MSI addition is good for virtio for small scale devices of 1 to 32. > > > PCIe EP may support MSI-X and MSI both the capabilities and sw can give > > preference to MSI when the need is <= 32 vectors. > > > > > > > PCIe specification only mandates the devices to support either MSI or MSI-X. > > > > Reference: PCIe spec r5.0, sec 6.1.4: > > > > "All PCI Express device Functions that are capable of generating interrupts > > must support MSI or MSI-X or both." > > > I am referring to the last word "both". > But there is an 'or' before 'both', that's what I'm trying to highlight. Because, there is no necessity for a device to support MSI-X. But current Virtio Linux driver expects the device to support MSI-X, otherwise falls back to INTx. > > > So MSI-X is clearly an optional feature which simple devices tend to ignore. > > But if both are supported, then obviously Virtio will make use of MSI-X, but > > that's not the case here. > > > If both are supported, and required scale by the driver is <=32, driver can choose MSI due to its lightweight nature. > Why do you say "obviously virtio will make use of MSI-X?" do you mean current code or future code? > Current code: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/virtio/virtio_pci_common.c#n102 The API vp_request_msix_vectors() just requests MSI-X using the flag PCI_IRQ_MSIX. Because of this, even if the device supports MSI, it won't be used by Virtio, hence falling back to legacy INTx. > > > Though I don’t see it anyway related to PCIe EP configuration in your > > diagram. > > > In other words, PCI EP subsystem can still work with MSI-X. > > > Can you please elaborate it? > > > > > > > I hope the above info clarifies. If not, please let me know. > > > > The only part that is not clear to me is, the PCIe EP controller driver is attaching to virtio device or some vendor specific SoC platform device? I think above justification will clarify this. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: MSI for Virtio PCI transport 2024-06-25 9:11 ` Manivannan Sadhasivam @ 2024-06-25 9:59 ` Parav Pandit 0 siblings, 0 replies; 13+ messages in thread From: Parav Pandit @ 2024-06-25 9:59 UTC (permalink / raw) To: Manivannan Sadhasivam; +Cc: virtio-comment@lists.linux.dev, mie@igel.co.jp > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > Sent: Tuesday, June 25, 2024 2:41 PM > > On Tue, Jun 25, 2024 at 06:18:46AM +0000, Parav Pandit wrote: > > > > > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > > Sent: Tuesday, June 25, 2024 11:14 AM > > > > > > On Tue, Jun 25, 2024 at 04:09:07AM +0000, Parav Pandit wrote: > > > > Hi, > > > > > > > > > From: Manivannan Sadhasivam <manisadhasivam.linux@gmail.com> > > > > > Sent: Monday, June 24, 2024 9:50 PM > > > > > > > > > > Hi, > > > > > > > > > > We are looking into adapting Virtio spec for configurable > > > > > physical PCIe endpoint devices to expose Virtio devices to the > > > > > host machine connected over PCIe. This allows us to use the > > > > > existing frontend drivers on the host machine, thus minimizing > > > > > the development efforts. This idea is not new as some vendors > > > > > like NVidia have already released customized PCIe devices > > > > > exposing Virtio devices to the host machines. But we are working > > > > > on making the configurable PCIe devices running Linux kernel to > > > > > expose Virtio devices using the PCI > > > Endpoint (EP) subsystem. > > > > > > > > > > Below is the simplistic represenation of the idea with virt-net > > > > > as an > > > example. > > > > > But this could be extended to any supported Virtio devices: > > > > > > > > > > HOST ENDPOINT > > > > > > > > > > +-----------------------------+ > > > > > +-----------------------------+ +-----------------------------+ > > > > > | | | | > > > > > | Linux Kernel | | Linux Kernel | > > > > > | | | | > > > > > | | | +------------------+ | > > > > > | | | | | | > > > > > | | | | Modem | | > > > > > | | | | | | > > > > > | | | +---------|--------+ | > > > > > | | | | | > > > > > | +------------------+ | | +---------|--------+ | > > > > > | | | | | | | | > > > > > | | Virt-net | | | | Virtio EPF | | > > > > > | | | | | | | | > > > > > | +---------|--------+ | | +---------|--------+ | > > > > > | | | | | | > > > > > | +---------|--------+ | | +---------|--------+ | > > > > > | | | | | | | | > > > > > | | Virtio PCI | | | | PCI EP Subsystem | | > > > > > | | | | | | | | > > > > > | +---------|--------+ | | +---------|--------+ | > > > > > | SW | | | SW | | > > > > > ----------------|-------------- ----------------|-------------- > > > > > | HW | | | HW | | > > > > > | +---------|--------+ | | +---------|--------+ | > > > > > | | | | | | | | > > > > > | | PCIe RC | | | | PCIe EP | | > > > > > | | | | | | | | > > > > > +-----+---------|--------+----+ > > > > > +-----+---------|--------+----+ +-----+---------|--------+----+ > > > > > | | > > > > > | | > > > > > | | > > > > > | PCIe | > > > > > ----------------------------------------- > > > > > > > > > Can you please explain what is PCIe EP subsystem is? > > > > I assume, it is a subsystem to somehow configure the PCIe EP HW > > > instances? > > > > If yes, it is not connected to any PCIe RC in your diagram. > > > > > > > > > > PCIe EP subsystem is a Linux kernel framework to configure the PCIe > > > EP IP inside an SoC/device. Here 'Endpoint' is a separate SoC/device > > > that is running Linux kernel and uses PCIe EP subsystem in kernel > > > [1] to configure the PCIe EP IP based on product usecase like GPU card, > NVMe, Modem, WLAN etc... > > > > > > > I understood the PCI EP subsystem. > > I didn’t follow it in context of virtio device as you have "virtio EPF". > > > > > > So how does the MSI help in this case? > > > > > > > > > > I think you are missing the point that 'Endpoint' is a separate > > > SoC/device that is connected to a host machine over PCIe. > > Understood. > > > > > Just like how you would connect a PCIe based GPU card to a Desktop PC. > > Also understood. > > > > > Only difference is, most of the PCIe cards will run on a proprietary > > > firmware supplied by the vendor, but here the firmware itself can be > > > built by the user and configurable. And this is where Virtio is > > > going to be exposed. > > > > > This part I read few times, but not understanding. > > > > A PCI EP can be a virtio or nvme or any device. > > A PCI controller driver in Linux can call devm_pci_epc_create() and > implement virtio specific configuration headers. > > Right. Although, there is one more component called EPF (Endpoint Function) > driver that implements the header for the device. I didn't go in detail because I > thought that would mislead the discussion. But if you want to know more > about PCI Endpoint subsystem, please take a look at my ELEC presentation: > > Yes. bringing PCI EPF to this discussion was confusing. But I am familiar with building composable PCI EPs for virtio and others. MSI has no relation to how one composes PCIe EPs. > > Don’t see this anyway related to MSI-X. > > MSI/MSI-X is a PCIe endpoint controller (hardware) capability. Based on that, > the PCI EP subsystem will expose those functionalities to the host. But if the > underlying hardware is not supporing MSI-X, then it won't be exposed. Yes. > > > A PCI controller driver may operate a non virtio SoC device, right? > > > > If you go over my presentation, then you will get the internals of EP > subsystem. According to that, a PCIe endpoint device may expose it as any > kind of device to the host. That behavior is defined by the EPF driver. > Currently, mainline Linux kernel supports a test driver, MHI (Qcom specific), > NTB function drivers. And if Virtio is supported, then there would be a > common Virtio driver together with an usecase specific driver like virt-net, virt- > blk etc... > Yes. I am clear. > > Are you trying to create a new kind of virtio device that is actually bind to PCI > controller driver? > > and if so, it likely needs a new device id as this is the control point of PCIe EP > device. > > > > No new Virtio device. We are just trying to expose transitional Virtio devices > like virt-net, virt-blk, virt-console etc... The idea is to just expose these devices > and make use of the existing frontend drivers on the host machine. Right. I went one step further to see if you are exploring "struct device" in the devm_pci_epc_create() to be also a virtio device or not. I understood now, that is not your goal. You are just replacing hypervisor sw with the semi hw to compose the hardware PCIe EPs. > > Just like how a hypervisor would expose Virtio devices to the guest. Here, > hypervisor is replaced by the PCIe endpoint device running Linux kernel. > > > and idea is to consume MSI vectors by this PCI controller driver like updated > virtio PCI driver , it looks fine to me. > > > > Yeah, since our devices are supporting MSI only, we want the host side Virtio > stack to make use of it. > Yeah, make sense. > > > > > > > > > While doing so, we faced an issue due to lack of MSI support > > > > > defined in Virtio spec for PCI transport. Currently, the PCI > > > > > transport (starting from 0.9.5) has only defined INTx (legacy) > > > > > and MSI-X interrupts for the device to send notifications to the > > > > > guest. While it works well for the hypervisor to guest > > > > > communcation, when a physical PCIe device is used as a Virtio > > > > > device, lack of MSI support is > > > hurting the performance (when there is no MSI-X). > > > > > > > > > I am familiar with the scale issue of MSI-X, which is better for > > > > MSI (relative > > > to MSI-X). > > > > What prevents implementing the MSI-X? > > > > > > > > > > As I said, most of the devices I'm aware doesn't support MSI-X in > > > hardware itself (I mean in the PCIe EP IP inside the SoC/device). > > > For simple usecases like WLAN, modem, MSI-X is not really required. > > > > > > > > Most of the physical PCIe endpoint devices support MSI > > > > > interrupts over MSI- > > > > I am not sure if this is true. :) > > > > But not a concern either. > > > > > > > > > > It really depends on the usecase I would say. > > > > > Right. And also depends from which side you see it. > > i.e. to see PCIe EP device from RC side or from Linux PCIe EP subsystem side. > > > > A PCIe EP device does not support MSI-X is confusing term to say, > > Because from a host (server) root complex point of view, which is seeing > virtio or nvme PCI device, it is a PCIe EP device. > > > > Therefore, for simplicity, just say, > > > > A virtio PCI device does not support MSI interrupt mode. > > And it is useful to support it, that is optimized then MSI-X at lower scale. > > > > Ok, I now get what you are saying. I was describing more from a PCIe > endpoint device point of view. > > > > > > X for simplicity and with Virtio not supporting MSI, falling > > > > > back to legacy INTx interrupts is affecting the performance. > > > > > > > > > > First of all, INTx requires the PCIe devices to send two MSG > > > > > TLPs > > > > > (Assert/Deassert) to emulate level triggered interrupt on the host. > > > > > And there could be some delay between assert and deassert > > > > > messages to make sure that the host recognizes it as an > > > > > interrupt (level trigger). Also, the INTx interrupts are limited > > > > > to 1 per function, so all the notifications from device has to > > > > > share this single interrupt > > > (INTA). > > > > > > > > > Yes, INTx deprecation is in my list but didn’t get their yet. > > > > > > > > > On the other hand, MSI requires only one MWr TLP from the device > > > > > to host and since it is a posted write, there is no delay involved. > > > > > Also, a single PCIe function can use upto 32 MSIs, thus making > > > > > it possible to use one MSI vector per virtqueue (32 is more than > > > > > enough for > > > most of the usecases). > > > > > > > > > > So my question is, why does the Virtio spec not supporting MSI? > > > > > If there are no major blocker in supporting MSI, could we > > > > > propose adding MSI to the Virtio spec? > > > > > > > > > MSI addition is good for virtio for small scale devices of 1 to 32. > > > > PCIe EP may support MSI-X and MSI both the capabilities and sw can > > > > give > > > preference to MSI when the need is <= 32 vectors. > > > > > > > > > > PCIe specification only mandates the devices to support either MSI or MSI- > X. > > > > > > Reference: PCIe spec r5.0, sec 6.1.4: > > > > > > "All PCI Express device Functions that are capable of generating > > > interrupts must support MSI or MSI-X or both." > > > > > I am referring to the last word "both". > > > > But there is an 'or' before 'both', that's what I'm trying to highlight. > Because, there is no necessity for a device to support MSI-X. But current Virtio > Linux driver expects the device to support MSI-X, otherwise falls back to INTx. > Yes. we are aligned. > > > > > So MSI-X is clearly an optional feature which simple devices tend to ignore. > > > But if both are supported, then obviously Virtio will make use of > > > MSI-X, but that's not the case here. > > > > > If both are supported, and required scale by the driver is <=32, driver can > choose MSI due to its lightweight nature. > > Why do you say "obviously virtio will make use of MSI-X?" do you mean > current code or future code? > > > > Current code: .. Ok. Whatever PCIe EPs you make with MSI only capability will work, once the virtio spec and driver will support MSI. The basic motivation to add MSI support to the virtio PCI transport is to support light weight composable hw PCI devices, which are more performant than INTx and light weight than MSI-X. Looks good to me. I am ignoring masking bits limitation of the MSI for now. Thanks. > > The API vp_request_msix_vectors() just requests MSI-X using the flag > PCI_IRQ_MSIX. Because of this, even if the device supports MSI, it won't be > used by Virtio, hence falling back to legacy INTx. > > > > > Though I don’t see it anyway related to PCIe EP configuration in > > > > your > > > diagram. > > > > In other words, PCI EP subsystem can still work with MSI-X. > > > > Can you please elaborate it? > > > > > > > > > > I hope the above info clarifies. If not, please let me know. > > > > > > > The only part that is not clear to me is, the PCIe EP controller driver is > attaching to virtio device or some vendor specific SoC platform device? > > I think above justification will clarify this. > > - Mani > > -- > மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-24 16:19 MSI for Virtio PCI transport Manivannan Sadhasivam 2024-06-25 4:09 ` Parav Pandit @ 2024-06-25 7:52 ` Michael S. Tsirkin 2024-06-25 9:19 ` Manivannan Sadhasivam 1 sibling, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-25 7:52 UTC (permalink / raw) To: Manivannan Sadhasivam; +Cc: virtio-comment, mie On Mon, Jun 24, 2024 at 09:49:57PM +0530, Manivannan Sadhasivam wrote: > Hi, > > We are looking into adapting Virtio spec for configurable physical PCIe endpoint > devices to expose Virtio devices to the host machine connected over PCIe. This > allows us to use the existing frontend drivers on the host machine, thus > minimizing the development efforts. This idea is not new as some vendors like > NVidia have already released customized PCIe devices exposing Virtio devices to > the host machines. But we are working on making the configurable PCIe devices > running Linux kernel to expose Virtio devices using the PCI Endpoint (EP) > subsystem. > > Below is the simplistic represenation of the idea with virt-net as an > example. But this could be extended to any supported Virtio devices: > > HOST ENDPOINT > > +-----------------------------+ +-----------------------------+ > | | | | > | Linux Kernel | | Linux Kernel | > | | | | > | | | +------------------+ | > | | | | | | > | | | | Modem | | > | | | | | | > | | | +---------|--------+ | > | | | | | > | +------------------+ | | +---------|--------+ | > | | | | | | | | > | | Virt-net | | | | Virtio EPF | | > | | | | | | | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | | | > | | Virtio PCI | | | | PCI EP Subsystem | | > | | | | | | | | > | +---------|--------+ | | +---------|--------+ | > | SW | | | SW | | > ----------------|-------------- ----------------|-------------- > | HW | | | HW | | > | +---------|--------+ | | +---------|--------+ | > | | | | | | | | > | | PCIe RC | | | | PCIe EP | | > | | | | | | | | > +-----+---------|--------+----+ +-----+---------|--------+----+ > | | > | | > | | > | PCIe | > ----------------------------------------- > > While doing so, we faced an issue due to lack of MSI support defined in Virtio > spec for PCI transport. Currently, the PCI transport (starting from 0.9.5) has > only defined INTx (legacy) and MSI-X interrupts for the device to send > notifications to the guest. While it works well for the hypervisor to guest > communcation, when a physical PCIe device is used as a Virtio device, lack of > MSI support is hurting the performance (when there is no MSI-X). > > Most of the physical PCIe endpoint devices support MSI interrupts over MSI-X for > simplicity and with Virtio not supporting MSI, falling back to legacy INTx > interrupts is affecting the performance. > > First of all, INTx requires the PCIe devices to send two MSG TLPs > (Assert/Deassert) to emulate level triggered interrupt on the host. And there > could be some delay between assert and deassert messages to make sure that the > host recognizes it as an interrupt (level trigger). Also, the INTx interrupts > are limited to 1 per function, so all the notifications from device has to share > this single interrupt (INTA). > > On the other hand, MSI requires only one MWr TLP from the device to host and > since it is a posted write, there is no delay involved. Also, a single PCIe > function can use upto 32 MSIs, thus making it possible to use one MSI vector per > virtqueue (32 is more than enough for most of the usecases). > > So my question is, why does the Virtio spec not supporting MSI? If there are no > major blocker in supporting MSI, could we propose adding MSI to the Virtio spec? > > - Mani > > -- > மணிவண்ணன் சதாசிவம் Yes, it's possible to add - however, you also said EP requires more changes from virtio. So maybe we need "virtio over EP" then. Let's try to figure out the full list of issues, to see which makes more sense. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: MSI for Virtio PCI transport 2024-06-25 7:52 ` Michael S. Tsirkin @ 2024-06-25 9:19 ` Manivannan Sadhasivam 0 siblings, 0 replies; 13+ messages in thread From: Manivannan Sadhasivam @ 2024-06-25 9:19 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: virtio-comment, mie On Tue, Jun 25, 2024 at 03:52:30AM -0400, Michael S. Tsirkin wrote: > On Mon, Jun 24, 2024 at 09:49:57PM +0530, Manivannan Sadhasivam wrote: > > Hi, > > > > We are looking into adapting Virtio spec for configurable physical PCIe endpoint > > devices to expose Virtio devices to the host machine connected over PCIe. This > > allows us to use the existing frontend drivers on the host machine, thus > > minimizing the development efforts. This idea is not new as some vendors like > > NVidia have already released customized PCIe devices exposing Virtio devices to > > the host machines. But we are working on making the configurable PCIe devices > > running Linux kernel to expose Virtio devices using the PCI Endpoint (EP) > > subsystem. > > > > Below is the simplistic represenation of the idea with virt-net as an > > example. But this could be extended to any supported Virtio devices: > > > > HOST ENDPOINT > > > > +-----------------------------+ +-----------------------------+ > > | | | | > > | Linux Kernel | | Linux Kernel | > > | | | | > > | | | +------------------+ | > > | | | | | | > > | | | | Modem | | > > | | | | | | > > | | | +---------|--------+ | > > | | | | | > > | +------------------+ | | +---------|--------+ | > > | | | | | | | | > > | | Virt-net | | | | Virtio EPF | | > > | | | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | | | > > | | Virtio PCI | | | | PCI EP Subsystem | | > > | | | | | | | | > > | +---------|--------+ | | +---------|--------+ | > > | SW | | | SW | | > > ----------------|-------------- ----------------|-------------- > > | HW | | | HW | | > > | +---------|--------+ | | +---------|--------+ | > > | | | | | | | | > > | | PCIe RC | | | | PCIe EP | | > > | | | | | | | | > > +-----+---------|--------+----+ +-----+---------|--------+----+ > > | | > > | | > > | | > > | PCIe | > > ----------------------------------------- > > > > While doing so, we faced an issue due to lack of MSI support defined in Virtio > > spec for PCI transport. Currently, the PCI transport (starting from 0.9.5) has > > only defined INTx (legacy) and MSI-X interrupts for the device to send > > notifications to the guest. While it works well for the hypervisor to guest > > communcation, when a physical PCIe device is used as a Virtio device, lack of > > MSI support is hurting the performance (when there is no MSI-X). > > > > Most of the physical PCIe endpoint devices support MSI interrupts over MSI-X for > > simplicity and with Virtio not supporting MSI, falling back to legacy INTx > > interrupts is affecting the performance. > > > > First of all, INTx requires the PCIe devices to send two MSG TLPs > > (Assert/Deassert) to emulate level triggered interrupt on the host. And there > > could be some delay between assert and deassert messages to make sure that the > > host recognizes it as an interrupt (level trigger). Also, the INTx interrupts > > are limited to 1 per function, so all the notifications from device has to share > > this single interrupt (INTA). > > > > On the other hand, MSI requires only one MWr TLP from the device to host and > > since it is a posted write, there is no delay involved. Also, a single PCIe > > function can use upto 32 MSIs, thus making it possible to use one MSI vector per > > virtqueue (32 is more than enough for most of the usecases). > > > > So my question is, why does the Virtio spec not supporting MSI? If there are no > > major blocker in supporting MSI, could we propose adding MSI to the Virtio spec? > > > > - Mani > > > > -- > > மணிவண்ணன் சதாசிவம் > > Yes, it's possible to add - however, you also said EP requires more > changes from virtio. So maybe we need "virtio over EP" then. I don't think we need a separate 'Virtio over EP'. EP uses PCI transport, so 'Virtio over PCI' is fine as it is. Just because 'Virtio over PCI' was designed based on virtual PCI devices exposed by the hypervisor, the real world limitations were not taken into account. And that's what we are trying to add. > Let's try to figure out the full list of issues, to see which makes > more sense. > Sure. But each one would need a separate discussion, that's why I started this thread for MSI. Let me check with Shunsuke and come up with an exhaustive list. - Mani -- மணிவண்ணன் சதாசிவம் ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-06-25 9:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-24 16:19 MSI for Virtio PCI transport Manivannan Sadhasivam 2024-06-25 4:09 ` Parav Pandit 2024-06-25 5:43 ` Manivannan Sadhasivam 2024-06-25 6:18 ` Parav Pandit 2024-06-25 7:55 ` Michael S. Tsirkin 2024-06-25 8:00 ` Parav Pandit 2024-06-25 8:09 ` Michael S. Tsirkin 2024-06-25 8:18 ` Parav Pandit 2024-06-25 8:29 ` Michael S. Tsirkin 2024-06-25 9:11 ` Manivannan Sadhasivam 2024-06-25 9:59 ` Parav Pandit 2024-06-25 7:52 ` Michael S. Tsirkin 2024-06-25 9:19 ` Manivannan Sadhasivam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox