* [RFC] Flexible SR-IOV support for virtio-net
@ 2023-11-18 12:10 Akihiko Odaki
2023-11-28 8:47 ` Yui Washizu
0 siblings, 1 reply; 3+ messages in thread
From: Akihiko Odaki @ 2023-11-18 12:10 UTC (permalink / raw)
To: Michael S. Tsirkin, Marcel Apfelbaum, Washizu Yui,
qemu-devel@nongnu.org
Hi,
We are planning to add PCIe SR-IOV support to the virtio-net driver for
Windows ("NetKVM")[1], and we want a SR-IOV feature for virtio-net
emulation code in QEMU to test it. I expect there are other people
interested in such a feature, considering that people are using igb[2]
to test SR-IOV support in VMs.
Washizu Yui have already proposed an RFC patch to add a SR-IOV feature
to virtio-net emulation[3][4] but it's preliminary and has no
configurability for VFs.
Now I'm proposing to add SR-IOV support to virtio-net with full
configurability for VFs by following the implementation of virtio-net
failover[5]. I'm planning to write patches myself, but I know there are
people interested in such patches so I'd like to let you know the idea
beforehand.
The idea:
The problem when implementing configurability for VFs is that SR-IOV VFs
can be realized and unrealized at runtime with a request from the guest.
So a naive implementation cannot deal with a command line like the
following:
-device virtio-net-pci,addr=0x0.0x0,sriov=on
-device virtio-net-pci,addr=0x0.0x1
-device virtio-net-pci,addr=0x0.0x2
This will realize the virtio-net functions in 0x0.0x1 and 0x0.0x2 when
the guest starts instead of when the guest requests to enable VFs.
However, reviewing the virtio-net emulation code, I realized the
virtio-net failover also "hides" devices when the guest starts. The
following command line hides hostdev0 when the guest starts, and adds it
when the guest requests VIRTIO_NET_F_STANDBY feature:
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6f:55:cc, \
bus=root2,failover=on
-device vfiopci,host=5e:00.2,id=hostdev0,bus=root1,failover_pair_id=net1
So it should be also possible to do similar to "hide" VFs and
realize/unrealize them when the guest requests.
There are two things I hate with this idea when contrasting it with the
conventional multifunction feature[6] though. One is that the PF must be
added before VFs; a similar limitation is imposed for failover.
Another is that it will be specific to virtio-net. I was considering to
implement a "generic" SR-IOV feature that will work on various devices,
but I realized that will need lots of configuration validations. We may
eventually want it, but probably it's better to avoid such a big leap as
the first step.
Please tell me if you have questions or suggestions.
Regards,
Akihiko Odaki
[1] https://github.com/virtio-win/kvm-guest-drivers-windows
[2] https://qemu.readthedocs.io/en/v8.1.0/system/devices/igb.html
[3]
https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.washidu@gmail.com/
[4]
https://netdevconf.info/0x17/sessions/talk/unleashing-sr-iov-offload-on-virtual-machines.html
[5] https://qemu.readthedocs.io/en/v8.1.0/system/virtio-net-failover.html
[6] https://gitlab.com/qemu-project/qemu/-/blob/v8.1.2/docs/pcie.txt
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [RFC] Flexible SR-IOV support for virtio-net 2023-11-18 12:10 [RFC] Flexible SR-IOV support for virtio-net Akihiko Odaki @ 2023-11-28 8:47 ` Yui Washizu 2023-11-28 9:34 ` Akihiko Odaki 0 siblings, 1 reply; 3+ messages in thread From: Yui Washizu @ 2023-11-28 8:47 UTC (permalink / raw) To: Akihiko Odaki, Michael S. Tsirkin, Marcel Apfelbaum, qemu-devel@nongnu.org On 2023/11/18 21:10, Akihiko Odaki wrote: > Hi, > > We are planning to add PCIe SR-IOV support to the virtio-net driver > for Windows ("NetKVM")[1], and we want a SR-IOV feature for virtio-net > emulation code in QEMU to test it. I expect there are other people > interested in such a feature, considering that people are using igb[2] > to test SR-IOV support in VMs. > > Washizu Yui have already proposed an RFC patch to add a SR-IOV feature > to virtio-net emulation[3][4] but it's preliminary and has no > configurability for VFs. > > Now I'm proposing to add SR-IOV support to virtio-net with full > configurability for VFs by following the implementation of virtio-net > failover[5]. I'm planning to write patches myself, but I know there > are people interested in such patches so I'd like to let you know the > idea beforehand. > > The idea: > > The problem when implementing configurability for VFs is that SR-IOV > VFs can be realized and unrealized at runtime with a request from the > guest. So a naive implementation cannot deal with a command line like > the following: > -device virtio-net-pci,addr=0x0.0x0,sriov=on > -device virtio-net-pci,addr=0x0.0x1 > -device virtio-net-pci,addr=0x0.0x2 > > This will realize the virtio-net functions in 0x0.0x1 and 0x0.0x2 when > the guest starts instead of when the guest requests to enable VFs. > > However, reviewing the virtio-net emulation code, I realized the > virtio-net failover also "hides" devices when the guest starts. The > following command line hides hostdev0 when the guest starts, and adds > it when the guest requests VIRTIO_NET_F_STANDBY feature: > > -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6f:55:cc, \ > bus=root2,failover=on > -device vfiopci,host=5e:00.2,id=hostdev0,bus=root1,failover_pair_id=net1 > > So it should be also possible to do similar to "hide" VFs and > realize/unrealize them when the guest requests. > > There are two things I hate with this idea when contrasting it with > the conventional multifunction feature[6] though. One is that the PF > must be added before VFs; a similar limitation is imposed for failover. > > Another is that it will be specific to virtio-net. I was considering > to implement a "generic" SR-IOV feature that will work on various > devices, but I realized that will need lots of configuration > validations. We may eventually want it, but probably it's better to > avoid such a big leap as the first step. > > Please tell me if you have questions or suggestions. > Hi, Odaki-san The idea appears to be practical and convenient. I have some things I want to confirm. I understood your idea can make deices for VFs, created by qdev_new or qdev_realize function, invisible from guest OS. Is my understanding correct ? And, if your idea is realized, will it be possible to specify the backend device for the virtio-pci-net device ? Could you provide insights into the next steps beyond the implementation details ? About when do you expect your implementation to be merged into qemu ? Do you have a timeline for this plan ? Moreover, is there any way we can collaborate on the implementation you're planning ? Regards, Yui Washizu > Regards, > Akihiko Odaki > > [1] https://github.com/virtio-win/kvm-guest-drivers-windows > [2] https://qemu.readthedocs.io/en/v8.1.0/system/devices/igb.html > [3] > https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.washidu@gmail.com/ > [4] > https://netdevconf.info/0x17/sessions/talk/unleashing-sr-iov-offload-on-virtual-machines.html > [5] https://qemu.readthedocs.io/en/v8.1.0/system/virtio-net-failover.html > [6] https://gitlab.com/qemu-project/qemu/-/blob/v8.1.2/docs/pcie.txt ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] Flexible SR-IOV support for virtio-net 2023-11-28 8:47 ` Yui Washizu @ 2023-11-28 9:34 ` Akihiko Odaki 0 siblings, 0 replies; 3+ messages in thread From: Akihiko Odaki @ 2023-11-28 9:34 UTC (permalink / raw) To: Yui Washizu, Michael S. Tsirkin, Marcel Apfelbaum, qemu-devel@nongnu.org On 2023/11/28 17:47, Yui Washizu wrote: > > On 2023/11/18 21:10, Akihiko Odaki wrote: >> Hi, >> >> We are planning to add PCIe SR-IOV support to the virtio-net driver >> for Windows ("NetKVM")[1], and we want a SR-IOV feature for virtio-net >> emulation code in QEMU to test it. I expect there are other people >> interested in such a feature, considering that people are using igb[2] >> to test SR-IOV support in VMs. >> >> Washizu Yui have already proposed an RFC patch to add a SR-IOV feature >> to virtio-net emulation[3][4] but it's preliminary and has no >> configurability for VFs. >> >> Now I'm proposing to add SR-IOV support to virtio-net with full >> configurability for VFs by following the implementation of virtio-net >> failover[5]. I'm planning to write patches myself, but I know there >> are people interested in such patches so I'd like to let you know the >> idea beforehand. >> >> The idea: >> >> The problem when implementing configurability for VFs is that SR-IOV >> VFs can be realized and unrealized at runtime with a request from the >> guest. So a naive implementation cannot deal with a command line like >> the following: >> -device virtio-net-pci,addr=0x0.0x0,sriov=on >> -device virtio-net-pci,addr=0x0.0x1 >> -device virtio-net-pci,addr=0x0.0x2 >> >> This will realize the virtio-net functions in 0x0.0x1 and 0x0.0x2 when >> the guest starts instead of when the guest requests to enable VFs. >> >> However, reviewing the virtio-net emulation code, I realized the >> virtio-net failover also "hides" devices when the guest starts. The >> following command line hides hostdev0 when the guest starts, and adds >> it when the guest requests VIRTIO_NET_F_STANDBY feature: >> >> -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6f:55:cc, \ >> bus=root2,failover=on >> -device vfiopci,host=5e:00.2,id=hostdev0,bus=root1,failover_pair_id=net1 >> >> So it should be also possible to do similar to "hide" VFs and >> realize/unrealize them when the guest requests. >> >> There are two things I hate with this idea when contrasting it with >> the conventional multifunction feature[6] though. One is that the PF >> must be added before VFs; a similar limitation is imposed for failover. >> >> Another is that it will be specific to virtio-net. I was considering >> to implement a "generic" SR-IOV feature that will work on various >> devices, but I realized that will need lots of configuration >> validations. We may eventually want it, but probably it's better to >> avoid such a big leap as the first step. >> >> Please tell me if you have questions or suggestions. >> > > > Hi, Odaki-san Hi, > > The idea appears to be practical and convenient. > > I have some things I want to confirm. > I understood your idea can make deices for VFs, > created by qdev_new or qdev_realize function, invisible from guest OS. > Is my understanding correct ? Yes, the guest will request to enable VFs with the standard SR-IOV capability, and the virtio-net implementation will use appropriate QEMU-internal APIs to create and realize VFs accordingly. > And, if your idea is realized, > will it be possible to specify the backend device for the virtio-pci-net > device ? Yes, you can specify netdev like conventional virtio-net devices. > > Could you provide insights into the next steps > beyond the implementation details ? > About when do you expect your implementation > to be merged into qemu ? > Do you have a timeline for this plan ? > Moreover, is there any way > we can collaborate on the implementation you're planning ? I intend to upstream my implementation. The flexibility of this design will make the SR-IOV support useful for many people and make it suitable for upstreaming. I also expect the implementation will be clean enough for upstreaming. I'll submit it to the mailing list when I finish the implementation so I'd like you to test and review it. By the way, I started the implementation and realized it may be better to change the design so I present the design changes below: First I intend to change the CLI. The interface in my last proposal expects there is only one PF in a bus and it is marked with "sriov" property. However, the specification allows to have multiple PFs in a bus so it's better to design the CLI so that it allows to have multiple PFs though I'm not going to implement such a feature at first. The new CLI will instead add "sriov-pf" property to VFs, which designates the PF paired with them. The below is an example of a command line conforming to the new interface: -device virtio-net-pci,addr=0x0.0x3,netdev=tap3,sriov-pf=pf1 -device virtio-net-pci,addr=0x0.0x2,netdev=tap2,id=pf1 -device virtio-net-pci,addr=0x0.0x1,netdev=tap1,sriov-pf=pf0 -device virtio-net-pci,addr=0x0.0x0,netdev=tap0,id=pf0 Another design change is *not* to use the "device hiding" API of failover. It is because fully-realized devices are useful when validating the configuration. In particular, VFs must have a consistent BAR configuration, and that can be validated only after they are realized. So I'm now considering to have "prototype VFs" realized before the PF gets realized. Prototype VFs will be fully realized, but virtio_write_config() and virtio_read_config() will do nothing for those VFs, which effectively disables them. It is similar how functions are disabled until function 0 gets plugged for a conventional multifunction device (c.f., pci_host_config_write_common() and pci_host_config_read_common()). When the PF gets realized, the PF will validate the configuration by inspecting the prototype VFs. If the configuration looks valid, the PF backs up DeviceState::opts and unplugs them. The PF will later use the backed up device options to realize VFs when the guest requests. This design change forces to create VFs before the PF in the command line. It is similar that the conventional multifunction requires function 0 to be realized after the other functions. I may make other design changes as the implementation progresses, but the above is the current design I have in mind. Regards, Akihiko Odaki ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-11-28 9:34 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-18 12:10 [RFC] Flexible SR-IOV support for virtio-net Akihiko Odaki 2023-11-28 8:47 ` Yui Washizu 2023-11-28 9:34 ` Akihiko Odaki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).