* Using virtio as a physical (wire-level) transport @ 2010-08-04 23:04 Ira W. Snyder 2010-08-05 21:30 ` Michael S. Tsirkin 0 siblings, 1 reply; 8+ messages in thread From: Ira W. Snyder @ 2010-08-04 23:04 UTC (permalink / raw) To: Michael S. Tsirkin, Rusty Russell; +Cc: netdev, Zang Roy, virtualization Hello Michael, Rusty, I'm trying to figure out how to use virtio-net and vhost-net to communicate over a physical transport (PCI bus) instead of shared memory (for example, qemu/kvm guest). We've talked about this several times in the past, and I currently have some time to devote to this again. I'm trying to figure out if virtio is still a viable solution, or if it has been evolved such that it is unusable for this application. I am trying to create a generic system to allow the type of communications described below. I would like to create something that can be easily ported to any slave computer which meets the following requirements: 1) it is a PCI slave (agent) (it acts like any other PCI card) 2) it has an inter-processor communications mechanism 3) it has a DMA engine There is a reasonable amount of demand for such a system. I get inquiries about the prototype code I posted to linux-netdev at least once a month. This sort of system is used regularly in the telecommunications industry, among others. Here is a quick drawing of the system I work with. Please forgive my poor ascii art skills. +-----------------+ | master computer | | | +-------------------+ | PCI slot #1 | <-- physical connection --> | slave computer #1 | | virtio-net if#1 | | vhost-net if#1 | | | +-------------------+ | | | | +-------------------+ | PCI slot #2 | <-- physical connection --> | slave computer #2 | | virtio-net if#2 | | vhost-net if#2 | | | +-------------------+ | | | | +-------------------+ | PCI slot #n | <-- physical connection --> | slave computer #n | | virtio-net if#n | | vhost-net if#n | | | +-------------------+ +-----------------+ The reason for using vhost-net on the "slave" side is because vhost-net is the component that performs data copies. In most cases, the slave computers are non-x86 and have DMA controllers. DMA is an absolute necessity when copying data across the PCI bus. Do you think virtio is a viable solution to solve this problem? If not, can you suggest anything else? Another reason I ask this question is that I have previously invested several months implementing a similar solution, only to have it outright rejected for "not being the right way". If you don't think something like this has any hope, I'd rather not waste another month of my life. If you can think of a solution that is likely to be "the right way", I'd rather you told me before I implement any code. Making my life harder since the last time I tried this, mainline commit 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the possibility of using an alternative virtqueue implementation. The commit message suggests that you might be willing to add this capability back. Would this be an option? Thanks for your time, Ira ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-04 23:04 Using virtio as a physical (wire-level) transport Ira W. Snyder @ 2010-08-05 21:30 ` Michael S. Tsirkin 2010-08-05 23:01 ` Ira W. Snyder 0 siblings, 1 reply; 8+ messages in thread From: Michael S. Tsirkin @ 2010-08-05 21:30 UTC (permalink / raw) To: Ira W. Snyder; +Cc: Rusty Russell, virtualization, Zang Roy, netdev Hi Ira, > Making my life harder since the last time I tried this, mainline commit > 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the > possibility of using an alternative virtqueue implementation. The commit > message suggests that you might be willing to add this capability back. > Would this be an option? Sorry about that. With respect to this commit, we only had one implementation upstream and extra levels of indirection made extending the API much harder for no apparent benefit. When there's more than one ring implementation with very small amount of common code, I think that it might make sense to readd the indirection back, to separate the code cleanly. OTOH if the two implementations share a lot of code, I think that it might be better to just add a couple of if statements here and there. This way compiler even might have a chance to compile the code out if the feature is disabled in kernel config. -- MST ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-05 21:30 ` Michael S. Tsirkin @ 2010-08-05 23:01 ` Ira W. Snyder 2010-08-05 23:20 ` Michael S. Tsirkin 0 siblings, 1 reply; 8+ messages in thread From: Ira W. Snyder @ 2010-08-05 23:01 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Rusty Russell, virtualization, Zang Roy, netdev On Fri, Aug 06, 2010 at 12:30:50AM +0300, Michael S. Tsirkin wrote: > Hi Ira, > > > Making my life harder since the last time I tried this, mainline commit > > 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the > > possibility of using an alternative virtqueue implementation. The commit > > message suggests that you might be willing to add this capability back. > > Would this be an option? > > Sorry about that. > > With respect to this commit, we only had one implementation upstream > and extra levels of indirection made extending the API > much harder for no apparent benefit. > > When there's more than one ring implementation with very small amount of > common code, I think that it might make sense to readd the indirection > back, to separate the code cleanly. > > OTOH if the two implementations share a lot of code, I think that it > might be better to just add a couple of if statements here and there. > This way compiler even might have a chance to compile the code out if > the feature is disabled in kernel config. > The virtqueue implementation I envision will be almost identical to the current virtio_ring virtqueue implementation, with the following exceptions: * the "shared memory" will actually be remote, on the PCI BAR of a device * iowrite32(), ioread32() and friends will be needed to access the memory * there will only be a fixed number of virtqueues available, due to PCI BAR size * cross-endian virtqueues must work * kick needs to be cross-machine (using PCI IRQ's) I don't think it is feasible to add this to the existing implementation. I think the requirement of being cross-endian will be the hardest to overcome. Rusty did not envision the cross-endian use case when he designed this, and it shows, in virtio_ring, virtio_net and vhost. I have no idea what to do about this. Do you have any ideas? I plan to create a custom socket similar to tun/macvtap which will use DMA to transfer around data. This, along with a few other tricks, will allow me to use vhost_net to operate the device. Along with a custom virtqueue implementation meeting the requirements above, this seems like a good plan. Thanks for responding, Ira ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-05 23:01 ` Ira W. Snyder @ 2010-08-05 23:20 ` Michael S. Tsirkin 2010-08-06 15:34 ` Ira W. Snyder 0 siblings, 1 reply; 8+ messages in thread From: Michael S. Tsirkin @ 2010-08-05 23:20 UTC (permalink / raw) To: Ira W. Snyder; +Cc: Rusty Russell, virtualization, Zang Roy, netdev On Thu, Aug 05, 2010 at 04:01:03PM -0700, Ira W. Snyder wrote: > On Fri, Aug 06, 2010 at 12:30:50AM +0300, Michael S. Tsirkin wrote: > > Hi Ira, > > > > > Making my life harder since the last time I tried this, mainline commit > > > 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the > > > possibility of using an alternative virtqueue implementation. The commit > > > message suggests that you might be willing to add this capability back. > > > Would this be an option? > > > > Sorry about that. > > > > With respect to this commit, we only had one implementation upstream > > and extra levels of indirection made extending the API > > much harder for no apparent benefit. > > > > When there's more than one ring implementation with very small amount of > > common code, I think that it might make sense to readd the indirection > > back, to separate the code cleanly. > > > > OTOH if the two implementations share a lot of code, I think that it > > might be better to just add a couple of if statements here and there. > > This way compiler even might have a chance to compile the code out if > > the feature is disabled in kernel config. > > > > The virtqueue implementation I envision will be almost identical to the > current virtio_ring virtqueue implementation, with the following > exceptions: > > * the "shared memory" will actually be remote, on the PCI BAR of a device > * iowrite32(), ioread32() and friends will be needed to access the memory > * there will only be a fixed number of virtqueues available, due to PCI > BAR size > * cross-endian virtqueues must work > * kick needs to be cross-machine (using PCI IRQ's) > > I don't think it is feasible to add this to the existing implementation. > I think the requirement of being cross-endian will be the hardest to > overcome. Rusty did not envision the cross-endian use case when he > designed this, and it shows, in virtio_ring, virtio_net and vhost. I > have no idea what to do about this. Do you have any ideas? My guess is sticking an if around each access in virtio would hurt, if this is what you are asking about. Just a crazy idea: vhost already uses wrappers like get_user etc, maybe when building kernel for your board you could redefine these to also byteswap? -- MST ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-05 23:20 ` Michael S. Tsirkin @ 2010-08-06 15:34 ` Ira W. Snyder 2010-08-14 11:34 ` Alexander Graf 0 siblings, 1 reply; 8+ messages in thread From: Ira W. Snyder @ 2010-08-06 15:34 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Rusty Russell, virtualization, Zang Roy, netdev On Fri, Aug 06, 2010 at 02:20:42AM +0300, Michael S. Tsirkin wrote: > On Thu, Aug 05, 2010 at 04:01:03PM -0700, Ira W. Snyder wrote: > > On Fri, Aug 06, 2010 at 12:30:50AM +0300, Michael S. Tsirkin wrote: > > > Hi Ira, > > > > > > > Making my life harder since the last time I tried this, mainline commit > > > > 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the > > > > possibility of using an alternative virtqueue implementation. The commit > > > > message suggests that you might be willing to add this capability back. > > > > Would this be an option? > > > > > > Sorry about that. > > > > > > With respect to this commit, we only had one implementation upstream > > > and extra levels of indirection made extending the API > > > much harder for no apparent benefit. > > > > > > When there's more than one ring implementation with very small amount of > > > common code, I think that it might make sense to readd the indirection > > > back, to separate the code cleanly. > > > > > > OTOH if the two implementations share a lot of code, I think that it > > > might be better to just add a couple of if statements here and there. > > > This way compiler even might have a chance to compile the code out if > > > the feature is disabled in kernel config. > > > > > > > The virtqueue implementation I envision will be almost identical to the > > current virtio_ring virtqueue implementation, with the following > > exceptions: > > > > * the "shared memory" will actually be remote, on the PCI BAR of a device > > * iowrite32(), ioread32() and friends will be needed to access the memory > > * there will only be a fixed number of virtqueues available, due to PCI > > BAR size > > * cross-endian virtqueues must work > > * kick needs to be cross-machine (using PCI IRQ's) > > > > I don't think it is feasible to add this to the existing implementation. > > I think the requirement of being cross-endian will be the hardest to > > overcome. Rusty did not envision the cross-endian use case when he > > designed this, and it shows, in virtio_ring, virtio_net and vhost. I > > have no idea what to do about this. Do you have any ideas? > > My guess is sticking an if around each access in virtio would hurt, > if this is what you are asking about. > Yes, I think so too. I think using le32 byte order everywhere in virtio would be a good thing. In addition, it means that on all x86, things continue to work as-is. It would also have no overhead in the most common case: x86-on-x86. This problem is not limited to my new use of virtio. Virtio is completely useless in a relatively common virtualization scenario: x86 host with qemu-ppc guest. Or any other big endian guest system. > Just a crazy idea: vhost already uses wrappers like get_user etc, > maybe when building kernel for your board you could > redefine these to also byteswap? > I think idea is clever, but also psychotic :) I'm sure it would work, but that only solves the problem of virtio ring descriptors. The virtio-net header contains several __u16 fields which would also need to be fixed-endianness. Thanks, Ira ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-06 15:34 ` Ira W. Snyder @ 2010-08-14 11:34 ` Alexander Graf 2010-08-16 0:19 ` Rusty Russell 2010-09-06 11:19 ` Michael S. Tsirkin 0 siblings, 2 replies; 8+ messages in thread From: Alexander Graf @ 2010-08-14 11:34 UTC (permalink / raw) To: Ira W. Snyder Cc: Michael S. Tsirkin, netdev@vger.kernel.org, Zang Roy, virtualization@lists.linux-foundation.org Am 06.08.2010 um 11:34 schrieb "Ira W. Snyder" <iws@ovro.caltech.edu>: > On Fri, Aug 06, 2010 at 02:20:42AM +0300, Michael S. Tsirkin wrote: >> On Thu, Aug 05, 2010 at 04:01:03PM -0700, Ira W. Snyder wrote: >>> On Fri, Aug 06, 2010 at 12:30:50AM +0300, Michael S. Tsirkin wrote: >>>> Hi Ira, >>>> >>>>> Making my life harder since the last time I tried this, mainline commit >>>>> 7c5e9ed0c (virtio_ring: remove a level of indirection) has removed the >>>>> possibility of using an alternative virtqueue implementation. The commit >>>>> message suggests that you might be willing to add this capability back. >>>>> Would this be an option? >>>> >>>> Sorry about that. >>>> >>>> With respect to this commit, we only had one implementation upstream >>>> and extra levels of indirection made extending the API >>>> much harder for no apparent benefit. >>>> >>>> When there's more than one ring implementation with very small amount of >>>> common code, I think that it might make sense to readd the indirection >>>> back, to separate the code cleanly. >>>> >>>> OTOH if the two implementations share a lot of code, I think that it >>>> might be better to just add a couple of if statements here and there. >>>> This way compiler even might have a chance to compile the code out if >>>> the feature is disabled in kernel config. >>>> >>> >>> The virtqueue implementation I envision will be almost identical to the >>> current virtio_ring virtqueue implementation, with the following >>> exceptions: >>> >>> * the "shared memory" will actually be remote, on the PCI BAR of a device >>> * iowrite32(), ioread32() and friends will be needed to access the memory >>> * there will only be a fixed number of virtqueues available, due to PCI >>> BAR size >>> * cross-endian virtqueues must work >>> * kick needs to be cross-machine (using PCI IRQ's) >>> >>> I don't think it is feasible to add this to the existing implementation. >>> I think the requirement of being cross-endian will be the hardest to >>> overcome. Rusty did not envision the cross-endian use case when he >>> designed this, and it shows, in virtio_ring, virtio_net and vhost. I >>> have no idea what to do about this. Do you have any ideas? >> >> My guess is sticking an if around each access in virtio would hurt, >> if this is what you are asking about. >> > > Yes, I think so too. I think using le32 byte order everywhere in virtio > would be a good thing. In addition, it means that on all x86, things > continue to work as-is. It would also have no overhead in the most > common case: x86-on-x86. > > This problem is not limited to my new use of virtio. Virtio is > completely useless in a relatively common virtualization scenario: > x86 host with qemu-ppc guest. Or any other big endian guest system. This one actually works because we know that we're building for a BE guest. But I agree that it's a mess and clearly a very incorrect design decision. >> Just a crazy idea: vhost already uses wrappers like get_user etc, >> maybe when building kernel for your board you could >> redefine these to also byteswap? >> > > I think idea is clever, but also psychotic :) I'm sure it would work, > but that only solves the problem of virtio ring descriptors. The > virtio-net header contains several __u16 fields which would also need > to be fixed-endianness. I'd vote for defining virtio v2 that makes everything LE. Maybe we could even have an LE capability with a grace period of phasing out non-LE capable hosts and guests. Alex ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-14 11:34 ` Alexander Graf @ 2010-08-16 0:19 ` Rusty Russell 2010-09-06 11:19 ` Michael S. Tsirkin 1 sibling, 0 replies; 8+ messages in thread From: Rusty Russell @ 2010-08-16 0:19 UTC (permalink / raw) To: virtualization Cc: Alexander Graf, Ira W. Snyder, netdev@vger.kernel.org, Zang Roy, Michael S. Tsirkin On Sat, 14 Aug 2010 09:04:19 pm Alexander Graf wrote: > > Am 06.08.2010 um 11:34 schrieb "Ira W. Snyder" <iws@ovro.caltech.edu>: > > This problem is not limited to my new use of virtio. Virtio is > > completely useless in a relatively common virtualization scenario: > > x86 host with qemu-ppc guest. Or any other big endian guest system. > > This one actually works because we know that we're building for a BE guest. > But I agree that it's a mess and clearly a very incorrect design decision. Yes, since you need to know the guest's endian to virtualize it, the correct interpretation of the virtio ring seemed the least problem. Perhaps I went overboard in simplification here, but it seemed pure legacy. If we did a virtio2, as has been suggested, it would be possible to address this. You could of course do a hack where you detect the ring endianness the first time they use it (based on avail.flags, avail.index and the descriptor it would be quite reliable in practice). Cheers, Rusty. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Using virtio as a physical (wire-level) transport 2010-08-14 11:34 ` Alexander Graf 2010-08-16 0:19 ` Rusty Russell @ 2010-09-06 11:19 ` Michael S. Tsirkin 1 sibling, 0 replies; 8+ messages in thread From: Michael S. Tsirkin @ 2010-09-06 11:19 UTC (permalink / raw) To: Alexander Graf Cc: Ira W. Snyder, netdev@vger.kernel.org, Zang Roy, virtualization@lists.linux-foundation.org On Sat, Aug 14, 2010 at 07:34:19AM -0400, Alexander Graf wrote: > I'd vote for defining virtio v2 that makes everything LE. Maybe we > could even have an LE capability with a grace period of phasing out > non-LE capable hosts and guests. So there are multiple ideas floating for modifying the ring, and together they might warrant virtio2. This includes removing available ring, publishing consumer indexes, possibly some interrupt mitigation ideas, and we can put endian-ness there. -- MST ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-09-06 11:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-04 23:04 Using virtio as a physical (wire-level) transport Ira W. Snyder 2010-08-05 21:30 ` Michael S. Tsirkin 2010-08-05 23:01 ` Ira W. Snyder 2010-08-05 23:20 ` Michael S. Tsirkin 2010-08-06 15:34 ` Ira W. Snyder 2010-08-14 11:34 ` Alexander Graf 2010-08-16 0:19 ` Rusty Russell 2010-09-06 11:19 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).