qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: "Marc-André Lureau" <marcandre.lureau@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Stefan Hajnoczi" <stefanha@gmail.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>
Subject: Re: [Qemu-devel] Vhost-pci RFC2.0
Date: Wed, 19 Apr 2017 16:33:36 +0800	[thread overview]
Message-ID: <58F720E0.6070709@intel.com> (raw)
In-Reply-To: <CAJ+F1CL3P96gSVcZCCPatZGp7yUGeryEVSV4aX4rDXRjVBo0uw@mail.gmail.com>

On 04/19/2017 03:31 PM, Marc-André Lureau wrote:
> Hi
>
> On Wed, Apr 19, 2017 at 10:38 AM Wang, Wei W <wei.w.wang@intel.com 
> <mailto:wei.w.wang@intel.com>> wrote:
>
>     Hi,
>     We made some design changes to the original vhost-pci design, and
>     want to open
>     a discussion about the latest design (labelled 2.0) and its
>     extension (2.1).
>     2.0 design: One VM shares the entire memory of another VM
>     2.1 design: One VM uses an intermediate memory shared with another
>     VM for
>                          packet transmission.
>     For the convenience of discussion, I have some pictures presented
>     at this link:
>     _https://github.com/wei-w-wang/vhost-pci-discussion/blob/master/vhost-pci-rfc2.0.pdf_
>     Fig. 1 shows the common driver frame that we want use to build the
>     2.0 and 2.1
>     design. A TX/RX engine consists of a local ring and an exotic ring.
>
>
> Isn't "external" (or "remote") more appropriate than "exotic" ?

OK, probably we can use "remote" here.

>
>     Local ring:
>     1) allocated by the driver itself;
>     2) registered with the device (i.e. virtio_add_queue())
>     Exotic ring:
>     1) ring memory comes from the outside (of the driver), and exposed
>     to the driver
>          via a BAR MMIO;
>     2) does not have a registration in the device, so no
>     ioeventfd/irqfd, configuration
>     registers allocated in the device
>     Fig. 2 shows how the driver frame is used to build the 2.0 design.
>     1) Asymmetric: vhost-pci-net <-> virtio-net
>     2) VM1 shares the entire memory of VM2, and the exotic rings are
>     the rings
>         from VM2.
>     3) Performance (in terms of copies between VMs):
>         TX: 0-copy (packets are put to VM2’s RX ring directly)
>         RX: 1-copy (the green arrow line in the VM1’s RX engine)
>
>
> Why is the copy necessary?

Because the packet from the remote ring can't be delivered to the
network stack directly. To be more precise,
1)  The buffer from the remote ring is not allocated by the guest 
driver. If the
      buffer is directly delivered to the network stack, the network 
stack will free
      the buffer that is not allocated by the guest;
2) When we think about the vring operation, after getting the buffer 
from the
     avail ring, we need to put back the used buffer to the used ring to 
tell the
     other end that the buffer has been used. The network stack won't do 
this
     operation.

So, based on the two points. I think we need to use a local ring, and 
copy the
packet to the buffer from the local ring (i.e. buffer memory allocated 
by the
guest driver), and the driver will do the "give back used buffer" operation
as explained in 2).


>     Fig. 3 shows how the driver frame is used to build the 2.1 design.
>     1) Symmetric: vhost-pci-net <-> vhost-pci-net
>     2) Share an intermediate memory, allocated by VM1’s vhost-pci device,
>     for data exchange, and the exotic rings are built on the shared memory
>     3) Performance:
>         TX: 1-copy
>     RX: 1-copy
>     Fig. 4 shows the inter-VM notification path for 2.0 (2.1 is similar).
>     The four eventfds are allocated by virtio-net, and shared with
>     vhost-pci-net:
>     Uses virtio-net’s TX/RX kickfd as the vhost-pci-net’s RX/TX callfd
>     Uses virtio-net’s TX/RX callfd as the vhost-pci-net’s RX/TX kickfd
>     Example of how it works:
>     After packets are put into vhost-pci-net’s TX, the driver kicks
>     TX, which
>     causes the an interrupt associated with fd3 to be injected to
>     virtio-net
>     The draft code of the 2.0 design is ready, and can be found here:
>     Qemu: _https://github.com/wei-w-wang/vhost-pci-device_
>
>
> The repository contains a single big commit 
> (https://github.com/wei-w-wang/vhost-pci-device/commit/fa01ec5e41de176197dae505c05b659f5483187f). 
> Please try to provide a seperate patch or series of patch from an 
> upstream commit/release point.

It's the test-able version of the 2.0 design. I will separate it.
If possible, hope we can review the design first, especially the common
driver frame. Then I can make the related changes from the
discussion, and post out the patch series.

Best,
Wei

  reply	other threads:[~2017-04-19  8:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19  6:38 [Qemu-devel] Vhost-pci RFC2.0 Wang, Wei W
2017-04-19  7:31 ` Marc-André Lureau
2017-04-19  8:33   ` Wei Wang [this message]
2017-04-19  7:35 ` Jan Kiszka
2017-04-19  8:42   ` Wei Wang
2017-04-19  8:49     ` [Qemu-devel] [virtio-dev] " Jan Kiszka
2017-04-19  9:09       ` Wei Wang
2017-04-19  9:31         ` Jan Kiszka
2017-04-19 10:02           ` Wei Wang
2017-04-19 10:36             ` Jan Kiszka
2017-04-19 11:11               ` Wei Wang
2017-04-19 11:21                 ` Jan Kiszka
2017-04-19 14:33                   ` Wang, Wei W
2017-04-19 14:52                     ` Jan Kiszka
2017-04-20  6:51                       ` Wei Wang
2017-04-20  7:05                         ` Jan Kiszka
2017-04-20  8:58                           ` Wei Wang
2017-04-19  9:57 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-04-19 10:42   ` Wei Wang
2017-04-19 15:24     ` Stefan Hajnoczi
2017-04-20  5:51       ` Wei Wang
2017-05-02 12:48         ` Stefan Hajnoczi
2017-05-03  6:02           ` Wei Wang
2017-05-05  4:05 ` Jason Wang
2017-05-05  6:18   ` Wei Wang
2017-05-05  9:18     ` Jason Wang
2017-05-08  1:39       ` Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58F720E0.6070709@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).