qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: "Jan Kiszka" <jan.kiszka@siemens.com>,
	"Marc-André Lureau" <marcandre.lureau@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Stefan Hajnoczi" <stefanha@gmail.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>
Cc: Jailhouse <jailhouse-dev@googlegroups.com>
Subject: Re: [Qemu-devel] [virtio-dev] Re: Vhost-pci RFC2.0
Date: Wed, 19 Apr 2017 17:09:22 +0800	[thread overview]
Message-ID: <58F72942.3030802@intel.com> (raw)
In-Reply-To: <d96b8097-b541-9908-8141-86267e726471@siemens.com>

On 04/19/2017 04:49 PM, Jan Kiszka wrote:
> On 2017-04-19 10:42, Wei Wang wrote:
>> On 04/19/2017 03:35 PM, Jan Kiszka wrote:
>>> On 2017-04-19 08:38, Wang, Wei W wrote:
>>>> Hi,
>>>>    We made some design changes to the original vhost-pci design, and want
>>>> to open
>>>> a discussion about the latest design (labelled 2.0) and its extension
>>>> (2.1).
>>>> 2.0 design: One VM shares the entire memory of another VM
>>>> 2.1 design: One VM uses an intermediate memory shared with another VM
>>>> for
>>>>                        packet transmission.
>>>>    For the convenience of discussion, I have some pictures presented at
>>>> this link:
>>>> _https://github.com/wei-w-wang/vhost-pci-discussion/blob/master/vhost-pci-rfc2.0.pdf_
>>>>
>>>>    Fig. 1 shows the common driver frame that we want use to build the 2.0
>>>> and 2.1
>>>> design. A TX/RX engine consists of a local ring and an exotic ring.
>>>> Local ring:
>>>> 1) allocated by the driver itself;
>>>> 2) registered with the device (i.e. virtio_add_queue())
>>>> Exotic ring:
>>>> 1) ring memory comes from the outside (of the driver), and exposed to
>>>> the driver
>>>>        via a BAR MMIO;
>>> Small additional requirement: In order to make this usable with
>>> Jailhouse as well, we need [also] a side-channel configuration for the
>>> regions, i.e. likely via a PCI capability. There are too few BARs, and
>>> they suggest relocatablity, which is not available under Jailhouse for
>>> simplicity reasons (IOW, the shared regions are statically mapped by the
>>> hypervisor into the affected guest address spaces).
>> What kind of configuration would you need for the regions?
>> I think adding a PCI capability should be easy.
> Basically address and size, see
> https://github.com/siemens/jailhouse/blob/wip/ivshmem2/Documentation/ivshmem-v2-specification.md#vendor-specific-capability-id-09h
Got it, thanks. That should be easy to add to 2.1.

>>>> 2) does not have a registration in the device, so no ioeventfd/irqfd,
>>>> configuration
>>>> registers allocated in the device
>>>>    Fig. 2 shows how the driver frame is used to build the 2.0 design.
>>>> 1) Asymmetric: vhost-pci-net <-> virtio-net
>>>> 2) VM1 shares the entire memory of VM2, and the exotic rings are the
>>>> rings
>>>>       from VM2.
>>>> 3) Performance (in terms of copies between VMs):
>>>>       TX: 0-copy (packets are put to VM2’s RX ring directly)
>>>>       RX: 1-copy (the green arrow line in the VM1’s RX engine)
>>>>    Fig. 3 shows how the driver frame is used to build the 2.1 design.
>>>> 1) Symmetric: vhost-pci-net <-> vhost-pci-net
>>> This is interesting!
>>>
>>>> 2) Share an intermediate memory, allocated by VM1’s vhost-pci device,
>>>> for data exchange, and the exotic rings are built on the shared memory
>>>> 3) Performance:
>>>>       TX: 1-copy
>>>> RX: 1-copy
>>> I'm not yet sure I to this right: there are two different MMIO regions
>>> involved, right? One is used for VM1's RX / VM2's TX, and the other for
>>> the reverse path? Would allow our requirement to have those regions
>>> mapped with asymmetric permissions (RX read-only, TX read/write).
>> The design presented here intends to use only one BAR to expose
>> both TX and RX. The two VMs share an intermediate memory
>> here, why couldn't we give the same permission to TX and RX?
>>
> For security and/or safety reasons: the TX side can then safely prepare
> and sign a message in-place because the RX side cannot mess around with
> it while not yet being signed (or check-summed). Saves one copy from a
> secure place into the shared memory.

If we allow guest1 to write to RX, what safety issue would it cause to
guest2?


>>>>    Fig. 4 shows the inter-VM notification path for 2.0 (2.1 is similar).
>>>> The four eventfds are allocated by virtio-net, and shared with
>>>> vhost-pci-net:
>>>> Uses virtio-net’s TX/RX kickfd as the vhost-pci-net’s RX/TX callfd
>>>> Uses virtio-net’s TX/RX callfd as the vhost-pci-net’s RX/TX kickfd
>>>> Example of how it works:
>>>> After packets are put into vhost-pci-net’s TX, the driver kicks TX,
>>>> which
>>>> causes the an interrupt associated with fd3 to be injected to virtio-net
>>>>    The draft code of the 2.0 design is ready, and can be found here:
>>>> Qemu: _https://github.com/wei-w-wang/vhost-pci-device_
>>>> Guest driver: _https://github.com/wei-w-wang/vhost-pci-driver_
>>>>    We tested the 2.0 implementation using the Spirent packet
>>>> generator to transmit 64B packets, the results show that the
>>>> throughput of vhost-pci reaches around 1.8Mpps, which is around
>>>> two times larger than the legacy OVS+DPDK. Also, vhost-pci shows
>>>> better scalability than OVS+DPDK.
>>>>    
>>> Do you have numbers for the symmetric 2.1 case as well? Or is the driver
>>> not yet ready for that yet? Otherwise, I could try to make it work over
>>> a simplistic vhost-pci 2.1 version in Jailhouse as well. That would give
>>> a better picture of how much additional complexity this would mean
>>> compared to our ivshmem 2.0.
>>>
>> Implementation of 2.1 is not ready yet. We can extend it to 2.1 after
>> the common driver frame is reviewed.
> Can you you assess the needed effort?
>
> For us, this is a critical feature, because we need to decide if
> vhost-pci can be an option at all. In fact, the "exotic ring" will be
> the only way to provide secure inter-partition communication on Jailhouse.
>
If what is here for 2.0 is suitable to be upstream-ed, I think it will 
be easy
to extend it to 2.1 (probably within 1 month).

Best,
Wei

  reply	other threads:[~2017-04-19  9:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19  6:38 [Qemu-devel] Vhost-pci RFC2.0 Wang, Wei W
2017-04-19  7:31 ` Marc-André Lureau
2017-04-19  8:33   ` Wei Wang
2017-04-19  7:35 ` Jan Kiszka
2017-04-19  8:42   ` Wei Wang
2017-04-19  8:49     ` [Qemu-devel] [virtio-dev] " Jan Kiszka
2017-04-19  9:09       ` Wei Wang [this message]
2017-04-19  9:31         ` Jan Kiszka
2017-04-19 10:02           ` Wei Wang
2017-04-19 10:36             ` Jan Kiszka
2017-04-19 11:11               ` Wei Wang
2017-04-19 11:21                 ` Jan Kiszka
2017-04-19 14:33                   ` Wang, Wei W
2017-04-19 14:52                     ` Jan Kiszka
2017-04-20  6:51                       ` Wei Wang
2017-04-20  7:05                         ` Jan Kiszka
2017-04-20  8:58                           ` Wei Wang
2017-04-19  9:57 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-04-19 10:42   ` Wei Wang
2017-04-19 15:24     ` Stefan Hajnoczi
2017-04-20  5:51       ` Wei Wang
2017-05-02 12:48         ` Stefan Hajnoczi
2017-05-03  6:02           ` Wei Wang
2017-05-05  4:05 ` Jason Wang
2017-05-05  6:18   ` Wei Wang
2017-05-05  9:18     ` Jason Wang
2017-05-08  1:39       ` Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58F72942.3030802@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=jailhouse-dev@googlegroups.com \
    --cc=jan.kiszka@siemens.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).