qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "Stefan Hajnoczi" <stefanha@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>
Subject: Re: [Qemu-devel] [virtio-dev] Vhost-pci RFC2.0
Date: Wed, 03 May 2017 14:02:28 +0800	[thread overview]
Message-ID: <59097274.6050204@intel.com> (raw)
In-Reply-To: <20170502124804.GB22502@stefanha-x1.localdomain>

On 05/02/2017 08:48 PM, Stefan Hajnoczi wrote:
> On Thu, Apr 20, 2017 at 01:51:24PM +0800, Wei Wang wrote:
>> On 04/19/2017 11:24 PM, Stefan Hajnoczi wrote:
>>> On Wed, Apr 19, 2017 at 11:42 AM, Wei Wang <wei.w.wang@intel.com> wrote:
>>>> On 04/19/2017 05:57 PM, Stefan Hajnoczi wrote:
>>>>> On Wed, Apr 19, 2017 at 06:38:11AM +0000, Wang, Wei W wrote:
>>>>>> We made some design changes to the original vhost-pci design, and want to
>>>>>> open
>>>>>> a discussion about the latest design (labelled 2.0) and its extension
>>>>>> (2.1).
>>>>>> 2.0 design: One VM shares the entire memory of another VM
>>>>>> 2.1 design: One VM uses an intermediate memory shared with another VM for
>>>>>>                         packet transmission.
>>>>> Hi,
>>>>> Can you talk a bit about the motivation for the 2.x design and major
>>>>> changes compared to 1.x?
>>>> 1.x refers to the design we presented at KVM Form before. The major
>>>> change includes:
>>>> 1) inter-VM notification support
>>>> 2) TX engine and RX engine, which is the structure built in the driver. From
>>>> the device point of view, the local rings of the engines need to be
>>>> registered.
>>> It would be great to support any virtio device type.
>> Yes, the current design already supports the creation of devices of
>> different types.
>> The support is added to the vhost-user protocol and the vhost-user slave.
>> Once the slave handler receives the request to create the device (with
>> the specified device type), the remaining process (e.g. device realize)
>> is device specific.
>> This part remains the same as presented before
>> (i.e.Page 12 @ http://www.linux-kvm.org/images/5/55/02x07A-Wei_Wang-Design_of-Vhost-pci.pdf).
>>> The use case I'm thinking of is networking and storage appliances in
>>> cloud environments (e.g. OpenStack).  vhost-user doesn't fit nicely
>>> because users may not be allowed to run host userspace processes.  VMs
>>> are first-class objects in compute clouds.  It would be natural to
>>> deploy networking and storage appliances as VMs using vhost-pci.
>>>
>>> In order to achieve this vhost-pci needs to be a virtio transport and
>>> not a virtio-net-specific PCI device.  It would extend the VIRTIO 1.x
>>> spec alongside virtio-pci, virtio-mmio, and virtio-ccw.
>> Actually it is designed as a device under virtio-pci transport. I'm
>> not sure about the value of having a new transport.
>>
>>> When you say TX and RX I'm not sure if the design only supports
>>> virtio-net devices?
>> Current design focuses on the vhost-pci-net device. That's the
>> reason that we have TX/RX here. As mention above, when the
>> slave invokes the device creation function, the execution
>> goes to each device specific code.
>>
>> The TX/RX is the design after the device creation, so it is specific
>> to vhost-pci-net. For the future vhost-pci-blk, that design can
>> have its own request queue.
> Here is my understanding based on your vhost-pci GitHub repo:
>
> VM1 sees a normal virtio-net-pci device.  VM1 QEMU is invoked with a
> vhost-user netdev.
>
> VM2 sees a hotplugged vhost-pci-net virtio-pci device once VM1
> initializes the device and a message is sent over vhost-user.

Right.

>
> There is no integration with Linux drivers/vhost/ code for VM2.  Instead
> you are writing a 3rd virtio-net driver specifically for vhost-pci.  Not
> sure if it's possible to reuse drivers/vhost/ cleanly but that would be
> nicer than implementing virtio-net again.

vhost-pci-net is a standalone network device with its own unique
device id, and the device itself is different from virtio-net (e.g.
different virtqueues), so I think it would be more reasonable to
let vhost-pci-net have its own driver.

There are indeed some functions in vhost-pci-net that looks
similar to those in virtio-net (e.g. try_fill_recv()). I haven't thought
of a good way to reuse them yet, because the interfaces are not
completely the same, for example, vpnet_info and virtnet_info,
which need to be passed to the functions, are different.

>
> Is the VM1 vhost-user netdev a normal vhost-user device or does it know
> about vhost-pci?

Share the QEMU booting commands, which would be helpful:
VM1(vhost-pci-net):
-chardev socket,id=slave1,server,wait=off,path=${PATH_SLAVE1} \
-vhost-pci-slave socket,chardev=slave1

VM2(virtio-net):
-chardev socket,id=sock2,path=${PATH_SLAVE1} \
-netdev type=vhost-user,id=net2,chardev=sock2,vhostforce \
-device virtio-net-pci,mac=52:54:00:00:00:02,netdev=net2

The netdev doesn't know about vhost_pci, but the vhost_dev knows
it via
vhost_dev->protocol_features &
     (1ULL << VHOST_USER_PROTOCOL_F_VHOST_PCI),

The vhost-pci specific messages need to be sent in the vhost-pci
case. For example, at the end of vhost_net_start(), if it detects the
slave is vhost-pci, it will send a
VHOST_USER_SET_VHOST_PCI_START message to the slave(VM1).

>
> It's hard to study code changes in your vhost-pci repo because
> everything (QEMU + Linux + your changes) was committed in a single
> commit.  Please keep your changes in separate commits so it's easy to
> find them.
>
Thanks a lot for reading the draft code. I'm working to do some
cleanup and split it into patches. I will post out the QEMU side
patches soon.


Best,
Wei

  reply	other threads:[~2017-05-03  6:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19  6:38 [Qemu-devel] Vhost-pci RFC2.0 Wang, Wei W
2017-04-19  7:31 ` Marc-André Lureau
2017-04-19  8:33   ` Wei Wang
2017-04-19  7:35 ` Jan Kiszka
2017-04-19  8:42   ` Wei Wang
2017-04-19  8:49     ` [Qemu-devel] [virtio-dev] " Jan Kiszka
2017-04-19  9:09       ` Wei Wang
2017-04-19  9:31         ` Jan Kiszka
2017-04-19 10:02           ` Wei Wang
2017-04-19 10:36             ` Jan Kiszka
2017-04-19 11:11               ` Wei Wang
2017-04-19 11:21                 ` Jan Kiszka
2017-04-19 14:33                   ` Wang, Wei W
2017-04-19 14:52                     ` Jan Kiszka
2017-04-20  6:51                       ` Wei Wang
2017-04-20  7:05                         ` Jan Kiszka
2017-04-20  8:58                           ` Wei Wang
2017-04-19  9:57 ` [Qemu-devel] [virtio-dev] " Stefan Hajnoczi
2017-04-19 10:42   ` Wei Wang
2017-04-19 15:24     ` Stefan Hajnoczi
2017-04-20  5:51       ` Wei Wang
2017-05-02 12:48         ` Stefan Hajnoczi
2017-05-03  6:02           ` Wei Wang [this message]
2017-05-05  4:05 ` Jason Wang
2017-05-05  6:18   ` Wei Wang
2017-05-05  9:18     ` Jason Wang
2017-05-08  1:39       ` Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59097274.6050204@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).