Discussion of the implementations of VIRTIO specification
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Dust Li <dust.li@linux.alibaba.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	virtio-dev@lists.oasis-open.org, hans@linux.alibaba.com,
	herongguang@linux.alibaba.com, zmlcc@linux.alibaba.com,
	zhenzao@linux.alibaba.com, helinguo@linux.alibaba.com,
	gerry@linux.alibaba.com, mst@redhat.com, cohuck@redhat.com,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device
Date: Fri, 21 Oct 2022 13:13:03 +0800	[thread overview]
Message-ID: <Y1IqX2uVpcD7cvRF@TonyMac-Alibaba> (raw)
In-Reply-To: <20221021045422.GC63658@linux.alibaba.com>

On Fri, Oct 21, 2022 at 12:54:22PM +0800, Dust Li wrote:
> On Fri, Oct 21, 2022 at 11:53:10AM +0800, Tony Lu wrote:
> >On Fri, Oct 21, 2022 at 11:09:19AM +0800, Jason Wang wrote:
> >> On Fri, Oct 21, 2022 at 11:05 AM Tony Lu <tonylu@linux.alibaba.com> wrote:
> >> >
> >> > On Fri, Oct 21, 2022 at 10:47:29AM +0800, Jason Wang wrote:
> >> > > On Wed, Oct 19, 2022 at 6:01 PM Tony Lu <tonylu@linux.alibaba.com> wrote:
> >> > > >
> >> > > > On Wed, Oct 19, 2022 at 05:04:58PM +0800, Jason Wang wrote:
> >> > > > >
> >> > > > > 在 2022/10/19 16:07, Tony Lu 写道:
> >> > > > > > On Wed, Oct 19, 2022 at 02:02:21PM +0800, Xuan Zhuo wrote:
> >> > > > > > > On Wed, 19 Oct 2022 12:36:35 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >> > > > > > > > On Wed, Oct 19, 2022 at 12:22 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >> > > > > > > > > On Wed, 19 Oct 2022 11:56:52 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >> > > > > > > > > > On Wed, Oct 19, 2022 at 10:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >> > > > > > > > > > > On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Hi Jason,
> >> > > > > > > > > > >
> >> > > > > > > > > > > I think there may be some problems with the direction we are discussing.
> >> > > > > > > > > > Probably not.
> >> > > > > > > > > >
> >> > > > > > > > > > As far as we are focusing on technology, there's nothing wrong from my
> >> > > > > > > > > > perspective. And this is how the community works. Your idea needs to
> >> > > > > > > > > > be justified and people are free to raise any technical questions
> >> > > > > > > > > > especially considering you've posted a spec change with prototype
> >> > > > > > > > > > codes but not only the idea.
> >> > > > > > > > > >
> >> > > > > > > > > > > Our
> >> > > > > > > > > > > goal is to add an new ism device. As far as the spec is concerned, we are not
> >> > > > > > > > > > > concerned with the implementation of the backend.
> >> > > > > > > > > > >
> >> > > > > > > > > > > The direction we should discuss is what is the difference between the ism device
> >> > > > > > > > > > > and other devices such as virtio-net, and whether it is necessary to introduce
> >> > > > > > > > > > > this new device.
> >> > > > > > > > > > This is somehow what I want to ask, actually it's not a comparison
> >> > > > > > > > > > with virtio-net but:
> >> > > > > > > > > >
> >> > > > > > > > > > - virtio-roce
> >> > > > > > > > > > - virtio-vhost-user
> >> > > > > > > > > > - virtio-(p)mem
> >> > > > > > > > > >
> >> > > > > > > > > > or whether we can simply add features to those devices to achieve what
> >> > > > > > > > > > you want to do here.
> >> > > > > > > > >
> >> > > > > > > > > Yes, this is my priority to discuss.
> >> > > > > > > > >
> >> > > > > > > > > At the moment, I think the most similar to ism is the Vhost-user Device Backend
> >> > > > > > > > > of virtio-vhost-user.
> >> > > > > > > > >
> >> > > > > > > > > My understanding of it is to map any virtio device to another vm as a vvu
> >> > > > > > > > > device.
> >> > > > > > > > Yes, so a possible way is to have a device with memory zone/region
> >> > > > > > > > provision and management then map it via virtio-vhost-user.
> >> > > > > > >
> >> > > > > > > Yes, there is such a possibility. virtio-vhost-user makes me feel that what can
> >> > > > > > > be shared is the function implementation of map.
> >> > > > > > >
> >> > > > > > > But in the vm to provide the interface to the upper layer, I think this is the
> >> > > > > > > work of ism.
> >> > > > > > >
> >> > > > > > > But one of the reasons why I didn't use virtio-vhost-user directly is that in
> >> > > > > > > another vm, the guest can operate the vvu device, which we hope that both sides
> >> > > > > > > are equal to the ism device.
> >> > > > > > >
> >> > > > > > > So I want to agree on a question first: who will provide the upper layer with
> >> > > > > > > the ability to share the memory area?
> >> > > > > > >
> >> > > > > > > Our answer is a new ism device. How does this device achieve memory sharing, I
> >> > > > > > > think is the second question.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > > >  From this design purpose, I think the two are different.
> >> > > > > > > > >
> >> > > > > > > > > Of course, you might want to extend it, it does have some similarities and uses
> >> > > > > > > > > a lot of similar techniques.
> >> > > > > > > > I don't have any preference so far. If you think your idea makes more
> >> > > > > > > > sense, then try your best to justify it in the list.
> >> > > > > > > >
> >> > > > > > > > > So we can really discuss in this direction, whether
> >> > > > > > > > > the vvu device can be extended to achieve the purpose of ism, or whether the
> >> > > > > > > > > design goals can be agreed.
> >> > > > > > > > I've added Stefan in the loop, let's hear from him.
> >> > > > > > > >
> >> > > > > > > > > Or, in the direction of memory sharing in the backend, can ism and vvu be merged?
> >> > > > > > > > > Should device/driver APIs remain independent?
> >> > > > > > > > Btw, you mentioned that one possible user of ism is the smc, but I
> >> > > > > > > > don't see how it connects to that with your prototype driver.
> >> > > > > > > Yes, we originally had plans, but the virtio spec was considered for submission,
> >> > > > > > > so this was not included. Maybe, we should have included this part @Tony
> >> > > > > > >
> >> > > > > > > A brief introduction is that SMC currently has a corresponding
> >> > > > > > > s390/net/ism_drv.c and we will replace this in the virtualization scenario.
> >> > > > >
> >> > > > >
> >> > > > > Ok, I see. So I think the goal is to implement something in virtio that is
> >> > > > > functional equivalent to IBM ISM device.
> >> > > > >
> >> > > >
> >> > > > Yes, IBM ISM devices do something similar and it inspired this.
> >> > >
> >> > > Ok, it would be better to mention this in the cover letter of the next
> >> > > version. This can ease the reviewers (IBM has some good docs of those
> >> > > from the website).
> >> > >
> >> >
> >> > Yes, we will do it.
> >> 
> >> Btw, I wonder about the plan to support live migration. E.g do we need
> >> to hot unplug the ism device before the migration then we can fallback
> >> to TCP/IP ?
> >> 
> >
> >>From the point view of SMC, SMC-R maintains multiple link (RDMA QP), it
> >can live migrate existed connections to new link.
> >
> >Currently, yes, for SMC-D.
> 
> I think Jason means VM live migration from one Host to another. Am I
> right, Jason ?
> 
> In that case, the share memory from the ISM device is no longer valid,
> I think we have to hot unplug before the migration to notify SMC that
> the SMC-D link is no longer usable.

Yes, this is what I mean ;-) SMC-D needs to unplug the device.

> IIUC, SMC-D doesn't support transparently fallback to TCP/IP in this case
> now. But I think we could make that happen, since SMC already support link
> migration between different RDMA devices.

Yes, currently SMC-D doesn't support migration to another device or
fallback. And SMC-R supports migration to another link, no fallback.

Cheers,
Tony Lu

> Thanks
> 
> >
> >Cheers,
> >Tony Lu
> >
> >
> >> Thanks
> >> 
> >> >
> >> > > >
> >> > > > >
> >> > > > > > >
> >> > > > > > > Thanks.
> >> > > > > > >
> >> > > > > > SMC is a network protocol which is modeled by shared memory rather than
> >> > > > > > packet.
> >> > > > >
> >> > > > >
> >> > > > > After reading more SMC from IBM website, I think you meant SMC-D here. And I
> >> > > > > wonder in order to have a complete SMC solution we still need virtio-ROCE
> >> > > > > for inter host communcation?
> >> > > > >
> >> > > >
> >> > > > Mostly yes.
> >> > > >
> >> > > > SMC-D is the part of whole SMC solution. SMC supports multiple
> >> > > > underlying device, -D means ISM device, -R means RDMA device. The key
> >> > > > data model is shared memory, SMC uses RDMA (-R) or ISM(-D) to *share*
> >> > > > memory between peers, and it will choose the suitable device on demand
> >> > > > during handshaking. If there was no suitable device, it would fall back
> >> > > > to TCP. So virtio-ROCE is not required.
> >> > >
> >> > > So the commniting peers on the same host we need SMC-D, in the future
> >> > > we need to use RDMA to offload the communication among the peers of
> >> > > different hosts. Then we can get fully transparent offload no matter
> >> > > the peer is local or not.
> >> > >
> >> >
> >> > Yes, this is what we want to do.
> >> >
> >> > > >
> >> > > > >
> >> > > > > >   Actually the basic required interfaces of SMC device are:
> >> > > > > >
> >> > > > > >    - alloc / free memory region, each connection peer has two memory
> >> > > > > >     regions dynamically for sending and receiving ring buffer.
> >> > > > > >    - attach / detach memory region, remote attaches local-allocated
> >> > > > > >     sending region as receiving region, vice versa.
> >> > > > > >    - notify, tell peer to read data and update cursor.
> >> > > > > >
> >> > > > > > Then the device can be registered as SMC ISM device. Of course, SMC
> >> > > > > > also requires some modification to adapt it.
> >> > > > >
> >> > > > >
> >> > > > > Looking at s390 ism driver it requires other stuffs like vlan add/remove or
> >> > > > > gid query, do we need them as well?
> >> > > >
> >> > > > vlan is not required in this use case. ISM uses gid to identified each
> >> > > > others, maybe we could implement it in virtio ways.
> >> > >
> >> > > I'd suggest adding the codes to register the driver to SMC/ISM in the
> >> > > next version (instead of a simple procfs hooking). Then people can
> >> > > easily play or review.
> >> > >
> >> >
> >> > Ok, I will add the codes in the next version.
> >> >
> >> > Cheers,
> >> > Tony Lu
> >> >
> >> > > Thanks
> >> > >
> >> > > >
> >> > > > To support virtio-ism smoothly, the interfaces of ISM driver still need
> >> > > > to be adjusted. I will put it on the table with IBM people.
> >> > > >
> >> > > > Cheers,
> >> > > > Tony Lu
> >> > > >
> >> > > > >
> >> > > > > Thanks
> >> > > > >
> >> > > > >
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > Tony Lu
> >> > > > > >
> >> > > > > > > > Thanks
> >> > > > > > > >
> >> > > > > > > > > Thanks.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > > > How to share the backend with other deivce is another problem.
> >> > > > > > > > > > Yes, anything that is used for your virito-ism prototype can be used
> >> > > > > > > > > > for other devices.
> >> > > > > > > > > >
> >> > > > > > > > > > > Our goal is to dynamically obtain a piece of memory to share with other vms.
> >> > > > > > > > > > So at this level, I don't see the exact difference compared to
> >> > > > > > > > > > virtio-vhost-user. Let's just focus on the API that carries on the
> >> > > > > > > > > > semantic:
> >> > > > > > > > > >
> >> > > > > > > > > > - map/unmap
> >> > > > > > > > > > - permission update
> >> > > > > > > > > >
> >> > > > > > > > > > The only missing piece is the per region notification.
> >> > > > > > > > > >
> >> > > > > > > > > > > In a connection, this memory will be used repeatedly. As far as SMC is concerned,
> >> > > > > > > > > > > it will use it as a ring. Of course, we also need a notify mechanism.
> >> > > > > > > > > > >
> >> > > > > > > > > > > That's what we're aiming for, so we should first discuss whether this
> >> > > > > > > > > > > requirement is reasonable.
> >> > > > > > > > > > So unless somebody said "no", it is fine until now.
> >> > > > > > > > > >
> >> > > > > > > > > > > I think it's a feature currently not supported by
> >> > > > > > > > > > > other devices specified by the current virtio spce.
> >> > > > > > > > > > Probably, but we've already had rfcs for roce and vhost-user.
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks
> >> > > > > > > > > >
> >> > > > > > > > > > > Thanks.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > >
> >> >


  reply	other threads:[~2022-10-21  5:13 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17  7:47 [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device Xuan Zhuo
2022-10-17  7:47 ` [virtio-dev] [PATCH 1/2] Reserve device id for ISM device Xuan Zhuo
2022-10-17  7:47 ` [PATCH 2/2] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2022-10-17  8:17 ` [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device Jason Wang
2022-10-17 12:26   ` Xuan Zhuo
2022-10-18  6:54     ` Jason Wang
2022-10-18  8:33       ` Gerry
2022-10-19  3:55         ` Jason Wang
2022-10-19  5:29           ` Gerry
2022-10-18  8:55       ` He Rongguang
2022-10-19  4:16         ` Jason Wang
2022-10-19  6:43       ` Xuan Zhuo
2022-10-19  8:01         ` Jason Wang
2022-10-19  8:03           ` Gerry
2022-10-19  8:14             ` Xuan Zhuo
2022-10-19  8:21             ` Dust Li
2022-10-19  9:08               ` Jason Wang
2022-10-19  9:10                 ` Xuan Zhuo
2022-10-19  9:15                   ` Jason Wang
2022-10-19  9:23                     ` Xuan Zhuo
2022-10-21  2:41                       ` Jason Wang
2022-10-21  2:53                         ` Gerry
2022-10-21  3:30                         ` Dust Li
2022-10-21  6:37                           ` Jason Wang
2022-10-21  9:26                             ` Dust Li
2022-10-19  8:13           ` Xuan Zhuo
2022-10-19  8:15             ` Xuan Zhuo
2022-10-19  9:11               ` Jason Wang
2022-10-19  9:15                 ` Xuan Zhuo
2022-10-21  2:42                   ` Jason Wang
2022-10-21  3:03                     ` Xuan Zhuo
2022-10-21  6:35                       ` Jason Wang
2022-10-18  3:15   ` dust.li
2022-10-18  7:29     ` Jason Wang
2022-10-19  2:34   ` Xuan Zhuo
2022-10-19  3:56     ` Jason Wang
2022-10-19  4:08       ` Xuan Zhuo
2022-10-19  4:36         ` Jason Wang
2022-10-19  6:02           ` Xuan Zhuo
2022-10-19  8:07             ` Tony Lu
2022-10-19  9:04               ` Jason Wang
2022-10-19  9:10                 ` Gerry
2022-10-19  9:13                   ` Jason Wang
2022-10-19 10:01                 ` Tony Lu
2022-10-21  2:47                   ` Jason Wang
2022-10-21  3:05                     ` Tony Lu
2022-10-21  3:07                       ` Jason Wang
2022-10-21  3:23                         ` Tony Lu
2022-10-21  3:09                       ` Jason Wang
2022-10-21  3:53                         ` Tony Lu
2022-10-21  4:54                           ` Dust Li
2022-10-21  5:13                             ` Tony Lu [this message]
2022-10-21  6:38                               ` Jason Wang
2022-10-19  4:30       ` Xuan Zhuo
2022-10-19  5:10         ` Jason Wang
2022-10-19  6:13           ` Xuan Zhuo
2022-10-18  7:32 ` Jan Kiszka
2022-11-14 21:30   ` Jan Kiszka
2022-11-16  2:13     ` Xuan Zhuo
2022-11-23 15:27       ` Jan Kiszka
2022-11-24  2:32         ` Xuan Zhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1IqX2uVpcD7cvRF@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=cohuck@redhat.com \
    --cc=dust.li@linux.alibaba.com \
    --cc=gerry@linux.alibaba.com \
    --cc=hans@linux.alibaba.com \
    --cc=helinguo@linux.alibaba.com \
    --cc=herongguang@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=zhenzao@linux.alibaba.com \
    --cc=zmlcc@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox