From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gXnaD-0003Z8-9R for qemu-devel@nongnu.org; Fri, 14 Dec 2018 08:32:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gXna9-0001DI-2P for qemu-devel@nongnu.org; Fri, 14 Dec 2018 08:32:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:5101) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gXna8-0001CI-J8 for qemu-devel@nongnu.org; Fri, 14 Dec 2018 08:32:44 -0500 Date: Fri, 14 Dec 2018 08:31:58 -0500 From: "Michael S. Tsirkin" Message-ID: <20181214082455-mutt-send-email-mst@kernel.org> References: <5794e090-9a9b-ca30-1066-ef697c9b67be@redhat.com> <7520e2cd-59cc-c133-f913-e7397df684dd@redhat.com> <20181213095516-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang Cc: Yongji Xie , nixun@baidu.com, zhangyu31@baidu.com, lilin24@baidu.com, qemu-devel@nongnu.org, chaiwen@baidu.com, marcandre.lureau@redhat.com, Xie Yongji , maxime.coquelin@redhat.com On Fri, Dec 14, 2018 at 12:36:01PM +0800, Jason Wang wrote: >=20 > On 2018/12/13 =E4=B8=8B=E5=8D=8810:56, Michael S. Tsirkin wrote: > > On Thu, Dec 13, 2018 at 11:41:06AM +0800, Yongji Xie wrote: > > > On Thu, 13 Dec 2018 at 10:58, Jason Wang wrot= e: > > > >=20 > > > > On 2018/12/12 =E4=B8=8B=E5=8D=885:18, Yongji Xie wrote: > > > > > > > > Ok, then we can simply forbid increasing the avail_idx in= this case? > > > > > > > >=20 > > > > > > > > Basically, it's a question of whether or not it's better = to done it in > > > > > > > > the level of virtio instead of vhost. I'm pretty sure if = we expose > > > > > > > > sufficient information, it could be done without touching= vhost-user. > > > > > > > > And we won't deal with e.g migration and other cases. > > > > > > > >=20 > > > > > > > OK, I get your point. That's indeed an alternative way. But= this feature seems > > > > > > > to be only useful to vhost-user backend. > > > > > > I admit I could not think of a use case other than vhost-user= . > > > > > >=20 > > > > > >=20 > > > > > > > I'm not sure whether it make sense to > > > > > > > touch virtio protocol for this feature. > > > > > > Some possible advantages: > > > > > >=20 > > > > > > - Feature could be determined and noticed by user or manageme= nt layer. > > > > > >=20 > > > > > > - There's no need to invent ring layout specific protocol to = record in > > > > > > flight descriptors. E.g if my understanding is correct, for t= his series > > > > > > and for the example above, it still can not work for packed v= irtqueue > > > > > > since descriptor id is not sufficient (descriptor could be ov= erwritten > > > > > > by used one). You probably need to have a (partial) copy of d= escriptor > > > > > > ring for this. > > > > > >=20 > > > > > > - No need to deal with migration, all information was in gues= t memory. > > > > > >=20 > > > > > Yes, we have those advantages. But seems like handle this in vh= ost-user > > > > > level could be easier to be maintained in production environmen= t. We can > > > > > support old guest. And the bug fix will not depend on guest ker= nel updating. > > > >=20 > > > > Yes. But the my main concern is the layout specific data structur= e. If > > > > it could be done through a generic structure (can it?), it would = be > > > > fine. Otherwise, I believe we don't want another negotiation abou= t what > > > > kind of layout that backend support for reconnect. > > > >=20 > > > Yes, the current layout in shared memory didn't support packed virt= queue because > > > the information of one descriptor in descriptor ring will not be > > > available once device fetch it. > > >=20 > > > I also thought about a generic structure before. But I failed... So= I > > > tried another way > > > to acheive that in this series. In QEMU side, we just provide a sha= red > > > memory to backend > > > and we didn't define anything for this memory. In backend side, the= y > > > should know how to > > > use those memory to record inflight I/O no matter what kind of > > > virtqueue they used. > > > Thus, If we updates virtqueue for new virtio spec in the feature, = we > > > don't need to touch > > > QEMU and guest. What do you think about it? > > >=20 > > > Thanks, > > > Yongji > > I think that's a good direction to take, yes. > > Backends need to be very careful about the layout, > > with versioning etc. >=20 >=20 > I'm not sure this could be done 100% transparent to qemu. E.g you need = to > deal with reset I think and you need to carefully choose the size of th= e > region. Which means you need negotiate the size, layout through backend= . I am not sure I follow. The point is all this state is internal to the backend. QEMU does not care at all - it just helps a little by hanging on to it. > And > need to deal with migration with them. Good catch. There definitely is an issue in that you can not migrate with backend being disconnected: migration needs to flush the backend and we can't when it's disconnected. This needs to be addressed. I think it's cleanest to just defer migration until backend does reconnect. Backend cross version migration is all messed up in vhost user, I agree. There was a plan to fix it that was never executed unfortunately. Maxime, do you still plan to look into it? > This is another sin of splitting > virtio dataplane from qemu anyway. >=20 >=20 > Thanks It wasn't split as such - dpdk was never a part of qemu. We just enabled it without fuse hacks. --=20 MST