qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yongji Xie <elohimes@gmail.com>
Cc: "Stefan Hajnoczi" <stefanha@gmail.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Coquelin, Maxime" <maxime.coquelin@redhat.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>,
	"Евгений Яковлев" <wrfsh@yandex-team.ru>,
	qemu-devel <qemu-devel@nongnu.org>,
	zhangyu31@baidu.com, chaiwen@baidu.com, nixun@baidu.com,
	lilin24@baidu.com, "Xie Yongji" <xieyongji@baidu.com>
Subject: Re: [Qemu-devel] [PATCH v5 1/6] vhost-user: Support transferring inflight buffer between qemu and backend
Date: Tue, 29 Jan 2019 23:07:48 -0500	[thread overview]
Message-ID: <20190129230733-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAONzpcZWmmVYzzAbS4HYTpAaNn6Xo+nG9XoRut9N_ro6NfRzLg@mail.gmail.com>

On Wed, Jan 30, 2019 at 11:49:56AM +0800, Yongji Xie wrote:
> On Wed, 30 Jan 2019 at 10:30, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 30, 2019 at 10:07:28AM +0800, Yongji Xie wrote:
> > > On Tue, 29 Jan 2019 at 22:15, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jan 29, 2019 at 02:15:35PM +0800, Yongji Xie wrote:
> > > > > On Tue, 29 Jan 2019 at 12:26, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 29, 2019 at 12:11:55PM +0800, Stefan Hajnoczi wrote:
> > > > > > > On Tue, Jan 22, 2019 at 04:31:47PM +0800, elohimes@gmail.com wrote:
> > > > > > > > +typedef struct DescState {
> > > > > > > > +    uint8_t inuse;
> > > > > > > > +    uint8_t version;
> > > > > > > > +    uint16_t used_idx;
> > > > > > > > +    uint16_t avail_idx;
> > > > > > > > +    uint16_t reserved;
> > > > > > > > +} DescState;
> > > > > > > > +
> > > > > > > > +typedef struct QueueRegion {
> > > > > > > > +    uint8_t valid;
> > > > > >
> > > > > > what's this?
> > > > > >
> > > > >
> > > > > We can use this to check whether this buffer is reset by qemu.
> > > >
> > > >
> > > > I'd use version here. Document that it's 1 currently.
> > >
> > > If we put version into the shared buffer, QEMU will reset it when vm
> > > reset. Then if backend restart at the same time, the version of this
> > > buffer will be lost. So I still let QEMU maintain it and send it back
> > > through message's payload.
> >
> > I don't get it. If it's reset there's no contents.
> > If there's no contents then who cares what the version is?
> >
> 
> What I thought before is that we should not update the buffer when vm
> reset. But now seems like it's unnecessary. I will update this patch
> as you said. Thank you!
> 
> > > > Or do we want feature flags so we can support downgrades too?
> > > >
> > >
> > > We faced the same problem that the feature flags will be lost if QEMU
> > > do not maintain it. So maybe we should put this into message's
> > > payload?
> >
> > I don't understand why we care. We only maintain inflight so we
> > don't need to reset the device. if device is reset we don't
> > need to maintain them at all.
> >
> > > >
> > > > > > > > +    uint16_t desc_num;
> > > > > >
> > > > > > there's padding before this field. Pls make it explicit.
> > > > > >
> > > > >
> > > > > Will do it.
> > > > >
> > > > > > > > +    DescState desc[0];
> > > > > > > > +} QueueRegion;
> > > > > > > > +
> > > > > > > > +The struct DescState is used to describe one head-descriptor's state. The
> > > > > > > > +fields have following meanings:
> > > > > > > > +
> > > > > > > > +    inuse: Indicate whether the descriptor is inuse or not.
> > > > > >
> > > > > > inuse by what?
> > > > > >
> > > > >
> > > > > Maybe inflight is better?
> > > > >
> > > > > > > > +
> > > > > > > > +    version: Indicate whether we have an atomic update to used ring and
> > > > > > > > +    inflight buffer when slave crash at that point. This field should be
> > > > > > > > +    increased by one before and after this two updates. An odd version
> > > > > > > > +    indicates an in-progress update.
> > > > > >
> > > > > > I'm not sure I understand what does the above say. Also does this
> > > > > > require two atomics? Seems pretty expensive. And why is it called
> > > > > > version?
> > > > > >
> > > > > > > > +
> > > > > > > > +    used_idx: Store old index of used ring before we update used ring and
> > > > > > > > +    inflight buffer so that slave can know whether an odd version inflight
> > > > > > > > +    head-descriptor in inflight buffer is processed or not.
> > > > > >
> > > > > > Here too.
> > > > > >
> > > > >
> > > > > Sorry, the above description may be not clear. This two fields are
> > > > > used to indicate whether we have an in-progress update to used ring
> > > > > and inflight buffer. If slave crash before the update to used_ring and
> > > > > after the update to inflight buffer, the version should be odd and
> > > > > used_idx should be equal to used_ring.idx. Then we need to roll back
> > > > > the update to inflight buffer. As for the name of the version filed,
> > > > > actually I didn't find a good one, so I just copy it from struct
> > > > > kvm_steal_time...
> > > > >
> > > > > > > > +
> > > > > > > > +    avail_idx: Used to preserve the descriptor's order in avail ring so that
> > > > > > > > +    slave can resubmit descriptors in order.
> > > > > >
> > > > > > Why would that be necessary?
> > > > > >
> > > > >
> > > > > Maybe some devices will be able to use it to preserve order after
> > > > > reconnecting in future?
> > > >
> > > > If buffers are used in order then old entries in the ring
> > > > are not overwritten so inflight tracking is not
> > > > necessary. This is exactly the case with vhost user net today.
> > > >
> > >
> > > OK, looks reasonable to me.
> > >
> > > > > > >
> > > > > > > Will a completely new "packed vring" inflight shm layout be necessary to
> > > > > > > support the packed vring layout in VIRTIO 1.1?
> > > > > > >
> > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-610007
> > > > > >
> > > > > > Probably.
> > > > > >
> > > > >
> > > > > How about supporting packed virtqueue in guest driver?
> > > > >
> > > > > Thanks,
> > > > > Yongji
> > > >
> > > > Depends on the guest right? Linux has it:
> > > >
> > >
> > > Sorry, actually I mean I prefer to support inflight tracking for
> > > packed virtqueue in guest driver. This feature is only used by legacy
> > > virtio 1.0 or virtio 0.9 device. What do you think about it?
> > >
> > > Thanks,
> > > Yongji
> >
> > I don't see what does one have to do with the other.  Either we do or we
> > don't want to do it without downtime and retries. If we don't mind
> > resets then let's do it by making guest changes.
> >
> 
> OK, I see. But seems like now we don't support packed virtqueue in
> qemu. Is it better to do that in a splited patchset? Add packed
> virtqueue implenment in contrib/libvhost-user.c, then document the
> inflight I/O tracking behavior.
> 
> Thanks,
> Yongji

it's just documentation so it does not have to depend
on actual code.

  reply	other threads:[~2019-01-30  4:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-22  8:31 [Qemu-devel] [PATCH v5 0/6] vhost-user-blk: Add support for backend reconnecting elohimes
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 1/6] vhost-user: Support transferring inflight buffer between qemu and backend elohimes
2019-01-29  4:11   ` Stefan Hajnoczi
2019-01-29  4:26     ` Michael S. Tsirkin
2019-01-29  6:15       ` Yongji Xie
2019-01-29 14:15         ` Michael S. Tsirkin
2019-01-30  2:07           ` Yongji Xie
2019-01-30  2:30             ` Michael S. Tsirkin
2019-01-30  3:49               ` Yongji Xie
2019-01-30  4:07                 ` Michael S. Tsirkin [this message]
2019-01-30  4:11                   ` Yongji Xie
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 2/6] libvhost-user: Introduce vu_queue_map_desc() elohimes
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 3/6] libvhost-user: Support tracking inflight I/O in shared memory elohimes
2019-01-30  2:31   ` Jason Wang
2019-01-30  3:14     ` Michael S. Tsirkin
2019-01-30  9:52       ` Jason Wang
2019-01-30  3:58     ` Yongji Xie
2019-02-01  2:26       ` Jason Wang
2019-01-30  5:48     ` Yongji Xie
2019-02-01  2:27       ` Jason Wang
2019-02-05  1:37         ` Yongji Xie
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 4/6] vhost-user-blk: Add support to get/set inflight buffer elohimes
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 5/6] vhost-user-blk: Add support to reconnect backend elohimes
2019-01-22  8:31 ` [Qemu-devel] [PATCH v5 6/6] contrib/vhost-user-blk: enable inflight I/O tracking elohimes
2019-01-30  2:29 ` [Qemu-devel] [PATCH v5 0/6] vhost-user-blk: Add support for backend reconnecting Jason Wang
2019-01-30  3:40   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190129230733-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=berrange@redhat.com \
    --cc=chaiwen@baidu.com \
    --cc=elohimes@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=lilin24@baidu.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=nixun@baidu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=wrfsh@yandex-team.ru \
    --cc=xieyongji@baidu.com \
    --cc=yury-kotov@yandex-team.ru \
    --cc=zhangyu31@baidu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).