From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH 06/19] virtio: update last_avail_idx when inuse is decreased. Date: Fri, 24 Dec 2010 15:23:11 +0200 Message-ID: <20101224132311.GE24424@redhat.com> References: <1293160708-30881-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1293160708-30881-7-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <20101224094416.GB23271@redhat.com> <20101224124035.GC24424@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, avi@redhat.com, anthony@codemonkey.ws, aliguori@us.ibm.com, mtosatti@redhat.com, dlaor@redhat.com, kwolf@redhat.com, ananth@in.ibm.com, psuriset@linux.vnet.ibm.com, vatsa@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com, ohmura.kei@lab.ntt.co.jp To: Yoshiaki Tamura Return-path: Received: from mx1.redhat.com ([209.132.183.28]:32157 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752033Ab0LXNXz (ORCPT ); Fri, 24 Dec 2010 08:23:55 -0500 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Dec 24, 2010 at 10:14:50PM +0900, Yoshiaki Tamura wrote: > 2010/12/24 Michael S. Tsirkin : > > On Fri, Dec 24, 2010 at 08:22:00PM +0900, Yoshiaki Tamura wrote: > >> 2010/12/24 Michael S. Tsirkin : > >> > On Fri, Dec 24, 2010 at 12:18:15PM +0900, Yoshiaki Tamura wrote: > >> >> virtio save/load is currently sending last_avail_idx, but inuse= isn't. > >> >> This causes inconsistent state when using Kemari which replays > >> >> outstanding requests on the secondary. =A0By letting last_avail= _idx to > >> >> be updated after inuse is decreased, it would be possible to re= play > >> >> the outstanding requests. =A0Noth that live migration shouldn't= be > >> >> affected because it waits until flushing all requests. =A0Also = in > >> >> conjunction with event-tap, requests inversion should be avoide= d. > >> >> > >> >> Signed-off-by: Yoshiaki Tamura > >> > > >> > I think I understood the request inversion. My question now is, > >> > event-tap transfers inuse events as well, wont the same > >> > request be repeated twice? > >> > > >> >> --- > >> >> =A0hw/virtio.c | =A0 =A08 +++++++- > >> >> =A01 files changed, 7 insertions(+), 1 deletions(-) > >> >> > >> >> diff --git a/hw/virtio.c b/hw/virtio.c > >> >> index 07dbf86..f915c46 100644 > >> >> --- a/hw/virtio.c > >> >> +++ b/hw/virtio.c > >> >> @@ -72,7 +72,7 @@ struct VirtQueue > >> >> =A0 =A0 =A0VRing vring; > >> >> =A0 =A0 =A0target_phys_addr_t pa; > >> >> =A0 =A0 =A0uint16_t last_avail_idx; > >> >> - =A0 =A0int inuse; > >> >> + =A0 =A0uint16_t inuse; > >> >> =A0 =A0 =A0uint16_t vector; > >> >> =A0 =A0 =A0void (*handle_output)(VirtIODevice *vdev, VirtQueue = *vq); > >> >> =A0 =A0 =A0VirtIODevice *vdev; > >> >> @@ -671,6 +671,7 @@ void virtio_save(VirtIODevice *vdev, QEMUFi= le *f) > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be32(f, vdev->vq[i].vring.num); > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be64(f, vdev->vq[i].pa); > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].last_avail_id= x); > >> >> + =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].inuse); > >> >> =A0 =A0 =A0 =A0 =A0if (vdev->binding->save_queue) > >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0vdev->binding->save_queue(vdev->bind= ing_opaque, i, f); > >> >> =A0 =A0 =A0} > >> >> @@ -710,6 +711,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFi= le *f) > >> >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].vring.num =3D qemu_get_be32(f); > >> >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].pa =3D qemu_get_be64(f); > >> >> =A0 =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].last_avail_id= x); > >> >> + =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].inuse); > >> >> + > >> >> + =A0 =A0 =A0 =A0/* revert last_avail_idx if there are outstand= ing emulation. */ > >> > > >> > if there are outstanding emulation -> if requests > >> > are outstanding in event-tap? > >> > > >> >> + =A0 =A0 =A0 =A0vdev->vq[i].last_avail_idx -=3D vdev->vq[i].in= use; > >> >> + =A0 =A0 =A0 =A0vdev->vq[i].inuse =3D 0; > >> >> > >> > > >> > I don't understand it, if this is all we do we can equivalently > >> > decrement on the sender side and avoid breaking migration compat= ibility? > >> > >> It seems I sent the old patch... =A0I'm really sorry. =A0Currently > >> I'm taking the approach to update last_avai_idx later. > >> Decreasing looks scary to me if the guest already knows about it. > > > > It seems exactly the same functionally. >=20 > If it is the same I'm fine to go with the decreasing approach. > Is it fine for the guest? Is last_avai_idx irrelevant to the > guest's behavior? >=20 > Yoshi At least at the moment, yes. > >> commit 8ac6ba51cc558b3bfcac7a5814d92f275ee874e9 > >> Author: Yoshiaki Tamura > >> Date: =A0 Mon May 17 10:36:14 2010 +0900 > >> > >> =A0 =A0 virtio: update last_avail_idx when inuse is decreased. > >> > >> =A0 =A0 virtio save/load is currently sending last_avail_idx, but = inuse isn't. > >> =A0 =A0 This causes inconsistent state when using Kemari which rep= lays > >> =A0 =A0 outstanding requests on the secondary. =A0By letting last_= avail_idx to > >> =A0 =A0 be updated after inuse is decreased, it would be possible = to replay > >> =A0 =A0 the outstanding requests. =A0Noth that live migration shou= ldn't be > >> =A0 =A0 affected because it waits until flushing all requests. =A0= Also in > >> =A0 =A0 conjunction with event-tap, requests inversion should be a= voided. > >> > >> =A0 =A0 Signed-off-by: Yoshiaki Tamura > >> > >> diff --git a/hw/virtio.c b/hw/virtio.c > >> index 07dbf86..b1586da 100644 > >> --- a/hw/virtio.c > >> +++ b/hw/virtio.c > >> @@ -198,7 +198,7 @@ int virtio_queue_ready(VirtQueue *vq) > >> > >> =A0int virtio_queue_empty(VirtQueue *vq) > >> =A0{ > >> - =A0 =A0return vring_avail_idx(vq) =3D=3D vq->last_avail_idx; > >> + =A0 =A0return vring_avail_idx(vq) =3D=3D vq->last_avail_idx + vq= ->inuse; > >> =A0} > >> > >> =A0void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem= , > >> @@ -238,6 +238,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned i= nt count) > >> =A0 =A0 =A0wmb(); > >> =A0 =A0 =A0trace_virtqueue_flush(vq, count); > >> =A0 =A0 =A0vring_used_idx_increment(vq, count); > >> + =A0 =A0vq->last_avail_idx +=3D count; > >> =A0 =A0 =A0vq->inuse -=3D count; > >> =A0} > >> > >> @@ -306,7 +307,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int i= n_bytes, int o > >> =A0 =A0 =A0unsigned int idx; > >> =A0 =A0 =A0int total_bufs, in_total, out_total; > >> > >> - =A0 =A0idx =3D vq->last_avail_idx; > >> + =A0 =A0idx =3D vq->last_avail_idx + vq->inuse; > >> > >> =A0 =A0 =A0total_bufs =3D in_total =3D out_total =3D 0; > >> =A0 =A0 =A0while (virtqueue_num_heads(vq, idx)) { > >> @@ -386,7 +387,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElem= ent *elem) > >> =A0 =A0 =A0unsigned int i, head, max; > >> =A0 =A0 =A0target_phys_addr_t desc_pa =3D vq->vring.desc; > >> > >> - =A0 =A0if (!virtqueue_num_heads(vq, vq->last_avail_idx)) > >> + =A0 =A0if (!virtqueue_num_heads(vq, vq->last_avail_idx + vq->inu= se)) > >> =A0 =A0 =A0 =A0 =A0return 0; > >> > >> =A0 =A0 =A0/* When we start there are none of either input nor out= put. */ > >> @@ -394,7 +395,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElem= ent *elem) > >> > >> =A0 =A0 =A0max =3D vq->vring.num; > >> > >> - =A0 =A0i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx+= +); > >> + =A0 =A0i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx = + vq->inuse); > >> > >> =A0 =A0 =A0if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIREC= T) { > >> =A0 =A0 =A0 =A0 =A0if (vring_desc_len(desc_pa, i) % sizeof(VRingDe= sc)) { > >> @@ -626,7 +627,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQue= ue *vq) > >> =A0 =A0 =A0/* Always notify when queue is empty (when feature ackn= owledge) */ > >> =A0 =A0 =A0if ((vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT= ) && > >> =A0 =A0 =A0 =A0 =A0(!(vdev->guest_features & (1 << VIRTIO_F_NOTIFY= _ON_EMPTY)) || > >> - =A0 =A0 =A0 =A0 (vq->inuse || vring_avail_idx(vq) !=3D vq->last_= avail_idx))) > >> + =A0 =A0 =A0 =A0 (vq->inuse || vring_avail_idx(vq) !=3D vq->last_= avail_idx + vq->inuse))) > >> =A0 =A0 =A0 =A0 =A0return; > >> > >> =A0 =A0 =A0trace_virtio_notify(vdev, vq); > >> > >> > >> > > >> >> =A0 =A0 =A0 =A0 =A0if (vdev->vq[i].pa) { > >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0uint16_t nheads; > >> >> -- > >> >> 1.7.1.2 > >> > -- > >> > To unsubscribe from this list: send the line "unsubscribe kvm" i= n > >> > the body of a message to majordomo@vger.kernel.org > >> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.= html > >> > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at =A0http://vger.kernel.org/majordomo-info.htm= l > >