From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45317 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PW6xK-00040J-G4 for qemu-devel@nongnu.org; Fri, 24 Dec 2010 07:41:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PW6xJ-0007mC-2I for qemu-devel@nongnu.org; Fri, 24 Dec 2010 07:41:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52660) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PW6xI-0007m3-MX for qemu-devel@nongnu.org; Fri, 24 Dec 2010 07:41:09 -0500 Date: Fri, 24 Dec 2010 14:40:35 +0200 From: "Michael S. Tsirkin" Message-ID: <20101224124035.GC24424@redhat.com> References: <1293160708-30881-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1293160708-30881-7-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <20101224094416.GB23271@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH 06/19] virtio: update last_avail_idx when inuse is decreased. List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yoshiaki Tamura Cc: kwolf@redhat.com, aliguori@us.ibm.com, dlaor@redhat.com, ananth@in.ibm.com, kvm@vger.kernel.org, ohmura.kei@lab.ntt.co.jp, mtosatti@redhat.com, qemu-devel@nongnu.org, vatsa@linux.vnet.ibm.com, avi@redhat.com, psuriset@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com On Fri, Dec 24, 2010 at 08:22:00PM +0900, Yoshiaki Tamura wrote: > 2010/12/24 Michael S. Tsirkin : > > On Fri, Dec 24, 2010 at 12:18:15PM +0900, Yoshiaki Tamura wrote: > >> virtio save/load is currently sending last_avail_idx, but inuse isn'= t. > >> This causes inconsistent state when using Kemari which replays > >> outstanding requests on the secondary. =A0By letting last_avail_idx = to > >> be updated after inuse is decreased, it would be possible to replay > >> the outstanding requests. =A0Noth that live migration shouldn't be > >> affected because it waits until flushing all requests. =A0Also in > >> conjunction with event-tap, requests inversion should be avoided. > >> > >> Signed-off-by: Yoshiaki Tamura > > > > I think I understood the request inversion. My question now is, > > event-tap transfers inuse events as well, wont the same > > request be repeated twice? > > > >> --- > >> =A0hw/virtio.c | =A0 =A08 +++++++- > >> =A01 files changed, 7 insertions(+), 1 deletions(-) > >> > >> diff --git a/hw/virtio.c b/hw/virtio.c > >> index 07dbf86..f915c46 100644 > >> --- a/hw/virtio.c > >> +++ b/hw/virtio.c > >> @@ -72,7 +72,7 @@ struct VirtQueue > >> =A0 =A0 =A0VRing vring; > >> =A0 =A0 =A0target_phys_addr_t pa; > >> =A0 =A0 =A0uint16_t last_avail_idx; > >> - =A0 =A0int inuse; > >> + =A0 =A0uint16_t inuse; > >> =A0 =A0 =A0uint16_t vector; > >> =A0 =A0 =A0void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq); > >> =A0 =A0 =A0VirtIODevice *vdev; > >> @@ -671,6 +671,7 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f= ) > >> =A0 =A0 =A0 =A0 =A0qemu_put_be32(f, vdev->vq[i].vring.num); > >> =A0 =A0 =A0 =A0 =A0qemu_put_be64(f, vdev->vq[i].pa); > >> =A0 =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].last_avail_idx); > >> + =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].inuse); > >> =A0 =A0 =A0 =A0 =A0if (vdev->binding->save_queue) > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0vdev->binding->save_queue(vdev->binding_o= paque, i, f); > >> =A0 =A0 =A0} > >> @@ -710,6 +711,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f= ) > >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].vring.num =3D qemu_get_be32(f); > >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].pa =3D qemu_get_be64(f); > >> =A0 =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].last_avail_idx); > >> + =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].inuse); > >> + > >> + =A0 =A0 =A0 =A0/* revert last_avail_idx if there are outstanding e= mulation. */ > > > > if there are outstanding emulation -> if requests > > are outstanding in event-tap? > > > >> + =A0 =A0 =A0 =A0vdev->vq[i].last_avail_idx -=3D vdev->vq[i].inuse; > >> + =A0 =A0 =A0 =A0vdev->vq[i].inuse =3D 0; > >> > > > > I don't understand it, if this is all we do we can equivalently > > decrement on the sender side and avoid breaking migration compatibili= ty? >=20 > It seems I sent the old patch... I'm really sorry. Currently > I'm taking the approach to update last_avai_idx later. > Decreasing looks scary to me if the guest already knows about it. It seems exactly the same functionally. > commit 8ac6ba51cc558b3bfcac7a5814d92f275ee874e9 > Author: Yoshiaki Tamura > Date: Mon May 17 10:36:14 2010 +0900 >=20 > virtio: update last_avail_idx when inuse is decreased. >=20 > virtio save/load is currently sending last_avail_idx, but inuse isn= 't. > This causes inconsistent state when using Kemari which replays > outstanding requests on the secondary. By letting last_avail_idx t= o > be updated after inuse is decreased, it would be possible to replay > the outstanding requests. Noth that live migration shouldn't be > affected because it waits until flushing all requests. Also in > conjunction with event-tap, requests inversion should be avoided. >=20 > Signed-off-by: Yoshiaki Tamura >=20 > diff --git a/hw/virtio.c b/hw/virtio.c > index 07dbf86..b1586da 100644 > --- a/hw/virtio.c > +++ b/hw/virtio.c > @@ -198,7 +198,7 @@ int virtio_queue_ready(VirtQueue *vq) >=20 > int virtio_queue_empty(VirtQueue *vq) > { > - return vring_avail_idx(vq) =3D=3D vq->last_avail_idx; > + return vring_avail_idx(vq) =3D=3D vq->last_avail_idx + vq->inuse; > } >=20 > void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem, > @@ -238,6 +238,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int co= unt) > wmb(); > trace_virtqueue_flush(vq, count); > vring_used_idx_increment(vq, count); > + vq->last_avail_idx +=3D count; > vq->inuse -=3D count; > } >=20 > @@ -306,7 +307,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_byt= es, int o > unsigned int idx; > int total_bufs, in_total, out_total; >=20 > - idx =3D vq->last_avail_idx; > + idx =3D vq->last_avail_idx + vq->inuse; >=20 > total_bufs =3D in_total =3D out_total =3D 0; > while (virtqueue_num_heads(vq, idx)) { > @@ -386,7 +387,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *= elem) > unsigned int i, head, max; > target_phys_addr_t desc_pa =3D vq->vring.desc; >=20 > - if (!virtqueue_num_heads(vq, vq->last_avail_idx)) > + if (!virtqueue_num_heads(vq, vq->last_avail_idx + vq->inuse)) > return 0; >=20 > /* When we start there are none of either input nor output. */ > @@ -394,7 +395,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *= elem) >=20 > max =3D vq->vring.num; >=20 > - i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx++); > + i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx + vq->inu= se); >=20 > if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) { > if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) { > @@ -626,7 +627,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *v= q) > /* Always notify when queue is empty (when feature acknowledge) */ > if ((vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT) && > (!(vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) || > - (vq->inuse || vring_avail_idx(vq) !=3D vq->last_avail_idx))) > + (vq->inuse || vring_avail_idx(vq) !=3D vq->last_avail_idx + v= q->inuse))) > return; >=20 > trace_virtio_notify(vdev, vq); >=20 >=20 > > > >> =A0 =A0 =A0 =A0 =A0if (vdev->vq[i].pa) { > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0uint16_t nheads; > >> -- > >> 1.7.1.2 > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > >