From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=35794 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PW7cz-0000c5-HV for qemu-devel@nongnu.org; Fri, 24 Dec 2010 08:27:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PW7cg-0000XO-Nd for qemu-devel@nongnu.org; Fri, 24 Dec 2010 08:23:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:31999) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PW7cg-0000XH-AE for qemu-devel@nongnu.org; Fri, 24 Dec 2010 08:23:54 -0500 Date: Fri, 24 Dec 2010 15:23:11 +0200 From: "Michael S. Tsirkin" Message-ID: <20101224132311.GE24424@redhat.com> References: <1293160708-30881-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1293160708-30881-7-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <20101224094416.GB23271@redhat.com> <20101224124035.GC24424@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH 06/19] virtio: update last_avail_idx when inuse is decreased. List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yoshiaki Tamura Cc: kwolf@redhat.com, aliguori@us.ibm.com, dlaor@redhat.com, ananth@in.ibm.com, kvm@vger.kernel.org, ohmura.kei@lab.ntt.co.jp, mtosatti@redhat.com, qemu-devel@nongnu.org, vatsa@linux.vnet.ibm.com, avi@redhat.com, psuriset@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com On Fri, Dec 24, 2010 at 10:14:50PM +0900, Yoshiaki Tamura wrote: > 2010/12/24 Michael S. Tsirkin : > > On Fri, Dec 24, 2010 at 08:22:00PM +0900, Yoshiaki Tamura wrote: > >> 2010/12/24 Michael S. Tsirkin : > >> > On Fri, Dec 24, 2010 at 12:18:15PM +0900, Yoshiaki Tamura wrote: > >> >> virtio save/load is currently sending last_avail_idx, but inuse i= sn't. > >> >> This causes inconsistent state when using Kemari which replays > >> >> outstanding requests on the secondary. =A0By letting last_avail_i= dx to > >> >> be updated after inuse is decreased, it would be possible to repl= ay > >> >> the outstanding requests. =A0Noth that live migration shouldn't b= e > >> >> affected because it waits until flushing all requests. =A0Also in > >> >> conjunction with event-tap, requests inversion should be avoided. > >> >> > >> >> Signed-off-by: Yoshiaki Tamura > >> > > >> > I think I understood the request inversion. My question now is, > >> > event-tap transfers inuse events as well, wont the same > >> > request be repeated twice? > >> > > >> >> --- > >> >> =A0hw/virtio.c | =A0 =A08 +++++++- > >> >> =A01 files changed, 7 insertions(+), 1 deletions(-) > >> >> > >> >> diff --git a/hw/virtio.c b/hw/virtio.c > >> >> index 07dbf86..f915c46 100644 > >> >> --- a/hw/virtio.c > >> >> +++ b/hw/virtio.c > >> >> @@ -72,7 +72,7 @@ struct VirtQueue > >> >> =A0 =A0 =A0VRing vring; > >> >> =A0 =A0 =A0target_phys_addr_t pa; > >> >> =A0 =A0 =A0uint16_t last_avail_idx; > >> >> - =A0 =A0int inuse; > >> >> + =A0 =A0uint16_t inuse; > >> >> =A0 =A0 =A0uint16_t vector; > >> >> =A0 =A0 =A0void (*handle_output)(VirtIODevice *vdev, VirtQueue *v= q); > >> >> =A0 =A0 =A0VirtIODevice *vdev; > >> >> @@ -671,6 +671,7 @@ void virtio_save(VirtIODevice *vdev, QEMUFile= *f) > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be32(f, vdev->vq[i].vring.num); > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be64(f, vdev->vq[i].pa); > >> >> =A0 =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].last_avail_idx)= ; > >> >> + =A0 =A0 =A0 =A0qemu_put_be16s(f, &vdev->vq[i].inuse); > >> >> =A0 =A0 =A0 =A0 =A0if (vdev->binding->save_queue) > >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0vdev->binding->save_queue(vdev->bindin= g_opaque, i, f); > >> >> =A0 =A0 =A0} > >> >> @@ -710,6 +711,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile= *f) > >> >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].vring.num =3D qemu_get_be32(f); > >> >> =A0 =A0 =A0 =A0 =A0vdev->vq[i].pa =3D qemu_get_be64(f); > >> >> =A0 =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].last_avail_idx)= ; > >> >> + =A0 =A0 =A0 =A0qemu_get_be16s(f, &vdev->vq[i].inuse); > >> >> + > >> >> + =A0 =A0 =A0 =A0/* revert last_avail_idx if there are outstandin= g emulation. */ > >> > > >> > if there are outstanding emulation -> if requests > >> > are outstanding in event-tap? > >> > > >> >> + =A0 =A0 =A0 =A0vdev->vq[i].last_avail_idx -=3D vdev->vq[i].inus= e; > >> >> + =A0 =A0 =A0 =A0vdev->vq[i].inuse =3D 0; > >> >> > >> > > >> > I don't understand it, if this is all we do we can equivalently > >> > decrement on the sender side and avoid breaking migration compatib= ility? > >> > >> It seems I sent the old patch... =A0I'm really sorry. =A0Currently > >> I'm taking the approach to update last_avai_idx later. > >> Decreasing looks scary to me if the guest already knows about it. > > > > It seems exactly the same functionally. >=20 > If it is the same I'm fine to go with the decreasing approach. > Is it fine for the guest? Is last_avai_idx irrelevant to the > guest's behavior? >=20 > Yoshi At least at the moment, yes. > >> commit 8ac6ba51cc558b3bfcac7a5814d92f275ee874e9 > >> Author: Yoshiaki Tamura > >> Date: =A0 Mon May 17 10:36:14 2010 +0900 > >> > >> =A0 =A0 virtio: update last_avail_idx when inuse is decreased. > >> > >> =A0 =A0 virtio save/load is currently sending last_avail_idx, but in= use isn't. > >> =A0 =A0 This causes inconsistent state when using Kemari which repla= ys > >> =A0 =A0 outstanding requests on the secondary. =A0By letting last_av= ail_idx to > >> =A0 =A0 be updated after inuse is decreased, it would be possible to= replay > >> =A0 =A0 the outstanding requests. =A0Noth that live migration should= n't be > >> =A0 =A0 affected because it waits until flushing all requests. =A0Al= so in > >> =A0 =A0 conjunction with event-tap, requests inversion should be avo= ided. > >> > >> =A0 =A0 Signed-off-by: Yoshiaki Tamura > >> > >> diff --git a/hw/virtio.c b/hw/virtio.c > >> index 07dbf86..b1586da 100644 > >> --- a/hw/virtio.c > >> +++ b/hw/virtio.c > >> @@ -198,7 +198,7 @@ int virtio_queue_ready(VirtQueue *vq) > >> > >> =A0int virtio_queue_empty(VirtQueue *vq) > >> =A0{ > >> - =A0 =A0return vring_avail_idx(vq) =3D=3D vq->last_avail_idx; > >> + =A0 =A0return vring_avail_idx(vq) =3D=3D vq->last_avail_idx + vq->= inuse; > >> =A0} > >> > >> =A0void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem, > >> @@ -238,6 +238,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int= count) > >> =A0 =A0 =A0wmb(); > >> =A0 =A0 =A0trace_virtqueue_flush(vq, count); > >> =A0 =A0 =A0vring_used_idx_increment(vq, count); > >> + =A0 =A0vq->last_avail_idx +=3D count; > >> =A0 =A0 =A0vq->inuse -=3D count; > >> =A0} > >> > >> @@ -306,7 +307,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_= bytes, int o > >> =A0 =A0 =A0unsigned int idx; > >> =A0 =A0 =A0int total_bufs, in_total, out_total; > >> > >> - =A0 =A0idx =3D vq->last_avail_idx; > >> + =A0 =A0idx =3D vq->last_avail_idx + vq->inuse; > >> > >> =A0 =A0 =A0total_bufs =3D in_total =3D out_total =3D 0; > >> =A0 =A0 =A0while (virtqueue_num_heads(vq, idx)) { > >> @@ -386,7 +387,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElemen= t *elem) > >> =A0 =A0 =A0unsigned int i, head, max; > >> =A0 =A0 =A0target_phys_addr_t desc_pa =3D vq->vring.desc; > >> > >> - =A0 =A0if (!virtqueue_num_heads(vq, vq->last_avail_idx)) > >> + =A0 =A0if (!virtqueue_num_heads(vq, vq->last_avail_idx + vq->inuse= )) > >> =A0 =A0 =A0 =A0 =A0return 0; > >> > >> =A0 =A0 =A0/* When we start there are none of either input nor outpu= t. */ > >> @@ -394,7 +395,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElemen= t *elem) > >> > >> =A0 =A0 =A0max =3D vq->vring.num; > >> > >> - =A0 =A0i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx++)= ; > >> + =A0 =A0i =3D head =3D virtqueue_get_head(vq, vq->last_avail_idx + = vq->inuse); > >> > >> =A0 =A0 =A0if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT)= { > >> =A0 =A0 =A0 =A0 =A0if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc= )) { > >> @@ -626,7 +627,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue= *vq) > >> =A0 =A0 =A0/* Always notify when queue is empty (when feature acknow= ledge) */ > >> =A0 =A0 =A0if ((vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT) = && > >> =A0 =A0 =A0 =A0 =A0(!(vdev->guest_features & (1 << VIRTIO_F_NOTIFY_O= N_EMPTY)) || > >> - =A0 =A0 =A0 =A0 (vq->inuse || vring_avail_idx(vq) !=3D vq->last_av= ail_idx))) > >> + =A0 =A0 =A0 =A0 (vq->inuse || vring_avail_idx(vq) !=3D vq->last_av= ail_idx + vq->inuse))) > >> =A0 =A0 =A0 =A0 =A0return; > >> > >> =A0 =A0 =A0trace_virtio_notify(vdev, vq); > >> > >> > >> > > >> >> =A0 =A0 =A0 =A0 =A0if (vdev->vq[i].pa) { > >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0uint16_t nheads; > >> >> -- > >> >> 1.7.1.2 > >> > -- > >> > To unsubscribe from this list: send the line "unsubscribe kvm" in > >> > the body of a message to majordomo@vger.kernel.org > >> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.ht= ml > >> > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > >