From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c9XPk-0002B0-3s for qemu-devel@nongnu.org; Wed, 23 Nov 2016 08:16:41 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c9XPd-00047B-Mk for qemu-devel@nongnu.org; Wed, 23 Nov 2016 08:16:40 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37120) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c9XPd-00045x-Du for qemu-devel@nongnu.org; Wed, 23 Nov 2016 08:16:33 -0500 Date: Wed, 23 Nov 2016 15:16:31 +0200 From: "Michael S. Tsirkin" Message-ID: <20161123151550-mutt-send-email-mst@kernel.org> References: <1478704922-3400-1-git-send-email-yuri.benditovich@daynix.com> <1478704922-3400-4-git-send-email-yuri.benditovich@daynix.com> <20161109221300-mutt-send-email-mst@kernel.org> <20161110154953-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 3/3] net: virtio-net discards TX data after link down List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yuri Benditovich Cc: Jason Wang , qemu-devel@nongnu.org, Dmitry Fleytman , Yan Vugenfirer On Wed, Nov 23, 2016 at 11:52:25AM +0200, Yuri Benditovich wrote: >=20 >=20 > On Thu, Nov 10, 2016 at 3:54 PM, Michael S. Tsirkin wr= ote: >=20 > On Thu, Nov 10, 2016 at 01:56:05AM +0200, Yuri Benditovich wrote: > > > > > > On Wed, Nov 9, 2016 at 10:28 PM, Michael S. Tsirkin > wrote: > > > >=A0 =A0 =A0On Wed, Nov 09, 2016 at 05:22:02PM +0200, yuri.benditov= ich@daynix.com > >=A0 =A0 =A0wrote: > >=A0 =A0 =A0> From: Yuri Benditovich > >=A0 =A0 =A0> > >=A0 =A0 =A0> https://bugzilla.redhat.com/show_bug.cgi?id=3D1295637 > >=A0 =A0 =A0> Upon set_link monitor command or upon netdev deletion > >=A0 =A0 =A0> virtio-net sends link down indication to the guest > >=A0 =A0 =A0> and stops vhost if one is used. > >=A0 =A0 =A0> Guest driver can still submit data for TX until it > >=A0 =A0 =A0> recognizes link loss. If these packets not returned b= y > >=A0 =A0 =A0> the host, the Windows guest will never be able to fin= ish > >=A0 =A0 =A0> disable/removal/shutdown. > >=A0 =A0 =A0> Now each packet sent by guest after NIC indicated lin= k > >=A0 =A0 =A0> down will be completed immediately. > >=A0 =A0 =A0> > >=A0 =A0 =A0> Signed-off-by: Yuri Benditovich > >=A0 =A0 =A0> --- > >=A0 =A0 =A0>=A0 hw/net/virtio-net.c | 28 +++++++++++++++++++++++++= +++ > >=A0 =A0 =A0>=A0 1 file changed, 28 insertions(+) > >=A0 =A0 =A0> > >=A0 =A0 =A0> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.= c > >=A0 =A0 =A0> index 06bfe4b..ab4e18a 100644 > >=A0 =A0 =A0> --- a/hw/net/virtio-net.c > >=A0 =A0 =A0> +++ b/hw/net/virtio-net.c > >=A0 =A0 =A0> @@ -218,6 +218,16 @@ static void virtio_net_vnet_endi= an_status( > VirtIONet > >=A0 =A0 =A0*n, uint8_t status) > >=A0 =A0 =A0>=A0 =A0 =A0 } > >=A0 =A0 =A0>=A0 } > >=A0 =A0 =A0> > >=A0 =A0 =A0> +static void virtio_net_drop_tx_queue_data(VirtIODevi= ce *vdev, > VirtQueue > >=A0 =A0 =A0*vq) > >=A0 =A0 =A0> +{ > >=A0 =A0 =A0> +=A0 =A0 VirtQueueElement *elem; > >=A0 =A0 =A0> +=A0 =A0 while ((elem =3D virtqueue_pop(vq, sizeof(Vi= rtQueueElement)))) { > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 virtqueue_push(vq, elem, 0); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 virtio_notify(vdev, vq); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 g_free(elem); > >=A0 =A0 =A0> +=A0 =A0 } > >=A0 =A0 =A0> +} > >=A0 =A0 =A0> + > >=A0 =A0 =A0>=A0 static void virtio_net_set_status(struct VirtIODev= ice *vdev, > uint8_t > >=A0 =A0 =A0status) > >=A0 =A0 =A0>=A0 { > >=A0 =A0 =A0>=A0 =A0 =A0 VirtIONet *n =3D VIRTIO_NET(vdev); > > > >=A0 =A0 =A0I don't like this part. This does too much queue parsin= g, > >=A0 =A0 =A0I would like to just copy head from avail to used ring. > > > >=A0 =A0 =A0For example, people want to support rings >1K in size. > >=A0 =A0 =A0Let's add bool virtqueue_drop(vq) and be done with it. > > > > > > Please note that this code works only when link is down. > > For me this was too complicated to write simpler procedure > > with the same result. >=20 > Yes - it's somewhat problematic and risky that we process > the ring in qemu, but I don't see an easy way around that. > But at least let's limit the processing and assumptions we > make. > =20 >=20 >=20 > So, what is the status and how do we make the progress. > What kind of change in the patch you suggest? >=20 > Thanks, > Yuri Add an API that copies entries from avail to used ring without looking at the desc buffer. >=20 > =20 > > > > > >=A0 =A0 =A0> @@ -262,6 +272,14 @@ static void virtio_net_set_statu= s(struct > >=A0 =A0 =A0VirtIODevice *vdev, uint8_t status) > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qemu_bh_cancel(q->= tx_bh); > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 =A0 =A0 } > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 if ((n->status & VIRTIO_NET_= S_LINK_UP) =3D=3D 0 && > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (queue_status & VIRT= IO_CONFIG_S_DRIVER_OK)) { > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* if tx is waiting = we are likely have some > packets in >=20 > ... we likely have some ... > =20 > >=A0 =A0 =A0tx queue > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* and disabled no= tification */ >=20 > what does this refer to? > =20 > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->tx_waiting =3D 0; > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 virtio_queue_set_not= ification(q->tx_vq, 1); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 virtio_net_drop_tx_q= ueue_data(vdev, q->tx_vq); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 =A0 =A0 } > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 } > >=A0 =A0 =A0>=A0 =A0 =A0 } > >=A0 =A0 =A0>=A0 } > > > >=A0 =A0 =A0OK but what if guest keeps sending packets? What will d= rop them? > > > > > > This code fixes following problem in original code (example): > > We are in vhost=3Doff and receive kick ->virtio_net_handle_tx_tim= er > > -> tx_waiting=3D1, notification disabled, timer set > > Now we receive link loss, cancel the timer and stay with packets = in the > queue > > and with > > disabled notification. Nobody will return them. (easy to reproduc= e with > timer > > set to 5ms) > > > > Added code drops packets we already have and ensure we will repor= t them > > as completed to guest. If guest keeps sending packets, they will = be > dropped=A0 > > in=A0virtio_net_handle_tx_timer and=A0in =A0virtio_net_handle_tx_= bh (in > procedures > > just below) > > as we already with link down.=A0 >=20 > Yes I get that. I'm just not 100% sure all paths have > us listen on the ioeventfd and handle kicks without races - > this was previously assumed not to matter. >=20 >=20 > > > > > >=A0 =A0 =A0> @@ -1319,6 +1337,11 @@ static void virtio_net_handle_= tx_timer( > >=A0 =A0 =A0VirtIODevice *vdev, VirtQueue *vq) > >=A0 =A0 =A0>=A0 =A0 =A0 VirtIONet *n =3D VIRTIO_NET(vdev); > >=A0 =A0 =A0>=A0 =A0 =A0 VirtIONetQueue *q =3D &n->vqs[vq2q(virtio_= get_queue_index(vq))]; > >=A0 =A0 =A0> > >=A0 =A0 =A0> +=A0 =A0 if (unlikely((n->status & VIRTIO_NET_S_LINK_= UP) =3D=3D 0)) { > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 virtio_net_drop_tx_queue_data(vdev, = vq); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 return; > >=A0 =A0 =A0> +=A0 =A0 } > >=A0 =A0 =A0> + > >=A0 =A0 =A0>=A0 =A0 =A0 /* This happens when device was stopped bu= t VCPU wasn't. */ > >=A0 =A0 =A0>=A0 =A0 =A0 if (!vdev->vm_running) { > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 q->tx_waiting =3D 1; > >=A0 =A0 =A0> @@ -1345,6 +1368,11 @@ static void virtio_net_handle_= tx_bh( > VirtIODevice > >=A0 =A0 =A0*vdev, VirtQueue *vq) > >=A0 =A0 =A0>=A0 =A0 =A0 VirtIONet *n =3D VIRTIO_NET(vdev); > >=A0 =A0 =A0>=A0 =A0 =A0 VirtIONetQueue *q =3D &n->vqs[vq2q(virtio_= get_queue_index(vq))]; > >=A0 =A0 =A0> > >=A0 =A0 =A0> +=A0 =A0 if (unlikely((n->status & VIRTIO_NET_S_LINK_= UP) =3D=3D 0)) { > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 virtio_net_drop_tx_queue_data(vdev, = vq); > >=A0 =A0 =A0> +=A0 =A0 =A0 =A0 return; > >=A0 =A0 =A0> +=A0 =A0 } > >=A0 =A0 =A0> + > >=A0 =A0 =A0>=A0 =A0 =A0 if (unlikely(q->tx_waiting)) { > >=A0 =A0 =A0>=A0 =A0 =A0 =A0 =A0 return; > >=A0 =A0 =A0>=A0 =A0 =A0 } > >=A0 =A0 =A0> -- > >=A0 =A0 =A0> 1.9.1 > > > > >=20 >=20