From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sridhar Samudrala Subject: Re: [PATCH net-next-2.6] vhost: Restart tx poll when socket send queue is full Date: Fri, 19 Feb 2010 13:19:45 -0800 Message-ID: <1266614385.17881.7.camel@w-sridhar.beaverton.ibm.com> References: <1266526751.15681.71.camel@w-sridhar.beaverton.ibm.com> <20100218223021.GB14847@redhat.com> <1266544807.15681.259.camel@w-sridhar.beaverton.ibm.com> <20100219144203.GD14847@redhat.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: David Miller , netdev To: "Michael S. Tsirkin" Return-path: Received: from e36.co.us.ibm.com ([32.97.110.154]:47038 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754719Ab0BSVT6 (ORCPT ); Fri, 19 Feb 2010 16:19:58 -0500 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by e36.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id o1JLH4W2017024 for ; Fri, 19 Feb 2010 14:17:04 -0700 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o1JLJk75177542 for ; Fri, 19 Feb 2010 14:19:51 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o1JLJj5w016462 for ; Fri, 19 Feb 2010 14:19:46 -0700 In-Reply-To: <20100219144203.GD14847@redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 2010-02-19 at 16:42 +0200, Michael S. Tsirkin wrote: > > > Hmm. We already do > > > if (wmem >= sock->sk->sk_sndbuf * 3 / 4) { > > > tx_poll_start(net, sock); > > > set_bit(SOCK_ASYNC_NOSPACE, &sock->flags); > > > break; > > > } > > > why does not this code trigger here? > > > > This check is done only when the ring is empty(head == vq->num). > > But we are breaking out of the loop here. > > if (unlikely(total_len >= VHOST_NET_WEIGHT)) { > > vhost_poll_queue(&vq->poll); > > break; > > } > > > > I guess tx_poll_start() is missing here. The following patch fixes > > the hang and may be a better fix. > > > > Signed-off-by: Sridhar Samudrala > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > > index 4c89283..fe9d296 100644 > > --- a/drivers/vhost/net.c > > +++ b/drivers/vhost/net.c > > @@ -172,6 +172,7 @@ static void handle_tx(struct vhost_net *net) > > vhost_add_used_and_signal(&net->dev, vq, head, 0); > > total_len += len; > > if (unlikely(total_len >= VHOST_NET_WEIGHT)) { > > + tx_poll_start(net, sock); > > vhost_poll_queue(&vq->poll); > > break; > > } > > > > Thanks > > Sridhar > > > Hmm, this happens when > we have polled a lot of packets, and want to > give another vq a chance to poll. > Looks like a strange place to add it. I am also seeing sendmsg() calls failing with EAGAIN. Could be a bug in handling this error. The check for sendq full is done outside the for loop. It is possible that we can run out of sendq space within the for loop. Should we check for wmem within the for loop? Thanks Sridhar