All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>,
	virtualization@lists.linux-foundation.org
Subject: Re: virtio-net: locking in tx napi
Date: Tue, 13 Apr 2021 00:44:48 -0400	[thread overview]
Message-ID: <20210413004420-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <cebebed3-7a92-5471-117f-774286731cf5@redhat.com>

On Tue, Apr 13, 2021 at 10:29:03AM +0800, Jason Wang wrote:
> 
> 在 2021/4/13 上午6:31, Michael S. Tsirkin 写道:
> > On Mon, Apr 12, 2021 at 06:03:50PM -0400, Michael S. Tsirkin wrote:
> > > I was working on the spurios interrupt problem and
> > > I noticed something weird.
> > > 
> > > static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> > > {
> > >          struct send_queue *sq = container_of(napi, struct send_queue, napi);
> > >          struct virtnet_info *vi = sq->vq->vdev->priv;
> > >          unsigned int index = vq2txq(sq->vq);
> > >          struct netdev_queue *txq;
> > > 
> > >          if (unlikely(is_xdp_raw_buffer_queue(vi, index))) {
> > >                  /* We don't need to enable cb for XDP */
> > >                  napi_complete_done(napi, 0);
> > >                  return 0;
> > >          }
> > > 
> > >          txq = netdev_get_tx_queue(vi->dev, index);
> > >          __netif_tx_lock(txq, raw_smp_processor_id());
> > >          free_old_xmit_skbs(sq, true);
> > >          __netif_tx_unlock(txq);
> > >          virtqueue_napi_complete(napi, sq->vq, 0);
> > >          if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > >                  netif_tx_wake_queue(txq);
> > >          return 0;
> > > }
> > > 
> > > So virtqueue_napi_complete is called when txq is not locked,
> > > thinkably start_xmit can happen right?
> 
> 
> Yes, I think so.
> 
> 
> > > 
> > > Now virtqueue_napi_complete
> > > 
> > > static void virtqueue_napi_complete(struct napi_struct *napi,
> > >                                      struct virtqueue *vq, int processed)
> > > {
> > >          int opaque;
> > > 
> > >          opaque = virtqueue_enable_cb_prepare(vq);
> > >          if (napi_complete_done(napi, processed)) {
> > >                  if (unlikely(virtqueue_poll(vq, opaque)))
> > >                          virtqueue_napi_schedule(napi, vq);
> > >          } else {
> > >                  virtqueue_disable_cb(vq);
> > >          }
> > > }
> > > 
> > > 
> > > So it is calling virtqueue_enable_cb_prepare but tx queue
> > > could be running and can process things in parallel ...
> > > What gives? I suspect this corrupts the ring, and explains
> > > why we are seeing weird hangs with vhost packed ring ...
> > > 
> > > Jason?
> 
> 
> It might cause the interrupt to be disabled unexpectedly I think.
> 
> 
> > > 
> > > 
> > > -- 
> > > MST
> > and wouldn't the following untested patch make sense then?
> > 
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 82e520d2cb12..c23341b18eb5 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1504,6 +1505,8 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> >   	struct virtnet_info *vi = sq->vq->vdev->priv;
> >   	unsigned int index = vq2txq(sq->vq);
> >   	struct netdev_queue *txq;
> > +	int opaque;
> > +	bool done;
> >   	if (unlikely(is_xdp_raw_buffer_queue(vi, index))) {
> >   		/* We don't need to enable cb for XDP */
> > @@ -1514,9 +1517,26 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> >   	txq = netdev_get_tx_queue(vi->dev, index);
> >   	__netif_tx_lock(txq, raw_smp_processor_id());
> >   	free_old_xmit_skbs(sq, true);
> > +
> > +	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > +
> > +	done = napi_complete_done(napi, 0);
> > +
> > +	if (!done)
> > +		virtqueue_disable_cb(sq->vq);
> > +
> >   	__netif_tx_unlock(txq);
> > -	virtqueue_napi_complete(napi, sq->vq, 0);
> > +	if (done) {
> > +		if (unlikely(virtqueue_poll(vq, opaque))) {
> > +			if (napi_schedule_prep(napi)) {
> > +				__netif_tx_lock(txq, raw_smp_processor_id());
> > +				virtqueue_disable_cb(sq->vq);
> > +				__netif_tx_unlock(txq);
> > +				__napi_schedule(napi);
> > +			}
> > +		}
> > +	}
> >   	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> >   		netif_tx_wake_queue(txq);
> 
> 
> I wonder why not simply protect the whole poll_tx with tx lock in this case?
> 
> Thanks
> 

Well it takes __netif_tx_lock internally does it not? Sounds like a
deadlock to me ..,

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2021-04-13  4:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-12 22:03 virtio-net: locking in tx napi Michael S. Tsirkin
2021-04-12 22:31 ` Michael S. Tsirkin
2021-04-13  2:29   ` Jason Wang
2021-04-13  4:44     ` Michael S. Tsirkin [this message]
2021-04-13 13:29       ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210413004420-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.