From: Jason Wang <jasowang@redhat.com>
To: Herbert Xu <herbert@gondor.apana.org.au>,
David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org, xen-devel@lists.xenproject.org,
konrad.wilk@oracle.com, boris.ostrovsky@oracle.com,
edumazet@google.com, "David S. Miller" <davem@davemloft.net>
Subject: Re: virtio_net: Fix napi poll list corruption
Date: Mon, 22 Dec 2014 16:18:33 +0800 [thread overview]
Message-ID: <5497D3D9.2070509@redhat.com> (raw)
In-Reply-To: <20141220002327.GA31975@gondor.apana.org.au>
On 12/20/2014 08:23 AM, Herbert Xu wrote:
> David Vrabel <david.vrabel@citrix.com> wrote:
>> After d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less interrupt
>> masking in NAPI) the napi instance is removed from the per-cpu list
>> prior to calling the n->poll(), and is only requeued if all of the
>> budget was used. This inadvertently broke netfront because netfront
>> does not use NAPI correctly.
> A similar bug exists in virtio_net.
>
> -- >8 --
> The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
> interrupt masking in NAPI) breaks virtio_net in an insidious way.
>
> It is now required that if the entire budget is consumed when poll
> returns, the napi poll_list must remain empty. However, like some
> other drivers virtio_net tries to do a last-ditch check and if
> there is more work it will call napi_schedule and then immediately
> process some of this new work. Should the entire budget be consumed
> while processing such new work then we will violate the new caller
> contract.
>
> This patch fixes this by not touching any work when we reschedule
> in virtio_net.
>
> The worst part of this bug is that the list corruption causes other
> napi users to be moved off-list. In my case I was chasing a stall
> in IPsec (IPsec uses netif_rx) and I only belatedly realised that it
> was virtio_net which caused the stall even though the virtio_net
> poll was still functioning perfectly after IPsec stalled.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index b8bd719..5ca9771 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -760,7 +760,6 @@ static int virtnet_poll(struct napi_struct *napi, int budget)
> container_of(napi, struct receive_queue, napi);
> unsigned int r, received = 0;
>
> -again:
> received += virtnet_receive(rq, budget - received);
>
> /* Out of packets? */
> @@ -771,7 +770,6 @@ again:
> napi_schedule_prep(napi)) {
> virtqueue_disable_cb(rq->vq);
> __napi_schedule(napi);
> - goto again;
> }
> }
>
> Cheers,
Acked-by: Jason Wang <jasowang@redhat.com>
btw, looks like at least caif_virtio has the same issue.
next prev parent reply other threads:[~2014-12-22 8:18 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-16 18:59 [PATCHv1 net] xen-netfront: use napi_complete() correctly to prevent Rx stalling David Vrabel
2014-12-16 20:22 ` David Miller
2014-12-16 20:22 ` David Miller
2014-12-20 0:23 ` virtio_net: Fix napi poll list corruption Herbert Xu
2014-12-20 0:23 ` Herbert Xu
2014-12-20 0:36 ` net: Detect drivers that reschedule NAPI and exhaust budget Herbert Xu
2014-12-20 0:36 ` Herbert Xu
2014-12-20 1:34 ` Eric Dumazet
2014-12-20 2:40 ` David Miller
2014-12-20 6:55 ` Herbert Xu
2014-12-20 18:00 ` Eric Dumazet
2014-12-20 20:14 ` [0/4] net: net_rx_action fixes and clean-ups Herbert Xu
2014-12-20 20:14 ` Herbert Xu
2014-12-20 20:16 ` [PATCH 1/4] net: Move napi polling code out of net_rx_action Herbert Xu
2014-12-24 4:20 ` David Miller
2014-12-24 4:20 ` David Miller
2014-12-20 20:16 ` Herbert Xu
2014-12-20 20:16 ` [PATCH 2/4] net: Detect drivers that reschedule NAPI and exhaust budget Herbert Xu
2014-12-20 20:16 ` Herbert Xu
2014-12-24 4:20 ` David Miller
2014-12-24 4:20 ` David Miller
2014-12-20 20:16 ` [PATCH 3/4] net: Always poll at least one device in net_rx_action Herbert Xu
2014-12-20 20:16 ` Herbert Xu
2014-12-24 4:20 ` David Miller
2014-12-24 4:20 ` David Miller
2014-12-20 20:16 ` [PATCH 4/4] net: Rearrange loop " Herbert Xu
2014-12-20 20:16 ` Herbert Xu
2014-12-24 4:20 ` David Miller
2014-12-24 4:20 ` David Miller
2014-12-20 18:00 ` net: Detect drivers that reschedule NAPI and exhaust budget Eric Dumazet
2014-12-20 6:55 ` Herbert Xu
2014-12-20 2:40 ` David Miller
2014-12-20 1:34 ` Eric Dumazet
2014-12-22 8:18 ` Jason Wang [this message]
2014-12-22 9:35 ` caif: Fix napi poll list corruption Herbert Xu
2014-12-22 10:02 ` Jason Wang
2014-12-22 10:02 ` Jason Wang
2014-12-22 21:35 ` David Miller
2014-12-22 21:35 ` David Miller
2014-12-22 9:35 ` Herbert Xu
2014-12-22 8:18 ` virtio_net: " Jason Wang
2014-12-22 16:19 ` Marcelo Ricardo Leitner
2014-12-22 16:19 ` Marcelo Ricardo Leitner
2014-12-22 21:10 ` David Miller
2014-12-22 21:10 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5497D3D9.2070509@redhat.com \
--to=jasowang@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=davem@davemloft.net \
--cc=david.vrabel@citrix.com \
--cc=edumazet@google.com \
--cc=herbert@gondor.apana.org.au \
--cc=konrad.wilk@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.