public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Shawn Bohrer <sbohrer@cloudflare.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, bjorn@kernel.org,
	magnus.karlsson@intel.com, kernel-team@cloudflare.com
Subject: Re: Possible race with xsk_flush
Date: Fri, 16 Dec 2022 17:42:36 -0600	[thread overview]
Message-ID: <Y50CbIe2efR/JMwz@sbohrer-cf-dell> (raw)
In-Reply-To: <CAJ8uoz1GKvoaM0DCo1Ki8q=LHR1cjrNC=1BK7chTKKW9Po5F5A@mail.gmail.com>

On Fri, Dec 16, 2022 at 11:05:19AM +0100, Magnus Karlsson wrote:
> To summarize, we are expecting this ordering:
> 
> CPU 0 __xsk_rcv_zc()
> CPU 0 __xsk_map_flush()
> CPU 2 __xsk_rcv_zc()
> CPU 2 __xsk_map_flush()
> 
> But we are seeing this order:
> 
> CPU 0 __xsk_rcv_zc()
> CPU 2 __xsk_rcv_zc()
> CPU 0 __xsk_map_flush()
> CPU 2 __xsk_map_flush()
> 
> Here is the veth NAPI poll loop:
> 
> static int veth_poll(struct napi_struct *napi, int budget)
> {
>     struct veth_rq *rq =
>     container_of(napi, struct veth_rq, xdp_napi);
>     struct veth_stats stats = {};
>     struct veth_xdp_tx_bq bq;
>     int done;
> 
>     bq.count = 0;
> 
>     xdp_set_return_frame_no_direct();
>     done = veth_xdp_rcv(rq, budget, &bq, &stats);
> 
>     if (done < budget && napi_complete_done(napi, done)) {
>         /* Write rx_notify_masked before reading ptr_ring */
>        smp_store_mb(rq->rx_notify_masked, false);
>        if (unlikely(!__ptr_ring_empty(&rq->xdp_ring))) {
>            if (napi_schedule_prep(&rq->xdp_napi)) {
>                WRITE_ONCE(rq->rx_notify_masked, true);
>                __napi_schedule(&rq->xdp_napi);
>             }
>         }
>     }
> 
>     if (stats.xdp_tx > 0)
>         veth_xdp_flush(rq, &bq);
>     if (stats.xdp_redirect > 0)
>         xdp_do_flush();
>     xdp_clear_return_frame_no_direct();
> 
>     return done;
> }
> 
> Something I have never seen before is that there is
> napi_complete_done() and a __napi_schedule() before xdp_do_flush().
> Let us check if this has something to do with it. So two suggestions
> to be executed separately:
> 
> * Put a probe at the __napi_schedule() above and check if it gets
> triggered before this problem
> * Move the "if (stats.xdp_redirect > 0) xdp_do_flush();" to just
> before "if (done < budget && napi_complete_done(napi, done)) {"

I've built a kernel moving xdp_do_flush() before napi_complete_done()
and will leave this running over the weekend.  I also spent a while
trying to understand the napi code to see if that reordering seemed
like it might cause the bug.  I'll admit I still don't fully undestand
the napi code or how the bug can happen.  We'll see if this appears to
fix the issue, but since it is somewhat hard to reproduce I'd love for
someone to be able to explain why it fixes it.

Thanks,
Shawn Bohrer

  parent reply	other threads:[~2022-12-16 23:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-14 22:32 Possible race with xsk_flush Shawn Bohrer
2022-12-15 10:22 ` Magnus Karlsson
2022-12-15 19:50   ` Alex Forster
2022-12-16  0:14   ` Shawn Bohrer
2022-12-16 10:05     ` Magnus Karlsson
2022-12-16 16:48       ` Shawn Bohrer
2022-12-16 23:42       ` Shawn Bohrer [this message]
2022-12-20  1:31       ` Shawn Bohrer
2022-12-20  9:06         ` Magnus Karlsson
2022-12-20 15:25           ` Shawn Bohrer
2022-12-20 15:59             ` Magnus Karlsson
2022-12-20 16:02               ` Shawn Bohrer
2022-12-20 18:59 ` [PATCH] veth: Fix race with AF_XDP exposing old or uninitialized descriptors Shawn Bohrer
2022-12-22  1:07   ` Jakub Kicinski
2022-12-22 14:30     ` Paolo Abeni
2022-12-22 10:18   ` Paolo Abeni
2023-01-11 14:02     ` Magnus Karlsson
2023-01-11 14:21       ` Toke Høiland-Jørgensen
2023-01-11 16:24         ` Magnus Karlsson
2023-01-11 23:06           ` Toke Høiland-Jørgensen
2022-12-22 14:40   ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y50CbIe2efR/JMwz@sbohrer-cf-dell \
    --to=sbohrer@cloudflare.com \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kernel-team@cloudflare.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox