public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: "Nelson, Shannon" <shannon.nelson@amd.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"David Ahern" <dsahern@gmail.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Andrii Nakryiko" <andrii@kernel.org>
Subject: Re: BPF/XDP: kernel panic when removing an interface that is an xdp_redirect target
Date: Wed, 8 Nov 2023 13:30:21 -0800	[thread overview]
Message-ID: <fa95d5d0-35c0-497e-aea8-a35f9f6304f4@amd.com> (raw)
In-Reply-To: <87h6lxy3zq.fsf@toke.dk>

On 11/7/2023 7:31 AM, Toke Høiland-Jørgensen wrote:
> 
> "Nelson, Shannon" <shannon.nelson@amd.com> writes:
> 
>> While testing new code to support XDP in the ionic driver we found that
>> we could panic the kernel by running a bind/unbind loop on the target
>> interface of an xdp_redirect action.  Obviously this is a stress test
>> that is abusing the system, but it does point to a window of opportunity
>> in bq_enqueue() and bq_xmit_all().  I believe that while the validity of
>> the target interface has been checked in __xdp_enqueue(), the interface
>> can be unbound by the time either bq_enqueue() or bq_xmit_all() tries to
>> use the interface.  There is no locking or reference taken on the
>> interface to hold it in place before the target’s ndo_xdp_xmit() is called.
>>
>> Below is a stack trace that our tester captured while running our test
>> code on a RHEL 9.2 kernel – yes, I know, unpublished driver code on a
>> non-upstream kernel.  But if you look at the current upstream code in
>> kernel/bpf/devmap.c I think you can see what we ran into.
>>
>> Other than telling users to not abuse the system with a bind/unbind
>> loop, is there something we can do to limit the potential pain here?
>> Without knowing what interfaces might be targeted by the users’ XDP
>> programs, is there a step the originating driver can do to take
>> precautions?  Did we simply miss a step in the driver, or is this an
>> actual problem in the devmap code?
> 
> Sounds like a driver bug :)

Entirely possible, wouldn't be our first ... :-)

> 
> The XDP redirect flow guarantees that all outstanding packets are
> flushed within a single NAPI cycle, as documented here:
> https://docs.kernel.org/bpf/redirect.html
> 
> So basically, the driver should be doing a two-step teardown: remove
> global visibility of the resource in question, wait for all concurrent
> users to finish, and *then* free the data structure. This corresponds to
> the usual RCU protection: resources should be kept alive until all
> concurrent RCU critical sections have exited on all CPUs. So if your
> driver is removing an interface's data structure without waiting for
> concurrent NAPI cycles to finish, that's a bug in the driver.
> 
> This kind of thing is what the synchronize_net() function is for; for a
> usage example, see veth_napi_del_range(). My guess would be that you're
> missing this as part of your driver teardown flow?

Essentially, the first thing we do in the remove function is to call 
unregister_netdev(), which has synchronize_net() in the path, so I don't 
think this is missing from our scenario, but thanks for the hint, I'll 
keep this in mind.  I do see there are a couple of net drivers that are 
more aggressive about calling it directly in some other parts of the 
logic - I don't think that has a bearing on this issue, but I'll keep it 
in mind.

> 
> Another source of a bug like this could be that your driver does not in
> fact call xdp_do_flush() before exiting its NAPI cycle, so that there
> will be packets from the previous cycle in the bq queue, in which case
> the assumption mentioned in the linked document obviously breaks down.
> But that would also be a driver bug :)

We do call the xdp_do_flush() at the end of the NAPI cycle, just before 
calling napi_complete_done().

> 
> -Toke
> 

Thanks for the notes - I'll have our tester spend some more time with 
this using other drivers/interfaces as the targets to see if we can 
gather more information on the scenario.

sln


  reply	other threads:[~2023-11-08 21:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-06 23:03 BPF/XDP: kernel panic when removing an interface that is an xdp_redirect target Nelson, Shannon
2023-11-07 15:31 ` Toke Høiland-Jørgensen
2023-11-08 21:30   ` Nelson, Shannon [this message]
2023-11-08 23:10     ` Toke Høiland-Jørgensen
2023-11-09  1:52     ` Jakub Kicinski
2023-11-09  2:11       ` Nelson, Shannon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa95d5d0-35c0-497e-aea8-a35f9f6304f4@amd.com \
    --to=shannon.nelson@amd.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=hawk@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox