From: John Fastabend <john.fastabend@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>, martin.lau@kernel.org
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Daniel Borkmann <daniel@iogearbox.net>,
Pedro Pinto <xten@osec.io>, Hyunwoo Kim <v4bel@theori.io>,
Wongi Lee <qwerty@theori.io>
Subject: RE: [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry
Date: Mon, 08 Jul 2024 17:21:12 -0700 [thread overview]
Message-ID: <668c82787f16_d77208e0@john.notmuch> (raw)
In-Reply-To: <20240708133130.11609-1-daniel@iogearbox.net>
Daniel Borkmann wrote:
> Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
> an issue that the tcx_entry can be released too early leading to a use
> after free (UAF) when an active old-style ingress or clsact qdisc with a
> shared tc block is later replaced by another ingress or clsact instance.
>
> Essentially, the sequence to trigger the UAF (one example) can be as follows:
>
> 1. A network namespace is created
> 2. An ingress qdisc is created. This allocates a tcx_entry, and
> &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
> same time, a tcf block with index 1 is created.
> 3. chain0 is attached to the tcf block. chain0 must be connected to
> the block linked to the ingress qdisc to later reach the function
> tcf_chain0_head_change_cb_del() which triggers the UAF.
> 4. Create and graft a clsact qdisc. This causes the ingress qdisc
> created in step 1 to be removed, thus freeing the previously linked
> tcx_entry:
>
> rtnetlink_rcv_msg()
> => tc_modify_qdisc()
> => qdisc_create()
> => clsact_init() [a]
> => qdisc_graft()
> => qdisc_destroy()
> => __qdisc_destroy()
> => ingress_destroy() [b]
> => tcx_entry_free()
> => kfree_rcu() // tcx_entry freed
>
> 5. Finally, the network namespace is closed. This registers the
> cleanup_net worker, and during the process of releasing the
> remaining clsact qdisc, it accesses the tcx_entry that was
> already freed in step 4, causing the UAF to occur:
>
> cleanup_net()
> => ops_exit_list()
> => default_device_exit_batch()
> => unregister_netdevice_many()
> => unregister_netdevice_many_notify()
> => dev_shutdown()
> => qdisc_put()
> => clsact_destroy() [c]
> => tcf_block_put_ext()
> => tcf_chain0_head_change_cb_del()
> => tcf_chain_head_change_item()
> => clsact_chain_head_change()
> => mini_qdisc_pair_swap() // UAF
>
> There are also other variants, the gist is to add an ingress (or clsact)
> qdisc with a specific shared block, then to replace that qdisc, waiting
> for the tcx_entry kfree_rcu() to be executed and subsequently accessing
> the current active qdisc's miniq one way or another.
>
> The correct fix is to turn the miniq_active boolean into a counter. What
> can be observed, at step 2 above, the counter transitions from 0->1, at
> step [a] from 1->2 (in order for the miniq object to remain active during
> the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
> eventual release. The reference counter in general ranges from [0,2] and
> it does not need to be atomic since all access to the counter is protected
> by the rtnl mutex. With this in place, there is no longer a UAF happening
> and the tcx_entry is freed at the correct time.
>
> Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
> Reported-by: Pedro Pinto <xten@osec.io>
> Co-developed-by: Pedro Pinto <xten@osec.io>
> Signed-off-by: Pedro Pinto <xten@osec.io>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Hyunwoo Kim <v4bel@theori.io>
> Cc: Wongi Lee <qwerty@theori.io>
> Cc: Martin KaFai Lau <martin.lau@kernel.org>
> ---
Acked-by: John Fastabend <john.fastabend@gmail.com>
prev parent reply other threads:[~2024-07-09 0:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-08 13:31 [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry Daniel Borkmann
2024-07-08 13:31 ` [PATCH bpf 2/2] selftests/bpf: Extend tcx tests to cover late tcx_entry release Daniel Borkmann
2024-07-08 22:34 ` Martin KaFai Lau
2024-07-09 19:48 ` Daniel Borkmann
2024-07-08 22:30 ` [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry patchwork-bot+netdevbpf
2024-07-09 0:21 ` John Fastabend [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=668c82787f16_d77208e0@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=martin.lau@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=qwerty@theori.io \
--cc=v4bel@theori.io \
--cc=xten@osec.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).