From: John Fastabend <john.fastabend@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>, martin.lau@kernel.org
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Daniel Borkmann <daniel@iogearbox.net>,
Pedro Pinto <xten@osec.io>, Hyunwoo Kim <v4bel@theori.io>,
Wongi Lee <qwerty@theori.io>
Subject: RE: [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry
Date: Mon, 08 Jul 2024 17:21:12 -0700 [thread overview]
Message-ID: <668c82787f16_d77208e0@john.notmuch> (raw)
In-Reply-To: <20240708133130.11609-1-daniel@iogearbox.net>
Daniel Borkmann wrote:
> Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
> an issue that the tcx_entry can be released too early leading to a use
> after free (UAF) when an active old-style ingress or clsact qdisc with a
> shared tc block is later replaced by another ingress or clsact instance.
>
> Essentially, the sequence to trigger the UAF (one example) can be as follows:
>
> 1. A network namespace is created
> 2. An ingress qdisc is created. This allocates a tcx_entry, and
> &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
> same time, a tcf block with index 1 is created.
> 3. chain0 is attached to the tcf block. chain0 must be connected to
> the block linked to the ingress qdisc to later reach the function
> tcf_chain0_head_change_cb_del() which triggers the UAF.
> 4. Create and graft a clsact qdisc. This causes the ingress qdisc
> created in step 1 to be removed, thus freeing the previously linked
> tcx_entry:
>
> rtnetlink_rcv_msg()
> => tc_modify_qdisc()
> => qdisc_create()
> => clsact_init() [a]
> => qdisc_graft()
> => qdisc_destroy()
> => __qdisc_destroy()
> => ingress_destroy() [b]
> => tcx_entry_free()
> => kfree_rcu() // tcx_entry freed
>
> 5. Finally, the network namespace is closed. This registers the
> cleanup_net worker, and during the process of releasing the
> remaining clsact qdisc, it accesses the tcx_entry that was
> already freed in step 4, causing the UAF to occur:
>
> cleanup_net()
> => ops_exit_list()
> => default_device_exit_batch()
> => unregister_netdevice_many()
> => unregister_netdevice_many_notify()
> => dev_shutdown()
> => qdisc_put()
> => clsact_destroy() [c]
> => tcf_block_put_ext()
> => tcf_chain0_head_change_cb_del()
> => tcf_chain_head_change_item()
> => clsact_chain_head_change()
> => mini_qdisc_pair_swap() // UAF
>
> There are also other variants, the gist is to add an ingress (or clsact)
> qdisc with a specific shared block, then to replace that qdisc, waiting
> for the tcx_entry kfree_rcu() to be executed and subsequently accessing
> the current active qdisc's miniq one way or another.
>
> The correct fix is to turn the miniq_active boolean into a counter. What
> can be observed, at step 2 above, the counter transitions from 0->1, at
> step [a] from 1->2 (in order for the miniq object to remain active during
> the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
> eventual release. The reference counter in general ranges from [0,2] and
> it does not need to be atomic since all access to the counter is protected
> by the rtnl mutex. With this in place, there is no longer a UAF happening
> and the tcx_entry is freed at the correct time.
>
> Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
> Reported-by: Pedro Pinto <xten@osec.io>
> Co-developed-by: Pedro Pinto <xten@osec.io>
> Signed-off-by: Pedro Pinto <xten@osec.io>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Hyunwoo Kim <v4bel@theori.io>
> Cc: Wongi Lee <qwerty@theori.io>
> Cc: Martin KaFai Lau <martin.lau@kernel.org>
> ---
Acked-by: John Fastabend <john.fastabend@gmail.com>
prev parent reply other threads:[~2024-07-09 0:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-08 13:31 [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry Daniel Borkmann
2024-07-08 13:31 ` [PATCH bpf 2/2] selftests/bpf: Extend tcx tests to cover late tcx_entry release Daniel Borkmann
2024-07-08 22:34 ` Martin KaFai Lau
2024-07-09 19:48 ` Daniel Borkmann
2024-07-08 22:30 ` [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry patchwork-bot+netdevbpf
2024-07-09 0:21 ` John Fastabend [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=668c82787f16_d77208e0@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=martin.lau@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=qwerty@theori.io \
--cc=v4bel@theori.io \
--cc=xten@osec.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.