From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>, Andrii Nakryiko <andriin@fb.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Jesper Dangaard Brouer <brouer@redhat.com>,
John Fastabend <john.fastabend@gmail.com>,
Lorenz Bauer <lmb@cloudflare.com>, Andrey Ignatov <rdna@fb.com>
Subject: Re: [PATCH bpf-next v3 1/4] xdp: Support specifying expected existing program when attaching XDP
Date: Wed, 25 Mar 2020 17:48:04 +0100 [thread overview]
Message-ID: <87a744peij.fsf@toke.dk> (raw)
In-Reply-To: <20200325014622.ilhqpfdwgb65jpnq@ast-mbp>
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> On Tue, Mar 24, 2020 at 07:12:53PM +0100, Toke Høiland-Jørgensen wrote:
>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>>
>> While it is currently possible for userspace to specify that an existing
>> XDP program should not be replaced when attaching to an interface, there is
>> no mechanism to safely replace a specific XDP program with another.
>>
>> This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_ID, which can be
>> set along with IFLA_XDP_FD. If set, the kernel will check that the program
>> currently loaded on the interface matches the expected one, and fail the
>> operation if it does not. This corresponds to a 'cmpxchg' memory operation.
>> Setting the new attribute with a negative value means that no program is
>> expected to be attached, which corresponds to setting the UPDATE_IF_NOEXIST
>> flag.
>>
>> A new companion flag, XDP_FLAGS_EXPECT_ID, is also added to explicitly
>> request checking of the EXPECTED_ID attribute. This is needed for userspace
>> to discover whether the kernel supports the new attribute.
>>
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> Over the years of defining apis to attach bpf progs to different kernel
> components the key mistake we made is that we tried to use corresponding
> subsystem way of doing thing and it failed every time. What made the problem
> worse that this failure we only learned after many years. Attaching xdp via
> netlink felt great back then. Attaching clsbpf via tc felt awesome too. Doing
> cgroup-bpf via bpf syscall with three different attaching ways also felt great.
> All of these attaching things turned out to be broken because all of them
> failed to provide the concept of ownership of the attachment. Which caused 10k
> clsbpf progs on one netdev, 64 cgroup-bpf progs in one cgroup, nuked xdp progs.
> Hence we have to add the ownership model first. Doing mini extensions to
> existing apis is not addressing the giant issue of existing apis.
>
> The idea of this patch is to do atomic replacement of xdp prog. It's a good
> idea and useful feature, but I don't want to extend existing broken apis to do
> add this feature. atomic replacement has to come with thought through owner
> model first.
Setting aside the question of which is the best abstraction to represent
an attachment, it seems to me that the actual behavioural problem (XDP
programs being overridden by mistake) would be solvable by this patch,
assuming well-behaved userspace applications.
If you do need kernel support, we could extend the netlink API to accept
bpf_link FDs in addition to prog FDs. Then applications that fit the
bpf_link model could use netlink to setup the initial link (and any
other device configuration they need to do anyway), and any subsequent
replacements would be done by LINK_UPDATE bpf() operations. The lack of
CAP_NET_ADMIN could even restrict applications from removing the
bpf_link attachment from the netdev, without preventing them from
swapping out the bpf_prog attached to it.
This would also keep netlink solely in charge of netdev configuration,
and prevent any issues with a netdev being locked due to a bpf_link
attachment.
-Toke
next prev parent reply other threads:[~2020-03-25 16:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-24 18:12 [PATCH bpf-next v3 0/4] XDP: Support atomic replacement of XDP interface attachments Toke Høiland-Jørgensen
2020-03-24 18:12 ` [PATCH bpf-next v3 1/4] xdp: Support specifying expected existing program when attaching XDP Toke Høiland-Jørgensen
2020-03-24 19:40 ` Jakub Kicinski
2020-03-25 0:54 ` Andrii Nakryiko
2020-03-25 1:43 ` Jakub Kicinski
2020-03-25 16:48 ` Toke Høiland-Jørgensen
2020-03-25 1:46 ` Alexei Starovoitov
2020-03-25 16:48 ` Toke Høiland-Jørgensen [this message]
2020-03-24 18:12 ` [PATCH bpf-next v3 2/4] tools: Add EXPECTED_ID-related definitions in if_link.h Toke Høiland-Jørgensen
2020-03-24 18:12 ` [PATCH bpf-next v3 3/4] libbpf: Add function to set link XDP fd while specifying old program Toke Høiland-Jørgensen
2020-03-24 18:12 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add tests for attaching XDP programs Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a744peij.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kuba@kernel.org \
--cc=lmb@cloudflare.com \
--cc=netdev@vger.kernel.org \
--cc=rdna@fb.com \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).