From: "Toke Høiland-Jørgensen" <toke@kernel.org>
To: Florian Westphal <fw@strlen.de>, bpf@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org, Florian Westphal <fw@strlen.de>
Subject: Re: [RFC] bpf: add bpf_link support for BPF_NETFILTER programs
Date: Mon, 30 Jan 2023 18:38:03 +0100 [thread overview]
Message-ID: <87zg9zx6ro.fsf@toke.dk> (raw)
In-Reply-To: <20230130150432.24924-1-fw@strlen.de>
Florian Westphal <fw@strlen.de> writes:
> Doesn't apply, doesn't work -- there is no BPF_NETFILTER program type.
>
> Sketches the uapi. Example usage:
>
> union bpf_attr attr = { };
>
> attr.link_create.prog_fd = progfd;
> attr.link_create.attach_type = BPF_NETFILTER;
> attr.link_create.netfilter.pf = PF_INET;
> attr.link_create.netfilter.hooknum = NF_INET_LOCAL_IN;
> attr.link_create.netfilter.priority = -128;
>
> err = bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
>
> ... this would attach progfd to ipv4:input hook.
>
> Is BPF_LINK the right place? Hook gets removed automatically if the calling program
> exits, afaict this is intended.
Yes, this is indeed intended for bpf_link. This plays well with
applications that use the API and stick around (because things get
cleaned up after them automatically even if they crash, say), but it
doesn't work so well for programs that don't (which, notably, includes
command line utilities like 'nft').
This is why I personally never really liked those semantics for
networking hooks: If I run a utility that attaches an XDP program I
generally expect that to stick around until the netdev disappears unless
something else explicitly removes it. (Yes you can pin a bpf_link, but
then you have the opposite problem: if the netdev disappears some entity
has to remove the pinned link, or you'll have a zombie program present
in the kernel until the next reboot).
For XDP and TC users can choose between bpf_link and netlink for
attachment and get one of the two semantics (goes away on close or stays
put). Not sure if it would make sense to do the same for nftables?
> Should a program running in init_netns be allowed to attach hooks in other netns too?
>
> I could do what BPF_LINK_TYPE_NETNS is doing and fetch net via
> get_net_ns_by_fd(attr->link_create.target_fd);
We don't allow that for any other type of BPF program; the expectation
is that the entity doing the attachment will move to the right ns first.
Is there any particular use case for doing something different for
nftables?
> For the actual BPF_NETFILTER program type I plan to follow what the bpf
> flow dissector is doing, i.e. pretend prototype is
>
> func(struct __sk_buff *skb)
>
> but pass a custom program specific context struct on kernel side.
> Verifier will rewrite accesses as needed.
This sounds reasonable, and also promotes code reuse between program
types (say, you can write some BPF code to parse a packet and reuse it
between the flow dissector, TC and netfilter).
> Things like nf_hook_state->in (net_device) could then be exposed via
> kfuncs.
Right, so like:
state = bpf_nf_get_hook_state(ctx); ?
Sounds OK to me.
> nf_hook_run_bpf() (c-function that creates the program context and
> calls the real bpf prog) would be "updated" to use the bpf dispatcher to
> avoid the indirect call overhead.
What 'bpf dispatcher' are you referring to here? We have way too many
things with that name :P
> Does that seem ok to you? I'd ignore the bpf dispatcher for now and
> would work on the needed verifier changes first.
Getting something that works first seems reasonable, sure! :)
-Toke
next prev parent reply other threads:[~2023-01-30 17:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-30 15:04 [RFC] bpf: add bpf_link support for BPF_NETFILTER programs Florian Westphal
2023-01-30 17:38 ` Toke Høiland-Jørgensen [this message]
2023-01-30 18:01 ` Florian Westphal
2023-01-30 21:10 ` Toke Høiland-Jørgensen
2023-01-30 21:44 ` Alexei Starovoitov
2023-01-31 14:18 ` Florian Westphal
2023-01-31 16:19 ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zg9zx6ro.fsf@toke.dk \
--to=toke@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox