From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1C46C6379F for ; Mon, 30 Jan 2023 21:10:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229847AbjA3VKy (ORCPT ); Mon, 30 Jan 2023 16:10:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230433AbjA3VKv (ORCPT ); Mon, 30 Jan 2023 16:10:51 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7EFB1CADF; Mon, 30 Jan 2023 13:10:50 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5169C6122C; Mon, 30 Jan 2023 21:10:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D8C4C433D2; Mon, 30 Jan 2023 21:10:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1675113049; bh=DnHr1LbMcoGKDw/UbF+UJ3q/uIAGq2Km3uH30O8+GYw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=eOe+m0dD6nNDHlwAZASSqvRRwH+M55+EUNWE8dTSk4sUmMqeBO/fkU3lxWbC7nmvN kgClo8Xp3sEfPJlZ5mtG+ThwvpJwVj6+D3RA4r8KNKohSk5H6eUH71sRrbJLt1OFC/ KfC5Qnn+FBzaDH6EhbGghsmIPtWcUgk8oQA0Dxx18SbQdeNF0O5rFPvv0NJgEAiNXP +ox09edXaF+5fftqcJNVyuhxs2xBiNQFRUGj065xeDylbF1Oi9LwUyYKm7LOQAZR2j wbzLnd1jycond8VRuQNVZqtap7wZ1R/mDGobiIawlt3QOEWaByyrRGltcZMqciTEuz I1MQHQgATe3/w== Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 4A55C97272C; Mon, 30 Jan 2023 22:10:45 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Florian Westphal Cc: Florian Westphal , bpf@vger.kernel.org, netfilter-devel@vger.kernel.org Subject: Re: [RFC] bpf: add bpf_link support for BPF_NETFILTER programs In-Reply-To: <20230130180115.GB12902@breakpoint.cc> References: <20230130150432.24924-1-fw@strlen.de> <87zg9zx6ro.fsf@toke.dk> <20230130180115.GB12902@breakpoint.cc> X-Clacks-Overhead: GNU Terry Pratchett Date: Mon, 30 Jan 2023 22:10:45 +0100 Message-ID: <87o7qfwwx6.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Florian Westphal writes: > Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> > Is BPF_LINK the right place? Hook gets removed automatically if the c= alling program >> > exits, afaict this is intended. >>=20 >> Yes, this is indeed intended for bpf_link. This plays well with >> applications that use the API and stick around (because things get >> cleaned up after them automatically even if they crash, say), but it >> doesn't work so well for programs that don't (which, notably, includes >> command line utilities like 'nft'). > > Right, but I did not want to create a dependency on nfnetlink or > nftables netlink right from the start. Dependency how? For userspace consumers, you mean? >> For XDP and TC users can choose between bpf_link and netlink for >> attachment and get one of the two semantics (goes away on close or stays >> put). Not sure if it would make sense to do the same for nftables? > > For nftables I suspect that, if nft can emit bpf, it would make sense to > pass the bpf descriptor together with nftables netlink, i.e. along with > the normal netlink data. > > nftables kernel side would then know to use the bpf prog for the > datapath instead of the interpreter and could even fallback to > interpreter. > > But for the raw hook use case that Alexei and Daniel preferred (cf. > initial proposal to call bpf progs from nf_tables interpreter) I think > that there should be no nftables dependency. > > I could add a new nfnetlink subtype for nf-bpf if bpf_link is not > appropriate as an alternative. I don't think there's anything wrong with bpf_link as an initial interface at least. I just think it should (eventually) be possible to load a BPF-based firewall from the command line via this interface, without having to resort to pinning. There was some talk about adding this as a mode to the bpf_link interface itself at some point, but that never materialised (probably because the need is not great since the netlink interface serves this purpose for TC/XDP). >> > Things like nf_hook_state->in (net_device) could then be exposed via >> > kfuncs. >>=20 >> Right, so like: >>=20 >> state =3D bpf_nf_get_hook_state(ctx); ? >>=20 >> Sounds OK to me. > > Yes, something like that. Downside is that the nf_hook_state struct > is no longer bound by any abi contract, but as I understood it thats > fine. Well, there's an ongoing discussion about what, if any, should be the expectations around kfunc stability: https://lore.kernel.org/r/20230117212731.442859-1-toke@redhat.com I certainly don't think it's problematic for a subsystem to give *more* stability guarantees than BPF core. I mean, if you want the kfunc interface to be stable you just... don't change it? :) >> > nf_hook_run_bpf() (c-function that creates the program context and >> > calls the real bpf prog) would be "updated" to use the bpf dispatcher = to >> > avoid the indirect call overhead. >>=20 >> What 'bpf dispatcher' are you referring to here? We have way too many >> things with that name :P > > I meant 'DEFINE_BPF_DISPATCHER(nf_user_progs);' Ah, right. Yeah, that can definitely be added later! -Toke