From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EE59C433DB for ; Mon, 1 Feb 2021 13:49:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C428664E94 for ; Mon, 1 Feb 2021 13:49:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231821AbhBANtD (ORCPT ); Mon, 1 Feb 2021 08:49:03 -0500 Received: from correo.us.es ([193.147.175.20]:51082 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231405AbhBANtA (ORCPT ); Mon, 1 Feb 2021 08:49:00 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id 969D715C102 for ; Mon, 1 Feb 2021 14:48:16 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 8494DDA72F for ; Mon, 1 Feb 2021 14:48:16 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 79FA6DA704; Mon, 1 Feb 2021 14:48:16 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 3D8E6DA78B; Mon, 1 Feb 2021 14:48:14 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Mon, 01 Feb 2021 14:48:14 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from us.es (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: 1984lsi) by entrada.int (Postfix) with ESMTPSA id 1F6EE42DC6DF; Mon, 1 Feb 2021 14:48:14 +0100 (CET) Date: Mon, 1 Feb 2021 14:48:13 +0100 X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: Florian Westphal Cc: netfilter-devel@vger.kernel.org Subject: Re: [PATCH nf-next] netfilter: nftables: introduce table ownership Message-ID: <20210201134813.GA24566@salvia> References: <20210127021928.2444-1-pablo@netfilter.org> <20210201122455.GE12443@breakpoint.cc> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210201122455.GE12443@breakpoint.cc> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netfilter-devel@vger.kernel.org On Mon, Feb 01, 2021 at 01:24:55PM +0100, Florian Westphal wrote: > Pablo Neira Ayuso wrote: > > A userspace daemon like firewalld might need to monitor for netlink > > updates to detect its ruleset removal by the (global) flush ruleset > > command to ensure ruleset persistence. This adds extra complexity from > > userspace and, for some little time, the firewall policy is not in > > place. > > > > This patch adds the NFT_MSG_SETOWNER netlink command which allows a > > userspace program to own the table that creates in exclusivity. > > > > Tables that are owned... > > > > - can only be updated and removed by the owner, non-owners hit EPERM if > > they try to update it or remove it. > > - are destroyed when the owner send the NFT_MSG_UNSETOWNER command, > > or the netlink socket is closed or the process is gone (implicit > > netlink socket closure). > > - are skipped by the global flush ruleset command. > > - are listed in the global ruleset. > > > > The userspace process that sends the new NFT_MSG_SETOWNER command need > > to leave open the netlink socket. > > > > The NFTA_TABLE_OWNER netlink attribute specifies the netlink port ID to > > identify the owner. > > At least for systemd use case, there would be a need to allow > add/removal of set elements from other user. Then, probably a flag for this? Such flag would work like this? - Allow for set element updates (from any process, no ownership). - nft flush ruleset skips flushing the set. - nft flush set x y flushes the content of this set. The table owner would set on such flag. Would this work for the scenario you describe below? > At the moment, table is created by systemd-networkd which will update > the masquerade set. > > In case systemd-nspawn is used and configured to expose container > services via dnat that will need to add the translation map: > > add table ip io.systemd.nat > add chain ip io.systemd.nat prerouting { type nat hook prerouting priority dstnat + 1; policy accept; } > [..] > # new generation 2 by process 1378 (systemd-network) > add element ip io.systemd.nat masq_saddr { 192.168.159.192/28 } > # new generation 3 by process 1378 (systemd-network) > add element ip io.systemd.nat map_port_ipport { tcp . 2222 : 192.168.159.201 . 22 } > # new generation 4 by process 1512 (systemd-nspawn) > > > +struct nft_owner { > > + struct list_head list; > > + possible_net_t net; > > + u32 nlpid; > > +}; > > I don't see why this is needed. > Isn't it enough to record the nlpid in the table and set a flag that the table is > owned by that pid? I'll have a look. > > + nft_active_genmask(table, genmask)) { > > + if (nlpid && table->nlpid && table->nlpid != nlpid) > > + return ERR_PTR(-EPERM); > > + > > i.e., (table->flags & OWNED) && table->nlpid != nlpid)? > > On netlink sk destruction the owner flag could be cleared or table > could be auto-zapped. Default behaviour right now is: table is released if owner is gone. It should be possible to add a flag to leave the ruleset in place (owner flag would be cleared from NETLINK_RELEASE event path).