* [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule()
2023-10-19 11:33 [nf-next PATCH v3 0/3] Introduce locking for rule reset requests Phil Sutter
@ 2023-10-19 11:33 ` Phil Sutter
2023-10-19 11:52 ` Florian Westphal
2023-10-19 11:33 ` [nf-next PATCH v3 2/3] netfilter: nf_tables: Introduce nf_tables_getrule_single() Phil Sutter
2023-10-19 11:33 ` [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests Phil Sutter
2 siblings, 1 reply; 11+ messages in thread
From: Phil Sutter @ 2023-10-19 11:33 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel
The table lookup will be dropped from that function, so remove that
dependency from audit logging code. Using whatever is in
nla[NFTA_RULE_TABLE] is sufficient as long as the previous rule info
filling succeded.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v1:
- New patch
---
net/netfilter/nf_tables_api.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 72ed4d2045c5..3c65ce7a2f51 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3589,15 +3589,19 @@ static int nf_tables_dump_rules_done(struct netlink_callback *cb)
static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
const struct nlattr * const nla[])
{
+ struct nftables_pernet *nft_net = nft_pernet(info->net);
struct netlink_ext_ack *extack = info->extack;
u8 genmask = nft_genmask_cur(info->net);
u8 family = info->nfmsg->nfgen_family;
+ u32 portid = NETLINK_CB(skb).portid;
const struct nft_chain *chain;
const struct nft_rule *rule;
struct net *net = info->net;
struct nft_table *table;
struct sk_buff *skb2;
bool reset = false;
+ char *tablename;
+ char *buf;
int err;
if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
@@ -3637,16 +3641,23 @@ static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
reset = true;
- err = nf_tables_fill_rule_info(skb2, net, NETLINK_CB(skb).portid,
+ err = nf_tables_fill_rule_info(skb2, net, portid,
info->nlh->nlmsg_seq, NFT_MSG_NEWRULE, 0,
family, table, chain, rule, 0, reset);
if (err < 0)
goto err_fill_rule_info;
- if (reset)
- audit_log_rule_reset(table, nft_pernet(net)->base_seq, 1);
+ if (!reset)
+ return nfnetlink_unicast(skb2, net, portid);
- return nfnetlink_unicast(skb2, net, NETLINK_CB(skb).portid);
+ tablename = nla_strdup(nla[NFTA_RULE_TABLE], GFP_ATOMIC);
+ buf = kasprintf(GFP_ATOMIC, "%s:%u", tablename, nft_net->base_seq);
+ audit_log_nfcfg(buf, info->nfmsg->nfgen_family, 1,
+ AUDIT_NFT_OP_RULE_RESET, GFP_ATOMIC);
+ kfree(buf);
+ kfree(tablename);
+
+ return nfnetlink_unicast(skb2, net, portid);
err_fill_rule_info:
kfree_skb(skb2);
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule()
2023-10-19 11:33 ` [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule() Phil Sutter
@ 2023-10-19 11:52 ` Florian Westphal
2023-10-19 12:29 ` Phil Sutter
0 siblings, 1 reply; 11+ messages in thread
From: Florian Westphal @ 2023-10-19 11:52 UTC (permalink / raw)
To: Phil Sutter; +Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel
> - return nfnetlink_unicast(skb2, net, NETLINK_CB(skb).portid);
> + tablename = nla_strdup(nla[NFTA_RULE_TABLE], GFP_ATOMIC);
> + buf = kasprintf(GFP_ATOMIC, "%s:%u", tablename, nft_net->base_seq);
You can use %.*s:%u", nla_len(nla[NFTA_RULE_TABLE]), nla_data(nla[NFTA_RULE_TABLE) ...
here to avoid the extra strdup.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule()
2023-10-19 11:52 ` Florian Westphal
@ 2023-10-19 12:29 ` Phil Sutter
0 siblings, 0 replies; 11+ messages in thread
From: Phil Sutter @ 2023-10-19 12:29 UTC (permalink / raw)
To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel
On Thu, Oct 19, 2023 at 01:52:52PM +0200, Florian Westphal wrote:
> > - return nfnetlink_unicast(skb2, net, NETLINK_CB(skb).portid);
> > + tablename = nla_strdup(nla[NFTA_RULE_TABLE], GFP_ATOMIC);
> > + buf = kasprintf(GFP_ATOMIC, "%s:%u", tablename, nft_net->base_seq);
>
> You can use %.*s:%u", nla_len(nla[NFTA_RULE_TABLE]), nla_data(nla[NFTA_RULE_TABLE) ...
> here to avoid the extra strdup.
Nice, thanks!
^ permalink raw reply [flat|nested] 11+ messages in thread
* [nf-next PATCH v3 2/3] netfilter: nf_tables: Introduce nf_tables_getrule_single()
2023-10-19 11:33 [nf-next PATCH v3 0/3] Introduce locking for rule reset requests Phil Sutter
2023-10-19 11:33 ` [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule() Phil Sutter
@ 2023-10-19 11:33 ` Phil Sutter
2023-10-19 11:33 ` [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests Phil Sutter
2 siblings, 0 replies; 11+ messages in thread
From: Phil Sutter @ 2023-10-19 11:33 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel
Outsource the reply skb preparation for non-dump getrule requests into a
distinct function. Prep work for rule reset locking.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
net/netfilter/nf_tables_api.c | 76 ++++++++++++++++++++---------------
1 file changed, 44 insertions(+), 32 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3c65ce7a2f51..584d3b204372 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3586,66 +3586,82 @@ static int nf_tables_dump_rules_done(struct netlink_callback *cb)
}
/* called with rcu_read_lock held */
-static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
- const struct nlattr * const nla[])
+static struct sk_buff *
+nf_tables_getrule_single(u32 portid, const struct nfnl_info *info,
+ const struct nlattr * const nla[], bool reset)
{
- struct nftables_pernet *nft_net = nft_pernet(info->net);
struct netlink_ext_ack *extack = info->extack;
u8 genmask = nft_genmask_cur(info->net);
u8 family = info->nfmsg->nfgen_family;
- u32 portid = NETLINK_CB(skb).portid;
const struct nft_chain *chain;
const struct nft_rule *rule;
struct net *net = info->net;
struct nft_table *table;
struct sk_buff *skb2;
- bool reset = false;
- char *tablename;
- char *buf;
int err;
- if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
- struct netlink_dump_control c = {
- .start= nf_tables_dump_rules_start,
- .dump = nf_tables_dump_rules,
- .done = nf_tables_dump_rules_done,
- .module = THIS_MODULE,
- .data = (void *)nla,
- };
-
- return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
- }
-
table = nft_table_lookup(net, nla[NFTA_RULE_TABLE], family, genmask, 0);
if (IS_ERR(table)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_TABLE]);
- return PTR_ERR(table);
+ return ERR_CAST(table);
}
chain = nft_chain_lookup(net, table, nla[NFTA_RULE_CHAIN], genmask);
if (IS_ERR(chain)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]);
- return PTR_ERR(chain);
+ return ERR_CAST(chain);
}
rule = nft_rule_lookup(chain, nla[NFTA_RULE_HANDLE]);
if (IS_ERR(rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_HANDLE]);
- return PTR_ERR(rule);
+ return ERR_CAST(rule);
}
skb2 = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
if (!skb2)
- return -ENOMEM;
-
- if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
- reset = true;
+ return ERR_PTR(-ENOMEM);
err = nf_tables_fill_rule_info(skb2, net, portid,
info->nlh->nlmsg_seq, NFT_MSG_NEWRULE, 0,
family, table, chain, rule, 0, reset);
- if (err < 0)
- goto err_fill_rule_info;
+ if (err < 0) {
+ kfree_skb(skb2);
+ return ERR_PTR(err);
+ }
+
+ return skb2;
+}
+
+static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
+ const struct nlattr * const nla[])
+{
+ struct nftables_pernet *nft_net = nft_pernet(info->net);
+ u32 portid = NETLINK_CB(skb).portid;
+ struct net *net = info->net;
+ struct sk_buff *skb2;
+ bool reset = false;
+ char *tablename;
+ char *buf;
+
+ if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
+ struct netlink_dump_control c = {
+ .start= nf_tables_dump_rules_start,
+ .dump = nf_tables_dump_rules,
+ .done = nf_tables_dump_rules_done,
+ .module = THIS_MODULE,
+ .data = (void *)nla,
+ };
+
+ return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
+ }
+
+ if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
+ reset = true;
+
+ skb2 = nf_tables_getrule_single(portid, info, nla, reset);
+ if (IS_ERR(skb2))
+ return PTR_ERR(skb2);
if (!reset)
return nfnetlink_unicast(skb2, net, portid);
@@ -3658,10 +3674,6 @@ static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
kfree(tablename);
return nfnetlink_unicast(skb2, net, portid);
-
-err_fill_rule_info:
- kfree_skb(skb2);
- return err;
}
void nf_tables_rule_destroy(const struct nft_ctx *ctx, struct nft_rule *rule)
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 11:33 [nf-next PATCH v3 0/3] Introduce locking for rule reset requests Phil Sutter
2023-10-19 11:33 ` [nf-next PATCH v3 1/3] netfilter: nf_tables: Open-code audit log call in nf_tables_getrule() Phil Sutter
2023-10-19 11:33 ` [nf-next PATCH v3 2/3] netfilter: nf_tables: Introduce nf_tables_getrule_single() Phil Sutter
@ 2023-10-19 11:33 ` Phil Sutter
2023-10-19 11:38 ` Pablo Neira Ayuso
2 siblings, 1 reply; 11+ messages in thread
From: Phil Sutter @ 2023-10-19 11:33 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel
Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
the same rule at the same time. At least counter and quota expressions
will suffer from value underruns in this case.
Prevent this by introducing dedicated locking callbacks for nfnetlink
and the asynchronous dump handling to serialize access.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v2:
- Keep local variable 'nft_net' in nf_tables_getrule_reset()
- No need for local variable 'family' in same function (used only once
after all the churn)
---
net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
1 file changed, 60 insertions(+), 14 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 584d3b204372..fbb688c9903c 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3551,6 +3551,19 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
return skb->len;
}
+static int nf_tables_dumpreset_rules(struct sk_buff *skb,
+ struct netlink_callback *cb)
+{
+ struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
+ int ret;
+
+ mutex_lock(&nft_net->commit_mutex);
+ ret = nf_tables_dump_rules(skb, cb);
+ mutex_unlock(&nft_net->commit_mutex);
+
+ return ret;
+}
+
static int nf_tables_dump_rules_start(struct netlink_callback *cb)
{
struct nft_rule_dump_ctx *ctx = (void *)cb->ctx;
@@ -3570,12 +3583,18 @@ static int nf_tables_dump_rules_start(struct netlink_callback *cb)
return -ENOMEM;
}
}
- if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
- ctx->reset = true;
-
return 0;
}
+static int nf_tables_dumpreset_rules_start(struct netlink_callback *cb)
+{
+ struct nft_rule_dump_ctx *ctx = (void *)cb->ctx;
+
+ ctx->reset = true;
+
+ return nf_tables_dump_rules_start(cb);
+}
+
static int nf_tables_dump_rules_done(struct netlink_callback *cb)
{
struct nft_rule_dump_ctx *ctx = (void *)cb->ctx;
@@ -3636,13 +3655,9 @@ nf_tables_getrule_single(u32 portid, const struct nfnl_info *info,
static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
const struct nlattr * const nla[])
{
- struct nftables_pernet *nft_net = nft_pernet(info->net);
u32 portid = NETLINK_CB(skb).portid;
struct net *net = info->net;
struct sk_buff *skb2;
- bool reset = false;
- char *tablename;
- char *buf;
if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
@@ -3656,15 +3671,46 @@ static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
}
- if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
- reset = true;
-
- skb2 = nf_tables_getrule_single(portid, info, nla, reset);
+ skb2 = nf_tables_getrule_single(portid, info, nla, false);
if (IS_ERR(skb2))
return PTR_ERR(skb2);
- if (!reset)
- return nfnetlink_unicast(skb2, net, portid);
+ return nfnetlink_unicast(skb2, net, portid);
+}
+
+static int nf_tables_getrule_reset(struct sk_buff *skb,
+ const struct nfnl_info *info,
+ const struct nlattr * const nla[])
+{
+ struct nftables_pernet *nft_net = nft_pernet(info->net);
+ u32 portid = NETLINK_CB(skb).portid;
+ struct net *net = info->net;
+ char *tablename, *buf;
+ struct sk_buff *skb2;
+
+ if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
+ struct netlink_dump_control c = {
+ .start= nf_tables_dumpreset_rules_start,
+ .dump = nf_tables_dumpreset_rules,
+ .done = nf_tables_dump_rules_done,
+ .module = THIS_MODULE,
+ .data = (void *)nla,
+ };
+
+ return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
+ }
+
+ if (!try_module_get(THIS_MODULE))
+ return -EINVAL;
+ rcu_read_unlock();
+ mutex_lock(&nft_net->commit_mutex);
+ skb2 = nf_tables_getrule_single(portid, info, nla, true);
+ mutex_unlock(&nft_net->commit_mutex);
+ rcu_read_lock();
+ module_put(THIS_MODULE);
+
+ if (IS_ERR(skb2))
+ return PTR_ERR(skb2);
tablename = nla_strdup(nla[NFTA_RULE_TABLE], GFP_ATOMIC);
buf = kasprintf(GFP_ATOMIC, "%s:%u", tablename, nft_net->base_seq);
@@ -9003,7 +9049,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_rule_policy,
},
[NFT_MSG_GETRULE_RESET] = {
- .call = nf_tables_getrule,
+ .call = nf_tables_getrule_reset,
.type = NFNL_CB_RCU,
.attr_count = NFTA_RULE_MAX,
.policy = nft_rule_policy,
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 11:33 ` [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests Phil Sutter
@ 2023-10-19 11:38 ` Pablo Neira Ayuso
2023-10-19 11:59 ` Florian Westphal
2023-10-19 12:33 ` Phil Sutter
0 siblings, 2 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-19 11:38 UTC (permalink / raw)
To: Phil Sutter; +Cc: Florian Westphal, netfilter-devel
On Thu, Oct 19, 2023 at 01:33:47PM +0200, Phil Sutter wrote:
> Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
> the same rule at the same time. At least counter and quota expressions
> will suffer from value underruns in this case.
>
> Prevent this by introducing dedicated locking callbacks for nfnetlink
> and the asynchronous dump handling to serialize access.
>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> Changes since v2:
> - Keep local variable 'nft_net' in nf_tables_getrule_reset()
> - No need for local variable 'family' in same function (used only once
> after all the churn)
> ---
> net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
> 1 file changed, 60 insertions(+), 14 deletions(-)
>
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 584d3b204372..fbb688c9903c 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
[...]
> +static int nf_tables_dumpreset_rules(struct sk_buff *skb,
> + struct netlink_callback *cb)
> +{
> + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
> + int ret;
> +
> + mutex_lock(&nft_net->commit_mutex);
> + ret = nf_tables_dump_rules(skb, cb);
> + mutex_unlock(&nft_net->commit_mutex);
NACK.
This just mitigates the problem we are discussing, when there is an
interference with an ongoing transaction.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 11:38 ` Pablo Neira Ayuso
@ 2023-10-19 11:59 ` Florian Westphal
2023-10-19 13:38 ` Pablo Neira Ayuso
2023-10-19 12:33 ` Phil Sutter
1 sibling, 1 reply; 11+ messages in thread
From: Florian Westphal @ 2023-10-19 11:59 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Phil Sutter, Florian Westphal, netfilter-devel
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Thu, Oct 19, 2023 at 01:33:47PM +0200, Phil Sutter wrote:
> > Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
> > the same rule at the same time. At least counter and quota expressions
> > will suffer from value underruns in this case.
> >
> > Prevent this by introducing dedicated locking callbacks for nfnetlink
> > and the asynchronous dump handling to serialize access.
> >
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > ---
> > Changes since v2:
> > - Keep local variable 'nft_net' in nf_tables_getrule_reset()
> > - No need for local variable 'family' in same function (used only once
> > after all the churn)
> > ---
> > net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
> > 1 file changed, 60 insertions(+), 14 deletions(-)
> >
> > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> > index 584d3b204372..fbb688c9903c 100644
> > --- a/net/netfilter/nf_tables_api.c
> > +++ b/net/netfilter/nf_tables_api.c
> [...]
> > +static int nf_tables_dumpreset_rules(struct sk_buff *skb,
> > + struct netlink_callback *cb)
> > +{
> > + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
> > + int ret;
> > +
> > + mutex_lock(&nft_net->commit_mutex);
> > + ret = nf_tables_dump_rules(skb, cb);
> > + mutex_unlock(&nft_net->commit_mutex);
>
> NACK.
>
> This just mitigates the problem we are discussing, when there is an
> interference with an ongoing transaction.
It resolves corrupting the internal state when two parallel resets
are done.
If you believe that we have to make entire dump consistent even
when reset flag is given I see no choice but to completely remove
reset-from-dump support.
What is you suggested solution?
AFAICS, with this series, userspace can, in theory, merge partial
dumps into consistent output by manually collecting the partial
dumps.
That said, I think its not very realistic that userspace will
get this right.
That leaves: userspace does a dump (without reset), and if that
was consistent walk it and do a per-handle get-with-reset request
for each rule, then update the (not-yet-printed) dump with the
newly obtained stateful results.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 11:59 ` Florian Westphal
@ 2023-10-19 13:38 ` Pablo Neira Ayuso
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-19 13:38 UTC (permalink / raw)
To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel
On Thu, Oct 19, 2023 at 01:59:09PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Thu, Oct 19, 2023 at 01:33:47PM +0200, Phil Sutter wrote:
> > > Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
> > > the same rule at the same time. At least counter and quota expressions
> > > will suffer from value underruns in this case.
> > >
> > > Prevent this by introducing dedicated locking callbacks for nfnetlink
> > > and the asynchronous dump handling to serialize access.
> > >
> > > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > > ---
> > > Changes since v2:
> > > - Keep local variable 'nft_net' in nf_tables_getrule_reset()
> > > - No need for local variable 'family' in same function (used only once
> > > after all the churn)
> > > ---
> > > net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
> > > 1 file changed, 60 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> > > index 584d3b204372..fbb688c9903c 100644
> > > --- a/net/netfilter/nf_tables_api.c
> > > +++ b/net/netfilter/nf_tables_api.c
> > [...]
> > > +static int nf_tables_dumpreset_rules(struct sk_buff *skb,
> > > + struct netlink_callback *cb)
> > > +{
> > > + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
> > > + int ret;
> > > +
> > > + mutex_lock(&nft_net->commit_mutex);
> > > + ret = nf_tables_dump_rules(skb, cb);
> > > + mutex_unlock(&nft_net->commit_mutex);
> >
> > NACK.
> >
> > This just mitigates the problem we are discussing, when there is an
> > interference with an ongoing transaction.
>
> It resolves corrupting the internal state when two parallel resets
> are done.
>
> If you believe that we have to make entire dump consistent even
> when reset flag is given I see no choice but to completely remove
> reset-from-dump support.
It is just the commit_mutex in this section that I don't like, but OK
this operation and scenario should be rare and adding yet another lock
for a rare case might make no sense.
> What is you suggested solution?
>
> AFAICS, with this series, userspace can, in theory, merge partial
> dumps into consistent output by manually collecting the partial
> dumps.
>
> That said, I think its not very realistic that userspace will
> get this right.
I think this is possible, I would like to see userspace code (just
simple sketch) that shows how complicate this is.
> That leaves: userspace does a dump (without reset), and if that
> was consistent walk it and do a per-handle get-with-reset request
> for each rule, then update the (not-yet-printed) dump with the
> newly obtained stateful results.
Counter expressions in rules are probably the more complicated to
restore. Named objects should be rather easy.
Another possibility is to explore transaction control plane path and
use NLM_F_ECHO to get back the original counters, but I would not
really follow this path.
I have been considering netlink dump based on ruleset snapshots. The
idea is: Make a snapshot of the ruleset at the beginning of the
netlink dump, this requires a new class of objects that can be used as
container to store what is going to be. The snapshot should happen
right before initiating the netlink dump holding the commit_mutex, and
it would take memory to store this temporary ruleset snapshot. The
snapshot would be done per dump request.
This ensures no EINTR can happen, the ruleset at the end of the
netlink dump will be consistent, but it might be slightly behind the
real rule if updates have happened, but using the generation ID, it
should be possible to at least report it to the user.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 11:38 ` Pablo Neira Ayuso
2023-10-19 11:59 ` Florian Westphal
@ 2023-10-19 12:33 ` Phil Sutter
2023-10-19 13:04 ` Pablo Neira Ayuso
1 sibling, 1 reply; 11+ messages in thread
From: Phil Sutter @ 2023-10-19 12:33 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel
On Thu, Oct 19, 2023 at 01:38:04PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Oct 19, 2023 at 01:33:47PM +0200, Phil Sutter wrote:
> > Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
> > the same rule at the same time. At least counter and quota expressions
> > will suffer from value underruns in this case.
> >
> > Prevent this by introducing dedicated locking callbacks for nfnetlink
> > and the asynchronous dump handling to serialize access.
> >
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > ---
> > Changes since v2:
> > - Keep local variable 'nft_net' in nf_tables_getrule_reset()
> > - No need for local variable 'family' in same function (used only once
> > after all the churn)
> > ---
> > net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
> > 1 file changed, 60 insertions(+), 14 deletions(-)
> >
> > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> > index 584d3b204372..fbb688c9903c 100644
> > --- a/net/netfilter/nf_tables_api.c
> > +++ b/net/netfilter/nf_tables_api.c
> [...]
> > +static int nf_tables_dumpreset_rules(struct sk_buff *skb,
> > + struct netlink_callback *cb)
> > +{
> > + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
> > + int ret;
> > +
> > + mutex_lock(&nft_net->commit_mutex);
> > + ret = nf_tables_dump_rules(skb, cb);
> > + mutex_unlock(&nft_net->commit_mutex);
>
> NACK.
>
> This just mitigates the problem we are discussing, when there is an
> interference with an ongoing transaction.
This fixes for user space's ability to underrun counters and quotas
because expressions' dump callbacks are not concurrency safe in reset
mode.
What you're concerned with is a different issue.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [nf-next PATCH v3 3/3] netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
2023-10-19 12:33 ` Phil Sutter
@ 2023-10-19 13:04 ` Pablo Neira Ayuso
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-19 13:04 UTC (permalink / raw)
To: Phil Sutter, Florian Westphal, netfilter-devel
On Thu, Oct 19, 2023 at 02:33:50PM +0200, Phil Sutter wrote:
> On Thu, Oct 19, 2023 at 01:38:04PM +0200, Pablo Neira Ayuso wrote:
> > On Thu, Oct 19, 2023 at 01:33:47PM +0200, Phil Sutter wrote:
> > > Rule reset is not concurrency-safe per-se, so multiple CPUs may reset
> > > the same rule at the same time. At least counter and quota expressions
> > > will suffer from value underruns in this case.
> > >
> > > Prevent this by introducing dedicated locking callbacks for nfnetlink
> > > and the asynchronous dump handling to serialize access.
> > >
> > > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > > ---
> > > Changes since v2:
> > > - Keep local variable 'nft_net' in nf_tables_getrule_reset()
> > > - No need for local variable 'family' in same function (used only once
> > > after all the churn)
> > > ---
> > > net/netfilter/nf_tables_api.c | 74 ++++++++++++++++++++++++++++-------
> > > 1 file changed, 60 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> > > index 584d3b204372..fbb688c9903c 100644
> > > --- a/net/netfilter/nf_tables_api.c
> > > +++ b/net/netfilter/nf_tables_api.c
> > [...]
> > > +static int nf_tables_dumpreset_rules(struct sk_buff *skb,
> > > + struct netlink_callback *cb)
> > > +{
> > > + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
> > > + int ret;
> > > +
> > > + mutex_lock(&nft_net->commit_mutex);
> > > + ret = nf_tables_dump_rules(skb, cb);
> > > + mutex_unlock(&nft_net->commit_mutex);
> >
> > NACK.
> >
> > This just mitigates the problem we are discussing, when there is an
> > interference with an ongoing transaction.
>
> This fixes for user space's ability to underrun counters and quotas
> because expressions' dump callbacks are not concurrency safe in reset
> mode.
>
> What you're concerned with is a different issue.
I'd suggest you add comment to this code, feel free to add better
wording:
/* Mutex is held is to prevent that two concurrent dump-and-reset calls
* do not underrun counters and quotas. The commit_mutex is used for the
* lack a better lock, this is not transaction path.
*/
mutex_lock(&nft_net->commit_mutex);
ret = nf_tables_dump_rules(skb, cb);
mutex_unlock(&nft_net->commit_mutex);
^ permalink raw reply [flat|nested] 11+ messages in thread