* [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats
@ 2024-10-25 13:32 Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion Florian Westphal
` (6 more replies)
0 siblings, 7 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Mathieu reported a lockdep splat on rule deletion with
CONFIG_RCU_LIST=y.
Unfortunately there are many more errors, and not all are false
positives.
First patches pass lockdep_commit_lock_is_held() to the rcu
list traversal macro so that those splats are avoided.
The last two patches are real code change as opposed to
'pass the transaction mutex to relax rcu check':
Those two lists are not protected by transaction mutex so
could be altered in parallel.
Aside from context these patches could be applied in any order.
This targets nf-next because these are long-standing issues so
it seems wrong to fix it this late in the release cycle.
Florian Westphal (7):
netfilter: nf_tables: avoid false-positive lockdep splat on rule
deletion
netfilter: nf_tables: avoid false-positive lockdep splats with sets
netfilter: nf_tables: avoid false-positive lockdep splats with
flowtables
netfilter: nf_tables: avoid false-positive lockdep splats in set
walker
netfilter: nf_tables: avoid false-positive lockdep splats with
basechain hook
netfilter: nf_tables: must hold rcu read lock while iterating
expression type list
netfilter: nf_tables: must hold rcu read lock while iterating object
type list
include/net/netfilter/nf_tables.h | 3 +-
net/netfilter/nf_tables_api.c | 110 ++++++++++++++++++------------
net/netfilter/nft_flow_offload.c | 4 +-
net/netfilter/nft_set_bitmap.c | 10 +--
net/netfilter/nft_set_hash.c | 3 +-
5 files changed, 79 insertions(+), 51 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-25 15:09 ` Matthieu Baerts
2024-10-25 13:32 ` [PATCH nf-next 2/7] netfilter: nf_tables: avoid false-positive lockdep splats with sets Florian Westphal
` (5 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal, Matthieu Baerts
On rule delete we get:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
1 lock held by iptables/134:
#0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables
Code is fine, no other CPU can change the list because we're holding
transaction mutex.
Pass the needed lockdep annotation to the iterator and fix
two comments for functions that are no longer restricted to rcu-only
context.
This is enough to resolve rule delete, but there are several other
missing annotations, added in followup-patches.
Fixes: 28875945ba98 ("rcu: Add support for consolidated-RCU reader checking")
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/da27f17f-3145-47af-ad0f-7fd2a823623e@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 30331688301e..80c285ac7e07 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3411,13 +3411,15 @@ void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr)
* Rules
*/
-static struct nft_rule *__nft_rule_lookup(const struct nft_chain *chain,
+static struct nft_rule *__nft_rule_lookup(const struct net *net,
+ const struct nft_chain *chain,
u64 handle)
{
struct nft_rule *rule;
// FIXME: this sucks
- list_for_each_entry_rcu(rule, &chain->rules, list) {
+ list_for_each_entry_rcu(rule, &chain->rules, list,
+ lockdep_commit_lock_is_held(net)) {
if (handle == rule->handle)
return rule;
}
@@ -3425,13 +3427,14 @@ static struct nft_rule *__nft_rule_lookup(const struct nft_chain *chain,
return ERR_PTR(-ENOENT);
}
-static struct nft_rule *nft_rule_lookup(const struct nft_chain *chain,
+static struct nft_rule *nft_rule_lookup(const struct net *net,
+ const struct nft_chain *chain,
const struct nlattr *nla)
{
if (nla == NULL)
return ERR_PTR(-EINVAL);
- return __nft_rule_lookup(chain, be64_to_cpu(nla_get_be64(nla)));
+ return __nft_rule_lookup(net, chain, be64_to_cpu(nla_get_be64(nla)));
}
static const struct nla_policy nft_rule_policy[NFTA_RULE_MAX + 1] = {
@@ -3732,7 +3735,7 @@ static int nf_tables_dump_rules_done(struct netlink_callback *cb)
return 0;
}
-/* called with rcu_read_lock held */
+/* Caller must hold rcu read lock or transaction mutex */
static struct sk_buff *
nf_tables_getrule_single(u32 portid, const struct nfnl_info *info,
const struct nlattr * const nla[], bool reset)
@@ -3759,7 +3762,7 @@ nf_tables_getrule_single(u32 portid, const struct nfnl_info *info,
return ERR_CAST(chain);
}
- rule = nft_rule_lookup(chain, nla[NFTA_RULE_HANDLE]);
+ rule = nft_rule_lookup(net, chain, nla[NFTA_RULE_HANDLE]);
if (IS_ERR(rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_HANDLE]);
return ERR_CAST(rule);
@@ -4057,7 +4060,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
if (nla[NFTA_RULE_HANDLE]) {
handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_HANDLE]));
- rule = __nft_rule_lookup(chain, handle);
+ rule = __nft_rule_lookup(net, chain, handle);
if (IS_ERR(rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_HANDLE]);
return PTR_ERR(rule);
@@ -4079,7 +4082,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
if (nla[NFTA_RULE_POSITION]) {
pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION]));
- old_rule = __nft_rule_lookup(chain, pos_handle);
+ old_rule = __nft_rule_lookup(net, chain, pos_handle);
if (IS_ERR(old_rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION]);
return PTR_ERR(old_rule);
@@ -4296,7 +4299,7 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info,
if (chain) {
if (nla[NFTA_RULE_HANDLE]) {
- rule = nft_rule_lookup(chain, nla[NFTA_RULE_HANDLE]);
+ rule = nft_rule_lookup(info->net, chain, nla[NFTA_RULE_HANDLE]);
if (IS_ERR(rule)) {
if (PTR_ERR(rule) == -ENOENT &&
NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_DESTROYRULE)
@@ -8101,7 +8104,7 @@ static int nf_tables_dump_obj_done(struct netlink_callback *cb)
return 0;
}
-/* called with rcu_read_lock held */
+/* Caller must hold rcu read lock or transaction mutex */
static struct sk_buff *
nf_tables_getobj_single(u32 portid, const struct nfnl_info *info,
const struct nlattr * const nla[], bool reset)
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 2/7] netfilter: nf_tables: avoid false-positive lockdep splats with sets
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 3/7] netfilter: nf_tables: avoid false-positive lockdep splats with flowtables Florian Westphal
` (4 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Same as previous patch. All set handling functions here can be called
with transaction mutex held (but not the rcu read lock).
The transaction mutex prevents concurrent add/delete, so this is fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 80c285ac7e07..a51731d76401 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3986,7 +3986,8 @@ int nft_set_catchall_validate(const struct nft_ctx *ctx, struct nft_set *set)
struct nft_set_ext *ext;
int ret = 0;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list,
+ lockdep_commit_lock_is_held(ctx->net)) {
ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_active(ext, dummy_iter.genmask))
continue;
@@ -4459,7 +4460,8 @@ static const struct nla_policy nft_set_desc_policy[NFTA_SET_DESC_MAX + 1] = {
[NFTA_SET_DESC_CONCAT] = NLA_POLICY_NESTED_ARRAY(nft_concat_policy),
};
-static struct nft_set *nft_set_lookup(const struct nft_table *table,
+static struct nft_set *nft_set_lookup(const struct net *net,
+ const struct nft_table *table,
const struct nlattr *nla, u8 genmask)
{
struct nft_set *set;
@@ -4467,7 +4469,8 @@ static struct nft_set *nft_set_lookup(const struct nft_table *table,
if (nla == NULL)
return ERR_PTR(-EINVAL);
- list_for_each_entry_rcu(set, &table->sets, list) {
+ list_for_each_entry_rcu(set, &table->sets, list,
+ lockdep_commit_lock_is_held(net)) {
if (!nla_strcmp(nla, set->name) &&
nft_active_genmask(set, genmask))
return set;
@@ -4517,7 +4520,7 @@ struct nft_set *nft_set_lookup_global(const struct net *net,
{
struct nft_set *set;
- set = nft_set_lookup(table, nla_set_name, genmask);
+ set = nft_set_lookup(net, table, nla_set_name, genmask);
if (IS_ERR(set)) {
if (!nla_set_id)
return set;
@@ -4893,7 +4896,7 @@ static int nf_tables_getset(struct sk_buff *skb, const struct nfnl_info *info,
if (!nla[NFTA_SET_TABLE])
return -EINVAL;
- set = nft_set_lookup(table, nla[NFTA_SET_NAME], genmask);
+ set = nft_set_lookup(net, table, nla[NFTA_SET_NAME], genmask);
if (IS_ERR(set)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_SET_NAME]);
return PTR_ERR(set);
@@ -5229,7 +5232,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
- set = nft_set_lookup(table, nla[NFTA_SET_NAME], genmask);
+ set = nft_set_lookup(net, table, nla[NFTA_SET_NAME], genmask);
if (IS_ERR(set)) {
if (PTR_ERR(set) != -ENOENT) {
NL_SET_BAD_ATTR(extack, nla[NFTA_SET_NAME]);
@@ -5431,7 +5434,7 @@ static int nf_tables_delset(struct sk_buff *skb, const struct nfnl_info *info,
set = nft_set_lookup_byhandle(table, attr, genmask);
} else {
attr = nla[NFTA_SET_NAME];
- set = nft_set_lookup(table, attr, genmask);
+ set = nft_set_lookup(net, table, attr, genmask);
}
if (IS_ERR(set)) {
@@ -5495,7 +5498,8 @@ static int nft_set_catchall_bind_check(const struct nft_ctx *ctx,
struct nft_set_ext *ext;
int ret = 0;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list,
+ lockdep_commit_lock_is_held(ctx->net)) {
ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_active(ext, genmask))
continue;
@@ -6261,7 +6265,7 @@ static int nft_set_dump_ctx_init(struct nft_set_dump_ctx *dump_ctx,
return PTR_ERR(table);
}
- set = nft_set_lookup(table, nla[NFTA_SET_ELEM_LIST_SET], genmask);
+ set = nft_set_lookup(net, table, nla[NFTA_SET_ELEM_LIST_SET], genmask);
if (IS_ERR(set)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_SET_ELEM_LIST_SET]);
return PTR_ERR(set);
@@ -7493,7 +7497,8 @@ static int nft_set_catchall_flush(const struct nft_ctx *ctx,
struct nft_set_ext *ext;
int ret = 0;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list,
+ lockdep_commit_lock_is_held(ctx->net)) {
ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_active(ext, genmask))
continue;
@@ -7543,7 +7548,7 @@ static int nf_tables_delsetelem(struct sk_buff *skb,
return PTR_ERR(table);
}
- set = nft_set_lookup(table, nla[NFTA_SET_ELEM_LIST_SET], genmask);
+ set = nft_set_lookup(net, table, nla[NFTA_SET_ELEM_LIST_SET], genmask);
if (IS_ERR(set)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_SET_ELEM_LIST_SET]);
return PTR_ERR(set);
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 3/7] netfilter: nf_tables: avoid false-positive lockdep splats with flowtables
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 2/7] netfilter: nf_tables: avoid false-positive lockdep splats with sets Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 4/7] netfilter: nf_tables: avoid false-positive lockdep splats in set walker Florian Westphal
` (3 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
The transaction mutex prevents concurrent add/delete, its ok to iterate
those lists outside of rcu read side critical sections.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/net/netfilter/nf_tables.h | 3 ++-
net/netfilter/nf_tables_api.c | 15 +++++++++------
net/netfilter/nft_flow_offload.c | 4 ++--
3 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 91ae20cb7648..c1513bd14568 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1463,7 +1463,8 @@ struct nft_flowtable {
struct nf_flowtable data;
};
-struct nft_flowtable *nft_flowtable_lookup(const struct nft_table *table,
+struct nft_flowtable *nft_flowtable_lookup(const struct net *net,
+ const struct nft_table *table,
const struct nlattr *nla,
u8 genmask);
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index a51731d76401..9e367e134691 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8378,12 +8378,14 @@ static const struct nla_policy nft_flowtable_policy[NFTA_FLOWTABLE_MAX + 1] = {
[NFTA_FLOWTABLE_FLAGS] = { .type = NLA_U32 },
};
-struct nft_flowtable *nft_flowtable_lookup(const struct nft_table *table,
+struct nft_flowtable *nft_flowtable_lookup(const struct net *net,
+ const struct nft_table *table,
const struct nlattr *nla, u8 genmask)
{
struct nft_flowtable *flowtable;
- list_for_each_entry_rcu(flowtable, &table->flowtables, list) {
+ list_for_each_entry_rcu(flowtable, &table->flowtables, list,
+ lockdep_commit_lock_is_held(net)) {
if (!nla_strcmp(nla, flowtable->name) &&
nft_active_genmask(flowtable, genmask))
return flowtable;
@@ -8739,7 +8741,7 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
return PTR_ERR(table);
}
- flowtable = nft_flowtable_lookup(table, nla[NFTA_FLOWTABLE_NAME],
+ flowtable = nft_flowtable_lookup(net, table, nla[NFTA_FLOWTABLE_NAME],
genmask);
if (IS_ERR(flowtable)) {
err = PTR_ERR(flowtable);
@@ -8933,7 +8935,7 @@ static int nf_tables_delflowtable(struct sk_buff *skb,
flowtable = nft_flowtable_lookup_byhandle(table, attr, genmask);
} else {
attr = nla[NFTA_FLOWTABLE_NAME];
- flowtable = nft_flowtable_lookup(table, attr, genmask);
+ flowtable = nft_flowtable_lookup(net, table, attr, genmask);
}
if (IS_ERR(flowtable)) {
@@ -9003,7 +9005,8 @@ static int nf_tables_fill_flowtable_info(struct sk_buff *skb, struct net *net,
if (!hook_list)
hook_list = &flowtable->hook_list;
- list_for_each_entry_rcu(hook, hook_list, list) {
+ list_for_each_entry_rcu(hook, hook_list, list,
+ lockdep_commit_lock_is_held(net)) {
if (nla_put_string(skb, NFTA_DEVICE_NAME, hook->ops.dev->name))
goto nla_put_failure;
}
@@ -9145,7 +9148,7 @@ static int nf_tables_getflowtable(struct sk_buff *skb,
return PTR_ERR(table);
}
- flowtable = nft_flowtable_lookup(table, nla[NFTA_FLOWTABLE_NAME],
+ flowtable = nft_flowtable_lookup(net, table, nla[NFTA_FLOWTABLE_NAME],
genmask);
if (IS_ERR(flowtable)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_FLOWTABLE_NAME]);
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 2f732fae5a83..65199c23c75c 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -409,8 +409,8 @@ static int nft_flow_offload_init(const struct nft_ctx *ctx,
if (!tb[NFTA_FLOW_TABLE_NAME])
return -EINVAL;
- flowtable = nft_flowtable_lookup(ctx->table, tb[NFTA_FLOW_TABLE_NAME],
- genmask);
+ flowtable = nft_flowtable_lookup(ctx->net, ctx->table,
+ tb[NFTA_FLOW_TABLE_NAME], genmask);
if (IS_ERR(flowtable))
return PTR_ERR(flowtable);
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 4/7] netfilter: nf_tables: avoid false-positive lockdep splats in set walker
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
` (2 preceding siblings ...)
2024-10-25 13:32 ` [PATCH nf-next 3/7] netfilter: nf_tables: avoid false-positive lockdep splats with flowtables Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook Florian Westphal
` (2 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Its not possible to add or delete elements from hash and bitmap sets,
as long as caller is holding the transaction mutex, so its ok to iterate
the list outside of rcu read side critical section.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_set_bitmap.c | 10 ++++++----
net/netfilter/nft_set_hash.c | 3 ++-
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nft_set_bitmap.c b/net/netfilter/nft_set_bitmap.c
index 1caa04619dc6..12390d2e994f 100644
--- a/net/netfilter/nft_set_bitmap.c
+++ b/net/netfilter/nft_set_bitmap.c
@@ -88,13 +88,15 @@ bool nft_bitmap_lookup(const struct net *net, const struct nft_set *set,
}
static struct nft_bitmap_elem *
-nft_bitmap_elem_find(const struct nft_set *set, struct nft_bitmap_elem *this,
+nft_bitmap_elem_find(const struct net *net,
+ const struct nft_set *set, struct nft_bitmap_elem *this,
u8 genmask)
{
const struct nft_bitmap *priv = nft_set_priv(set);
struct nft_bitmap_elem *be;
- list_for_each_entry_rcu(be, &priv->list, head) {
+ list_for_each_entry_rcu(be, &priv->list, head,
+ lockdep_is_held(&nft_pernet(net)->commit_mutex)) {
if (memcmp(nft_set_ext_key(&be->ext),
nft_set_ext_key(&this->ext), set->klen) ||
!nft_set_elem_active(&be->ext, genmask))
@@ -132,7 +134,7 @@ static int nft_bitmap_insert(const struct net *net, const struct nft_set *set,
u8 genmask = nft_genmask_next(net);
u32 idx, off;
- be = nft_bitmap_elem_find(set, new, genmask);
+ be = nft_bitmap_elem_find(net, set, new, genmask);
if (be) {
*elem_priv = &be->priv;
return -EEXIST;
@@ -201,7 +203,7 @@ nft_bitmap_deactivate(const struct net *net, const struct nft_set *set,
nft_bitmap_location(set, elem->key.val.data, &idx, &off);
- be = nft_bitmap_elem_find(set, this, genmask);
+ be = nft_bitmap_elem_find(net, set, this, genmask);
if (!be)
return NULL;
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index daa56dda737a..65bd291318f2 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -647,7 +647,8 @@ static void nft_hash_walk(const struct nft_ctx *ctx, struct nft_set *set,
int i;
for (i = 0; i < priv->buckets; i++) {
- hlist_for_each_entry_rcu(he, &priv->table[i], node) {
+ hlist_for_each_entry_rcu(he, &priv->table[i], node,
+ lockdep_is_held(&nft_pernet(ctx->net)->commit_mutex)) {
if (iter->count < iter->skip)
goto cont;
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
` (3 preceding siblings ...)
2024-10-25 13:32 ` [PATCH nf-next 4/7] netfilter: nf_tables: avoid false-positive lockdep splats in set walker Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-25 14:40 ` Phil Sutter
2024-10-25 13:32 ` [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 7/7] netfilter: nf_tables: must hold rcu read lock while iterating object " Florian Westphal
6 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Like previous patches: iteration is ok if the lost cannot be altered in
parallel.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 9e367e134691..3b5154f2dd79 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1824,7 +1824,8 @@ static int nft_dump_stats(struct sk_buff *skb, struct nft_stats __percpu *stats)
return -ENOSPC;
}
-static int nft_dump_basechain_hook(struct sk_buff *skb, int family,
+static int nft_dump_basechain_hook(struct sk_buff *skb,
+ const struct net *net, int family,
const struct nft_base_chain *basechain,
const struct list_head *hook_list)
{
@@ -1849,7 +1850,8 @@ static int nft_dump_basechain_hook(struct sk_buff *skb, int family,
if (!hook_list)
hook_list = &basechain->hook_list;
- list_for_each_entry_rcu(hook, hook_list, list) {
+ list_for_each_entry_rcu(hook, hook_list, list,
+ lockdep_commit_lock_is_held(net)) {
if (!first)
first = hook;
@@ -1900,7 +1902,7 @@ static int nf_tables_fill_chain_info(struct sk_buff *skb, struct net *net,
const struct nft_base_chain *basechain = nft_base_chain(chain);
struct nft_stats __percpu *stats;
- if (nft_dump_basechain_hook(skb, family, basechain, hook_list))
+ if (nft_dump_basechain_hook(skb, net, family, basechain, hook_list))
goto nla_put_failure;
if (nla_put_be32(skb, NFTA_CHAIN_POLICY,
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
` (4 preceding siblings ...)
2024-10-25 13:32 ` [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
2024-10-26 9:39 ` Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 7/7] netfilter: nf_tables: must hold rcu read lock while iterating object " Florian Westphal
6 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
nft shell tests trigger:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3125 RCU-list traversed in non-reader section!!
1 lock held by nft/2068:
#0: ffff888106c6f8c8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x3c/0xf0
But the transaction mutex doesn't protect this list, the nfnl subsystem
mutex would, but we can't acquire it here without risk of ABBA
deadlocks.
Acquire the rcu read lock and also reject any types that reside in
modules, we'd need to extend this to hold a module reference.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3b5154f2dd79..3db4457f46a2 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3296,25 +3296,37 @@ int nft_expr_inner_parse(const struct nft_ctx *ctx, const struct nlattr *nla,
if (!tb[NFTA_EXPR_DATA] || !tb[NFTA_EXPR_NAME])
return -EINVAL;
+ rcu_read_lock();
+
type = __nft_expr_type_get(ctx->family, tb[NFTA_EXPR_NAME]);
- if (!type)
- return -ENOENT;
+ if (!type) {
+ err = -ENOENT;
+ goto out_unlock;
+ }
- if (!type->inner_ops)
- return -EOPNOTSUPP;
+ if (!type->inner_ops || !type->owner) {
+ err = -EOPNOTSUPP;
+ goto out_unlock;
+ }
err = nla_parse_nested_deprecated(info->tb, type->maxattr,
tb[NFTA_EXPR_DATA],
type->policy, NULL);
if (err < 0)
- goto err_nla_parse;
+ goto out_unlock;
info->attr = nla;
info->ops = type->inner_ops;
+ /* At this time only meta and payload are supported, both are
+ * builtin. Therefore we can get away without a module
+ * reference.
+ */
+ rcu_read_unlock();
return 0;
-err_nla_parse:
+out_unlock:
+ rcu_read_unlock();
return err;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH nf-next 7/7] netfilter: nf_tables: must hold rcu read lock while iterating object type list
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
` (5 preceding siblings ...)
2024-10-25 13:32 ` [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list Florian Westphal
@ 2024-10-25 13:32 ` Florian Westphal
6 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-25 13:32 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Update of stateful object triggers:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:7759 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/3060:
#0: ffff88810f0578c8 (&nft_net->commit_mutex){+.+.}-{4:4}, [..]
... but this list is not protected by the transaction mutex but the
nfnl nftables subsystem mutex.
Switch to nft_obj_type_get which will acquire rcu read lock,
bump refcount, and returns the result.
Fixes: dad3bdeef45f ("netfilter: nf_tables: fix memory leak during stateful obj update").
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3db4457f46a2..7712201ecf2a 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7809,9 +7809,7 @@ static int nf_tables_updobj(const struct nft_ctx *ctx,
struct nft_trans *trans;
int err = -ENOMEM;
- if (!try_module_get(type->owner))
- return -ENOENT;
-
+ /* caller must have obtained type->owner reference. */
trans = nft_trans_alloc(ctx, NFT_MSG_NEWOBJ,
sizeof(struct nft_trans_obj));
if (!trans)
@@ -7879,15 +7877,16 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info,
if (info->nlh->nlmsg_flags & NLM_F_REPLACE)
return -EOPNOTSUPP;
- type = __nft_obj_type_get(objtype, family);
- if (WARN_ON_ONCE(!type))
- return -ENOENT;
-
if (!obj->ops->update)
return 0;
+ type = nft_obj_type_get(net, objtype, family);
+ if (WARN_ON_ONCE(!type))
+ return -ENOENT;
+
nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
+ /* type->owner reference is put when transaction object is released. */
return nf_tables_updobj(&ctx, type, nla[NFTA_OBJ_DATA], obj);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook
2024-10-25 13:32 ` [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook Florian Westphal
@ 2024-10-25 14:40 ` Phil Sutter
0 siblings, 0 replies; 13+ messages in thread
From: Phil Sutter @ 2024-10-25 14:40 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Fri, Oct 25, 2024 at 03:32:22PM +0200, Florian Westphal wrote:
> Like previous patches: iteration is ok if the lost cannot be altered in
Typo: lost -> list
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
2024-10-25 13:32 ` [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion Florian Westphal
@ 2024-10-25 15:09 ` Matthieu Baerts
2024-10-26 9:35 ` Florian Westphal
0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Baerts @ 2024-10-25 15:09 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
Hi Florian,
On 25/10/2024 15:32, Florian Westphal wrote:
> On rule delete we get:
> WARNING: suspicious RCU usage
> net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
> 1 lock held by iptables/134:
> #0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables
>
> Code is fine, no other CPU can change the list because we're holding
> transaction mutex.
>
> Pass the needed lockdep annotation to the iterator and fix
> two comments for functions that are no longer restricted to rcu-only
> context.
>
> This is enough to resolve rule delete, but there are several other
> missing annotations, added in followup-patches.
Thank you for the patch! (and sorry for having somehow pushed you to
open the pandora box for the other cases :) )
I confirm this fix avoids the warning in my case:
Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
2024-10-25 15:09 ` Matthieu Baerts
@ 2024-10-26 9:35 ` Florian Westphal
2024-10-26 11:20 ` Matthieu Baerts
0 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2024-10-26 9:35 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: Florian Westphal, netfilter-devel
Matthieu Baerts <matttbe@kernel.org> wrote:
> > This is enough to resolve rule delete, but there are several other
> > missing annotations, added in followup-patches.
>
> Thank you for the patch! (and sorry for having somehow pushed you to
> open the pandora box for the other cases :) )
Right, I still saw more splats when running nftables tests, but those
were in other places of network stack (outside netfilter land).
I will have a look next week.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list
2024-10-25 13:32 ` [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list Florian Westphal
@ 2024-10-26 9:39 ` Florian Westphal
0 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2024-10-26 9:39 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
Florian Westphal <fw@strlen.de> wrote:
> + if (!type->inner_ops || !type->owner) {
> + err = -EOPNOTSUPP;
This is wrong, we need to reject the opposite, i.e.
if (!type->inner_ops || type->owner) {
NULL owner means its builtin, and thats what we can
support at this time.
I will wait until Tuesday before sending v2.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
2024-10-26 9:35 ` Florian Westphal
@ 2024-10-26 11:20 ` Matthieu Baerts
0 siblings, 0 replies; 13+ messages in thread
From: Matthieu Baerts @ 2024-10-26 11:20 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
Hi Florian,
On 26/10/2024 11:35, Florian Westphal wrote:
> Matthieu Baerts <matttbe@kernel.org> wrote:
>>> This is enough to resolve rule delete, but there are several other
>>> missing annotations, added in followup-patches.
>>
>> Thank you for the patch! (and sorry for having somehow pushed you to
>> open the pandora box for the other cases :) )
>
> Right, I still saw more splats when running nftables tests, but those
> were in other places of network stack (outside netfilter land).
>
> I will have a look next week.
Just in case, I had one splat when stopping my VM:
https://lore.kernel.org/f856d105-5463-4b8b-8715-0e6871165616@kernel.org
There is a fix for it that is coming:
https://lore.kernel.org/20240913162340.2142976-1-kan.liang@linux.intel.com
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-10-26 11:20 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-25 13:32 [PATCH nf-next 0/7] netfilter: nf_tables: avoid PROVE_RCU_LIST splats Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 1/7] netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion Florian Westphal
2024-10-25 15:09 ` Matthieu Baerts
2024-10-26 9:35 ` Florian Westphal
2024-10-26 11:20 ` Matthieu Baerts
2024-10-25 13:32 ` [PATCH nf-next 2/7] netfilter: nf_tables: avoid false-positive lockdep splats with sets Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 3/7] netfilter: nf_tables: avoid false-positive lockdep splats with flowtables Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 4/7] netfilter: nf_tables: avoid false-positive lockdep splats in set walker Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 5/7] netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook Florian Westphal
2024-10-25 14:40 ` Phil Sutter
2024-10-25 13:32 ` [PATCH nf-next 6/7] netfilter: nf_tables: must hold rcu read lock while iterating expression type list Florian Westphal
2024-10-26 9:39 ` Florian Westphal
2024-10-25 13:32 ` [PATCH nf-next 7/7] netfilter: nf_tables: must hold rcu read lock while iterating object " Florian Westphal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.