* [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation.
@ 2024-08-20 7:54 Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 1/3] netfilter: nft_counter: Disable BH in nft_counter_offload_stats() Sebastian Andrzej Siewior
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-08-20 7:54 UTC (permalink / raw)
To: netfilter-devel, coreteam
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Thomas Gleixner
Hi,
I've been looking at nft_counter and identified two bugs and then added
an optimisation on top.
This is just compile tested, I didn't manage to trigger some of the
pathes I changed (especially nft_counter_offload_stats()).
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net 1/3] netfilter: nft_counter: Disable BH in nft_counter_offload_stats().
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
@ 2024-08-20 7:54 ` Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 2/3] netfilter: nft_counter: Synchronize nft_counter_reset() against reader Sebastian Andrzej Siewior
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-08-20 7:54 UTC (permalink / raw)
To: netfilter-devel, coreteam
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Thomas Gleixner,
Sebastian Andrzej Siewior
The sequence counter nft_counter_seq is a per-CPU counter. There is no
lock associated with it. nft_counter_do_eval() is using the same counter
and disables BH which suggest that it can be invoked from a softirq.
This in turn means that nft_counter_offload_stats(), which disables only
preemption, can be interrupted by nft_counter_do_eval() leading to two
writer for one seqcount_t.
This can lead to loosing stats or reading statistics while they are
updated.
Disable BH during stats update in nft_counter_offload_stats() to ensure
one writer at a time.
Fixes: b72920f6e4a9d ("netfilter: nftables: counter hardware offload support")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
net/netfilter/nft_counter.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index 291ed2026367e..16f40b503d379 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -265,7 +265,7 @@ static void nft_counter_offload_stats(struct nft_expr *expr,
struct nft_counter *this_cpu;
seqcount_t *myseq;
- preempt_disable();
+ local_bh_disable();
this_cpu = this_cpu_ptr(priv->counter);
myseq = this_cpu_ptr(&nft_counter_seq);
@@ -273,7 +273,7 @@ static void nft_counter_offload_stats(struct nft_expr *expr,
this_cpu->packets += stats->pkts;
this_cpu->bytes += stats->bytes;
write_seqcount_end(myseq);
- preempt_enable();
+ local_bh_enable();
}
void nft_counter_init_seqcount(void)
--
2.45.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net 2/3] netfilter: nft_counter: Synchronize nft_counter_reset() against reader.
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 1/3] netfilter: nft_counter: Disable BH in nft_counter_offload_stats() Sebastian Andrzej Siewior
@ 2024-08-20 7:54 ` Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net-next 3/3] netfilter: nft_counter: Use u64_stats_t for statistic Sebastian Andrzej Siewior
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-08-20 7:54 UTC (permalink / raw)
To: netfilter-devel, coreteam
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Thomas Gleixner,
Sebastian Andrzej Siewior
nft_counter_reset() resets the counter by subtracting the previously
retrieved value from the counter. This is a write operation on the
counter and as such it requires to be performed with a write sequence of
nft_counter_seq to serialize against its possible reader.
Update the packets/ bytes within write-sequence of nft_counter_seq.
Fixes: d84701ecbcd6a ("netfilter: nft_counter: rework atomic dump and reset")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
net/netfilter/nft_counter.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index 16f40b503d379..eab0dc66bee6b 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -107,11 +107,16 @@ static void nft_counter_reset(struct nft_counter_percpu_priv *priv,
struct nft_counter *total)
{
struct nft_counter *this_cpu;
+ seqcount_t *myseq;
local_bh_disable();
this_cpu = this_cpu_ptr(priv->counter);
+ myseq = this_cpu_ptr(&nft_counter_seq);
+
+ write_seqcount_begin(myseq);
this_cpu->packets -= total->packets;
this_cpu->bytes -= total->bytes;
+ write_seqcount_end(myseq);
local_bh_enable();
}
--
2.45.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next 3/3] netfilter: nft_counter: Use u64_stats_t for statistic.
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 1/3] netfilter: nft_counter: Disable BH in nft_counter_offload_stats() Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 2/3] netfilter: nft_counter: Synchronize nft_counter_reset() against reader Sebastian Andrzej Siewior
@ 2024-08-20 7:54 ` Sebastian Andrzej Siewior
2024-08-20 9:25 ` [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Florian Westphal
2024-08-20 10:28 ` Pablo Neira Ayuso
4 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-08-20 7:54 UTC (permalink / raw)
To: netfilter-devel, coreteam
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Thomas Gleixner,
Sebastian Andrzej Siewior, Eric Dumazet
The nft_counter uses two s64 counters for statistics. Those two are
protected by a seqcount to ensure that the 64bit variable is always
properly seen during updates even on 32bit architectures where the store
is performed by two writes. A side effect is that the two counter (bytes
and packet) are written and read together in the same window.
This can be replaced with u64_stats_t. write_seqcount_begin()/ end() is
replaced with u64_stats_update_begin()/ end() and behaves the same way
as with seqcount_t on 32bit architectures. Additionally there is a
preempt_disable on PREEMPT_RT to ensure that a reader does not preempt a
writer.
On 64bit architectures the macros are removed and the reads happen
without any retries. This also means that the reader can observe one
counter (bytes) from before the update and the other counter (packets)
but that is okay since there is no requirement to have both counter from
the same update window.
Convert the statistic to u64_stats_t. There is one optimisation:
nft_counter_do_init() and nft_counter_clone() allocate a new per-CPU
counter and assign a value to it. During this assignment preemption is
disabled which is not needed because the counter is not yet exposed to
the system so there can not be another writer or reader. Therefore
disabling preemption is omitted and raw_cpu_ptr() is used to obtain a
pointer to a counter for the assignment.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
net/netfilter/nft_counter.c | 90 +++++++++++++++++++------------------
1 file changed, 46 insertions(+), 44 deletions(-)
diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index eab0dc66bee6b..cc73253294963 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -8,7 +8,7 @@
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
-#include <linux/seqlock.h>
+#include <linux/u64_stats_sync.h>
#include <linux/netlink.h>
#include <linux/netfilter.h>
#include <linux/netfilter/nf_tables.h>
@@ -17,6 +17,11 @@
#include <net/netfilter/nf_tables_offload.h>
struct nft_counter {
+ u64_stats_t bytes;
+ u64_stats_t packets;
+};
+
+struct nft_counter_tot {
s64 bytes;
s64 packets;
};
@@ -25,25 +30,24 @@ struct nft_counter_percpu_priv {
struct nft_counter __percpu *counter;
};
-static DEFINE_PER_CPU(seqcount_t, nft_counter_seq);
+static DEFINE_PER_CPU(struct u64_stats_sync, nft_counter_sync);
static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
{
+ struct u64_stats_sync *nft_sync;
struct nft_counter *this_cpu;
- seqcount_t *myseq;
local_bh_disable();
this_cpu = this_cpu_ptr(priv->counter);
- myseq = this_cpu_ptr(&nft_counter_seq);
+ nft_sync = this_cpu_ptr(&nft_counter_sync);
- write_seqcount_begin(myseq);
+ u64_stats_update_begin(nft_sync);
+ u64_stats_add(&this_cpu->bytes, pkt->skb->len);
+ u64_stats_inc(&this_cpu->packets);
+ u64_stats_update_end(nft_sync);
- this_cpu->bytes += pkt->skb->len;
- this_cpu->packets++;
-
- write_seqcount_end(myseq);
local_bh_enable();
}
@@ -66,17 +70,16 @@ static int nft_counter_do_init(const struct nlattr * const tb[],
if (cpu_stats == NULL)
return -ENOMEM;
- preempt_disable();
- this_cpu = this_cpu_ptr(cpu_stats);
+ this_cpu = raw_cpu_ptr(cpu_stats);
if (tb[NFTA_COUNTER_PACKETS]) {
- this_cpu->packets =
- be64_to_cpu(nla_get_be64(tb[NFTA_COUNTER_PACKETS]));
+ u64_stats_set(&this_cpu->packets,
+ be64_to_cpu(nla_get_be64(tb[NFTA_COUNTER_PACKETS])));
}
if (tb[NFTA_COUNTER_BYTES]) {
- this_cpu->bytes =
- be64_to_cpu(nla_get_be64(tb[NFTA_COUNTER_BYTES]));
+ u64_stats_set(&this_cpu->bytes,
+ be64_to_cpu(nla_get_be64(tb[NFTA_COUNTER_BYTES])));
}
- preempt_enable();
+
priv->counter = cpu_stats;
return 0;
}
@@ -104,40 +107,41 @@ static void nft_counter_obj_destroy(const struct nft_ctx *ctx,
}
static void nft_counter_reset(struct nft_counter_percpu_priv *priv,
- struct nft_counter *total)
+ struct nft_counter_tot *total)
{
+ struct u64_stats_sync *nft_sync;
struct nft_counter *this_cpu;
- seqcount_t *myseq;
local_bh_disable();
this_cpu = this_cpu_ptr(priv->counter);
- myseq = this_cpu_ptr(&nft_counter_seq);
+ nft_sync = this_cpu_ptr(&nft_counter_sync);
+
+ u64_stats_update_begin(nft_sync);
+ u64_stats_add(&this_cpu->packets, -total->packets);
+ u64_stats_add(&this_cpu->bytes, -total->bytes);
+ u64_stats_update_end(nft_sync);
- write_seqcount_begin(myseq);
- this_cpu->packets -= total->packets;
- this_cpu->bytes -= total->bytes;
- write_seqcount_end(myseq);
local_bh_enable();
}
static void nft_counter_fetch(struct nft_counter_percpu_priv *priv,
- struct nft_counter *total)
+ struct nft_counter_tot *total)
{
struct nft_counter *this_cpu;
- const seqcount_t *myseq;
u64 bytes, packets;
unsigned int seq;
int cpu;
memset(total, 0, sizeof(*total));
for_each_possible_cpu(cpu) {
- myseq = per_cpu_ptr(&nft_counter_seq, cpu);
+ struct u64_stats_sync *nft_sync = per_cpu_ptr(&nft_counter_sync, cpu);
+
this_cpu = per_cpu_ptr(priv->counter, cpu);
do {
- seq = read_seqcount_begin(myseq);
- bytes = this_cpu->bytes;
- packets = this_cpu->packets;
- } while (read_seqcount_retry(myseq, seq));
+ seq = u64_stats_fetch_begin(nft_sync);
+ bytes = u64_stats_read(&this_cpu->bytes);
+ packets = u64_stats_read(&this_cpu->packets);
+ } while (u64_stats_fetch_retry(nft_sync, seq));
total->bytes += bytes;
total->packets += packets;
@@ -148,7 +152,7 @@ static int nft_counter_do_dump(struct sk_buff *skb,
struct nft_counter_percpu_priv *priv,
bool reset)
{
- struct nft_counter total;
+ struct nft_counter_tot total;
nft_counter_fetch(priv, &total);
@@ -237,7 +241,7 @@ static int nft_counter_clone(struct nft_expr *dst, const struct nft_expr *src, g
struct nft_counter_percpu_priv *priv_clone = nft_expr_priv(dst);
struct nft_counter __percpu *cpu_stats;
struct nft_counter *this_cpu;
- struct nft_counter total;
+ struct nft_counter_tot total;
nft_counter_fetch(priv, &total);
@@ -245,11 +249,9 @@ static int nft_counter_clone(struct nft_expr *dst, const struct nft_expr *src, g
if (cpu_stats == NULL)
return -ENOMEM;
- preempt_disable();
- this_cpu = this_cpu_ptr(cpu_stats);
- this_cpu->packets = total.packets;
- this_cpu->bytes = total.bytes;
- preempt_enable();
+ this_cpu = raw_cpu_ptr(cpu_stats);
+ u64_stats_set(&this_cpu->packets, total.packets);
+ u64_stats_set(&this_cpu->bytes, total.bytes);
priv_clone->counter = cpu_stats;
return 0;
@@ -267,17 +269,17 @@ static void nft_counter_offload_stats(struct nft_expr *expr,
const struct flow_stats *stats)
{
struct nft_counter_percpu_priv *priv = nft_expr_priv(expr);
+ struct u64_stats_sync *nft_sync;
struct nft_counter *this_cpu;
- seqcount_t *myseq;
local_bh_disable();
this_cpu = this_cpu_ptr(priv->counter);
- myseq = this_cpu_ptr(&nft_counter_seq);
+ nft_sync = this_cpu_ptr(&nft_counter_sync);
- write_seqcount_begin(myseq);
- this_cpu->packets += stats->pkts;
- this_cpu->bytes += stats->bytes;
- write_seqcount_end(myseq);
+ u64_stats_update_begin(nft_sync);
+ u64_stats_add(&this_cpu->packets, stats->pkts);
+ u64_stats_add(&this_cpu->bytes, stats->bytes);
+ u64_stats_update_end(nft_sync);
local_bh_enable();
}
@@ -286,7 +288,7 @@ void nft_counter_init_seqcount(void)
int cpu;
for_each_possible_cpu(cpu)
- seqcount_init(per_cpu_ptr(&nft_counter_seq, cpu));
+ u64_stats_init(per_cpu_ptr(&nft_counter_sync, cpu));
}
struct nft_expr_type nft_counter_type;
--
2.45.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation.
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
` (2 preceding siblings ...)
2024-08-20 7:54 ` [PATCH net-next 3/3] netfilter: nft_counter: Use u64_stats_t for statistic Sebastian Andrzej Siewior
@ 2024-08-20 9:25 ` Florian Westphal
2024-08-20 10:28 ` Pablo Neira Ayuso
4 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2024-08-20 9:25 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: netfilter-devel, coreteam, Pablo Neira Ayuso, Jozsef Kadlecsik,
Thomas Gleixner
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> I've been looking at nft_counter and identified two bugs and then added
> an optimisation on top.
Looks good to me, thanks!
Series: Reviewed-by: Florian Westphal <fw@strlen.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation.
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
` (3 preceding siblings ...)
2024-08-20 9:25 ` [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Florian Westphal
@ 2024-08-20 10:28 ` Pablo Neira Ayuso
4 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-20 10:28 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: netfilter-devel, coreteam, Jozsef Kadlecsik, Thomas Gleixner
On Tue, Aug 20, 2024 at 09:54:29AM +0200, Sebastian Andrzej Siewior wrote:
> Hi,
>
> I've been looking at nft_counter and identified two bugs and then added
> an optimisation on top.
>
> This is just compile tested, I didn't manage to trigger some of the
> pathes I changed (especially nft_counter_offload_stats()).
Applied 1/3 and 2/3 to nf.git, thanks Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-08-20 10:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-20 7:54 [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 1/3] netfilter: nft_counter: Disable BH in nft_counter_offload_stats() Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net 2/3] netfilter: nft_counter: Synchronize nft_counter_reset() against reader Sebastian Andrzej Siewior
2024-08-20 7:54 ` [PATCH net-next 3/3] netfilter: nft_counter: Use u64_stats_t for statistic Sebastian Andrzej Siewior
2024-08-20 9:25 ` [PATCH net 0/3] netfilter: nft_counter: Statistics fixes/ optimisation Florian Westphal
2024-08-20 10:28 ` Pablo Neira Ayuso
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.