From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
linux-rt-devel@lists.linux.dev,
Jozsef Kadlecsik <kadlec@netfilter.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH net-next v2 0/3] Replace xt_recseq with u64_stats.
Date: Thu, 13 Mar 2025 00:16:03 +0100 [thread overview]
Message-ID: <Z9IVs3LD3A1HPSS0@calendula> (raw)
In-Reply-To: <20250221133143.5058-1-bigeasy@linutronix.de>
Hi Sebastian,
On Fri, Feb 21, 2025 at 02:31:40PM +0100, Sebastian Andrzej Siewior wrote:
> The per-CPU xt_recseq is a custom netfilter seqcount. It provides
> synchronisation for the replacement of the xt_table::private pointer and
> ensures that the two counter in xt_counters are properly observed during
> an update on 32bit architectures. xt_recseq also supports recursion.
>
> This construct is less than optimal on PREMPT_RT because the lack of an
> associated lock (with the seqcount) can lead to a deadlock if a high
> priority reader interrupts a writter. Also xt_recseq relies on locking
> with BH-disable which becomes problematic if the lock, currently part of
> local_bh_disable() on PREEMPT_RT, gets removed.
>
> This can be optimized unrelated to PREEMPT_RT:
> - Use RCU for synchronisation. This means ipt_do_table() (and the two
> other) access xt_table::private within a RCU section.
> xt_replace_table() replaces the pointer with rcu_assign_pointer() and
> uses synchronize_rcu() to wait until each reader left RCU section.
>
> - Use u64_stats_t for the statistics. The advantage here is that
> u64_stats_sync which is use a seqcount is optimized away on 64bit
> architectures. The increment becomes just an add, the read just a read
> of the variable without a loop. On 32bit architectures the seqcount
> remains but the scope is smaller.
>
> The struct xt_counters is defined in a user exported header (uapi). So
> in patch #2 I tried to split the regular u64 access and the "internal
> access" which treats the struct either as two counter or a per-CPU
> pointer. In order not to expose u64_stats_t to userland I added a "pad"
> which is cast to the internal type. I hoped that this makes it obvious
> that a function like xt_get_this_cpu_counter() expects the possible
> per-CPU type but mark_source_chains() or get_counters() expect the u64
> type without pointers.
>
> v1…v2 https://lore.kernel.org/all/20250216125135.3037967-1-bigeasy@linutronix.de/
> - Updated kerneldoc in 2/3 so that the renamed parameter is part of
> it.
> - Updated description 1/3 in case there are complains regarding the
> synchronize_rcu(). The suggested course of action is to motivate
> people to move away from "legacy" towards "nft" tooling. Last resort
> is not to wait for the in-flight counter and just copy what is
> there.
Kconfig !PREEMPT_RT for this is not an option, right?
Thanks.
next prev parent reply other threads:[~2025-03-12 23:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-21 13:31 [PATCH net-next v2 0/3] Replace xt_recseq with u64_stats Sebastian Andrzej Siewior
2025-02-21 13:31 ` [PATCH net-next v2 1/3] netfilter: Make xt_table::private RCU protected Sebastian Andrzej Siewior
2025-02-21 13:31 ` [PATCH net-next v2 2/3] netfilter: Split the xt_counters type between kernel and user Sebastian Andrzej Siewior
2025-02-21 13:31 ` [PATCH net-next v2 3/3] netfilter: Use u64_stats for counters in xt_counters_k Sebastian Andrzej Siewior
2025-03-07 16:57 ` [PATCH net-next v2 0/3] Replace xt_recseq with u64_stats Sebastian Andrzej Siewior
2025-03-12 23:16 ` Pablo Neira Ayuso [this message]
2025-03-13 8:34 ` Sebastian Andrzej Siewior
2025-03-20 12:41 ` Pablo Neira Ayuso
2025-03-21 10:24 ` Pablo Neira Ayuso
2025-03-24 16:29 ` Sebastian Andrzej Siewior
2025-03-24 17:47 ` Florian Westphal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z9IVs3LD3A1HPSS0@calendula \
--to=pablo@netfilter.org \
--cc=bigeasy@linutronix.de \
--cc=coreteam@netfilter.org \
--cc=kadlec@netfilter.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netfilter-devel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.