From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: kadlec@netfilter.org, nnamrec@gmail.com,
Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
Linux regressions mailing list <regressions@lists.linux.dev>,
netdev@vger.kernel.org
Subject: Re: 6.10/bisected/regression - commits 4e7aaa6b82d6 cause appearing WARNING at net/netfilter/ipset/ip_set_core.c:1200 suspicious rcu_dereference_protected() usage
Date: Thu, 20 Jun 2024 10:42:57 +0200 [thread overview]
Message-ID: <ZnPrkcWkAxQrODju@calendula> (raw)
In-Reply-To: <CABXGCsND1HmAjCZNS_fg59_qbQfxfcHCD_OYD2tjTYdWFDSajw@mail.gmail.com>
Hi,
A fix has been included in this batch:
https://lore.kernel.org/netfilter-devel/20240619170537.2846-1-pablo@netfilter.org/T/#m39095851d6fc9d1a04d9de66f5580f61c9593389
On Thu, Jun 20, 2024 at 11:55:05AM +0500, Mikhail Gavrilov wrote:
> Hi,
> between 2ef5971ff345 and rc4 I spotted a new regression.
> It is expressed in the appearance warning with stacktrace after one
> minute after boot.
>
> =============================
> WARNING: suspicious RCU usage
> 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
> G W L ------- ---
> -----------------------------
> net/netfilter/ipset/ip_set_core.c:1200 suspicious
> rcu_dereference_protected() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 3 locks held by kworker/u128:1/264:
> #0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
> process_one_work+0xeab/0x1460
> #1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
> process_one_work+0x82b/0x1460
> #2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
> cleanup_net+0xb9/0xa90
>
> stack backtrace:
> CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
> ------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
> #1
> Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
> WIFI, BIOS 2611 04/07/2024
> Workqueue: netns cleanup_net
> Call Trace:
> <TASK>
> dump_stack_lvl+0x84/0xd0
> lockdep_rcu_suspicious.cold+0xa1/0x134
> _destroy_all_sets+0x1c7/0x560 [ip_set]
> ip_set_net_exit+0x20/0x50 [ip_set]
> ops_exit_list+0x99/0x170
> cleanup_net+0x4d9/0xa90
> ? __pfx_cleanup_net+0x10/0x10
> process_one_work+0x8a4/0x1460
> ? worker_thread+0xe3/0x1010
> ? __pfx_process_one_work+0x10/0x10
> ? assign_work+0x16c/0x240
> worker_thread+0x5e6/0x1010
> ? __kthread_parkme+0xb1/0x1d0
> ? __pfx_worker_thread+0x10/0x10
> ? __pfx_worker_thread+0x10/0x10
> kthread+0x2d2/0x3a0
> ? _raw_spin_unlock_irq+0x28/0x60
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x31/0x70
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> =============================
> WARNING: suspicious RCU usage
> 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
> G W L ------- ---
> -----------------------------
> net/netfilter/ipset/ip_set_core.c:1211 suspicious
> rcu_dereference_protected() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 3 locks held by kworker/u128:1/264:
> #0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
> process_one_work+0xeab/0x1460
> #1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
> process_one_work+0x82b/0x1460
> #2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
> cleanup_net+0xb9/0xa90
>
> stack backtrace:
> CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
> ------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
> #1
> Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
> WIFI, BIOS 2611 04/07/2024
> Workqueue: netns cleanup_net
> Call Trace:
> <TASK>
> dump_stack_lvl+0x84/0xd0
> lockdep_rcu_suspicious.cold+0xa1/0x134
> _destroy_all_sets+0x3a8/0x560 [ip_set]
> ip_set_net_exit+0x20/0x50 [ip_set]
> ops_exit_list+0x99/0x170
> cleanup_net+0x4d9/0xa90
> ? __pfx_cleanup_net+0x10/0x10
> process_one_work+0x8a4/0x1460
> ? worker_thread+0xe3/0x1010
> ? __pfx_process_one_work+0x10/0x10
> ? assign_work+0x16c/0x240
> worker_thread+0x5e6/0x1010
> ? __kthread_parkme+0xb1/0x1d0
> ? __pfx_worker_thread+0x10/0x10
> ? __pfx_worker_thread+0x10/0x10
> kthread+0x2d2/0x3a0
> ? _raw_spin_unlock_irq+0x28/0x60
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x31/0x70
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
> workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 7 times,
> consider switching to WQ_UNBOUND
> workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 11 times,
> consider switching to WQ_UNBOUND
>
> Bisect blame this commit:
> commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10
> Author: Jozsef Kadlecsik <kadlec@netfilter.org>
> Date: Tue Jun 4 15:58:03 2024 +0200
>
> netfilter: ipset: Fix race between namespace cleanup and gc in the
> list:set type
>
> Lion Ackermann reported that there is a race condition between
> namespace cleanup
> in ipset and the garbage collection of the list:set type. The namespace
> cleanup can destroy the list:set type of sets while the gc of the
> set type is
> waiting to run in rcu cleanup. The latter uses data from the
> destroyed set which
> thus leads use after free. The patch contains the following parts:
>
> - When destroying all sets, first remove the garbage collectors, then wait
> if needed and then destroy the sets.
> - Fix the badly ordered "wait then remove gc" for the destroy a single set
> case.
> - Fix the missing rcu locking in the list:set type in the userspace test
> case.
> - Use proper RCU list handlings in the list:set type.
>
> The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list
> flush to cancel_gc).
>
> Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression
> in swap operation)
> Reported-by: Lion Ackermann <nnamrec@gmail.com>
> Tested-by: Lion Ackermann <nnamrec@gmail.com>
> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
>
> net/netfilter/ipset/ip_set_core.c | 81
> ++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
> net/netfilter/ipset/ip_set_list_set.c | 30 ++++++++++++++----------------
> 2 files changed, 60 insertions(+), 51 deletions(-)
>
> And I can confirm after reverting 4e7aaa6b82d6 the issue is gone.
>
> I also attach the build config and full kernel log.
>
> My hardware specs: https://linux-hardware.org/?probe=80512f0c04
>
> Jozsef can you look into this please?
>
> --
> Best Regards,
> Mike Gavrilov.
prev parent reply other threads:[~2024-06-20 8:43 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-20 6:55 6.10/bisected/regression - commits 4e7aaa6b82d6 cause appearing WARNING at net/netfilter/ipset/ip_set_core.c:1200 suspicious rcu_dereference_protected() usage Mikhail Gavrilov
2024-06-20 8:42 ` Pablo Neira Ayuso [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZnPrkcWkAxQrODju@calendula \
--to=pablo@netfilter.org \
--cc=kadlec@netfilter.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikhail.v.gavrilov@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=nnamrec@gmail.com \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).