From: Uladzislau Rezki <urezki@gmail.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Florian Westphal <fw@strlen.de>,
linux-kernel@vger.kernel.org, Uladzislau Rezki <urezki@gmail.com>,
Vlastimil Babka <vbabka@suse.cz>,
Kent Overstreet <kent.overstreet@linux.dev>,
Ben Greear <greearb@candelatech.com>
Subject: Re: [PATCH lib] lib: alloc_tag_module_unload must wait for pending kfree_rcu calls
Date: Tue, 8 Oct 2024 09:14:58 +0200 [thread overview]
Message-ID: <ZwTb8tMVVqrpZIv2@pc636> (raw)
In-Reply-To: <CAJuCfpGZg8Pydy4rGUefOBgwJZ5C6_s3p913oFQJSVV+S9MQoA@mail.gmail.com>
On Mon, Oct 07, 2024 at 06:49:32PM -0700, Suren Baghdasaryan wrote:
> On Mon, Oct 7, 2024 at 6:15 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Mon, 7 Oct 2024 22:52:24 +0200 Florian Westphal <fw@strlen.de> wrote:
> >
> > > Ben Greear reports following splat:
> > > ------------[ cut here ]------------
> > > net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
> > > WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
> > > Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
> > > ...
> > > Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
> > > RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
> > > codetag_unload_module+0x19b/0x2a0
> > > ? codetag_load_module+0x80/0x80
> > >
> > > nf_nat module exit calls kfree_rcu on those addresses, but the free
> > > operation is likely still pending by the time alloc_tag checks for leaks.
> > >
> > > Wait for outstanding kfree_rcu operations to complete before checking
> > > resolves this warning.
> > >
> > > Reproducer:
> > > unshare -n iptables-nft -t nat -A PREROUTING -p tcp
> > > grep nf_nat /proc/allocinfo # will list 4 allocations
> > > rmmod nft_chain_nat
> > > rmmod nf_nat # will WARN.
> > >
> > > ...
> > >
> > > --- a/lib/codetag.c
> > > +++ b/lib/codetag.c
> > > @@ -228,6 +228,8 @@ bool codetag_unload_module(struct module *mod)
> > > if (!mod)
> > > return true;
> > >
> > > + kvfree_rcu_barrier();
> > > +
> > > mutex_lock(&codetag_lock);
> > > list_for_each_entry(cttype, &codetag_types, link) {
> > > struct codetag_module *found = NULL;
> >
> > It's always hard to determine why a thing like this is present, so a
> > comment is helpful:
> >
> > --- a/lib/codetag.c~lib-alloc_tag_module_unload-must-wait-for-pending-kfree_rcu-calls-fix
> > +++ a/lib/codetag.c
> > @@ -228,6 +228,7 @@ bool codetag_unload_module(struct module
> > if (!mod)
> > return true;
> >
> > + /* await any module's kfree_rcu() operations to complete */
> > kvfree_rcu_barrier();
> >
> > mutex_lock(&codetag_lock);
> > _
> >
> > But I do wonder whether this is in the correct place.
> >
> > Waiting for a module's ->exit() function's kfree_rcu()s to complete
> > should properly be done by the core module handling code?
>
> I don't think core module code cares about kfree_rcu()s being complete
> before the module is unloaded.
> Allocation tagging OTOH cares because it is about to destroy tags
> which will be accessed when kfree() actually happens, therefore a
> strict ordering is important.
>
> >
> > free_module() does a full-on synchronize_rcu() prior to freeing the
> > module memory itself and I think codetag_unload_module() could be
> > called after that?
>
> I think we could move codetag_unload_module() after synchronize_rcu()
> inside free_module() but according to the reply in
> https://lore.kernel.org/all/20241007112904.GA27104@breakpoint.cc/
> synchronize_rcu() does not help. I'm not quite sure why...
>
It is because, synchronize_rcu() is used for a bit different things,
i.e. it is about a GP completion. Offloading objects can span several
GPs.
> Note that once I'm done upstreaming
> https://lore.kernel.org/all/20240902044128.664075-3-surenb@google.com/,
> this change will not be needed and I'm planning to remove this call,
> however this change is useful for backporting. It should be sent to
> stable@vger.kernel.org # v6.10+
>
The kvfree_rcu_barrier() has been added into v6.12:
<snip>
urezki@pc638:~/data/raid0/coding/linux.git$ git tag --contains 2b55d6a42d14c8675e38d6d9adca3014fdf01951
next-20240912
next-20240919
next-20240920
next-20241002
v6.12-rc1
urezki@pc638:~/data/raid0/coding/linux.git$
<snip>
For 6.10+, it implies that the mentioned commit should be backported also.
--
Uladzislau Rezki
next prev parent reply other threads:[~2024-10-08 7:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-07 20:52 [PATCH lib] lib: alloc_tag_module_unload must wait for pending kfree_rcu calls Florian Westphal
2024-10-08 1:15 ` Andrew Morton
2024-10-08 1:49 ` Suren Baghdasaryan
2024-10-08 7:14 ` Uladzislau Rezki [this message]
2024-10-08 11:16 ` Suren Baghdasaryan
2024-10-08 13:07 ` Uladzislau Rezki
2024-10-08 18:34 ` Suren Baghdasaryan
2024-10-09 8:36 ` Uladzislau Rezki
2024-10-15 16:22 ` Suren Baghdasaryan
2024-10-15 17:44 ` Uladzislau Rezki
2024-10-16 9:56 ` Vlastimil Babka
2024-10-08 8:14 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZwTb8tMVVqrpZIv2@pc636 \
--to=urezki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=fw@strlen.de \
--cc=greearb@candelatech.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox