From: Roman Gushchin <roman.gushchin@linux.dev>
To: Dave Chinner <david@fromorbit.com>
Cc: Kirill Tkhai <tkhai@ya.ru>,
akpm@linux-foundation.org, vbabka@suse.cz,
viro@zeniv.linux.org.uk, brauner@kernel.org, djwong@kernel.org,
hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
zhengqi.arch@bytedance.com
Subject: Re: [PATCH v2 3/3] fs: Use delayed shrinker unregistration
Date: Mon, 5 Jun 2023 19:56:59 -0700 [thread overview]
Message-ID: <ZH6ge3yiGAotYRR9@P9FQF9L96D> (raw)
In-Reply-To: <ZH6K0McWBeCjaf16@dread.disaster.area>
On Tue, Jun 06, 2023 at 11:24:32AM +1000, Dave Chinner wrote:
> On Mon, Jun 05, 2023 at 05:38:27PM -0700, Roman Gushchin wrote:
> > On Mon, Jun 05, 2023 at 10:03:25PM +0300, Kirill Tkhai wrote:
> > > Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
> > > test case caused by commit: f95bdb700bc6 ("mm: vmscan: make global slab
> > > shrink lockless"). Qi Zheng investigated that the reason is in long SRCU's
> > > synchronize_srcu() occuring in unregister_shrinker().
> > >
> > > This patch fixes the problem by using new unregistration interfaces,
> > > which split unregister_shrinker() in two parts. First part actually only
> > > notifies shrinker subsystem about the fact of unregistration and it prevents
> > > future shrinker methods calls. The second part completes the unregistration
> > > and it insures, that struct shrinker is not used during shrinker chain
> > > iteration anymore, so shrinker memory may be freed. Since the long second
> > > part is called from delayed work asynchronously, it hides synchronize_srcu()
> > > delay from a user.
> > >
> > > Signed-off-by: Kirill Tkhai <tkhai@ya.ru>
> > > ---
> > > fs/super.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/super.c b/fs/super.c
> > > index 8d8d68799b34..f3e4f205ec79 100644
> > > --- a/fs/super.c
> > > +++ b/fs/super.c
> > > @@ -159,6 +159,7 @@ static void destroy_super_work(struct work_struct *work)
> > > destroy_work);
> > > int i;
> > >
> > > + unregister_shrinker_delayed_finalize(&s->s_shrink);
> > > for (i = 0; i < SB_FREEZE_LEVELS; i++)
> > > percpu_free_rwsem(&s->s_writers.rw_sem[i]);
> > > kfree(s);
> > > @@ -327,7 +328,7 @@ void deactivate_locked_super(struct super_block *s)
> > > {
> > > struct file_system_type *fs = s->s_type;
> > > if (atomic_dec_and_test(&s->s_active)) {
> > > - unregister_shrinker(&s->s_shrink);
> > > + unregister_shrinker_delayed_initiate(&s->s_shrink);
> >
> > Hm, it makes the API more complex and easier to mess with. Like what will happen
> > if the second part is never called? Or it's called without the first part being
> > called first?
>
> Bad things.
Agree.
> Also, it doesn't fix the three other unregister_shrinker() calls in
> the XFS unmount path, nor the three in the ext4/mbcache/jbd2 unmount
> path.
>
> Those are just some of the unregister_shrinker() calls that have
> dynamic contexts that would also need this same fix; I haven't
> audited the 3 dozen other unregister_shrinker() calls around the
> kernel to determine if any of them need similar treatment, too.
>
> IOWs, this patchset is purely a band-aid to fix the reported
> regression, not an actual fix for the underlying problems caused by
> moving the shrinker infrastructure to SRCU protection. This is why
> I really want the SRCU changeover reverted.
>
> Not only are the significant changes the API being necessary, it's
> put the entire shrinker paths under a SRCU critical section. AIUI,
> this means while the shrinkers are running the RCU grace period
> cannot expire and no RCU freed memory will actually get freed until
> the srcu read lock is dropped by the shrinker.
>
> Given the superblock shrinkers are freeing dentry and inode objects
> by RCU freeing, this is also a fairly significant change of
> behaviour. i.e. cond_resched() in the shrinker processing loops no
> longer allows RCU grace periods to expire and have memory freed with
> the shrinkers are running.
>
> Are there problems this will cause? I don't know, but I'm pretty
> sure they haven't even been considered until now....
>
> > Isn't it possible to hide it from a user and call the second part from a work
> > context automatically?
>
> Nope, because it has to be done before the struct shrinker is freed.
> Those are embedded into other structures rather than being
> dynamically allocated objects.
This part we might consider to revisit, if it helps to solve other problems.
Having an extra memory allocation (or two) per mount-point doesn't look
that expensive. Again, iff it helps with more important problems.
Thanks!
next prev parent reply other threads:[~2023-06-06 2:57 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-05 19:02 [PATCH v2 0/3] mm: Make unregistration of super_block shrinker more faster Kirill Tkhai
2023-06-05 19:03 ` [PATCH v2 1/3] mm: vmscan: move shrinker_debugfs_remove() before synchronize_srcu() Kirill Tkhai
2023-06-06 0:31 ` Roman Gushchin
2023-06-05 19:03 ` [PATCH v2 2/3] mm: Split unregister_shrinker() in fast and slow part Kirill Tkhai
2023-06-07 4:49 ` kernel test robot
2023-06-07 7:33 ` Yujie Liu
2023-06-05 19:03 ` [PATCH v2 3/3] fs: Use delayed shrinker unregistration Kirill Tkhai
2023-06-06 0:38 ` Roman Gushchin
2023-06-06 1:24 ` Dave Chinner
2023-06-06 2:56 ` Roman Gushchin [this message]
2023-06-06 6:51 ` Dave Chinner
2023-06-06 15:56 ` Roman Gushchin
2023-06-06 21:21 ` Kirill Tkhai
2023-06-06 22:30 ` Dave Chinner
2023-06-08 16:36 ` Theodore Ts'o
2023-06-08 23:17 ` Dave Chinner
2023-06-09 0:27 ` Andrew Morton
2023-06-09 2:50 ` Qi Zheng
2023-06-05 22:32 ` [PATCH v2 0/3] mm: Make unregistration of super_block shrinker more faster Dave Chinner
2023-06-06 21:06 ` Kirill Tkhai
2023-06-06 22:02 ` Dave Chinner
2023-06-07 2:51 ` Qi Zheng
2023-06-08 21:58 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZH6ge3yiGAotYRR9@P9FQF9L96D \
--to=roman.gushchin@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=muchun.song@linux.dev \
--cc=paulmck@kernel.org \
--cc=tkhai@ya.ru \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).