From: "Darrick J. Wong" <djwong@kernel.org>
To: Qi Zheng <qi.zheng@linux.dev>
Cc: Dave Chinner <david@fromorbit.com>,
akpm@linux-foundation.org, tkhai@ya.ru, roman.gushchin@linux.dev,
vbabka@suse.cz, viro@zeniv.linux.org.uk, brauner@kernel.org,
hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH 6/8] xfs: introduce xfs_fs_destroy_super()
Date: Fri, 2 Jun 2023 08:15:32 -0700 [thread overview]
Message-ID: <20230602151532.GP16865@frogsfrogsfrogs> (raw)
In-Reply-To: <2f34a702-1a57-06a5-1bd9-de54a67a839e@linux.dev>
On Fri, Jun 02, 2023 at 11:13:09AM +0800, Qi Zheng wrote:
> Hi Dave,
>
> On 2023/6/2 07:06, Dave Chinner wrote:
> > On Thu, Jun 01, 2023 at 04:43:32PM +0800, Qi Zheng wrote:
> > > Hi Dave,
> > > On 2023/6/1 07:48, Dave Chinner wrote:
> > > > On Wed, May 31, 2023 at 09:57:40AM +0000, Qi Zheng wrote:
> > > > > From: Kirill Tkhai <tkhai@ya.ru>
> > > > I don't really like this ->destroy_super() callback, especially as
> > > > it's completely undocumented as to why it exists. This is purely a
> > > > work-around for handling extended filesystem superblock shrinker
> > > > functionality, yet there's nothing that tells the reader this.
> > > >
> > > > It also seems to imply that the superblock shrinker can continue to
> > > > run after the existing unregister_shrinker() call before ->kill_sb()
> > > > is called. This violates the assumption made in filesystems that the
> > > > superblock shrinkers have been stopped and will never run again
> > > > before ->kill_sb() is called. Hence ->kill_sb() implementations
> > > > assume there is nothing else accessing filesystem owned structures
> > > > and it can tear down internal structures safely.
> > > >
> > > > Realistically, the days of XFS using this superblock shrinker
> > > > extension are numbered. We've got a lot of the infrastructure we
> > > > need in place to get rid of the background inode reclaim
> > > > infrastructure that requires this shrinker extension, and it's on my
> > > > list of things that need to be addressed in the near future.
> > > >
> > > > In fact, now that I look at it, I think the shmem usage of this
> > > > superblock shrinker interface is broken - it returns SHRINK_STOP to
> > > > ->free_cached_objects(), but the only valid return value is the
> > > > number of objects freed (i.e. 0 is nothing freed). These special
> > > > superblock extension interfaces do not work like a normal
> > > > shrinker....
> > > >
> > > > Hence I think the shmem usage should be replaced with an separate
> > > > internal shmem shrinker that is managed by the filesystem itself
> > > > (similar to how XFS has multiple internal shrinkers).
> > > >
> > > > At this point, then the only user of this interface is (again) XFS.
> > > > Given this, adding new VFS methods for a single filesystem
> > > > for functionality that is planned to be removed is probably not the
> > > > best approach to solving the problem.
> > >
> > > Thanks for such a detailed analysis. Kirill Tkhai just proposeed a
> > > new method[1], I cc'd you on the email.
> >
> > I;ve just read through that thread, and I've looked at the original
> > patch that caused the regression.
> >
> > I'm a bit annoyed right now. Nobody cc'd me on the original patches
> > nor were any of the subsystems that use shrinkers were cc'd on the
> > patches that changed shrinker behaviour. I only find out about this
>
> Sorry about that, my mistake. I followed the results of
> scripts/get_maintainer.pl before.
Sometimes I wonder if people who contribute a lot to a subsystem should
be more aggressive about listing themselves explicitly in MAINTAINERS
but then I look at the ~600 emails that came in while I was on vacation
for 6 days over a long weekend and ... shut up. :P
> > because someone tries to fix something they broke by *breaking more
> > stuff* and not even realising how broken what they are proposing is.
>
> Yes, this slows down the speed of umount. But the benefit is that
> slab shrink becomes lockless, the mount operation and slab shrink no
> longer affect each other, and the IPC no longer drops significantly,
> etc.
The lockless shrink seems like a good thing to have, but ... is it
really true that the superblock shrinker can still be running after
->kill_sb? /That/ is surprising to me.
--D
> And I used bpftrace to measure the time consumption of
> unregister_shrinker():
>
> ```
> And I just tested it on a physical machine (Intel(R) Xeon(R) Platinum
> 8260 CPU @ 2.40GHz) and the results are as follows:
>
> 1) use synchronize_srcu():
>
> @ns[umount]:
> [8K, 16K) 83 |@@@@@@@ |
> [16K, 32K) 578
> |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [32K, 64K) 78 |@@@@@@@ |
> [64K, 128K) 6 | |
> [128K, 256K) 7 | |
> [256K, 512K) 29 |@@ |
> [512K, 1M) 51 |@@@@ |
> [1M, 2M) 90 |@@@@@@@@ |
> [2M, 4M) 70 |@@@@@@ |
> [4M, 8M) 8 | |
>
> 2) use synchronize_srcu_expedited():
>
> @ns[umount]:
> [8K, 16K) 31 |@@ |
> [16K, 32K) 803
> |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [32K, 64K) 158 |@@@@@@@@@@ |
> [64K, 128K) 4 | |
> [128K, 256K) 2 | |
> [256K, 512K) 2 | |
> ```
>
> With synchronize_srcu(), most of the time consumption is between 16us and
> 32us, the worst case between 4ms and 8ms. Is this totally
> unacceptable?
>
> This performance regression report comes from a stress test. Will the
> umount action be executed so frequently under real workloads?
>
> If there are really unacceptable, after applying the newly proposed
> method, umount will be as fast as before (or even faster).
>
> Thanks,
> Qi
>
> >
> > The previous code was not broken and it provided specific guarantees
> > to subsystems via unregister_shrinker(). From the above discussion,
> > it appears that the original authors of these changes either did not
> > know about or did not understand them, so that casts doubt in my
> > mind about the attempted solution and all the proposed fixes for it.
> >
> > I don't have the time right now unravel this mess and fully
> > understand the original problem, changes or the band-aids that are
> > being thrown around. We are also getting quite late in the cycle to
> > be doing major surgery to critical infrastructure, especially as it
> > gives so little time to review regression test whatever new solution
> > is proposed.
> >
> > Given this appears to be a change introduced in 6.4-rc1, I think the
> > right thing to do is to revert the change rather than make things
> > worse by trying to shove some "quick fix" into the kernel to address
> > it.
> >
> > Andrew, could you please sort out a series to revert this shrinker
> > infrastructure change and all the dependent hacks that have been
> > added to try to fix it so far?
> >
> > -Dave.
>
> --
> Thanks,
> Qi
next prev parent reply other threads:[~2023-06-02 15:15 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-31 9:57 [PATCH 0/8] make unregistration of super_block shrinker more faster Qi Zheng
2023-05-31 9:57 ` [PATCH 1/8] mm: vmscan: move shrinker_debugfs_remove() before synchronize_srcu() Qi Zheng
2023-05-31 10:49 ` Christian Brauner
2023-05-31 12:52 ` Qi Zheng
2023-05-31 9:57 ` [PATCH 2/8] mm: vmscan: split unregister_shrinker() Qi Zheng
2023-05-31 22:57 ` Dave Chinner
2023-05-31 9:57 ` [PATCH 3/8] fs: move list_lru_destroy() to destroy_super_work() Qi Zheng
2023-05-31 23:00 ` Dave Chinner
2023-05-31 9:57 ` [PATCH 4/8] fs: shrink only (SB_ACTIVE|SB_BORN) superblocks in super_cache_scan() Qi Zheng
2023-05-31 23:12 ` Dave Chinner
2023-05-31 9:57 ` [PATCH 5/8] fs: introduce struct super_operations::destroy_super() callback Qi Zheng
2023-05-31 11:19 ` Christian Brauner
2023-05-31 12:54 ` Qi Zheng
2023-05-31 9:57 ` [PATCH 6/8] xfs: introduce xfs_fs_destroy_super() Qi Zheng
2023-05-31 23:48 ` Dave Chinner
2023-06-01 8:43 ` Qi Zheng
2023-06-01 9:58 ` Christian Brauner
2023-06-01 23:06 ` Dave Chinner
2023-06-02 3:13 ` Qi Zheng
2023-06-02 15:15 ` Darrick J. Wong [this message]
2023-06-05 11:50 ` Christian Brauner
2023-06-05 12:16 ` Qi Zheng
2023-05-31 9:57 ` [PATCH 7/8] shmem: implement shmem_destroy_super() Qi Zheng
2023-05-31 9:57 ` [PATCH 8/8] fs: use unregister_shrinker_delayed_{initiate, finalize} for super_block shrinker Qi Zheng
[not found] ` <20230531114054.bf077db642aa9c58c0831687@linux-foundation.org>
2023-06-01 8:46 ` [PATCH 0/8] make unregistration of super_block shrinker more faster Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230602151532.GP16865@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=muchun.song@linux.dev \
--cc=paulmck@kernel.org \
--cc=qi.zheng@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=tkhai@ya.ru \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).