From: Dave Chinner via Virtualization <virtualization@lists.linux-foundation.org>
To: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: djwong@kernel.org, roman.gushchin@linux.dev,
dri-devel@lists.freedesktop.org,
virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
dm-devel@redhat.com, linux-ext4@vger.kernel.org,
paulmck@kernel.org, linux-arm-msm@vger.kernel.org,
intel-gfx@lists.freedesktop.org, linux-nfs@vger.kernel.org,
linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org,
vbabka@suse.cz, brauner@kernel.org, tytso@mit.edu,
linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org,
linux-btrfs@vger.kernel.org, tkhai@ya.ru
Subject: Re: [PATCH 02/29] mm: vmscan: introduce some helpers for dynamically allocating shrinker
Date: Fri, 23 Jun 2023 16:12:03 +1000 [thread overview]
Message-ID: <ZJU3s8tyGsYTVS8f@dread.disaster.area> (raw)
In-Reply-To: <20230622085335.77010-3-zhengqi.arch@bytedance.com>
On Thu, Jun 22, 2023 at 04:53:08PM +0800, Qi Zheng wrote:
> Introduce some helpers for dynamically allocating shrinker instance,
> and their uses are as follows:
>
> 1. shrinker_alloc_and_init()
>
> Used to allocate and initialize a shrinker instance, the priv_data
> parameter is used to pass the pointer of the previously embedded
> structure of the shrinker instance.
>
> 2. shrinker_free()
>
> Used to free the shrinker instance when the registration of shrinker
> fails.
>
> 3. unregister_and_free_shrinker()
>
> Used to unregister and free the shrinker instance, and the kfree()
> will be changed to kfree_rcu() later.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
> include/linux/shrinker.h | 12 ++++++++++++
> mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++++
> 2 files changed, 47 insertions(+)
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 43e6fcabbf51..8e9ba6fa3fcc 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -107,6 +107,18 @@ extern void unregister_shrinker(struct shrinker *shrinker);
> extern void free_prealloced_shrinker(struct shrinker *shrinker);
> extern void synchronize_shrinkers(void);
>
> +typedef unsigned long (*count_objects_cb)(struct shrinker *s,
> + struct shrink_control *sc);
> +typedef unsigned long (*scan_objects_cb)(struct shrinker *s,
> + struct shrink_control *sc);
> +
> +struct shrinker *shrinker_alloc_and_init(count_objects_cb count,
> + scan_objects_cb scan, long batch,
> + int seeks, unsigned flags,
> + void *priv_data);
> +void shrinker_free(struct shrinker *shrinker);
> +void unregister_and_free_shrinker(struct shrinker *shrinker);
Hmmmm. Not exactly how I envisioned this to be done.
Ok, this will definitely work, but I don't think it is an
improvement. It's certainly not what I was thinking of when I
suggested dynamically allocating shrinkers.
The main issue is that this doesn't simplify the API - it expands it
and creates a minefield of old and new functions that have to be
used in exactly the right order for the right things to happen.
What I was thinking of was moving the entire shrinker setup code
over to the prealloc/register_prepared() algorithm, where the setup
is already separated from the activation of the shrinker.
That is, we start by renaming prealloc_shrinker() to
shrinker_alloc(), adding a flags field to tell it everything that it
needs to alloc (i.e. the NUMA/MEMCG_AWARE flags) and having it
returned a fully allocated shrinker ready to register. Initially
this also contains an internal flag to say the shrinker was
allocated so that unregister_shrinker() knows to free it.
The caller then fills out the shrinker functions, seeks, etc. just
like the do now, and then calls register_shrinker_prepared() to make
the shrinker active when it wants to turn it on.
When it is time to tear down the shrinker, no API needs to change.
unregister_shrinker() does all the shutdown and frees all the
internal memory like it does now. If the shrinker is also marked as
allocated, it frees the shrinker via RCU, too.
Once everything is converted to this API, we then remove
register_shrinker(), rename register_shrinker_prepared() to
shrinker_register(), rename unregister_shrinker to
shrinker_unregister(), get rid of the internal "allocated" flag
and always free the shrinker.
At the end of the patchset, every shrinker should be set
up in a manner like this:
sb->shrinker = shrinker_alloc(SHRINKER_MEMCG_AWARE|SHRINKER_NUMA_AWARE,
"sb-%s", type->name);
if (!sb->shrinker)
return -ENOMEM;
sb->shrinker->count_objects = super_cache_count;
sb->shrinker->scan_objects = super_cache_scan;
sb->shrinker->batch = 1024;
sb->shrinker->private = sb;
.....
shrinker_register(sb->shrinker);
And teardown is just a call to shrinker_unregister(sb->shrinker)
as it is now.
i.e. the entire shrinker regsitration API is now just three
functions, down from the current four, and much simpler than the
the seven functions this patch set results in...
The other advantage of this is that it will break all the existing
out of tree code and third party modules using the old API and will
no longer work with a kernel using lockless slab shrinkers. They
need to break (both at the source and binary levels) to stop bad
things from happening due to using uncoverted shrinkers in the new
setup.
-Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
prev parent reply other threads:[~2023-06-23 6:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20230622085335.77010-1-zhengqi.arch@bytedance.com>
[not found] ` <20230622085335.77010-2-zhengqi.arch@bytedance.com>
2023-06-22 14:47 ` [PATCH 01/29] mm: shrinker: add shrinker::private_data field Vlastimil Babka
[not found] ` <20230622085335.77010-30-zhengqi.arch@bytedance.com>
2023-06-22 14:53 ` [PATCH 29/29] mm: shrinker: move shrinker-related code into a separate file Vlastimil Babka
[not found] ` <20230622085335.77010-25-zhengqi.arch@bytedance.com>
2023-06-22 15:12 ` [PATCH 24/29] mm: vmscan: make global slab shrink lockless Vlastimil Babka
2023-06-23 6:29 ` Dave Chinner via Virtualization
[not found] ` <a21047bb-3b87-a50a-94a7-f3fa4847bc08@bytedance.com>
2023-06-23 22:19 ` Dave Chinner via Virtualization
[not found] ` <20230622085335.77010-3-zhengqi.arch@bytedance.com>
2023-06-23 6:12 ` Dave Chinner via Virtualization [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZJU3s8tyGsYTVS8f@dread.disaster.area \
--to=virtualization@lists.linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dm-devel@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=tkhai@ya.ru \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).