From: Steven Whitehouse <swhiteho@redhat.com>
To: Bob Peterson <rpeterso@redhat.com>,
cluster-devel@redhat.com, linux-fsdevel@vger.kernel.org,
Dave Chinner <dchinner@redhat.com>
Cc: linux-kernel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [Cluster-devel] [PATCH 2/2] GFS2: Add a gfs2-specific prune_icache_sb
Date: Sun, 26 Jun 2016 14:18:52 +0100 [thread overview]
Message-ID: <576FD63C.6020805@redhat.com> (raw)
In-Reply-To: <1466797811-5873-3-git-send-email-rpeterso@redhat.com>
Hi,
I think the idea looks good. A couple of comments below though...
On 24/06/16 20:50, Bob Peterson wrote:
> This patch adds a new prune_icache_sb function for the VFS slab
> shrinker to call. Trying to directly free the inodes from memory
> might deadlock because it evicts inodes, which calls into DLM
> to acquire the glock. The DLM, in turn, may block on a pending
> fence operation, which may already be blocked on memory allocation
> that caused the slab shrinker to be called in the first place.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> fs/gfs2/incore.h | 2 ++
> fs/gfs2/ops_fstype.c | 1 +
> fs/gfs2/quota.c | 25 +++++++++++++++++++++++++
> fs/gfs2/super.c | 13 +++++++++++++
> 4 files changed, 41 insertions(+)
>
> diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
> index a6a3389..a367459 100644
> --- a/fs/gfs2/incore.h
> +++ b/fs/gfs2/incore.h
> @@ -757,6 +757,8 @@ struct gfs2_sbd {
>
> struct task_struct *sd_logd_process;
> struct task_struct *sd_quotad_process;
> + int sd_iprune; /* inodes to prune */
> + spinlock_t sd_shrinkspin;
>
> /* Quota stuff */
>
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index 4546360..65a69be 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -95,6 +95,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
> spin_lock_init(&sdp->sd_jindex_spin);
> mutex_init(&sdp->sd_jindex_mutex);
> init_completion(&sdp->sd_journal_ready);
> + spin_lock_init(&sdp->sd_shrinkspin);
>
> INIT_LIST_HEAD(&sdp->sd_quota_list);
> mutex_init(&sdp->sd_quota_mutex);
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index ce7d69a..5810a2c 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -1528,14 +1528,39 @@ void gfs2_wake_up_statfs(struct gfs2_sbd *sdp) {
> int gfs2_quotad(void *data)
> {
> struct gfs2_sbd *sdp = data;
> + struct super_block *sb = sdp->sd_vfs;
> struct gfs2_tune *tune = &sdp->sd_tune;
> unsigned long statfs_timeo = 0;
> unsigned long quotad_timeo = 0;
> unsigned long t = 0;
> DEFINE_WAIT(wait);
> int empty;
> + int rc;
> + struct shrink_control sc = {.gfp_mask = GFP_KERNEL, };
>
> while (!kthread_should_stop()) {
> + /* TODO: Deal with shrinking of dcache */
> + /* Prune any inode cache intended by the shrinker. */
> + spin_lock(&sdp->sd_shrinkspin);
> + if (sdp->sd_iprune > 0) {
> + sc.nr_to_scan = sdp->sd_iprune;
> + if (sc.nr_to_scan > 1024)
> + sc.nr_to_scan = 1024;
> + sdp->sd_iprune -= sc.nr_to_scan;
> + spin_unlock(&sdp->sd_shrinkspin);
> + rc = prune_icache_sb(sb, &sc);
> + if (rc < 0) {
> + spin_lock(&sdp->sd_shrinkspin);
> + sdp->sd_iprune = 0;
> + spin_unlock(&sdp->sd_shrinkspin);
> + }
> + if (sdp->sd_iprune) {
> + cond_resched();
> + continue;
> + }
> + } else {
> + spin_unlock(&sdp->sd_shrinkspin);
> + }
>
> /* Update the master statfs file */
> if (sdp->sd_statfs_force_sync) {
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 9b2ff353..75e8a85 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1667,6 +1667,18 @@ static void gfs2_destroy_inode(struct inode *inode)
> call_rcu(&inode->i_rcu, gfs2_i_callback);
> }
>
> +static long gfs2_prune_icache_sb(struct super_block *sb,
> + struct shrink_control *sc)
> +{
> + struct gfs2_sbd *sdp;
> +
> + sdp = sb->s_fs_info;
> + spin_lock(&sdp->sd_shrinkspin);
> + sdp->sd_iprune = sc->nr_to_scan + 1;
> + spin_unlock(&sdp->sd_shrinkspin);
> + return 0;
> +}
This doesn't wake up the thread that will do the reclaim, so that there
may be a significant delay between the request to shrink and the actual
shrink. Also, using quotad is not a good plan, since it might itself
block waiting for memory. This should be done by a thread on its own to
avoid any deadlock possibility here.
There also appears to be a limit of 1024 to scan per run of quotad,
which means it would take a very long time to push out any significant
number, and it does seem a bit arbitrary - was there a reason for
selecting that number? It would probably be better to simply yield every
now and then if there are a lot of items to process,
Steve.
> +
> const struct super_operations gfs2_super_ops = {
> .alloc_inode = gfs2_alloc_inode,
> .destroy_inode = gfs2_destroy_inode,
> @@ -1681,5 +1693,6 @@ const struct super_operations gfs2_super_ops = {
> .remount_fs = gfs2_remount_fs,
> .drop_inode = gfs2_drop_inode,
> .show_options = gfs2_show_options,
> + .prune_icache_sb = gfs2_prune_icache_sb,
> };
>
next prev parent reply other threads:[~2016-06-26 13:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-24 19:50 [PATCH 0/2] Per superblock inode reclaim Bob Peterson
2016-06-24 19:50 ` [PATCH 1/2] vfs: Add hooks for filesystem-specific prune_icache_sb Bob Peterson
2016-06-28 1:10 ` Dave Chinner
2016-06-24 19:50 ` [PATCH 2/2] GFS2: Add a gfs2-specific prune_icache_sb Bob Peterson
2016-06-26 13:18 ` Steven Whitehouse [this message]
2016-06-28 2:08 ` Dave Chinner
2016-06-28 9:13 ` [Cluster-devel] " Steven Whitehouse
2016-06-28 22:47 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=576FD63C.6020805@redhat.com \
--to=swhiteho@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=dchinner@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rpeterso@redhat.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).