linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: Bob Peterson <rpeterso@redhat.com>,
	cluster-devel@redhat.com, linux-fsdevel@vger.kernel.org,
	Dave Chinner <dchinner@redhat.com>
Cc: linux-kernel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [Cluster-devel] [PATCH 2/2] GFS2: Add a gfs2-specific prune_icache_sb
Date: Sun, 26 Jun 2016 14:18:52 +0100	[thread overview]
Message-ID: <576FD63C.6020805@redhat.com> (raw)
In-Reply-To: <1466797811-5873-3-git-send-email-rpeterso@redhat.com>

Hi,

I think the idea looks good. A couple of comments below though...

On 24/06/16 20:50, Bob Peterson wrote:
> This patch adds a new prune_icache_sb function for the VFS slab
> shrinker to call. Trying to directly free the inodes from memory
> might deadlock because it evicts inodes, which calls into DLM
> to acquire the glock. The DLM, in turn, may block on a pending
> fence operation, which may already be blocked on memory allocation
> that caused the slab shrinker to be called in the first place.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/incore.h     |  2 ++
>   fs/gfs2/ops_fstype.c |  1 +
>   fs/gfs2/quota.c      | 25 +++++++++++++++++++++++++
>   fs/gfs2/super.c      | 13 +++++++++++++
>   4 files changed, 41 insertions(+)
>
> diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
> index a6a3389..a367459 100644
> --- a/fs/gfs2/incore.h
> +++ b/fs/gfs2/incore.h
> @@ -757,6 +757,8 @@ struct gfs2_sbd {
>   
>   	struct task_struct *sd_logd_process;
>   	struct task_struct *sd_quotad_process;
> +	int sd_iprune; /* inodes to prune */
> +	spinlock_t sd_shrinkspin;
>   
>   	/* Quota stuff */
>   
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index 4546360..65a69be 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -95,6 +95,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
>   	spin_lock_init(&sdp->sd_jindex_spin);
>   	mutex_init(&sdp->sd_jindex_mutex);
>   	init_completion(&sdp->sd_journal_ready);
> +	spin_lock_init(&sdp->sd_shrinkspin);
>   
>   	INIT_LIST_HEAD(&sdp->sd_quota_list);
>   	mutex_init(&sdp->sd_quota_mutex);
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index ce7d69a..5810a2c 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -1528,14 +1528,39 @@ void gfs2_wake_up_statfs(struct gfs2_sbd *sdp) {
>   int gfs2_quotad(void *data)
>   {
>   	struct gfs2_sbd *sdp = data;
> +	struct super_block *sb = sdp->sd_vfs;
>   	struct gfs2_tune *tune = &sdp->sd_tune;
>   	unsigned long statfs_timeo = 0;
>   	unsigned long quotad_timeo = 0;
>   	unsigned long t = 0;
>   	DEFINE_WAIT(wait);
>   	int empty;
> +	int rc;
> +	struct shrink_control sc = {.gfp_mask = GFP_KERNEL, };
>   
>   	while (!kthread_should_stop()) {
> +		/* TODO: Deal with shrinking of dcache */
> +		/* Prune any inode cache intended by the shrinker. */
> +		spin_lock(&sdp->sd_shrinkspin);
> +		if (sdp->sd_iprune > 0) {
> +			sc.nr_to_scan = sdp->sd_iprune;
> +			if (sc.nr_to_scan > 1024)
> +				sc.nr_to_scan = 1024;
> +			sdp->sd_iprune -= sc.nr_to_scan;
> +			spin_unlock(&sdp->sd_shrinkspin);
> +			rc = prune_icache_sb(sb, &sc);
> +			if (rc < 0) {
> +				spin_lock(&sdp->sd_shrinkspin);
> +				sdp->sd_iprune = 0;
> +				spin_unlock(&sdp->sd_shrinkspin);
> +			}
> +			if (sdp->sd_iprune) {
> +				cond_resched();
> +				continue;
> +			}
> +		} else {
> +			spin_unlock(&sdp->sd_shrinkspin);
> +		}
>   
>   		/* Update the master statfs file */
>   		if (sdp->sd_statfs_force_sync) {
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 9b2ff353..75e8a85 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1667,6 +1667,18 @@ static void gfs2_destroy_inode(struct inode *inode)
>   	call_rcu(&inode->i_rcu, gfs2_i_callback);
>   }
>   
> +static long gfs2_prune_icache_sb(struct super_block *sb,
> +				 struct shrink_control *sc)
> +{
> +	struct gfs2_sbd *sdp;
> +
> +	sdp = sb->s_fs_info;
> +	spin_lock(&sdp->sd_shrinkspin);
> +	sdp->sd_iprune = sc->nr_to_scan + 1;
> +	spin_unlock(&sdp->sd_shrinkspin);
> +	return 0;
> +}
This doesn't wake up the thread that will do the reclaim, so that there 
may be a significant delay between the request to shrink and the actual 
shrink. Also, using quotad is not a good plan, since it might itself 
block waiting for memory. This should be done by a thread on its own to 
avoid any deadlock possibility here.

There also appears to be a limit of 1024 to scan per run of quotad, 
which means it would take a very long time to push out any significant 
number, and it does seem a bit arbitrary - was there a reason for 
selecting that number? It would probably be better to simply yield every 
now and then if there are a lot of items to process,

Steve.

> +
>   const struct super_operations gfs2_super_ops = {
>   	.alloc_inode		= gfs2_alloc_inode,
>   	.destroy_inode		= gfs2_destroy_inode,
> @@ -1681,5 +1693,6 @@ const struct super_operations gfs2_super_ops = {
>   	.remount_fs		= gfs2_remount_fs,
>   	.drop_inode		= gfs2_drop_inode,
>   	.show_options		= gfs2_show_options,
> +	.prune_icache_sb	= gfs2_prune_icache_sb,
>   };
>   

  reply	other threads:[~2016-06-26 13:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-24 19:50 [PATCH 0/2] Per superblock inode reclaim Bob Peterson
2016-06-24 19:50 ` [PATCH 1/2] vfs: Add hooks for filesystem-specific prune_icache_sb Bob Peterson
2016-06-28  1:10   ` Dave Chinner
2016-06-24 19:50 ` [PATCH 2/2] GFS2: Add a gfs2-specific prune_icache_sb Bob Peterson
2016-06-26 13:18   ` Steven Whitehouse [this message]
2016-06-28  2:08   ` Dave Chinner
2016-06-28  9:13     ` [Cluster-devel] " Steven Whitehouse
2016-06-28 22:47       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=576FD63C.6020805@redhat.com \
    --to=swhiteho@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rpeterso@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).