cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Revert 35e478f Flush pending glock work when evicting an inode
Date: Fri, 20 Nov 2015 13:47:04 +0000	[thread overview]
Message-ID: <564F2458.7090902@redhat.com> (raw)
In-Reply-To: <1447958561-2584-3-git-send-email-rpeterso@redhat.com>

Hi,

On 19/11/15 18:42, Bob Peterson wrote:
> Commit 35e478f was added because there were cases where
> the inode's glock still had pending delayed work.
> The theory is that the delayed work referenced the inode after it
> had been deleted from memory by gfs2_evict_inode during its call to
> filemap_fdatawrite. I'm guessing that's wrong. I'm guessing that
> the delayed work was referencing the glock after it had been
> freed due to the call to gfs2_glock_put() in gfs2_evict_inode().
> The previous patch made the last put be done from the delayed
> work queue, so that should make 35e478f obsolete. The other thing
> to consider is that vfs evict() calls invalidate_inode_buffers
> before calling the file system-specific gfs2_evict_inode(), and
> therefore we should have no dirty buffers for the inode itself.
> (We might have dirty buffers for the glock address space, though).
Not only are there likely to be dirty buffers still in the meta data 
address space, that is a requirement for good performance when unlinking 
lots of inodes.

> The reason why the call to flush_delayed_work can't be done here
> is that it can block, waiting like so:
>
> kernel: Call Trace:
> kernel: [<ffffffff815395f5>] schedule_timeout+0x215/0x2e0
> kernel: [<ffffffff81141f1a>] ? shrink_inactive_list+0x53a/0x830
> kernel: [<ffffffff81539273>] wait_for_common+0x123/0x180
> kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
> kernel: [<ffffffff8153938d>] wait_for_completion+0x1d/0x20
> kernel: [<ffffffff8109b3e7>] flush_work+0x77/0xc0
> kernel: [<ffffffff8109ac50>] ? wq_barrier_func+0x0/0x20
> kernel: [<ffffffff8109b604>] flush_delayed_work+0x54/0x70
> kernel: [<ffffffffa05e3552>] gfs2_clear_inode+0x62/0xf0 [gfs2]
>
> The delayed work can depend on actions from DLM, and DLM can
> hang indefinitely on a fencing action.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/super.c | 1 -
>   1 file changed, 1 deletion(-)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 46e5004..4a9b090 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1613,7 +1613,6 @@ out:
>   	clear_inode(inode);
>   	gfs2_dir_hash_inval(ip);
>   	ip->i_gl->gl_object = NULL;
> -	flush_delayed_work(&ip->i_gl->gl_work);
>   	gfs2_glock_add_to_lru(ip->i_gl);
>   	if (queue_delayed_work(gfs2_glock_workqueue,
>   			       &ip->i_gl->gl_work, 0) == 0)

I'm not convinced that it is safe to do this, without finding some other 
way to flush the workqueue. All glock work items on the queue hold a ref 
count to the glock, so it should not be possible to have a case where 
the glock is referenced by work queue activity after it has been freed  
- unless I've misunderstood the explanation above,

Steve.



  reply	other threads:[~2015-11-20 13:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-19 18:42 [Cluster-devel] [GFS2 PATCH 0/2] GFS2: Avoid inode shrinker-related deadlocks Bob Peterson
2015-11-19 18:42 ` [Cluster-devel] [GFS2 PATCH 1/2] GFS2: Make gfs2_clear_inode() queue the final put Bob Peterson
2015-11-20 13:33   ` Steven Whitehouse
2015-11-25 14:22     ` Bob Peterson
2015-11-25 14:26       ` Steven Whitehouse
2015-12-01 15:42         ` Bob Peterson
2015-12-02 10:23           ` Steven Whitehouse
2015-12-02 16:42             ` Bob Peterson
2015-12-02 17:41               ` Bob Peterson
2015-12-03 11:18                 ` Steven Whitehouse
2015-12-04 14:51                   ` Bob Peterson
2015-12-04 15:51                     ` David Teigland
2015-12-04 17:38                       ` Bob Peterson
2015-12-08  7:57               ` Dave Chinner
2015-12-08  9:03                 ` Steven Whitehouse
2015-11-19 18:42 ` [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Revert 35e478f Flush pending glock work when evicting an inode Bob Peterson
2015-11-20 13:47   ` Steven Whitehouse [this message]
2015-11-25 14:36     ` Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564F2458.7090902@redhat.com \
    --to=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).