Re: [PATCH 3/4] xfs: drain the buffer LRU on mount

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/4] xfs: drain the buffer LRU on mount
Date: Tue, 24 Oct 2017 10:06:06 -0400	[thread overview]
Message-ID: <20171024140606.GD56184@bfoster.bfoster> (raw)
In-Reply-To: <20171024002334.GK5483@magnolia>

On Mon, Oct 23, 2017 at 05:23:34PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 23, 2017 at 12:54:17PM -0400, Brian Foster wrote:
> > On Mon, Oct 23, 2017 at 09:39:40AM -0700, Darrick J. Wong wrote:
> > > On Mon, Oct 23, 2017 at 10:46:45AM -0400, Brian Foster wrote:
> > > > Log recovery of v4 filesystems does not use buffer verifiers because
> > > > log recovery historically can result in transient buffer corruption
> > > > when target buffers might be ahead of the log after a crash. v5
> > > > filesystems work around this problem with metadata LSN ordering.
> > > > 
> > > > While the log recovery behavior is necessary on v4 supers, it
> > > > currently can result in leaving buffers around in the LRU without
> > > > verifiers attached for a significant amount of time. This can lead
> > > > to use of unverified buffers while the filesystem is in active use,
> > > > long after recovery has completed.
> > > > 
> > > > To address this problem and provide a more consistent clean,
> > > > post-mount buffer cache state, update the log mount sequence to
> > > > unconditionally drain all buffers from the LRU as a final step.
> > > > 
> > > > Reported-by: Darrick Wong <darrick.wong@oracle.com>
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > >  fs/xfs/xfs_log.c | 13 +++++++++++++
> > > >  1 file changed, 13 insertions(+)
> > > > 
> > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > > index dc95a49..e282fd8 100644
> > > > --- a/fs/xfs/xfs_log.c
> > > > +++ b/fs/xfs/xfs_log.c
> > > > @@ -744,6 +744,7 @@ xfs_log_mount_finish(
> > > >  {
> > > >  	int	error = 0;
> > > >  	bool	readonly = (mp->m_flags & XFS_MOUNT_RDONLY);
> > > > +	bool	recovered = mp->m_log->l_flags & XLOG_RECOVERY_NEEDED;
> > > >  
> > > >  	if (mp->m_flags & XFS_MOUNT_NORECOVERY) {
> > > >  		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> > > > @@ -780,6 +781,18 @@ xfs_log_mount_finish(
> > > >  	mp->m_super->s_flags &= ~MS_ACTIVE;
> > > >  	evict_inodes(mp->m_super);
> > > >  
> > > > +	/*
> > > > +	 * Drain the buffer LRU after log recovery. This is required for v4
> > > > +	 * filesystems to avoid leaving around buffers with NULL verifier ops,
> > > > +	 * but we do it unconditionally to make sure we're always in a clean
> > > > +	 * cache state after mount.
> > > > +	 */
> > > > +	if (recovered) {
> > > 
> > > if (recovered && !error) { ?
> > > 
> > > I observed that running xfs/376 on an rmap filesystem fails when it
> > > tries to fuzz the high bit of u3.bmx[0].startoff.  That triggers an
> > > incorrect freeing of what is now a post-eof extent.  The corresponding
> > > rmap free operation fails after the RUI has been logged and shuts down
> > > the filesystem, so a subsequent log recovery attempt also fails when it
> > > tries to remove an rmap that doesn't exist.  If we then try to force the
> > > log we end up deadlocked somehwere... though if we /don't/ then memory
> > > gets corrupted and the kernel blows up anyway. :(
> > > 
> > 
> > Interesting... do you have a stack trace? I'm curious why forcing the
> > log would hang here and not in the subsequent log force in
> > xfs_log_unmount() -> xfs_log_quiesce().
> 
...
> ...after which I see this once before the system totally hangs:
> 
> [ 5405.641103] INFO: task mount:28033 blocked for more than 60 seconds.
> [ 5405.643982]       Tainted: G        W       4.14.0-rc6-xfsx #1
> [ 5405.646067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 5405.650068] mount           D12768 28033  27973 0x80000000
> [ 5405.655505] Call Trace:
> [ 5405.657210]  ? __schedule+0x40d/0xb20
> [ 5405.658865]  schedule+0x40/0x90
> [ 5405.661340]  xfs_ail_push_all_sync+0xa4/0xe0 [xfs]
> [ 5405.664348]  ? remove_wait_queue+0x60/0x60
> [ 5405.668815]  xfs_log_mount_finish+0xc7/0x110 [xfs]
> [ 5405.671736]  xfs_mountfs+0x68f/0xaf0 [xfs]
> [ 5405.674645]  xfs_fs_fill_super+0x4d7/0x630 [xfs]
> [ 5405.677673]  ? xfs_finish_flags+0x150/0x150 [xfs]
> [ 5405.679966]  mount_bdev+0x178/0x1b0
> [ 5405.681300]  mount_fs+0xf/0x80
> [ 5405.682426]  vfs_kern_mount+0x62/0x160
> [ 5405.683712]  do_mount+0x1b1/0xd50
> [ 5405.687634]  ? rcu_read_lock_sched_held+0x3f/0x70
> [ 5405.688453]  SyS_mount+0x85/0xd0
> [ 5405.689033]  entry_SYSCALL_64_fastpath+0x1f/0xbe
> [ 5405.689955] RIP: 0033:0x7f0226c72b5a
> [ 5405.690533] RSP: 002b:00007fff9ddea318 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
> [ 5405.692074] RAX: ffffffffffffffda RBX: 00007f0226f6b63a RCX: 00007f0226c72b5a
> [ 5405.695604] RDX: 0000000001a3b2d0 RSI: 0000000001a3b310 RDI: 0000000001a3b2f0
> [ 5405.696566] RBP: 0000000001a3b120 R08: 0000000001a3b270 R09: 0000000000000012
> [ 5405.699583] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00007f022717b83c
> [ 5405.700465] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000005
> [ 5405.702569] 
> [ 5405.702569] Showing all locks held in the system:
> [ 5405.703485] 1 lock held by khungtaskd/37:
> [ 5405.704095]  #0:  (tasklist_lock){.+.+}, at: [<ffffffff810dfbf7>] debug_show_all_locks+0x37/0x190
> [ 5405.705342] 2 locks held by agetty/810:
> [ 5405.705845]  #0:  (&tty->ldisc_sem){++++}, at: [<ffffffff813fecf4>] tty_ldisc_ref_wait+0x24/0x50
> [ 5405.706934]  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<ffffffff813fd034>] n_tty_read+0xc4/0x880
> [ 5405.708030] 1 lock held by in:imklog/864:
> [ 5405.708534]  #0:  (&f->f_pos_lock){+.+.}, at: [<ffffffff8125d8d8>] __fdget_pos+0x48/0x60
> [ 5405.709572] 1 lock held by mount/28033:
> [ 5405.710112]  #0:  (&type->s_umount_key#45/1){+.+.}, at: [<ffffffff8123c993>] sget_userns+0x303/0x4f0
> [ 5405.711210] 
> [ 5405.711408] =============================================
> [ 5405.711408] 
> 

Thanks. Hmm, when I run xfs/376 on the djwong-devel xfs I see a bunch of
mm/usercopy.c BUG()s and what looks like a mount hang during quotacheck,
but it's blocked down in the block layer in io_schedule() attempting a
readahead. I'm not terribly surprised by the lockup after all the BUG()
invocations, but what I'm seeing doesn't appear to be similar to what
you reproduce or related to the AIL in any way.

I assume this mount behavior repeats after the test completes. If so,
could you grab a metadump of the fs?

Brian

> --D
> 
> > 
> > Brian
> > 
> > > --D
> > > 
> > > > +		xfs_log_force(mp, XFS_LOG_SYNC);
> > > > +		xfs_ail_push_all_sync(mp->m_ail);
> > > > +	}
> > > > +	xfs_wait_buftarg(mp->m_ddev_targp);
> > > > +
> > > >  	if (readonly)
> > > >  		mp->m_flags |= XFS_MOUNT_RDONLY;
> > > >  
> > > > -- 
> > > > 2.9.5
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2017-10-24 14:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-23 14:46 [PATCH 0/4] xfs: miscellaneous log recovery fixes Brian Foster
2017-10-23 14:46 ` [PATCH 1/4] xfs: sanity check log record range parameters Brian Foster
2017-10-23 23:49   ` Darrick J. Wong
2017-10-24 11:30     ` Brian Foster
2017-10-25  5:09       ` Darrick J. Wong
2017-10-23 14:46 ` [PATCH 2/4] xfs: fix log block underflow during recovery cycle verification Brian Foster
2017-10-23 23:50   ` Darrick J. Wong
2017-10-23 14:46 ` [PATCH 3/4] xfs: drain the buffer LRU on mount Brian Foster
2017-10-23 16:39   ` Darrick J. Wong
2017-10-23 16:54     ` Brian Foster
2017-10-24  0:23       ` Darrick J. Wong
2017-10-24 14:06         ` Brian Foster [this message]
2017-10-24 19:47           ` Brian Foster
2017-10-23 14:46 ` [PATCH RFC 4/4] xfs: enforce a maximum total iclog buffer size Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171024140606.GD56184@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).