public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: tinguely@sgi.com
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: flush workers before stopping log
Date: Thu, 30 Aug 2012 10:23:35 +1000	[thread overview]
Message-ID: <20120830002335.GB15292@dastard> (raw)
In-Reply-To: <20120829134628.835024558@sgi.com>

On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@sgi.com wrote:
> The unmount race continues with our test boxes.
> 
> The below trace gave the clue that there is a write of the superblock
> after the log UNMOUNT record and xfs_logprint confirmed this write.
> 
> A couple different experiments points to the sync worker. The simplest
> solution is to moved the final flush of the workers before the final
> superblock write so there is no other filesystem activity after the
> UNMOUNT record is written to the log.

....
>  #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
>  #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
> #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
> #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
> #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
> #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
> #14 [c5377f58] process_one_work at c024ee4c
> #15 [c5377f98] worker_thread at c024f43d
> #16 [c5377fbc] kthread at c025326b
> #17 [c5377fe8] kernel_thread_helper at c070e834
> 
> PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
>  #0 [cde0fda0] __schedule at c0706595
>  #1 [cde0fe28] schedule at c0706b89
>  #2 [cde0fe30] schedule_timeout at c0705600
>  #3 [cde0fe94] __down_common at c0706098
>  #4 [cde0fec8] __down at c0706122
>  #5 [cde0fed0] down at c025936f
>  #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
>  #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]

OK, so you've got IO on the superblock buffer still active when the
superblock is being freed.

> There should be no more I/O after the UNMOUNT record is written to the log.

That depends - a freeze leaves the filesystem in exactly this state.
:)

> Flush the workers before the final sync of the superblock, write of the
> UNMOUNT log record and tearing down the log.
> 
> This earlier flush prevents a late write of the superblock that raced with
> the fiesystem shutdown.

I'm not sure the xfs_sync_work can be responsible for this - the
xfs_sync_worker() has a MS_ACTIVE guard on it, so it will not log a
dummy record (superblock) during the unmount procedure, nor does it
dispatch supblock buffer IO, so it can not be responsible for the
item in the log after the unmount record or the IO that is being
run.

> Index: b/fs/xfs/xfs_mount.c
> ===================================================================
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1490,6 +1490,11 @@ xfs_unmountfs(
>  
>  	xfs_qm_unmount(mp);
>  
> +	/* flush the worker queues while the log still exists and
> +	 * before the final sync and unmount record.
> +	 */
> +	xfs_syncd_stop(mp);

xfs_syncd_stop() needs to die rather than being moved from place to
place every time some problem is seemd.  I outlined what we need to
do to solve the problems once and for all a couple of months ago:

http://oss.sgi.com/archives/xfs/2012-06/msg00064.html

I could use a break from backporting and testing - I might do this
today...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-08-30  0:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120829134624.316257238@sgi.com>
2012-08-29 13:46 ` [PATCH] xfs: flush workers before stopping log tinguely
2012-08-29 14:31   ` Mark Tinguely
2012-08-30  0:23   ` Dave Chinner [this message]
2012-08-30 17:25     ` Ben Myers
2012-08-30 22:35       ` Dave Chinner
2012-08-31 18:15         ` Mark Tinguely
2012-09-01 23:08           ` Christoph Hellwig
2012-09-12 18:33             ` xfs: stop the sync worker before xfs_unmountfs Ben Myers
2012-09-12 23:14               ` Dave Chinner
2012-09-13 16:43                 ` Ben Myers
2012-09-13 21:18                   ` [PATCH] " Ben Myers
2012-09-14  1:07                     ` Dave Chinner
2012-09-18 13:28                     ` Mark Tinguely
2012-09-13  8:17               ` Christoph Hellwig
2012-09-13 21:19                 ` Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120830002335.GB15292@dastard \
    --to=david@fromorbit.com \
    --cc=tinguely@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox