public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Review: freezing sometimes leaves the log dirty
@ 2007-01-30 22:03 David Chinner
  2007-02-01  6:52 ` Donald Douwsma
  2007-02-02 11:46 ` Christoph Hellwig
  0 siblings, 2 replies; 8+ messages in thread
From: David Chinner @ 2007-01-30 22:03 UTC (permalink / raw)
  To: xfs-dev; +Cc: xfs

When we freeze the filesystem on a system that is under
heavy load, the fleeze can complete it's flushes while there
are still transactions active. Hence the freeze completes
with a dirty log and dirty metadata buffers still in memory.

The Linux freeze path is a tangled mess - I had to go back
to the irix code to work out exactly what we should be doing
to work out why the linux code was failing because of
the convoluted paths the linux code takes through the
generic layers.

In short, when we freeze the writes, we should not be
quiescing the filesystem at this point. All we should
be doing is a blocking data sync because we haven't shut down
the transaction subsystem yet. We also need to wait
for all direct I/O writes to complete as well.

Once the data sync is complete, we can return to the generic
code for it to freeze new transactions. Then we can wait for
all active transactions to complete before we quiesce the
filesystem which flushes out all the dirty metadata buffers.

At this point we have a clean filesystem and an empty log
so we can safely write the unmount record followed by a
dummy record to dirty the log to ensure unlinked list
processing on remount if we crash or shut down the machine
while the filesystem is frozen.

Comments?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---
 fs/xfs/linux-2.6/xfs_super.c |   14 +++++++++++---
 fs/xfs/linux-2.6/xfs_vfs.h   |    1 +
 fs/xfs/xfs_vfsops.c          |   26 ++++++++++++++++++++++----
 3 files changed, 34 insertions(+), 7 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c	2007-01-08 14:32:40.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c	2007-01-08 22:46:12.520522391 +1100
@@ -730,9 +730,17 @@ xfs_fs_sync_super(
 	int			error;
 	int			flags;
 
-	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE))
-		flags = SYNC_QUIESCE;
-	else
+	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE)) {
+		/*
+		 * First stage of freeze - no more writers will make progress
+		 * now we are here, so we flush delwri and delalloc buffers
+		 * here, then wait for all I/O to complete.  Data is frozen at
+		 * that point. Metadata is not frozen, transactions can still
+		 * occur here so don't bother flushing the buftarg (i.e
+		 * SYNC_QUIESCE) because it'll just get dirty again.
+		 */
+		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_DIO_WAIT;
+	} else
 		flags = SYNC_FSDATA | (wait ? SYNC_WAIT : 0);
 
 	error = bhv_vfs_sync(vfsp, flags, NULL);
Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vfs.h	2006-12-22 10:53:22.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h	2007-01-08 22:27:26.366619320 +1100
@@ -92,6 +92,7 @@ typedef enum {
 #define SYNC_REFCACHE		0x0040  /* prune some of the nfs ref cache */
 #define SYNC_REMOUNT		0x0080  /* remount readonly, no dummy LRs */
 #define SYNC_QUIESCE		0x0100  /* quiesce fileystem for a snapshot */
+#define SYNC_DIO_WAIT		0x0200  /* wait for direct I/O to complete */
 
 #define SHUTDOWN_META_IO_ERROR	0x0001	/* write attempt to metadata failed */
 #define SHUTDOWN_LOG_IO_ERROR	0x0002	/* write attempt to the log failed */
Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.c	2007-01-08 20:06:55.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c	2007-01-08 23:27:54.696637946 +1100
@@ -881,6 +881,10 @@ xfs_statvfs(
  *		       this by simply making sure the log gets flushed
  *		       if SYNC_BDFLUSH is set, and by actually writing it
  *		       out otherwise.
+ *	SYNC_DIO_WAIT - The caller wants us to wait for all direct I/Os
+ *		       as well to ensure all data I/O completes before we
+ *		       return. Forms the drain side of the write barrier needed
+ *		       to safely quiesce the filesystem.
  *
  */
 /*ARGSUSED*/
@@ -892,10 +896,7 @@ xfs_sync(
 {
 	xfs_mount_t	*mp = XFS_BHVTOM(bdp);
 
-	if (unlikely(flags == SYNC_QUIESCE))
-		return xfs_quiesce_fs(mp);
-	else
-		return xfs_syncsub(mp, flags, NULL);
+	return xfs_syncsub(mp, flags, NULL);
 }
 
 /*
@@ -1181,6 +1182,12 @@ xfs_sync_inodes(
 			}
 
 		}
+		/*
+		 * When freezing, we need to wait ensure direct I/O is complete
+		 * as well to ensure all data modification is complete here
+		 */
+		if (flags & SYNC_DIO_WAIT)
+			vn_iowait(vp);
 
 		if (flags & SYNC_BDFLUSH) {
 			if ((flags & SYNC_ATTR) &&
@@ -1959,15 +1966,26 @@ xfs_showargs(
 	return 0;
 }
 
+/*
+ * Second stage of a freeze. The data is already frozen, now we have to take
+ * care of the metadata. New transactions are already blocked, so we need to
+ * wait for any remaining transactions to drain out before proceding.
+ */
 STATIC void
 xfs_freeze(
 	bhv_desc_t	*bdp)
 {
 	xfs_mount_t	*mp = XFS_BHVTOM(bdp);
 
+	/* wait for all modifications to complete */
 	while (atomic_read(&mp->m_active_trans) > 0)
 		delay(100);
 
+	/* flush inodes and push all remaining buffers out to disk */
+	xfs_quiesce_fs(mp);
+
+	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
+
 	/* Push the superblock and write an unmount record */
 	xfs_log_unmount_write(mp);
 	xfs_unmountfs_writesb(mp);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-01-30 22:03 Review: freezing sometimes leaves the log dirty David Chinner
@ 2007-02-01  6:52 ` Donald Douwsma
  2007-02-02 11:46 ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Donald Douwsma @ 2007-02-01  6:52 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs

Hi Dave,

It looks good to me.

Donald

David Chinner wrote:
> When we freeze the filesystem on a system that is under
> heavy load, the fleeze can complete it's flushes while there
> are still transactions active. Hence the freeze completes
> with a dirty log and dirty metadata buffers still in memory.
> 
> The Linux freeze path is a tangled mess - I had to go back
> to the irix code to work out exactly what we should be doing
> to work out why the linux code was failing because of
> the convoluted paths the linux code takes through the
> generic layers.
> 
> In short, when we freeze the writes, we should not be
> quiescing the filesystem at this point. All we should
> be doing is a blocking data sync because we haven't shut down
> the transaction subsystem yet. We also need to wait
> for all direct I/O writes to complete as well.
> 
> Once the data sync is complete, we can return to the generic
> code for it to freeze new transactions. Then we can wait for
> all active transactions to complete before we quiesce the
> filesystem which flushes out all the dirty metadata buffers.
> 
> At this point we have a clean filesystem and an empty log
> so we can safely write the unmount record followed by a
> dummy record to dirty the log to ensure unlinked list
> processing on remount if we crash or shut down the machine
> while the filesystem is frozen.
> 
> Comments?
> 
> Cheers,
> 
> Dave.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-01-30 22:03 Review: freezing sometimes leaves the log dirty David Chinner
  2007-02-01  6:52 ` Donald Douwsma
@ 2007-02-02 11:46 ` Christoph Hellwig
  2007-02-02 14:07   ` David Chinner
  2007-02-04 21:56   ` Nathan Scott
  1 sibling, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2007-02-02 11:46 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs

On Wed, Jan 31, 2007 at 09:03:26AM +1100, David Chinner wrote:
> -	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE))
> -		flags = SYNC_QUIESCE;
> -	else
> +	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE)) {
> +		/*
> +		 * First stage of freeze - no more writers will make progress
> +		 * now we are here, so we flush delwri and delalloc buffers
> +		 * here, then wait for all I/O to complete.  Data is frozen at
> +		 * that point. Metadata is not frozen, transactions can still
> +		 * occur here so don't bother flushing the buftarg (i.e
> +		 * SYNC_QUIESCE) because it'll just get dirty again.
> +		 */
> +		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_DIO_WAIT;
> +	} else

You remove all uses of SYNC_QUIESCE in this patch, so please kill the
definition aswell.

> + *	SYNC_DIO_WAIT - The caller wants us to wait for all direct I/Os
> + *		       as well to ensure all data I/O completes before we
> + *		       return. Forms the drain side of the write barrier needed
> + *		       to safely quiesce the filesystem.
>   *
>   */
>  /*ARGSUSED*/
> @@ -892,10 +896,7 @@ xfs_sync(
>  {
>  	xfs_mount_t	*mp = XFS_BHVTOM(bdp);
>  
> -	if (unlikely(flags == SYNC_QUIESCE))
> -		return xfs_quiesce_fs(mp);
> -	else
> -		return xfs_syncsub(mp, flags, NULL);
> +	return xfs_syncsub(mp, flags, NULL);
>  }
>  
>  /*
> @@ -1181,6 +1182,12 @@ xfs_sync_inodes(
>  			}
>  
>  		}
> +		/*
> +		 * When freezing, we need to wait ensure direct I/O is complete
> +		 * as well to ensure all data modification is complete here
> +		 */
> +		if (flags & SYNC_DIO_WAIT)
> +			vn_iowait(vp);

vn_iowait waits for v_iocount decrementing to zero.  We use v_iocount
for tracking ioend structures that are used both for buffered and direct
I/O.  Because of that the flag should probably be SYNC_IOWAIT and the comment
updated to reflect this.

> +/*
> + * Second stage of a freeze. The data is already frozen, now we have to take
> + * care of the metadata. New transactions are already blocked, so we need to
> + * wait for any remaining transactions to drain out before proceding.
> + */
>  STATIC void
>  xfs_freeze(
>  	bhv_desc_t	*bdp)
>  {
>  	xfs_mount_t	*mp = XFS_BHVTOM(bdp);
>  
> +	/* wait for all modifications to complete */
>  	while (atomic_read(&mp->m_active_trans) > 0)
>  		delay(100);
>  
> +	/* flush inodes and push all remaining buffers out to disk */
> +	xfs_quiesce_fs(mp);
> +
> +	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
> +

xfs_vfsops.c is considered common code, so you should probably use
ASSERT here, not BUG_ON.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-02-02 11:46 ` Christoph Hellwig
@ 2007-02-02 14:07   ` David Chinner
  2007-02-05 21:02     ` David Chinner
  2007-02-04 21:56   ` Nathan Scott
  1 sibling, 1 reply; 8+ messages in thread
From: David Chinner @ 2007-02-02 14:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: David Chinner, xfs-dev, xfs

On Fri, Feb 02, 2007 at 11:46:23AM +0000, Christoph Hellwig wrote:
> On Wed, Jan 31, 2007 at 09:03:26AM +1100, David Chinner wrote:
> > -	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE))
> > -		flags = SYNC_QUIESCE;
> > -	else
> > +	if (unlikely(sb->s_frozen == SB_FREEZE_WRITE)) {
> > +		/*
> > +		 * First stage of freeze - no more writers will make progress
> > +		 * now we are here, so we flush delwri and delalloc buffers
> > +		 * here, then wait for all I/O to complete.  Data is frozen at
> > +		 * that point. Metadata is not frozen, transactions can still
> > +		 * occur here so don't bother flushing the buftarg (i.e
> > +		 * SYNC_QUIESCE) because it'll just get dirty again.
> > +		 */
> > +		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_DIO_WAIT;
> > +	} else
> 
> You remove all uses of SYNC_QUIESCE in this patch, so please kill the
> definition aswell.

Yup, didn't think of that.

> > +		/*
> > +		 * When freezing, we need to wait ensure direct I/O is complete
> > +		 * as well to ensure all data modification is complete here
> > +		 */
> > +		if (flags & SYNC_DIO_WAIT)
> > +			vn_iowait(vp);
> 
> vn_iowait waits for v_iocount decrementing to zero.  We use v_iocount
> for tracking ioend structures that are used both for buffered and direct
> I/O.  Because of that the flag should probably be SYNC_IOWAIT and the comment
> updated to reflect this.

Ok, that's probably a better reflection of what the code does. We've already
waited for all buffered I/O via the SYNC_WAIT flag, so I was only really
thinking about the direct I/o case here.

> > +/*
> > + * Second stage of a freeze. The data is already frozen, now we have to take
> > + * care of the metadata. New transactions are already blocked, so we need to
> > + * wait for any remaining transactions to drain out before proceding.
> > + */
> >  STATIC void
> >  xfs_freeze(
> >  	bhv_desc_t	*bdp)
> >  {
> >  	xfs_mount_t	*mp = XFS_BHVTOM(bdp);
> >  
> > +	/* wait for all modifications to complete */
> >  	while (atomic_read(&mp->m_active_trans) > 0)
> >  		delay(100);
> >  
> > +	/* flush inodes and push all remaining buffers out to disk */
> > +	xfs_quiesce_fs(mp);
> > +
> > +	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
> > +
> 
> xfs_vfsops.c is considered common code, so you should probably use
> ASSERT here, not BUG_ON.

good catch.

Patch below cleans all this up and also fixes the 2.4 tree as well.

BTW, i think further cleanup in xfs_quiesce_fs() can be done - that
flush loop looks redundant - neither Irix nor 2.4 linux need it,
and I can't see why it would be needed on 2.6. It looks to me like
it was trying to fix the problem I'm fixing right now. I'll look
into it further at some point...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---
 fs/xfs/linux-2.4/xfs_super.c |    8 +++-----
 fs/xfs/linux-2.4/xfs_vfs.h   |    2 +-
 fs/xfs/linux-2.6/xfs_super.c |    2 +-
 fs/xfs/linux-2.6/xfs_vfs.h   |    3 +--
 fs/xfs/xfs_vfsops.c          |   17 +++++++++--------
 5 files changed, 15 insertions(+), 17 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_super.c	2007-02-02 16:35:36.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c	2007-02-03 00:59:00.453115972 +1100
@@ -669,13 +669,11 @@ struct super_block *freeze_bdev(struct b
 		wmb();
 
 		/* Flush the refcache */
-		bhv_vfs_sync(vfsp, SYNC_REFCACHE | SYNC_WAIT, NULL);;
+		bhv_vfs_sync(vfsp, SYNC_REFCACHE | SYNC_WAIT, NULL);
 
 		/* Flush delalloc and delwri data */
-		bhv_vfs_sync(vfsp, SYNC_DELWRI | SYNC_WAIT, NULL);;
-
-		/* Flush out everything to it's normal place */
-		bhv_vfs_sync(vfsp, SYNC_QUIESCE, NULL);
+		bhv_vfs_sync(vfsp,
+			SYNC_FSDATA|SYNC_DELWRI|SYNC_WAIT|SYNC_IOWAIT, NULL);
 
 		/* Pause transaction subsystem */
 		vfsp->vfs_frozen = SB_FREEZE_TRANS;
Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vfs.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_vfs.h	2007-01-16 10:54:15.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vfs.h	2007-02-03 00:51:38.255434540 +1100
@@ -92,7 +92,7 @@ typedef enum {
 #define SYNC_FSDATA		0x0020	/* flush fs data (e.g. superblocks) */
 #define SYNC_REFCACHE		0x0040  /* prune some of the nfs ref cache */
 #define SYNC_REMOUNT		0x0080  /* remount readonly, no dummy LRs */
-#define SYNC_QUIESCE		0x0100  /* quiesce fileystem for a snapshot */
+#define SYNC_IOWAIT		0x0100  /* wait for all I/O to complete */
 
 #define SHUTDOWN_META_IO_ERROR	0x0001	/* write attempt to metadata failed */
 #define SHUTDOWN_LOG_IO_ERROR	0x0002	/* write attempt to the log failed */
Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c	2007-02-02 16:35:36.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c	2007-02-03 00:59:03.236749054 +1100
@@ -674,7 +674,7 @@ xfs_fs_sync_super(
 		 * occur here so don't bother flushing the buftarg (i.e
 		 * SYNC_QUIESCE) because it'll just get dirty again.
 		 */
-		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_DIO_WAIT;
+		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_IOWAIT;
 	} else
 		flags = SYNC_FSDATA | (wait ? SYNC_WAIT : 0);
 
Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vfs.h	2007-02-02 16:35:02.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h	2007-02-03 00:51:41.638988060 +1100
@@ -91,8 +91,7 @@ typedef enum {
 #define SYNC_FSDATA		0x0020	/* flush fs data (e.g. superblocks) */
 #define SYNC_REFCACHE		0x0040  /* prune some of the nfs ref cache */
 #define SYNC_REMOUNT		0x0080  /* remount readonly, no dummy LRs */
-#define SYNC_QUIESCE		0x0100  /* quiesce fileystem for a snapshot */
-#define SYNC_DIO_WAIT		0x0200  /* wait for direct I/O to complete */
+#define SYNC_IOWAIT		0x0100  /* wait for all I/O to complete */
 
 #define SHUTDOWN_META_IO_ERROR	0x0001	/* write attempt to metadata failed */
 #define SHUTDOWN_LOG_IO_ERROR	0x0002	/* write attempt to the log failed */
Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.c	2007-02-02 16:35:36.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c	2007-02-03 00:49:10.946876638 +1100
@@ -881,10 +881,10 @@ xfs_statvfs(
  *		       this by simply making sure the log gets flushed
  *		       if SYNC_BDFLUSH is set, and by actually writing it
  *		       out otherwise.
- *	SYNC_DIO_WAIT - The caller wants us to wait for all direct I/Os
- *		       as well to ensure all data I/O completes before we
- *		       return. Forms the drain side of the write barrier needed
- *		       to safely quiesce the filesystem.
+ *	SYNC_IOWAIT  - The caller wants us to wait for all data I/O to complete
+ *		       before we return (including direct I/O). Forms the drain
+ *		       side of the write barrier needed to safely quiesce the
+ *		       filesystem.
  *
  */
 /*ARGSUSED*/
@@ -1183,10 +1183,11 @@ xfs_sync_inodes(
 
 		}
 		/*
-		 * When freezing, we need to wait ensure direct I/O is complete
-		 * as well to ensure all data modification is complete here
+		 * When freezing, we need to wait ensure all I/O (including direct
+		 * I/O) is complete to ensure no further data modification can take
+		 * place after this point
 		 */
-		if (flags & SYNC_DIO_WAIT)
+		if (flags & SYNC_IOWAIT)
 			vn_iowait(vp);
 
 		if (flags & SYNC_BDFLUSH) {
@@ -1984,7 +1985,7 @@ xfs_freeze(
 	/* flush inodes and push all remaining buffers out to disk */
 	xfs_quiesce_fs(mp);
 
-	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
+	ASSERT(atomic_read(&mp->m_active_trans) == 0);
 
 	/* Push the superblock and write an unmount record */
 	xfs_log_unmount_write(mp);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-02-02 11:46 ` Christoph Hellwig
  2007-02-02 14:07   ` David Chinner
@ 2007-02-04 21:56   ` Nathan Scott
  2007-02-04 23:45     ` David Chinner
  1 sibling, 1 reply; 8+ messages in thread
From: Nathan Scott @ 2007-02-04 21:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: David Chinner, xfs-dev, xfs

On Fri, 2007-02-02 at 11:46 +0000, Christoph Hellwig wrote:
> On Wed, Jan 31, 2007 at 09:03:26AM +1100, David Chinner wrote:
> > +	/* flush inodes and push all remaining buffers out to disk */
> > +	xfs_quiesce_fs(mp);
> > +
> > +	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
> > +
> 
> xfs_vfsops.c is considered common code, so you should probably use
> ASSERT here, not BUG_ON.

There's also an ASSERT_ALWAYS macro IIRC, if you want the equivalent
functionality of BUG_ON (i.e. always check).

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-02-04 21:56   ` Nathan Scott
@ 2007-02-04 23:45     ` David Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: David Chinner @ 2007-02-04 23:45 UTC (permalink / raw)
  To: Nathan Scott; +Cc: Christoph Hellwig, David Chinner, xfs-dev, xfs

On Mon, Feb 05, 2007 at 08:56:55AM +1100, Nathan Scott wrote:
> On Fri, 2007-02-02 at 11:46 +0000, Christoph Hellwig wrote:
> > On Wed, Jan 31, 2007 at 09:03:26AM +1100, David Chinner wrote:
> > > +	/* flush inodes and push all remaining buffers out to disk */
> > > +	xfs_quiesce_fs(mp);
> > > +
> > > +	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
> > > +
> > 
> > xfs_vfsops.c is considered common code, so you should probably use
> > ASSERT here, not BUG_ON.
> 
> There's also an ASSERT_ALWAYS macro IIRC, if you want the equivalent
> functionality of BUG_ON (i.e. always check).

True, I forgot about that one. Thx, Nathan....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-02-02 14:07   ` David Chinner
@ 2007-02-05 21:02     ` David Chinner
  2007-02-05 21:09       ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: David Chinner @ 2007-02-05 21:02 UTC (permalink / raw)
  To: David Chinner; +Cc: Christoph Hellwig, xfs-dev, xfs

On Sat, Feb 03, 2007 at 01:07:06AM +1100, David Chinner wrote:
> Patch below cleans all this up and also fixes the 2.4 tree as well.

Can I get an ack from someone on this?

> BTW, i think further cleanup in xfs_quiesce_fs() can be done - that
> flush loop looks redundant - neither Irix nor 2.4 linux need it,
> and I can't see why it would be needed on 2.6. It looks to me like
> it was trying to fix the problem I'm fixing right now. I'll look
> into it further at some point...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
> 
> ---
>  fs/xfs/linux-2.4/xfs_super.c |    8 +++-----
>  fs/xfs/linux-2.4/xfs_vfs.h   |    2 +-
>  fs/xfs/linux-2.6/xfs_super.c |    2 +-
>  fs/xfs/linux-2.6/xfs_vfs.h   |    3 +--
>  fs/xfs/xfs_vfsops.c          |   17 +++++++++--------
>  5 files changed, 15 insertions(+), 17 deletions(-)
> 
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_super.c	2007-02-02 16:35:36.000000000 +1100
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c	2007-02-03 00:59:00.453115972 +1100
> @@ -669,13 +669,11 @@ struct super_block *freeze_bdev(struct b
>  		wmb();
>  
>  		/* Flush the refcache */
> -		bhv_vfs_sync(vfsp, SYNC_REFCACHE | SYNC_WAIT, NULL);;
> +		bhv_vfs_sync(vfsp, SYNC_REFCACHE | SYNC_WAIT, NULL);
>  
>  		/* Flush delalloc and delwri data */
> -		bhv_vfs_sync(vfsp, SYNC_DELWRI | SYNC_WAIT, NULL);;
> -
> -		/* Flush out everything to it's normal place */
> -		bhv_vfs_sync(vfsp, SYNC_QUIESCE, NULL);
> +		bhv_vfs_sync(vfsp,
> +			SYNC_FSDATA|SYNC_DELWRI|SYNC_WAIT|SYNC_IOWAIT, NULL);
>  
>  		/* Pause transaction subsystem */
>  		vfsp->vfs_frozen = SB_FREEZE_TRANS;
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vfs.h
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_vfs.h	2007-01-16 10:54:15.000000000 +1100
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vfs.h	2007-02-03 00:51:38.255434540 +1100
> @@ -92,7 +92,7 @@ typedef enum {
>  #define SYNC_FSDATA		0x0020	/* flush fs data (e.g. superblocks) */
>  #define SYNC_REFCACHE		0x0040  /* prune some of the nfs ref cache */
>  #define SYNC_REMOUNT		0x0080  /* remount readonly, no dummy LRs */
> -#define SYNC_QUIESCE		0x0100  /* quiesce fileystem for a snapshot */
> +#define SYNC_IOWAIT		0x0100  /* wait for all I/O to complete */
>  
>  #define SHUTDOWN_META_IO_ERROR	0x0001	/* write attempt to metadata failed */
>  #define SHUTDOWN_LOG_IO_ERROR	0x0002	/* write attempt to the log failed */
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c	2007-02-02 16:35:36.000000000 +1100
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c	2007-02-03 00:59:03.236749054 +1100
> @@ -674,7 +674,7 @@ xfs_fs_sync_super(
>  		 * occur here so don't bother flushing the buftarg (i.e
>  		 * SYNC_QUIESCE) because it'll just get dirty again.
>  		 */
> -		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_DIO_WAIT;
> +		flags = SYNC_FSDATA | SYNC_DELWRI | SYNC_WAIT | SYNC_IOWAIT;
>  	} else
>  		flags = SYNC_FSDATA | (wait ? SYNC_WAIT : 0);
>  
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vfs.h	2007-02-02 16:35:02.000000000 +1100
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.h	2007-02-03 00:51:41.638988060 +1100
> @@ -91,8 +91,7 @@ typedef enum {
>  #define SYNC_FSDATA		0x0020	/* flush fs data (e.g. superblocks) */
>  #define SYNC_REFCACHE		0x0040  /* prune some of the nfs ref cache */
>  #define SYNC_REMOUNT		0x0080  /* remount readonly, no dummy LRs */
> -#define SYNC_QUIESCE		0x0100  /* quiesce fileystem for a snapshot */
> -#define SYNC_DIO_WAIT		0x0200  /* wait for direct I/O to complete */
> +#define SYNC_IOWAIT		0x0100  /* wait for all I/O to complete */
>  
>  #define SHUTDOWN_META_IO_ERROR	0x0001	/* write attempt to metadata failed */
>  #define SHUTDOWN_LOG_IO_ERROR	0x0002	/* write attempt to the log failed */
> Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.c	2007-02-02 16:35:36.000000000 +1100
> +++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c	2007-02-03 00:49:10.946876638 +1100
> @@ -881,10 +881,10 @@ xfs_statvfs(
>   *		       this by simply making sure the log gets flushed
>   *		       if SYNC_BDFLUSH is set, and by actually writing it
>   *		       out otherwise.
> - *	SYNC_DIO_WAIT - The caller wants us to wait for all direct I/Os
> - *		       as well to ensure all data I/O completes before we
> - *		       return. Forms the drain side of the write barrier needed
> - *		       to safely quiesce the filesystem.
> + *	SYNC_IOWAIT  - The caller wants us to wait for all data I/O to complete
> + *		       before we return (including direct I/O). Forms the drain
> + *		       side of the write barrier needed to safely quiesce the
> + *		       filesystem.
>   *
>   */
>  /*ARGSUSED*/
> @@ -1183,10 +1183,11 @@ xfs_sync_inodes(
>  
>  		}
>  		/*
> -		 * When freezing, we need to wait ensure direct I/O is complete
> -		 * as well to ensure all data modification is complete here
> +		 * When freezing, we need to wait ensure all I/O (including direct
> +		 * I/O) is complete to ensure no further data modification can take
> +		 * place after this point
>  		 */
> -		if (flags & SYNC_DIO_WAIT)
> +		if (flags & SYNC_IOWAIT)
>  			vn_iowait(vp);
>  
>  		if (flags & SYNC_BDFLUSH) {
> @@ -1984,7 +1985,7 @@ xfs_freeze(
>  	/* flush inodes and push all remaining buffers out to disk */
>  	xfs_quiesce_fs(mp);
>  
> -	BUG_ON(atomic_read(&mp->m_active_trans) > 0);
> +	ASSERT(atomic_read(&mp->m_active_trans) == 0);
>  
>  	/* Push the superblock and write an unmount record */
>  	xfs_log_unmount_write(mp);

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: freezing sometimes leaves the log dirty
  2007-02-05 21:02     ` David Chinner
@ 2007-02-05 21:09       ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2007-02-05 21:09 UTC (permalink / raw)
  To: David Chinner; +Cc: Christoph Hellwig, xfs-dev, xfs

On Tue, Feb 06, 2007 at 08:02:45AM +1100, David Chinner wrote:
> On Sat, Feb 03, 2007 at 01:07:06AM +1100, David Chinner wrote:
> > Patch below cleans all this up and also fixes the 2.4 tree as well.
> 
> Can I get an ack from someone on this?
> 

Looks good to me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-02-05 22:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-30 22:03 Review: freezing sometimes leaves the log dirty David Chinner
2007-02-01  6:52 ` Donald Douwsma
2007-02-02 11:46 ` Christoph Hellwig
2007-02-02 14:07   ` David Chinner
2007-02-05 21:02     ` David Chinner
2007-02-05 21:09       ` Christoph Hellwig
2007-02-04 21:56   ` Nathan Scott
2007-02-04 23:45     ` David Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox