[PATCH 0/8] Sync fixes and cleanups (version 4)

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] Sync fixes and cleanups (version 4)
@ 2009-04-27 14:43 Jan Kara
  2009-04-27 14:43 ` [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability " Jan Kara
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton

  Hi,

  here comes the next version of sync fixes and cleanups. This version also
includes some moving of code from super.c to sync.c and quota sync cleanup and
slight speedup. If noone has any objections anymore, who's going to merge it?
Al?

								Bye
									Honza

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 19:38   ` Andrew Morton
  2009-04-27 14:43 ` [PATCH 2/8] vfs: Call ->sync_fs() even if s_dirt is 0 " Jan Kara
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

So far, do_sync() called:
  sync_inodes(0);
  sync_supers();
  sync_filesystems(0);
  sync_filesystems(1);
  sync_inodes(1);

This ordering makes it kind of hard for filesystems as sync_inodes(0) need not
submit all the IO (for example it skips inodes with I_SYNC set) so e.g. forcing
transaction to disk in ->sync_fs() is not really enough. Therefore sys_sync has
not been completely reliable on some filesystems (ext3, ext4, reiserfs, ocfs2
and others are hit by this) when racing e.g. with background writeback. A
similar problem hits also other filesystems (e.g. ext2) because of
write_supers() being called before the sync_inodes(1).

Change the ordering of calls in do_sync() - this requires a new function
sync_blkdevs() to preserve the property that block devices are always synced
after write_super() / sync_fs() call.

The same issue is fixed in __fsync_super() function used on umount /
remount read-only.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/super.c         |   27 ++++++++++++++++++++++++++-
 fs/sync.c          |    3 ++-
 include/linux/fs.h |    2 ++
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 786fe7d..4826540 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -267,6 +267,7 @@ void __fsync_super(struct super_block *sb)
 {
 	sync_inodes_sb(sb, 0);
 	vfs_dq_sync(sb);
+	sync_inodes_sb(sb, 1);
 	lock_super(sb);
 	if (sb->s_dirt && sb->s_op->write_super)
 		sb->s_op->write_super(sb);
@@ -274,7 +275,6 @@ void __fsync_super(struct super_block *sb)
 	if (sb->s_op->sync_fs)
 		sb->s_op->sync_fs(sb, 1);
 	sync_blockdev(sb->s_bdev);
-	sync_inodes_sb(sb, 1);
 }
 
 /*
@@ -502,6 +502,31 @@ restart:
 	mutex_unlock(&mutex);
 }
 
+/*
+ *  Sync all block devices underlying some superblock
+ */
+void sync_blockdevs(void)
+{
+	struct super_block *sb;
+
+	spin_lock(&sb_lock);
+restart:
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (!sb->s_bdev)
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+		down_read(&sb->s_umount);
+		if (sb->s_root)
+			sync_blockdev(sb->s_bdev);
+		up_read(&sb->s_umount);
+		spin_lock(&sb_lock);
+		if (__put_super_and_need_restart(sb))
+			goto restart;
+	}
+	spin_unlock(&sb_lock);
+}
+
 /**
  *	get_super - get the superblock of a device
  *	@bdev: device to get the superblock for
diff --git a/fs/sync.c b/fs/sync.c
index 7abc65f..fa14e42 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -26,10 +26,11 @@ static void do_sync(unsigned long wait)
 	wakeup_pdflush(0);
 	sync_inodes(0);		/* All mappings, inodes and their blockdevs */
 	vfs_dq_sync(NULL);
+	sync_inodes(wait);	/* Mappings, inodes and blockdevs, again. */
 	sync_supers();		/* Write the superblocks */
 	sync_filesystems(0);	/* Start syncing the filesystems */
 	sync_filesystems(wait);	/* Waitingly sync the filesystems */
-	sync_inodes(wait);	/* Mappings, inodes and blockdevs, again. */
+	sync_blockdevs();
 	if (!wait)
 		printk("Emergency Sync complete\n");
 	if (unlikely(laptop_mode))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5bed436..4bad02e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1942,6 +1942,7 @@ extern void bdput(struct block_device *);
 extern struct block_device *open_by_devnum(dev_t, fmode_t);
 extern void invalidate_bdev(struct block_device *);
 extern int sync_blockdev(struct block_device *bdev);
+extern void sync_blockdevs(void);
 extern struct super_block *freeze_bdev(struct block_device *);
 extern void emergency_thaw_all(void);
 extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
@@ -1951,6 +1952,7 @@ extern int fsync_no_super(struct block_device *);
 #else
 static inline void bd_forget(struct inode *inode) {}
 static inline int sync_blockdev(struct block_device *bdev) { return 0; }
+static inline void sync_blockdevs(void) { }
 static inline void invalidate_bdev(struct block_device *bdev) {}
 
 static inline struct super_block *freeze_bdev(struct block_device *sb)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability (version 4)
  2009-04-27 14:43 ` [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability " Jan Kara
@ 2009-04-27 19:38   ` Andrew Morton
  2009-04-28 11:56     ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2009-04-27 19:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, viro, linux-fsdevel, hch, trond.myklebust, jack

On Mon, 27 Apr 2009 16:43:48 +0200
Jan Kara <jack@suse.cz> wrote:

> So far, do_sync() called:
>   sync_inodes(0);
>   sync_supers();
>   sync_filesystems(0);
>   sync_filesystems(1);
>   sync_inodes(1);

The description has me all confused.

> This ordering makes it kind of hard for filesystems as sync_inodes(0) need not
> submit all the IO (for example it skips inodes with I_SYNC set) so e.g. forcing
> transaction to disk in ->sync_fs() is not really enough.

Is not really enough for what?

sync_fs(wait==0) is not supposed to be reliable - it's an advice to the
fs that it should push as much "easy" writeback into the queue as
possible.  We'll do the real sync later, with sync_fs(wait==1).

> Therefore sys_sync has
> not been completely reliable on some filesystems (ext3, ext4, reiserfs, ocfs2
> and others are hit by this) when racing e.g. with background writeback.

No sync can ever be reliable in the presence of concurrent write
activity, unless we freeze userspace.

> A
> similar problem hits also other filesystems (e.g. ext2) because of
> write_supers() being called before the sync_inodes(1).
> 
> Change the ordering of calls in do_sync() - this requires a new function
> sync_blkdevs() to preserve the property that block devices are always synced
> after write_super() / sync_fs() call.
> 
> The same issue is fixed in __fsync_super() function used on umount /
> remount read-only.

So it's all a bit unclear (to me) what this patch is trying to fix?


> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/super.c         |   27 ++++++++++++++++++++++++++-
>  fs/sync.c          |    3 ++-
>  include/linux/fs.h |    2 ++
>  3 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 786fe7d..4826540 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -267,6 +267,7 @@ void __fsync_super(struct super_block *sb)
>  {
>  	sync_inodes_sb(sb, 0);
>  	vfs_dq_sync(sb);
> +	sync_inodes_sb(sb, 1);
>  	lock_super(sb);
>  	if (sb->s_dirt && sb->s_op->write_super)
>  		sb->s_op->write_super(sb);
> @@ -274,7 +275,6 @@ void __fsync_super(struct super_block *sb)
>  	if (sb->s_op->sync_fs)
>  		sb->s_op->sync_fs(sb, 1);
>  	sync_blockdev(sb->s_bdev);
> -	sync_inodes_sb(sb, 1);
>  }
>  
>  /*
> @@ -502,6 +502,31 @@ restart:
>  	mutex_unlock(&mutex);
>  }
>  
> +/*
> + *  Sync all block devices underlying some superblock
> + */
> +void sync_blockdevs(void)
> +{
> +	struct super_block *sb;
> +
> +	spin_lock(&sb_lock);
> +restart:
> +	list_for_each_entry(sb, &super_blocks, s_list) {
> +		if (!sb->s_bdev)
> +			continue;
> +		sb->s_count++;
> +		spin_unlock(&sb_lock);
> +		down_read(&sb->s_umount);
> +		if (sb->s_root)
> +			sync_blockdev(sb->s_bdev);
> +		up_read(&sb->s_umount);
> +		spin_lock(&sb_lock);
> +		if (__put_super_and_need_restart(sb))
> +			goto restart;
> +	}
> +	spin_unlock(&sb_lock);
> +}

The comment doesn't match the implementation.  This function syncs all
blockdevs underlying _all_ superblocks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability (version 4)
  2009-04-27 19:38   ` Andrew Morton
@ 2009-04-28 11:56     ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-28 11:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, viro, linux-fsdevel, hch, trond.myklebust

On Mon 27-04-09 12:38:25, Andrew Morton wrote:
> On Mon, 27 Apr 2009 16:43:48 +0200
> Jan Kara <jack@suse.cz> wrote:
> > So far, do_sync() called:
> >   sync_inodes(0);
> >   sync_supers();
> >   sync_filesystems(0);
> >   sync_filesystems(1);
> >   sync_inodes(1);
> 
> The description has me all confused.
> 
> > This ordering makes it kind of hard for filesystems as sync_inodes(0) need not
> > submit all the IO (for example it skips inodes with I_SYNC set) so e.g. forcing
> > transaction to disk in ->sync_fs() is not really enough.
> 
> Is not really enough for what?
> 
> sync_fs(wait==0) is not supposed to be reliable - it's an advice to the
> fs that it should push as much "easy" writeback into the queue as
> possible.  We'll do the real sync later, with sync_fs(wait==1).
  Yes, but note that after sync_fs(wait==1) we do sync_inodes(wait==1) and
only this last sync_inodes() call is guaranteed to get all the inode data
to disk. So sync_fs() is called *before* all the dirty data are actually
written. That is against expectation of sync_fs() implementation of most
filesystems...

> > Therefore sys_sync has
> > not been completely reliable on some filesystems (ext3, ext4, reiserfs, ocfs2
> > and others are hit by this) when racing e.g. with background writeback.
> 
> No sync can ever be reliable in the presence of concurrent write
> activity, unless we freeze userspace.
  Of course, but it should be reliable in the presence of pdflush()
flushing dirty data. And it was not currently because even background
writeback sets I_SYNC flag of the inode and sync_inodes(wait==0) skips
these inodes.
  This is the real bug this patch is trying to fix, but generally it tries
to make the code more robust so that the reliability of sys_sync() does not
depend on the exact behavior of WB_SYNC_NONE writeback done by
sync_inodes(wait==0).

> > A
> > similar problem hits also other filesystems (e.g. ext2) because of
> > write_supers() being called before the sync_inodes(1).
> > 
> > Change the ordering of calls in do_sync() - this requires a new function
> > sync_blkdevs() to preserve the property that block devices are always synced
> > after write_super() / sync_fs() call.
> > 
> > The same issue is fixed in __fsync_super() function used on umount /
> > remount read-only.
> 
> So it's all a bit unclear (to me) what this patch is trying to fix?
  Hopefully explained above ;).

									Honza
> 
> 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/super.c         |   27 ++++++++++++++++++++++++++-
> >  fs/sync.c          |    3 ++-
> >  include/linux/fs.h |    2 ++
> >  3 files changed, 30 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/super.c b/fs/super.c
> > index 786fe7d..4826540 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -267,6 +267,7 @@ void __fsync_super(struct super_block *sb)
> >  {
> >  	sync_inodes_sb(sb, 0);
> >  	vfs_dq_sync(sb);
> > +	sync_inodes_sb(sb, 1);
> >  	lock_super(sb);
> >  	if (sb->s_dirt && sb->s_op->write_super)
> >  		sb->s_op->write_super(sb);
> > @@ -274,7 +275,6 @@ void __fsync_super(struct super_block *sb)
> >  	if (sb->s_op->sync_fs)
> >  		sb->s_op->sync_fs(sb, 1);
> >  	sync_blockdev(sb->s_bdev);
> > -	sync_inodes_sb(sb, 1);
> >  }
> >  
> >  /*
> > @@ -502,6 +502,31 @@ restart:
> >  	mutex_unlock(&mutex);
> >  }
> >  
> > +/*
> > + *  Sync all block devices underlying some superblock
> > + */
> > +void sync_blockdevs(void)
> > +{
> > +	struct super_block *sb;
> > +
> > +	spin_lock(&sb_lock);
> > +restart:
> > +	list_for_each_entry(sb, &super_blocks, s_list) {
> > +		if (!sb->s_bdev)
> > +			continue;
> > +		sb->s_count++;
> > +		spin_unlock(&sb_lock);
> > +		down_read(&sb->s_umount);
> > +		if (sb->s_root)
> > +			sync_blockdev(sb->s_bdev);
> > +		up_read(&sb->s_umount);
> > +		spin_lock(&sb_lock);
> > +		if (__put_super_and_need_restart(sb))
> > +			goto restart;
> > +	}
> > +	spin_unlock(&sb_lock);
> > +}
> 
> The comment doesn't match the implementation.  This function syncs all
> blockdevs underlying _all_ superblocks.
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/8] vfs: Call ->sync_fs() even if s_dirt is 0 (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
  2009-04-27 14:43 ` [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 3/8] vfs: Make __fsync_super() a static function " Jan Kara
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

sync_filesystems() has a condition that if wait == 0 and s_dirt == 0, then
->sync_fs() isn't called. This does not really make much sence since s_dirt is
generally used by a filesystem to mean that ->write_super() needs to be called.
But ->sync_fs() does different things. I even suspect that some filesystems
(btrfs?) sets s_dirt just to fool this logic.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/super.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 4826540..d9759e0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -490,7 +490,7 @@ restart:
 		spin_unlock(&sb_lock);
 		down_read(&sb->s_umount);
 		async_synchronize_full_domain(&sb->s_async_list);
-		if (sb->s_root && (wait || sb->s_dirt))
+		if (sb->s_root)
 			sb->s_op->sync_fs(sb, wait);
 		up_read(&sb->s_umount);
 		/* restart only when sb is no longer on the list */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] vfs: Make __fsync_super() a static function (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
  2009-04-27 14:43 ` [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability " Jan Kara
  2009-04-27 14:43 ` [PATCH 2/8] vfs: Call ->sync_fs() even if s_dirt is 0 " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 4/8] vfs: Make sys_sync() use fsync_super() " Jan Kara
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

__fsync_super() does the same thing as fsync_super(). So change the only
caller to use fsync_super() and make __fsync_super() static. This removes
unnecessarily duplicated call to sync_blockdev() and prepares ground
for the changes to __fsync_super() in the following patches.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c     |    2 +-
 fs/super.c         |    7 +++----
 include/linux/fs.h |    1 -
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index f45dbc1..48d1290 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -240,7 +240,7 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 		sb->s_frozen = SB_FREEZE_WRITE;
 		smp_wmb();
 
-		__fsync_super(sb);
+		fsync_super(sb);
 
 		sb->s_frozen = SB_FREEZE_TRANS;
 		smp_wmb();
diff --git a/fs/super.c b/fs/super.c
index d9759e0..05f32a0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -263,7 +263,7 @@ EXPORT_SYMBOL(unlock_super);
  * device.  Takes the superblock lock.  Requires a second blkdev
  * flush by the caller to complete the operation.
  */
-void __fsync_super(struct super_block *sb)
+static int __fsync_super(struct super_block *sb)
 {
 	sync_inodes_sb(sb, 0);
 	vfs_dq_sync(sb);
@@ -274,7 +274,7 @@ void __fsync_super(struct super_block *sb)
 	unlock_super(sb);
 	if (sb->s_op->sync_fs)
 		sb->s_op->sync_fs(sb, 1);
-	sync_blockdev(sb->s_bdev);
+	return sync_blockdev(sb->s_bdev);
 }
 
 /*
@@ -284,8 +284,7 @@ void __fsync_super(struct super_block *sb)
  */
 int fsync_super(struct super_block *sb)
 {
-	__fsync_super(sb);
-	return sync_blockdev(sb->s_bdev);
+	return __fsync_super(sb);
 }
 EXPORT_SYMBOL_GPL(fsync_super);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4bad02e..47a67c9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2084,7 +2084,6 @@ extern int filemap_fdatawrite_range(struct address_space *mapping,
 extern int vfs_fsync(struct file *file, struct dentry *dentry, int datasync);
 extern void sync_supers(void);
 extern void sync_filesystems(int wait);
-extern void __fsync_super(struct super_block *sb);
 extern void emergency_sync(void);
 extern void emergency_remount(void);
 extern int do_remount_sb(struct super_block *sb, int flags,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] vfs: Make sys_sync() use fsync_super() (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (2 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 3/8] vfs: Make __fsync_super() a static function " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 5/8] vfs: Move syncing code from super.c to sync.c " Jan Kara
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

It is unnecessarily fragile to have two places (fsync_super() and do_sync())
doing data integrity sync of the filesystem. Alter __fsync_super() to
accommodate needs of both callers and use it. So after this patch
__fsync_super() is the only place where we gather all the calls needed to
properly send all data on a filesystem to disk.

Nice bonus is that we get a complete livelock avoidance and write_supers()
is now only used for periodic writeback of superblocks.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c            |   15 ++++++---
 fs/fs-writeback.c         |   49 -------------------------------
 fs/super.c                |   70 +++++++++++++++------------------------------
 fs/sync.c                 |   31 ++++++-------------
 include/linux/fs.h        |    4 +-
 include/linux/writeback.h |    1 -
 6 files changed, 45 insertions(+), 125 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 48d1290..2609cce 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -175,17 +175,22 @@ blkdev_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,
 				iov, offset, nr_segs, blkdev_get_blocks, NULL);
 }
 
+int __sync_blockdev(struct block_device *bdev, int wait)
+{
+	if (!bdev)
+		return 0;
+	if (!wait)
+		return filemap_flush(bdev->bd_inode->i_mapping);
+	return filemap_write_and_wait(bdev->bd_inode->i_mapping);
+}
+
 /*
  * Write out and wait upon all the dirty data associated with a block
  * device via its mapping.  Does not take the superblock lock.
  */
 int sync_blockdev(struct block_device *bdev)
 {
-	int ret = 0;
-
-	if (bdev)
-		ret = filemap_write_and_wait(bdev->bd_inode->i_mapping);
-	return ret;
+	return __sync_blockdev(bdev, 1);
 }
 EXPORT_SYMBOL(sync_blockdev);
 
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 91013ff..e0fb2e7 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -679,55 +679,6 @@ void sync_inodes_sb(struct super_block *sb, int wait)
 }
 
 /**
- * sync_inodes - writes all inodes to disk
- * @wait: wait for completion
- *
- * sync_inodes() goes through each super block's dirty inode list, writes the
- * inodes out, waits on the writeout and puts the inodes back on the normal
- * list.
- *
- * This is for sys_sync().  fsync_dev() uses the same algorithm.  The subtle
- * part of the sync functions is that the blockdev "superblock" is processed
- * last.  This is because the write_inode() function of a typical fs will
- * perform no I/O, but will mark buffers in the blockdev mapping as dirty.
- * What we want to do is to perform all that dirtying first, and then write
- * back all those inode blocks via the blockdev mapping in one sweep.  So the
- * additional (somewhat redundant) sync_blockdev() calls here are to make
- * sure that really happens.  Because if we call sync_inodes_sb(wait=1) with
- * outstanding dirty inodes, the writeback goes block-at-a-time within the
- * filesystem's write_inode().  This is extremely slow.
- */
-static void __sync_inodes(int wait)
-{
-	struct super_block *sb;
-
-	spin_lock(&sb_lock);
-restart:
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		sb->s_count++;
-		spin_unlock(&sb_lock);
-		down_read(&sb->s_umount);
-		if (sb->s_root) {
-			sync_inodes_sb(sb, wait);
-			sync_blockdev(sb->s_bdev);
-		}
-		up_read(&sb->s_umount);
-		spin_lock(&sb_lock);
-		if (__put_super_and_need_restart(sb))
-			goto restart;
-	}
-	spin_unlock(&sb_lock);
-}
-
-void sync_inodes(int wait)
-{
-	__sync_inodes(0);
-
-	if (wait)
-		__sync_inodes(1);
-}
-
-/**
  * write_inode_now	-	write an inode to disk
  * @inode: inode to write to disk
  * @sync: whether the write should be synchronous or not
diff --git a/fs/super.c b/fs/super.c
index 05f32a0..b5d7dfb 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -258,23 +258,23 @@ EXPORT_SYMBOL(lock_super);
 EXPORT_SYMBOL(unlock_super);
 
 /*
- * Write out and wait upon all dirty data associated with this
- * superblock.  Filesystem data as well as the underlying block
- * device.  Takes the superblock lock.  Requires a second blkdev
- * flush by the caller to complete the operation.
+ * Do the filesystem syncing work. For simple filesystems sync_inodes_sb(sb, 0)
+ * just dirties buffers with inodes so we have to submit IO for these buffers
+ * via __sync_blockdev(). This also speeds up the wait == 1 case since in that
+ * case write_inode() functions do sync_dirty_buffer() and thus effectively
+ * write one block at a time.
  */
-static int __fsync_super(struct super_block *sb)
+static int __fsync_super(struct super_block *sb, int wait)
 {
-	sync_inodes_sb(sb, 0);
 	vfs_dq_sync(sb);
-	sync_inodes_sb(sb, 1);
+	sync_inodes_sb(sb, wait);
 	lock_super(sb);
 	if (sb->s_dirt && sb->s_op->write_super)
 		sb->s_op->write_super(sb);
 	unlock_super(sb);
 	if (sb->s_op->sync_fs)
-		sb->s_op->sync_fs(sb, 1);
-	return sync_blockdev(sb->s_bdev);
+		sb->s_op->sync_fs(sb, wait);
+	return __sync_blockdev(sb->s_bdev, wait);
 }
 
 /*
@@ -284,7 +284,12 @@ static int __fsync_super(struct super_block *sb)
  */
 int fsync_super(struct super_block *sb)
 {
-	return __fsync_super(sb);
+	int ret;
+
+	ret = __fsync_super(sb, 0);
+	if (ret < 0)
+		return ret;
+	return __fsync_super(sb, 1);
 }
 EXPORT_SYMBOL_GPL(fsync_super);
 
@@ -448,20 +453,18 @@ restart:
 }
 
 /*
- * Call the ->sync_fs super_op against all filesystems which are r/w and
- * which implement it.
+ * Sync all the data for all the filesystems (called by sys_sync() and
+ * emergency sync)
  *
  * This operation is careful to avoid the livelock which could easily happen
- * if two or more filesystems are being continuously dirtied.  s_need_sync_fs
+ * if two or more filesystems are being continuously dirtied.  s_need_sync
  * is used only here.  We set it against all filesystems and then clear it as
  * we sync them.  So redirtied filesystems are skipped.
  *
  * But if process A is currently running sync_filesystems and then process B
- * calls sync_filesystems as well, process B will set all the s_need_sync_fs
+ * calls sync_filesystems as well, process B will set all the s_need_sync
  * flags again, which will cause process A to resync everything.  Fix that with
  * a local mutex.
- *
- * (Fabian) Avoid sync_fs with clean fs & wait mode 0
  */
 void sync_filesystems(int wait)
 {
@@ -471,18 +474,16 @@ void sync_filesystems(int wait)
 	mutex_lock(&mutex);		/* Could be down_interruptible */
 	spin_lock(&sb_lock);
 	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (!sb->s_op->sync_fs)
-			continue;
 		if (sb->s_flags & MS_RDONLY)
 			continue;
-		sb->s_need_sync_fs = 1;
+		sb->s_need_sync = 1;
 	}
 
 restart:
 	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (!sb->s_need_sync_fs)
+		if (!sb->s_need_sync)
 			continue;
-		sb->s_need_sync_fs = 0;
+		sb->s_need_sync = 0;
 		if (sb->s_flags & MS_RDONLY)
 			continue;	/* hm.  Was remounted r/o meanwhile */
 		sb->s_count++;
@@ -490,7 +491,7 @@ restart:
 		down_read(&sb->s_umount);
 		async_synchronize_full_domain(&sb->s_async_list);
 		if (sb->s_root)
-			sb->s_op->sync_fs(sb, wait);
+			__fsync_super(sb, wait);
 		up_read(&sb->s_umount);
 		/* restart only when sb is no longer on the list */
 		spin_lock(&sb_lock);
@@ -501,31 +502,6 @@ restart:
 	mutex_unlock(&mutex);
 }
 
-/*
- *  Sync all block devices underlying some superblock
- */
-void sync_blockdevs(void)
-{
-	struct super_block *sb;
-
-	spin_lock(&sb_lock);
-restart:
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (!sb->s_bdev)
-			continue;
-		sb->s_count++;
-		spin_unlock(&sb_lock);
-		down_read(&sb->s_umount);
-		if (sb->s_root)
-			sync_blockdev(sb->s_bdev);
-		up_read(&sb->s_umount);
-		spin_lock(&sb_lock);
-		if (__put_super_and_need_restart(sb))
-			goto restart;
-	}
-	spin_unlock(&sb_lock);
-}
-
 /**
  *	get_super - get the superblock of a device
  *	@bdev: device to get the superblock for
diff --git a/fs/sync.c b/fs/sync.c
index fa14e42..86c6a86 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -17,35 +17,24 @@
 #define VALID_FLAGS (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE| \
 			SYNC_FILE_RANGE_WAIT_AFTER)
 
-/*
- * sync everything.  Start out by waking pdflush, because that writes back
- * all queues in parallel.
- */
-static void do_sync(unsigned long wait)
+SYSCALL_DEFINE0(sync)
 {
-	wakeup_pdflush(0);
-	sync_inodes(0);		/* All mappings, inodes and their blockdevs */
-	vfs_dq_sync(NULL);
-	sync_inodes(wait);	/* Mappings, inodes and blockdevs, again. */
-	sync_supers();		/* Write the superblocks */
-	sync_filesystems(0);	/* Start syncing the filesystems */
-	sync_filesystems(wait);	/* Waitingly sync the filesystems */
-	sync_blockdevs();
-	if (!wait)
-		printk("Emergency Sync complete\n");
+	sync_filesystems(0);
+	sync_filesystems(1);
 	if (unlikely(laptop_mode))
 		laptop_sync_completion();
-}
-
-SYSCALL_DEFINE0(sync)
-{
-	do_sync(1);
 	return 0;
 }
 
 static void do_sync_work(struct work_struct *work)
 {
-	do_sync(0);
+	/*
+	 * Sync twice to reduce the possibility we skipped some inodes / pages
+	 * because they were temporarily locked
+	 */
+	sync_filesystems(0);
+	sync_filesystems(0);
+	printk("Emergency Sync complete\n");
 	kfree(work);
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 47a67c9..be2be8d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1321,7 +1321,7 @@ struct super_block {
 	struct rw_semaphore	s_umount;
 	struct mutex		s_lock;
 	int			s_count;
-	int			s_need_sync_fs;
+	int			s_need_sync;
 	atomic_t		s_active;
 #ifdef CONFIG_SECURITY
 	void                    *s_security;
@@ -1942,7 +1942,7 @@ extern void bdput(struct block_device *);
 extern struct block_device *open_by_devnum(dev_t, fmode_t);
 extern void invalidate_bdev(struct block_device *);
 extern int sync_blockdev(struct block_device *bdev);
-extern void sync_blockdevs(void);
+extern int __sync_blockdev(struct block_device *bdev, int wait);
 extern struct super_block *freeze_bdev(struct block_device *);
 extern void emergency_thaw_all(void);
 extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9c1ed1f..943d1c9 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -79,7 +79,6 @@ struct writeback_control {
 void writeback_inodes(struct writeback_control *wbc);
 int inode_wait(void *);
 void sync_inodes_sb(struct super_block *, int wait);
-void sync_inodes(int wait);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
 static inline void wait_on_inode(struct inode *inode)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] vfs: Move syncing code from super.c to sync.c (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (3 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 4/8] vfs: Make sys_sync() use fsync_super() " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 6/8] vfs: Rename fsync_super() to sync_filesystem() " Jan Kara
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

Move sync_filesystems(), __fsync_super(), fsync_super() from
super.c to sync.c where it fits better.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/super.c         |   86 ----------------------------------------------------
 fs/sync.c          |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |    1 -
 3 files changed, 86 insertions(+), 87 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index b5d7dfb..c0302a5 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -257,42 +257,6 @@ void unlock_super(struct super_block * sb)
 EXPORT_SYMBOL(lock_super);
 EXPORT_SYMBOL(unlock_super);
 
-/*
- * Do the filesystem syncing work. For simple filesystems sync_inodes_sb(sb, 0)
- * just dirties buffers with inodes so we have to submit IO for these buffers
- * via __sync_blockdev(). This also speeds up the wait == 1 case since in that
- * case write_inode() functions do sync_dirty_buffer() and thus effectively
- * write one block at a time.
- */
-static int __fsync_super(struct super_block *sb, int wait)
-{
-	vfs_dq_sync(sb);
-	sync_inodes_sb(sb, wait);
-	lock_super(sb);
-	if (sb->s_dirt && sb->s_op->write_super)
-		sb->s_op->write_super(sb);
-	unlock_super(sb);
-	if (sb->s_op->sync_fs)
-		sb->s_op->sync_fs(sb, wait);
-	return __sync_blockdev(sb->s_bdev, wait);
-}
-
-/*
- * Write out and wait upon all dirty data associated with this
- * superblock.  Filesystem data as well as the underlying block
- * device.  Takes the superblock lock.
- */
-int fsync_super(struct super_block *sb)
-{
-	int ret;
-
-	ret = __fsync_super(sb, 0);
-	if (ret < 0)
-		return ret;
-	return __fsync_super(sb, 1);
-}
-EXPORT_SYMBOL_GPL(fsync_super);
-
 /**
  *	generic_shutdown_super	-	common helper for ->kill_sb()
  *	@sb: superblock to kill
@@ -452,56 +416,6 @@ restart:
 	spin_unlock(&sb_lock);
 }
 
-/*
- * Sync all the data for all the filesystems (called by sys_sync() and
- * emergency sync)
- *
- * This operation is careful to avoid the livelock which could easily happen
- * if two or more filesystems are being continuously dirtied.  s_need_sync
- * is used only here.  We set it against all filesystems and then clear it as
- * we sync them.  So redirtied filesystems are skipped.
- *
- * But if process A is currently running sync_filesystems and then process B
- * calls sync_filesystems as well, process B will set all the s_need_sync
- * flags again, which will cause process A to resync everything.  Fix that with
- * a local mutex.
- */
-void sync_filesystems(int wait)
-{
-	struct super_block *sb;
-	static DEFINE_MUTEX(mutex);
-
-	mutex_lock(&mutex);		/* Could be down_interruptible */
-	spin_lock(&sb_lock);
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (sb->s_flags & MS_RDONLY)
-			continue;
-		sb->s_need_sync = 1;
-	}
-
-restart:
-	list_for_each_entry(sb, &super_blocks, s_list) {
-		if (!sb->s_need_sync)
-			continue;
-		sb->s_need_sync = 0;
-		if (sb->s_flags & MS_RDONLY)
-			continue;	/* hm.  Was remounted r/o meanwhile */
-		sb->s_count++;
-		spin_unlock(&sb_lock);
-		down_read(&sb->s_umount);
-		async_synchronize_full_domain(&sb->s_async_list);
-		if (sb->s_root)
-			__fsync_super(sb, wait);
-		up_read(&sb->s_umount);
-		/* restart only when sb is no longer on the list */
-		spin_lock(&sb_lock);
-		if (__put_super_and_need_restart(sb))
-			goto restart;
-	}
-	spin_unlock(&sb_lock);
-	mutex_unlock(&mutex);
-}
-
 /**
  *	get_super - get the superblock of a device
  *	@bdev: device to get the superblock for
diff --git a/fs/sync.c b/fs/sync.c
index 86c6a86..a0a163a 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -17,6 +17,92 @@
 #define VALID_FLAGS (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE| \
 			SYNC_FILE_RANGE_WAIT_AFTER)
 
+/*
+ * Do the filesystem syncing work. For simple filesystems sync_inodes_sb(sb, 0)
+ * just dirties buffers with inodes so we have to submit IO for these buffers
+ * via __sync_blockdev(). This also speeds up the wait == 1 case since in that
+ * case write_inode() functions do sync_dirty_buffer() and thus effectively
+ * write one block at a time.
+ */
+static int __fsync_super(struct super_block *sb, int wait)
+{
+	vfs_dq_sync(sb);
+	sync_inodes_sb(sb, wait);
+	lock_super(sb);
+	if (sb->s_dirt && sb->s_op->write_super)
+		sb->s_op->write_super(sb);
+	unlock_super(sb);
+	if (sb->s_op->sync_fs)
+		sb->s_op->sync_fs(sb, wait);
+	return __sync_blockdev(sb->s_bdev, wait);
+}
+
+/*
+ * Write out and wait upon all dirty data associated with this
+ * superblock.  Filesystem data as well as the underlying block
+ * device.  Takes the superblock lock.
+ */
+int fsync_super(struct super_block *sb)
+{
+	int ret;
+
+	ret = __fsync_super(sb, 0);
+	if (ret < 0)
+		return ret;
+	return __fsync_super(sb, 1);
+}
+EXPORT_SYMBOL_GPL(fsync_super);
+
+/*
+ * Sync all the data for all the filesystems (called by sys_sync() and
+ * emergency sync)
+ *
+ * This operation is careful to avoid the livelock which could easily happen
+ * if two or more filesystems are being continuously dirtied.  s_need_sync
+ * is used only here.  We set it against all filesystems and then clear it as
+ * we sync them.  So redirtied filesystems are skipped.
+ *
+ * But if process A is currently running sync_filesystems and then process B
+ * calls sync_filesystems as well, process B will set all the s_need_sync
+ * flags again, which will cause process A to resync everything.  Fix that with
+ * a local mutex.
+ */
+static void sync_filesystems(int wait)
+{
+	struct super_block *sb;
+	static DEFINE_MUTEX(mutex);
+
+	mutex_lock(&mutex);		/* Could be down_interruptible */
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (sb->s_flags & MS_RDONLY)
+			continue;
+		sb->s_need_sync = 1;
+	}
+
+restart:
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (!sb->s_need_sync)
+			continue;
+		sb->s_need_sync = 0;
+		if (sb->s_flags & MS_RDONLY)
+			continue;	/* hm.  Was remounted r/o meanwhile */
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+		down_read(&sb->s_umount);
+		async_synchronize_full_domain(&sb->s_async_list);
+		if (sb->s_root)
+			__fsync_super(sb, wait);
+		up_read(&sb->s_umount);
+		/* restart only when sb is no longer on the list */
+		spin_lock(&sb_lock);
+		if (__put_super_and_need_restart(sb))
+			goto restart;
+	}
+	spin_unlock(&sb_lock);
+	mutex_unlock(&mutex);
+}
+
 SYSCALL_DEFINE0(sync)
 {
 	sync_filesystems(0);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index be2be8d..cbef739 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2083,7 +2083,6 @@ extern int filemap_fdatawrite_range(struct address_space *mapping,
 
 extern int vfs_fsync(struct file *file, struct dentry *dentry, int datasync);
 extern void sync_supers(void);
-extern void sync_filesystems(int wait);
 extern void emergency_sync(void);
 extern void emergency_remount(void);
 extern int do_remount_sb(struct super_block *sb, int flags,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] vfs: Rename fsync_super() to sync_filesystem() (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (4 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 5/8] vfs: Move syncing code from super.c to sync.c " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 7/8] quota: cleanup dquota sync functions " Jan Kara
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

Rename the function so that it better describe what it really does. Also
remove the unnecessary include of buffer_head.h.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c     |    4 ++--
 fs/super.c         |    5 ++---
 fs/sync.c          |   14 +++++++-------
 include/linux/fs.h |    2 +-
 4 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2609cce..40370c5 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -203,7 +203,7 @@ int fsync_bdev(struct block_device *bdev)
 {
 	struct super_block *sb = get_super(bdev);
 	if (sb) {
-		int res = fsync_super(sb);
+		int res = sync_filesystem(sb);
 		drop_super(sb);
 		return res;
 	}
@@ -245,7 +245,7 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 		sb->s_frozen = SB_FREEZE_WRITE;
 		smp_wmb();
 
-		fsync_super(sb);
+		sync_filesystem(sb);
 
 		sb->s_frozen = SB_FREEZE_TRANS;
 		smp_wmb();
diff --git a/fs/super.c b/fs/super.c
index c0302a5..a7df36c 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -28,7 +28,6 @@
 #include <linux/blkdev.h>
 #include <linux/quotaops.h>
 #include <linux/namei.h>
-#include <linux/buffer_head.h>		/* for fsync_super() */
 #include <linux/mount.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
@@ -278,7 +277,7 @@ void generic_shutdown_super(struct super_block *sb)
 
 	if (sb->s_root) {
 		shrink_dcache_for_umount(sb);
-		fsync_super(sb);
+		sync_filesystem(sb);
 		lock_super(sb);
 		sb->s_flags &= ~MS_ACTIVE;
 
@@ -561,7 +560,7 @@ int do_remount_sb(struct super_block *sb, int flags, void *data, int force)
 	if (flags & MS_RDONLY)
 		acct_auto_close(sb);
 	shrink_dcache_sb(sb);
-	fsync_super(sb);
+	sync_filesystem(sb);
 
 	/* If we are remounting RDONLY and current sb is read/write,
 	   make sure there are no rw files opened */
diff --git a/fs/sync.c b/fs/sync.c
index a0a163a..c3ef04e 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -12,7 +12,7 @@
 #include <linux/linkage.h>
 #include <linux/pagemap.h>
 #include <linux/quotaops.h>
-#include <linux/buffer_head.h>
+#include <linux/async.h>
 
 #define VALID_FLAGS (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE| \
 			SYNC_FILE_RANGE_WAIT_AFTER)
@@ -24,7 +24,7 @@
  * case write_inode() functions do sync_dirty_buffer() and thus effectively
  * write one block at a time.
  */
-static int __fsync_super(struct super_block *sb, int wait)
+static int __sync_filesystem(struct super_block *sb, int wait)
 {
 	vfs_dq_sync(sb);
 	sync_inodes_sb(sb, wait);
@@ -42,16 +42,16 @@ static int __fsync_super(struct super_block *sb, int wait)
  * superblock.  Filesystem data as well as the underlying block
  * device.  Takes the superblock lock.
  */
-int fsync_super(struct super_block *sb)
+int sync_filesystem(struct super_block *sb)
 {
 	int ret;
 
-	ret = __fsync_super(sb, 0);
+	ret = __sync_filesystem(sb, 0);
 	if (ret < 0)
 		return ret;
-	return __fsync_super(sb, 1);
+	return __sync_filesystem(sb, 1);
 }
-EXPORT_SYMBOL_GPL(fsync_super);
+EXPORT_SYMBOL_GPL(sync_filesystem);
 
 /*
  * Sync all the data for all the filesystems (called by sys_sync() and
@@ -92,7 +92,7 @@ restart:
 		down_read(&sb->s_umount);
 		async_synchronize_full_domain(&sb->s_async_list);
 		if (sb->s_root)
-			__fsync_super(sb, wait);
+			__sync_filesystem(sb, wait);
 		up_read(&sb->s_umount);
 		/* restart only when sb is no longer on the list */
 		spin_lock(&sb_lock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cbef739..71a59ee 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1947,7 +1947,7 @@ extern struct super_block *freeze_bdev(struct block_device *);
 extern void emergency_thaw_all(void);
 extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
 extern int fsync_bdev(struct block_device *);
-extern int fsync_super(struct super_block *);
+extern int sync_filesystem(struct super_block *);
 extern int fsync_no_super(struct block_device *);
 #else
 static inline void bd_forget(struct inode *inode) {}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] quota: cleanup dquota sync functions (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (5 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 6/8] vfs: Rename fsync_super() to sync_filesystem() " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 14:43 ` [PATCH 8/8] quota: Introduce writeout_quota_sb() " Jan Kara
  2009-04-27 17:36 ` [PATCH 0/8] Sync fixes and cleanups " Al Viro
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Christoph Hellwig, Jan Kara

From: Christoph Hellwig <hch@lst.de>

Currently the VFS calls vfs_dq_sync to sync out disk quotas for a given
superblock.  This is a small wrapper around sync_dquots which for the
case of a non-NULL superblock is a small wrapper around quota_sync_sb.

Just make quota_sync_sb global (rename it to sync_quota_sb) and call it
directly.  Also call it directly for those cases in quota.c that have a
superblock and leave sync_dquots purely an iterator over sync_quota_sb and
remove it's superblock argument.

To make this nicer move the check for the lack of a quota_sync method
from the callers into sync_quota_sb.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/quota/quota.c         |   23 ++++++++++++-----------
 fs/sync.c                |    2 +-
 include/linux/quotaops.h |   11 +++--------
 3 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index b7f5a46..5edb20d 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -159,10 +159,13 @@ static int check_quotactl_valid(struct super_block *sb, int type, int cmd,
 	return error;
 }
 
-static void quota_sync_sb(struct super_block *sb, int type)
+void sync_quota_sb(struct super_block *sb, int type)
 {
 	int cnt;
 
+	if (!sb->s_qcop->quota_sync)
+		return;
+
 	sb->s_qcop->quota_sync(sb, type);
 
 	if (sb_dqopt(sb)->flags & DQUOT_QUOTA_SYS_FILE)
@@ -192,16 +195,11 @@ static void quota_sync_sb(struct super_block *sb, int type)
 	mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
 }
 
-void sync_dquots(struct super_block *sb, int type)
+static void sync_dquots(int type)
 {
+	struct super_block *sb;
 	int cnt;
 
-	if (sb) {
-		if (sb->s_qcop->quota_sync)
-			quota_sync_sb(sb, type);
-		return;
-	}
-
 	spin_lock(&sb_lock);
 restart:
 	list_for_each_entry(sb, &super_blocks, s_list) {
@@ -222,8 +220,8 @@ restart:
 		sb->s_count++;
 		spin_unlock(&sb_lock);
 		down_read(&sb->s_umount);
-		if (sb->s_root && sb->s_qcop->quota_sync)
-			quota_sync_sb(sb, type);
+		if (sb->s_root)
+			sync_quota_sb(sb, type);
 		up_read(&sb->s_umount);
 		spin_lock(&sb_lock);
 		if (__put_super_and_need_restart(sb))
@@ -301,7 +299,10 @@ static int do_quotactl(struct super_block *sb, int type, int cmd, qid_t id,
 			return sb->s_qcop->set_dqblk(sb, type, id, &idq);
 		}
 		case Q_SYNC:
-			sync_dquots(sb, type);
+			if (sb)
+				sync_quota_sb(sb, type);
+			else
+				sync_dquots(type);
 			return 0;
 
 		case Q_XQUOTAON:
diff --git a/fs/sync.c b/fs/sync.c
index c3ef04e..4914f0a 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -26,7 +26,7 @@
  */
 static int __sync_filesystem(struct super_block *sb, int wait)
 {
-	vfs_dq_sync(sb);
+	sync_quota_sb(sb, -1);
 	sync_inodes_sb(sb, wait);
 	lock_super(sb);
 	if (sb->s_dirt && sb->s_op->write_super)
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index 36353d9..047310f 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -20,7 +20,7 @@ static inline struct quota_info *sb_dqopt(struct super_block *sb)
 /*
  * declaration of quota_function calls in kernel.
  */
-void sync_dquots(struct super_block *sb, int type);
+void sync_quota_sb(struct super_block *sb, int type);
 
 int dquot_initialize(struct inode *inode, int type);
 int dquot_drop(struct inode *inode);
@@ -253,12 +253,7 @@ static inline void vfs_dq_free_inode(struct inode *inode)
 		inode->i_sb->dq_op->free_inode(inode, 1);
 }
 
-/* The following two functions cannot be called inside a transaction */
-static inline void vfs_dq_sync(struct super_block *sb)
-{
-	sync_dquots(sb, -1);
-}
-
+/* Cannot be called inside a transaction */
 static inline int vfs_dq_off(struct super_block *sb, int remount)
 {
 	int ret = -ENOSYS;
@@ -334,7 +329,7 @@ static inline void vfs_dq_free_inode(struct inode *inode)
 {
 }
 
-static inline void vfs_dq_sync(struct super_block *sb)
+static inline void sync_quota_sb(struct super_block *sb, int type)
 {
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] quota: Introduce writeout_quota_sb() (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (6 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 7/8] quota: cleanup dquota sync functions " Jan Kara
@ 2009-04-27 14:43 ` Jan Kara
  2009-04-27 17:36 ` [PATCH 0/8] Sync fixes and cleanups " Al Viro
  8 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2009-04-27 14:43 UTC (permalink / raw)
  To: LKML
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton, Jan Kara

Introduce this function which just writes all the quota structures but
avoids all the syncing and cache pruning work to expose quota structures
to userspace. Use this function from __sync_filesystem when wait == 0.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/sync.c                |    6 +++++-
 include/linux/quotaops.h |    9 +++++++++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/fs/sync.c b/fs/sync.c
index 4914f0a..8a14e20 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -26,7 +26,11 @@
  */
 static int __sync_filesystem(struct super_block *sb, int wait)
 {
-	sync_quota_sb(sb, -1);
+	/* Avoid doing twice syncing and cache pruning for quota sync */
+	if (!wait)
+		writeout_quota_sb(sb, -1);
+	else
+		sync_quota_sb(sb, -1);
 	sync_inodes_sb(sb, wait);
 	lock_super(sb);
 	if (sb->s_dirt && sb->s_op->write_super)
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index 047310f..7bc4575 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -21,6 +21,11 @@ static inline struct quota_info *sb_dqopt(struct super_block *sb)
  * declaration of quota_function calls in kernel.
  */
 void sync_quota_sb(struct super_block *sb, int type);
+static inline void writeout_quota_sb(struct super_block *sb, int type)
+{
+	if (sb->s_qcop->quota_sync)
+		sb->s_qcop->quota_sync(sb, type);
+}
 
 int dquot_initialize(struct inode *inode, int type);
 int dquot_drop(struct inode *inode);
@@ -333,6 +338,10 @@ static inline void sync_quota_sb(struct super_block *sb, int type)
 {
 }
 
+static inline void writeout_quota_sb(struct super_block *sb, int type)
+{
+}
+
 static inline int vfs_dq_off(struct super_block *sb, int remount)
 {
 	return 0;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/8] Sync fixes and cleanups (version 4)
  2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
                   ` (7 preceding siblings ...)
  2009-04-27 14:43 ` [PATCH 8/8] quota: Introduce writeout_quota_sb() " Jan Kara
@ 2009-04-27 17:36 ` Al Viro
  8 siblings, 0 replies; 12+ messages in thread
From: Al Viro @ 2009-04-27 17:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: LKML, linux-fsdevel, Christoph Hellwig, Trond Myklebust,
	Andrew Morton

On Mon, Apr 27, 2009 at 04:43:47PM +0200, Jan Kara wrote:
>   Hi,
> 
>   here comes the next version of sync fixes and cleanups. This version also
> includes some moving of code from super.c to sync.c and quota sync cleanup and
> slight speedup. If noone has any objections anymore, who's going to merge it?
> Al?

Yes.  I've got the previous version of patchset in the local queue, BTW,
but I'm still going through that code.  Woke up to find the update...

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-04-28 11:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-27 14:43 [PATCH 0/8] Sync fixes and cleanups (version 4) Jan Kara
2009-04-27 14:43 ` [PATCH 1/8] vfs: Fix sys_sync() and fsync_super() reliability " Jan Kara
2009-04-27 19:38   ` Andrew Morton
2009-04-28 11:56     ` Jan Kara
2009-04-27 14:43 ` [PATCH 2/8] vfs: Call ->sync_fs() even if s_dirt is 0 " Jan Kara
2009-04-27 14:43 ` [PATCH 3/8] vfs: Make __fsync_super() a static function " Jan Kara
2009-04-27 14:43 ` [PATCH 4/8] vfs: Make sys_sync() use fsync_super() " Jan Kara
2009-04-27 14:43 ` [PATCH 5/8] vfs: Move syncing code from super.c to sync.c " Jan Kara
2009-04-27 14:43 ` [PATCH 6/8] vfs: Rename fsync_super() to sync_filesystem() " Jan Kara
2009-04-27 14:43 ` [PATCH 7/8] quota: cleanup dquota sync functions " Jan Kara
2009-04-27 14:43 ` [PATCH 8/8] quota: Introduce writeout_quota_sb() " Jan Kara
2009-04-27 17:36 ` [PATCH 0/8] Sync fixes and cleanups " Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).