linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] introduce sys_syncfs to sync a single file system
       [not found]     ` <Pine.LNX.4.64.1103071515070.11152@cobra.newdream.net>
@ 2011-03-10 19:31       ` Sage Weil
  2011-03-10 22:08         ` Arnd Bergmann
                           ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Sage Weil @ 2011-03-10 19:31 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Aneesh Kumar K. V, Jonathan Nieder, akpm, linux-api,
	arnd, mtk.manpages, viro, hch, linux-arch

It is frequently useful to sync a single file system, instead of all
mounted file systems via sync(2):

 - On machines with many mounts, it is not at all uncommon for some of
   them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
   those and may never get to the one you do care about (e.g., /).
 - Some applications write lots of data to the file system and then
   want to make sure it is flushed to disk.  Calling fsync(2) on each
   file introduces unnecessary ordering constraints that result in a large
   amount of sub-optimal writeback/flush/commit behavior by the file
   system.

There are currently two ways (that I know of) to sync a single super_block:

 - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
   mapping, which isn't usually desirable, and doesn't work for non-block
   file systems.
 - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
   current implemention.  Relying on this little-known side effect for
   something like data safety sounds foolish.

Both of these approaches require root privileges, which some applications
do not have (nor should they need?) given that sync(2) is an unprivileged
operation.

This patch introduces a new system call syncfs(2) that takes an fd and
syncs only the file system it references.  Maybe someday we can

 $ sync /some/path

and not get

 sync: ignoring all arguments

The syscall is motivated by comments by Al and Christoph at the last LSF.
syncfs(2) seems like an appropriate name given statfs(2).

A similar ioctl was also proposed a while back, see
	http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2

Signed-off-by: Sage Weil <sage@newdream.net>
---
ChangeLog:
  v3: Update include/linux/syscalls.h and asm-generic/unistd.h
  v2: Rename to syncfs, simplify to just take an fd.
  v1: syncat

 arch/x86/ia32/ia32entry.S          |    1 +
 arch/x86/include/asm/unistd_32.h   |    3 ++-
 arch/x86/include/asm/unistd_64.h   |    2 ++
 arch/x86/kernel/syscall_table_32.S |    1 +
 fs/sync.c                          |   24 ++++++++++++++++++++++++
 include/asm-generic/unistd.h       |    4 +++-
 include/linux/syscalls.h           |    1 +
 7 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 518bb99..24082b8 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -851,4 +851,5 @@ ia32_sys_call_table:
 	.quad sys_fanotify_init
 	.quad sys32_fanotify_mark
 	.quad sys_prlimit64		/* 340 */
+	.quad sys_syncfs
 ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index b766a5e..da3903f 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -346,10 +346,11 @@
 #define __NR_fanotify_init	338
 #define __NR_fanotify_mark	339
 #define __NR_prlimit64		340
+#define __NR_syncfs             341
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 341
+#define NR_syscalls 342
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 363e9b8..a2e7516 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -669,6 +669,8 @@ __SYSCALL(__NR_fanotify_init, sys_fanotify_init)
 __SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
 #define __NR_prlimit64				302
 __SYSCALL(__NR_prlimit64, sys_prlimit64)
+#define __NR_syncfs                             303
+__SYSCALL(__NR_syncfs, sys_syncfs)
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index b35786d..d40bd16 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -340,3 +340,4 @@ ENTRY(sys_call_table)
 	.long sys_fanotify_init
 	.long sys_fanotify_mark
 	.long sys_prlimit64		/* 340 */
+	.long sys_syncfs
diff --git a/fs/sync.c b/fs/sync.c
index ba76b96..92ca208 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -7,6 +7,7 @@
 #include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/namei.h>
 #include <linux/sched.h>
 #include <linux/writeback.h>
 #include <linux/syscalls.h>
@@ -128,6 +129,29 @@ void emergency_sync(void)
 	}
 }
 
+/*
+ * sync a single super
+ */
+SYSCALL_DEFINE1(syncfs, int, fd)
+{
+	struct file *file;
+	struct super_block *sb;
+	int ret;
+	int fput_needed;
+
+	file = fget_light(fd, &fput_needed);
+	if (!file)
+		return -EBADF;
+	sb = file->f_dentry->d_sb;
+
+	down_read(&sb->s_umount);
+	ret = sync_filesystem(sb);
+	up_read(&sb->s_umount);
+
+	fput_light(file, fput_needed);
+	return ret;
+}
+
 /**
  * vfs_fsync_range - helper to sync a range of data & metadata to disk
  * @file:		file to sync
diff --git a/include/asm-generic/unistd.h b/include/asm-generic/unistd.h
index b969770..3cf62eb 100644
--- a/include/asm-generic/unistd.h
+++ b/include/asm-generic/unistd.h
@@ -646,9 +646,11 @@ __SYSCALL(__NR_prlimit64, sys_prlimit64)
 __SYSCALL(__NR_fanotify_init, sys_fanotify_init)
 #define __NR_fanotify_mark 263
 __SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
+#define __NR_syncfs 264
+__SYSCALL(__NR_syncfs, sys_syncfs)
 
 #undef __NR_syscalls
-#define __NR_syscalls 264
+#define __NR_syscalls 265
 
 /*
  * All syscalls below here should go away really,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 98664db..0ceed21 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -820,6 +820,7 @@ asmlinkage long sys_fanotify_init(unsigned int flags, unsigned int event_f_flags
 asmlinkage long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
 				  u64 mask, int fd,
 				  const char  __user *pathname);
+asmlinkage long sys_syncfs(int fd);
 
 int kernel_execve(const char *filename, const char *const argv[], const char *const envp[]);
 
-- 
1.7.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] introduce sys_syncfs to sync a single file system
  2011-03-10 19:31       ` [PATCH v3] introduce sys_syncfs to sync a single file system Sage Weil
@ 2011-03-10 22:08         ` Arnd Bergmann
       [not found]         ` <Pine.LNX.4.64.1103101125150.4190-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  2011-03-13 20:59         ` Christoph Hellwig
  2 siblings, 0 replies; 6+ messages in thread
From: Arnd Bergmann @ 2011-03-10 22:08 UTC (permalink / raw)
  To: Sage Weil
  Cc: linux-fsdevel, linux-kernel, Aneesh Kumar K. V, Jonathan Nieder,
	akpm, linux-api, mtk.manpages, viro, hch, linux-arch

On Thursday 10 March 2011 20:31:30 Sage Weil wrote:
> It is frequently useful to sync a single file system, instead of all
> mounted file systems via sync(2):
> 
>  - On machines with many mounts, it is not at all uncommon for some of
>    them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
>    those and may never get to the one you do care about (e.g., /).
>  - Some applications write lots of data to the file system and then
>    want to make sure it is flushed to disk.  Calling fsync(2) on each
>    file introduces unnecessary ordering constraints that result in a large
>    amount of sub-optimal writeback/flush/commit behavior by the file
>    system.
> 
> ...
> Signed-off-by: Sage Weil <sage@newdream.net>

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] introduce sys_syncfs to sync a single file system
       [not found]         ` <Pine.LNX.4.64.1103101125150.4190-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2011-03-11  4:44           ` Aneesh Kumar K. V
  2011-03-11  4:44             ` Aneesh Kumar K. V
  0 siblings, 1 reply; 6+ messages in thread
From: Aneesh Kumar K. V @ 2011-03-11  4:44 UTC (permalink / raw)
  To: Sage Weil, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Jonathan Nieder,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	linux-api-u79uwXL29TY76Z2rM5mHXA, arnd-r2nGTMty4D4,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, hch-jcswGhMUV9g,
	linux-arch-u79uwXL29TY76Z2rM5mHXA

On Thu, 10 Mar 2011 11:31:30 -0800 (PST), Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
> It is frequently useful to sync a single file system, instead of all
> mounted file systems via sync(2):
> 
>  - On machines with many mounts, it is not at all uncommon for some of
>    them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
>    those and may never get to the one you do care about (e.g., /).
>  - Some applications write lots of data to the file system and then
>    want to make sure it is flushed to disk.  Calling fsync(2) on each
>    file introduces unnecessary ordering constraints that result in a large
>    amount of sub-optimal writeback/flush/commit behavior by the file
>    system.
> 
> There are currently two ways (that I know of) to sync a single super_block:
> 
>  - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
>    mapping, which isn't usually desirable, and doesn't work for non-block
>    file systems.
>  - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
>    current implemention.  Relying on this little-known side effect for
>    something like data safety sounds foolish.
> 
> Both of these approaches require root privileges, which some applications
> do not have (nor should they need?) given that sync(2) is an unprivileged
> operation.
> 
> This patch introduces a new system call syncfs(2) that takes an fd and
> syncs only the file system it references.  Maybe someday we can
> 
>  $ sync /some/path
> 
> and not get
> 
>  sync: ignoring all arguments
> 
> The syscall is motivated by comments by Al and Christoph at the last LSF.
> syncfs(2) seems like an appropriate name given statfs(2).
> 
> A similar ioctl was also proposed a while back, see
> 	http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2
> 
> Signed-off-by: Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org>

Reviewed-by: Aneesh Kumar <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] introduce sys_syncfs to sync a single file system
  2011-03-11  4:44           ` Aneesh Kumar K. V
@ 2011-03-11  4:44             ` Aneesh Kumar K. V
  0 siblings, 0 replies; 6+ messages in thread
From: Aneesh Kumar K. V @ 2011-03-11  4:44 UTC (permalink / raw)
  To: Sage Weil, linux-fsdevel
  Cc: linux-kernel, Jonathan Nieder, akpm, linux-api, arnd,
	mtk.manpages, viro, hch, linux-arch

On Thu, 10 Mar 2011 11:31:30 -0800 (PST), Sage Weil <sage@newdream.net> wrote:
> It is frequently useful to sync a single file system, instead of all
> mounted file systems via sync(2):
> 
>  - On machines with many mounts, it is not at all uncommon for some of
>    them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
>    those and may never get to the one you do care about (e.g., /).
>  - Some applications write lots of data to the file system and then
>    want to make sure it is flushed to disk.  Calling fsync(2) on each
>    file introduces unnecessary ordering constraints that result in a large
>    amount of sub-optimal writeback/flush/commit behavior by the file
>    system.
> 
> There are currently two ways (that I know of) to sync a single super_block:
> 
>  - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
>    mapping, which isn't usually desirable, and doesn't work for non-block
>    file systems.
>  - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
>    current implemention.  Relying on this little-known side effect for
>    something like data safety sounds foolish.
> 
> Both of these approaches require root privileges, which some applications
> do not have (nor should they need?) given that sync(2) is an unprivileged
> operation.
> 
> This patch introduces a new system call syncfs(2) that takes an fd and
> syncs only the file system it references.  Maybe someday we can
> 
>  $ sync /some/path
> 
> and not get
> 
>  sync: ignoring all arguments
> 
> The syscall is motivated by comments by Al and Christoph at the last LSF.
> syncfs(2) seems like an appropriate name given statfs(2).
> 
> A similar ioctl was also proposed a while back, see
> 	http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2
> 
> Signed-off-by: Sage Weil <sage@newdream.net>

Reviewed-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] introduce sys_syncfs to sync a single file system
  2011-03-10 19:31       ` [PATCH v3] introduce sys_syncfs to sync a single file system Sage Weil
  2011-03-10 22:08         ` Arnd Bergmann
       [not found]         ` <Pine.LNX.4.64.1103101125150.4190-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2011-03-13 20:59         ` Christoph Hellwig
  2011-03-13 20:59           ` Christoph Hellwig
  2 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2011-03-13 20:59 UTC (permalink / raw)
  To: Sage Weil
  Cc: linux-fsdevel, linux-kernel, Aneesh Kumar K. V, Jonathan Nieder,
	akpm, linux-api, arnd, mtk.manpages, viro, hch, linux-arch

> +/*
> + * sync a single super
> + */

Not exactly the most descriptive comment.

Otherwise looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] introduce sys_syncfs to sync a single file system
  2011-03-13 20:59         ` Christoph Hellwig
@ 2011-03-13 20:59           ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-03-13 20:59 UTC (permalink / raw)
  To: Sage Weil
  Cc: linux-fsdevel, linux-kernel, Aneesh Kumar K. V, Jonathan Nieder,
	akpm, linux-api, arnd, mtk.manpages, viro, hch, linux-arch

> +/*
> + * sync a single super
> + */

Not exactly the most descriptive comment.

Otherwise looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-03-13 21:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.64.1102171035220.13904@cobra.newdream.net>
     [not found] ` <20110303072223.GA28133@elie>
     [not found]   ` <87bp1sziqn.fsf@linux.vnet.ibm.com>
     [not found]     ` <Pine.LNX.4.64.1103071515070.11152@cobra.newdream.net>
2011-03-10 19:31       ` [PATCH v3] introduce sys_syncfs to sync a single file system Sage Weil
2011-03-10 22:08         ` Arnd Bergmann
     [not found]         ` <Pine.LNX.4.64.1103101125150.4190-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2011-03-11  4:44           ` Aneesh Kumar K. V
2011-03-11  4:44             ` Aneesh Kumar K. V
2011-03-13 20:59         ` Christoph Hellwig
2011-03-13 20:59           ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).