linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] implement posix O_SYNC and O_DSYNC semantics
       [not found] <20090914165419.GD25549@duck.suse.cz>
@ 2009-09-15 13:12 ` Christoph Hellwig
  2009-09-15 14:10   ` Jan Kara
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Christoph Hellwig @ 2009-09-15 13:12 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-arch, akpm, drepper, viro, kyle, sct

While Linux provided an O_SYNC flag basically since day 1, it took until
Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
since that day we had generic_osync_around with only minor changes and the
great "For now, when the user asks for O_SYNC, we'll actually give O_DSYNC"
comment.  This patch intends to actually give us real O_SYNC semantics
in addition to the O_DSYNC semantics.  After Jan's O_SYNC patches which
are required before this patch it's actually surprisingly simple, we
just need to figure out when to set the datasync flag to vfs_fsync_range
and when not.

This patch renames the existing O_SYNC flag to O_DSYNC while keeping
it's numerical value to keep binary compatibility, and adds a new real
O_SYNC flag.  To guarantee backwards compatiblity it is defined as
expanding to both the O_DSYNC and the new additional binary flag
(__O_SYNC) to make sure we are backwards-compatible when compiled against
the new headers.

This also means that all places that don't care about the differences
can just check O_DSYNC and get the right behaviour for O_SYNC, too - only
places that actuall care need to check __O_SYNC in addition.  Drivers
and network filesystems have been updated in a fail safe way to always
do the full sync magic if O_DSYNC is set.  The few places setting O_SYNC
for lower layers are kept that way for now to stay failsafe.

We enforce that O_DSYNC is set when __O_SYNC is set early in the
open path to make sure we always get these sane options.

Note that parisc really fucked up their headers as they already define
a O_DSYNC that has always been a no-op.  We try to repair it by using it
for the new O_DSYNC and redefinining O_SYNC to send both the traditional
O_SYNC numerical value _and_ the O_DSYNC one.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Index: linux-2.6/arch/x86/mm/pat.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat.c	2009-09-15 00:46:32.911256267 -0300
+++ linux-2.6/arch/x86/mm/pat.c	2009-09-15 09:41:27.301253948 -0300
@@ -541,7 +541,7 @@ int phys_mem_access_prot_allowed(struct 
 	if (!range_is_allowed(pfn, size))
 		return 0;
 
-	if (file->f_flags & O_SYNC) {
+	if (file->f_flags & O_DSYNC) {
 		flags = _PAGE_CACHE_UC_MINUS;
 	}
 
Index: linux-2.6/drivers/char/mem.c
===================================================================
--- linux-2.6.orig/drivers/char/mem.c	2009-09-15 00:46:33.096254330 -0300
+++ linux-2.6/drivers/char/mem.c	2009-09-15 09:41:27.302253936 -0300
@@ -44,7 +44,7 @@ static inline int uncached_access(struct
 {
 #if defined(CONFIG_IA64)
 	/*
-	 * On ia64, we ignore O_SYNC because we cannot tolerate memory attribute aliases.
+	 * On ia64, we ignore O_DSYNC because we cannot tolerate memory attribute aliases.
 	 */
 	return !(efi_mem_attributes(addr) & EFI_MEMORY_WB);
 #elif defined(CONFIG_MIPS)
@@ -57,9 +57,9 @@ static inline int uncached_access(struct
 #else
 	/*
 	 * Accessing memory above the top the kernel knows about or through a file pointer
-	 * that was marked O_SYNC will be done non-cached.
+	 * that was marked O_DSYNC will be done non-cached.
 	 */
-	if (file->f_flags & O_SYNC)
+	if (file->f_flags & O_DSYNC)
 		return 1;
 	return addr >= __pa(high_memory);
 #endif
Index: linux-2.6/drivers/staging/me4000/me4000.c
===================================================================
--- linux-2.6.orig/drivers/staging/me4000/me4000.c	2009-09-15 00:46:33.130254399 -0300
+++ linux-2.6/drivers/staging/me4000/me4000.c	2009-09-15 09:41:27.305253618 -0300
@@ -1985,8 +1985,8 @@ static ssize_t me4000_ao_write_cont(stru
 			spin_unlock_irqrestore(&ao_context->int_lock, flags);
 		}
 
-		/* Wait until the state machine is stopped if O_SYNC is set */
-		if (filep->f_flags & O_SYNC) {
+		/* Wait until the state machine is stopped if O_DSYNC is set */
+		if (filep->f_flags & O_DSYNC) {
 			while (inl(ao_context->status_reg) &
 			       ME4000_AO_STATUS_BIT_FSM) {
 				interruptible_sleep_on_timeout(&queue, 1);
Index: linux-2.6/drivers/usb/gadget/file_storage.c
===================================================================
--- linux-2.6.orig/drivers/usb/gadget/file_storage.c	2009-09-15 00:46:33.138253951 -0300
+++ linux-2.6/drivers/usb/gadget/file_storage.c	2009-09-15 09:41:27.311253752 -0300
@@ -1713,7 +1713,7 @@ static int do_write(struct fsg_dev *fsg)
 		}
 		if (fsg->cmnd[1] & 0x08) {	// FUA
 			spin_lock(&curlun->filp->f_lock);
-			curlun->filp->f_flags |= O_SYNC;
+			curlun->filp->f_flags |= O_DSYNC;
 			spin_unlock(&curlun->filp->f_lock);
 		}
 	}
Index: linux-2.6/fs/afs/write.c
===================================================================
--- linux-2.6.orig/fs/afs/write.c	2009-09-15 00:46:33.144254016 -0300
+++ linux-2.6/fs/afs/write.c	2009-09-15 09:41:27.316253550 -0300
@@ -692,8 +692,9 @@ ssize_t afs_file_write(struct kiocb *ioc
 	}
 
 	/* return error values for O_SYNC and IS_SYNC() */
-	if (IS_SYNC(&vnode->vfs_inode) || iocb->ki_filp->f_flags & O_SYNC) {
-		ret = afs_fsync(iocb->ki_filp, dentry, 1);
+	if (IS_SYNC(&vnode->vfs_inode) || iocb->ki_filp->f_flags & O_DSYNC) {
+		ret = afs_fsync(iocb->ki_filp, dentry,
+				(iocb->ki_filp->f_flags & __O_SYNC) ? 0 : 1);
 		if (ret < 0)
 			result = ret;
 	}
Index: linux-2.6/fs/btrfs/file.c
===================================================================
--- linux-2.6.orig/fs/btrfs/file.c	2009-09-15 00:46:33.151254279 -0300
+++ linux-2.6/fs/btrfs/file.c	2009-09-15 09:41:27.316253550 -0300
@@ -924,7 +924,7 @@ static ssize_t btrfs_file_write(struct f
 	unsigned long last_index;
 	int will_write;
 
-	will_write = ((file->f_flags & O_SYNC) || IS_SYNC(inode) ||
+	will_write = ((file->f_flags & O_DSYNC) || IS_SYNC(inode) ||
 		      (file->f_flags & O_DIRECT));
 
 	nrptrs = min((count + PAGE_CACHE_SIZE - 1) / PAGE_CACHE_SIZE,
@@ -1077,7 +1077,7 @@ out_nolock:
 		if (err)
 			num_written = err;
 
-		if ((file->f_flags & O_SYNC) || IS_SYNC(inode)) {
+		if ((file->f_flags & O_DSYNC) || IS_SYNC(inode)) {
 			trans = btrfs_start_transaction(root, 1);
 			ret = btrfs_log_dentry_safe(trans, root,
 						    file->f_dentry);
Index: linux-2.6/fs/cifs/dir.c
===================================================================
--- linux-2.6.orig/fs/cifs/dir.c	2009-09-15 00:46:33.156254147 -0300
+++ linux-2.6/fs/cifs/dir.c	2009-09-15 09:41:27.319254141 -0300
@@ -214,7 +214,8 @@ int cifs_posix_open(char *full_path, str
 		posix_flags |= SMB_O_TRUNC;
 	if (oflags & O_APPEND)
 		posix_flags |= SMB_O_APPEND;
-	if (oflags & O_SYNC)
+	/* be safe and imply O_SYNC for O_DSYNC */
+	if (oflags & O_DSYNC)
 		posix_flags |= SMB_O_SYNC;
 	if (oflags & O_DIRECTORY)
 		posix_flags |= SMB_O_DIRECTORY;
Index: linux-2.6/fs/cifs/file.c
===================================================================
--- linux-2.6.orig/fs/cifs/file.c	2009-09-15 00:46:33.162254422 -0300
+++ linux-2.6/fs/cifs/file.c	2009-09-15 09:41:27.323254719 -0300
@@ -96,8 +96,10 @@ static inline fmode_t cifs_posix_convert
 	   reopening a file.  They had their effect on the original open */
 	if (flags & O_APPEND)
 		posix_flags |= (fmode_t)O_APPEND;
-	if (flags & O_SYNC)
-		posix_flags |= (fmode_t)O_SYNC;
+	if (flags & O_DSYNC)
+		posix_flags |= (fmode_t)O_DSYNC;
+	if (flags & __O_SYNC)
+		posix_flags |= (fmode_t)__O_SYNC;
 	if (flags & O_DIRECTORY)
 		posix_flags |= (fmode_t)O_DIRECTORY;
 	if (flags & O_NOFOLLOW)
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c	2009-09-15 00:46:33.168253161 -0300
+++ linux-2.6/fs/namei.c	2009-09-15 09:45:26.694256679 -0300
@@ -1678,6 +1678,15 @@ struct file *do_filp_open(int dfd, const
 	int will_write;
 	int flag = open_to_namei_flags(open_flag);
 
+	/*
+	 * O_SYNC is implemented as __O_SYNC|O_DSYNC.  As many places only
+	 * check for O_DSYNC if the need any syncing at all we enforce it's
+	 * always set instead of having to deal with possibly weird behaviour
+	 * for malicious applications setting only __O_SYNC.
+	 */
+	if (open_flag & __O_SYNC)
+		open_flag |= O_DSYNC;
+
 	if (!acc_mode)
 		acc_mode = MAY_OPEN | ACC_MODE(flag);
 
Index: linux-2.6/fs/nfs/file.c
===================================================================
--- linux-2.6.orig/fs/nfs/file.c	2009-09-15 00:46:33.174254134 -0300
+++ linux-2.6/fs/nfs/file.c	2009-09-15 09:41:27.330253653 -0300
@@ -580,7 +580,7 @@ static int nfs_need_sync_write(struct fi
 {
 	struct nfs_open_context *ctx;
 
-	if (IS_SYNC(inode) || (filp->f_flags & O_SYNC))
+	if (IS_SYNC(inode) || (filp->f_flags & O_DSYNC))
 		return 1;
 	ctx = nfs_file_open_context(filp);
 	if (test_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags))
@@ -621,7 +621,7 @@ static ssize_t nfs_file_write(struct kio
 
 	nfs_add_stats(inode, NFSIOS_NORMALWRITTENBYTES, count);
 	result = generic_file_aio_write(iocb, iov, nr_segs, pos);
-	/* Return error values for O_SYNC and IS_SYNC() */
+	/* Return error values for O_DSYNC and IS_SYNC() */
 	if (result >= 0 && nfs_need_sync_write(iocb->ki_filp, inode)) {
 		int err = nfs_do_fsync(nfs_file_open_context(iocb->ki_filp), inode);
 		if (err < 0)
Index: linux-2.6/fs/nfs/write.c
===================================================================
--- linux-2.6.orig/fs/nfs/write.c	2009-09-15 00:46:33.180254200 -0300
+++ linux-2.6/fs/nfs/write.c	2009-09-15 09:41:27.332254187 -0300
@@ -774,7 +774,7 @@ int nfs_updatepage(struct file *file, st
 	 */
 	if (nfs_write_pageuptodate(page, inode) &&
 			inode->i_flock == NULL &&
-			!(file->f_flags & O_SYNC)) {
+			!(file->f_flags & O_DSYNC)) {
 		count = max(count + offset, nfs_page_length(page));
 		offset = 0;
 	}
Index: linux-2.6/include/asm-generic/fcntl.h
===================================================================
--- linux-2.6.orig/include/asm-generic/fcntl.h	2009-09-15 00:46:33.211253817 -0300
+++ linux-2.6/include/asm-generic/fcntl.h	2009-09-15 09:41:27.335253940 -0300
@@ -3,8 +3,6 @@
 
 #include <linux/types.h>
 
-/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
-   located on an ext2 file system */
 #define O_ACCMODE	00000003
 #define O_RDONLY	00000000
 #define O_WRONLY	00000001
@@ -27,8 +25,8 @@
 #ifndef O_NONBLOCK
 #define O_NONBLOCK	00004000
 #endif
-#ifndef O_SYNC
-#define O_SYNC		00010000
+#ifndef O_DSYNC
+#define O_DSYNC		00010000	/* used to be O_SYNC, see below */
 #endif
 #ifndef FASYNC
 #define FASYNC		00020000	/* fcntl, for BSD compatibility */
@@ -51,6 +49,25 @@
 #ifndef O_CLOEXEC
 #define O_CLOEXEC	02000000	/* set close_on_exec */
 #endif
+
+/*
+ * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
+ * the O_SYNC flag.  We continue to use the existing numerical value
+ * for O_DSYNC semantics now, but using the correct symbolic name for it.
+ * This new value is used to request true Posix O_SYNC semantics.  It is
+ * defined in this strange way to make sure applications compiled against
+ * new headers get at least O_DSYNC semantics on older kernels.
+ *
+ * This has the nice side-effect that we can simply test for O_DSYNC
+ * wherever we do not care if O_DSYNC or O_SYNC is used.
+ *
+ * Note: __O_SYNC must never be used directly.
+ */
+#ifndef O_SYNC
+#define __O_SYNC	04000000
+#define O_SYNC		(__O_SYNC|O_DSYNC)
+#endif
+
 #ifndef O_NDELAY
 #define O_NDELAY	O_NONBLOCK
 #endif
Index: linux-2.6/fs/ocfs2/file.c
===================================================================
--- linux-2.6.orig/fs/ocfs2/file.c	2009-09-15 00:46:33.186253776 -0300
+++ linux-2.6/fs/ocfs2/file.c	2009-09-15 09:41:27.338254042 -0300
@@ -1878,7 +1878,7 @@ out_dio:
 	/* buffered aio wouldn't have proper lock coverage today */
 	BUG_ON(ret == -EIOCBQUEUED && !(file->f_flags & O_DIRECT));
 
-	if ((file->f_flags & O_SYNC && !direct_io) || IS_SYNC(inode)) {
+	if ((file->f_flags & O_DSYNC && !direct_io) || IS_SYNC(inode)) {
 		ret = filemap_fdatawrite_range(file->f_mapping, pos,
 					       pos + count - 1);
 		if (ret < 0)
Index: linux-2.6/fs/ubifs/file.c
===================================================================
--- linux-2.6.orig/fs/ubifs/file.c	2009-09-15 00:46:33.192253912 -0300
+++ linux-2.6/fs/ubifs/file.c	2009-09-15 09:41:27.341254213 -0300
@@ -1403,7 +1403,7 @@ static ssize_t ubifs_aio_write(struct ki
 	if (ret < 0)
 		return ret;
 
-	if (ret > 0 && (IS_SYNC(inode) || iocb->ki_filp->f_flags & O_SYNC)) {
+	if (ret > 0 && (IS_SYNC(inode) || iocb->ki_filp->f_flags & O_DSYNC)) {
 		err = ubifs_sync_wbufs_by_inode(c, inode);
 		if (err)
 			return err;
Index: linux-2.6/fs/xfs/linux-2.6/xfs_lrw.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_lrw.c	2009-09-15 00:46:33.198253488 -0300
+++ linux-2.6/fs/xfs/linux-2.6/xfs_lrw.c	2009-09-15 09:41:27.344254176 -0300
@@ -811,7 +811,7 @@ write_retry:
 	XFS_STATS_ADD(xs_write_bytes, ret);
 
 	/* Handle various SYNC-type writes */
-	if ((file->f_flags & O_SYNC) || IS_SYNC(inode)) {
+	if ((file->f_flags & O_DSYNC) || IS_SYNC(inode)) {
 		int error2;
 
 		xfs_iunlock(xip, iolock);
Index: linux-2.6/sound/core/rawmidi.c
===================================================================
--- linux-2.6.orig/sound/core/rawmidi.c	2009-09-15 00:46:33.219253718 -0300
+++ linux-2.6/sound/core/rawmidi.c	2009-09-15 09:41:27.347253859 -0300
@@ -1258,7 +1258,7 @@ static ssize_t snd_rawmidi_write(struct 
 			break;
 		count -= count1;
 	}
-	if (file->f_flags & O_SYNC) {
+	if (file->f_flags & O_DSYNC) {
 		spin_lock_irq(&runtime->lock);
 		while (runtime->avail != runtime->buffer_size) {
 			wait_queue_t wait;
Index: linux-2.6/arch/alpha/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/fcntl.h	2009-09-15 00:46:32.945006724 -0300
+++ linux-2.6/arch/alpha/include/asm/fcntl.h	2009-09-15 09:41:27.348253497 -0300
@@ -1,8 +1,6 @@
 #ifndef _ALPHA_FCNTL_H
 #define _ALPHA_FCNTL_H
 
-/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
-   located on an ext2 file system */
 #define O_CREAT		 01000	/* not fcntl */
 #define O_TRUNC		 02000	/* not fcntl */
 #define O_EXCL		 04000	/* not fcntl */
@@ -10,13 +8,28 @@
 
 #define O_NONBLOCK	 00004
 #define O_APPEND	 00010
-#define O_SYNC		040000
+#define O_DSYNC		040000	/* used to be O_SYNC, see below */
 #define O_DIRECTORY	0100000	/* must be a directory */
 #define O_NOFOLLOW	0200000 /* don't follow links */
 #define O_LARGEFILE	0400000 /* will be set by the kernel on every open */
 #define O_DIRECT	02000000 /* direct disk access - should check with OSF/1 */
 #define O_NOATIME	04000000
 #define O_CLOEXEC	010000000 /* set close_on_exec */
+/*
+ * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
+ * the O_SYNC flag.  We continue to use the existing numerical value
+ * for O_DSYNC semantics now, but using the correct symbolic name for it.
+ * This new value is used to request true Posix O_SYNC semantics.  It is
+ * defined in this strange way to make sure applications compiled against
+ * new headers get at least O_DSYNC semantics on older kernels.
+ *
+ * This has the nice side-effect that we can simply test for O_DSYNC
+ * wherever we do not care if O_DSYNC or O_SYNC is used.
+ *
+ * Note: __O_SYNC must never be used directly.
+ */
+#define __O_SYNC	020000000
+#define O_SYNC		(__O_SYNC|O_DSYNC)
 
 #define F_GETLK		7
 #define F_SETLK		8
Index: linux-2.6/arch/blackfin/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/blackfin/include/asm/fcntl.h	2009-09-15 00:46:32.978006455 -0300
+++ linux-2.6/arch/blackfin/include/asm/fcntl.h	2009-09-15 09:41:27.351254088 -0300
@@ -1,8 +1,6 @@
 #ifndef _BFIN_FCNTL_H
 #define _BFIN_FCNTL_H
 
-/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
-   located on an ext2 file system */
 #define O_DIRECTORY	 040000	/* must be a directory */
 #define O_NOFOLLOW	0100000	/* don't follow links */
 #define O_DIRECT	0200000	/* direct disk access hint - currently ignored */
Index: linux-2.6/arch/mips/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/fcntl.h	2009-09-15 00:46:33.002006368 -0300
+++ linux-2.6/arch/mips/include/asm/fcntl.h	2009-09-15 09:41:27.354254050 -0300
@@ -10,7 +10,7 @@
 
 
 #define O_APPEND	0x0008
-#define O_SYNC		0x0010
+#define O_DSYNC		0x0010	/* used to be O_SYNC, see below */
 #define O_NONBLOCK	0x0080
 #define O_CREAT         0x0100	/* not fcntl */
 #define O_TRUNC		0x0200	/* not fcntl */
@@ -18,6 +18,21 @@
 #define O_NOCTTY	0x0800	/* not fcntl */
 #define FASYNC		0x1000	/* fcntl, for BSD compatibility */
 #define O_LARGEFILE	0x2000	/* allow large file opens */
+/*
+ * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
+ * the O_SYNC flag.  We continue to use the existing numerical value
+ * for O_DSYNC semantics now, but using the correct symbolic name for it.
+ * This new value is used to request true Posix O_SYNC semantics.  It is
+ * defined in this strange way to make sure applications compiled against
+ * new headers get at least O_DSYNC semantics on older kernels.
+ *
+ * This has the nice side-effect that we can simply test for O_DSYNC
+ * wherever we do not care if O_DSYNC or O_SYNC is used.
+ *
+ * Note: __O_SYNC must never be used directly.
+ */
+#define __O_SYNC	0x4000
+#define O_SYNC		(__O_SYNC|O_DSYNC)
 #define O_DIRECT	0x8000	/* direct disk access hint */
 
 #define F_GETLK		14
Index: linux-2.6/arch/mips/kernel/kspd.c
===================================================================
--- linux-2.6.orig/arch/mips/kernel/kspd.c	2009-09-15 00:46:33.021004807 -0300
+++ linux-2.6/arch/mips/kernel/kspd.c	2009-09-15 09:41:27.357254082 -0300
@@ -82,6 +82,7 @@ static int sp_stopping = 0;
 #define MTSP_O_SHLOCK		0x0010
 #define MTSP_O_EXLOCK		0x0020
 #define MTSP_O_ASYNC		0x0040
+/* XXX: check which of these is actually O_SYNC vs O_DSYNC */
 #define MTSP_O_FSYNC		O_SYNC
 #define MTSP_O_NOFOLLOW		0x0100
 #define MTSP_O_SYNC		0x0080
Index: linux-2.6/arch/mips/lemote/lm2e/mem.c
===================================================================
--- linux-2.6.orig/arch/mips/lemote/lm2e/mem.c	2009-09-15 00:46:33.054254081 -0300
+++ linux-2.6/arch/mips/lemote/lm2e/mem.c	2009-09-15 09:41:27.357254082 -0300
@@ -11,7 +11,7 @@
 /* override of arch/mips/mm/cache.c: __uncached_access */
 int __uncached_access(struct file *file, unsigned long addr)
 {
-	if (file->f_flags & O_SYNC)
+	if (file->f_flags & O_DSYNC)
 		return 1;
 
 	/*
Index: linux-2.6/arch/mips/mm/cache.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/cache.c	2009-09-15 00:46:33.074254183 -0300
+++ linux-2.6/arch/mips/mm/cache.c	2009-09-15 09:41:27.360254044 -0300
@@ -194,7 +194,7 @@ void __devinit cpu_cache_init(void)
 
 int __weak __uncached_access(struct file *file, unsigned long addr)
 {
-	if (file->f_flags & O_SYNC)
+	if (file->f_flags & O_DSYNC)
 		return 1;
 
 	return addr >= __pa(high_memory);
Index: linux-2.6/arch/parisc/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/fcntl.h	2009-09-15 00:46:33.082254364 -0300
+++ linux-2.6/arch/parisc/include/asm/fcntl.h	2009-09-15 09:41:27.363254007 -0300
@@ -1,14 +1,13 @@
 #ifndef _PARISC_FCNTL_H
 #define _PARISC_FCNTL_H
 
-/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
-   located on an ext2 file system */
 #define O_APPEND	000000010
 #define O_BLKSEEK	000000100 /* HPUX only */
 #define O_CREAT		000000400 /* not fcntl */
 #define O_EXCL		000002000 /* not fcntl */
 #define O_LARGEFILE	000004000
-#define O_SYNC		000100000
+#define __O_SYNC	000100000
+#define O_SYNC		(__O_SYNC|O_DSYNC)
 #define O_NONBLOCK	000200004 /* HPUX has separate NDELAY & NONBLOCK */
 #define O_NOCTTY	000400000 /* not fcntl */
 #define O_DSYNC		001000000 /* HPUX only */
Index: linux-2.6/arch/sparc/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/fcntl.h	2009-09-15 00:46:33.090254335 -0300
+++ linux-2.6/arch/sparc/include/asm/fcntl.h	2009-09-15 09:41:27.367253956 -0300
@@ -1,14 +1,12 @@
 #ifndef _SPARC_FCNTL_H
 #define _SPARC_FCNTL_H
 
-/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
-   located on an ext2 file system */
 #define O_APPEND	0x0008
 #define FASYNC		0x0040	/* fcntl, for BSD compatibility */
 #define O_CREAT		0x0200	/* not fcntl */
 #define O_TRUNC		0x0400	/* not fcntl */
 #define O_EXCL		0x0800	/* not fcntl */
-#define O_SYNC		0x2000
+#define O_DSYNC		0x2000	/* used to be O_SYNC, see below */
 #define O_NONBLOCK	0x4000
 #if defined(__sparc__) && defined(__arch64__)
 #define O_NDELAY	0x0004
@@ -20,6 +18,21 @@
 #define O_DIRECT        0x100000 /* direct disk access hint */
 #define O_NOATIME	0x200000
 #define O_CLOEXEC	0x400000
+/*
+ * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
+ * the O_SYNC flag.  We continue to use the existing numerical value
+ * for O_DSYNC semantics now, but using the correct symbolic name for it.
+ * This new value is used to request true Posix O_SYNC semantics.  It is
+ * defined in this strange way to make sure applications compiled against
+ * new headers get at least O_DSYNC semantics on older kernels.
+ *
+ * This has the nice side-effect that we can simply test for O_DSYNC
+ * wherever we do not care if O_DSYNC or O_SYNC is used.
+ *
+ * Note: __O_SYNC must never be used directly.
+ */
+#define __O_SYNC	0x800000
+#define O_SYNC		(__O_SYNC|O_DSYNC)
 
 #define F_GETOWN	5	/*  for sockets. */
 #define F_SETOWN	6	/*  for sockets. */
Index: linux-2.6/fs/sync.c
===================================================================
--- linux-2.6.orig/fs/sync.c	2009-09-15 00:46:33.205253612 -0300
+++ linux-2.6/fs/sync.c	2009-09-15 09:41:27.370254058 -0300
@@ -287,10 +287,11 @@ SYSCALL_DEFINE1(fdatasync, unsigned int,
  */
 int generic_write_sync(struct file *file, loff_t pos, loff_t count)
 {
-	if (!(file->f_flags & O_SYNC) && !IS_SYNC(file->f_mapping->host))
+	if (!(file->f_flags & O_DSYNC) && !IS_SYNC(file->f_mapping->host))
 		return 0;
 	return vfs_fsync_range(file, file->f_path.dentry, pos,
-			       pos + count - 1, 1);
+			       pos + count - 1,
+			       (file->f_flags & __O_SYNC) ? 0 : 1);
 }
 EXPORT_SYMBOL(generic_write_sync);
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] implement posix O_SYNC and O_DSYNC semantics
  2009-09-15 13:12 ` [PATCH] implement posix O_SYNC and O_DSYNC semantics Christoph Hellwig
@ 2009-09-15 14:10   ` Jan Kara
  2009-09-15 14:50   ` Ulrich Drepper
  2009-09-17 21:03   ` Kyle McMartin
  2 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2009-09-15 14:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, linux-kernel, linux-arch, akpm, drepper, viro, kyle,
	sct

On Tue 15-09-09 15:12:52, Christoph Hellwig wrote:
> While Linux provided an O_SYNC flag basically since day 1, it took until
> Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
> since that day we had generic_osync_around with only minor changes and the
> great "For now, when the user asks for O_SYNC, we'll actually give O_DSYNC"
> comment.  This patch intends to actually give us real O_SYNC semantics
> in addition to the O_DSYNC semantics.  After Jan's O_SYNC patches which
> are required before this patch it's actually surprisingly simple, we
> just need to figure out when to set the datasync flag to vfs_fsync_range
> and when not.
> 
> This patch renames the existing O_SYNC flag to O_DSYNC while keeping
> it's numerical value to keep binary compatibility, and adds a new real
> O_SYNC flag.  To guarantee backwards compatiblity it is defined as
> expanding to both the O_DSYNC and the new additional binary flag
> (__O_SYNC) to make sure we are backwards-compatible when compiled against
> the new headers.
> 
> This also means that all places that don't care about the differences
> can just check O_DSYNC and get the right behaviour for O_SYNC, too - only
> places that actuall care need to check __O_SYNC in addition.  Drivers
> and network filesystems have been updated in a fail safe way to always
> do the full sync magic if O_DSYNC is set.  The few places setting O_SYNC
> for lower layers are kept that way for now to stay failsafe.
> 
> We enforce that O_DSYNC is set when __O_SYNC is set early in the
> open path to make sure we always get these sane options.
> 
> Note that parisc really fucked up their headers as they already define
> a O_DSYNC that has always been a no-op.  We try to repair it by using it
> for the new O_DSYNC and redefinining O_SYNC to send both the traditional
> O_SYNC numerical value _and_ the O_DSYNC one.
> 
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
  The patch looks fine now.
  Acked-by: Jan Kara <jack@suse.cz>

> Index: linux-2.6/arch/x86/mm/pat.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/pat.c	2009-09-15 00:46:32.911256267 -0300
> +++ linux-2.6/arch/x86/mm/pat.c	2009-09-15 09:41:27.301253948 -0300
> @@ -541,7 +541,7 @@ int phys_mem_access_prot_allowed(struct 
>  	if (!range_is_allowed(pfn, size))
>  		return 0;
>  
> -	if (file->f_flags & O_SYNC) {
> +	if (file->f_flags & O_DSYNC) {
>  		flags = _PAGE_CACHE_UC_MINUS;
>  	}
>  
> Index: linux-2.6/drivers/char/mem.c
> ===================================================================
> --- linux-2.6.orig/drivers/char/mem.c	2009-09-15 00:46:33.096254330 -0300
> +++ linux-2.6/drivers/char/mem.c	2009-09-15 09:41:27.302253936 -0300
> @@ -44,7 +44,7 @@ static inline int uncached_access(struct
>  {
>  #if defined(CONFIG_IA64)
>  	/*
> -	 * On ia64, we ignore O_SYNC because we cannot tolerate memory attribute aliases.
> +	 * On ia64, we ignore O_DSYNC because we cannot tolerate memory attribute aliases.
>  	 */
>  	return !(efi_mem_attributes(addr) & EFI_MEMORY_WB);
>  #elif defined(CONFIG_MIPS)
> @@ -57,9 +57,9 @@ static inline int uncached_access(struct
>  #else
>  	/*
>  	 * Accessing memory above the top the kernel knows about or through a file pointer
> -	 * that was marked O_SYNC will be done non-cached.
> +	 * that was marked O_DSYNC will be done non-cached.
>  	 */
> -	if (file->f_flags & O_SYNC)
> +	if (file->f_flags & O_DSYNC)
>  		return 1;
>  	return addr >= __pa(high_memory);
>  #endif
> Index: linux-2.6/drivers/staging/me4000/me4000.c
> ===================================================================
> --- linux-2.6.orig/drivers/staging/me4000/me4000.c	2009-09-15 00:46:33.130254399 -0300
> +++ linux-2.6/drivers/staging/me4000/me4000.c	2009-09-15 09:41:27.305253618 -0300
> @@ -1985,8 +1985,8 @@ static ssize_t me4000_ao_write_cont(stru
>  			spin_unlock_irqrestore(&ao_context->int_lock, flags);
>  		}
>  
> -		/* Wait until the state machine is stopped if O_SYNC is set */
> -		if (filep->f_flags & O_SYNC) {
> +		/* Wait until the state machine is stopped if O_DSYNC is set */
> +		if (filep->f_flags & O_DSYNC) {
>  			while (inl(ao_context->status_reg) &
>  			       ME4000_AO_STATUS_BIT_FSM) {
>  				interruptible_sleep_on_timeout(&queue, 1);
> Index: linux-2.6/drivers/usb/gadget/file_storage.c
> ===================================================================
> --- linux-2.6.orig/drivers/usb/gadget/file_storage.c	2009-09-15 00:46:33.138253951 -0300
> +++ linux-2.6/drivers/usb/gadget/file_storage.c	2009-09-15 09:41:27.311253752 -0300
> @@ -1713,7 +1713,7 @@ static int do_write(struct fsg_dev *fsg)
>  		}
>  		if (fsg->cmnd[1] & 0x08) {	// FUA
>  			spin_lock(&curlun->filp->f_lock);
> -			curlun->filp->f_flags |= O_SYNC;
> +			curlun->filp->f_flags |= O_DSYNC;
>  			spin_unlock(&curlun->filp->f_lock);
>  		}
>  	}
> Index: linux-2.6/fs/afs/write.c
> ===================================================================
> --- linux-2.6.orig/fs/afs/write.c	2009-09-15 00:46:33.144254016 -0300
> +++ linux-2.6/fs/afs/write.c	2009-09-15 09:41:27.316253550 -0300
> @@ -692,8 +692,9 @@ ssize_t afs_file_write(struct kiocb *ioc
>  	}
>  
>  	/* return error values for O_SYNC and IS_SYNC() */
> -	if (IS_SYNC(&vnode->vfs_inode) || iocb->ki_filp->f_flags & O_SYNC) {
> -		ret = afs_fsync(iocb->ki_filp, dentry, 1);
> +	if (IS_SYNC(&vnode->vfs_inode) || iocb->ki_filp->f_flags & O_DSYNC) {
> +		ret = afs_fsync(iocb->ki_filp, dentry,
> +				(iocb->ki_filp->f_flags & __O_SYNC) ? 0 : 1);
>  		if (ret < 0)
>  			result = ret;
>  	}
> Index: linux-2.6/fs/btrfs/file.c
> ===================================================================
> --- linux-2.6.orig/fs/btrfs/file.c	2009-09-15 00:46:33.151254279 -0300
> +++ linux-2.6/fs/btrfs/file.c	2009-09-15 09:41:27.316253550 -0300
> @@ -924,7 +924,7 @@ static ssize_t btrfs_file_write(struct f
>  	unsigned long last_index;
>  	int will_write;
>  
> -	will_write = ((file->f_flags & O_SYNC) || IS_SYNC(inode) ||
> +	will_write = ((file->f_flags & O_DSYNC) || IS_SYNC(inode) ||
>  		      (file->f_flags & O_DIRECT));
>  
>  	nrptrs = min((count + PAGE_CACHE_SIZE - 1) / PAGE_CACHE_SIZE,
> @@ -1077,7 +1077,7 @@ out_nolock:
>  		if (err)
>  			num_written = err;
>  
> -		if ((file->f_flags & O_SYNC) || IS_SYNC(inode)) {
> +		if ((file->f_flags & O_DSYNC) || IS_SYNC(inode)) {
>  			trans = btrfs_start_transaction(root, 1);
>  			ret = btrfs_log_dentry_safe(trans, root,
>  						    file->f_dentry);
> Index: linux-2.6/fs/cifs/dir.c
> ===================================================================
> --- linux-2.6.orig/fs/cifs/dir.c	2009-09-15 00:46:33.156254147 -0300
> +++ linux-2.6/fs/cifs/dir.c	2009-09-15 09:41:27.319254141 -0300
> @@ -214,7 +214,8 @@ int cifs_posix_open(char *full_path, str
>  		posix_flags |= SMB_O_TRUNC;
>  	if (oflags & O_APPEND)
>  		posix_flags |= SMB_O_APPEND;
> -	if (oflags & O_SYNC)
> +	/* be safe and imply O_SYNC for O_DSYNC */
> +	if (oflags & O_DSYNC)
>  		posix_flags |= SMB_O_SYNC;
>  	if (oflags & O_DIRECTORY)
>  		posix_flags |= SMB_O_DIRECTORY;
> Index: linux-2.6/fs/cifs/file.c
> ===================================================================
> --- linux-2.6.orig/fs/cifs/file.c	2009-09-15 00:46:33.162254422 -0300
> +++ linux-2.6/fs/cifs/file.c	2009-09-15 09:41:27.323254719 -0300
> @@ -96,8 +96,10 @@ static inline fmode_t cifs_posix_convert
>  	   reopening a file.  They had their effect on the original open */
>  	if (flags & O_APPEND)
>  		posix_flags |= (fmode_t)O_APPEND;
> -	if (flags & O_SYNC)
> -		posix_flags |= (fmode_t)O_SYNC;
> +	if (flags & O_DSYNC)
> +		posix_flags |= (fmode_t)O_DSYNC;
> +	if (flags & __O_SYNC)
> +		posix_flags |= (fmode_t)__O_SYNC;
>  	if (flags & O_DIRECTORY)
>  		posix_flags |= (fmode_t)O_DIRECTORY;
>  	if (flags & O_NOFOLLOW)
> Index: linux-2.6/fs/namei.c
> ===================================================================
> --- linux-2.6.orig/fs/namei.c	2009-09-15 00:46:33.168253161 -0300
> +++ linux-2.6/fs/namei.c	2009-09-15 09:45:26.694256679 -0300
> @@ -1678,6 +1678,15 @@ struct file *do_filp_open(int dfd, const
>  	int will_write;
>  	int flag = open_to_namei_flags(open_flag);
>  
> +	/*
> +	 * O_SYNC is implemented as __O_SYNC|O_DSYNC.  As many places only
> +	 * check for O_DSYNC if the need any syncing at all we enforce it's
> +	 * always set instead of having to deal with possibly weird behaviour
> +	 * for malicious applications setting only __O_SYNC.
> +	 */
> +	if (open_flag & __O_SYNC)
> +		open_flag |= O_DSYNC;
> +
>  	if (!acc_mode)
>  		acc_mode = MAY_OPEN | ACC_MODE(flag);
>  
> Index: linux-2.6/fs/nfs/file.c
> ===================================================================
> --- linux-2.6.orig/fs/nfs/file.c	2009-09-15 00:46:33.174254134 -0300
> +++ linux-2.6/fs/nfs/file.c	2009-09-15 09:41:27.330253653 -0300
> @@ -580,7 +580,7 @@ static int nfs_need_sync_write(struct fi
>  {
>  	struct nfs_open_context *ctx;
>  
> -	if (IS_SYNC(inode) || (filp->f_flags & O_SYNC))
> +	if (IS_SYNC(inode) || (filp->f_flags & O_DSYNC))
>  		return 1;
>  	ctx = nfs_file_open_context(filp);
>  	if (test_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags))
> @@ -621,7 +621,7 @@ static ssize_t nfs_file_write(struct kio
>  
>  	nfs_add_stats(inode, NFSIOS_NORMALWRITTENBYTES, count);
>  	result = generic_file_aio_write(iocb, iov, nr_segs, pos);
> -	/* Return error values for O_SYNC and IS_SYNC() */
> +	/* Return error values for O_DSYNC and IS_SYNC() */
>  	if (result >= 0 && nfs_need_sync_write(iocb->ki_filp, inode)) {
>  		int err = nfs_do_fsync(nfs_file_open_context(iocb->ki_filp), inode);
>  		if (err < 0)
> Index: linux-2.6/fs/nfs/write.c
> ===================================================================
> --- linux-2.6.orig/fs/nfs/write.c	2009-09-15 00:46:33.180254200 -0300
> +++ linux-2.6/fs/nfs/write.c	2009-09-15 09:41:27.332254187 -0300
> @@ -774,7 +774,7 @@ int nfs_updatepage(struct file *file, st
>  	 */
>  	if (nfs_write_pageuptodate(page, inode) &&
>  			inode->i_flock == NULL &&
> -			!(file->f_flags & O_SYNC)) {
> +			!(file->f_flags & O_DSYNC)) {
>  		count = max(count + offset, nfs_page_length(page));
>  		offset = 0;
>  	}
> Index: linux-2.6/include/asm-generic/fcntl.h
> ===================================================================
> --- linux-2.6.orig/include/asm-generic/fcntl.h	2009-09-15 00:46:33.211253817 -0300
> +++ linux-2.6/include/asm-generic/fcntl.h	2009-09-15 09:41:27.335253940 -0300
> @@ -3,8 +3,6 @@
>  
>  #include <linux/types.h>
>  
> -/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
> -   located on an ext2 file system */
>  #define O_ACCMODE	00000003
>  #define O_RDONLY	00000000
>  #define O_WRONLY	00000001
> @@ -27,8 +25,8 @@
>  #ifndef O_NONBLOCK
>  #define O_NONBLOCK	00004000
>  #endif
> -#ifndef O_SYNC
> -#define O_SYNC		00010000
> +#ifndef O_DSYNC
> +#define O_DSYNC		00010000	/* used to be O_SYNC, see below */
>  #endif
>  #ifndef FASYNC
>  #define FASYNC		00020000	/* fcntl, for BSD compatibility */
> @@ -51,6 +49,25 @@
>  #ifndef O_CLOEXEC
>  #define O_CLOEXEC	02000000	/* set close_on_exec */
>  #endif
> +
> +/*
> + * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
> + * the O_SYNC flag.  We continue to use the existing numerical value
> + * for O_DSYNC semantics now, but using the correct symbolic name for it.
> + * This new value is used to request true Posix O_SYNC semantics.  It is
> + * defined in this strange way to make sure applications compiled against
> + * new headers get at least O_DSYNC semantics on older kernels.
> + *
> + * This has the nice side-effect that we can simply test for O_DSYNC
> + * wherever we do not care if O_DSYNC or O_SYNC is used.
> + *
> + * Note: __O_SYNC must never be used directly.
> + */
> +#ifndef O_SYNC
> +#define __O_SYNC	04000000
> +#define O_SYNC		(__O_SYNC|O_DSYNC)
> +#endif
> +
>  #ifndef O_NDELAY
>  #define O_NDELAY	O_NONBLOCK
>  #endif
> Index: linux-2.6/fs/ocfs2/file.c
> ===================================================================
> --- linux-2.6.orig/fs/ocfs2/file.c	2009-09-15 00:46:33.186253776 -0300
> +++ linux-2.6/fs/ocfs2/file.c	2009-09-15 09:41:27.338254042 -0300
> @@ -1878,7 +1878,7 @@ out_dio:
>  	/* buffered aio wouldn't have proper lock coverage today */
>  	BUG_ON(ret == -EIOCBQUEUED && !(file->f_flags & O_DIRECT));
>  
> -	if ((file->f_flags & O_SYNC && !direct_io) || IS_SYNC(inode)) {
> +	if ((file->f_flags & O_DSYNC && !direct_io) || IS_SYNC(inode)) {
>  		ret = filemap_fdatawrite_range(file->f_mapping, pos,
>  					       pos + count - 1);
>  		if (ret < 0)
> Index: linux-2.6/fs/ubifs/file.c
> ===================================================================
> --- linux-2.6.orig/fs/ubifs/file.c	2009-09-15 00:46:33.192253912 -0300
> +++ linux-2.6/fs/ubifs/file.c	2009-09-15 09:41:27.341254213 -0300
> @@ -1403,7 +1403,7 @@ static ssize_t ubifs_aio_write(struct ki
>  	if (ret < 0)
>  		return ret;
>  
> -	if (ret > 0 && (IS_SYNC(inode) || iocb->ki_filp->f_flags & O_SYNC)) {
> +	if (ret > 0 && (IS_SYNC(inode) || iocb->ki_filp->f_flags & O_DSYNC)) {
>  		err = ubifs_sync_wbufs_by_inode(c, inode);
>  		if (err)
>  			return err;
> Index: linux-2.6/fs/xfs/linux-2.6/xfs_lrw.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_lrw.c	2009-09-15 00:46:33.198253488 -0300
> +++ linux-2.6/fs/xfs/linux-2.6/xfs_lrw.c	2009-09-15 09:41:27.344254176 -0300
> @@ -811,7 +811,7 @@ write_retry:
>  	XFS_STATS_ADD(xs_write_bytes, ret);
>  
>  	/* Handle various SYNC-type writes */
> -	if ((file->f_flags & O_SYNC) || IS_SYNC(inode)) {
> +	if ((file->f_flags & O_DSYNC) || IS_SYNC(inode)) {
>  		int error2;
>  
>  		xfs_iunlock(xip, iolock);
> Index: linux-2.6/sound/core/rawmidi.c
> ===================================================================
> --- linux-2.6.orig/sound/core/rawmidi.c	2009-09-15 00:46:33.219253718 -0300
> +++ linux-2.6/sound/core/rawmidi.c	2009-09-15 09:41:27.347253859 -0300
> @@ -1258,7 +1258,7 @@ static ssize_t snd_rawmidi_write(struct 
>  			break;
>  		count -= count1;
>  	}
> -	if (file->f_flags & O_SYNC) {
> +	if (file->f_flags & O_DSYNC) {
>  		spin_lock_irq(&runtime->lock);
>  		while (runtime->avail != runtime->buffer_size) {
>  			wait_queue_t wait;
> Index: linux-2.6/arch/alpha/include/asm/fcntl.h
> ===================================================================
> --- linux-2.6.orig/arch/alpha/include/asm/fcntl.h	2009-09-15 00:46:32.945006724 -0300
> +++ linux-2.6/arch/alpha/include/asm/fcntl.h	2009-09-15 09:41:27.348253497 -0300
> @@ -1,8 +1,6 @@
>  #ifndef _ALPHA_FCNTL_H
>  #define _ALPHA_FCNTL_H
>  
> -/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
> -   located on an ext2 file system */
>  #define O_CREAT		 01000	/* not fcntl */
>  #define O_TRUNC		 02000	/* not fcntl */
>  #define O_EXCL		 04000	/* not fcntl */
> @@ -10,13 +8,28 @@
>  
>  #define O_NONBLOCK	 00004
>  #define O_APPEND	 00010
> -#define O_SYNC		040000
> +#define O_DSYNC		040000	/* used to be O_SYNC, see below */
>  #define O_DIRECTORY	0100000	/* must be a directory */
>  #define O_NOFOLLOW	0200000 /* don't follow links */
>  #define O_LARGEFILE	0400000 /* will be set by the kernel on every open */
>  #define O_DIRECT	02000000 /* direct disk access - should check with OSF/1 */
>  #define O_NOATIME	04000000
>  #define O_CLOEXEC	010000000 /* set close_on_exec */
> +/*
> + * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
> + * the O_SYNC flag.  We continue to use the existing numerical value
> + * for O_DSYNC semantics now, but using the correct symbolic name for it.
> + * This new value is used to request true Posix O_SYNC semantics.  It is
> + * defined in this strange way to make sure applications compiled against
> + * new headers get at least O_DSYNC semantics on older kernels.
> + *
> + * This has the nice side-effect that we can simply test for O_DSYNC
> + * wherever we do not care if O_DSYNC or O_SYNC is used.
> + *
> + * Note: __O_SYNC must never be used directly.
> + */
> +#define __O_SYNC	020000000
> +#define O_SYNC		(__O_SYNC|O_DSYNC)
>  
>  #define F_GETLK		7
>  #define F_SETLK		8
> Index: linux-2.6/arch/blackfin/include/asm/fcntl.h
> ===================================================================
> --- linux-2.6.orig/arch/blackfin/include/asm/fcntl.h	2009-09-15 00:46:32.978006455 -0300
> +++ linux-2.6/arch/blackfin/include/asm/fcntl.h	2009-09-15 09:41:27.351254088 -0300
> @@ -1,8 +1,6 @@
>  #ifndef _BFIN_FCNTL_H
>  #define _BFIN_FCNTL_H
>  
> -/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
> -   located on an ext2 file system */
>  #define O_DIRECTORY	 040000	/* must be a directory */
>  #define O_NOFOLLOW	0100000	/* don't follow links */
>  #define O_DIRECT	0200000	/* direct disk access hint - currently ignored */
> Index: linux-2.6/arch/mips/include/asm/fcntl.h
> ===================================================================
> --- linux-2.6.orig/arch/mips/include/asm/fcntl.h	2009-09-15 00:46:33.002006368 -0300
> +++ linux-2.6/arch/mips/include/asm/fcntl.h	2009-09-15 09:41:27.354254050 -0300
> @@ -10,7 +10,7 @@
>  
>  
>  #define O_APPEND	0x0008
> -#define O_SYNC		0x0010
> +#define O_DSYNC		0x0010	/* used to be O_SYNC, see below */
>  #define O_NONBLOCK	0x0080
>  #define O_CREAT         0x0100	/* not fcntl */
>  #define O_TRUNC		0x0200	/* not fcntl */
> @@ -18,6 +18,21 @@
>  #define O_NOCTTY	0x0800	/* not fcntl */
>  #define FASYNC		0x1000	/* fcntl, for BSD compatibility */
>  #define O_LARGEFILE	0x2000	/* allow large file opens */
> +/*
> + * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
> + * the O_SYNC flag.  We continue to use the existing numerical value
> + * for O_DSYNC semantics now, but using the correct symbolic name for it.
> + * This new value is used to request true Posix O_SYNC semantics.  It is
> + * defined in this strange way to make sure applications compiled against
> + * new headers get at least O_DSYNC semantics on older kernels.
> + *
> + * This has the nice side-effect that we can simply test for O_DSYNC
> + * wherever we do not care if O_DSYNC or O_SYNC is used.
> + *
> + * Note: __O_SYNC must never be used directly.
> + */
> +#define __O_SYNC	0x4000
> +#define O_SYNC		(__O_SYNC|O_DSYNC)
>  #define O_DIRECT	0x8000	/* direct disk access hint */
>  
>  #define F_GETLK		14
> Index: linux-2.6/arch/mips/kernel/kspd.c
> ===================================================================
> --- linux-2.6.orig/arch/mips/kernel/kspd.c	2009-09-15 00:46:33.021004807 -0300
> +++ linux-2.6/arch/mips/kernel/kspd.c	2009-09-15 09:41:27.357254082 -0300
> @@ -82,6 +82,7 @@ static int sp_stopping = 0;
>  #define MTSP_O_SHLOCK		0x0010
>  #define MTSP_O_EXLOCK		0x0020
>  #define MTSP_O_ASYNC		0x0040
> +/* XXX: check which of these is actually O_SYNC vs O_DSYNC */
>  #define MTSP_O_FSYNC		O_SYNC
>  #define MTSP_O_NOFOLLOW		0x0100
>  #define MTSP_O_SYNC		0x0080
> Index: linux-2.6/arch/mips/lemote/lm2e/mem.c
> ===================================================================
> --- linux-2.6.orig/arch/mips/lemote/lm2e/mem.c	2009-09-15 00:46:33.054254081 -0300
> +++ linux-2.6/arch/mips/lemote/lm2e/mem.c	2009-09-15 09:41:27.357254082 -0300
> @@ -11,7 +11,7 @@
>  /* override of arch/mips/mm/cache.c: __uncached_access */
>  int __uncached_access(struct file *file, unsigned long addr)
>  {
> -	if (file->f_flags & O_SYNC)
> +	if (file->f_flags & O_DSYNC)
>  		return 1;
>  
>  	/*
> Index: linux-2.6/arch/mips/mm/cache.c
> ===================================================================
> --- linux-2.6.orig/arch/mips/mm/cache.c	2009-09-15 00:46:33.074254183 -0300
> +++ linux-2.6/arch/mips/mm/cache.c	2009-09-15 09:41:27.360254044 -0300
> @@ -194,7 +194,7 @@ void __devinit cpu_cache_init(void)
>  
>  int __weak __uncached_access(struct file *file, unsigned long addr)
>  {
> -	if (file->f_flags & O_SYNC)
> +	if (file->f_flags & O_DSYNC)
>  		return 1;
>  
>  	return addr >= __pa(high_memory);
> Index: linux-2.6/arch/parisc/include/asm/fcntl.h
> ===================================================================
> --- linux-2.6.orig/arch/parisc/include/asm/fcntl.h	2009-09-15 00:46:33.082254364 -0300
> +++ linux-2.6/arch/parisc/include/asm/fcntl.h	2009-09-15 09:41:27.363254007 -0300
> @@ -1,14 +1,13 @@
>  #ifndef _PARISC_FCNTL_H
>  #define _PARISC_FCNTL_H
>  
> -/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
> -   located on an ext2 file system */
>  #define O_APPEND	000000010
>  #define O_BLKSEEK	000000100 /* HPUX only */
>  #define O_CREAT		000000400 /* not fcntl */
>  #define O_EXCL		000002000 /* not fcntl */
>  #define O_LARGEFILE	000004000
> -#define O_SYNC		000100000
> +#define __O_SYNC	000100000
> +#define O_SYNC		(__O_SYNC|O_DSYNC)
>  #define O_NONBLOCK	000200004 /* HPUX has separate NDELAY & NONBLOCK */
>  #define O_NOCTTY	000400000 /* not fcntl */
>  #define O_DSYNC		001000000 /* HPUX only */
> Index: linux-2.6/arch/sparc/include/asm/fcntl.h
> ===================================================================
> --- linux-2.6.orig/arch/sparc/include/asm/fcntl.h	2009-09-15 00:46:33.090254335 -0300
> +++ linux-2.6/arch/sparc/include/asm/fcntl.h	2009-09-15 09:41:27.367253956 -0300
> @@ -1,14 +1,12 @@
>  #ifndef _SPARC_FCNTL_H
>  #define _SPARC_FCNTL_H
>  
> -/* open/fcntl - O_SYNC is only implemented on blocks devices and on files
> -   located on an ext2 file system */
>  #define O_APPEND	0x0008
>  #define FASYNC		0x0040	/* fcntl, for BSD compatibility */
>  #define O_CREAT		0x0200	/* not fcntl */
>  #define O_TRUNC		0x0400	/* not fcntl */
>  #define O_EXCL		0x0800	/* not fcntl */
> -#define O_SYNC		0x2000
> +#define O_DSYNC		0x2000	/* used to be O_SYNC, see below */
>  #define O_NONBLOCK	0x4000
>  #if defined(__sparc__) && defined(__arch64__)
>  #define O_NDELAY	0x0004
> @@ -20,6 +18,21 @@
>  #define O_DIRECT        0x100000 /* direct disk access hint */
>  #define O_NOATIME	0x200000
>  #define O_CLOEXEC	0x400000
> +/*
> + * Before Linux 2.6.32 only O_DSYNC semantics were implemented, but using
> + * the O_SYNC flag.  We continue to use the existing numerical value
> + * for O_DSYNC semantics now, but using the correct symbolic name for it.
> + * This new value is used to request true Posix O_SYNC semantics.  It is
> + * defined in this strange way to make sure applications compiled against
> + * new headers get at least O_DSYNC semantics on older kernels.
> + *
> + * This has the nice side-effect that we can simply test for O_DSYNC
> + * wherever we do not care if O_DSYNC or O_SYNC is used.
> + *
> + * Note: __O_SYNC must never be used directly.
> + */
> +#define __O_SYNC	0x800000
> +#define O_SYNC		(__O_SYNC|O_DSYNC)
>  
>  #define F_GETOWN	5	/*  for sockets. */
>  #define F_SETOWN	6	/*  for sockets. */
> Index: linux-2.6/fs/sync.c
> ===================================================================
> --- linux-2.6.orig/fs/sync.c	2009-09-15 00:46:33.205253612 -0300
> +++ linux-2.6/fs/sync.c	2009-09-15 09:41:27.370254058 -0300
> @@ -287,10 +287,11 @@ SYSCALL_DEFINE1(fdatasync, unsigned int,
>   */
>  int generic_write_sync(struct file *file, loff_t pos, loff_t count)
>  {
> -	if (!(file->f_flags & O_SYNC) && !IS_SYNC(file->f_mapping->host))
> +	if (!(file->f_flags & O_DSYNC) && !IS_SYNC(file->f_mapping->host))
>  		return 0;
>  	return vfs_fsync_range(file, file->f_path.dentry, pos,
> -			       pos + count - 1, 1);
> +			       pos + count - 1,
> +			       (file->f_flags & __O_SYNC) ? 0 : 1);
>  }
>  EXPORT_SYMBOL(generic_write_sync);
>  
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] implement posix O_SYNC and O_DSYNC semantics
  2009-09-15 13:12 ` [PATCH] implement posix O_SYNC and O_DSYNC semantics Christoph Hellwig
  2009-09-15 14:10   ` Jan Kara
@ 2009-09-15 14:50   ` Ulrich Drepper
  2009-09-17 17:16     ` Christoph Hellwig
  2009-09-17 21:03   ` Kyle McMartin
  2 siblings, 1 reply; 5+ messages in thread
From: Ulrich Drepper @ 2009-09-15 14:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, linux-kernel, linux-arch, akpm, viro, kyle, sct

On 09/15/2009 06:12 AM, Christoph Hellwig wrote:

> Signed-off-by: Christoph Hellwig<hch@lst.de>
> Acked-by: Trond Myklebust<Trond.Myklebust@netapp.com>

Looks OK to me:

Acked-by: Ulrich Drepper <drepper@redhat.com>

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] implement posix O_SYNC and O_DSYNC semantics
  2009-09-15 14:50   ` Ulrich Drepper
@ 2009-09-17 17:16     ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2009-09-17 17:16 UTC (permalink / raw)
  To: Ulrich Drepper
  Cc: Christoph Hellwig, Jan Kara, linux-kernel, linux-arch, akpm, viro,
	kyle, sct

Btw, a little update on O_RSYNC:  I have a patch that should work,
but surprisingly enough it doesn't.  Seem like the O_ flags grew too
large and somewhere in the middle they get truncated off.  Here's what I
have so far:

Index: linux-2.6/fs/splice.c
===================================================================
--- linux-2.6.orig/fs/splice.c	2009-09-15 00:06:09.737003454 -0300
+++ linux-2.6/fs/splice.c	2009-09-15 00:08:23.669254032 -0300
@@ -501,6 +501,10 @@ ssize_t generic_file_splice_read(struct 
 	if (unlikely(left < len))
 		len = left;
 
+	ret = generic_read_sync(in, *ppos, len);
+	if (ret)
+		return ret;
+
 	ret = __generic_file_splice_read(in, ppos, pipe, len, flags);
 	if (ret > 0) {
 		*ppos += ret;
Index: linux-2.6/fs/sync.c
===================================================================
--- linux-2.6.orig/fs/sync.c	2009-09-15 00:08:23.180271144 -0300
+++ linux-2.6/fs/sync.c	2009-09-15 00:28:41.359031442 -0300
@@ -295,6 +295,33 @@ int generic_write_sync(struct file *file
 }
 EXPORT_SYMBOL(generic_write_sync);
 
+/**
+ * generic_read_sync - perform syncing befor
+ * @file:	file to which the read happens
+ * @pos:	offset where the read starts
+ * @count:	length of the read
+ *
+ * This implements the O_RSYNC semantics:
+ *   O_RSYNC on its own just means the data is successfully transferred to
+ *   the calling process (always the case).
+ *
+ *   O_RSYNC|O_DSYNC means that if a read request hits data that is currently
+ *   in a cache and not yet on the medium, then the write to medium is
+ *   successful before the read succeeds.
+ *
+ *   O_RSYNC|O_SYNC means the same plus the integrity of file meta information
+ *   (access time etc).
+ */
+int generic_read_sync(struct file *file, loff_t pos, loff_t count)
+{
+	if (((file->f_flags & (O_RSYNC|O_DSYNC)) != (O_RSYNC|O_DSYNC)))
+		return 0;
+	return vfs_fsync_range(file, file->f_path.dentry, pos,
+			       pos + count - 1,
+			       (file->f_flags & __O_SYNC) ? 0 : 1);
+}
+EXPORT_SYMBOL(generic_read_sync);
+
 /*
  * sys_sync_file_range() permits finely controlled syncing over a segment of
  * a file in the range offset .. (offset+nbytes-1) inclusive.  If nbytes is
Index: linux-2.6/include/asm-generic/fcntl.h
===================================================================
--- linux-2.6.orig/include/asm-generic/fcntl.h	2009-09-15 00:08:23.162254189 -0300
+++ linux-2.6/include/asm-generic/fcntl.h	2009-09-15 00:08:23.672254134 -0300
@@ -68,6 +68,10 @@
 #define O_SYNC		(__O_SYNC|O_DSYNC)
 #endif
 
+#ifndef O_RSYNC
+#define O_RSYNC		010000000
+#endif
+
 #ifndef O_NDELAY
 #define O_NDELAY	O_NONBLOCK
 #endif
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h	2009-09-15 00:06:09.758004312 -0300
+++ linux-2.6/include/linux/fs.h	2009-09-15 00:08:23.673254191 -0300
@@ -2097,6 +2097,7 @@ extern int vfs_fsync_range(struct file *
 			   loff_t start, loff_t end, int datasync);
 extern int vfs_fsync(struct file *file, struct dentry *dentry, int datasync);
 extern int generic_write_sync(struct file *file, loff_t pos, loff_t count);
+extern int generic_read_sync(struct file *file, loff_t pos, loff_t count);
 extern void sync_supers(void);
 extern void emergency_sync(void);
 extern void emergency_remount(void);
Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c	2009-09-15 00:06:09.764004377 -0300
+++ linux-2.6/mm/filemap.c	2009-09-15 00:08:23.676300248 -0300
@@ -1285,6 +1285,10 @@ generic_file_aio_read(struct kiocb *iocb
 	if (retval)
 		return retval;
 
+	retval = generic_read_sync(filp, pos, count);
+	if (retval)
+		return retval;
+
 	/* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
 	if (filp->f_flags & O_DIRECT) {
 		loff_t size;
Index: linux-2.6/arch/alpha/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/fcntl.h	2009-09-15 00:08:23.169254241 -0300
+++ linux-2.6/arch/alpha/include/asm/fcntl.h	2009-09-15 00:08:23.678253988 -0300
@@ -30,6 +30,7 @@
  */
 #define __O_SYNC	020000000
 #define O_SYNC		(__O_SYNC|O_DSYNC)
+#define O_RSYNC		040000000
 
 #define F_GETLK		7
 #define F_SETLK		8
Index: linux-2.6/arch/mips/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/fcntl.h	2009-09-15 00:08:23.172253854 -0300
+++ linux-2.6/arch/mips/include/asm/fcntl.h	2009-09-15 00:08:23.678253988 -0300
@@ -34,6 +34,7 @@
 #define __O_SYNC	0x4000
 #define O_SYNC		(__O_SYNC|O_DSYNC)
 #define O_DIRECT	0x8000	/* direct disk access hint */
+#define O_DSYNC		0x10000
 
 #define F_GETLK		14
 #define F_SETLK		6
Index: linux-2.6/arch/parisc/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/fcntl.h	2009-09-15 00:08:23.178298896 -0300
+++ linux-2.6/arch/parisc/include/asm/fcntl.h	2009-09-15 00:08:23.680301735 -0300
@@ -14,6 +14,7 @@
 #define O_RSYNC		002000000 /* HPUX only */
 #define O_NOATIME	004000000
 #define O_CLOEXEC	010000000 /* set close_on_exec */
+#define O_RSYNC		020000000
 
 #define O_DIRECTORY	000010000 /* must be a directory */
 #define O_NOFOLLOW	000000200 /* don't follow links */
Index: linux-2.6/arch/sparc/include/asm/fcntl.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/fcntl.h	2009-09-15 00:08:23.179254674 -0300
+++ linux-2.6/arch/sparc/include/asm/fcntl.h	2009-09-15 00:08:23.681254370 -0300
@@ -33,6 +33,7 @@
  */
 #define __O_SYNC	0x800000
 #define O_SYNC		(__O_SYNC|O_DSYNC)
+#define O_RSYNC		0x1000000
 
 #define F_GETOWN	5	/*  for sockets. */
 #define F_SETOWN	6	/*  for sockets. */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] implement posix O_SYNC and O_DSYNC semantics
  2009-09-15 13:12 ` [PATCH] implement posix O_SYNC and O_DSYNC semantics Christoph Hellwig
  2009-09-15 14:10   ` Jan Kara
  2009-09-15 14:50   ` Ulrich Drepper
@ 2009-09-17 21:03   ` Kyle McMartin
  2 siblings, 0 replies; 5+ messages in thread
From: Kyle McMartin @ 2009-09-17 21:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, linux-kernel, linux-arch, akpm, drepper, viro, kyle,
	sct

On Tue, Sep 15, 2009 at 03:12:52PM +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> 

Acked-by: Kyle McMartin <kyle@redhat.com>

Parisc bits look ok to me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-17 21:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090914165419.GD25549@duck.suse.cz>
2009-09-15 13:12 ` [PATCH] implement posix O_SYNC and O_DSYNC semantics Christoph Hellwig
2009-09-15 14:10   ` Jan Kara
2009-09-15 14:50   ` Ulrich Drepper
2009-09-17 17:16     ` Christoph Hellwig
2009-09-17 21:03   ` Kyle McMartin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).