From: Jan Kara <jack@suse.cz>
To: LKML <linux-kernel@vger.kernel.org>
Cc: hch@lst.de, linux-fsdevel@vger.kernel.org,
Jan Kara <jack@suse.cz>, Evgeniy Polyakov <zbr@ioremap.net>,
ocfs2-devel@oss.oracle.com, Joel Becker <joel.becker@oracle.com>,
Felix Blyakher <felixb@sgi.com>,
xfs@oss.sgi.com, Anton Altaparmakov <aia21@cantab.net>,
linux-ntfs-dev@lists.sourceforge.net,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode
Date: Fri, 21 Aug 2009 18:59:51 +0200 [thread overview]
Message-ID: <1250874001-15483-7-git-send-email-jack@suse.cz> (raw)
In-Reply-To: <1250874001-15483-1-git-send-email-jack@suse.cz>
Introduce new function for generic inode syncing (generic_sync_file) and use
it from fsync() path. Introduce also new helper for syncing after a sync
write (generic_write_sync) using the generic function.
Use these new helpers for syncing from generic VFS functions. This makes
O_SYNC writes to block devices acquire i_mutex for syncing. If we really
care about this, we can make block_fsync() drop the i_mutex and reacquire
it before it returns.
CC: Evgeniy Polyakov <zbr@ioremap.net>
CC: ocfs2-devel@oss.oracle.com
CC: Joel Becker <joel.becker@oracle.com>
CC: Felix Blyakher <felixb@sgi.com>
CC: xfs@oss.sgi.com
CC: Anton Altaparmakov <aia21@cantab.net>
CC: linux-ntfs-dev@lists.sourceforge.net
CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/splice.c | 22 ++++----------
fs/sync.c | 81 +++++++++++++++++++++++++++++++++++++++++++---------
include/linux/fs.h | 7 ++++
mm/filemap.c | 18 ++++--------
4 files changed, 86 insertions(+), 42 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index 73766d2..8190237 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -976,25 +976,15 @@ generic_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
if (ret > 0) {
unsigned long nr_pages;
+ int err;
- *ppos += ret;
nr_pages = (ret + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- /*
- * If file or inode is SYNC and we actually wrote some data,
- * sync it.
- */
- if (unlikely((out->f_flags & O_SYNC) || IS_SYNC(inode))) {
- int err;
-
- mutex_lock(&inode->i_mutex);
- err = generic_osync_inode(inode, mapping,
- OSYNC_METADATA|OSYNC_DATA);
- mutex_unlock(&inode->i_mutex);
-
- if (err)
- ret = err;
- }
+ err = generic_write_sync(out, *ppos, ret);
+ if (err)
+ ret = err;
+ else
+ *ppos += ret;
balance_dirty_pages_ratelimited_nr(mapping, nr_pages);
}
diff --git a/fs/sync.c b/fs/sync.c
index 3422ba6..fc320aa 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -176,23 +176,30 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
}
/**
- * vfs_fsync - perform a fsync or fdatasync on a file
+ * generic_sync_file - helper to sync data & metadata to disk
* @file: file to sync
* @dentry: dentry of @file
- * @data: only perform a fdatasync operation
+ * @start: offset in bytes of the beginning of data range to sync
+ * @end: offset in bytes of the end of data range (inclusive)
+ * @what: what should be synced
*
- * Write back data and metadata for @file to disk. If @datasync is
- * set only metadata needed to access modified file data is written.
+ * What the function exactly does is controlled by the @what parameter:
*
- * In case this function is called from nfsd @file may be %NULL and
- * only @dentry is set. This can only happen when the filesystem
- * implements the export_operations API.
+ * If SYNC_SUBMIT_DATA is set, the function submits all pages in the given
+ * range to disk.
+ *
+ * The function always calls ->fsync() callback of the filesystem. If
+ * SYNC_INODE is not set, we pass down the fact that it is just a datasync.
+ *
+ * If SYNC_WAIT_DATA is set, the function waits for writeback to finish
+ * in the given range.
*/
-int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+int generic_sync_file(struct file *file, struct dentry *dentry, loff_t start,
+ loff_t end, int what)
{
const struct file_operations *fop;
struct address_space *mapping;
- int err, ret;
+ int err, ret = 0;
/*
* Get mapping and operations from the file in case we have
@@ -212,23 +219,50 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
goto out;
}
- ret = filemap_fdatawrite(mapping);
+ if (what & SYNC_SUBMIT_DATA)
+ ret = filemap_fdatawrite_range(mapping, start, end);
/*
* We need to protect against concurrent writers, which could cause
* livelocks in fsync_buffers_list().
*/
mutex_lock(&mapping->host->i_mutex);
- err = fop->fsync(file, dentry, datasync);
+ err = fop->fsync(file, dentry, !(what & SYNC_INODE));
if (!ret)
ret = err;
mutex_unlock(&mapping->host->i_mutex);
- err = filemap_fdatawait(mapping);
- if (!ret)
- ret = err;
+
+ if (what & SYNC_WAIT_DATA) {
+ err = filemap_fdatawait_range(mapping, start, end);
+ if (!ret)
+ ret = err;
+ }
out:
return ret;
}
+EXPORT_SYMBOL(generic_sync_file);
+
+/**
+ * vfs_fsync - perform a fsync or fdatasync on a file
+ * @file: file to sync
+ * @dentry: dentry of @file
+ * @datasync: only perform a fdatasync operation
+ *
+ * Write back data and metadata for @file to disk. If @datasync is
+ * set only metadata needed to access modified file data is written.
+ *
+ * In case this function is called from nfsd @file may be %NULL and
+ * only @dentry is set. This can only happen when the filesystem
+ * implements the export_operations API.
+ */
+int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+{
+ int what = SYNC_SUBMIT_DATA | SYNC_WAIT_DATA;
+
+ if (!datasync)
+ what |= SYNC_INODE;
+ return generic_sync_file(file, dentry, 0, LLONG_MAX, what);
+}
EXPORT_SYMBOL(vfs_fsync);
static int do_fsync(unsigned int fd, int datasync)
@@ -254,6 +288,25 @@ SYSCALL_DEFINE1(fdatasync, unsigned int, fd)
return do_fsync(fd, 1);
}
+/**
+ * generic_write_sync - perform syncing after a write if file / inode is sync
+ * @file: file to which the write happened
+ * @pos: offset where the write started
+ * @count: length of the write
+ *
+ * This is just a simple wrapper about our general syncing function.
+ * FIXME: Make it inline?
+ */
+int generic_write_sync(struct file *file, loff_t pos, loff_t count)
+{
+ if (!(file->f_flags & O_SYNC) && !IS_SYNC(file->f_mapping->host))
+ return 0;
+ return generic_sync_file(file, file->f_path.dentry, pos,
+ pos + count - 1,
+ SYNC_SUBMIT_DATA | SYNC_WAIT_DATA);
+}
+EXPORT_SYMBOL(generic_write_sync);
+
/*
* sys_sync_file_range() permits finely controlled syncing over a segment of
* a file in the range offset .. (offset+nbytes-1) inclusive. If nbytes is
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 29fc8da..648001c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2088,7 +2088,14 @@ extern int __filemap_fdatawrite_range(struct address_space *mapping,
extern int filemap_fdatawrite_range(struct address_space *mapping,
loff_t start, loff_t end);
+/* Flags for generic_sync_file */
+#define SYNC_INODE 1
+#define SYNC_SUBMIT_DATA 2
+#define SYNC_WAIT_DATA 4
+extern int generic_sync_file(struct file *file, struct dentry *dentry,
+ loff_t start, loff_t end, int what);
extern int vfs_fsync(struct file *file, struct dentry *dentry, int datasync);
+extern int generic_write_sync(struct file *file, loff_t pos, loff_t count);
extern void sync_supers(void);
extern void emergency_sync(void);
extern void emergency_remount(void);
diff --git a/mm/filemap.c b/mm/filemap.c
index ef9f635..d0802a9 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -39,11 +39,10 @@
/*
* FIXME: remove all knowledge of the buffer layer from the core VM
*/
-#include <linux/buffer_head.h> /* for generic_osync_inode */
+#include <linux/buffer_head.h> /* for try_to_free_buffers */
#include <asm/mman.h>
-
/*
* Shared mappings implemented 30.11.1994. It's not fully working yet,
* though.
@@ -2480,19 +2479,16 @@ ssize_t device_aio_write(struct kiocb *iocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
struct file *file = iocb->ki_filp;
- struct address_space *mapping = file->f_mapping;
- struct inode *inode = mapping->host;
ssize_t ret;
BUG_ON(iocb->ki_pos != pos);
ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
- if ((ret > 0 || ret == -EIOCBQUEUED) &&
- ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
+ if (ret > 0 || ret == -EIOCBQUEUED) {
ssize_t err;
- err = sync_page_range_nolock(inode, mapping, pos, ret);
+ err = generic_write_sync(file, pos, ret);
if (err < 0)
ret = err;
}
@@ -2515,8 +2511,7 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
unsigned long nr_segs, loff_t pos)
{
struct file *file = iocb->ki_filp;
- struct address_space *mapping = file->f_mapping;
- struct inode *inode = mapping->host;
+ struct inode *inode = file->f_mapping->host;
ssize_t ret;
BUG_ON(iocb->ki_pos != pos);
@@ -2525,11 +2520,10 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
mutex_unlock(&inode->i_mutex);
- if ((ret > 0 || ret == -EIOCBQUEUED) &&
- ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
+ if (ret > 0 || ret == -EIOCBQUEUED) {
ssize_t err;
- err = sync_page_range(inode, mapping, pos, ret);
+ err = generic_write_sync(file, pos, ret);
if (err < 0)
ret = err;
}
--
1.6.0.2
next prev parent reply other threads:[~2009-08-21 16:59 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-21 16:59 [PATCH 01/17] vfs: Introduce filemap_fdatawait_range Jan Kara
2009-08-21 16:59 ` [PATCH 02/17] vfs: Export __generic_file_aio_write() and add some comments Jan Kara
2009-08-21 16:59 ` [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write() Jan Kara
2009-08-21 16:59 ` [PATCH 04/17] pohmelfs: Use __generic_file_aio_write instead of generic_file_aio_write_nolock Jan Kara
2009-08-21 16:59 ` [PATCH 05/17] ocfs2: " Jan Kara
2009-08-21 16:59 ` [PATCH 06/17] vfs: Rename generic_file_aio_write_nolock Jan Kara
2009-08-21 16:59 ` Jan Kara [this message]
2009-08-21 16:59 ` [PATCH 08/17] ext2: Update comment about generic_osync_inode Jan Kara
2009-08-21 16:59 ` [PATCH 09/17] ext3: Remove syncing logic from ext3_file_write Jan Kara
2009-08-21 16:59 ` [PATCH 10/17] ext4: Remove syncing logic from ext4_file_write Jan Kara
-- strict thread matches above, loose matches on Subject: below --
2009-08-21 17:23 [PATCH 0/17] Make O_SYNC handling use standard syncing path (Version 2) Jan Kara
2009-08-21 17:23 ` [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Jan Kara
2009-08-27 17:35 ` Christoph Hellwig
2009-08-30 16:35 ` Jamie Lokier
2009-08-30 16:39 ` Christoph Hellwig
2009-08-30 17:29 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1250874001-15483-7-git-send-email-jack@suse.cz \
--to=jack@suse.cz \
--cc=aia21@cantab.net \
--cc=felixb@sgi.com \
--cc=hch@lst.de \
--cc=hirofumi@mail.parknet.co.jp \
--cc=joel.becker@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-ntfs-dev@lists.sourceforge.net \
--cc=ocfs2-devel@oss.oracle.com \
--cc=tytso@mit.edu \
--cc=xfs@oss.sgi.com \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).