From: npiggin@suse.de
To: linux-fsdevel@vger.kernel.org
Cc: hch@infradead.org, viro@zeniv.linux.org.uk, jack@suse.cz
Subject: [patch 3/5] fs: introduce new truncate sequence
Date: Fri, 10 Jul 2009 17:30:31 +1000 [thread overview]
Message-ID: <20090710073230.895009841@suse.de> (raw)
In-Reply-To: 20090710073028.782561541@suse.de
[-- Attachment #1: fs-new-truncate.patch --]
[-- Type: text/plain, Size: 8305 bytes --]
Introduce a new truncate calling sequence into fs/mm subsystems. Rather than
setattr > vmtruncate > truncate, have filesystems call their truncate sequence
from ->setattr if filesystem specific operations are required. vmtruncate is
deprecated, and truncate_pagecache and inode_newsize_ok helpers introduced
previously should be used.
simple_setsize is also introduced to perform the equivalent of vmtruncate.
simple_setsize gets called by inode_setattr when ATTR_SIZE is passed. So
filesystems implementing their own truncate code in setattr then calling
through to inode_setattr should clear ATTR_SIZE.
A new attribute is introduced into inode_operations structure; .new_truncate
is a temporary hack to distinguish filesystems that implement the new
truncate system. These guys cannot trim off block past i_size via vmtruncate,
so instead they must handle it in fs code. This gives better opportunity to
catch errors etc anyway. .new_truncate and .truncate will go away once all
filesystems are converted.
Big problem with the previous calling sequence: the filesystem is not called
until i_size has already changed. This means it is not allowed to fail the
call, and also it does not know what the previous i_size was. Also, generic
code calling vmtruncate to truncate allocated blocks in case of error had
no good way to return a meaningful error (or, for example, atomically handle
block deallocation).
Signed-off-by: Nick Piggin <npiggin@suse.de>
---
Documentation/filesystems/vfs.txt | 7 ++++++-
fs/attr.c | 7 ++++++-
fs/buffer.c | 12 +++++++++---
fs/direct-io.c | 7 ++++---
fs/libfs.c | 17 +++++++++++++++++
include/linux/fs.h | 2 ++
mm/truncate.c | 6 ++----
7 files changed, 46 insertions(+), 12 deletions(-)
Index: linux-2.6/fs/libfs.c
===================================================================
--- linux-2.6.orig/fs/libfs.c
+++ linux-2.6/fs/libfs.c
@@ -329,6 +329,22 @@ int simple_rename(struct inode *old_dir,
return 0;
}
+int simple_setsize(struct inode *inode, loff_t newsize)
+{
+ loff_t oldsize;
+ int error;
+
+ error = inode_newsize_ok(inode, newsize);
+ if (error)
+ return error;
+
+ oldsize = inode->i_size;
+ i_size_write(inode, newsize);
+ truncate_pagecache(inode, oldsize, newsize);
+
+ return error;
+}
+
int simple_readpage(struct file *file, struct page *page)
{
clear_highpage(page);
@@ -840,6 +856,7 @@ EXPORT_SYMBOL(generic_read_dir);
EXPORT_SYMBOL(get_sb_pseudo);
EXPORT_SYMBOL(simple_write_begin);
EXPORT_SYMBOL(simple_write_end);
+EXPORT_SYMBOL(simple_setsize);
EXPORT_SYMBOL(simple_dir_inode_operations);
EXPORT_SYMBOL(simple_dir_operations);
EXPORT_SYMBOL(simple_empty);
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h
+++ linux-2.6/include/linux/fs.h
@@ -1527,6 +1527,7 @@ struct inode_operations {
void * (*follow_link) (struct dentry *, struct nameidata *);
void (*put_link) (struct dentry *, struct nameidata *, void *);
void (*truncate) (struct inode *);
+ int new_truncate; /* nasty hack to transition to new truncate code */
int (*permission) (struct inode *, int);
int (*setattr) (struct dentry *, struct iattr *);
int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *);
@@ -2332,6 +2333,7 @@ extern int simple_link(struct dentry *,
extern int simple_unlink(struct inode *, struct dentry *);
extern int simple_rmdir(struct inode *, struct dentry *);
extern int simple_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
+extern int simple_setsize(struct inode *inode, loff_t newsize);
extern int simple_sync_file(struct file *, struct dentry *, int);
extern int simple_empty(struct dentry *);
extern int simple_readpage(struct file *file, struct page *page);
Index: linux-2.6/fs/buffer.c
===================================================================
--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -1992,9 +1992,14 @@ int block_write_begin(struct file *file,
* prepare_write() may have instantiated a few blocks
* outside i_size. Trim these off again. Don't need
* i_size_read because we hold i_mutex.
+ *
+ * Filesystems which set i_op->new_truncate must
+ * handle this themselves. Eventually this will go
+ * away because everyone will be converted.
*/
if (pos + len > inode->i_size)
- vmtruncate(inode, inode->i_size);
+ if (!inode->i_op->new_truncate)
+ vmtruncate(inode, inode->i_size);
}
}
@@ -2371,7 +2376,7 @@ int block_commit_write(struct page *page
*
* We are not allowed to take the i_mutex here so we have to play games to
* protect against truncate races as the page could now be beyond EOF. Because
- * vmtruncate() writes the inode size before removing pages, once we have the
+ * truncate writes the inode size before removing pages, once we have the
* page lock we can determine safely if the page is beyond EOF. If it is not
* beyond EOF, then the page is guaranteed safe against truncation until we
* unlock the page.
@@ -2595,7 +2600,8 @@ out_release:
*pagep = NULL;
if (pos + len > inode->i_size)
- vmtruncate(inode, inode->i_size);
+ if (!inode->i_op->new_truncate)
+ vmtruncate(inode, inode->i_size);
return ret;
}
Index: linux-2.6/fs/direct-io.c
===================================================================
--- linux-2.6.orig/fs/direct-io.c
+++ linux-2.6/fs/direct-io.c
@@ -1210,14 +1210,15 @@ __blockdev_direct_IO(int rw, struct kioc
/*
* In case of error extending write may have instantiated a few
* blocks outside i_size. Trim these off again for DIO_LOCKING.
- * NOTE: DIO_NO_LOCK/DIO_OWN_LOCK callers have to handle this by
- * it's own meaner.
+ * NOTE: DIO_NO_LOCK/DIO_OWN_LOCK callers have to handle this in
+ * their own manner.
*/
if (unlikely(retval < 0 && (rw & WRITE))) {
loff_t isize = i_size_read(inode);
if (end > isize && dio_lock_type == DIO_LOCKING)
- vmtruncate(inode, isize);
+ if (!inode->i_op->new_truncate)
+ vmtruncate(inode, isize);
}
if (rw == READ && dio_lock_type == DIO_LOCKING)
Index: linux-2.6/fs/attr.c
===================================================================
--- linux-2.6.orig/fs/attr.c
+++ linux-2.6/fs/attr.c
@@ -111,7 +111,12 @@ int inode_setattr(struct inode * inode,
if (ia_valid & ATTR_SIZE &&
attr->ia_size != i_size_read(inode)) {
- int error = vmtruncate(inode, attr->ia_size);
+ int error;
+
+ if (inode->i_op->new_truncate)
+ error = simple_setsize(inode, attr->ia_size);
+ else
+ error = vmtruncate(inode, attr->ia_size);
if (error)
return error;
}
Index: linux-2.6/Documentation/filesystems/vfs.txt
===================================================================
--- linux-2.6.orig/Documentation/filesystems/vfs.txt
+++ linux-2.6/Documentation/filesystems/vfs.txt
@@ -401,11 +401,16 @@ otherwise noted.
started might not be in the page cache at the end of the
walk).
- truncate: called by the VFS to change the size of a file. The
+ truncate: Deprecated. This will not be called if ->setsize is defined.
+ Called by the VFS to change the size of a file. The
i_size field of the inode is set to the desired size by the
VFS before this method is called. This method is called by
the truncate(2) system call and related functionality.
+ Note: ->truncate and vmtruncate are deprecated. Do not add new
+ instances/calls of these. Filesystems shoud be converted to do their
+ truncate sequence via ->setattr().
+
permission: called by the VFS to check for access rights on a POSIX-like
filesystem.
Index: linux-2.6/mm/truncate.c
===================================================================
--- linux-2.6.orig/mm/truncate.c
+++ linux-2.6/mm/truncate.c
@@ -520,12 +520,10 @@ int vmtruncate(struct inode * inode, lof
loff_t oldsize;
int error;
- error = inode_newsize_ok(inode, offset);
+ error = simple_setsize(inode, offset);
if (error)
return error;
- oldsize = inode->i_size;
- i_size_write(inode, offset);
- truncate_pagecache(inode, oldsize, offset);
+
if (inode->i_op->truncate)
inode->i_op->truncate(inode);
next prev parent reply other threads:[~2009-07-10 7:34 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-10 7:30 [patch 0/5] new truncate sequence patchset npiggin
2009-07-10 7:30 ` [patch 1/5] fs: new truncate helpers npiggin
2009-07-10 7:30 ` [patch 2/5] fs: use " npiggin-l3A5Bk7waGM
2009-07-10 7:30 ` npiggin [this message]
2009-07-10 7:30 ` [patch 4/5] tmpfs: convert to use the new truncate convention npiggin
2009-07-10 7:30 ` [patch 5/5] ext2: " npiggin
2009-09-01 18:29 ` Jan Kara
2009-09-02 9:14 ` Nick Piggin
2009-09-02 11:14 ` Jan Kara
2009-07-10 9:33 ` [patch 1/3] fs: buffer_head writepage no invalidate Nick Piggin
2009-07-10 9:34 ` [patch 2/3] fs: buffer_head writepage no zero Nick Piggin
2009-07-10 11:46 ` Jan Kara
2009-07-13 6:54 ` Nick Piggin
2009-07-10 9:35 ` [patch 3/3] fs: buffer_head page_lock i_size relax Nick Piggin
2009-07-10 11:08 ` [patch 1/3] fs: buffer_head writepage no invalidate Jan Kara
2009-07-10 14:31 ` [patch 0/5] new truncate sequence patchset Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2009-08-16 10:25 [patch 0/5] new truncate sequence npiggin
2009-08-16 10:25 ` [patch 3/5] fs: introduce " npiggin
2009-08-16 19:39 ` Christoph Hellwig
2009-08-17 6:38 ` Nick Piggin
2009-08-17 16:41 ` Nick Piggin
2009-08-17 18:06 ` Christoph Hellwig
2009-08-18 9:19 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090710073230.895009841@suse.de \
--to=npiggin@suse.de \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).