* [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes
@ 2007-06-14 2:34 Nick Piggin
2007-06-14 2:35 ` [patch 2/8] " Nick Piggin
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:34 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, cmm, Badari Pulavarty, Dmitriy Monakhov,
mark.fasheh
Here are several fixes for issues that Dmitriy found, and also several
fixlets for non-compiling filesystems.
These only caused one trivial down-stack reject, so I won't worry about
sending you the fix for that.
These have had some testing with various filesystems, block sizes, and
journal modes with fsx-linux, fsstress and swapping-kbuilds.
--
Dmitriy noticed some weird code I had in __page_symlink that is wrong.
Fixes: fs-introduce-write_begin-write_end-and-perform_write-aops.patch
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c
+++ linux-2.6/fs/namei.c
@@ -2702,21 +2702,20 @@ int __page_symlink(struct inode *inode,
char *kaddr;
retry:
- err = pagecache_write_begin(NULL, mapping, 0, PAGE_CACHE_SIZE,
+ err = pagecache_write_begin(NULL, mapping, 0, len-1,
AOP_FLAG_UNINTERRUPTIBLE, &page, &fsdata);
if (err)
goto fail;
kaddr = kmap_atomic(page, KM_USER0);
memcpy(kaddr, symname, len-1);
- memset(kaddr+len-1, 0, PAGE_CACHE_SIZE-(len-1));
kunmap_atomic(kaddr, KM_USER0);
- err = pagecache_write_end(NULL, mapping, 0, PAGE_CACHE_SIZE, PAGE_CACHE_SIZE,
+ err = pagecache_write_end(NULL, mapping, 0, len-1, len-1,
page, fsdata);
if (err < 0)
goto fail;
- if (err < PAGE_CACHE_SIZE)
+ if (err < len-1)
goto retry;
mark_inode_dirty(inode);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 2/8] 2.6.22-rc4-mm2 buffered write fixes
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
@ 2007-06-14 2:35 ` Nick Piggin
2007-06-14 2:36 ` [patch 3/8] " Nick Piggin
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:35 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, cmm, Badari Pulavarty, Dmitriy Monakhov,
mark.fasheh
Dmitriy noticed that iov_iter_fault_in_readable could go past the end
of the first iov in a multi-iov situation, and that could be considered
an EFAULT by the caller. Fix and comment.
Fixes: fs-introduce-write_begin-write_end-and-perform_write-aops.patch
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c
+++ linux-2.6/mm/filemap.c
@@ -1794,9 +1794,19 @@ void iov_iter_advance(struct iov_iter *i
i->count -= bytes;
}
+/*
+ * Fault in the first iovec of the given iov_iter, to a maximum length
+ * of bytes. Returns 0 on success, or non-zero if the memory could not be
+ * accessed (ie. because it is an invalid address).
+ *
+ * writev-intensive code may want this to prefault several iovecs -- that
+ * would be possible (callers must not rely on the fact that _only_ the
+ * first iovec will be faulted with the current implementation).
+ */
int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
{
char __user *buf = i->iov->iov_base + i->iov_offset;
+ bytes = min(bytes, i->iov->iov_len - i->iov_offset);
return fault_in_pages_readable(buf, bytes);
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 3/8] 2.6.22-rc4-mm2 buffered write fixes
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
2007-06-14 2:35 ` [patch 2/8] " Nick Piggin
@ 2007-06-14 2:36 ` Nick Piggin
2007-06-14 2:37 ` [patch 4/8] " Nick Piggin
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:36 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, cmm, Badari Pulavarty, Dmitriy Monakhov,
mark.fasheh
Don't move journal_stop from under page lock.
Fixes: ext3-convert-to-new-aops.patch
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/fs/ext3/inode.c
===================================================================
--- linux-2.6.orig/fs/ext3/inode.c
+++ linux-2.6/fs/ext3/inode.c
@@ -36,6 +36,7 @@
#include <linux/mpage.h>
#include <linux/uio.h>
#include <linux/bio.h>
+#include <linux/swap.h> /* mark_page_accessed */
#include "xattr.h"
#include "acl.h"
@@ -1217,6 +1218,31 @@ static int write_end_fn(handle_t *handle
}
/*
+ * Generic write_end handler for ordered and writeback ext3 journal modes.
+ * We can't use generic_write_end, because that unlocks the page and we need to
+ * unlock the page after ext3_journal_stop, but ext3_journal_stop must run
+ * after block_write_end.
+ */
+static int ext3_generic_write_end(struct file *file,
+ struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page, void *fsdata)
+{
+ struct inode *inode = file->f_mapping->host;
+
+ copied = block_write_end(file, mapping, pos, len, copied, page, fsdata);
+
+ mark_page_accessed(page);
+
+ if (pos+copied > inode->i_size) {
+ i_size_write(inode, pos+copied);
+ mark_inode_dirty(inode);
+ }
+
+ return copied;
+}
+
+/*
* We need to pick up the new inode size which generic_commit_write gave us
* `file' can be NULL - eg, when called from page_symlink().
*
@@ -1250,17 +1276,17 @@ static int ext3_ordered_write_end(struct
new_i_size = pos + copied;
if (new_i_size > EXT3_I(inode)->i_disksize)
EXT3_I(inode)->i_disksize = new_i_size;
- copied = generic_write_end(file, mapping, pos, len, copied,
+ copied = ext3_generic_write_end(file, mapping, pos, len, copied,
page, fsdata);
if (copied < 0)
ret = copied;
- } else {
- unlock_page(page);
- page_cache_release(page);
}
ret2 = ext3_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
@@ -1278,7 +1304,7 @@ static int ext3_writeback_write_end(stru
if (new_i_size > EXT3_I(inode)->i_disksize)
EXT3_I(inode)->i_disksize = new_i_size;
- copied = generic_write_end(file, mapping, pos, len, copied,
+ copied = ext3_generic_write_end(file, mapping, pos, len, copied,
page, fsdata);
if (copied < 0)
ret = copied;
@@ -1286,6 +1312,9 @@ static int ext3_writeback_write_end(stru
ret2 = ext3_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
@@ -1313,8 +1342,6 @@ static int ext3_journalled_write_end(str
to, &partial, write_end_fn);
if (!partial)
SetPageUptodate(page);
- unlock_page(page);
- page_cache_release(page);
if (pos+copied > inode->i_size)
i_size_write(inode, pos+copied);
EXT3_I(inode)->i_state |= EXT3_STATE_JDATA;
@@ -1328,6 +1355,9 @@ static int ext3_journalled_write_end(str
ret2 = ext3_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 4/8] 2.6.22-rc4-mm2 buffered write fixes
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
2007-06-14 2:35 ` [patch 2/8] " Nick Piggin
2007-06-14 2:36 ` [patch 3/8] " Nick Piggin
@ 2007-06-14 2:37 ` Nick Piggin
2007-06-14 2:39 ` [patch 4/8] minix: convert to new aops fix Nick Piggin
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:37 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, cmm, Badari Pulavarty, Dmitriy Monakhov,
mark.fasheh
Don't move journal_stop from under page lock.
Fixes: ext4-convert-to-new-aops.patch
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/fs/ext4/inode.c
===================================================================
--- linux-2.6.orig/fs/ext4/inode.c
+++ linux-2.6/fs/ext4/inode.c
@@ -36,6 +36,7 @@
#include <linux/mpage.h>
#include <linux/uio.h>
#include <linux/bio.h>
+#include <linux/swap.h> /* mark_page_accessed */
#include "xattr.h"
#include "acl.h"
@@ -1215,6 +1216,31 @@ static int write_end_fn(handle_t *handle
}
/*
+ * Generic write_end handler for ordered and writeback ext4 journal modes.
+ * We can't use generic_write_end, because that unlocks the page and we need to
+ * unlock the page after ext4_journal_stop, but ext4_journal_stop must run
+ * after block_write_end.
+ */
+static int ext4_generic_write_end(struct file *file,
+ struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page, void *fsdata)
+{
+ struct inode *inode = file->f_mapping->host;
+
+ copied = block_write_end(file, mapping, pos, len, copied, page, fsdata);
+
+ mark_page_accessed(page);
+
+ if (pos+copied > inode->i_size) {
+ i_size_write(inode, pos+copied);
+ mark_inode_dirty(inode);
+ }
+
+ return copied;
+}
+
+/*
* We need to pick up the new inode size which generic_commit_write gave us
* `file' can be NULL - eg, when called from page_symlink().
*
@@ -1248,17 +1274,17 @@ static int ext4_ordered_write_end(struct
new_i_size = pos + copied;
if (new_i_size > EXT4_I(inode)->i_disksize)
EXT4_I(inode)->i_disksize = new_i_size;
- copied = generic_write_end(file, mapping, pos, len, copied,
+ copied = ext4_generic_write_end(file, mapping, pos, len, copied,
page, fsdata);
if (copied < 0)
ret = copied;
- } else {
- unlock_page(page);
- page_cache_release(page);
}
ret2 = ext4_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
@@ -1276,7 +1302,7 @@ static int ext4_writeback_write_end(stru
if (new_i_size > EXT4_I(inode)->i_disksize)
EXT4_I(inode)->i_disksize = new_i_size;
- copied = generic_write_end(file, mapping, pos, len, copied,
+ copied = ext4_generic_write_end(file, mapping, pos, len, copied,
page, fsdata);
if (copied < 0)
ret = copied;
@@ -1284,6 +1310,9 @@ static int ext4_writeback_write_end(stru
ret2 = ext4_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
@@ -1311,8 +1340,6 @@ static int ext4_journalled_write_end(str
to, &partial, write_end_fn);
if (!partial)
SetPageUptodate(page);
- unlock_page(page);
- page_cache_release(page);
if (pos+copied > inode->i_size)
i_size_write(inode, pos+copied);
EXT4_I(inode)->i_state |= EXT4_STATE_JDATA;
@@ -1326,6 +1353,9 @@ static int ext4_journalled_write_end(str
ret2 = ext4_journal_stop(handle);
if (!ret)
ret = ret2;
+ unlock_page(page);
+ page_cache_release(page);
+
return ret ? ret : copied;
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 4/8] minix: convert to new aops fix
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
` (2 preceding siblings ...)
2007-06-14 2:37 ` [patch 4/8] " Nick Piggin
@ 2007-06-14 2:39 ` Nick Piggin
2007-06-14 2:40 ` [patch 6/8] sysv: " Nick Piggin
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-fsdevel, mark.fasheh
Index: linux-2.6/fs/minix/minix.h
===================================================================
--- linux-2.6.orig/fs/minix/minix.h
+++ linux-2.6/fs/minix/minix.h
@@ -54,6 +54,9 @@ extern int minix_new_block(struct inode
extern void minix_free_block(struct inode *inode, unsigned long block);
extern unsigned long minix_count_free_blocks(struct minix_sb_info *sbi);
extern int minix_getattr(struct vfsmount *, struct dentry *, struct kstat *);
+extern int __minix_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata);
extern void V1_minix_truncate(struct inode *);
extern void V2_minix_truncate(struct inode *);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 6/8] sysv: convert to new aops fix
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
` (3 preceding siblings ...)
2007-06-14 2:39 ` [patch 4/8] minix: convert to new aops fix Nick Piggin
@ 2007-06-14 2:40 ` Nick Piggin
2007-06-14 2:41 ` [patch 7/8] ufs: " Nick Piggin
2007-06-14 2:42 ` [patch 8/8] reiser4: fix for new aops patches Nick Piggin
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:40 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-fsdevel, mark.fasheh
Index: linux-2.6/fs/sysv/sysv.h
===================================================================
--- linux-2.6.orig/fs/sysv/sysv.h
+++ linux-2.6/fs/sysv/sysv.h
@@ -136,6 +136,9 @@ extern unsigned long sysv_count_free_blo
/* itree.c */
extern void sysv_truncate(struct inode *);
+extern int __sysv_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata);
/* inode.c */
extern int sysv_write_inode(struct inode *, int);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 7/8] ufs: convert to new aops fix
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
` (4 preceding siblings ...)
2007-06-14 2:40 ` [patch 6/8] sysv: " Nick Piggin
@ 2007-06-14 2:41 ` Nick Piggin
2007-06-14 2:42 ` [patch 8/8] reiser4: fix for new aops patches Nick Piggin
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:41 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-fsdevel, mark.fasheh
Index: linux-2.6/fs/ufs/util.h
===================================================================
--- linux-2.6.orig/fs/ufs/util.h
+++ linux-2.6/fs/ufs/util.h
@@ -231,6 +231,9 @@ ufs_set_inode_gid(struct super_block *sb
extern dev_t ufs_get_inode_dev(struct super_block *, struct ufs_inode_info *);
extern void ufs_set_inode_dev(struct super_block *, struct ufs_inode_info *, dev_t);
+extern int __ufs_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata);
/*
* These functions manipulate ufs buffers
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch 8/8] reiser4: fix for new aops patches
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
` (5 preceding siblings ...)
2007-06-14 2:41 ` [patch 7/8] ufs: " Nick Piggin
@ 2007-06-14 2:42 ` Nick Piggin
6 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-06-14 2:42 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-fsdevel, mark.fasheh
Index: linux-2.6/fs/reiser4/plugin/item/extent_file_ops.c
===================================================================
--- linux-2.6.orig/fs/reiser4/plugin/item/extent_file_ops.c
+++ linux-2.6/fs/reiser4/plugin/item/extent_file_ops.c
@@ -7,7 +7,6 @@
#include <linux/quotaops.h>
#include <linux/swap.h>
-#include "../../../../mm/filemap.h"
static inline reiser4_extent *ext_by_offset(const znode *node, int offset)
{
@@ -937,6 +936,31 @@ static int write_extent_reserve_space(st
return reiser4_grab_space(count, 0 /* flags */);
}
+/*
+ * filemap_copy_from_user no longer exists in generic code, because it
+ * is deadlocky (copying from user while holding the page lock is bad).
+ * As a temporary fix for reiser4, just define it here.
+ */
+static inline size_t
+filemap_copy_from_user(struct page *page, unsigned long offset,
+ const char __user *buf, unsigned bytes)
+{
+ char *kaddr;
+ int left;
+
+ kaddr = kmap_atomic(page, KM_USER0);
+ left = __copy_from_user_inatomic_nocache(kaddr + offset, buf, bytes);
+ kunmap_atomic(kaddr, KM_USER0);
+
+ if (left != 0) {
+ /* Do it the slow way */
+ kaddr = kmap(page);
+ left = __copy_from_user_nocache(kaddr + offset, buf, bytes);
+ kunmap(page);
+ }
+ return bytes - left;
+}
+
/**
* reiser4_write_extent - write method of extent item plugin
* @file: file to write to
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-06-14 2:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-14 2:34 [patch 1/8] 2.6.22-rc4-mm2 buffered write fixes Nick Piggin
2007-06-14 2:35 ` [patch 2/8] " Nick Piggin
2007-06-14 2:36 ` [patch 3/8] " Nick Piggin
2007-06-14 2:37 ` [patch 4/8] " Nick Piggin
2007-06-14 2:39 ` [patch 4/8] minix: convert to new aops fix Nick Piggin
2007-06-14 2:40 ` [patch 6/8] sysv: " Nick Piggin
2007-06-14 2:41 ` [patch 7/8] ufs: " Nick Piggin
2007-06-14 2:42 ` [patch 8/8] reiser4: fix for new aops patches Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).