linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
       [not found] <200705292119.l4TLJtAD011726@shell0.pdx.osdl.net>
@ 2007-05-30  3:13 ` Nick Piggin
  2007-05-30  3:24   ` Andrew Morton
  2007-05-30 10:39   ` Steven Whitehouse
  0 siblings, 2 replies; 6+ messages in thread
From: Nick Piggin @ 2007-05-30  3:13 UTC (permalink / raw)
  To: akpm; +Cc: mark.fasheh, Steven Whitehouse, linux-fsdevel

On Tue, May 29, 2007 at 02:19:55PM -0700, Andrew Morton wrote:
> 
> The patch titled
>      fs: introduce write_begin, write_end, and perform_write aops
> has been added to the -mm tree.  Its filename is
>      fs-introduce-write_begin-write_end-and-perform_write-aops.patch
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> out what to do about this
> 
> ------------------------------------------------------
> Subject: fs: introduce write_begin, write_end, and perform_write aops
> From: Nick Piggin <npiggin@suse.de>
> 
> These are intended to replace prepare_write and commit_write with more
> flexible alternatives that are also able to avoid the buffered write
> deadlock problems efficiently (which prepare_write is unable to do).

OK, well now Andrew's merged a significant chunk of this work, I
would like to try getting the clustered filesystem patches back
in too (Steven, the last GFS2 patch you sent had rejects against this
tree, so I dropped it... hope it isn't too much work to bring it back
uptodate?).

The cluster filesystems aren't 100% happy with the backward-compat
code, because pagecache_write_end cannot handle AOP_TRUNCATED_PAGE from
->commit_write... so if you were to try using loop over GFS2, it might
go BUG. This is a bit bad of me, however the compat code would have been
a whole lot uglier to support that, and I figure the cluster filesystems
want to convert to the new aops ASAP anyway.

I doubt anybody but the filesystem developers would be using -mm in such
a way, but even so I hope we can fix this before long.

Meanwhile, I'll look at redoing the rest of the filesystems that got
left behind.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
  2007-05-30  3:13 ` + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree Nick Piggin
@ 2007-05-30  3:24   ` Andrew Morton
  2007-05-30 22:44     ` Mark Fasheh
  2007-05-30 10:39   ` Steven Whitehouse
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2007-05-30  3:24 UTC (permalink / raw)
  To: Nick Piggin; +Cc: mark.fasheh, Steven Whitehouse, linux-fsdevel

On Wed, 30 May 2007 05:13:54 +0200 Nick Piggin <npiggin@suse.de> wrote:

> On Tue, May 29, 2007 at 02:19:55PM -0700, Andrew Morton wrote:
> > 
> > The patch titled
> >      fs: introduce write_begin, write_end, and perform_write aops
> > has been added to the -mm tree.  Its filename is
> >      fs-introduce-write_begin-write_end-and-perform_write-aops.patch
> > 
> > *** Remember to use Documentation/SubmitChecklist when testing your code ***
> > 
> > See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> > out what to do about this
> > 
> > ------------------------------------------------------
> > Subject: fs: introduce write_begin, write_end, and perform_write aops
> > From: Nick Piggin <npiggin@suse.de>
> > 
> > These are intended to replace prepare_write and commit_write with more
> > flexible alternatives that are also able to avoid the buffered write
> > deadlock problems efficiently (which prepare_write is unable to do).
> 
> OK, well now Andrew's merged a significant chunk of this work, I
> would like to try getting the clustered filesystem patches back
> in too (Steven, the last GFS2 patch you sent had rejects against this
> tree, so I dropped it... hope it isn't too much work to bring it back
> uptodate?).
>
> The cluster filesystems aren't 100% happy with the backward-compat
> code, because pagecache_write_end cannot handle AOP_TRUNCATED_PAGE from
> ->commit_write... so if you were to try using loop over GFS2, it might
> go BUG. This is a bit bad of me, however the compat code would have been
> a whole lot uglier to support that, and I figure the cluster filesystems
> want to convert to the new aops ASAP anyway.
> 
> I doubt anybody but the filesystem developers would be using -mm in such
> a way, but even so I hope we can fix this before long.
> 
> Meanwhile, I'll look at redoing the rest of the filesystems that got
> left behind.

hm, I suppose that means I need to undrop git-ocfs2.patch.  It has a mild
disagreeement with the fault-vs-invalidate patches which I didn't feel like
fixing.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
  2007-05-30  3:13 ` + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree Nick Piggin
  2007-05-30  3:24   ` Andrew Morton
@ 2007-05-30 10:39   ` Steven Whitehouse
  2007-05-31  5:50     ` Nick Piggin
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Whitehouse @ 2007-05-30 10:39 UTC (permalink / raw)
  To: Nick Piggin; +Cc: akpm, mark.fasheh, linux-fsdevel

Hi,

On Wed, 2007-05-30 at 05:13 +0200, Nick Piggin wrote:
> On Tue, May 29, 2007 at 02:19:55PM -0700, Andrew Morton wrote:
> > 
> > The patch titled
> >      fs: introduce write_begin, write_end, and perform_write aops
> > has been added to the -mm tree.  Its filename is
> >      fs-introduce-write_begin-write_end-and-perform_write-aops.patch
> > 
> > *** Remember to use Documentation/SubmitChecklist when testing your code ***
> > 
> > See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> > out what to do about this
> > 
> > ------------------------------------------------------
> > Subject: fs: introduce write_begin, write_end, and perform_write aops
> > From: Nick Piggin <npiggin@suse.de>
> > 
> > These are intended to replace prepare_write and commit_write with more
> > flexible alternatives that are also able to avoid the buffered write
> > deadlock problems efficiently (which prepare_write is unable to do).
> 
> OK, well now Andrew's merged a significant chunk of this work, I
> would like to try getting the clustered filesystem patches back
> in too (Steven, the last GFS2 patch you sent had rejects against this
> tree, so I dropped it... hope it isn't too much work to bring it back
> uptodate?).
> 
I think the following should do the trick... sorry for the delay, I was
on holiday last week and I'm just catching up again. There is not a lot
of change from the previous version, just a few small changes in the
upstream code which had caused one chunk not to apply,

Steve.

---------------------------------------------------------------------------------
diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index fb84478..ad84a55 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -17,6 +17,7 @@
 #include <linux/mpage.h>
 #include <linux/fs.h>
 #include <linux/writeback.h>
+#include <linux/swap.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/lm_interface.h>
 
@@ -349,45 +350,49 @@ out_unlock:
 }
 
 /**
- * gfs2_prepare_write - Prepare to write a page to a file
+ * gfs2_write_begin - Begin to write to a file
  * @file: The file to write to
- * @page: The page which is to be prepared for writing
- * @from: From (byte range within page)
- * @to: To (byte range within page)
+ * @mapping: The mapping in which to write
+ * @pos: The file offset at which to start writing
+ * @len: Length of the write
+ * @flags: Various flags
+ * @pagep: Pointer to return the page
+ * @fsdata: Pointer to return fs data (unused by GFS2)
  *
  * Returns: errno
  */
 
-static int gfs2_prepare_write(struct file *file, struct page *page,
-			      unsigned from, unsigned to)
+static int gfs2_write_begin(struct file *file, struct address_space *mapping,
+			    loff_t pos, unsigned len, unsigned flags,
+			    struct page **pagep, void **fsdata)
 {
-	struct gfs2_inode *ip = GFS2_I(page->mapping->host);
-	struct gfs2_sbd *sdp = GFS2_SB(page->mapping->host);
+	struct gfs2_inode *ip = GFS2_I(mapping->host);
+	struct gfs2_sbd *sdp = GFS2_SB(mapping->host);
 	unsigned int data_blocks, ind_blocks, rblocks;
 	int alloc_required;
 	int error = 0;
-	loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + from;
-	loff_t end = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
 	struct gfs2_alloc *al;
-	unsigned int write_len = to - from;
+	pgoff_t index = pos >> PAGE_CACHE_SHIFT;
+	unsigned from = pos & (PAGE_CACHE_SIZE - 1);
+	unsigned to = from + len;
+	struct page *page;
 
-
-	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_ATIME|LM_FLAG_TRY_1CB, &ip->i_gh);
+	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_ATIME, &ip->i_gh);
 	error = gfs2_glock_nq_atime(&ip->i_gh);
-	if (unlikely(error)) {
-		if (error == GLR_TRYFAILED) {
-			unlock_page(page);
-			error = AOP_TRUNCATED_PAGE;
-			yield();
-		}
+	if (unlikely(error))
 		goto out_uninit;
-	}
 
-	gfs2_write_calc_reserv(ip, write_len, &data_blocks, &ind_blocks);
+	error = -ENOMEM;
+	page = __grab_cache_page(mapping, index);
+	*pagep = page;
+	if (!page)
+		goto out_unlock;
+
+	gfs2_write_calc_reserv(ip, len, &data_blocks, &ind_blocks);
 
-	error = gfs2_write_alloc_required(ip, pos, write_len, &alloc_required);
+	error = gfs2_write_alloc_required(ip, pos, len, &alloc_required);
 	if (error)
-		goto out_unlock;
+		goto out_putpage;
 
 
 	ip->i_alloc.al_requested = 0;
@@ -419,7 +424,7 @@ static int gfs2_prepare_write(struct file *file, struct page *page,
 		goto out;
 
 	if (gfs2_is_stuffed(ip)) {
-		if (end > sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)) {
+		if (pos + len > sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)) {
 			error = gfs2_unstuff_dinode(ip, page);
 			if (error == 0)
 				goto prepare_write;
@@ -441,6 +446,10 @@ out_qunlock:
 out_alloc_put:
 			gfs2_alloc_put(ip);
 		}
+out_putpage:
+		page_cache_release(page);
+		if (pos + len > ip->i_inode.i_size)
+			vmtruncate(&ip->i_inode, ip->i_inode.i_size);
 out_unlock:
 		gfs2_glock_dq_m(1, &ip->i_gh);
 out_uninit:
@@ -476,65 +485,118 @@ static void adjust_fs_space(struct inode *inode)
 }
 
 /**
- * gfs2_commit_write - Commit write to a file
+ * gfs2_stuffed_write_end - Write end for stuffed files
+ * @inode: The inode
+ * @dibh: The buffer_head containing the on-disk inode
+ * @pos: The file position
+ * @len: The length of the write
+ * @copied: How much was actually copied by the VFS
+ * @page: The page
+ *
+ * This copies the data from the page into the inode block after
+ * the inode data structure itself.
+ *
+ * Returns: errno
+ */
+static int gfs2_stuffed_write_end(struct inode *inode, struct buffer_head *dibh,
+				  loff_t pos, unsigned len, unsigned copied,
+				  struct page *page)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	u64 to = pos + copied;
+	void *kaddr;
+	unsigned char *buf = dibh->b_data + sizeof(struct gfs2_dinode);
+	struct gfs2_dinode *di = (struct gfs2_dinode *)dibh->b_data;
+
+	BUG_ON((pos + len) > (dibh->b_size - sizeof(struct gfs2_dinode)));
+	kaddr = kmap_atomic(page, KM_USER0);
+	memcpy(buf + pos, kaddr + pos, copied);
+	memset(kaddr + pos + copied, 0, len - copied);
+	flush_dcache_page(page);
+	kunmap_atomic(kaddr, KM_USER0);
+
+	if (!PageUptodate(page))
+		SetPageUptodate(page);
+	unlock_page(page);
+	mark_page_accessed(page);
+	page_cache_release(page);
+
+	if (inode->i_size < to) {
+		i_size_write(inode, to);
+		ip->i_di.di_size = inode->i_size;
+		di->di_size = cpu_to_be64(inode->i_size);
+		mark_inode_dirty(inode);
+	}
+
+	if (inode == sdp->sd_rindex)
+		adjust_fs_space(inode);
+
+	brelse(dibh);
+	gfs2_trans_end(sdp);
+	gfs2_glock_dq(&ip->i_gh);
+	gfs2_holder_uninit(&ip->i_gh);
+	return copied;
+}
+
+/**
+ * gfs2_write_end
  * @file: The file to write to
- * @page: The page containing the data
- * @from: From (byte range within page)
- * @to: To (byte range within page)
+ * @mapping: The address space to write to
+ * @pos: The file position
+ * @len: The length of the data
+ * @copied:
+ * @page: The page that has been written
+ * @fsdata: The fsdata (unused in GFS2)
+ *
+ * The main write_end function for GFS2. We have a separate one for
+ * stuffed files as they are slightly different, otherwise we just
+ * put our locking around the VFS provided functions.
  *
  * Returns: errno
  */
 
-static int gfs2_commit_write(struct file *file, struct page *page,
-			     unsigned from, unsigned to)
+static int gfs2_write_end(struct file *file, struct address_space *mapping,
+			  loff_t pos, unsigned len, unsigned copied,
+			  struct page *page, void *fsdata)
 {
 	struct inode *inode = page->mapping->host;
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
-	int error = -EOPNOTSUPP;
 	struct buffer_head *dibh;
 	struct gfs2_alloc *al = &ip->i_alloc;
 	struct gfs2_dinode *di;
+	unsigned int from = pos & (PAGE_CACHE_SIZE - 1);
+	unsigned int to = from + len;
+	int ret;
 
-	if (gfs2_assert_withdraw(sdp, gfs2_glock_is_locked_by_me(ip->i_gl)))
-                goto fail_nounlock;
+	BUG_ON(gfs2_glock_is_locked_by_me(ip->i_gl) == 0);
 
-	error = gfs2_meta_inode_buffer(ip, &dibh);
-	if (error)
-		goto fail_endtrans;
+	ret = gfs2_meta_inode_buffer(ip, &dibh);
+	if (unlikely(ret)) {
+		unlock_page(page);
+		page_cache_release(page);
+		goto failed;
+	}
 
 	gfs2_trans_add_bh(ip->i_gl, dibh, 1);
-	di = (struct gfs2_dinode *)dibh->b_data;
 
-	if (gfs2_is_stuffed(ip)) {
-		u64 file_size;
-		void *kaddr;
+	if (gfs2_is_stuffed(ip))
+		return gfs2_stuffed_write_end(inode, dibh, pos, len, copied, page);
 
-		file_size = ((u64)page->index << PAGE_CACHE_SHIFT) + to;
+	if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip))
+		gfs2_page_add_databufs(ip, page, from, to);
 
-		kaddr = kmap_atomic(page, KM_USER0);
-		memcpy(dibh->b_data + sizeof(struct gfs2_dinode) + from,
-		       kaddr + from, to - from);
-		kunmap_atomic(kaddr, KM_USER0);
+	ret = generic_write_end(file, mapping, pos, len, copied, page, fsdata);
 
-		SetPageUptodate(page);
-
-		if (inode->i_size < file_size) {
-			i_size_write(inode, file_size);
+	if (likely(ret >= 0)) {
+		copied = ret;
+		if  ((pos + copied) > inode->i_size) {
+			di = (struct gfs2_dinode *)dibh->b_data;
+			ip->i_di.di_size = inode->i_size;
+			di->di_size = cpu_to_be64(inode->i_size);
 			mark_inode_dirty(inode);
 		}
-	} else {
-		if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED ||
-		    gfs2_is_jdata(ip))
-			gfs2_page_add_databufs(ip, page, from, to);
-		error = generic_commit_write(file, page, from, to);
-		if (error)
-			goto fail;
-	}
-
-	if (ip->i_di.di_size < inode->i_size) {
-		ip->i_di.di_size = inode->i_size;
-		di->di_size = cpu_to_be64(inode->i_size);
 	}
 
 	if (inode == sdp->sd_rindex)
@@ -542,33 +604,15 @@ static int gfs2_commit_write(struct file *file, struct page *page,
 
 	brelse(dibh);
 	gfs2_trans_end(sdp);
+failed:
 	if (al->al_requested) {
 		gfs2_inplace_release(ip);
 		gfs2_quota_unlock(ip);
 		gfs2_alloc_put(ip);
 	}
-	unlock_page(page);
-	gfs2_glock_dq_m(1, &ip->i_gh);
-	lock_page(page);
+	gfs2_glock_dq(&ip->i_gh);
 	gfs2_holder_uninit(&ip->i_gh);
-	return 0;
-
-fail:
-	brelse(dibh);
-fail_endtrans:
-	gfs2_trans_end(sdp);
-	if (al->al_requested) {
-		gfs2_inplace_release(ip);
-		gfs2_quota_unlock(ip);
-		gfs2_alloc_put(ip);
-	}
-	unlock_page(page);
-	gfs2_glock_dq_m(1, &ip->i_gh);
-	lock_page(page);
-	gfs2_holder_uninit(&ip->i_gh);
-fail_nounlock:
-	ClearPageUptodate(page);
-	return error;
+	return ret;
 }
 
 /**
@@ -837,8 +881,8 @@ const struct address_space_operations gfs2_file_aops = {
 	.readpage = gfs2_readpage,
 	.readpages = gfs2_readpages,
 	.sync_page = block_sync_page,
-	.prepare_write = gfs2_prepare_write,
-	.commit_write = gfs2_commit_write,
+	.write_begin = gfs2_write_begin,
+	.write_end = gfs2_write_end,
 	.bmap = gfs2_bmap,
 	.invalidatepage = gfs2_invalidatepage,
 	.releasepage = gfs2_releasepage,




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
  2007-05-30  3:24   ` Andrew Morton
@ 2007-05-30 22:44     ` Mark Fasheh
  2007-05-30 22:49       ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Fasheh @ 2007-05-30 22:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, Steven Whitehouse, linux-fsdevel

On Tue, May 29, 2007 at 08:24:41PM -0700, Andrew Morton wrote:
> hm, I suppose that means I need to undrop git-ocfs2.patch.  It has a mild
> disagreeement with the fault-vs-invalidate patches which I didn't feel like
> fixing.

Erf, somehow I missed that it was dropped. I guess I'll take a look...
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
  2007-05-30 22:44     ` Mark Fasheh
@ 2007-05-30 22:49       ` Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2007-05-30 22:49 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Nick Piggin, Steven Whitehouse, linux-fsdevel

On Wed, 30 May 2007 15:44:43 -0700
Mark Fasheh <mark.fasheh@oracle.com> wrote:

> On Tue, May 29, 2007 at 08:24:41PM -0700, Andrew Morton wrote:
> > hm, I suppose that means I need to undrop git-ocfs2.patch.  It has a mild
> > disagreeement with the fault-vs-invalidate patches which I didn't feel like
> > fixing.
> 
> Erf, somehow I missed that it was dropped.

I didn't tell you ;)

> I guess I'll take a look...

It's the sort of mm-vs-mm thing which I get to fix, only I've been putting
it off.  As it now looks like the fault patches are moing away from death's
door, I shall do that.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
  2007-05-30 10:39   ` Steven Whitehouse
@ 2007-05-31  5:50     ` Nick Piggin
  0 siblings, 0 replies; 6+ messages in thread
From: Nick Piggin @ 2007-05-31  5:50 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: akpm, mark.fasheh, linux-fsdevel

On Wed, May 30, 2007 at 11:39:54AM +0100, Steven Whitehouse wrote:
> Hi,
> 
> On Wed, 2007-05-30 at 05:13 +0200, Nick Piggin wrote:
> > On Tue, May 29, 2007 at 02:19:55PM -0700, Andrew Morton wrote:
> > > 
> > > The patch titled
> > >      fs: introduce write_begin, write_end, and perform_write aops
> > > has been added to the -mm tree.  Its filename is
> > >      fs-introduce-write_begin-write_end-and-perform_write-aops.patch
> > > 
> > > *** Remember to use Documentation/SubmitChecklist when testing your code ***
> > > 
> > > See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> > > out what to do about this
> > > 
> > > ------------------------------------------------------
> > > Subject: fs: introduce write_begin, write_end, and perform_write aops
> > > From: Nick Piggin <npiggin@suse.de>
> > > 
> > > These are intended to replace prepare_write and commit_write with more
> > > flexible alternatives that are also able to avoid the buffered write
> > > deadlock problems efficiently (which prepare_write is unable to do).
> > 
> > OK, well now Andrew's merged a significant chunk of this work, I
> > would like to try getting the clustered filesystem patches back
> > in too (Steven, the last GFS2 patch you sent had rejects against this
> > tree, so I dropped it... hope it isn't too much work to bring it back
> > uptodate?).
> > 
> I think the following should do the trick... sorry for the delay, I was
> on holiday last week and I'm just catching up again. There is not a lot
> of change from the previous version, just a few small changes in the
> upstream code which had caused one chunk not to apply,

OK thanks! I'll send this to Andrew after he releases the next -mm and
hopefully picks up the patchset again.
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-31  5:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200705292119.l4TLJtAD011726@shell0.pdx.osdl.net>
2007-05-30  3:13 ` + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree Nick Piggin
2007-05-30  3:24   ` Andrew Morton
2007-05-30 22:44     ` Mark Fasheh
2007-05-30 22:49       ` Andrew Morton
2007-05-30 10:39   ` Steven Whitehouse
2007-05-31  5:50     ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).