From: Steven Whitehouse <swhiteho@redhat.com>
To: Nick Piggin <npiggin@suse.de>
Cc: akpm@linux-foundation.org, mark.fasheh@oracle.com,
linux-fsdevel@vger.kernel.org
Subject: Re: + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree
Date: Wed, 30 May 2007 11:39:54 +0100 [thread overview]
Message-ID: <1180521594.25918.62.camel@quoit> (raw)
In-Reply-To: <20070530031354.GE27054@wotan.suse.de>
Hi,
On Wed, 2007-05-30 at 05:13 +0200, Nick Piggin wrote:
> On Tue, May 29, 2007 at 02:19:55PM -0700, Andrew Morton wrote:
> >
> > The patch titled
> > fs: introduce write_begin, write_end, and perform_write aops
> > has been added to the -mm tree. Its filename is
> > fs-introduce-write_begin-write_end-and-perform_write-aops.patch
> >
> > *** Remember to use Documentation/SubmitChecklist when testing your code ***
> >
> > See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> > out what to do about this
> >
> > ------------------------------------------------------
> > Subject: fs: introduce write_begin, write_end, and perform_write aops
> > From: Nick Piggin <npiggin@suse.de>
> >
> > These are intended to replace prepare_write and commit_write with more
> > flexible alternatives that are also able to avoid the buffered write
> > deadlock problems efficiently (which prepare_write is unable to do).
>
> OK, well now Andrew's merged a significant chunk of this work, I
> would like to try getting the clustered filesystem patches back
> in too (Steven, the last GFS2 patch you sent had rejects against this
> tree, so I dropped it... hope it isn't too much work to bring it back
> uptodate?).
>
I think the following should do the trick... sorry for the delay, I was
on holiday last week and I'm just catching up again. There is not a lot
of change from the previous version, just a few small changes in the
upstream code which had caused one chunk not to apply,
Steve.
---------------------------------------------------------------------------------
diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index fb84478..ad84a55 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -17,6 +17,7 @@
#include <linux/mpage.h>
#include <linux/fs.h>
#include <linux/writeback.h>
+#include <linux/swap.h>
#include <linux/gfs2_ondisk.h>
#include <linux/lm_interface.h>
@@ -349,45 +350,49 @@ out_unlock:
}
/**
- * gfs2_prepare_write - Prepare to write a page to a file
+ * gfs2_write_begin - Begin to write to a file
* @file: The file to write to
- * @page: The page which is to be prepared for writing
- * @from: From (byte range within page)
- * @to: To (byte range within page)
+ * @mapping: The mapping in which to write
+ * @pos: The file offset at which to start writing
+ * @len: Length of the write
+ * @flags: Various flags
+ * @pagep: Pointer to return the page
+ * @fsdata: Pointer to return fs data (unused by GFS2)
*
* Returns: errno
*/
-static int gfs2_prepare_write(struct file *file, struct page *page,
- unsigned from, unsigned to)
+static int gfs2_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata)
{
- struct gfs2_inode *ip = GFS2_I(page->mapping->host);
- struct gfs2_sbd *sdp = GFS2_SB(page->mapping->host);
+ struct gfs2_inode *ip = GFS2_I(mapping->host);
+ struct gfs2_sbd *sdp = GFS2_SB(mapping->host);
unsigned int data_blocks, ind_blocks, rblocks;
int alloc_required;
int error = 0;
- loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + from;
- loff_t end = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
struct gfs2_alloc *al;
- unsigned int write_len = to - from;
+ pgoff_t index = pos >> PAGE_CACHE_SHIFT;
+ unsigned from = pos & (PAGE_CACHE_SIZE - 1);
+ unsigned to = from + len;
+ struct page *page;
-
- gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_ATIME|LM_FLAG_TRY_1CB, &ip->i_gh);
+ gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_ATIME, &ip->i_gh);
error = gfs2_glock_nq_atime(&ip->i_gh);
- if (unlikely(error)) {
- if (error == GLR_TRYFAILED) {
- unlock_page(page);
- error = AOP_TRUNCATED_PAGE;
- yield();
- }
+ if (unlikely(error))
goto out_uninit;
- }
- gfs2_write_calc_reserv(ip, write_len, &data_blocks, &ind_blocks);
+ error = -ENOMEM;
+ page = __grab_cache_page(mapping, index);
+ *pagep = page;
+ if (!page)
+ goto out_unlock;
+
+ gfs2_write_calc_reserv(ip, len, &data_blocks, &ind_blocks);
- error = gfs2_write_alloc_required(ip, pos, write_len, &alloc_required);
+ error = gfs2_write_alloc_required(ip, pos, len, &alloc_required);
if (error)
- goto out_unlock;
+ goto out_putpage;
ip->i_alloc.al_requested = 0;
@@ -419,7 +424,7 @@ static int gfs2_prepare_write(struct file *file, struct page *page,
goto out;
if (gfs2_is_stuffed(ip)) {
- if (end > sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)) {
+ if (pos + len > sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)) {
error = gfs2_unstuff_dinode(ip, page);
if (error == 0)
goto prepare_write;
@@ -441,6 +446,10 @@ out_qunlock:
out_alloc_put:
gfs2_alloc_put(ip);
}
+out_putpage:
+ page_cache_release(page);
+ if (pos + len > ip->i_inode.i_size)
+ vmtruncate(&ip->i_inode, ip->i_inode.i_size);
out_unlock:
gfs2_glock_dq_m(1, &ip->i_gh);
out_uninit:
@@ -476,65 +485,118 @@ static void adjust_fs_space(struct inode *inode)
}
/**
- * gfs2_commit_write - Commit write to a file
+ * gfs2_stuffed_write_end - Write end for stuffed files
+ * @inode: The inode
+ * @dibh: The buffer_head containing the on-disk inode
+ * @pos: The file position
+ * @len: The length of the write
+ * @copied: How much was actually copied by the VFS
+ * @page: The page
+ *
+ * This copies the data from the page into the inode block after
+ * the inode data structure itself.
+ *
+ * Returns: errno
+ */
+static int gfs2_stuffed_write_end(struct inode *inode, struct buffer_head *dibh,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page)
+{
+ struct gfs2_inode *ip = GFS2_I(inode);
+ struct gfs2_sbd *sdp = GFS2_SB(inode);
+ u64 to = pos + copied;
+ void *kaddr;
+ unsigned char *buf = dibh->b_data + sizeof(struct gfs2_dinode);
+ struct gfs2_dinode *di = (struct gfs2_dinode *)dibh->b_data;
+
+ BUG_ON((pos + len) > (dibh->b_size - sizeof(struct gfs2_dinode)));
+ kaddr = kmap_atomic(page, KM_USER0);
+ memcpy(buf + pos, kaddr + pos, copied);
+ memset(kaddr + pos + copied, 0, len - copied);
+ flush_dcache_page(page);
+ kunmap_atomic(kaddr, KM_USER0);
+
+ if (!PageUptodate(page))
+ SetPageUptodate(page);
+ unlock_page(page);
+ mark_page_accessed(page);
+ page_cache_release(page);
+
+ if (inode->i_size < to) {
+ i_size_write(inode, to);
+ ip->i_di.di_size = inode->i_size;
+ di->di_size = cpu_to_be64(inode->i_size);
+ mark_inode_dirty(inode);
+ }
+
+ if (inode == sdp->sd_rindex)
+ adjust_fs_space(inode);
+
+ brelse(dibh);
+ gfs2_trans_end(sdp);
+ gfs2_glock_dq(&ip->i_gh);
+ gfs2_holder_uninit(&ip->i_gh);
+ return copied;
+}
+
+/**
+ * gfs2_write_end
* @file: The file to write to
- * @page: The page containing the data
- * @from: From (byte range within page)
- * @to: To (byte range within page)
+ * @mapping: The address space to write to
+ * @pos: The file position
+ * @len: The length of the data
+ * @copied:
+ * @page: The page that has been written
+ * @fsdata: The fsdata (unused in GFS2)
+ *
+ * The main write_end function for GFS2. We have a separate one for
+ * stuffed files as they are slightly different, otherwise we just
+ * put our locking around the VFS provided functions.
*
* Returns: errno
*/
-static int gfs2_commit_write(struct file *file, struct page *page,
- unsigned from, unsigned to)
+static int gfs2_write_end(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page, void *fsdata)
{
struct inode *inode = page->mapping->host;
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_sbd *sdp = GFS2_SB(inode);
- int error = -EOPNOTSUPP;
struct buffer_head *dibh;
struct gfs2_alloc *al = &ip->i_alloc;
struct gfs2_dinode *di;
+ unsigned int from = pos & (PAGE_CACHE_SIZE - 1);
+ unsigned int to = from + len;
+ int ret;
- if (gfs2_assert_withdraw(sdp, gfs2_glock_is_locked_by_me(ip->i_gl)))
- goto fail_nounlock;
+ BUG_ON(gfs2_glock_is_locked_by_me(ip->i_gl) == 0);
- error = gfs2_meta_inode_buffer(ip, &dibh);
- if (error)
- goto fail_endtrans;
+ ret = gfs2_meta_inode_buffer(ip, &dibh);
+ if (unlikely(ret)) {
+ unlock_page(page);
+ page_cache_release(page);
+ goto failed;
+ }
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
- di = (struct gfs2_dinode *)dibh->b_data;
- if (gfs2_is_stuffed(ip)) {
- u64 file_size;
- void *kaddr;
+ if (gfs2_is_stuffed(ip))
+ return gfs2_stuffed_write_end(inode, dibh, pos, len, copied, page);
- file_size = ((u64)page->index << PAGE_CACHE_SHIFT) + to;
+ if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip))
+ gfs2_page_add_databufs(ip, page, from, to);
- kaddr = kmap_atomic(page, KM_USER0);
- memcpy(dibh->b_data + sizeof(struct gfs2_dinode) + from,
- kaddr + from, to - from);
- kunmap_atomic(kaddr, KM_USER0);
+ ret = generic_write_end(file, mapping, pos, len, copied, page, fsdata);
- SetPageUptodate(page);
-
- if (inode->i_size < file_size) {
- i_size_write(inode, file_size);
+ if (likely(ret >= 0)) {
+ copied = ret;
+ if ((pos + copied) > inode->i_size) {
+ di = (struct gfs2_dinode *)dibh->b_data;
+ ip->i_di.di_size = inode->i_size;
+ di->di_size = cpu_to_be64(inode->i_size);
mark_inode_dirty(inode);
}
- } else {
- if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED ||
- gfs2_is_jdata(ip))
- gfs2_page_add_databufs(ip, page, from, to);
- error = generic_commit_write(file, page, from, to);
- if (error)
- goto fail;
- }
-
- if (ip->i_di.di_size < inode->i_size) {
- ip->i_di.di_size = inode->i_size;
- di->di_size = cpu_to_be64(inode->i_size);
}
if (inode == sdp->sd_rindex)
@@ -542,33 +604,15 @@ static int gfs2_commit_write(struct file *file, struct page *page,
brelse(dibh);
gfs2_trans_end(sdp);
+failed:
if (al->al_requested) {
gfs2_inplace_release(ip);
gfs2_quota_unlock(ip);
gfs2_alloc_put(ip);
}
- unlock_page(page);
- gfs2_glock_dq_m(1, &ip->i_gh);
- lock_page(page);
+ gfs2_glock_dq(&ip->i_gh);
gfs2_holder_uninit(&ip->i_gh);
- return 0;
-
-fail:
- brelse(dibh);
-fail_endtrans:
- gfs2_trans_end(sdp);
- if (al->al_requested) {
- gfs2_inplace_release(ip);
- gfs2_quota_unlock(ip);
- gfs2_alloc_put(ip);
- }
- unlock_page(page);
- gfs2_glock_dq_m(1, &ip->i_gh);
- lock_page(page);
- gfs2_holder_uninit(&ip->i_gh);
-fail_nounlock:
- ClearPageUptodate(page);
- return error;
+ return ret;
}
/**
@@ -837,8 +881,8 @@ const struct address_space_operations gfs2_file_aops = {
.readpage = gfs2_readpage,
.readpages = gfs2_readpages,
.sync_page = block_sync_page,
- .prepare_write = gfs2_prepare_write,
- .commit_write = gfs2_commit_write,
+ .write_begin = gfs2_write_begin,
+ .write_end = gfs2_write_end,
.bmap = gfs2_bmap,
.invalidatepage = gfs2_invalidatepage,
.releasepage = gfs2_releasepage,
next prev parent reply other threads:[~2007-05-30 10:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200705292119.l4TLJtAD011726@shell0.pdx.osdl.net>
2007-05-30 3:13 ` + fs-introduce-write_begin-write_end-and-perform_write-aops.patch added to -mm tree Nick Piggin
2007-05-30 3:24 ` Andrew Morton
2007-05-30 22:44 ` Mark Fasheh
2007-05-30 22:49 ` Andrew Morton
2007-05-30 10:39 ` Steven Whitehouse [this message]
2007-05-31 5:50 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1180521594.25918.62.camel@quoit \
--to=swhiteho@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mark.fasheh@oracle.com \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).