linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sonny Rao <sonny@burdell.org>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Andrew Morton <akpm@osdl.org>,
	ext2-devel <ext2-devel@lists.sourceforge.net>,
	linux-fsdevel@vger.kernel.org, sct@redhat.com
Subject: Re: [Ext2-devel] Re: [RFC] ext3 writepages for writeback mode
Date: Mon, 14 Feb 2005 15:22:10 -0500	[thread overview]
Message-ID: <20050214202210.GA3873@kevlar.burdell.org> (raw)
In-Reply-To: <20050214200256.GA3690@kevlar.burdell.org>

[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]

On Mon, Feb 14, 2005 at 03:02:56PM -0500, Sonny Rao wrote:
> On Fri, Feb 11, 2005 at 04:09:46PM -0800, Badari Pulavarty wrote:
> > On Fri, 2005-02-11 at 15:58, Andrew Morton wrote:
> > > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > >
> > > > Due to lack of interesting suggestions to solve 
> > > > mpage_writepages() -> ext3_writeback_writepage() problem,
> > > > I fixed it in the dumbest possible way.
> > > 
> > > I've actually forgotten what the problem was.  It was 100 patches ago :(
> > 
> > The problem was 
> > 	ext3_writeback_writepages() -> mpage_writepages() could
> > call back ext3_writeback_writepage() in the "confused" case.
> > ext3_writeback_writepage() could end up doing nothing, since the
> > we already have a journal handle.
> > 
> > I added a writepage_helper to handle this case.
> > 
> > > 
> > > > Let me know, what you think.
> > > 
> > > 
> > > If it works, let's get some benchmark numbers so we can decide whether it
> > > justifies more development?
> > > 
> > 
> > Yep. I will get some numbers to see ..
> > 
> 
> I'm helping Badari collecting data on this...
> 
> My setup is a P4 2.0Ghz booted with 1GB of RAM and 1 cpu attached via
> Fiber to a seven disk raid0 array with write-caching turned off
> (write-cacheing can skew numbers significantly if you aren't careful
> to let the cache drain between runs, etc.) 
> 
> The test is a single-threaded sequential overwrite of a 20GB data set
> divided into 512MB files which are selected randomly and overwritten.
> 
> All of these numbers represent the average of three five-minute runs.
> 
> All ext3 tests are in writeback mode
> 
> FS		Throughput	Cpu Utilizaiton
> --		----------	---------------
> Ext3		78 MB/sec	75.9 %
> Ext3 + wpages	85 MB/sec	74.7 %
> 
> Just for comparison:
> 
> Ext2		88.5 MB/sec	74.2 %
> Ext2 + nobh	89.6 MB/sec	71.7 %
> 
> JFS		94.8 MB/sec	85.6 %
> XFS		100 MB/sec	95.5 %
> 
> 
> So, Badari's writepages patch improves performance on this particular
> setup by almost 10 %
> 
> 
> I can rerun with more processors/ram, or different disk configurations
> if anyone is interested.
> 
> Sonny

One other detail,

I had to fix the patch, for some reason it was malformed ?

I've attached my version of the patch.

Sonny



[-- Attachment #2: ext3-writeback-writepages.patch2-fixed --]
[-- Type: text/plain, Size: 5168 bytes --]

diff -Naurp linux-2.6.10-orig/fs/ext3/inode.c linux-2.6.10/fs/ext3/inode.c
--- linux-2.6.10-orig/fs/ext3/inode.c	2005-01-21 00:51:27.000000000 -0600
+++ linux-2.6.10/fs/ext3/inode.c	2005-02-14 11:36:52.387332295 -0600
@@ -856,6 +856,12 @@ get_block:
 	return ret;
 }
 
+static int ext3_writepages_get_block(struct inode *inode, sector_t iblock,
+			struct buffer_head *bh, int create)
+{
+	return ext3_direct_io_get_blocks(inode, iblock, 1, bh, create);
+}
+
 /*
  * `handle' can be NULL if create is zero
  */
@@ -1321,6 +1327,44 @@ out_fail:
 	return ret;
 }
 
+static int ext3_writeback_writepage_helper(struct page *page,
+				struct writeback_control *wbc)
+{
+	return block_write_full_page(page, ext3_get_block, wbc);
+}
+
+static int
+ext3_writeback_writepages(struct address_space *mapping, 
+				struct writeback_control *wbc)
+{
+	struct inode *inode = mapping->host;
+	handle_t *handle = NULL;
+	int err, ret = 0;
+
+	if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
+		return ret;
+
+	handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		return ret;
+	}
+
+	ret = __mpage_writepages(mapping, wbc, ext3_writepages_get_block,
+					ext3_writeback_writepage_helper);
+
+	/*
+	 * Need to reaquire the handle since ext3_writepages_get_block()
+	 * can restart the handle
+	 */
+	handle = journal_current_handle();
+
+	err = ext3_journal_stop(handle);
+	if (!ret)
+		ret = err;
+	return ret;
+}
+
 static int ext3_writeback_writepage(struct page *page,
 				struct writeback_control *wbc)
 {
@@ -1552,6 +1596,7 @@ static struct address_space_operations e
 	.readpage	= ext3_readpage,
 	.readpages	= ext3_readpages,
 	.writepage	= ext3_writeback_writepage,
+	.writepages	= ext3_writeback_writepages,
 	.sync_page	= block_sync_page,
 	.prepare_write	= ext3_prepare_write,
 	.commit_write	= ext3_writeback_commit_write,
diff -Naurp linux-2.6.10-orig/fs/mpage.c linux-2.6.10/fs/mpage.c
--- linux-2.6.10-orig/fs/mpage.c	2004-10-18 16:53:43.000000000 -0500
+++ linux-2.6.10/fs/mpage.c	2005-02-14 11:36:52.383332918 -0600
@@ -387,7 +387,8 @@ EXPORT_SYMBOL(mpage_readpage);
  */
 static struct bio *
 mpage_writepage(struct bio *bio, struct page *page, get_block_t get_block,
-	sector_t *last_block_in_bio, int *ret, struct writeback_control *wbc)
+	sector_t *last_block_in_bio, int *ret, struct writeback_control *wbc,
+	writepage_t writepage_helper)
 {
 	struct address_space *mapping = page->mapping;
 	struct inode *inode = page->mapping->host;
@@ -580,7 +581,7 @@ alloc_new:
 confused:
 	if (bio)
 		bio = mpage_bio_submit(WRITE, bio);
-	*ret = page->mapping->a_ops->writepage(page, wbc);
+	*ret = writepage_helper(page, wbc);
 	/*
 	 * The caller has a ref on the inode, so *mapping is stable
 	 */
@@ -619,6 +620,15 @@ int
 mpage_writepages(struct address_space *mapping,
 		struct writeback_control *wbc, get_block_t get_block)
 {
+	return __mpage_writepages(mapping, wbc, get_block,
+			 mapping->a_ops->writepage);
+}
+
+int
+__mpage_writepages(struct address_space *mapping,
+		struct writeback_control *wbc, get_block_t get_block,
+		writepage_t writepage_helper)
+{
 	struct backing_dev_info *bdi = mapping->backing_dev_info;
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
@@ -707,7 +717,8 @@ retry:
 				}
 			} else {
 				bio = mpage_writepage(bio, page, get_block,
-						&last_block_in_bio, &ret, wbc);
+						&last_block_in_bio, &ret, wbc,
+						writepage_helper);
 			}
 			if (ret || (--(wbc->nr_to_write) <= 0))
 				done = 1;
@@ -735,3 +746,4 @@ retry:
 	return ret;
 }
 EXPORT_SYMBOL(mpage_writepages);
+EXPORT_SYMBOL(__mpage_writepages);
diff -Naurp linux-2.6.10-orig/include/linux/fs.h linux-2.6.10/include/linux/fs.h
--- linux-2.6.10-orig/include/linux/fs.h	2005-01-21 00:51:40.000000000 -0600
+++ linux-2.6.10/include/linux/fs.h	2005-02-14 11:36:19.214499355 -0600
@@ -27,6 +27,7 @@ struct poll_table_struct;
 struct kstatfs;
 struct vm_area_struct;
 struct vfsmount;
+struct writeback_control;
 
 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -244,6 +245,7 @@ typedef int (get_blocks_t)(struct inode 
 			struct buffer_head *bh_result, int create);
 typedef void (dio_iodone_t)(struct inode *inode, loff_t offset,
 			ssize_t bytes, void *private);
+typedef int (writepage_t)(struct page *page, struct writeback_control *wbc);
 
 /*
  * Attribute flags.  These should be or-ed together to figure out what
diff -Naurp linux-2.6.10-orig/include/linux/mpage.h linux-2.6.10/include/linux/mpage.h
--- linux-2.6.10-orig/include/linux/mpage.h	2004-10-18 16:53:51.000000000 -0500
+++ linux-2.6.10/include/linux/mpage.h	2005-02-14 11:36:52.381333229 -0600
@@ -17,6 +17,9 @@ int mpage_readpages(struct address_space
 int mpage_readpage(struct page *page, get_block_t get_block);
 int mpage_writepages(struct address_space *mapping,
 		struct writeback_control *wbc, get_block_t get_block);
+int __mpage_writepages(struct address_space *mapping,
+		struct writeback_control *wbc, get_block_t get_block,
+		writepage_t writepage);
 
 static inline int
 generic_writepages(struct address_space *mapping, struct writeback_control *wbc)

  reply	other threads:[~2005-02-14 20:35 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-11  1:31 [RFC] ext3 writepages for writeback mode Badari Pulavarty
2005-02-11  1:53 ` Andrew Morton
2005-02-11  3:06   ` Badari Pulavarty
2005-02-11 23:29   ` Badari Pulavarty
2005-02-11 23:58     ` Andrew Morton
2005-02-12  0:09       ` Badari Pulavarty
2005-02-14 20:02         ` Sonny Rao
2005-02-14 20:22           ` Sonny Rao [this message]
2005-02-12  0:51       ` Badari Pulavarty
2005-02-12  1:00         ` Andrew Morton
2005-02-12  1:55           ` Badari Pulavarty
2005-02-12 12:20         ` [Ext2-devel] " Alex Tomas
2005-02-12 18:47           ` Badari Pulavarty
2005-02-12 21:43             ` Alex Tomas
2005-02-12 23:26           ` Badari Pulavarty
2005-02-12 23:29             ` Alex Tomas
2005-02-14 15:58               ` Badari Pulavarty
2005-02-14 16:34                 ` Alex Tomas
2005-02-14 16:50 ` Thiago Rondon
2005-02-14 18:08   ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050214202210.GA3873@kevlar.burdell.org \
    --to=sonny@burdell.org \
    --cc=akpm@osdl.org \
    --cc=ext2-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=sct@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).