From: Mingming Cao <cmm@us.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org, sandeen@redhat.com
Subject: Re: Delayed allocation and page_lock vs transaction start ordering
Date: Tue, 15 Apr 2008 16:28:25 -0700 [thread overview]
Message-ID: <1208302106.3636.47.camel@localhost.localdomain> (raw)
In-Reply-To: <1208282932.3636.9.camel@localhost.localdomain>
On Tue, 2008-04-15 at 11:08 -0700, Mingming Cao wrote:
> On Tue, 2008-04-15 at 18:14 +0200, Jan Kara wrote:
> > Hi,
> >
> > I've ported my patch inversing locking ordering of page_lock and
> > transaction start to ext4 (on top of ext4 patch queue). Everything except
> > delayed allocation is converted (the patch is below for interested
> > readers). The question is how to proceed with delayed allocation. Its
> > current implementation in VFS is designed to work well with the old
> > ordering (page lock first, then start a transaction). We could bend it to
> > work with the new locking ordering but I really see no point since ext4 is
> > the only user.
>
> I think the plan is port the changes to ext2/3/JFS and support delayed
> allocation on those filesystems.
>
> > Also XFS has AFAIK ordering first start transaction, then
> > lock pages so if we should ever merge delayed alloc implementations the new
> > ordering would make it easier.
> > So what do people think here? Do you agree with reimplementing current
> > mpage_da_... functions?
>
> It worth a try, but I could not see how to bend delayed allocation to
> work the new ordering:( With delayed allocation Ext4 gets into
> writepage() directly with page locked, but we need to start transaction
> to do block allocation...:(
Looked again it seems possible to reservse the order with delayed
allocation. with ext3_da_writepgaes() we could start the journal before
calling mpage_da_writepages()(which will lock the pages), instead of
start the journal inside ext4_da_get_block_write(). So that we could get
the locking order right. Just need to taking care of the estimated
credits right.
How about this? (untested, just throw out for comment)
---
fs/ext4/inode.c | 53 ++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 40 insertions(+), 13 deletions(-)
Index: linux-2.6.25-rc9/fs/ext4/inode.c
===================================================================
--- linux-2.6.25-rc9.orig/fs/ext4/inode.c 2008-04-15 15:40:33.000000000 -0700
+++ linux-2.6.25-rc9/fs/ext4/inode.c 2008-04-15 16:12:27.000000000 -0700
@@ -1437,18 +1437,12 @@ static int ext4_da_get_block_prep(struct
static int ext4_da_get_block_write(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create)
{
- int ret, needed_blocks = ext4_writepage_trans_blocks(inode);
+ int ret;
unsigned max_blocks = bh_result->b_size >> inode->i_blkbits;
loff_t disksize = EXT4_I(inode)->i_disksize;
handle_t *handle = NULL;
- if (create) {
- handle = ext4_journal_start(inode, needed_blocks);
- if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out;
- }
- }
+ handle = ext4_journal_current_handle();
ret = ext4_get_blocks_wrap(handle, inode, iblock, max_blocks,
bh_result, create, 0);
@@ -1483,17 +1477,50 @@ static int ext4_da_get_block_write(struc
ret = 0;
}
-out:
- if (handle && !IS_ERR(handle))
- ext4_journal_stop(handle);
-
return ret;
}
+/*
+ * For now just follow the DIO way to estimate the max credits
+ * needed to write out EXT4_MAX_BUF_BLOCKS pages.
+ * todo: need to calculate the max credits need for
+ * extent based files, currently the DIO credits is based on
+ * indirect-blocks mapping way.
+ *
+ * Probably should have a generic way to calculate credits
+ * for DIO, writepages, and truncate
+ */
+#define EXT4_MAX_BUF_BLOCKS DIO_MAX_BLOCKS
+#define EXT4_MAX_BUF_CREDITS DIO_CREDITS
+
static int ext4_da_writepages(struct address_space *mapping,
struct writeback_control *wbc)
{
- return mpage_da_writepages(mapping, wbc, ext4_da_get_block_write);
+ handle_t *handle = NULL;
+ int needed_blocks;
+ int ret;
+
+ /*
+ * Estimate the worse case needed credits to write out
+ * EXT4_MAX_BUF_BLOCKS pages
+ */
+ needed_blocks = ext4_writepages_trans_blocks(inode);
+
+ /* start the transaction with credits*/
+ handle = ext4_journal_start(inode, needed_blocks);
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ return ret;
+ }
+
+ /* set the max pages could be write-out at a time */
+ wbc->range_end = (wbc->range_start >> PAGE_CACHE_SHIFT
+ + EXT4_MAX_BUF_BLOCKS) << PAGE_CACHE_SHIFT;
+
+ ret = mpage_da_writepages(mapping, wbc, ext4_da_get_block_write);
+ ext4_journal_stop(handle);
+
+ return ret;
}
static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
next prev parent reply other threads:[~2008-04-15 23:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-15 16:14 Delayed allocation and page_lock vs transaction start ordering Jan Kara
2008-04-15 17:58 ` Badari Pulavarty
2008-04-16 9:26 ` Jan Kara
2008-04-15 18:08 ` Mingming Cao
2008-04-15 23:28 ` Mingming Cao [this message]
2008-04-15 23:33 ` Mingming Cao
2008-04-16 10:35 ` Jan Kara
2008-04-16 18:24 ` Mingming Cao
2008-04-16 19:55 ` Badari Pulavarty
2008-04-16 9:38 ` Jan Kara
2008-04-18 18:54 ` Andreas Dilger
2008-04-18 19:38 ` Mingming Cao
2008-04-21 17:13 ` Jan Kara
2008-05-21 8:21 ` Aneesh Kumar K.V
2008-05-26 17:21 ` Jan Kara
2008-05-26 18:00 ` Aneesh Kumar K.V
2008-05-27 12:43 ` Jan Kara
2008-05-27 15:11 ` Aneesh Kumar K.V
2008-05-28 9:33 ` Jan Kara
2008-05-28 9:43 ` Aneesh Kumar K.V
2008-05-28 10:33 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1208302106.3636.47.camel@localhost.localdomain \
--to=cmm@us.ibm.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.