From: Mingming Cao <cmm@us.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org, sandeen@redhat.com
Subject: Re: Delayed allocation and page_lock vs transaction start ordering
Date: Tue, 15 Apr 2008 16:28:25 -0700 [thread overview]
Message-ID: <1208302106.3636.47.camel@localhost.localdomain> (raw)
In-Reply-To: <1208282932.3636.9.camel@localhost.localdomain>
On Tue, 2008-04-15 at 11:08 -0700, Mingming Cao wrote:
> On Tue, 2008-04-15 at 18:14 +0200, Jan Kara wrote:
> > Hi,
> >
> > I've ported my patch inversing locking ordering of page_lock and
> > transaction start to ext4 (on top of ext4 patch queue). Everything except
> > delayed allocation is converted (the patch is below for interested
> > readers). The question is how to proceed with delayed allocation. Its
> > current implementation in VFS is designed to work well with the old
> > ordering (page lock first, then start a transaction). We could bend it to
> > work with the new locking ordering but I really see no point since ext4 is
> > the only user.
>
> I think the plan is port the changes to ext2/3/JFS and support delayed
> allocation on those filesystems.
>
> > Also XFS has AFAIK ordering first start transaction, then
> > lock pages so if we should ever merge delayed alloc implementations the new
> > ordering would make it easier.
> > So what do people think here? Do you agree with reimplementing current
> > mpage_da_... functions?
>
> It worth a try, but I could not see how to bend delayed allocation to
> work the new ordering:( With delayed allocation Ext4 gets into
> writepage() directly with page locked, but we need to start transaction
> to do block allocation...:(
Looked again it seems possible to reservse the order with delayed
allocation. with ext3_da_writepgaes() we could start the journal before
calling mpage_da_writepages()(which will lock the pages), instead of
start the journal inside ext4_da_get_block_write(). So that we could get
the locking order right. Just need to taking care of the estimated
credits right.
How about this? (untested, just throw out for comment)
---
fs/ext4/inode.c | 53 ++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 40 insertions(+), 13 deletions(-)
Index: linux-2.6.25-rc9/fs/ext4/inode.c
===================================================================
--- linux-2.6.25-rc9.orig/fs/ext4/inode.c 2008-04-15 15:40:33.000000000 -0700
+++ linux-2.6.25-rc9/fs/ext4/inode.c 2008-04-15 16:12:27.000000000 -0700
@@ -1437,18 +1437,12 @@ static int ext4_da_get_block_prep(struct
static int ext4_da_get_block_write(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create)
{
- int ret, needed_blocks = ext4_writepage_trans_blocks(inode);
+ int ret;
unsigned max_blocks = bh_result->b_size >> inode->i_blkbits;
loff_t disksize = EXT4_I(inode)->i_disksize;
handle_t *handle = NULL;
- if (create) {
- handle = ext4_journal_start(inode, needed_blocks);
- if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out;
- }
- }
+ handle = ext4_journal_current_handle();
ret = ext4_get_blocks_wrap(handle, inode, iblock, max_blocks,
bh_result, create, 0);
@@ -1483,17 +1477,50 @@ static int ext4_da_get_block_write(struc
ret = 0;
}
-out:
- if (handle && !IS_ERR(handle))
- ext4_journal_stop(handle);
-
return ret;
}
+/*
+ * For now just follow the DIO way to estimate the max credits
+ * needed to write out EXT4_MAX_BUF_BLOCKS pages.
+ * todo: need to calculate the max credits need for
+ * extent based files, currently the DIO credits is based on
+ * indirect-blocks mapping way.
+ *
+ * Probably should have a generic way to calculate credits
+ * for DIO, writepages, and truncate
+ */
+#define EXT4_MAX_BUF_BLOCKS DIO_MAX_BLOCKS
+#define EXT4_MAX_BUF_CREDITS DIO_CREDITS
+
static int ext4_da_writepages(struct address_space *mapping,
struct writeback_control *wbc)
{
- return mpage_da_writepages(mapping, wbc, ext4_da_get_block_write);
+ handle_t *handle = NULL;
+ int needed_blocks;
+ int ret;
+
+ /*
+ * Estimate the worse case needed credits to write out
+ * EXT4_MAX_BUF_BLOCKS pages
+ */
+ needed_blocks = ext4_writepages_trans_blocks(inode);
+
+ /* start the transaction with credits*/
+ handle = ext4_journal_start(inode, needed_blocks);
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ return ret;
+ }
+
+ /* set the max pages could be write-out at a time */
+ wbc->range_end = (wbc->range_start >> PAGE_CACHE_SHIFT
+ + EXT4_MAX_BUF_BLOCKS) << PAGE_CACHE_SHIFT;
+
+ ret = mpage_da_writepages(mapping, wbc, ext4_da_get_block_write);
+ ext4_journal_stop(handle);
+
+ return ret;
}
static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
next prev parent reply other threads:[~2008-04-15 23:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-15 16:14 Delayed allocation and page_lock vs transaction start ordering Jan Kara
2008-04-15 17:58 ` Badari Pulavarty
2008-04-16 9:26 ` Jan Kara
2008-04-15 18:08 ` Mingming Cao
2008-04-15 23:28 ` Mingming Cao [this message]
2008-04-15 23:33 ` Mingming Cao
2008-04-16 10:35 ` Jan Kara
2008-04-16 18:24 ` Mingming Cao
2008-04-16 19:55 ` Badari Pulavarty
2008-04-16 9:38 ` Jan Kara
2008-04-18 18:54 ` Andreas Dilger
2008-04-18 19:38 ` Mingming Cao
2008-04-21 17:13 ` Jan Kara
2008-05-21 8:21 ` Aneesh Kumar K.V
2008-05-26 17:21 ` Jan Kara
2008-05-26 18:00 ` Aneesh Kumar K.V
2008-05-27 12:43 ` Jan Kara
2008-05-27 15:11 ` Aneesh Kumar K.V
2008-05-28 9:33 ` Jan Kara
2008-05-28 9:43 ` Aneesh Kumar K.V
2008-05-28 10:33 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1208302106.3636.47.camel@localhost.localdomain \
--to=cmm@us.ibm.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox