From: Dmitry Monakhov <dmonakhov@openvz.org>
To: linux-ext4@vger.kernel.org
Cc: tytso@mit.edu
Subject: [PATCH 2/2] ext4: fix delalloc retry loop logic
Date: Wed, 03 Feb 2010 21:39:30 +0300 [thread overview]
Message-ID: <87zl3qrwnx.fsf@openvz.org> (raw)
[-- Attachment #1: Type: text/plain, Size: 423 bytes --]
Theodore please review this patch ASAP, currently ext4+quota is
fatally broken due to your patch. Christmas holidays when you
submit your patch is not good time for good review, IMHO
i was too lazy to review it carefully.
Testcase is trivial it is enough just hit a quota barrier.
dmon$ set-quota-limit /mnt id=dmon --bsoft=1000 --bsoft=1000
dmon$ dd if=/dev/zefo of=/mnt/file
kernel BUG at fs/jbd2/transaction.c:1027!
[-- Attachment #2: 0002-ext4-fix-delalloc-retry-loop-logic.patch --]
[-- Type: text/plain, Size: 3667 bytes --]
>From d66a56204ec7f79947423411972c460133e36ae5 Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov <dmonakhov@openvz.org>
Date: Wed, 3 Feb 2010 18:24:13 +0300
Subject: [PATCH 2/2] ext4: fix delalloc retry loop logic
Current delalloc write path is broken:
ext4_da_write_begin()
ext4_journal_start(inode, 1); -> current->journal != NULL
block_write_begin
ext4_da_get_block_prep()
ext4_da_reserve_space()
ext4_should_retry_alloc() -> deadlock
write_inode_now() -> BUG_ON due to lack of journal credits
Bug was partly introduced by following commit:
0637c6f4135f592f094207c7c21e7c0fc5557834
ext4: Patch up how we claim metadata blocks for quota purposes
In order to preserve retry logic and eliminate bugs we have to
move retry loop to ext4_da_write_begin()
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/ext4/inode.c | 41 ++++++++++++++++++-----------------------
1 files changed, 18 insertions(+), 23 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2d3fe4d..022ce18 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1815,7 +1815,6 @@ static int ext4_journalled_write_end(struct file *file,
*/
static int ext4_da_reserve_space(struct inode *inode, sector_t lblock)
{
- int retries = 0;
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
struct ext4_inode_info *ei = EXT4_I(inode);
unsigned long md_needed, md_reserved;
@@ -1825,7 +1824,6 @@ static int ext4_da_reserve_space(struct inode *inode, sector_t lblock)
* in order to allocate nrblocks
* worse case is one extent per block
*/
-repeat:
spin_lock(&ei->i_block_reservation_lock);
md_reserved = ei->i_reserved_meta_blocks;
md_needed = ext4_calc_metadata_amount(inode, lblock);
@@ -1836,27 +1834,11 @@ repeat:
* later. Real quota accounting is done at pages writeout
* time.
*/
- if (vfs_dq_reserve_block(inode, md_needed + 1)) {
- /*
- * We tend to badly over-estimate the amount of
- * metadata blocks which are needed, so if we have
- * reserved any metadata blocks, try to force out the
- * inode and see if we have any better luck.
- */
- if (md_reserved && retries++ <= 3)
- goto retry;
+ if (vfs_dq_reserve_block(inode, md_needed + 1))
return -EDQUOT;
- }
if (ext4_claim_free_blocks(sbi, md_needed + 1)) {
vfs_dq_release_reservation_block(inode, md_needed + 1);
- if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
- retry:
- if (md_reserved)
- write_inode_now(inode, (retries == 3));
- yield();
- goto repeat;
- }
return -ENOSPC;
}
spin_lock(&ei->i_block_reservation_lock);
@@ -3033,7 +3015,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
struct page **pagep, void **fsdata)
{
- int ret, retries = 0;
+ int ret, dqretry, retries = 0;
struct page *page;
pgoff_t index;
unsigned from, to;
@@ -3090,9 +3072,22 @@ retry:
ext4_truncate_failed_write(inode);
}
- if (!(flags & EXT4_AOP_FLAG_NORETRY) &&
- ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
- goto retry;
+ dqretry = (ret == -EDQUOT) || EXT4_I(inode)->i_reserved_meta_blocks;
+ if ( !(flags & EXT4_AOP_FLAG_NORETRY) &&
+ (ret == -ENOSPC || dqretry) &&
+ ext4_should_retry_alloc(inode->i_sb, &retries)) {
+ if (dqretry) {
+ /*
+ * We tend to badly over-estimate the amount of
+ * metadata blocks which are needed, so if we have
+ * reserved any metadata blocks, try to force out the
+ * inode and see if we have any better luck.
+ */
+ write_inode_now(inode, (retries == 3));
+ }
+ yield();
+ goto retry;
+ }
out:
return ret;
}
--
1.6.3.3
next reply other threads:[~2010-02-03 18:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-03 18:39 Dmitry Monakhov [this message]
2010-02-03 19:07 ` [PATCH 2/2] ext4: fix delalloc retry loop logic v2 Dmitry Monakhov
2010-02-03 21:16 ` tytso
2010-02-04 11:29 ` Aneesh Kumar K. V
2010-02-04 19:45 ` tytso
2010-02-04 21:50 ` Dmitry Monakhov
2010-02-05 3:55 ` tytso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zl3qrwnx.fsf@openvz.org \
--to=dmonakhov@openvz.org \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.