* [PATCH] ext4: fix reference counting bug on block allocation error
@ 2016-07-06 13:57 Vegard Nossum
2016-07-07 11:21 ` Vegard Nossum
0 siblings, 1 reply; 3+ messages in thread
From: Vegard Nossum @ 2016-07-06 13:57 UTC (permalink / raw)
To: tytso; +Cc: linux-ext4, Vegard Nossum, Aneesh Kumar K.V
If we hit this error when mounted with errors=continue or
errors=remount-ro:
EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.
Instead, handle it the same way that we handle other errors, except that
we retry the allocation instead of immediately returning an error (if we
were mounted with errors=continue, then ext4_mb_mark_diskspace_used()
should have fixed the original error and will either succeed or give a
different error; if we were mounted with errors=remount-ro, then it will
not be able to fix the original error and will raise a different error).
Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
fs/ext4/mballoc.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c1ab3ec..0370f76 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4514,11 +4514,7 @@ repeat:
if (likely(ac->ac_status == AC_STATUS_FOUND)) {
*errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
if (*errp == -EAGAIN) {
- /*
- * drop the reference that we took
- * in ext4_mb_use_best_found
- */
- ext4_mb_release_context(ac);
+ ext4_discard_allocated_blocks(ac);
ac->ac_b_ex.fe_group = 0;
ac->ac_b_ex.fe_start = 0;
ac->ac_b_ex.fe_len = 0;
--
1.9.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ext4: fix reference counting bug on block allocation error
2016-07-06 13:57 [PATCH] ext4: fix reference counting bug on block allocation error Vegard Nossum
@ 2016-07-07 11:21 ` Vegard Nossum
2016-07-15 3:01 ` Theodore Ts'o
0 siblings, 1 reply; 3+ messages in thread
From: Vegard Nossum @ 2016-07-07 11:21 UTC (permalink / raw)
To: tytso; +Cc: linux-ext4, Aneesh Kumar K.V
[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]
On 07/06/2016 03:57 PM, Vegard Nossum wrote:
> If we hit this error when mounted with errors=continue or
> errors=remount-ro:
>
> EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
>
> then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
> continue. However, ext4_mb_release_context() is the wrong thing to call
> here since we are still actually using the allocation context.
>
> Instead, handle it the same way that we handle other errors, except that
> we retry the allocation instead of immediately returning an error (if we
> were mounted with errors=continue, then ext4_mb_mark_diskspace_used()
> should have fixed the original error and will either succeed or give a
> different error; if we were mounted with errors=remount-ro, then it will
> not be able to fix the original error and will raise a different error).
This didn't really work as I thought and I'm now getting stuck in an
infinite loop here where it tries (and fails) to allocate the same
blocks over and over.
The attached new patch just returns on error instead of trying to fix
the problem.
Vegard
[-- Attachment #2: 0001-ext4-fix-reference-counting-bug-on-block-allocation-.patch --]
[-- Type: text/x-patch, Size: 1790 bytes --]
>From 09084a94d2b176e33ee52a0bb41409988e6ff744 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@oracle.com>
Date: Wed, 6 Jul 2016 15:01:36 +0200
Subject: [PATCH] ext4: fix reference counting bug on block allocation error
If we hit this error when mounted with errors=continue or
errors=remount-ro:
EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.
Instead, just error out. We could retry the allocation, but there is a
possibility of getting stuck in an infinite loop instead, so this seems
safer.
Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
fs/ext4/mballoc.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c1ab3ec..e65400a 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4513,18 +4513,7 @@ repeat:
}
if (likely(ac->ac_status == AC_STATUS_FOUND)) {
*errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
- if (*errp == -EAGAIN) {
- /*
- * drop the reference that we took
- * in ext4_mb_use_best_found
- */
- ext4_mb_release_context(ac);
- ac->ac_b_ex.fe_group = 0;
- ac->ac_b_ex.fe_start = 0;
- ac->ac_b_ex.fe_len = 0;
- ac->ac_status = AC_STATUS_CONTINUE;
- goto repeat;
- } else if (*errp) {
+ if (*errp) {
ext4_discard_allocated_blocks(ac);
goto errout;
} else {
--
1.9.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ext4: fix reference counting bug on block allocation error
2016-07-07 11:21 ` Vegard Nossum
@ 2016-07-15 3:01 ` Theodore Ts'o
0 siblings, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2016-07-15 3:01 UTC (permalink / raw)
To: Vegard Nossum; +Cc: linux-ext4, Aneesh Kumar K.V
On Thu, Jul 07, 2016 at 01:21:55PM +0200, Vegard Nossum wrote:
>
> The attached new patch just returns on error instead of trying to fix
> the problem.
The issue with your v2 patch is that we end up returning EAGAIN to
userspace, which is confusing. So I've fixed up your patch by adding
the following changes.
Thanks,
- Ted
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 5399b88..1156216 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2943,7 +2943,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
ext4_error(sb, "Allocating blocks %llu-%llu which overlap "
"fs metadata", block, block+len);
/* File system mounted not to panic on error
- * Fix the bitmap and repeat the block allocation
+ * Fix the bitmap and return EFSCORRUPTED
* We leak some of the blocks here.
*/
ext4_lock_group(sb, ac->ac_b_ex.fe_group);
@@ -2952,7 +2952,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
ext4_unlock_group(sb, ac->ac_b_ex.fe_group);
err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
if (!err)
- err = -EAGAIN;
+ err = -EFSCORRUPTED;
goto out_err;
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-15 3:02 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-06 13:57 [PATCH] ext4: fix reference counting bug on block allocation error Vegard Nossum
2016-07-07 11:21 ` Vegard Nossum
2016-07-15 3:01 ` Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).