* Re: ext4: fix reference counting bug on block allocation error
[not found] <20160727052921.5B41C35574@git2.kroah.org>
@ 2016-08-14 18:32 ` Greg KH
2016-08-14 18:37 ` Vegard Nossum
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2016-08-14 18:32 UTC (permalink / raw)
To: Vegard Nossum, Theodore Ts'o, adilger.kernel
Cc: linux-ext4, Aneesh Kumar K.V
Hi Vegard and ext4 developers,
The patch below, in Linus's tree, references a patch in the Fixes: line
that is not in Linus's tree (neither the git commit id, nor the subject
line.)
That's a bit confusing, what is this patch supposed to be fixing up?
What stable tree(s) should it go to if the original patch it fixes isn't
even in any tree?
confused,
greg k-h
On Wed, Jul 27, 2016 at 05:29:21AM +0000, Gregs git-bot wrote:
> commit: 554a5ccc4e4a20c5f3ec859de0842db4b4b9c77e
> From: Vegard Nossum <vegard.nossum@oracle.com>
> Date: Thu, 14 Jul 2016 23:02:47 -0400
> Subject: ext4: fix reference counting bug on block allocation error
>
> If we hit this error when mounted with errors=continue or
> errors=remount-ro:
>
> EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
>
> then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
> continue. However, ext4_mb_release_context() is the wrong thing to call
> here since we are still actually using the allocation context.
>
> Instead, just error out. We could retry the allocation, but there is a
> possibility of getting stuck in an infinite loop instead, so this seems
> safer.
>
> [ Fixed up so we don't return EAGAIN to userspace. --tytso ]
>
> Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> ---
> fs/ext4/mballoc.c | 17 +++--------------
> 1 file changed, 3 insertions(+), 14 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 77249e1..1156216 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2943,7 +2943,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
> ext4_error(sb, "Allocating blocks %llu-%llu which overlap "
> "fs metadata", block, block+len);
> /* File system mounted not to panic on error
> - * Fix the bitmap and repeat the block allocation
> + * Fix the bitmap and return EFSCORRUPTED
> * We leak some of the blocks here.
> */
> ext4_lock_group(sb, ac->ac_b_ex.fe_group);
> @@ -2952,7 +2952,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
> ext4_unlock_group(sb, ac->ac_b_ex.fe_group);
> err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
> if (!err)
> - err = -EAGAIN;
> + err = -EFSCORRUPTED;
> goto out_err;
> }
>
> @@ -4517,18 +4517,7 @@ repeat:
> }
> if (likely(ac->ac_status == AC_STATUS_FOUND)) {
> *errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
> - if (*errp == -EAGAIN) {
> - /*
> - * drop the reference that we took
> - * in ext4_mb_use_best_found
> - */
> - ext4_mb_release_context(ac);
> - ac->ac_b_ex.fe_group = 0;
> - ac->ac_b_ex.fe_start = 0;
> - ac->ac_b_ex.fe_len = 0;
> - ac->ac_status = AC_STATUS_CONTINUE;
> - goto repeat;
> - } else if (*errp) {
> + if (*errp) {
> ext4_discard_allocated_blocks(ac);
> goto errout;
> } else {
> --
> 2.9.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 18:32 ` ext4: fix reference counting bug on block allocation error Greg KH
@ 2016-08-14 18:37 ` Vegard Nossum
2016-08-14 18:51 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2016-08-14 18:37 UTC (permalink / raw)
To: Greg KH, Theodore Ts'o, adilger.kernel; +Cc: linux-ext4, Aneesh Kumar K.V
On 08/14/2016 08:32 PM, Greg KH wrote:
> Hi Vegard and ext4 developers,
>
> The patch below, in Linus's tree, references a patch in the Fixes: line
> that is not in Linus's tree (neither the git commit id, nor the subject
> line.)
>
It seems to exist?
>> Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during
block allocation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae
> That's a bit confusing, what is this patch supposed to be fixing up?
> What stable tree(s) should it go to if the original patch it fixes isn't
> even in any tree?
The referenced commit adds the ext4_mb_release_context(ac); line which
is what is causing problems because that releases the context which is
still in fact in use.
>
> confused,
Confused too :-)
Vegard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 18:37 ` Vegard Nossum
@ 2016-08-14 18:51 ` Greg KH
2016-08-14 18:58 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2016-08-14 18:51 UTC (permalink / raw)
To: Vegard Nossum
Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On Sun, Aug 14, 2016 at 08:37:56PM +0200, Vegard Nossum wrote:
> On 08/14/2016 08:32 PM, Greg KH wrote:
> > Hi Vegard and ext4 developers,
> >
> > The patch below, in Linus's tree, references a patch in the Fixes: line
> > that is not in Linus's tree (neither the git commit id, nor the subject
> > line.)
> >
>
> It seems to exist?
>
> >> Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block
> allocation")
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae
>
> > That's a bit confusing, what is this patch supposed to be fixing up?
> > What stable tree(s) should it go to if the original patch it fixes isn't
> > even in any tree?
>
> The referenced commit adds the ext4_mb_release_context(ac); line which
> is what is causing problems because that releases the context which is
> still in fact in use.
Oh doh, sorry for the noise, I was only looking at 4.8-rc1 and older, my
fault.
nevermind...
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 18:51 ` Greg KH
@ 2016-08-14 18:58 ` Greg KH
2016-08-14 19:02 ` Vegard Nossum
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2016-08-14 18:58 UTC (permalink / raw)
To: Vegard Nossum
Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On Sun, Aug 14, 2016 at 08:51:25PM +0200, Greg KH wrote:
> On Sun, Aug 14, 2016 at 08:37:56PM +0200, Vegard Nossum wrote:
> > On 08/14/2016 08:32 PM, Greg KH wrote:
> > > Hi Vegard and ext4 developers,
> > >
> > > The patch below, in Linus's tree, references a patch in the Fixes: line
> > > that is not in Linus's tree (neither the git commit id, nor the subject
> > > line.)
> > >
> >
> > It seems to exist?
> >
> > >> Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block
> > allocation")
> >
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae
> >
> > > That's a bit confusing, what is this patch supposed to be fixing up?
> > > What stable tree(s) should it go to if the original patch it fixes isn't
> > > even in any tree?
> >
> > The referenced commit adds the ext4_mb_release_context(ac); line which
> > is what is causing problems because that releases the context which is
> > still in fact in use.
>
> Oh doh, sorry for the noise, I was only looking at 4.8-rc1 and older, my
> fault.
Hm, wait. Commit 8556e8f3b6 didn't show up in my filters for some
reason (which it not good, and makes me worry), but also, it doesn't
apply to the stable trees at all.
So can you please send backports of that commit, and this one, if you
want them both queued up to any stable kernel tree?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 18:58 ` Greg KH
@ 2016-08-14 19:02 ` Vegard Nossum
2016-08-14 19:09 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2016-08-14 19:02 UTC (permalink / raw)
To: Greg KH; +Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On 08/14/2016 08:58 PM, Greg KH wrote:
> On Sun, Aug 14, 2016 at 08:51:25PM +0200, Greg KH wrote:
>> On Sun, Aug 14, 2016 at 08:37:56PM +0200, Vegard Nossum wrote:
>>> On 08/14/2016 08:32 PM, Greg KH wrote:
>>>> Hi Vegard and ext4 developers,
>>>>
>>>> The patch below, in Linus's tree, references a patch in the Fixes: line
>>>> that is not in Linus's tree (neither the git commit id, nor the subject
>>>> line.)
>>>>
>>>
>>> It seems to exist?
>>>
>>>>> Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block
>>> allocation")
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae
>>>
>>>> That's a bit confusing, what is this patch supposed to be fixing up?
>>>> What stable tree(s) should it go to if the original patch it fixes isn't
>>>> even in any tree?
>>>
>>> The referenced commit adds the ext4_mb_release_context(ac); line which
>>> is what is causing problems because that releases the context which is
>>> still in fact in use.
>>
>> Oh doh, sorry for the noise, I was only looking at 4.8-rc1 and older, my
>> fault.
>
> Hm, wait. Commit 8556e8f3b6 didn't show up in my filters for some
> reason (which it not good, and makes me worry), but also, it doesn't
> apply to the stable trees at all.
>
> So can you please send backports of that commit, and this one, if you
> want them both queued up to any stable kernel tree?
The commit which is being fixed is ancient:
$ git describe 8556e8f3b6
v2.6.28-5758-g8556e8f3
It's probably already in the base of every current stable tree, no?
Vegard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 19:02 ` Vegard Nossum
@ 2016-08-14 19:09 ` Greg KH
2016-08-14 19:35 ` Vegard Nossum
2016-08-14 20:19 ` Theodore Ts'o
0 siblings, 2 replies; 9+ messages in thread
From: Greg KH @ 2016-08-14 19:09 UTC (permalink / raw)
To: Vegard Nossum
Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On Sun, Aug 14, 2016 at 09:02:48PM +0200, Vegard Nossum wrote:
> On 08/14/2016 08:58 PM, Greg KH wrote:
> > On Sun, Aug 14, 2016 at 08:51:25PM +0200, Greg KH wrote:
> > > On Sun, Aug 14, 2016 at 08:37:56PM +0200, Vegard Nossum wrote:
> > > > On 08/14/2016 08:32 PM, Greg KH wrote:
> > > > > Hi Vegard and ext4 developers,
> > > > >
> > > > > The patch below, in Linus's tree, references a patch in the Fixes: line
> > > > > that is not in Linus's tree (neither the git commit id, nor the subject
> > > > > line.)
> > > > >
> > > >
> > > > It seems to exist?
> > > >
> > > > > > Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block
> > > > allocation")
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae
> > > >
> > > > > That's a bit confusing, what is this patch supposed to be fixing up?
> > > > > What stable tree(s) should it go to if the original patch it fixes isn't
> > > > > even in any tree?
> > > >
> > > > The referenced commit adds the ext4_mb_release_context(ac); line which
> > > > is what is causing problems because that releases the context which is
> > > > still in fact in use.
> > >
> > > Oh doh, sorry for the noise, I was only looking at 4.8-rc1 and older, my
> > > fault.
> >
> > Hm, wait. Commit 8556e8f3b6 didn't show up in my filters for some
> > reason (which it not good, and makes me worry), but also, it doesn't
> > apply to the stable trees at all.
> >
> > So can you please send backports of that commit, and this one, if you
> > want them both queued up to any stable kernel tree?
>
> The commit which is being fixed is ancient:
>
> $ git describe 8556e8f3b6
> v2.6.28-5758-g8556e8f3
>
> It's probably already in the base of every current stable tree, no?
Huh? Ok, this odd:
$ git describe --contains 8556e8f3b6
fatal: cannot describe '8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae'
Yet just a plain 'git describe' does work...
That's what threw me off, I only use --contains as that shows the
release the commit is in.
Ok, that makes me feel a bit better (that my scripts didn't miss the
patch, it was just old), but I wonder what is going on with git...
thanks again,
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 19:09 ` Greg KH
@ 2016-08-14 19:35 ` Vegard Nossum
2016-08-14 19:46 ` Greg KH
2016-08-14 20:19 ` Theodore Ts'o
1 sibling, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2016-08-14 19:35 UTC (permalink / raw)
To: Greg KH; +Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On 08/14/2016 09:09 PM, Greg KH wrote:
> On Sun, Aug 14, 2016 at 09:02:48PM +0200, Vegard Nossum wrote:
>> The commit which is being fixed is ancient:
>>
>> $ git describe 8556e8f3b6
>> v2.6.28-5758-g8556e8f3
>>
>> It's probably already in the base of every current stable tree, no?
>
> Huh? Ok, this odd:
> $ git describe --contains 8556e8f3b6
> fatal: cannot describe '8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae'
>
> Yet just a plain 'git describe' does work...
>
> That's what threw me off, I only use --contains as that shows the
> release the commit is in.
>
> Ok, that makes me feel a bit better (that my scripts didn't miss the
> patch, it was just old), but I wonder what is going on with git...
>
How odd.
There is this:
http://www.spinics.net/lists/git/msg246837.html
"""
Yes, the "describe --contains" algorithm uses timestamps to cut off the
traversal, so it can do the wrong thing if there's clock skew. It has a
"slop" margin of one day, but skew larger than that can fool it.
"""
It looks like ancestors of 87d8fe1 don't work:
$ for commit in $(git rev-list 87d8fe1^^..87d8fe1); do git describe
--contains $commit; done
v2.6.29-rc1~40^2~12
fatal: cannot describe '0087d9fb3f29f59e8d42c8b058376d80e5adde4c'
So maybe it's because the parent commit is 2 days in the future:
$ git log --format=fuller 87d8fe1^^..87d8fe1
commit 87d8fe1ee6b8d2f95076142d58c440dba4e7bdc2
Author: Theodore Ts'o <tytso@mit.edu>
AuthorDate: Sat Jan 3 09:47:09 2009 -0500
Commit: Theodore Ts'o <tytso@mit.edu>
CommitDate: Sat Jan 3 09:47:09 2009 -0500
[...]
commit 0087d9fb3f29f59e8d42c8b058376d80e5adde4c
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
AuthorDate: Mon Jan 5 21:49:12 2009 -0500
Commit: Theodore Ts'o <tytso@mit.edu>
CommitDate: Mon Jan 5 21:49:12 2009 -0500
[...]
Vegard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 19:35 ` Vegard Nossum
@ 2016-08-14 19:46 ` Greg KH
0 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2016-08-14 19:46 UTC (permalink / raw)
To: Vegard Nossum
Cc: Theodore Ts'o, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On Sun, Aug 14, 2016 at 09:35:59PM +0200, Vegard Nossum wrote:
> On 08/14/2016 09:09 PM, Greg KH wrote:
> > On Sun, Aug 14, 2016 at 09:02:48PM +0200, Vegard Nossum wrote:
> > > The commit which is being fixed is ancient:
> > >
> > > $ git describe 8556e8f3b6
> > > v2.6.28-5758-g8556e8f3
> > >
> > > It's probably already in the base of every current stable tree, no?
> >
> > Huh? Ok, this odd:
> > $ git describe --contains 8556e8f3b6
> > fatal: cannot describe '8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae'
> >
> > Yet just a plain 'git describe' does work...
> >
> > That's what threw me off, I only use --contains as that shows the
> > release the commit is in.
> >
> > Ok, that makes me feel a bit better (that my scripts didn't miss the
> > patch, it was just old), but I wonder what is going on with git...
> >
>
> How odd.
>
> There is this:
>
> http://www.spinics.net/lists/git/msg246837.html
>
> """
> Yes, the "describe --contains" algorithm uses timestamps to cut off the
> traversal, so it can do the wrong thing if there's clock skew. It has a
> "slop" margin of one day, but skew larger than that can fool it.
> """
>
> It looks like ancestors of 87d8fe1 don't work:
>
> $ for commit in $(git rev-list 87d8fe1^^..87d8fe1); do git describe
> --contains $commit; done
> v2.6.29-rc1~40^2~12
> fatal: cannot describe '0087d9fb3f29f59e8d42c8b058376d80e5adde4c'
>
> So maybe it's because the parent commit is 2 days in the future:
>
> $ git log --format=fuller 87d8fe1^^..87d8fe1
> commit 87d8fe1ee6b8d2f95076142d58c440dba4e7bdc2
> Author: Theodore Ts'o <tytso@mit.edu>
> AuthorDate: Sat Jan 3 09:47:09 2009 -0500
> Commit: Theodore Ts'o <tytso@mit.edu>
> CommitDate: Sat Jan 3 09:47:09 2009 -0500
> [...]
>
> commit 0087d9fb3f29f59e8d42c8b058376d80e5adde4c
> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> AuthorDate: Mon Jan 5 21:49:12 2009 -0500
> Commit: Theodore Ts'o <tytso@mit.edu>
> CommitDate: Mon Jan 5 21:49:12 2009 -0500
> [...]
Ouch, that's some clock skew.
Thanks for working this out, I've now applied the patch to the stable
trees.
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4: fix reference counting bug on block allocation error
2016-08-14 19:09 ` Greg KH
2016-08-14 19:35 ` Vegard Nossum
@ 2016-08-14 20:19 ` Theodore Ts'o
1 sibling, 0 replies; 9+ messages in thread
From: Theodore Ts'o @ 2016-08-14 20:19 UTC (permalink / raw)
To: Greg KH; +Cc: Vegard Nossum, adilger.kernel, linux-ext4, Aneesh Kumar K.V
On Sun, Aug 14, 2016 at 09:09:19PM +0200, Greg KH wrote:
> Huh? Ok, this odd:
> $ git describe --contains 8556e8f3b6
> fatal: cannot describe '8556e8f3b6c4c11601ce1e9ea8090a6d8bd5daae'
>
> Yet just a plain 'git describe' does work...
>
> That's what threw me off, I only use --contains as that shows the
> release the commit is in.
>
> Ok, that makes me feel a bit better (that my scripts didn't miss the
> patch, it was just old), but I wonder what is going on with git...
Yeah, that's partially my fault. Back in 2009, I was using a version
of guilt (quilt for git) that didn't enforce the requirement that
within a particular linear sequence of commits, the Committer time
must be always be increasing. In other words, things like this are
never supposed to happen:
% git log --pretty="%h %ci %s" -7 e8134b2
e8134b2 2009-01-05 21:38:26 -0500 ext4: Fix race between read_block_bitmap() and mark_diskspace_used()
5d1b1b3 2009-01-05 22:19:52 -0500 ext4: fix BUG when calling ext4_error with locked block group
b7be019 2008-11-23 23:51:53 -0500 ext4: Fix lockdep recursive locking warning
7a2fcbf 2009-01-05 21:36:55 -0500 ext4: don't use blocks freed but not yet committed in buddy cache init
fb68407 2008-11-06 17:50:21 -0500 jbd2: Call journal commit callback without holding j_list_lock
c3a326a 2008-11-25 15:11:52 -0500 ext4: cleanup mballoc header files
920313a 2009-01-05 21:36:19 -0500 ext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize
... because it hopelessly confuses commands like "git describe". As
you have rediscovered. :-)
The guilt command was fixed a long time ago to not do this thing, but
the nasty commits from late 2008 and early 2009 are still in the tree.
Fortunately, it's rare that we need to refer to commits this old, and
so people don't run into this often.
- Ted
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-08-14 20:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20160727052921.5B41C35574@git2.kroah.org>
2016-08-14 18:32 ` ext4: fix reference counting bug on block allocation error Greg KH
2016-08-14 18:37 ` Vegard Nossum
2016-08-14 18:51 ` Greg KH
2016-08-14 18:58 ` Greg KH
2016-08-14 19:02 ` Vegard Nossum
2016-08-14 19:09 ` Greg KH
2016-08-14 19:35 ` Vegard Nossum
2016-08-14 19:46 ` Greg KH
2016-08-14 20:19 ` Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox